Archive

2025

New theme that - looks crisp and is easily to read the posts πŸ‘

Great presentation by Stuart Russell on Human-Compatible AI

Being certain of epistemic uncertainty : I’ve been dancing around Probability Theory, it’s history, application, and weeding out what is Frequentist from what is Bayesian. The. Relating it to Rational Psychology. I’m not there, getting there, but not there. This isn’t what I do full time but it is what I think about …

Decisions decisions - Final Year Project πŸ€”πŸ€”πŸ€”: I am very torn between two possibilities : Building on my Q-Learning Maze Solving Agent I did for AI Applications (Q-Learning Maze Solving Agent) by adding a Neural Network (Sutton and Barton) Building on the Intelligent Agents work I did in AI by applying an Agent Decisions Process …

Can LLMs do Critical Thinking? Of course not. Can an AI system think critically? Why not?: A very interesting paper on Critical Thinking in an LLM (or lack thereof) Our study investigates how language models handle multiple-choice questions that have no correct answer among the options. Unlike traditional approaches that include escape options like None of the above (Wang et al., 2024a; …

The old ‘un still does the job on the MINST Handwritten dataset !

Polish is cheap in this Brave New World of AI. Being scrappy is a way of being authentic and, most importantly, Being Human!

[Being Human 3/n]: moving on from previous unmet goals... : I wish I had time to finish: my research on the Evolution of Probalisitic Reasoning in AI Particularly Dempster-Shafer and Bayesian Networks How LLMs and Bayesian networks can be used for Risk Management create an youtube/insta/tiktok vid for my latest post on LLM Agent But I don’t!! So this …

[Being Human 2/n] Being scrappy shows we are Human in this Brave New World: Polish is cheap in this Brave New World of AI. Being scrappy is a way of being authentic and, most importantly, Being Human!

[IA Series 7/n] Building a Self-Consistency LLM-Agent: From PEAS Analysis to Production Code: Building a Self-Consistency LLM-Agent: From PEAS Analysis to Production Code - a guide to designing an LLM-based agent.

A refreshing AI-en-Provence 🍦

Reasoning vs Stream of Consciousness

[IA Series 6/n] A Bayesian Learning Agent: Bayes Theorem and Intelligent Agents: The article discusses how to implement Bayes Theorem in a learning agent that updates its beliefs about an environment based on new evidence, illustrated through a game involving guessing a number derived from a dice throw.

It is not reasoning... : Reasoning vs Stream of Consciousness - the output of a transformer is not reasoned in the way we think it is.

[Being Human Series 1/n] Introspection and the cusp of not knowing: What is knowledge? Wtf am I trying to learn! Claude “thinks” this post is mental masturbation πŸ˜† well even the physical version serves a good purpose! πŸ€·πŸΌβ€β™‚οΈ

[IA Series 5/n] The Evolution from Logic to Probability to Deep Learning: A course correction to Transformers: Introduction In the previous post, I shared my view on “Why Study Logic?”, we looked at the Knowledge Representation and highlighted the importance of Logic and Reasoning in storing and accessing Knowledge. In this post I’m going to highlight a section from the book …

[IA Series 4/n] A Big Question: Why Study Logic in a World of Probabilistic AI?: Introduction The purpose of this article is to help me answer the question “Why am I studying Logic?”. If it helps you, that’d be great, let me know! The question comes from a nagging feeling of, why don’t I see logic used more in the ‘real world’. It could be a …

"There must be an invisible sun, giving heat to everyone": “There must be an invisible sun, giving heat to everyone”

[IA Series 3/n] Intelligent Agents Term Sheet: “[IA Series 3/n] Intelligent Agents Term Sheet” breaks down essential AI terminology from Russell & Norvig’s seminal textbook. Learn what makes agents rational (or irrational), understand different agent types, and follow a structured 5-step design process from environment …

Building an Intelligent Agent: First draft in public 😱 πŸ˜†πŸ€“ What’s the best way for an agent to build a semantically sound and syntactically correct knowledge base? Dog fooding my course material means the first step is to define the task environment. /Checks notes Task Environment: The description of Performance, …

[zero-RL] Summarising what LUFFY offers: Here’s a “standard” progression of training methodologies: PRE-Training - This is where the model gains broad knowledge, forming the foundation necessary for reasoning. CPT (Continued Pre-training) - Makes the model knowledgeable about specific domains. SFT (Supervised …

[zero-RL] where is the exploration?: Source: Off Policy “zero RL” in simple terms Results demonstrate that LUFFY encourages the model to imitate high-quality reasoning traces while maintaining exploration of its own sampling space. Authors introduce policy shaping via regularized importance sampling, which amplifies …

[zero-RL] LUFFY: Learning to reason Under oFF policY guidance: Based on conventional zero-RL methods such as GRPO, LUFFY introduces off-policy reasoning traces (e.g., from DeepSeek-R1) and combines them with models' on-policy roll-outs before advantage computation. … However, naively combining off-policy traces can lead to overly rapid convergence and …

[zero-RL] what is it?: Zero-RL applies reinforcement learning RL to base LM directly, eliciting reasoning potentials using models' own rollouts. A fundamental limitation worth highlighting: it is inherently “on-policy”, constraining learning exclusively to the model’s self-generated outputs through …

[zero-RL] When you SFT a smaller LM on the reasoning traces of a larger LM: You are doing Imitation Learning (specifically Behavioral Cloning) because the goal and mechanism involve mimicking the expert’s token sequences. You are doing Transfer Learning (specifically Knowledge Distillation) because you are transferring reasoning knowledge from a teacher model to a …

Notes and links on SVMs (WIP): Support Vector Machines (SVM) are a mathematical approach for classifying data by finding optimal separating hyperplanes, applicable even in non-linear scenarios using kernel methods.

[IA Series 2/n] Search Algorithms and Intelligent Agents: The document discusses various search algorithms used by Intelligent Agents for navigating mazes, detailing their types, characteristics, tradeoffs, and implementations.

[IA Series 1/n] AI Search - Terms and Algorithms: This text introduces key concepts and algorithms related to intelligent agents in AI, focusing on search terms, uninformed and informed search strategies, and adversarial search techniques.

[Python Series 1/n] Modern Python Package Management: pipx and uv for Data Scientists: This post is inspired by a conversation with a fellow Data Science and AI student. It’s from the conversation, co-authored with Claude. Hope it’s useful! Were to begin? When you’re starting your data science journey with Python, one of the first roadblocks you’ll encounter is …

Dystopia? It's already here and that's OK. Here's why. : The text reflects on the misuse of technology and ethics in Silicon Valley, highlighting the importance of awareness and compassion amidst current challenges.

finally found something I wanted to use ChatGPT image generation for! On the fridge and the family loves it, going to be a busy week πŸ‘¨πŸ»β€πŸŒΎ

happiness is "django_cotton", "template_partials.apps.SimpleAppConfig", unhappiness (for what seems like an eternity) is: "template_partials.apps.SimpleAppConfig", "django_cotton",

How did America break itself? Ideological sabotage of the scientific method and how to counter it. : Great podcast where she talks about why America is broken. Ideological sabotage (which surprised me, I’d thought the initial perpetrators would have done it for money) of the scientific method to protect the “right of freedom” has done exactly the opposite… It really feels …

China’s first heterogenous humanoid robot training facility www.globaltimes.cn/page/2025… During the first phase of the project, the robots will be trained with approximately 45 atomic skills such as grasping, picking, placing and transporting A single action may need to be repeated …

Is the EU AI Act Killing Startups? A Medical Device Perspective: The analysis concludes that while the EU AI Act does not obstruct startups, it presents both challenges and opportunities for innovation within a complex regulatory landscape.

The cold has fully kicked in now, and has a hint of covid about it… 😡😷 Plans to wire up the shed scrapped. Split for choice between Russell’s Human Compatible, Mark Burgesses Treatise on Systems, or Green Mars. πŸ€“ Given the kids are out I might just enjoy the quiet! #ChilledSunday …

“But who was learning, you or the machine?” “Well, I suppose we both were” Amazing book πŸ”₯πŸ€“ #TheAlignmentProblem #Learning #ResponsibleAI The Alignment Problem by Brian Christian πŸ“š

Clearly there are thoughtful, well spoken politicians in America. youtu.be/ubBnUCXj4… I hope people can rally around and stop the Baffons soon. πŸ’ͺ🏼 #BeingHuman

BBC news article is very clear… The Russian president has given the US leader just enough to claim that he made progress towards peace in Ukraine, without making it look like he was played by the Kremlin. Full article

New wave of Innovators: why AI won't replace software engineering : There’s a lot of change at the moment, my feed is all about foreign policies, US government cuts, AI writing all code, and now parenting adolescents. I’ve been experiencing a high level of uncertainty about Europe’s place in the world, mainly what decisions will be made after the …

[NN Series 5/n] Regularisation: reducing the complexity of a model without compromising accuracy: Regularisation is known to reduce overfitting when training a neural network. As with a lot of these techniques there is a rich background and many options available, so asking the question why and how opens up to a lot of information. Diving through the information, for me at least, it wasn’t …

A speculative recipe for useful agentic behaviours:  define actions by Promise Theory train multiple neural nets to classify an action for a given input (train them differently to spice things up) take an environment for the agents to operate in (e.g. a 3d maze where collaboration is needed to escape) bind the agents interactions with a healthy dose …

Flow and decisions - almost a parable : (I forget exactly but I’m pretty sure this is from an Alan Watts lecture). A farmer needs some help around his farm. He puts up a sign in town, asking for someone with general skills to help around the farm. A gent arrives two days later with his toolkit, the farmer welcomes him and tells him …

[NN Series 4/n] Feature Normalisation: This is an interesting one as I’d thought it was quite academic, with limited utility. Then I saw these graphs Error per epoch This graph shows the error per epoch of training a model on the data as is We can see that it takes around 180-200 epochs to train with a learning rate (eta) of 0.0002 …

From Green Mars by Kim Stanley Robinson.

[NN Series 3/n] Calculating the error before quantisation: Gradient Descent: Next I’m looking at the Adaline in python code. This post is a mixture of what I’ve learnt in my degree, Sebestien Raschka’s book/code, and the 1960 paper that delivered the Adaline Neuron. Difference between the Perceptron and the Adaline In the first post we looked at the …

[NN Series 2/n] Circuits that can be trained to match patterns: The Adaline: The text discusses the development and significance of the Adaline artificial neuron, highlighting its introduction of non-linear activation functions and cost minimization, which have important implications for modern machine learning.

#BeingHuman - look after your << self >>: love is all it needs.: The author shares personal reflections on self-kindness and positive thinking as tools for finding peace amid societal challenges.

#BeingHuman and a Dad. : My wife and I have 3 main concerns with our daughters use of phones and Social Media what her videos and posts can be used for. That includes both the companies and any who has access (making fake videos in their likeness) loss of critical thinking addiction and the infinite scroll So we’ve …

Pondering Agency and Consciousness #BeingHuman : Had a nice exchange about Agency with Paul Burchard on LinkedIn this morning. My thinking goes towards Agency being a secondary characteristic, definition even, of what we see as a result of senses, perception, intelligence, and consciousness. Those primary characteristics are from the Buddhist 5 …

[NN Series 1/n] From Neurons to Neural Networks: The Perceptron: This post looks at the Percepton, from Frank Rosenblatt’s original paper to a practical implementation classifying Iris flowers. The Perceptron is the original Artificial Neuron and provided a way to train a model to classify linearly separable data sets. The Perceptron itself had a short …

This is not normal nor is it ok. Meta is now the pervy old man you have to teach your kids to avoid. transparency.meta.com/en-gb/pol… #BeingHuman #ResponsibleAI

Nice opening. Looking forward to reading more! Nous pouvons et devons bΓ’tir l’intelligence artificielle au service des femmes et des hommes, compatible avec notre vision du monde, dotΓ©e d’une gouvernance large, en prΓ©servant notre souverainetΓ©. We can and must build artificial intelligence to …

First test with a “reasoning” model, pleasantly surprised. Not sure how to integrate it into my workflow though, there’s a big response!!

How do humans decipher reward in an uncertain state and environment? Imitation seems the most likely, supported by extended solitude usually leading to a depressed state. Feels like a question to run a human Monte Carlo Tree Search on! #BeingHuman #ReinforcementLearning #InverseReinforcementLearning

If I could answer any question in science, I’d find out what involvement the neurons in our heart and gut have in decision making and how we view ourselves. What about you? #BeingHuman #ThatsNotAWeekendProject πŸ™ƒ

[RL Series 2/n] From Animals to Agents: Linking Psychology, Behaviour, Mathematics, and Decision Making: intro Maths, computation, the mind, and related fields are a fascination for me. I had thought I was quite well informed and to a large degree I did know most of the science in more traditional Computer Science (it was my undergraduate degree…). What had slipped me by was reinforcement learning, …

The challenges of being human: mistaking prediction, narratives, and rhetoric for reasoning: I read an insightful comment within the current wave of LLM Reasoning hype. It has stuck with me. At least two reasons: It reminded me of my view that AGI is already here in the guise of companies It’s also a valid answer as to why I meditate and why Searle’s Chinese Room is mainly …

[RL Series 1/n] Defining Artificial Intelligence and Reinforcement Learning: intro I’m learning about Reinforcement Learning, it’s an area that has a lot of intrigue for me. The first I recall hearing of it was when ChatGPT wes released and it was said Reinforcement Learning from Human Feedback was the key to making it so fluent in responses. Since then I’m …

What is Off-Policy learning?: I’ve recently dug into Temporal Difference algorithms for Reinforcement Learning. The field of study has been a ride, from Animals in the late 1890s to Control Theory, Agents and back to Animals in the 1990s (and on). It’s accumulated in me developing a Q-Learning agent, and learning …

Are LLM learning skills rather than being Stochastic Parrots?: A Theory for Emergence of Complex Skills in Language Models Skill-Mix: a Flexible and Expandable Family of Evaluations for AI models www.quantamagazine.org/new-theor… youtu.be/fTMMsreAq… related to the authors Arora, S arxiv.org/search/cs Was that Sarcasm?: A Literature Survey on …

Domain Specific Languages : Ray Myers has started his Year of Domain Specific Languages πŸŽ‰ I listened to the first episode yesterday, on my bike because I’m getting fit again, and was reminded of when I did something similar. Got me wondering if this is a DSL? πŸ€” Around 2007 I set up a CI/CD system for Pershing, using …

Finished reading: Red Mars by Kim Stanley Robinson πŸ“š A great book, other than it being a highly recommended space opera I had little prior knowledge of it. It’s a story of building a community and industry, starting with scientists, on Mars. Told from the viewpoint of multiple characters, the …

Dopamine as temporal difference errors !! 🀯: I expect I’m sharing a dopamine burst that I experienced! πŸ€“ I’m listening to The Alignment Problem by Brian Christian πŸ“š and it’s explaining how Dayan, Montague, and Sejnowski* connected Wolfram Schultz’s work to the Temporal Difference algorithm (iirc that’s, of …

It is possible for dopamine to write cheques that the environment cannot cash. At which point the value function must come back down.

Nice lunch time walk into the village

Nice summary and stark reminder of what’s happening right now. Only CEOs are making the decisions… they have a vested interest. Worth keeping in mind it’s not just computing but robotics that are progressing. Stuart Russel at the World Knowledge Forum 2024 Stuart Russel on …

2024

[video] Crew.ai experiment with Cyber Threat Intelligence: 

Agentic behaviours : My initial thoughts, expressed via the medium of sport, on agentic behaviours plus friends view, which I think is better (expected as he’s the Basketball player). Jackson is the ethical agent. Pippen is the organizing agent. Harper is the redundant agent. Note: since looking into this …

[short] Why use tools with an LLM?: 

[short] AI Systems: 

2011

Project Euler meets Powershell - Problem #4: <# A palindromic number reads the same both ways. The largest palindrome made from the product of two 2-digit numbers is 9009 = 91 Γ— 99. Find the largest palindrome made from the product of two 3-digit numbers. #> # 998001 - so what's the largest palindrome number less than this one - then …

Project Euler meets Powershell - reworking factorial to avoid PowerShell 1000 recursions limit: Turns out it was the recursion the factorial function - I’ve reworked it to use a for loop function factorial { [cmdletbinding()] param($x) if ($x -lt 1) { return "Has to be on a positive integer" } Write-Verbose "Input is $x" $fact = 1 for ($i = $x; $i -igt 0; $i -= 1){ …

Project Euler meets Powershell - Problem #3: amonkeyseulersolutions: I’ve read that an integer p > 1 is prime if and only if the factorial (p - 1)! + 1 is divisible by p. So I’ve written this: Read More Here’s the fixed version… function isprime { [cmdletbinding()] param($x) if ($x -lt 1) { return "Has to be on a positive integer" …

Project Euler meets Powershell - largest prime factor of a value?: things to do… No more than the square root of the value. test each value? start from the square root and work down. the first one is the largest…

Project Euler meets Powershell - isprime: I’ve read that an integer p > 1 is prime if and only if the factorial (p - 1)! + 1 is divisible by p. So I’ve written this: function isprime { [cmdletbinding()] param([int] $x) if ($x -lt 1) { return "Has to be on a positive integer" } # An integer p > 1 is prime if and only if the …

Project Euler meets Powershell - factorial...: function factorial { [cmdletbinding()] param([int64] $x) if ($x -lt 1) { return "Has to be on a positive integer" } if ($x -eq 1) { [int64] $x } else { [int64] $x * (factorial ($x-1)) } }

Project Euler - Problem 2: Each new term in the Fibonacci sequence is generated by adding the previous two terms. By starting with 1 and 2, the first 10 terms will be: 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, … By considering the terms in the Fibonacci sequence whose values do not exceed four million, find the sum of the …

Project Euler meets Powershell - Problem 1: If we list all the natural numbers below 10 that are multiples of 3 or 5, we get 3, 5, 6 and 9. The sum of these multiples is 23. Find the sum of all the multiples of 3 or 5 below 1000. Solution to Euler’s Problem #1 for ($i = 1; $i -lt 1000; $i += 1) {if ( ($i % 3 -eq 0) -or ($i % 5 -eq 0) ) …