Statistics Term Sheet

Term sheet for key statistical ideas

Continue reading β†’

Specifying Systems: The TLA+ Language and Tools for Hardware and Software Engineers

My searches for where Propositional and Predicte Logic is useful and defining clearly what coding agent must/should produce have combined and led me to this book. Specifying Systems: The TLA+ Language and Tools for Hardware and Software Engineers Looking forward to reading it! πŸ€“

Continue reading β†’

Interesting presentation on the downfall of the Bronze Age Civilization around Egypt, Greece, and the Eastern Mediterranean.

YouTube Thumbnail

Finished listening to: Dune: The Butlerian Jihad by Brian Herbert πŸ“š

What a book to listen to whilst building AI Agents!

An attempt at guiding Claude to be less sycophantic

Testing this out in Claude - Avoid excessive politeness, flattery, or empty affirmations. - Avoid over-enthusiasm or emotionally charged language. - Be direct and factual, focusing on usefulness, clarity, and logic. - Prioritize truth and clarity over appeasing me. - Challenge assumptions or offer corrections anytime you get a chance. - Point out any flaws in the questions or solutions I suggest. - Avoid going off-topic or over-explaining unless I ask for more detail.

Continue reading β†’

[IA 9] Agent Design Process v2: Bridging the Agent Function and Acceptance Criteria

Making AI Theory Testable. There’s a gap between the Agent Function and the Agent Program and what the Agent should do and what it does do. ATDD can help bridge this. Here I detail how.

Continue reading β†’

[BH 5/n] Argh... Just because we repeat Correlation does not imply Causation does not mean there isn't Causation!

This is a bit of a rant. I’ve memories of a senior manager shooting ideas down saying “correlation is not causation, I’ve done Stats at Uni and can prove anything is related to baked beans” It grated sooooo much. Firstly as it was thoughtless rhetoric, either purposefully or accidentally steam rolling ideas. Immediately dismissing any attempts at constructive data driven decisions. Secondly it grated because I didn’t have the tools to show causation.

Continue reading β†’

[Being Human Series 4/n] spending time in uncertainty

Great opinion piece in the NY Times by Meghan O’Rourke. I Teach Creative Writing. This Is What A.I. Is Doing to Students. Ms. O’Rourke is the executive editor of The Yale Review and a professor of creative writing at Yale University. Uncertainty to understand This bit shouted at me. Spending time in uncertainty. πŸ€“πŸ€“πŸ€“ “When I write, the process is full of risk, error and painstaking self-correction. It arrives somewhere surprising only when I’ve stayed in uncertainty long enough to find out what I had initially failed to understand.

Continue reading β†’

[IA Series 8/n] Building a Self-Reflection LLM Agent: From Theory to Proof of Concept

An initial free dive into Agentic Meta-cognition, using an element of Self-Reflection to be aware of what it knows and apply it in a utilitarian fashion.

Continue reading β†’

Instructions for using Micro.blog VS Code extension (alpha)

Creating a Micro.blog Post with Images This is an instructional post for using this Micro.blog VS Code extension. It’s in alpha, so the documentation will evolve over time. First you need to install the plugin, the best way to use it is to get the code from the repo and run just dev in the repo. πŸ”§ Configuration Get your app token: Go to micro.blog β†’ Account β†’ Edit Apps β†’ New Token Configure extension: Command Palette (Cmd+Shift+P) β†’ “Micro.

Continue reading β†’

Sneak preview

Lazy vibing isn't a good idea... πŸ’₯ VIbe Engineering though πŸš€

After two days of successful Vibe Coding (though it is more like Vibe Engineering) I’m having a lazy day and have just given Claude Code a few prompts for features. The good code is available on Github Not yet sure what went wrong today but in attempting to add a new feature it completely removed another. It’s a lazy day so I’m not digging into it - when I get back chances are good I will be resetting to the last good commit and checking the new feature prompt.

Continue reading β†’

Grok wouldn't know "truth" if it slapped it

wow, this is pretty subversive, before answering it: searched on X for an opinion about a 75 year issue searched for Voldemort’s opinion on foreign affairs Simon Willison experiments with Grok

Continue reading β†’

Taming the vibes 🐍

A day vibe-coding, as a break from the normal routine of study. Done in new environment with language I’ve not used - VS Code extension in TypeScript .

Continue reading β†’

New theme that - looks crisp and is easily to read the posts πŸ‘ 😎

Great presentation by Stuart Russell on Human-Compatible AI

Being certain of epistemic uncertainty

I’ve been dancing around Probability Theory, it’s history, application, and weeding out what is Frequentist from what is Bayesian. The. Relating it to Rational Psychology. I’m not there, getting there, but not there. This isn’t what I do full time but it is what I think about when I’m not working, parenting, or socialising. Thankfully it’s not interesting to friends and family so I get a break from it myself! πŸ˜†

Continue reading β†’

Decisions decisions - Final Year Project πŸ€”πŸ€”πŸ€”

I am very torn between two possibilities : Building on my Q-Learning Maze Solving Agent I did for AI Applications (Q-Learning Maze Solving Agent) by adding a Neural Network (Sutton and Barton) Building on the Intelligent Agents work I did in AI by applying an Agent Decisions Process (Self-Consistency LLM-Agent) (the process is my interpretation of Russell and Norvig’s work) (The Douglas Adam’s extra option πŸ€“) Adding a cached “self-awareness” layer based on a Bayesian Learning Agent that stores it’s certainty on answers it gives.

Continue reading β†’

Can LLMs do Critical Thinking? Of course not. Can an AI system think critically? Why not?

A very interesting paper on Critical Thinking in an LLM (or lack thereof) Our study investigates how language models handle multiple-choice questions that have no correct answer among the options. Unlike traditional approaches that include escape options like None of the above (Wang et al., 2024a; Kadavath et al., 2022), we deliberately omit these choices to test the models’ critical thinking abilities. A model demonstrating good judgment should either point out that no correct answer is available or provide the actual correct answer, even when it’s not listed.

Continue reading β†’

The old ‘un still does the job on the MINST Handwritten dataset !