@thompsonson Headline is OTT but content is interesting, particularly showing that it continues to behave link a wave after interaction by observation
Replies
@thompsonson The implication that I see:
- a well trained LLM does not benefit from extended thinking
- it would be better to call “thinking” as internal context generation
- there may be a “perfect” context for a given question
@thompsonson Is there something about the entropy of the next token probability distribution (need to find the paper that shows “reasoning tokens” - like however or therefore - have a high entropy).
What is the entropy at each token in SLMs and LLMs compared to the r_c and r_o values… 🤔
@thompsonson “thinking” in the context of LLM’s is simple the LLM creating more context from it’s own model.
It is not reasoning, it is linking more information of its own internal information source.
Training a model on Chain-of-Thought is a way to train it to collect more relevant data.
Initial thoughts from: arxiv.org/abs/2507….
@thompsonson Potential updated version to include Acceptance Criteria:
Agent Design Process
Environment Analysis
Environment Specification: Specify the task environment using the PEAS framework (Performance measure, Environment, Actuators, Sensors)
Environment Analysis: Determine the properties of the Task Environment (observable, deterministic, static, discrete, single/multi-agent)
Architecture Selection
Agent Function: Define the ideal behaviour - what the agent ought to do - in abstract terms (mathematical mapping from percept sequences to actions)
Agent Type Selection: Choose appropriate agent architecture (simple reflex, model-based, etc.) capable of implementing the agent function
Acceptance Criteria (ATDD): Define testable observable behaviors using Given-When-Then format based on agent function and type constraints
Implementation Considerations
Agent Program: Implement the chosen architecture within physical constraints (compute availability, performance vs cost, etc.)
@thompsonson A post from Ron Garret on the scientific method and the difference between science and religion.
@thompsonson I’m half way through my coursework and refining my view of this
Today (12th May) I now think
Sensors
- Telegram API download
- TOR Forum Download
- Message Decomposer and Decontextualiser
- Entity Extractor
- Image analyser
- URL Verfier
Actuators
- Alerts
- Report
Environment
- Telegram channels
- TOR Forums
Performance Measures
- Accuracy of sensing (is the information correct)
- Speed of processing (from Sense to Act)
Sensors and Actuators feel better sperated… still missing something - maybe the Agent Function/Program (which I may be trying to cram some of it into this part)
@shubhendu.bsky.social Matter to who and why is that important?
My thinking is that it may not end well but any form of civil disobedience helps.
@shubhendu.bsky.social Doesn’t matter too much, just don’t use Twitter. 🤷🏼♂️
@faineg.bsky.social SAY IT LOUDER 😉
media.tenor.com/6szJWQP0fSIA…
@simon.fedi.simonwillison.net.ap.brid.gy One for the lawyers; I wonder what happens when the pay Meta clause is triggered (though I haven’t read the licence to double check it has that clause)… ?
@thompsonson History is named by personality clashes! 🤦🏼♂️
@thompsonson and even that isn’t right!!
all is good though :)
@jedi.be That’s strange to pronounce!
@eugenevinitsky.bsky.social As a serious answer, I tried crew ai and found a manager, who acted as a planner, ensured the workflow was followed. Before the main output could be from an agent responsible for a task in the middle of the workflow.
It’s weird. I’ve put a pin in it whilst I study RL and Intelligent Agents.
@eugenevinitsky.bsky.social One anthropomorphic ask I have to move away from is the “WDYT?” at the end of a prompt.
I see it as “Select * from all your stored information and tools”.
It’s tricky to unpick anthropomorphic interaction…
@eugenevinitsky.bsky.social Eugene, use the vibes! 👻
@thompsonson A rebuttal of two common deflationary stances against LLM cognition
Wednesday, February 12, 2025 →
@jedi.be A tad… I wouldn’t say flawed… First thoughts go towards
“We believe that Data and AI, used together as a technology, can be inconsistent, there we should design for failure”
@garymarcus.bsky.social An AI Innovator would be forced to save the world by hacking into the Whitehouse and installing AI everywhere! Only then would the investors say “ok that’s enough”.
(Obviously half of this is bollocks).
@rldmdublin2025.bsky.social Nice catch, I was just trying to validate it. Reported and blocked!
@rldmdublin2025.bsky.social The PayPal link isn’t working…
@j2bryson.bsky.social Both please ☺️
@j2bryson.bsky.social That’s because they only think about solving the problems they give the AI and then industrialising that as AGI.
How about creating an AI that solves problems for things that are alive??
If they did that a lack of interoperability and fragmentation would mean nothing….
@garymarcus.bsky.social Everyone leaving unhappy but accepting is the best sign for the rest of us!
Max Tegmark still thinks this thing is going to grow exponentially?? 🧐
@garymarcus.bsky.social It’s like the opposite of the stone soup parable.
You go in with Air Con not working, then you’re told the indicators aren’t working, then the radio, now the steering is wrong, then the accelerator, now the wheels are the wrong shape.
You’re left with a husk and a bill that’ll take years to pay.
@mathver.bsky.social The article doesn’t hang itself eh!
Given the opposing forces of this situation, this paragraph on those and the similarity to GDPR seems a fair comment.
Also, is an Act taking over 5 years to be written fast??
@j2bryson.bsky.social Hope I’m not taking over your thread but it does seem the time to show support for other ways of doing things.
The Techno-Facism that is occurring in the US should not be repeated elsewhere in the world (unless they want it 🫤… I vote against!!)
Nice article: www.techpolicy.press/anatomy-of-a…
@j2bryson.bsky.social She’s not welcome as a leader in any part of my thinking.
She had a great idea and did some awesome work two decades ago. Now she looks like she is trying to cash in.
Not welcome at all.
@j2bryson.bsky.social > every country can innovate and has expertise
💯
Also has free thinkers that genuinely wish for AI to be used for the benefit of humanity and protect the rights of all people.
I deeply wish you energy and courage in shoring up support 🖖🏼💪🏼
@garymarcus.bsky.social From what I’ve heard/researched I agree supervised and unsupervised learning has hit a wall for research and development, there’s not the data to support the previous claims. Nor are there “emergent” properties.
However, it has a long way to go before a wall in industrialisation and ubiquity.
@j2bryson.bsky.social De-regulation that combined personal and institutional finance is the gift that keeps given.
People should have the right to speculate, even over-hype their products, it shouldn’t be tied to everyone’s way of life though. 😡 Especially as the general population don’t realise the bind they are in.
@codefrenzy.bsky.social I commented more professionally on LinkedIn, here I joke that the push back is well summarised by the law of Conservation of Misery.
Well, I say a joke, but experience says otherwise! 🥴😅
@garymarcus.bsky.social And benefit science and technology in developing nations?
Iiuc, most new recruits in American STEM/Academia are not American…
The developments of these fields follows the need with capability to innovate…
@raymyers.bsky.social I can get with that.
With the Deepseek hype I’m seeing a lot more of an LLM’s “thought” process DSLs being embedded in (pseudo-) XML.
I’m into hyperparameter sweeps for Neural Nets and Q-Learning agents at the moment, but I hope to dig into that type of DSL soon.
@thompsonson To scan/read “How dopamine enables learning from aversion”: bsky.app/profile/t…
@thompsonson.bsky.social @raymyers.bsky.social kudos on starting the year of Domain Specific Languages, I watched the first one yesterday, got my brain moving and looking forward to the next episode.
I need to refine my understanding, I thought Terraform and Ansible are DSLs but maybe it’s HCL and YAML… 🤔
@jedi.be Was hoping to make the whole thing but got plans here 15-20th, so have to miss.
🤞🏼for a better schedule for the Paris AI Engineer event later in the year.
@jedi.be I’m learning how ActivityPub via micro.blog works, specifically how to get a clear view of a conversation! With that I assume that request is for Ray rather than me.
@jedi.be Yeah, if it is broken down into subprocesses, each has a set of actions to complete so the right context mean better performance (anecdotal presently).
I think that means the big question is what is the right context?! 👀
@jedi.be 💯
Maybe the tool to build tools is a mix of a “SDLC” Domain Specific Language and an LLM/Agent that processes it.
@raymyers.bsky.social is doing a year of DSL. Ray, do you give any credence to a DSL being useful here?
@jedi.be I think this needs to be a bit more nuanced, may have different agents/sub process only responsible for specific task.
The agent/subprocess only has/needs specific context required for that task.
The context for each agent/subprocess may overlap.
Comes from the MARL work in Edinburgh.