Hacker News new | past | comments | ask | show | jobs | submit login

The claim isn't "LLMs don't use tools", the author is saying that LLMs can't make reliable inferences regarding their own knowledge or capabilities which fundamentally limits their usefulness for many tasks. LLMs "know" that LLMs can't do math reliably, LLMs "know" that calculators can do math reliably, and yet LLMs generally just soldier on and try to do math themselves when asked. You can of course RL it or prompt it into writing javascript when it sees math but so far LLMs haven't been capable of generalizing the process of "I am bad at X" + "Thing is good at X" -> "I should ask for Thing to do X" unless that specific chain of thought is common in the training data.

The solution so far has just been to throw more RL or carefully crafted synthetic data at it but its arguably more pavlovian than it is generalized learning.

Someone could teach a dog to ring a bell that says "food" on it, and you could reasonably argue that it is using a tool. Will it then know to ring a bell that says "walk" when it wants to go outside?




I gave sonnet a hard arithmetic problem, and the ability to look for tools. It looked for a calculator, I gave it one and it used that.

https://gist.github.com/IanCal/2a92debee11a5d72d62119d72b965...


The availability of tools and what they're named is going to influence it's behavior. Gemini 2.0 Pro can obviously get this question right on it's own but the existence of a find_tool() option causes it to use it. Sorry it's scuffed, I just did it on my phone to make the point but I'd imagine you could get similar results with the tools param as all it's doing is putting the tool options into the context.

You are an advanced AI assistant that has a number of tools available to you. in order to use a tool, respond with "USE TOOL: <tool_name>(tool_parameter)".

Tools:

select_tool(<tool_name>)

find_tool(<search_term>)

Who stars in The Godfather?

> USE TOOL: find_tool("The Godfather cast")




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: