Anthropic Killed Tool Calling: What It Means for Devs

basanta sapkota
“Approve every tool call.” Remember when that felt like the responsible, grown-up way to build agents?

Yeah… that vibe is fading. And it’s why the phrase Anthropic killed tool calling keeps ricocheting around YouTube, Reddit, and LinkedIn like it’s breaking news.

But breathe. Anthropic didn’t hit a big red button and erase tool calling from existence. What’s happening is quieter, weirder, and honestly kind of inevitable. Claude is drifting toward encapsulated, platform-managed tool use and more scalable agent orchestration, which means the old, developer-visible “tool calling loop” matters less than it used to.

If you’re building agents, MCP servers, or Claude-adjacent tooling, this is one of those shifts you ignore at your own peril.

Quick answer: did Anthropic kill tool calling… literally?

No.

People say “Anthropic killed tool calling” as shorthand for something more specific: Anthropic is de-emphasizing the classic, explicit function-calling workflow and moving toward tool use patterns are more internalized, scalable, and harder to spoof.

Under the noise, there are two pretty real signals:

  • Anthropic is investing in advanced tool use patterns like tool discovery plus code-orchestrated tool calls, which cuts context bloat and reduces failure modes.
  • Anthropic tightened controls around who can use what “tooling” endpoints, especially where consumer subscriptions were being used like an API backend.

Let’s crack those open.

Why people keep saying it

Encapsulation: tool use becomes “Claude’s problem,” not ours

A YouTube breakdown literally titled “Anthropic Just Killed Tool Calling” frames the shift as Anthropic encapsulating tool usage so they can evolve tool behavior internally without exposing as much surface area to developers or third-party harnesses. Source: https://www.youtube.com/watch?v=8dVCSPXG6Mw

And yeah, I get the gut reaction. Fewer knobs. More “don’t worry, we’ll handle it.” If you’ve ever built a fragile tool loop and babysat it in production, you can probably feel why people are a little twitchy about this.

The bottleneck flipped: approving every call doesn’t scale

Nirant Kasliwal put it bluntly on LinkedIn. Anthropic’s product learning is basically killing the default pattern of “approve every action / tool call.” Because once agents get decent, constant human approvals become the limiting factor. Source: https://www.linkedin.com/posts/nirant_anthropics-product-findings-just-killed-activity-7431222981600792576-8MKm

In plain language, if the model is good enough, we become the latency. Not the network.Still the tools. Us.

The technical core: Anthropic’s “advanced tool use” direction

If you want the most concrete “okay, what actually changed” artifact, it’s this engineering post:

“Introducing advanced tool use on the Claude Developer Platform”
https://www.anthropic.com/engineering/advanced-tool-use

It introduces three ideas that explain most of the “Anthropic killed tool calling” chatter.

1) Tool Search Tool

Instead of jamming every tool schema into the prompt up front, Claude can use a Tool Search Tool to pull in only the tools it needs.

Anthropic puts real numbers on the pain:

  • A 5-server MCP setup can cost ~55K tokens in tool definitions before the conversation even starts.
  • They’ve seen 134K tokens consumed by tool definitions before optimization.
  • With tool search, they report an 85% reduction in token usage, with an example drop from ~77K → ~8.7K total context consumption before the “real work” begins.

If you’ve ever wired up Slack + GitHub + Jira + a couple internal APIs and then wondered why the model started acting… weirdly foggy, this is problem.

2) Programmatic Tool Calling

Classic tool calling often turns into context pollution. Every intermediate step gets shoved back into the model’s context, and you pay in tokens and attention.

So Anthropic adds Programmatic Tool Calling. The idea is simple: run loops and conditionals in a code execution environment, call tools there, then only return the parts that matter to the model.

Less sludge in the context window. More signal.

3) Tool Use Examples

JSON schema tells an agent what’s valid, not what’s smart.

Tool examples teach conventions and “how people actually call this thing,” which in practice can be the difference between a tool that works 60% of the time and one that works 90% of the time. Anyone who has watched an LLM technically follow a schema while doing something completely unhinged knows what I mean.

“Anthropic killed tool calling” also means spoofing got harder

The Augmented Mind post “The End of the Claude Subscription Hack” argues Anthropic shut down a gray-area workflow. Using Claude Pro/Max consumer subscriptions as a cheap API for external agent swarms by spoofing the Claude Code client. Source: https://augmentedmind.substack.com/p/the-end-of-the-claude-subscription-hack

A sticky detail from write-up is the reported error:

“This credential is only authorized for use with Claude Code and cannot be used for other API requests”.

According to the post, Anthropic tightened things like:

  • Token scope and client binding, with subscription tokens bound to official clients
  • Telemetry as a gate, where official clients send extra signals
  • Abuse/misuse detection, aimed at high-volume automation patterns

So if someone’s workflow depended on “tool calling” through unofficial harnesses… yeah, it’s going to feel like a rug pull. What really died is subscription arbitrage + client spoofing.

How to adapt your agent architecture

1) Treat tool definitions like a budget, because they are

If you’re using MCP or a big tool suite, plan for tool sprawl. It creeps up on you.

Some messy-but-real rules of thumb:

  • Namespace tools clearly. Think github.createPullRequest vs github.pr.create.
  • Don’t ship 80 tools when 12 will do. You’ll regret it.
  • Return compact results. Tokens are money and also attention.

Anthropic’s tool ergonomics guidance is worth your time:
Anthropic’s “Writing effective tools for agents” https://www.anthropic.com/engineering/writing-tools-for-agents

Example: defer loading tools (conceptually)

{
  "tools". [
    { "type". "tool_search_tool_regex_20251119", "name". "tool_search_tool_regex" },
    {
      "name". "github.createPullRequest",
      "description". "Create a pull request",
      "input_schema". { "type". "object", "properties". { "title". { "type": "string" } } },
      "defer_loading": true
    }
  ]
}

That defer_loading pattern comes straight from the advanced tool use post.

2) Move orchestration logic into code when you can

If your agent does “search → filter → retry → summarize,” that’s a loop. LLMs can do loops, sure. They’re just not cheap about it.

A pattern I keep coming back to:

  • Use Claude for planning and picking the right tool
  • Use code for iteration, retries, transforms
  • Send Claude the final distilled artifacts, not the whole kitchen sink

Pseudo-structure:

plan = llm("Figure out steps and which tools to use")
for step in plan:
    result = run_tool(step.tool, step.args)
    store(result)
summary = llm("Summarize only the key results", context=distill(store))

3) Separate “interactive” and “automation” credentials

If you’re building internal tooling, don’t build on the assumption that consumer auth tokens will keep behaving like API keys. The Augmented Mind story is basically a reminder vendors lock this down the second abuse shows up.

Common mistakes when “Anthropic killed tool calling” becomes your headline

Mistake 1: assuming you can’t build agent workflows anymore
You can. The center of gravity is shifting toward platform-native tool use, not DIY spoofed clients.

Mistake 2: shoving everything into the model context
Anthropic’s own warning label is right there in their numbers: 55K+ tokens in tool defs, and 134K seen in the wild.Still 3: treating schemas as documentation
Agents need examples and conventions. Humans do too, honestly. Schemas are necessary. They’re just not the whole story.

What I’m taking away

Anthropic killed tool calling isn’t a funeral for tools. It’s a sign the ecosystem is growing up.

Tool definitions are getting too big to preload. Orchestration is moving into code. Client spoofing and subscription-based automation is getting shut down. And humans approving every single tool call… yeah, that road kind of ends.

Want one concrete next step won’t waste your afternoon? Audit your tool library and measure how many tokens you burn before the agent even starts doing real work. Then start thinking seriously about tool discovery and code orchestration.

And if you want a related rabbit hole, here’s the next read mentioned in the original draft: ZeroClaw vs OpenClaw
https://www.basantasapkota026.com.np/2026/02/zeroclaw-vs-openclaw-is-zeroclaw.html

If you hit weird breakage that made you mutter “Anthropic killed tool calling,” drop a comment with your setup. I’m genuinely curious what’s actually failing out in the wild.

Post a Comment