Allow adding cache control header in Agent Call
142
Views
1
Comments
New
AI/ML

It would be incredible if we would be able to add the cache_control header in the messages we are sending to the LLMs using the new AI Workbench. This feature is available at least on Claude with more info here: https://docs.claude.com/en/docs/build-with-claude/prompt-caching
This approach, and I quote, "significantly reduces processing time and costs for repetitive tasks or prompts with consistent elements." This allows companies substancial savings on token usage!


Looking at the documentation they clame that this is especially useful for:

  • Prompts with many examples
  • Large amounts of context or background information
  • Repetitive tasks with consistent instructions
  • Long multi-turn conversations


where I would underline the repetitive tasks, which is one of the advantages of using agentic AI, where it can call the same tools again and again, with the same system prompt.


In the documentation the example they give is to cache an entire book. In our case, these could be large documents with company details, or a quite big system prompt.

I think along with cache-control as a standard header. The Extra Body param approach that is used on the AI Model invocation could also be applied. There is a lot of customisability that could be leveraged.