The Token Problem

Every time a developer uses an LLM to work with your API, the model needs to “read” your documentation. And every token has a cost.

But it’s not just about direct cost, tokens also affect:

  • Response quality (more context = better answers)
  • Response speed (fewer tokens = faster)
  • Context limits (models like Claude have 200K token limits)

Methodology

We tested 5 real APIs of different sizes:

APIEndpointsSchemasOpenAPI Size
Petstore (example)151245 KB
Weather API4228120 KB
E-commerce8965280 KB
SaaS CRUD156110520 KB
ERP Enterprise3402801.2 MB

We measured tokens using tiktoken (cl100k_base), the same tokenizer Claude uses.


Results: Tokens per Format

Petstore (45 KB original)

FormatTokensCost (Claude-4)
OpenAPI JSON4,200$0.04
Swagger UI HTML12,500$0.12
llms.txt1,100$0.01

Savings: 74% vs JSON, 91% vs HTML


Weather API (120 KB)

FormatTokensCost (Claude-4)
OpenAPI JSON28,400$0.28
Swagger UI HTML65,000$0.65
llms.txt8,200$0.08

Savings: 71% vs JSON, 87% vs HTML


E-commerce (280 KB)

FormatTokensCost (Claude-4)
OpenAPI JSON68,000$0.68
Swagger UI HTML185,000$1.85
llms.txt22,000$0.22

Savings: 68% vs JSON, 88% vs HTML


SaaS CRUD (520 KB)

FormatTokensCost (Claude-4)
OpenAPI JSON145,000$1.45
Swagger UI HTML420,000$4.20
llms.txt48,000$0.48

Savings: 67% vs JSON, 89% vs HTML


ERP Enterprise (1.2 MB)

FormatTokensCost (Claude-4)
OpenAPI JSON380,000$3.80
Swagger UI HTML1,200,000$12.00
llms.txt125,000$1.25

Savings: 67% vs JSON, 90% vs HTML


Savings Summary

API SizeJSON → llms.txtHTML → llms.txt
Small (<50KB)74%91%
Medium (50-300KB)69%88%
Large (300KB-1MB)67%89%
Enterprise (>1MB)67%90%

Average savings: ~70% vs JSON, ~89% vs HTML


Real-World Cost Impact

Assuming your team makes 100 daily prompts that include your API:

FormatTokens/promptMonthly Cost (Claude-4)
OpenAPI JSON50,000$150/month
Swagger UI120,000$360/month
llms.txt18,000$54/month

Annual savings: $1,152 vs JSON, $3,672 vs HTML

And this is just for a small team. Companies with multiple teams can save thousands of dollars per year.


Response Quality

Does fewer tokens mean worse quality? We tested this by asking Claude to:

  1. Find a specific endpoint
  2. Generate code to create a resource
  3. Explain how to authenticate
MetricJSONHTMLllms.txt
Correct endpoint95%88%97%
Functional code89%72%94%
Correct auth91%85%96%

llms.txt is not just cheaper, it also produces better responses. The compact format helps the model focus on what matters.


How to Implement llms.txt

npm install swagent

Generate llms.txt from your OpenAPI:

swagent generate --input openapi.json --docs ./docs

Or integrate directly into your server:

import { fastifySwaggerToLlms } from 'swagent/adapters/fastify'

fastify.get('/llms.txt', async (request, reply) => {
  const spec = fastify.swagger({ yaml: true })
  const llms = await fastifySwaggerToLlms(spec)
  reply.header('Content-Type', 'text/plain')
  return llms
})

Conclusion

If your API has more than 20 endpoints, you’re already losing money every time someone uses an LLM with your documentation. The llms.txt format:

  • ✅ Reduces token costs by ~70%
  • ✅ Improves LLM response quality
  • ✅ Works with any OpenAPI (2.0 or 3.x)
  • ✅ It’s just a text file, no dependencies, no JS

The only cost is generating it once in your build.

Try SWAGENT →