The Token Problem

Every time a developer uses an LLM to work with your API, the model needs to “read” your documentation. And every token has a cost.

But it’s not just about direct cost, tokens also affect:

Response quality (more context = better answers)
Response speed (fewer tokens = faster)
Context limits (models like Claude have 200K token limits)

Methodology

We tested 5 real APIs of different sizes:

API	Endpoints	Schemas	OpenAPI Size
Petstore (example)	15	12	45 KB
Weather API	42	28	120 KB
E-commerce	89	65	280 KB
SaaS CRUD	156	110	520 KB
ERP Enterprise	340	280	1.2 MB

We measured tokens using tiktoken (cl100k_base), the same tokenizer Claude uses.

Results: Tokens per Format

Petstore (45 KB original)

Format	Tokens	Cost (Claude-4)
OpenAPI JSON	4,200	$0.04
Swagger UI HTML	12,500	$0.12
llms.txt	1,100	$0.01

Savings: 74% vs JSON, 91% vs HTML

Weather API (120 KB)

Format	Tokens	Cost (Claude-4)
OpenAPI JSON	28,400	$0.28
Swagger UI HTML	65,000	$0.65
llms.txt	8,200	$0.08

Savings: 71% vs JSON, 87% vs HTML

E-commerce (280 KB)

Format	Tokens	Cost (Claude-4)
OpenAPI JSON	68,000	$0.68
Swagger UI HTML	185,000	$1.85
llms.txt	22,000	$0.22

Savings: 68% vs JSON, 88% vs HTML

SaaS CRUD (520 KB)

Format	Tokens	Cost (Claude-4)
OpenAPI JSON	145,000	$1.45
Swagger UI HTML	420,000	$4.20
llms.txt	48,000	$0.48

Savings: 67% vs JSON, 89% vs HTML

ERP Enterprise (1.2 MB)

Format	Tokens	Cost (Claude-4)
OpenAPI JSON	380,000	$3.80
Swagger UI HTML	1,200,000	$12.00
llms.txt	125,000	$1.25

Savings: 67% vs JSON, 90% vs HTML

Savings Summary

API Size	JSON → llms.txt	HTML → llms.txt
Small (<50KB)	74%	91%
Medium (50-300KB)	69%	88%
Large (300KB-1MB)	67%	89%
Enterprise (>1MB)	67%	90%

Average savings: ~70% vs JSON, ~89% vs HTML

Real-World Cost Impact

Assuming your team makes 100 daily prompts that include your API:

Format	Tokens/prompt	Monthly Cost (Claude-4)
OpenAPI JSON	50,000	$150/month
Swagger UI	120,000	$360/month
llms.txt	18,000	$54/month

Annual savings: $1,152 vs JSON, $3,672 vs HTML

And this is just for a small team. Companies with multiple teams can save thousands of dollars per year.

Response Quality

Does fewer tokens mean worse quality? We tested this by asking Claude to:

Find a specific endpoint
Generate code to create a resource
Explain how to authenticate

Metric	JSON	HTML	llms.txt
Correct endpoint	95%	88%	97%
Functional code	89%	72%	94%
Correct auth	91%	85%	96%

llms.txt is not just cheaper, it also produces better responses. The compact format helps the model focus on what matters.

How to Implement llms.txt

npm install swagent

Generate llms.txt from your OpenAPI:

swagent generate --input openapi.json --docs ./docs

Or integrate directly into your server:

import { fastifySwaggerToLlms } from 'swagent/adapters/fastify'

fastify.get('/llms.txt', async (request, reply) => {
  const spec = fastify.swagger({ yaml: true })
  const llms = await fastifySwaggerToLlms(spec)
  reply.header('Content-Type', 'text/plain')
  return llms
})

Conclusion

If your API has more than 20 endpoints, you’re already losing money every time someone uses an LLM with your documentation. The llms.txt format:

✅ Reduces token costs by ~70%
✅ Improves LLM response quality
✅ Works with any OpenAPI (2.0 or 3.x)
✅ It’s just a text file, no dependencies, no JS

The only cost is generating it once in your build.

Try SWAGENT →