The Token Problem
Every time a developer uses an LLM to work with your API, the model needs to “read” your documentation. And every token has a cost.
But it’s not just about direct cost, tokens also affect:
- Response quality (more context = better answers)
- Response speed (fewer tokens = faster)
- Context limits (models like Claude have 200K token limits)
Methodology
We tested 5 real APIs of different sizes:
| API | Endpoints | Schemas | OpenAPI Size |
|---|---|---|---|
| Petstore (example) | 15 | 12 | 45 KB |
| Weather API | 42 | 28 | 120 KB |
| E-commerce | 89 | 65 | 280 KB |
| SaaS CRUD | 156 | 110 | 520 KB |
| ERP Enterprise | 340 | 280 | 1.2 MB |
We measured tokens using tiktoken (cl100k_base), the same tokenizer Claude uses.
Results: Tokens per Format
Petstore (45 KB original)
| Format | Tokens | Cost (Claude-4) |
|---|---|---|
| OpenAPI JSON | 4,200 | $0.04 |
| Swagger UI HTML | 12,500 | $0.12 |
| llms.txt | 1,100 | $0.01 |
Savings: 74% vs JSON, 91% vs HTML
Weather API (120 KB)
| Format | Tokens | Cost (Claude-4) |
|---|---|---|
| OpenAPI JSON | 28,400 | $0.28 |
| Swagger UI HTML | 65,000 | $0.65 |
| llms.txt | 8,200 | $0.08 |
Savings: 71% vs JSON, 87% vs HTML
E-commerce (280 KB)
| Format | Tokens | Cost (Claude-4) |
|---|---|---|
| OpenAPI JSON | 68,000 | $0.68 |
| Swagger UI HTML | 185,000 | $1.85 |
| llms.txt | 22,000 | $0.22 |
Savings: 68% vs JSON, 88% vs HTML
SaaS CRUD (520 KB)
| Format | Tokens | Cost (Claude-4) |
|---|---|---|
| OpenAPI JSON | 145,000 | $1.45 |
| Swagger UI HTML | 420,000 | $4.20 |
| llms.txt | 48,000 | $0.48 |
Savings: 67% vs JSON, 89% vs HTML
ERP Enterprise (1.2 MB)
| Format | Tokens | Cost (Claude-4) |
|---|---|---|
| OpenAPI JSON | 380,000 | $3.80 |
| Swagger UI HTML | 1,200,000 | $12.00 |
| llms.txt | 125,000 | $1.25 |
Savings: 67% vs JSON, 90% vs HTML
Savings Summary
| API Size | JSON → llms.txt | HTML → llms.txt |
|---|---|---|
| Small (<50KB) | 74% | 91% |
| Medium (50-300KB) | 69% | 88% |
| Large (300KB-1MB) | 67% | 89% |
| Enterprise (>1MB) | 67% | 90% |
Average savings: ~70% vs JSON, ~89% vs HTML
Real-World Cost Impact
Assuming your team makes 100 daily prompts that include your API:
| Format | Tokens/prompt | Monthly Cost (Claude-4) |
|---|---|---|
| OpenAPI JSON | 50,000 | $150/month |
| Swagger UI | 120,000 | $360/month |
| llms.txt | 18,000 | $54/month |
Annual savings: $1,152 vs JSON, $3,672 vs HTML
And this is just for a small team. Companies with multiple teams can save thousands of dollars per year.
Response Quality
Does fewer tokens mean worse quality? We tested this by asking Claude to:
- Find a specific endpoint
- Generate code to create a resource
- Explain how to authenticate
| Metric | JSON | HTML | llms.txt |
|---|---|---|---|
| Correct endpoint | 95% | 88% | 97% |
| Functional code | 89% | 72% | 94% |
| Correct auth | 91% | 85% | 96% |
llms.txt is not just cheaper, it also produces better responses. The compact format helps the model focus on what matters.
How to Implement llms.txt
npm install swagent
Generate llms.txt from your OpenAPI:
swagent generate --input openapi.json --docs ./docs
Or integrate directly into your server:
import { fastifySwaggerToLlms } from 'swagent/adapters/fastify'
fastify.get('/llms.txt', async (request, reply) => {
const spec = fastify.swagger({ yaml: true })
const llms = await fastifySwaggerToLlms(spec)
reply.header('Content-Type', 'text/plain')
return llms
})
Conclusion
If your API has more than 20 endpoints, you’re already losing money every time someone uses an LLM with your documentation. The llms.txt format:
- ✅ Reduces token costs by ~70%
- ✅ Improves LLM response quality
- ✅ Works with any OpenAPI (2.0 or 3.x)
- ✅ It’s just a text file, no dependencies, no JS
The only cost is generating it once in your build.