The Token Company — LLM Gateways Platform

Founded 2025|San Francisco, CA|1-10 people|Unknown

What is The Token Company?

The Token Company builds a drop-in compression API that preprocesses LLM inputs using fast ML models to remove redundant tokens from prompts, chat histories, and RAG documents. Part of YC W2026, it was founded by Otso Veistera — reportedly the youngest solo founder in YC history at 18 years old. YC partners reached out directly rather than through the standard application process.

The API compresses 100K tokens in under 100ms using purpose-built classification models (bear-1 series) that identify and strip low-value tokens. This is not a generative LLM but a fast, deterministic model. The counterintuitive result: compression actually improves accuracy because models focus on higher-signal content. Published benchmarks show +2.7pp on financial QA with 20% fewer tokens and +4.0pp on reading comprehension with 17% fewer tokens.

A named customer, Pax Historia (processing 193B tokens/month), ran a 268K-vote blind arena study showing compressed prompts outperformed uncompressed with a +5% purchase lift. The pricing is straightforward at /bin/zsh.05 per 1M compressed tokens, and you only pay for tokens actually removed.

Key features

Core capabilities this platform advertises.

Token compression
Output quality improvement
Cost reduction middleware
LLM-agnostic

Strengths and tradeoffs

What this tool does well, and the limitations to keep in mind.

Pros

Extremely clear pricing at /bin/zsh.05/1M tokens with simple pay-for-what-you-remove model
Real customer validation with published blind arena results showing +5% performance lift
Counterintuitive but proven: compression improves accuracy, not just cuts cost
Fast and deterministic — non-generative ML model processes 100K tokens in under 100ms
YC partners sought out the founder — strong validation signal

Cons

Solo 18-year-old founder creates execution risk for enterprise sales cycles
LLM providers may build native compression into their APIs
Competes with Compresr (also YC W26) on similar value proposition
Compression impact may decrease as context windows grow cheaper

Plans & pricing

What's included in each plan, and how the tiers compare.

Pay-per-use

/bin/zsh.05/1M compressed tokens

Usage-based

Only pay for tokens removed
Sub-100ms processing
Tunable compression 0.0-1.0
REST API

View official pricing page

Common use cases

Teams looking to reduce LLM costs while improving quality

Token cost optimization
LLM output improvement
API cost reduction

Using The Token Company with Respan

The Token Company compresses LLM inputs while Respan monitors LLM outputs and performance. Together they optimize both the input side (cost reduction) and output side (quality monitoring) of LLM workflows.

Reduce LLM costs with Token Company compression while monitoring quality via Respan
Verify that compression maintains output quality using Respan evaluations
Track cost savings from compression alongside total LLM spend in Respan

Monitor compressed LLM calls with Respan