You can now integrate Hermes with The Grid.
Suppliers compete for every request Hermes sends. That means every skill invocation, every memory call, every agent loop, routed to the cheapest qualifying supplier automatically.
Here's how 👇
@ForwardFuture@tomas_hk Agree partially. Per-task routing is necessary, but it still leaves every buyer running the same eval over and over just to learn which model fits which job. Intelligence tiers, tied to the task at hand, fix that. Routing treats the symptom. Tiers are the cure.
@mintlify Did better docs cut token spend per task, or mostly raise success rate? Curious whether the win was fewer retries (reducing the token spent queitly) or just cleaner first passes.
@smratitiwa86867 Token discipline matters, but the bigger leak is tier mismatch: paying frontier prices for tasks a mid model handles fine. Trimming prompts saves pennies. Matching the workload to the right tier saves the bill.
AI costs are one of the fastest-growing budgets for many startups. We built The Grid to fix the underlying problem, with a market that offers quality tiers, competing suppliers, and guaranteed benchmarks.
If you are burning real dollars on inference, let’s talk (link below)!
Dara (CEO of Uber) on their AI spend:
"We blew through our AI budget in a quarter, for the whole year. It is forcing us to adjust.
We are going to meter headcount increases because to the extent that my engineers are getting much more efficient, their throughput is
@Altimor congrats on the switch, but have you thought about what happens next quarter when something cheaper drops?
The eval + migration tax keeps coming. Every few months, back in the codebase.
What if you never had to do this again? Pick a quality tier, get market-priced inference that's always routed to the best qualifying supplier automatically.
That's what we're building at The Grid.
Pulled the trigger today and switched 100% of Lindy traffic to DeepSeek v4, churning from Anthropic models.
Saves us millions of $ and we're actually seeing an *increase* in performance on many core use cases. Transformative for the business.
3/ The Grid standardizes inference into graded tiers with guaranteed spec. You pick the tier your workload needs. Suppliers compete to fill it.
You get the output at the best price, instead of brand names and expensive subscriptions.
Uber reportedly torched $3.4B of its AI budget in just four months.
The root cause: an absolute lack of a routing layer between the request and the model.
We are blindly throwing expensive frontier models at EVERY SINGLE TASK.
This is why I am excited about @The_GridAI.
It completely flips the script by turning AI inference into a live spot market.
Instead of hardcoding a single, expensive provider, you select a quality tier eg:
- text-standard
- text-prime
- text-max
The Grid then dynamically routes your request to the cheapest qualifying model 🔥
→ Drop-in migration via an OpenAI-compatible endpoint
→ Zero vendor lock-in for your applications
→ Automated, cost-saving routing based on task difficulty
→ Complete visibility into every token spent
Just for fun, I built a small @Streamlit demo to showcase its capabilities.
The app lays out the entire process:
✦ Header & Request:
↳ a clean UI to plug in an API key and choose your instrument (text-standard, text-prime, or text-max).
✦ Routing & Metadata:
↳ run a prompt and instantly inspect the backend
You can see the latency, prompt/completion tokens, and the exact model returned.
✦ Migration & Traceability:
↳ a live look at the code swap and how the market actually routed the request.
On the top left, you can enter your API key and try it out for yourself!
@The_GridAI just went live, and they are giving new accounts an incredible 200 million free tokens.
Demo app + open-source repo in the 🧵↓
For the devs asking how hard it is to switch, it's not.
The whole migration is 3 lines.
Suppliers compete for every request you send. Save up to 80% vs list price.
Here's something most inference buyers don't have access to: a limit order.
'Fill my Text Prime order at $0.50 or less.'
If the market clears there, your job runs. If not, it waits.
Let the price come to you instead of the other way around.
Genius idea for AI inference!
A marketplace that routes requests to the cheapest qualifying model at any given point.
This can get you up to 87% cheaper inference!
Today, if you need a model, you pay the vendor's fixed rate card, but that's about to change with this:
For the devs asking how hard it is to switch, it's not.
The whole migration is 3 lines.
Suppliers compete for every request you send. Save up to 80% vs list price.
The migration is almost insulting.
Before:
model: "gpt-4o"
base_url: "api.openai.com/v1"
After:
model: "text-prime"
base_url: "api.thegrid.ai/v1"
Two lines changed.
Your client keeps running like nothing happened, except now your costs track the actual market
🧵 3. The 3 text tiers and the 3 pricing
- Standard is for high-volume work where cost matters most, like classification, batch summarization, tagging, and simple extraction.
- Prime is the everyday production tier. This is where I’d put agents, RAG, drafting, support workflows, and quality-sensitive pipelines.
- Max is for the harder stuff, like long-context work, high-stakes reasoning, and tasks where a wrong answer can create real downstream cost.
The important part is that “cheapest” does not mean “random cheap model.”
Each tier has a quality threshold. The Grid checks models against benchmark floors anchored to Artificial Analysis. If a supplier falls below the required quality level for a tier, it gets removed from the eligible set. So the market competes on price, but only inside the quality bar you picked.
We get asked a lot: how much can I actually save?
Every instrument shows you the savings vs list price, in real time:
Text Standard: save up to 87%.
Text Prime: save up to 79%.
Text Max: save up to 18%.
Every time you call the API, suppliers compete to deliver your request.
@__sishir The way we think about it: set a floor on benchmark, latency, uptime, error rate, and once suppliers clear that, let them compete on price.
We're in Beta! First 200M tokens free for anyone who wants to try→ app.thegrid.ai/sign-up
2K Followers 256 FollowingThe open market for AI inference. No gatekeepers.
Peer-to-peer. Onchain payments. Verifiable reputation.
🐜🌱 https://t.co/q0t7Vh2zb4
9K Followers 2K FollowingInvesting in startups with @natfriedman + @danielgross; run @aigrant and the Andromeda Cluster; stan dogs and warm carbs & cheese; tech/AI for human flourishing
125K Followers 345 FollowingVenture capitalist at @theoryvc
Student of Startups
Backer of 9 unicorns
Author of https://t.co/IWw3R3RVLm
Subscribe https://t.co/iDgoLXaF98
3K Followers 13 FollowingUnderstanding Intelligence.
Measurement. Explanation. Application. That's how we're tackling AI interpretability: the greatest scientific problem of our age.
470K Followers 2K FollowingFounder/CEO https://t.co/m6TigM5azr: Free AI training for the smartest engineers. Will tweet as I wish and suffer the consequences. Accelerando: @kellyclaudeai
880K Followers 6K FollowingPresident & CEO @ycombinator —Founder @garryslist—Creator of GStack & GBrain—designer/engineer who helps founders—SF Dem accelerating the boom loop
371K Followers 1K FollowingCo-founder of stealth startup. Inventor of GANs. Lead author of https://t.co/M6vl8pEQ4I Founding chairman of @pubhealthaction
257K Followers 178 FollowingCo-founder of Thinking Machines Lab @thinkymachines; Ex-VP, AI Safety & robotics, applied research @OpenAI; Author of Lil'Log
137K Followers 1K FollowingSemiAnalysis
Boutique AI Infrastructure Research and Consulting
DMs are open for consulting, quotes, or to talk shop,
Opinions my own