Stability of the user experience is a strong reason to run open source models on your own infra, and it doesn't come up as often as cost control or data privacy.
More in the video.
Really insightful talk from Mateusz Charytoniuk on building a self-hosted LLM stack at @rustikonconf today. He broke down where open source LLMs are headed, how to self-host them at scale with Paddler, and how Rust helps to make this ecosystem more secure and maintainable.
MCP servers aren't just technical projects. They can add real value by making your product accessible through conversational AI. Here's a short video essay on how to frame them as a business opportunity: youtube.com/watch?v=R78mBg…
We're organizing Post Software, an event where AI builders, designers, and professionals from other industries meet to explore how AI can create entirely new solutions, not just optimize what already exists.
Want to speak or get involved? Visit post-software.intentee.com
It gives you real-time insights into how your capacity is used, how many requests are being buffered, and any issues that might have come up.
It also provides a convenient way to manage your models (swap them dynamically, use custom chat templates, adjust inference parameters).
Paddler is our open-source platform for self-hosting LLMs at scale, and it comes with a web UI you can use to understand your cluster at a glance.
paddler.intentee.com/docs/starting-…
Per-token API pricing for LLM usage is convenient, but it comes with a huge cost unpredictability. Traffic spikes are one thing, but the way your users use your LLM-based features can differ widely
Pre-alpha version (static site generation, open-source, github.com/intentee/poet) is out, with a custom syntax that gives full control over how content is understood structurally and will allow us to add AI-based content analysis features. More to come :)
Technical products and developer tools need to be discoverable in AI platforms, let users talk to their docs, and ensure coding copilots can offer quality code suggestions for their technologies.
This is what we’re aiming for with Poet poet.intentee.com
Based on the last StackOverflow Developer Survey, most developers prefer both interactive formats and long-form articles when learning new technologies.
This means static site documentation should no longer be static.
Paddler lets you download models directly from Hugging Face (or from a local file path), both via API and our web admin panel. You can also swap them dynamically without needing to restart the entire setup.
More: paddler.intentee.com/docs/starting-…
Not every LLM task requires a massive number of parameters. And it’s not only a matter of cost savings. Smaller, specialized models can offer a better experience to the end user (both in quality response and performance).
If there's no available capacity, Paddler will buffer the incoming requests. Combined with autoscaling, it lets you handle traffic spikes without dropping requests. You can also scale from zero hosts to avoid overpaying for idle GPU capacity. paddler.intentee.com/docs/internals…
Feature highlight: model swapping.
Paddler lets you swap models dynamically, without the need to restart the balancer or the agents. Available in the web admin panel or the API.
Learn more: paddler.intentee.com/docs/starting-…
Highlighting project paddler
Build and scale your own LLM infrastructure, powered by llama.cpp
Mateusz and team worked hard over the last year and have significantly improved the project - check it out and let them know your experience
github.com/intentee/paddl…
Paddler is 2.0 is out and it’s now a complete open-source platform to run and scale open-source LLMs in your own infrastructure.
Docs: paddler.intentee.com
GH: github.com/intentee/paddl…
4 Followers 23 FollowingDigital Growth Expert • YouTube SEO Specialist • MERN Stack Dev
Sharing tips on SEO, digital strategy & web tech. 🎯
https://t.co/phI8VJuICD
2 Followers 14 FollowingCo-founder of Intentee.
Currently working on: Paddler, open-source LLM load balancer and serving platform for self-hosting LLMs at scale
11K Followers 398 FollowingThe MLOps community is an open and transparent community where all are welcome to participate. It is a place where MLOps practitioners can collaborate and share