未达之风 @MLostWind
Joined September 2014-
Tweets12
-
Followers16
-
Following435
-
Likes0
Ilya Sutskever just told the AI industry why scaling is finished. One word built it. One word is about to break it. Sutskever: “Scaling is just one word, but it’s such a powerful word because it informs people what to do.” For five years, that single word replaced an entire research culture. Nobody needed breakthroughs. They needed bigger checks. Sutskever: “If you mix some compute with some data into a neural net of a certain size, you will get results, and you will know that it will be better if you just scale the recipe up.” That’s not science. That’s a recipe. Sutskever: “Companies love this because it gives you a very low risk way of investing your resources.” The most transformative technology in human history ran on the same logic used to franchise a restaurant chain. More locations. More ingredients. Same recipe. Predictable returns. You didn’t need researchers who could see around corners. You needed accountants who could approve purchase orders. But recipes expire. Sutskever: “At some point though, pre-training will run out of data. The data is very clearly finite.” Five years of infrastructure. Five years of hiring. Five years of investor decks. All built on top of something temporary. Sutskever: “I don’t think that’s true.” The co-founder of OpenAI. The mind behind the breakthroughs that made this entire era possible. Saying more money won’t solve it. Sutskever: “In some sense we are back to the age of research.” Most of the companies racing to build AGI were never research companies. They were scaling companies. They hired for execution. Not discovery. They optimized for throughput. Not insight. The talent pipelines. The investor pitches. The board decks. All built around one assumption. That the recipe would never expire. It’s expiring. And the companies that spent five years perfecting the art of spending money are about to discover something. The next era demands what capital can’t purchase. An original idea.
NVIDIA vLLM NVL72 ADVANTAGE: GB200 NVL72 delivers up to 3x performance compared to B200 on @Kimi_Moonshot 's Kimi K2.5. This is enabled by GB200's scale-up network which allows for frontier inference optimizations like wide expert parallelism. Great work to @rogerw0108 @NVIDIAAIDev @vllm_project @inferact @simon_mo_ ! 🚀 Not only is SGLang optimized for disagg+wideEP but vLLM is optimized too!
🚀 Congrats @Alibaba_Qwen on releasing Qwen3.6-35B-A3B — day-0 support is now live in SGLang! The first open-weight Qwen3.6: 35B total params (3B active), same hybrid architecture as Qwen3.5, with major upgrades in agentic coding & thinking preservation. 🔧 Agentic Coding: frontend workflows & repo-level reasoning with greater fluency 🧠 Thinking Preservation: retains reasoning context from historical messages ⚡ Gated DeltaNet + Sparse MoE (256 experts, 8+1 active): high throughput, low latency 📏 262K native context, extensible to 1M Cookbook: cookbook.sglang.io/autoregressive… Launch with SGLang:
⚡ Meet Qwen3.6-35B-A3B:Now Open-Source!🚀🚀 A sparse MoE model, 35B total params, 3B active. Apache 2.0 license. 🔥 Agentic coding on par with models 10x its active size 📷 Strong multimodal perception and reasoning ability 🧠 Multimodal thinking + non-thinking modes
Curious what's in the PR of almost 1400 kernels? Here we walk through a simple batched GEMM kernel 🟠 Tile size: M128, N16, K256 🟠W4A16: matrix A is INT4 with BF16 scaling factor for every 32 elements, matrix B is BF16 🟠3 pipeline stages 🟠1 CTA MMA 🟠Static scheduler This warp specialized kernel has the following warp roles: 🟠Load A 🟠Load A scaling factor (SF) 🟠Load B 🟠Cast A: Dequantize INT4 to BF16. Waits on Load A and Load A SF 🟠MMA: Performs matmul. Waits on Cast A and Load B 🟠Epilogue: Performs activation computation. Waits on MMA An interesting thing about this kernel is that its MMA uses TS mode due to matrix A dequantization requires CUDA cores, which work on registers instead of TMEM. As shown in our microbenchmarking article, TS mode has slightly lower throughput due to SMEM bandwidth bottleneck. In addition, @cursor_ai also shown that the CUDA core / Tensor Core compute gap also creates bottlenecks. To mitigate these issues, we see the kernel uses pipelining, similar to what Cursor did. Microbenchmarking article: newsletter.semianalysis.com/p/dissecting-n… Cursor blog post: cursor.com/blog/kernels
Trtllmgen kernels are now open. Fastest prefill and decode kernels for our target workloads. We wrote these to win InferenceX, MLPerf, other benchmarks. Powering some of today’s top served models. Dive in, learn, use them, or level up your own. Enjoy. github.com/flashinfer-ai/…
Faster AI chips alone don't fix slow inference. The real bottleneck is data movement. In the decode era, how well your architecture moves data determines speed, throughput, and cost. Here's why Dataflow matters more than ever 👇 sambanova.ai/blog/why-dataf…
Banger paper from NVIDIA. Agentic reasoning needs models that are not just capable, but efficient at long-context inference. The agent model layer is moving toward open, long-context, high-throughput architectures. This paper introduces Nemotron 3 Super, an open 120B parameter model with 12B active parameters, built as a hybrid Mamba-Attention Mixture-of-Experts architecture. The headline numbers are strong: up to 1M context length, comparable accuracy on common benchmarks, and up to 2.2x higher throughput than GPT-OSS-120B and 7.5x higher throughput than Qwen3.5-122B. The model combines several efficiency bets, including NVFP4 pretraining, LatentMoE for accuracy per FLOP and per parameter, and MTP layers for native speculative decoding. It is trained on 25 trillion tokens, then post-trained with supervised fine-tuning and RL. Paper: arxiv.org/abs/2604.12374 Learn to build effective AI agents in our academy: academy.dair.ai
Karpathy 一丢代码,全网程序员集体进化了! 大神又整活了:扔出极简 repo/gist,社区直接把它当底层骨架,卷出一堆生产级神器。不是简单的 fork,是真正的底层进化、从教育玩具变成能自动研究、自动建知识库、4 小时训 ChatGPT 的狠活。 我挑了 4 个正在 X 上刷屏的“Karpathy 系进化体”,程序员看了会沉默,AI 玩家看了会狂喜: 1️⃣ autoresearch(github.com/karpathy/autor…) 630 行代码,让 AI agent 自己改代码、训模型、打分、留优。人类睡觉,它进化。 “手动调参秃头活?交给机器吧!” (已有人 remix 成 ooda 版,A/B 测试、文案优化全能套) 2️⃣ llmwiki 系列(github.com/lucasastorian/… 等) 基于 Karpathy 的 LLM Wiki gist 进化:LLM 不再是搜索引擎,而是 Obsidian 里的“程序员”,自动总结、交叉引用、滚雪球式维护知识库。 RAG 哭晕在厕所,推特直呼“知识库自己长大了”。 3️⃣ nanochat(github.com/karpathy/nanoc…) 大神最新“unhinged”作品:nanoGPT 的全栈进化版,单 GPU 4 小时 $100 出一个能聊、能写诗、能解题的 ChatGPT 克隆。 设计初衷就写着“maximally forkable”,下一个研究 harness 预定! 4️⃣ micrograd / nanoGPT 衍生playground(silicon-more、napagrad 等) 从 Zero to Hero 课程底层进化而来,计算图 + 训练循环被玩出花,成了无数人的 AI 启蒙+benchmark 底座。 // 为什么这些项目这么爆? Karpathy 从不给你黑箱框架,他给的是极简、可读、可 hack 的骨架。 你 fork 它,不是在抄作业,而是在和大师一起递归自我改进。 这才是开源的最高境界:一个人的代码,变成全世界的进化树。
继 Karpathy 大神后,YC 的 CEO @garrytan 也分享了自己的知识管理方法: gist.github.com/garrytan/49c88… Karpathy大神的: gist.github.com/karpathy/442a6… 我将各自理论的核心架构、逻辑和概念可视化出来,方便大家看
We deliver the lowest token cost through extreme co-design. As NVIDIA software optimizations increase token throughput, the value of your NVIDIA GPUs grows from the moment you invest in them. Learn more ➡️ nvda.ws/4me7HBr
🤖 From this week's issue: A practical guide to LLM inference optimization framed around the "efficient frontier" concept — five techniques that move production systems toward the latency/throughput Pareto boundary without additional hardware spend. cloud.google.com/blog/topics/de…
Camelia 🍒 @NORMAND266650
3 Followers 520 Following early 2000s baby who still feels 15 but somehow has rent due ☁️
victoria 🤍 @vicktoriablushz
14 Followers 725 Following selectively social until you say something real ◡
victoria ♡ @umi_alphaz
3 Followers 156 Following Just a delicate girl who feels golden and dreams blooming dreams 💌
QualityStocks🇺🇸 @Jiexo97637
53 Followers 2K Following 15-30% Monthly | 2 High-Conviction Stocks.Short-Term Gains: 15-20% in Days/Weeks.DM "JOIN" for WhatsApp Alerts. Live Trade Signals • Market Analysis
Beth Kindiig @BethKlndiig
170 Followers 7K Following Investor with higher returns than Wall Street's Old Boys Club. Audited portfolio, automated hedge and free weekly analysis.
Beth Kindigs @Beth_Kindiqs
884 Followers 3K Following Investor with higher returns than Wall Street's Old Boys Club. Audited portfolio, automated hedge and free weekly analysis
elecbee @elecbee
77 Followers 278 Following Elecbee is a fun site that offers high quality connector products that link all the core needs of the enterprise.
Amber Curzon @curzon_amber
247 Followers 1K Following
TAO Culture @taocultureau
1K Followers 2K Following #marketing #consultancy #humanresources #solutions #professionalservices #media #entertainment #sportsindustries
Digital Marketing Br @DMarketingBr
151 Followers 2K Following Agência de Marketing Digital no Rio Telefone: (21) 3594-8773 [email protected]
Md Alamin Rubel @mdalaminrubel94
283 Followers 262 Following
RetailNext @RetailNext
18K Followers 17K Following RetailNext optimizes the shopper experience for retailers, brands, and malls with industry-leading advanced analytics. #retail #inspiringretail #analytics
Subquadratic @subquadratic
20K Followers 1K Following AI lab leading the subquadratic LLM revolution.
Alexander Whedon @alex_whedon
25K Followers 59 Following Building better algorithms. Co-Founder at @subquadratic
Zhihu Frontier @ZhihuFrontier
5K Followers 162 Following 🚀Bringing China's AI & tech trends, voices and perspectives to the global stage. ⚡️Powered by 知乎/https://t.co/OkIemRZdcj, China's leading knowledge community.
YIFENG LIU @YIFENGLIU_AI
430 Followers 85 Following CS Ph.D. student on LLM @ UCLA AGI Lab. Previous works: MARS, TPA, ByteDance Seed models and Kimi-1.5....
Chayenne Zhao @GenAI_is_real
12K Followers 484 Following Work Only With Those Who Beat Agents. Founding Member @radixark | SGLang & Large-scale RL @lmsysorg Prev: Tsinghua, CMU, UCLA, Amazon, ByteDance.
Macro_Lin | 市场�... @LinQingV
51K Followers 2K Following Ex-quant & PM|AI chip design|Semis × Capital Markets|Not Financial Advice
Meenakshi Yadav @MeenakshiYACS
3K Followers 208 Following AI & Tech Content Creator | LinkedIn (130K) | YouTuber (64K) | Personal Branding & Influencer Marketing Expert | Founder at ACS | [email protected] ✉️
Daily Dose of Data Sc... @DailyDoseOfDS_
49K Followers 2 Following Delivering daily insights in DS, ML, RAGs, Agents & AI Engineering. Trusted by over 100k+ readers!
The Linux Foundation @linuxfoundation
587K Followers 9K Following A nonprofit organization enabling mass innovation through open source. #linux #kubernetes #riscv #hyperledger #anuket #openssf #openjs #o3de and more!
rohan anil @_arohan_
43K Followers 2K Following member of technical staff & co-founder of @coreautoai - and continuing to aspire to understand deep learning.
Bun @bunjavascript
80K Followers 7 Following Bun is a fast, all-in-one toolkit for installing, bundling, running and testing JavaScript & TypeScript. To install: `npm i -g bun`
davinci @leothecurious
7K Followers 1K Following robotics engineer, student of nature, perpetually curious.
Lu Gan @ganlumomo
719 Followers 315 Following Assistant Professor @GTaerospace; Leading @lunarlab_gatech; Previously @UMRobotics, @Caltech
Alpin @AlpinDale
7K Followers 996 Following Every age, it seems, is tainted by the greed of men. Rubbish to one such as I, devoid of all worldly wants. — I work on HPC and making AI run faster.
Yifan Zhang @yifanzhang_
14K Followers 3K Following PhD at @Princeton University, Princeton AI Lab Fellow. RL & LLM Reasoning, Pretraining & Language Modeling. Prev @ Seed @Tsinghua_Uni
Santiago @svpino
452K Followers 565 Following Computer scientist. I teach hard-core AI/ML Engineering at https://t.co/THCAAZcBMu. YouTube: https://t.co/pROi08OZYJ
Dan Kornas @DanKornas
92K Followers 541 Following AI/ML Engineer Youtube: https://t.co/pjpX8NvUn5 Newsletter: https://t.co/NMMvPSmzua AI Engineering Guild: https://t.co/t6IXkLDX2T
Weiwei Sun @sunweiwei12
911 Followers 260 Following PhD student @LTIatCMU | Interned at Google, ByteDance, Vector, Baidu | Working on LLM agents
Ahmad @TheAhmadOsman
62K Followers 396 Following ai, chips, systems engineering, infra & hardware · on a mission to build a frontier, infra-first AI Lab in the West · i mod GPUs on r/LocalLLaMA
AVB @neural_avb
12K Followers 337 Following Neural Breakdown on YT | Read research with AI: https://t.co/Ef6m4nUpcZ | Latest vid: RLMs, Synthetic data gen | Next: DPO
Birchlabs @Birchlabs
5K Followers 222 Following ML Engineer at Anlatan (@novelaiofficial). co-author of HDiT (Hourglass Diffusion Transformers). works on diffusion models and LLMs. 日本語を勉強してる。
Teortaxes▶️ (Deep... @teortaxesTex
65K Followers 3K Following We're in a race. It's not USA vs China but humans and AGIs vs ape power centralization. @deepseek_ai stan #1, 2023–Deep Time «C’est la guerre.» ®1
Alexandr Wang @alexandr_wang
504K Followers 858 Following chief ai officer @meta, founder @scale_ai. rational in the fullness of time
Rémi Ouazan @remi_or_
941 Followers 145 Following transformers maintainer at Hugging Face 🤗 | ex Apple | Inference, GPUs, kernels
Dirhousssi Amine @DirhousssiAmine
535 Followers 391 Following 🇲🇦 ML engineer - post training team @huggingface 🤗 Rustacean 🦀 ● BJJ competitor ● Longtime Martial Artist
LMSYS Org @lmsysorg
16K Followers 199 Following Large Model Systems Organization: Join our Slack: https://t.co/vzYOTP4w6C. We developed SGLang https://t.co/OjwQadINKU, Chatbot Arena (now @arena), and Vicuna!
SGLang @sgl_project
2K Followers 16 Following Home of the SGLang community. For SGLang deep-dives @lmsysorg 🔗 https://t.co/TKZIbqzeZA
Intel Business @IntelBusiness
170K Followers 2K Following Conversations, solutions, and thought leadership on how our data-centric solutions are the engine of business. That’s the power of #IntelInside.
SemiAnalysis @SemiAnalysis_
110K Followers 27 Following
Google Research @GoogleResearch
100K Followers 17 Following Impossible? Let’s see. From algorithms to neuroscience to AI, Google Research strives to progress science, advance society & improve billions of people’s lives.
MiniMax (official) @MiniMax_AI
101K Followers 851 Following Agent: @MiniMaxAgent Token Plan: https://t.co/BDCycxepZw API: https://t.co/fHRdSV7BwZ Community: https://t.co/uhxxfLgkLU
Yohei @yoheinakajima
125K Followers 12K Following VC by day @untappedvc, builder by night: @babyagi_, @pippinlovesyou @pixelbeastsnft. Build-in-public log: https://t.co/UdHHGbZba5
Peter Steinberger �... @steipete
548K Followers 2K Following Polyagentmorous ClawFather. Came back from retirement to mess with AI and help a lobster take over the world. @OpenClaw🦞 + @OpenAI
Shengyuan @ShengyuanS
2K Followers 1K Following Trying to make small magic with AI. Staff of @Kimi_Moonshot. Opinions are my own.
DeepComputing @DeepComputingio
1K Followers 276 Following Official account of DeepComputing. Turning RISC-V into Reality! Our community: https://t.co/lPorkbQNdN
maharshi @maharshii
43K Followers 1K Following learning deeply about life one gradient step at a time - ml perf optimizer @ fal
Ray Dalio @RayDalio
2.2M Followers 93 Following Official account of Ray Dalio, founder of Bridgewater Associates, author of #1 New York Times bestseller 'Principles,' professional mistake maker
Electronics Weekly @ElectronicsNews
33K Followers 751 Following Follow us to get all the latest tech #EWnews for electronic components, products and design. Industry updates for events, awards and engineering #EWjobs
AnySilicon @AnySilicon
2K Followers 501 Following AnySilicon is a marketplace for companies to list, discover, and contact ASIC services companies and IP vendors around the world.
HotHardware @HotHardware
26K Followers 14K Following Cutting-Edge Tech News & Reviews For Over 25 Yrs https://t.co/JrruVWrkYt - https://t.co/oy3zeM0rOF Daily News Blast https://t.co/Iiati9MEKr
CodeWeavers @CodeWeavers
7K Followers 3K Following Developers to the core, who aren't afraid of any technical challenge. Creator of CrossOver™ software, offering PortJump™ and ExecMode™ services.
Tony Mongkolsmai @tonymongkolsmai
2K Followers 720 Following Distinguished Engineer @ NVIDIA. x-Intel. Tech, Sports, Music, Car Lover, Opinions my own, He/Him



















