Intuitive.Computer @Intuitive_Comp

Machine that thinks like you, and explains why. intuitive.computer Joined February 2026

Tweets

3
Followers

0
Following

38
Likes

0

Anthropic @AnthropicAI

2 weeks ago

New Science Blog: Why has AI advanced faster in coding than in biology? To agents, bio databases are like cities built before cars—maddening to drive in because they're designed for different traffic. How do we build infrastructure agents can use? anthropic.com/research/agent…

323 499 4K 751K 2K

View Details

Intuitive.Computer @Intuitive_Comp

5 months ago

arxiv.org/abs/2505.17117

0 0 0 7 0

View Details

Andrej Karpathy @karpathy

5 months ago

nanochat can now train GPT-2 grade LLM for <<$100 (~$73, 3 hours on a single 8XH100 node). GPT-2 is just my favorite LLM because it's the first time the LLM stack comes together in a recognizably modern form. So it has become a bit of a weird & lasting obsession of mine to train a model to GPT-2 capability but for much cheaper, with the benefit of ~7 years of progress. In particular, I suspected it should be possible today to train one for <<$100. Originally in 2019, GPT-2 was trained by OpenAI on 32 TPU v3 chips for 168 hours (7 days), with $8/hour/TPUv3 back then, for a total cost of approx. $43K. It achieves 0.256525 CORE score, which is an ensemble metric introduced in the DCLM paper over 22 evaluations like ARC/MMLU/etc. As of the last few improvements merged into nanochat (many of them originating in modded-nanogpt repo), I can now reach a higher CORE score in 3.04 hours (~$73) on a single 8XH100 node. This is a 600X cost reduction over 7 years, i.e. the cost to train GPT-2 is falling approximately 2.5X every year. I think this is likely an underestimate because I am still finding more improvements relatively regularly and I have a backlog of more ideas to try. A longer post with a lot of the detail of the optimizations involved and pointers on how to reproduce are here: github.com/karpathy/nanoc… Inspired by modded-nanogpt, I also created a leaderboard for "time to GPT-2", where this first "Jan29" model is entry #1 at 3.04 hours. It will be fun to iterate on this further and I welcome help! My hope is that nanochat can grow to become a very nice/clean and tuned experimental LLM harness for prototyping ideas, for having fun, and ofc for learning. The biggest improvements of things that worked out of the box and simply produced gains right away were 1) Flash Attention 3 kernels (faster, and allows window_size kwarg to get alternating attention patterns), Muon optimizer (I tried for ~1 day to delete it and only use AdamW and I couldn't), residual pathways and skip connections gated by learnable scalars, and value embeddings. There were many other smaller things that stack up. Image: semi-related eye candy of deriving the scaling laws for the current nanochat model miniseries, pretty and satisfying!

Intuitive.Computer @Intuitive_Comp

Anthropic @AnthropicAI

Intuitive.Computer @Intuitive_Comp

Andrej Karpathy @karpathy

tphuang @tphuang

Joe Lowry @globallithium

Sxcoal @sxcoal

Teo @Teo_Sinamin

Bessemer @BessemerVP

Paulo Macro @PauloMacro

Louis-Vincent Gave @gave_vincent

Dr. Pat Soon-Shiong @DrPatrick

Astera Institute @AsteraInstitute

Quarq @quarqlabs

Michael West @GerontologyMike

Claire Burch @claireburch_

Life Biosciences @lifebiosciences

GenScript @GenScript

Ed Zitron @edzitron

ClaudeDevs @ClaudeDevs

alphaXiv @askalphaxiv

Roan @RohOnChain

Anthropic @AnthropicAI

Citrini @citrini

Contrary Research @Contrary_Res

Yi Ma @YiMaTweets

LeRobot @LeRobotHF

AscentagePharma @AscentagePharma

Matt Schwartz @matt_is_nice

Peter Ottsjö @peterottsjo

SemiAnalysis @SemiAnalysis_

Winston @ChurchillWw

Grant Stenger (hiring... @GrantStenger

NonsparseOncologist @5_utr

Robert Timmerman @BobTimmermanMD

Legend Biotech @LegendBiotech

Legend Biotech Medica... @LegendUSMA

Ethan Mollick @emollick

Andrej Karpathy @karpathy

Yann LeCun @ylecun

SpaceX @SpaceX

Google @Google

Trends for United States

You might like