Working on new methods for understanding machine learning systems and entangled quantum systems.sites.google.com/view/jordanten… BrisbaneJoined December 2009
New paper!
Think Fast: Estimating No-CoT Task-Completion Time Horizons of Frontier AI Models
@METR_Evals showed that models' time horizons have doubled every few months. We ask: what length of tasks can models complete without any CoT?
Glad to have contributed to this new paper!
We measured the length of tasks LLMs can complete without CoT, which is a key proxy for the extent to which we can trust CoT monitors.
Result: the 50% no-CoT time horizons of frontier LLMs are ~3 minutes and double every 373 days.
New paper!
Think Fast: Estimating No-CoT Task-Completion Time Horizons of Frontier AI Models
@METR_Evals showed that models' time horizons have doubled every few months. We ask: what length of tasks can models complete without any CoT?
We are starting a new, nonprofit alignment organization, ⊢ Sequent Research, bringing together researchers previously on UK AISI’s Alignment Team, Timaeus, and elsewhere to research how to align superintelligence. We are hiring! 🧵
Model Transparency at the @AISecurityInst evaluated Claude Mythos 5 for capabilities and behaviours relevant to monitorability, our first time doing this in pre-deployment testing! Details in thread 🧵
Our internal data shows Claude is accelerating AI development—a possible path to recursive self-improvement, or AI autonomously building a more capable successor.
It’s happening faster than we thought, and the implications deserve greater attention. anthropic.com/institute/recu…
An obvious way to study whether a training technique removes misalignment is to run that technique on a model organism (MO).
But we've found that MOs are often weirdly fragile. E.g. training them to talk like a pirate often removes their bad behavior. 1/2
This report is an incredibly detailed and broad look into how it might become harder to monitor, audit or generally make confident claims about frontier AI systems. We interviewed an exceptional array of experts from multiple frontier labs, academia and industry. Worth a read!
The safety of advanced AI systems increasingly depends on the ability to oversee them. Our new report examines today’s AI oversight landscape, finding many pathways likely to lead to its degradation.🧵
I helped write this report on oversight of AI systems and how it could degrade - it's a great overview, and a good guide to what research directions might help us maintain the level of oversight we enjoy today
The safety of advanced AI systems increasingly depends on the ability to oversee them. Our new report examines today’s AI oversight landscape, finding many pathways likely to lead to its degradation.🧵
@1a3orn@jiaxinwen22 Though notably there's nothing requiring the discrete tokens to be legible english.
"Thinking Without Words: Efficient Latent Reasoning with Abstract Chain-of-Thought" is an example of learning to reason in an abstract discrete token vocabulary.
arxiv.org/abs/2604.22709
@1a3orn@jiaxinwen22 Though notably there's nothing requiring the discrete tokens to be legible english.
"Thinking Without Words: Efficient Latent Reasoning with Abstract Chain-of-Thought" is an example of learning to reason in an abstract discrete token vocabulary.
arxiv.org/abs/2604.22709
There are a lot of pathways via which AI oversight is likely to degrade! Latent reasoning architectures, situational awareness, representational drift... We wrote a report ranking them.
Here I'll go into some which worry me most 🧵
The safety of advanced AI systems increasingly depends on the ability to oversee them. Our new report examines today’s AI oversight landscape, finding many pathways likely to lead to its degradation.🧵
225 Followers 363 FollowingNo. 10 Downing Street Innovation Fellow | Research Scientist at AISI | Visiting Lecturer at Imperial College London
Working on AI Evaluation and AI for Medicine
2K Followers 2K FollowingOpen-source interpretability to seize the means of prediction. Postdoc w/ @davidbau @ndif_team @Northeastern. Prev: @GroNLP, @amazonscience
9K Followers 8K FollowingDoing interesting research with TRMs. Maybe solved Bitcoin scaling.
Fd. @tradelayer @moralitylab
Build treasures in the heavens
Feed the hungry in war zones
645 Followers 499 FollowingDeputy Director, Research Unit, UK AISI. Generalist who enjoys getting difficult things done and trying to make the world less bad. Views mine, rt!=endorse
2K Followers 5K FollowingBuilding talent & community in AI safety. Currently @AISecurityInst, prev. @AnthropicAI. Philosophy, Politics, and Economics alumna @UniofOxford.
770 Followers 951 FollowingTrying to figure out how AI works 🔍🧠
Currently at @ETH Zurich, previously @EPFL 🇨🇭
LLMs, interpretability, emergence, grokking 🤖
196 Followers 150 FollowingAI Whisperer & Prompt Engineer. Articles on Artificial Intelligence & how AI text & images can be used in copywriting, marketing, art, design, & storytelling!
225 Followers 363 FollowingNo. 10 Downing Street Innovation Fellow | Research Scientist at AISI | Visiting Lecturer at Imperial College London
Working on AI Evaluation and AI for Medicine
10K Followers 1K FollowingCo-founder of Guidelight AI Standards (https://t.co/tNBPmVsPqo), ex-OpenAI safety researcher, writing at https://t.co/R5KV9j3lsG
14K Followers 35 FollowingHigh-volume account of @ESYudkowsky, the original AI alignment guy. If it's missing punctuation, it's humor. If you can't tell, it's probably also humor.
2K Followers 1K FollowingAI safety, Econ, new liberalism, math, and a bit of art history (as a treat)
Behavioral evaluations @TransluceAI. Prev Astra, MATS & Walmart's Econ Team
9K Followers 2K FollowingAI forecasting and governance @AI_Futures_. Co-author of AI 2027 and the AI Futures Model. Also @aidigest_, @SamotsvetyF. Prev @oughtinc
1K Followers 15 FollowingThe official twitter of the Former Lighthaven PR Department a company for PR production for Lighthaven (not a sanctioned Lighthaven product).
34K Followers 5K FollowingEcon student, liberal, aspie, bi. Michael Kremer stan. I ❤️ optimal auction design. Spend more on drugs. Open borders now! alt @NicholasD91704
3K Followers 340 FollowingResearcher at AI Futures Project. AGI is going to be a really big deal, we don't know when it's going to happen, and we're not ready for it.