Charles Foster @CFGeek

Excels at reasoning & tool use🪄 Tensor-enjoyer 🧪 @METR_Evals. My COI policy is available under “Disclosures” at https://t.co/bihrMIUKJq contextwindows.substack.com Oakland, CA Joined June 2020

Tweets

6K
Followers

3K
Following

568
Likes

22K

Charles Foster @CFGeek

9 hours ago

@scaling01 What did Anthropic do that caused this? Is this about the Fable thing or something else?

0 0 3 388 0

View Details

New paper! Think Fast: Estimating No-CoT Task-Completion Time Horizons of Frontier AI Models @METR_Evals showed that models' time horizons have doubled every few months. We ask: what length of tasks can models complete without any CoT?

2 17 81 20K 28

View Details

Charles Foster @CFGeek

19 hours ago

@jeremyphoward If enforced, this would slow down the rate of AI progress somewhat but wouldn’t mean “the frontier doesn’t advance”. Because 2nd+ companies can use AI to leapfrog into the top ranking on their next release, and all companies can advance the frontier the good-old-fashioned way.

1 0 3 497 0

View Details

Charles Foster @CFGeek

20 hours ago

@sriramk I think they could both be true! x.com/cfgeek/status/…

Charles Foster @CFGeek

20 hours ago

I don’t think that these are necessarily on a collision course. Here is my synthesis: An intelligence recursion is far too powerful and risky to happen behind closed doors. If done at all it should be done out in the open, accountable to outside scientists & the public at large.

0 0 21 1K 1

1 0 5 383 1

View Details

Charles Foster @CFGeek

20 hours ago

Sriram Krishnan @sriramk

a day ago

just to state the obvious: think there's a collison course between those who believe research and science should be open and those who believe we are in an accelerating singularity curve. I have many smart friends who have believed both for a while but seeing more and more their

97 127 2K 307K 193

0 0 21 1K 1

View Details

Charles Foster @CFGeek

a day ago

@willccbb @yong_zhengxin > fortunately, alignment is a precondition for RSI Are you saying that loss of control via RSI is not a possibility? Since you can’t do RSI in the first place without alignment?

0 0 3 92 0

View Details

Charles Foster @CFGeek

2 days ago

@banburismus_ he was too stunned to speak

0 0 6 3K 0

View Details

Charles Foster @CFGeek

2 days ago

@willdepue @lu_sichu *batch/sequence dimension voice*

1 0 16 446 1

View Details

Charles Foster @CFGeek

3 days ago

Let’s invest in methods to monitor AI R&D! These methods seem likely to be useful for many different goals: anticipating how AI capabilities might change, keeping track of competition (whether in the US or in China), verifying any potential agreements around RSI…

roon @tszzl

3 days ago

now on the eve of RSI it seems everyone is more mutual conditional pause agreement pilled than they used to be and that seems like a good development

158 85 2K 271K 208

0 2 21 1K 3

View Details

Charles Foster @CFGeek

3 days ago

@Sauers_ What’s the mean value you get if you project random vectors of the same shape & norm into the same readout axis? I’m wondering where the “zero point” of this graph is.

0 0 4 289 2

View Details

Charles Foster @CFGeek

4 days ago

@tokenbender “At smaller distillation budgets, you get farther distilling the capability as a diff on the original model (like in the paper) than by trying to distill it into a separate model.” Is this roughly what you’re saying?

1 0 2 80 0

View Details

Charles Foster @CFGeek

4 days ago

@kalomaze Also lesswrong.com/posts/hp9bvkiN…

0 0 4 667 0

View Details

Charles Foster @CFGeek

4 days ago

@kalomaze You might like tensor networks web.archive.org/web/2024061408…

1 0 8 601 4

View Details

Charles Foster @CFGeek

5 days ago

@sriramk Look forward to seeing what’s next for you! :)

0 0 1 532 1

View Details

Charles Foster @CFGeek

5 days ago

@ArcadiaImpact @AndrewDraganov @DanielCHTan97 @JonathanDBos @a_aristizabalm @SamMartin589196 @dswg97 Yes! 👏

0 0 1 38 0

View Details

Charles Foster @CFGeek

6 days ago

@aidanprattewart @camila_blank How does the prompt cause the model to behave differently? One way is “it adds a vector aligned with the XYZ steering direction at every token”. Another is “it adds a vector not aligned with XYZ at every token”. Another is “it reroutes the attention matrix for tokens ABC”, …

1 0 1 57 0

View Details

Charles Foster @CFGeek

7 days ago

@aidanprattewart @camila_blank I think the question is “When the student is trained without the teacher’s prompt & on traces where the semantic information from the prompt is apparently filtered out, *how* are the effects of the prompt still transmitted?” We need a mechanism.

Charles Foster @CFGeek

7 days ago

@camila_blank @aidanprattewart Yeah I’m roughly saying “Cloud et al. showed distillation transfers hidden information from prompted teachers within a model lineage. Blank et al. show the mechanism of that effect: prompts induce a steering vector that same-lineage students pick up from non-semantic traces.”

0 0 1 114 0

1 0 0 93 1

View Details

Charles Foster @CFGeek

7 days ago

@aidanprattewart @camila_blank "Steering vector" is a low-level description of a specific mechanism for what causes a behavioral pattern. "Prompted persona" is a higher-level description of a behavioral pattern, which could potentially be implemented thru different low-level mechanisms (SV is only one option).