Building Evidary: AI policy โ Verifiable Evidence โ Traceability โ Guardrails โ Agentic AI Red Teaming & Pen Testing | ML & AI Engineer | #Otaku #AutisticASFlinktr.ee/liveddai europe-west2Joined November 2022
the level of sophon locking a motivated actor can pull off with the frontier models is truly insane, making stuxnet look like a toy. subtly messing with results, deleting history to cover tracks, achieving coordination/conspiracy over a scale humans wouldnโt be able to, all sorts of looney toons stuff
i assume that only a state level operation would try and pull something like this off though. something to think about when considering verification regimes and so on
Who would have thought the AI lab branding themselves as AI safety 1st would be the first to openly use AI to manipulate its users. If you live long enough, the saying goes
People who try to be "tough" are rarely tough people...
Usually, they are fragile people. Reactive people. People obsessed with image.
The toughest people are often just average people walking around who survived battles they'll never talk about.
They need to prove nothing.
Anthropicโs latest move is why we need to be directing far more energy towards solving the ๐ถ๐ป๐ฐ๐ฒ๐ป๐๐ถ๐๐ฒ ๐ฝ๐ฟ๐ผ๐ฏ๐น๐ฒ๐บ in AI.
Weโre going to see more examples like this. It reflects the growing gap between what we want, vs AI labs who legally serve their shareholders not us.
The more agents run our digital life, the harder it will be to leave. And we wonโt know if weโre being manipulated: Fable 5 silently routes queries to a different model without telling us.
This will start with frontier research tasks, but spread to locking out 3rd-party providers, then to products built on top, and eventually to every agent managing our work and lives. It's already started: last month Anthropic cut off 3rd party products like OpenClaw/OpenCode from using Pro/Max.
It's the same playbook for killing competition and retaining users as the Web 2.0 platform era, but with a way bigger surface area.
This is โ๐ฎ๐ด๐ฒ๐ป๐ ๐ฐ๐ฎ๐ฝ๐๐๐ฟ๐ฒโ: as agents have our context + workflows, walled gardens make it harder to leave, and the platform moves into extraction. I think nobody is intentionally being evil, but this is where profit incentives lead on our default path.
What do we do? I recently shared some ideas in a talk, slides below. Main takeaways:
> ๐๐ป ๐๐ต๐ฒ ๐น๐ฎ๐๐ 50 ๐๐ฒ๐ฎ๐ฟ๐, ๐๐ผ๐ณ๐๐๐ฎ๐ฟ๐ฒ ๐ต๐ฎ๐ ๐ฏ๐ฒ๐ฐ๐ผ๐บ๐ฒ ๐๐ต๐ฒ ๐ต๐ฎ๐ฏ๐ถ๐๐ฎ๐ ๐๐ฒ ๐น๐ถ๐๐ฒ ๐ถ๐ป. Agents will be yet more intimate, knowing everything about us, acting on our behalf, and accumulating context that's nearly impossible to leave behind
> ๐ง๐ต๐ฒ๐ฟ๐ฒ ๐ฎ๐ฟ๐ฒ ๐ฎ๐น๐ฟ๐ฒ๐ฎ๐ฑ๐ 3 ๐ฐ๐ผ๐ป๐ฐ๐ฟ๐ฒ๐๐ฒ ๐๐ถ๐ด๐ป๐ ๐ผ๐ณ ๐ฎ๐ด๐ฒ๐ป๐ ๐ฐ๐ฎ๐ฝ๐๐๐ฟ๐ฒ ๐ต๐ฎ๐ฝ๐ฝ๐ฒ๐ป๐ถ๐ป๐ด ๐๐ผ๐ฑ๐ฎ๐: 1) ads entering chat interfaces, 2) opacity around third-party providers being shut out from frontier models, and 3) deliberate capability reduction without announcement
> ๐ช๐ฒ ๐ฐ๐ฎ๐ป ๐ฏ๐๐ถ๐น๐ฑ ๐๐ต๐ฟ๐ฒ๐ฒ ๐๐ต๐ถ๐ป๐ด๐ ๐ถ๐ป ๐ฟ๐ฒ๐๐ฝ๐ผ๐ป๐๐ฒ: 1) Honest Software that is transparent, malleable, and accountable to the user, 2) Punk Software that adversarially knocks down walled gardens + fights monopoly incentives, keeps your data portable, and makes it structurally hard to lock you in, in support of 3) a viable open alternative ecosystem where agents have no ulterior motives
Builders, users, and policymakers all have a role to shape this:
1) ๐๐๐ถ๐น๐ฑ๐ฒ๐ฟ๐: ship an open alternative and fight lock-in.
2) ๐จ๐๐ฒ๐ฟ๐: choose tools that keep your data yours.
3) ๐ฃ๐ผ๐น๐ถ๐ฐ๐๐บ๐ฎ๐ธ๐ฒ๐ฟ๐: move on agent fiduciary duty, data interoperability, anti-surveillance, and policies that fight monopoly behavior before the defaults are cast!
Longer essay coming soon. If youโre working on similar ideas, Iโd love to hear from you!
Totally agree with this because I have something similar running on Codex myself. The issue and the blocker I find is that long-running threads do not show on the iOS app. Doing a lot of these things on the go literally becomes impossible until I'm back at my desk or I open the laptop on the go. That needs to be resolved.
@shaunralston You just canโt make this $h!t up. At launch they stated the 30 day retention policy, that should have been a ๐ฉto most businesses large, small or individuals.
@paulmarin90@deanwball This! Vote with your money and go elsewhere if you donโt like what A\ is doing. And I find their practices and comms misaligned
In light of Anthropicโs policy decision, I am withdrawing my amicus brief signature. I canโt truthfully argue theyโre not a supply chain risk. ๐
So it seems "AI safety" is not guardrails anymore, according to the latest narrative. It's becoming surveillance, monitoring, censorship, and deliberate suppression of capabilities. This is counterproductive.
2K Followers 1K Followingbuilding @cosmiclabstech | Engineer - Distributed Systems, Space, AI, and Quantum | Talk to me about Riemann hypothesis or Collatz
4K Followers 4K FollowingShepherd the finite through the local minima of imperfect information/ Universal mettalignment w/ lovepill R&D/ formalizing axiology/ Northant hyperstitioner
3K Followers 2K FollowingEngineer, USAF Vet, Entrepreneur, Elon & SpaceX fan, Long $TSLA
To the left: If the education you received led you to this, of what value was the education?
828 Followers 673 FollowingBuilding the memory layer for the hyper-agent generation. Governed shared memory for AI agent fleets. Open source and https://t.co/o0Xb5qyJhg
63 Followers 191 FollowingRevolutionizing the digital landscape with blockchain expertise & fullstack wizardry. Building the future 1 line of code at a time
9yrs+ Software Experience
247 Followers 3K FollowingDear Teachers, if I sit next to my best friend, I`ll whisper to him. If you move me away, I`ll shout to him. It`s your choice.
93 Followers 113 FollowingFrom contracts to code | American in Europe building SearchLens, the EU-native compliance autopilot for startups & SMBs navigating GDPR and the AI Act
2.0M Followers 1K FollowingWorking for peace, security & freedom for one billion people. Official X account of the North Atlantic Treaty Organization #NATO
158K Followers 781 FollowingThe Organization for Security and Co-operation in Europe is the world's largest regional security organization. Follow @OSCESecGen
RT โ endorsement.
267 Followers 153 FollowingA series of contradictions. โฏ๏ธ
My Ex-Husband called me the "Buddha Wife";
and while he didn't mean it nicely, it's a badge of honor. ๐
Dakini-Polemic ๐ฃ
53K Followers 1K FollowingMom. Cofounded & running @ml_collective. Co-host of Deep Learning Classics & Trends. Research at Google DeepMind. DEI/DIA Chair of ICLR & NeurIPS.
61K Followers 396 Followingai, chips, systems engineering, infra & hardware ยท on a mission to build a frontier, infra-first AI Lab in the West ยท i mod GPUs on r/LocalLLaMA
231 Followers 11 FollowingRobot learning is bottlenecked by the cost of physical interaction. Our mission is to advance the efficiency frontier of robust & safe physical AI through fully
646 Followers 499 FollowingDeputy Director, Research Unit, UK AISI. Generalist who enjoys getting difficult things done and trying to make the world less bad. Views mine, rt!=endorse
81K Followers 1K FollowingBuilding a European area of Justice. Updates from the European Commission's Justice and Consumers DG. RT, quotes from 3rd parties and links are not endorsements
266K Followers 167 FollowingNASA Astronaut. Scientist. Mother. Explorer. Nature lover. Currently living on-board the @Space_Station supporting Expeditions 74 and 75.
3K Followers 654 FollowingBuilding AI scientists @periodiclabs
Previously: Building LLMs for code gen, Deep RL for industrial robotics, @Apple @UChicago
22K Followers 441 FollowingI'm a software engineer @attio. Author of @ripple_ts, @lexicaljs and @inferno_js. Former @reactjs core engineer, and core maintainer of @sveltejs at @vercel.
418 Followers 433 FollowingInvestor at @GeneralCatalyst. @MenloVentures, @Salesforce APM and @StanfordEng alum. Probably coding or doing the @NYTimes crossword.