ericmichael @ericmichael
Data + AI lead @ UTHealth RGV / UTRGV School of Medicine, BJJ Black Belt @ 80/20 Jiu-Jitsu Edinburg, TX Joined July 2008-
Tweets169
-
Followers808
-
Following716
-
Likes1K
@rankintweets iPad, Tailscale SSH, Tmux. It’s great.
Biggest issue with Claude Code with Opus 4.5 is still that after a few context compressions it’s a huge risk to continue working in the conversation.
@jeremyphoward It’s a great and fast model! Simple little tricks like adding a no-op scratchpad tool to let it “plan” in-context can improve tool calling while keeping things super fast.
SAM 3D body does a great job at reconstructing 3D scenes of 2d Brazilian Jiu-Jitsu images. Struggles a bit with entanglement/occlusion. Initial tests of using out of the box SAM 3D embeddings for image—>position-embedding similarity shows promise compared to CLIP and could have a ton of useful application for coaches and athletes. Experimenting with fine-tuning an embedding model with contrastive loss to improve performance for position similarity.
@mervenoyann Replace detectron2/SAM2 with SAM3 + “person” prompt!
Very close to recreating the SAM3D Body playground. FastAPI + React + Three.JS. Vibe coded. 1. Use SAM3 prompted to identify people. Expose UI for user to select which identified people you want meshes for. 2. Pass masks to SAM3D Body to generate meshes and poses. 3. Animate with three.js. FB repo for SAM3D has lots of references but much better if you swap the repo’s human detection / masking with SAM3 (promptable) instead of detectron or SAM2.
Is it just me or is Playwright MCP the only useful MCP server out there? Everything else seems like noise. Or that using the MCP server would be more limited than just using their CLI interface (chainable and composable through bash).
@AnthropicAI “Tools. Models have access to a wide array of software tools (often via the open standard Model Context Protocol).” something about this line doesn’t sit right. There’s so few useful MCP servers that to throw this line into the article feels like poorly placed ad
claude hates uv
I think we’re actually in agreement if I’m grokking this correctly. What I’m doing in practice for example is what I think you described in your summary: collecting large amount of diverse traces and hand labeling. Splitting dataset into train/dev/test and optimizing the judge (allowing it to receive signal from dev human annotations) until TPR/TNR reach an acceptable threshold. The human signal is a key part of the judge training. Then optimizing the LLM (using GEPA) against the GEPA-trained judge.
GEPA prompt optimization: You don’t need a judge but it’s super helpful. GEPA is going to produce a bunch of experiments on your prompt and you need a way of scoring the _output_ of those runs for GEPA to provide good reflexive feedback. If you have human labeled data the labels only apply for the specific prompt you used at that time. You need an automated way of scoring outputs with each prompt iteration to get the full power of GEPA approaches (IMO). You can do that with a well-aligned judge. Or you could do it with a code-based check. Your human labeled data went stale the second a new prompt variation was mutated. Humans would need to go relabel outputs for each prompt variation. Aligning a judge allows you to automate this on _unseen_ prompt/output combos so your experts don’t have to relabel the outputs for every mutated prompt. Any clever way you have of using an LLM to match the output versus human labeled data _is_ a judge whether you realize it or not. Don’t want to make and align a judge? Write a code-based check!
@gooby_esq I’d be curious to know what are these classes of problems that we expect an LLM to be able to solve but not be able to verify. It is likely the case that the judge needs access to data, tools, context to verify the solution.
Some reactions to my post are "wow, I'll never use Ghostty since you use AI." That's fine, I really don't care. But my friends, if you plan on avoiding all software that had any AI assistance in its dev, I have really bad news for you about the general software ecosystem.
I’m all for evals but with coding agents a lot of the eval work happened in post-training already and got built into the model. Not true for other domains. The amount of tools required for effective coding agents is really small: bash, apply_diff. Every other tool is really just a nice to have. Most of the code for building a coding agent goes to controlling execution / UI / approvals more tightly. All of which would fall under the scope of automated testing rather than evals. so I can see why they would delay evals until vibe iteration runs out. I just made my own coding agent and the out of box performance of GPT-5 high is really strong. Devs have to weigh the time it takes to eval correctly vs waiting for the next coding RL’d model to be released. For any other domain it’s unlikely that the developer is also religiously using it and also a domain expert.
- level 1: Chatbots, prompting (identity, goal, starting context, guidelines), using OpenAI compatible APIs - level 2: Single Agent + Basic Retrieval Tools, Tracing, Human Evals - level 3: Single Agent + Action Oriented Tools, Guardrails, Approvals, Reference-Based and Reference-Free Automated Evals - level 4: Deployment, CI/CD, Privacy, Compliance, BAAs, Enterprise considerations - level 5: Multi-Agent Architectures (handoffs, agents as tools), DSPy / Prompt Optimization using Eval data - level 6: Workflows vs Agents, Hybrid Approaches - level 7: Measuring ROI and making business case for AI applications - level 8: Measuring OSS performance vs proprietary across evals, cost, performance metrics Hot takes: - Agentic RAG > vector embedding based retrieval, easier to create and better out of box performance - Caching beyond what API providers already provide is premature optimization - Single Agent systems > Multi-Agent systems based knowledge complexity and bang for buck performance - Generic metrics like ROGUE, BLEU, etc harm more than they help. Evals should be human-centered and focused on measuring the prevalence of _observed failure modes_ specific to the application domain instead of context-unaware metrics - Inference, self-hosting, OSS not advisable until measuring the prevalence of failure modes through evaluations shows that OSS performance is worth it when compared to other factors such as cost (GPUs, salaries, development complexity), latency, etc. - De-identification (PII redaction) for regulated industries like healthcare is more challenging and imposes more legal risk than simply building HIPAA compliant environments that are allowed to just process and store the PHI
@iScienceLuvr I think this would be really useful in a healthcare setting where folks are worried about agentic web search leaking PHI via search queries. Seems to me like it would be a much safer solution than any kind of automated de-identification scheme.
mia💞 (read bio) @cashmeremiuxcg
3 Followers 314 Following 18✨ dm me on my main @xkawaiimia i wanna show u smth
CyberChirp VT @mola1910
1K Followers 4K Following 🎵 🌟 Future Streamer | CyberChirp 🎮 Gaming enthusiast on a pre-debut journey! 💖 Looking for support & new friends! 📅 Stay tuned for updates and streams!
Auloxvir @Auloxvir695
8 Followers 172 Following
Thalia @Brerrar84675
187 Followers 6K Following You are never too old to set another goal or to dream a new dream.
Robert Scoble @Scobleizer
586K Followers 50K Following San Francisco/Silicon Valley AI | Robots, holodecks, BCIs, analysis of new things | Ex-Microsoft, Rackspace, Fast Company | Wrote eight books about the future.
Lakshya A Agrawal @LakshyAAAgrawal
5K Followers 3K Following PhD @ UC Berkeley | @gepa_ai Creator | Created https://t.co/YxPZsXZJeS | AI4Code Research Fellow @MSFTResearch | Hobby Saxophonist
DawnDutt @3hM6r93kr4MCn
40 Followers 977 Following
AmandaSnow @9dy483R5U3M66
31 Followers 878 Following
Viola @Quuorjo33659
85 Followers 2K Following The most beautiful thing a woman can wear is confidence.
WandaLena @nueT350TEgaq1n
26 Followers 970 Following
Marcelo Guerra Hahn @marceguerra
16K Followers 12K Following Educator | Engineering Leader | Speaker | ex-SoundCommerce | ex-Tableau | ex-MSFT
Owen Ou 🚀 @owenthereal
4K Followers 3K Following Maintainer of jq · @ubc_cs alum · @CrowdStrike · Ex-@Heroku, Ex-@Amazon · Author of Build Your Own Coding Agent (https://t.co/qFwkgQdrCr)
Vivian @7yGXhZ4rC5V80uG
158 Followers 6K Following The question isn’t who’s going to let me; it’s who’s going to stop me.
Kevin Vicent @moebious
832 Followers 7K Following Cineasta de mentiras, escritor frustrado, Ingeniero de software en busca de su identidad.
Scaleborg @scaleborg
3 Followers 1K Following
Su @BilgeSuuuu
1K Followers 5K Following
JB @justbuilding
7K Followers 1K Following
Ajay Palle @ajaypalle1101
115 Followers 1K Following Founder @PoliIntelLab | Exploring crossroads of data, tech, policy & everday state | @azimpremjiuniv
Audrey M. Roy Greenfe... @audreyfeldroy
8K Followers 2K Following Indie web dev/AI researcher @feldroydotcom | MIT 05 | Author of @AirWebFramework & Cookiecutter (200M dls) w/ @pydanny 💘 | Mom of 7-year-old Uma
Marta @JamalHarri95825
256 Followers 7K Following
AI in Medicine Confer... @AIinMedicine27
115 Followers 1K Following AI in Medicine Conference June 03-04, 2027 | Amsterdam, Netherlands #AI #Roboticsurgery #Precisionmedicine #Medicalimaging #Diagnostics #Radiology
Madhu Srinivasan @secunit64
545 Followers 2K Following Graphics + AI @AMD Research, ex @HPE_Cray, ex @KAUST_Vislab. Thoughts are mine.
Pamela @N3EocM6mHr8B7
32 Followers 855 Following
Ashtra @AshtraAI
703 Followers 933 Following Ashtra is a governed AI platform that automates the patient journey for mid-market healthcare with clinical-grade safety
Bart Blast @Bart_Blast
1K Followers 797 Following Building Hologram, a full-stack Elixir web framework
Aniket Deshmukh @ADeshmukh61
114 Followers 853 Following Building smarter AI than me | Observations on tech, startups & chaos | Relatable + sarcastic
Wyatt @dww
1K Followers 2K Following Founder/CEO of MPCStudios, loving Webflow, chatgpt, Make, Great Danes, e-bikes & life.
Carter Leffen @carterleffen
1K Followers 1K Following We are here to learn, make a difference, and have fun. - Deming | my opinions are my own.
Julianna @Ps0911LvyBxn1
31 Followers 917 Following
VitaCarllyle @1IwlMcLFtg3sO52
52 Followers 1K Following
Johnson Thomas, MD, F... @JohnsonThomasMD
2K Followers 3K Following Endocrinologist. Editor in Chief - Endocrine AI. Section chair. Passionate about thyroid nodules, cancer & AI. Creator of Endo Tools iphone app, AIBx.
VitaCrane @s6Y9786ck788R
323 Followers 5K Following
BlancheStella @wQoR24YSt28CBW
204 Followers 1K Following
Hyper3D by Deemos @DeemosTech
15K Followers 800 Following Pioneering 3D Generative AI: https://t.co/fXMtbRwlZu Rodin & ChatAvatar: Text/Image to production-ready 3D Assets. https://t.co/SKSMwINQJ6 #CG #3D #GenAI
Palmer Luckey @PalmerLuckey
852K Followers 2K Following I am a technology enthusiast, writer, and modder. Founder of @ModRetro, @Oculus VR, and @Anduriltech. Keeping American superheroes safe with autonomous systems.
Dimitris Papailiopoul... @DimitrisPapail
28K Followers 1K Following Researcher @MSFTResearch, AI Frontiers | Prof @UWMadison (on leave) | babas of Inez Lily.
Alex @alexfredo87
4K Followers 678 Following Indie game dev, working on a standalone VR multiplayer game.
vmiss @vmiss33
14K Followers 6K Following Enterprise Architect. AI Infrastructure. Electrical Engineer. #VCDX-236 @e2eea_global. ransomware is a disaster.
AI Slop @AIslop_
128K Followers 7 Following $AiSlop CA: 2wzVMXhLypmP92mXNCq4fuFcd9TCC972AbMfuiH3pump
Franziska Hinkelmann,... @fhinkel
18K Followers 3K Following Gemini CLI and agents @Google. @nodejs TSC member. Former Chrome @v8js.
OpenMed @OpenMed_AI
5K Followers 7 Following Open-Source AI Agents for Healthcare & Life Sciences · Now shipping OpenMed Agent: Terminal-native AI for Healthcare
Dan Shipper 📧 @danshipper
113K Followers 2K Following ceo @every | the only subscription you need to stay at the edge of AI
Collin @collin_jilbert
3K Followers 1K Following Does Ruby things @gorails, @hatchboxio, and @jumpstartrails | Skateboarder | Drummer | Co-editor @therubyradar
Jon Yongfook @yongfook
162K Followers 1K Following 🐻 https://t.co/KoqV5WRAhy image generation Bootstrapping SaaS @ $81K MRR
Matteo Paz @matteopaz06
2K Followers 65 Following
Courtney Smith @ADCourtneyy
2K Followers 977 Following Riftbound and League of Legends content creator. She/her. Contact: [email protected]
Runes & Rift @RunesAndRift
13K Followers 96 Following 🔸Your home for everything #Riftbound | Tournaments · Decks · Guides · News · Coaching | ✉️ [email protected]
elie @eliebakouch
18K Followers 4K Following training llm @PrimeIntellect (prev: @huggingface) anon feedback: https://t.co/JmMh7Sg3mL
Nikhila Ravi @nikhilaravi
7K Followers 3K Following Researcher @AIatMeta (MSL). Projects: SAM 3, SAM 2, Segment Anything, PyTorch3D
Kat ⊷ the Poet Engi... @poetengineer__
85K Followers 377 Following artist, engineer, researcher. live audio visual performer. artificial life, computation, neuroscience, machine psyche. science & art, futuristic & ancient.
Niloofar ✈️ icml @niloofar_mire
10K Followers 2K Following Technical staff @humansand, incoming asst. prof @LTIatCMU @CMU_EPP, ex RS in @AIatMeta, postdoc @uwcse, Ph.D. @ucsd_cse, former @MSFTResearch -Privacy, ML, NLP
Jiatao Gu @thoma_gu
6K Followers 2K Following Assistant Prof @CIS_Penn and Staff ML Researcher at @Apple (MLR) | ex-FAIR | PhD @HKUniversity | Research on Generative AI & World Models. また、日本語もできます。
Prime Intellect @PrimeIntellect
66K Followers 40 Following The Open Stack for Self-Improving Agents https://t.co/ZRZOsRRbwr
Kris Kitani @kkitani
4K Followers 334 Following Associate Research Professor at @SCSatCMU, @CMU_Robotics. Computer Vision, Machine Learning and Accessibility.
TK • 木下 @wordsofteekay
10K Followers 335 Following coding . ml . math . writing . books . https://t.co/SJXxlX5JeT
Evil Martians @evilmartians
10K Followers 168 Following Design and engineering consultancy for developer tools, AI, and cybersecurity startups. Lago, Tines, Teleport, Wallarm, and Whop grow with Evil Martians.
Dmitry Lyalin @LyalinDotCom
22K Followers 8K Following New job incoming... @ Google. 26yrs of summoning code • prev: @geminicli, @firebase, @microsoft | 👨👦 Dad 24/7 • 🏯 Sumo fan • 💭 Opinions mine 🇺🇸 🇺🇦
tuna🍣 @tunahorse21
14K Followers 1K Following husband | ai agent researcher | dev https://t.co/gsm5CeYhSm | https://t.co/UquSqAaBer & BJJ brown belt
Robert Scoble @Scobleizer
586K Followers 50K Following San Francisco/Silicon Valley AI | Robots, holodecks, BCIs, analysis of new things | Ex-Microsoft, Rackspace, Fast Company | Wrote eight books about the future.
Lakshya A Agrawal @LakshyAAAgrawal
5K Followers 3K Following PhD @ UC Berkeley | @gepa_ai Creator | Created https://t.co/YxPZsXZJeS | AI4Code Research Fellow @MSFTResearch | Hobby Saxophonist
Amanda Brooke Perino @AmandaBPerino
2K Followers 643 Following My elevator pitch: 'I'm going up, can you press the button for the 5th floor please, thanks.' This feed is 98% about dogs with a bit of #Rails Foundation news.
Owen Ou 🚀 @owenthereal
4K Followers 3K Following Maintainer of jq · @ubc_cs alum · @CrowdStrike · Ex-@Heroku, Ex-@Amazon · Author of Build Your Own Coding Agent (https://t.co/qFwkgQdrCr)
Marcelo Guerra Hahn @marceguerra
16K Followers 12K Following Educator | Engineering Leader | Speaker | ex-SoundCommerce | ex-Tableau | ex-MSFT
Grummz @Grummz
300K Followers 4K Following Games. Firefall, Em-8ER, OG WoW Team Lead. Diablo 2 Producer, Starcraft.
Levi Figueira @levifig
2K Followers 1K Following follower of christ. redeemed sinner. husband of 1. father of 5. expat. ✝️👨💻🇵🇹🇺🇸💚 #product #dev #ops #infra head of product engineering @ https://t.co/hl7tJnf4F7
tuōmo @7uomoki
3K Followers 2K Following tuomo kiiskinen • md-phd • building ai-powered next gen health @Stanford • quant genomics x data sci x comp biomedicine
Danny Thompson @DThompsonDev
301K Followers 3K Following Senior Developer Advocate at Atlassian | Co-Host of The Programming Podcast | I've helped 1000s of people land jobs in tech |
Name cannot be blank @hackSultan
293K Followers 2K Following Building a learning institution with a better experience at @altschoolafrica @hacksultan_ea
Swastika Yadav @swstica
65K Followers 830 Following devrel engineer @dimensionalos • building agentic robots🦾
freeCodeCamp.org @freeCodeCamp
1.2M Followers 159 Following We're a community of millions of people who are building new skills and getting new jobs together. A 501(c)(3) public charity. Tweets by @abbeyrenn.
Eddie Jaoude | DevRel... @eddiejaoude
206K Followers 3K Following 🌏 Digital Nomad 🇬🇧🇵🇹🇹🇭 ⭐️ GitHub Star of the Year | 👨🏫 GitHub Top Teacher Award 🤓 Developer Advocate @paypaldev 🎙️ Views expressed are my own
Pratham @Prathkum
449K Followers 891 Following I talk about web, AI, API, and social • Building experiences at @APILayer • Prev @Rapid_API @HyperspaceAI
Marko Denic @denicmarko
269K Followers 386 Following Software Engineer - Content Creator - Community Builder - Let's grow together → https://t.co/rSwUHtCUFL
Nat Miletic @natmiletic
68K Followers 1K Following Owner of https://t.co/t4xwCLpIdO Follow me for web dev, SEO, and WordPress content 🙌





















