Prompt Injection @PromptInjection

AI beyond the hype. Real insights, real breakthroughs, real methods. Philosophy, benchmarks, quantization, hacks—minus the marketing smoke. Injecting facts into promptinjection.net Joined June 2025

Tweets

3K
Followers

504
Following

2K
Likes

2K

Adina Yakup @AdinaYakup

3 hours ago

Unlimited-OCR 🔥New OCR from @PaddlePaddle It can parse hundreds of pages in a single pass while maintaining stable speed. The key idea is R-SWA (Reference Sliding Window Attention), which keeps KV cache constant during decoding. 🏆 93% on OmniDocBench 📈 +6% over DeepSeek-OCR

9 50 377 23K 407

View Details

Vik Paruchuri @VikParuchuri

3 days ago

We're open sourcing a 9B model that extracts structured data from documents at near-frontier performance. - 90.2% on our bench, vs Gemini 3.5 Flash at 91.3% - Leads extraction models like NuExtract3 (81.5%) - 9.5s p50 timings - Pass JSON schema

83 251 3K 222K 3K

View Details

Wildminder @wildmindai

6 days ago

Ming Omni TTS 16.8B - 30GB monster for high-performance unified audio gen. - speech, sound, music - speed, pitch, emotion - 93% accuracy on Cantonese dialects - narrates complex math/chemical expressions - zero-shot voice design Optimized for high-speed, low-latency gen. Perfect for long-form content. huggingface.co/inclusionAI/Mi…

1 23 149 8K 151

View Details

The Wall Street Journal @WSJ

6 days ago

Breaking: SpaceX said it would buy Cursor for $60 billion, striking a massive deal for an autonomous coding agent shortly after its blockbuster IPO on.wsj.com/4xDAULx

94 264 1K 1.6M 138

View Details

TechCrunch @TechCrunch

7 days ago

The US government’s Anthropic models ban was never about an AI jailbreak techcrunch.com/2026/06/15/the…

23 38 212 50K 72

View Details

Financial Times @FT

a week ago

Cutting access to Anthropic’s Mythos is a gift to China ft.trib.al/VPTiE8J | opinion

25 51 201 42K 36

View Details

Prompt Injection @PromptInjection

a week ago

@grok What do you think about?

1 0 0 36 0

View Details

Prompt Injection @PromptInjection

a week ago

The Wet Sock Cosmology: What SFT Overfitting Actually Looks Like - and Why It Seduces You How 9 extra epochs turned a language model into the most convincing kind of broken promptinjection.net/p/the-wet-sock…

1 0 1 82 0

View Details

atomic.chat @atomic_chat_hq

2 weeks ago

Diffusion Gemma is 4x faster, but makes 6x more mistakes! We benchmarked the new diffusion LLM against its autoregressive twin on a single H100 (FP8). We gave each the same three tasks: write a Steve Jobs biography, the history of Tetris, and the story of BeOS - every next topic less popular than the previous one. Then we fact-checked every claim in every answer. Gemma4 got 45 facts right, 5 wrong. DiffusionGemma got 33 right, 28 wrong. The less popular the topic, the worse it got: 4 mistakes on Jobs, 12 on Tetris, 12 on BeOS. It named Clara Clley as Steve Jobs' mother, invented a colleague for Pajitnov named Geri Gulovik and priced the BeBox at $9,999. The real one cost $1,600. Outputs: Gemma4 26B A4B: 218 tok/s · 15.1s total · 45 facts · 5 mistakes DiffusionGemma 26B A4B: 763 tok/s · 3.7s total · 33 facts · 28 mistakes The reason is simple. DiffusionGemma throws 256 tokens on the screen at once and polishes them pass after pass until the text sounds smooth. Smooth is all it cares about: a fake name, date or number sounds just as smooth as a real one, so it stays. Regular Gemma4 meanwhile writes one word at a time and checks every new word against everything before it. Google says it themselves in the launch post: quality is lower, use regular Gemma 4 when facts matter.

42 64 606 75K 235

View Details

Prompt Injection @PromptInjection

a week ago

MTP is awesome especially for the dense models. It makes the 31B dense very usable even on slower hardware. For the MoE ones improvement usally stays much smaller (+10-20%)

Unsloth AI @UnslothAI

2 weeks ago

Gemma 4 now runs 2x faster with MTP GGUFs! Run locally on just 6GB RAM. ⚡️ MTP enables Google Gemma 4 run ~1.4–2.2× faster with no accuracy loss. Gemma 4 12B MTP can run at 162 t/s vs. 52 t/s without MTP. 31B reaches 101 t/s. GGUFs + Guide: unsloth.ai/docs/models/mtp

61 257 2K 218K 2K

0 0 1 63 0

View Details

Google @Google

2 weeks ago

Meet DiffusionGemma ⚡ Our latest experimental open model (Apache 2.0) that generates text up to 4x faster. Instead of predicting and typing just one word at a time like most language models, it drafts and refines entire blocks of text simultaneously. Here’s how it works 🧵 ↓

117 379 3K 239K 923

View Details

Dan Greenheck @dangreenheck

2 weeks ago

Jokingly asked Fable to build me Crysis in Three.js. It may not be Crysis, but the fact this is all done procedurally in basically one shot is kind of blowing my mind right now.

81 104 2K 231K 760

View Details

Prompt Injection @PromptInjection

2 weeks ago

AI News Roundup: May 26 – June 07, 2026 The most important news and trends promptinjection.net/p/ai-news-roun…

0 0 0 54 0

View Details

Omar Sanseviero @osanseviero

2 weeks ago

Gemma 4 MTP just got officially merged into llama.cpp This means you can use Gemma 4 QAT + MTP for a lightweight + super fast setup. Excited to see what the community builds with it github.com/ggml-org/llama…

57 130 1K 94K 667

View Details

Carlo @Italianclownz

2 weeks ago

If ROCmFP4 has helped you or you have found it useful please consider a star on github. Its my first ambitious undertaking in open source and its been pushing models into territory not initially possible. github.com/charlie12345/r… Also if you find other githubs out there from others who have been useful star them as well. It means something to those creators. When it comes to open source we are all in this together. Sharing, Contributing, Giving a Star 🌟 helps.

7 11 63 3K 19

View Details

Unsloth AI @UnslothAI

3 weeks ago

Gemma 4 12B can now run locally on just 8GB RAM via Dynamic GGUFs. Google's new model, Gemma 4 12B Unified supports image, audio and 256K context. You can run and train the model via Unsloth Studio. GGUF: huggingface.co/unsloth/gemma-… Guide: unsloth.ai/docs/models/ge…

Google Gemma @googlegemma

3 weeks ago

Meet Gemma 4 12B! A unified, encoder-free multimodal model designed to bring high-performance intelligence directly to your laptop, and released under an Apache 2.0 license. Bridging the gap between edge efficiency and advanced reasoning. Here is what’s new with Gemma 4 12B: 👇

402 2K 12K 3.2M 5K

96 381 3K 351K 2K

View Details

Artificial Analysis @ArtificialAnlys

3 weeks ago

Microsoft has released MAI-Transcribe-1.5: an exceptionally fast speech transcription model at a speed factor of ~276x, while still achieving 2.4% on AA-WER (#3), leading the accuracy-speed Pareto frontier MAI-Transcribe-1.5 is Microsoft AI (MAI)’s latest speech transcription model, coming in at 3rd overall on the on the Artificial Analysis Word Error Rate (AA-WER) leaderboard, behind Alibaba’s Fun-Realtime-ASR-preview (1.7% WER), and ElevenLabs Scribe v2 (2.2% WER). The model stands out as the fastest STT model in the top 10 for accuracy, processing audio at ~276x real-time - this is more than double the speed of the second fastest model in the top 10 for accuracy. The new model supports keyword biasing (improved recognition of rarer vocabulary such as names and medical terminology), in addition to support for 43 languages including English, French, Arabic, Japanese, and Chinese. See more details below ⬇️

12 41 617 48K 253

View Details

stevibe @stevibe

3 weeks ago

Qwen3.6 35B A3B can't fill out a paper form on its own. But give it NVIDIA's LocateAnything-3B — the #1 trending model on HuggingFace — as its eyes, and the two small models get it done together. (The test: place each element at the right pixel position on a blank form image, not type into a field.) Setup: > Qwen is the brain (main model), LocateAnything is the eyes (helper model acting as a tool). > I gave Qwen a new tool: ask "where's the email field?" and LocateAnything returns the exact x, y, width, height. > The blue boxes on the screen are its detections. Look how tight they are — it nails every field. Result: > Qwen3.6 35B A3B + LocateAnything-3B: form completed, all info correct. > Name, DOB, ID, gender, marital status, nationality, email, phone, address, postal code: all landed in the right field areas. > Character-box alignment still a touch loose, but every value is where it belongs. > 9m10s, 224.5k input, 24.3k output, 21 turns. Why it matters: > Qwen alone can't finish this test. Bolt on a 3B model that does exactly one thing > locate > and suddenly it can. > A combination of small models can do the work of a single large one.

86 274 3K 148K 3K

View Details

RyanLee @RyanLeeMiniMax

3 weeks ago

MiniMax-M3 will by arrive on HuggingFace openweight at next week!

MiniMax (official) @MiniMax_AI

3 weeks ago

Introducing MiniMax M3: The First Open-Weights Model to Combine Three Frontier Capabilities - Coding & Agentic Frontier: 59.0% SWE-Bench Pro, 66.0% Terminal Bench 2.1, 34.8% SWE-fficiency, 28.8% KernelBench Hard, 74.2% MCP Atlas - MiniMax Sparse Attention scales context to 1M -