-
Tweets29K
-
Followers20K
-
Following211
-
Likes11K
没人跟他强调全能参考的重要性吗?
魔法! DeepSeekV4 上下文内存压缩到1/10! 大家都知道 DeepSeekV4 是支持1M上下文的, 而且经过了极度优化, 如果要真的用到1M上下文, 显存占用只需要10G左右, (对比之下 DeepSeek-V3.2 大概需要84G显存). 然后我刚看到了FlashMemory这个论文, 直接能把显存占用压到 1.3GB! 甚至输出效果不降反升! 哥们你骗兄弟可以, 骗自己就没意思了, 真的吗? 压缩后反而性能上升? 我赶紧看了论文细节: 咱们先复习一下传统做法: 模型每吐出一个字,都要把之前的几十万字重新看一遍(这就是全局注意力). FlashMemory 的做法是: 预测未来需要什么, 它内置了一个神经内存索引器(Neural Memory Indexer, 其实就是个小模型了),能够主动预判接下来生成内容时需要用到历史文本里的哪些片段. 然后预先准备好这些片段, 接下来只要做到命中率超高, 那么这个提升就绝对有效. 即它的假设是, KVCache里面的东西并不是生成每个字的时候全都需要的, 只需要按需提前加载即可. 很像做作业的时候, 把参考资料摊满桌子, 然后优化了一下就是把参考资料需要用到的部分直接拍照, 用的时候看照片就行了. 那么听上去很简单, 但实际的难点在于, 训练一个专用的索引器小模型, 需要把 DeepSeek-V4模型加载到显存里一起炼. 相当耗费算力. 于是这篇论文第二个亮点来了, 它搞了个解耦训练. 他们把这个索引器当成一个标准的"双编码器(Dual-encoder,类似做搜索推荐的模型)"来单独训练. 在这个过程中,根本不需要把庞大的 DeepSeek-V4 基座模型加载到显存中. 这让训练成本断崖式下降,且兼容标准的检索(Retrieval)训练框架. (简单来讲就是它是通用方法训练的, 通过query预测需要检索哪些长句子. 所以其实是个通用模型) 听上去靠谱, 那也只是显存占用少了, 怎么就性能还提高了呢? 答案是注意力降噪. 因为每次只提取和当前生成最相关的记忆块(Chunks)放入显存,模型在运算时就看不见那些无关的冗余信息了.天然地起到了一种"去噪"作用,这也是为什么显存占用少了,模型准确率反而略微提升的原因.官方测试在长文本评测集(如 LongBench-v2 等)上的准确率平均最终提升了 0.6%. (其实还有数据如何逐出显存和如何预测数据实现预加载, 这部分也很棒, 很有启发性. 建议看原论文, 篇幅原因写不下了) 论文地址: arxiv.org/abs/2606.09079 项目地址: github.com/libertywing/Fl… #FlashMemory #DeepSeekV4 #FlashMemoryDeepseekV4
这么说的话,生产环境中没人用裸模型,得先看“模型+harness”能解决哪种水平的问题,再看解决同样的问题需要多少金钱和时间,至于是一次搞定还是 agent 雕花,用户没那么关心。能力 sota 重要,某个能力层级内的性价比 sota 也重要。
我的使用经验是, one-pass 能力越强(且能在较少的思考下one-pass) 模型才是SOTA的. 要用 agentic coding 才能修复第一次犯的错反而是模型拉夸的表现, 再不济也要在Interleaved thinking过程中修复. agentic coding 是用来解决工程量和运行时问题的. 不是用来修静态检查就行发现的bug的.更简单的说,
『太空机房的运营成本比地面低得多。不用买地,不用交电费,不用花大钱搞水冷系统——真空里散热只要对着漆黑的宇宙辐射就行』,太阳能免费但太阳能面板不免费,为了提升辐射效率还是需要液冷来展开辐射面积,机柜和机柜的互联速度也是个问题。总体来说目前还不是性价比方案,唯一的优势是不给地球升温。
鸭腿 12,鹅腿 24,16 块定价就买不到鹅腿😂
一些聪明逼说 agent 能力内化到模型是常识,23 年最顶级的模型还在琢磨推理增强,24 年 6 月才出现 claude3.5。 2018–2020:知识补全模型 2021–2022:指令问答模型 2022–2024:推理增强模型 2023–2024:工具增强模型 2024–2025:工作流智能体 2025–2026:策略智能体模型 2026+ :自主体模型
会用的早就够用了,想要一句话创造奇迹的还是实现不了。
Tinyfool @tinyfool
193K Followers 999 Following Youtuber 20年老程序员,业已退休,不以写代码为主业。 不是英语老师,但开发了"英语轻松读",做了广受好评的学习英语的视频。 心在日本,人目前在天津 Youtube: 英语学习之路 https://t.co/fY05anVN8r 胡说八道 https://t.co/FQLaNWFD5j
Michael Anti @mranti
356K Followers 11K Following 安替, Globus, Global Research. A veteran journalist on International Affairs, Harvard Nieman Fellow '08, TEDGlobal Speaker.
Baye @waylybaye
160K Followers 534 Following 一个自由的程序员,以卖 App 为生。边全球旅居边创作产品。作品:熊猫吃短信、DAMA、ServerCat、OpenCat,Miley。
小径残雪 @xiaojingcanxue
238K Followers 2K Following 不做伥鬼。本推属营销号,内容均可去掉id随意转载。错漏、传谣处欢迎指正。除明确表明系个人观点或亲身经历外,其余内容均系搬运,转发不代表认同。请怀抱乐子人心态,太较真则可能严重抑郁。认知失调时请先反思自己是否3D人士。本营销号不接受诸如“你没给受害人脸上打码”、“你嘲笑弱势群体”之类的道德说教
倪爽 @nishuang
103K Followers 797 Following 倪爽,设计顾问。 超过 20 年经验资深设计师,在美国运营 Honest Dot Design / 倪爽设计工作室,为中国及北美企业提供品牌营销设计、UX 设计两类设计咨询服务
骆逸 @royxy
46K Followers 1K Following 文化 建筑 历史 风水 审美 键政 游戏 生活 旅行 摄影 书法 抽象 乱七八糟的无用知识分享 有时频道:https://t.co/7ra78DfjcC
luolei @luoleiorg
62K Followers 2K Following 💻 Full-Stack Developer / 📖 Lifetime Learner / 🚧 Building & Sharing / 🎬 YouTuber https://t.co/7g47VdNMnN / 📝 https://t.co/6ucNiJ2gyg / 👨🚀 More https://t.co/smbFyIb6Hc
forecho📈 @caizhenghai
43K Followers 4K Following 8 年美股投资者,13 年程序开发。 - x 私信会漏看,联系我:https://t.co/TpEhIUOWf4 - 美股导航 https://t.co/hQXxFor1s6
Ding @dingyi
152K Followers 5K Following promote your product ‣ [email protected] newsletter ‣ https://t.co/q1JG1yCzdb refine your startup ‣ https://t.co/FfUYboxOr5 newest product ‣ https://t.co/cP6NQ3keo5
Samantha @samanthaZH12
2 Followers 100 Following
day day UP @day_up69393
70 Followers 4K Following
yin monkey @monkey5058
2 Followers 89 Following
www @wwolf1999
6 Followers 352 Following
Chihao Zhang @chihao_zha2604
3 Followers 133 Following
David li @WeberLee16
9 Followers 399 Following
andy @andy88667788
5 Followers 47 Following
Osama Mohamed @Osama11sso
15 Followers 924 Following
Helmut Kemper @HelmutKemper
159 Followers 4K Following
mingsh9 @mingsh6
8 Followers 262 Following
Shaw @Shaw1747052577
6 Followers 1K Following
林卡Linka @lnk151633844445
16 Followers 2K Following
JY Z @JunYuZzzzz
20 Followers 2K Following
baby o @o_baby12134
8 Followers 260 Following
yang yang @yangyang1056945
67 Followers 1K Following
Muztagh @muztag9
24 Followers 968 Following
Mike @MossMoss69
1 Followers 148 Following
caval zheng @cavalzheng
44 Followers 896 Following
你看这事闹的 @sylas641
12 Followers 403 Following Vibe coding and living. A CS Master's student just coasting for survival. Spare some change?
Larry Wang @larrywang7791
73 Followers 1K Following Connect East and West, contribute to a better world
jackma2023 @jackma202366107
0 Followers 7 Following
Jay Chaplain @j38509
17 Followers 954 Following
Freefisher007 @freefisher007
23 Followers 4K Following
Bobby @Bobby14572
25 Followers 1K Following
ichew13 | 🧙Dmt-Nat... @ichew13
137 Followers 1K Following Researcher of complex systems, blockchain, AI and digital health
Vivi @vivilinsv
27K Followers 9K Following TEDx Speaker | Human–AI relationships | AI & Crypto | Building @souli_ai 💗 Host @Vivi_Valley | Columnist @FTChinese | ex-Reuters TV | Author
Boriigloo @boriigloo
5 Followers 460 Following
dahai @zaihezhizhou
111 Followers 2K Following
Cox Barry @CoxBarry6
0 Followers 111 Following
Creo @Creo1318777
6 Followers 164 Following
Tinyfool @tinyfool
193K Followers 999 Following Youtuber 20年老程序员,业已退休,不以写代码为主业。 不是英语老师,但开发了"英语轻松读",做了广受好评的学习英语的视频。 心在日本,人目前在天津 Youtube: 英语学习之路 https://t.co/fY05anVN8r 胡说八道 https://t.co/FQLaNWFD5j
ruanyf @ruanyf
202K Followers 372 Following Stay Focused, Keep Shipping. Build Early, Build Always. Improve yourself, Write solid/simple/stupid code.
Baye @waylybaye
160K Followers 534 Following 一个自由的程序员,以卖 App 为生。边全球旅居边创作产品。作品:熊猫吃短信、DAMA、ServerCat、OpenCat,Miley。
骆逸 @royxy
46K Followers 1K Following 文化 建筑 历史 风水 审美 键政 游戏 生活 旅行 摄影 书法 抽象 乱七八糟的无用知识分享 有时频道:https://t.co/7ra78DfjcC
luolei @luoleiorg
62K Followers 2K Following 💻 Full-Stack Developer / 📖 Lifetime Learner / 🚧 Building & Sharing / 🎬 YouTuber https://t.co/7g47VdNMnN / 📝 https://t.co/6ucNiJ2gyg / 👨🚀 More https://t.co/smbFyIb6Hc
CJ Zafir @cjzafir
62K Followers 982 Following I fine-tune small language models (SLMs) and beat large language models (LLMs)
Compute King @Compute_King
28K Followers 683 Following Husband & Father | Lifelong Learner | Creator & Innovator Semi · AI · HPC · GPU Computing Guru Global Compute Infrastructure & Bare-Metal Compute
Jianyang Gao @gaoj0017
3K Followers 895 Following Author of the RaBitQ quantization algorithm; Postdoc at @ETH on AI, ML System, Vector Database; prev. PhD @NTUsg; ICPC World Final;
Saining Xie @sainingxie
40K Followers 2K Following cofounder & chief science officer at @amilabs | faculty @nyu_courant | prev: @googledeepmind @meta (fair) @ucsandiego | ynwa
Marx Jia @MarxSiJie
5 Followers 62 Following
yan5xu @yan5xu
16K Followers 441 Following 🤖 AI 野生研究员 | ex @ManusAI_HQ & @hey_im_monica 推特内容仅代表个人观点,和公司无关
北美王路飞 @kingluffywang
39K Followers 1K Following 油管Up主,分享人工智障知识,视频揭露股票圈的割韭菜惯犯雷公 LEI(aka 老雷,雷公资本,LEI & Lonecapital)抖音:Kingluffywang Make Americas Great Again
Nano Banana 2 @NanoBanana
162K Followers 3 Following Nano Banana 2 🍌🍌 the world's most powerful image editing and generation model! Try in the @GeminiApp
肖师傅 @xiaojietongxue
22K Followers 523 Following 25年资深媒体人,香港迪士尼乐园真人秀导演,Google数字营销与电子商务专家。法国🇫🇷INSEEC MBA,美国MarCom数字营销铂金奖得主,AMCP赫尔墨斯创意奖Hermes Creative铂金奖得主。中国内地、香港、澳洲、西班牙4家媒体公关公司。👨🍼双胞胎奶爸,现居西班牙🇪🇸马德里。
Amira Zairi @azed_ai
58K Followers 950 Following AI Educator & Creator | Ambassador @Adobe @LeonardoAi & @tripoai | Partner with leading brands | 📧Collab → [email protected] | HIGGS-6LKDK
LudovicCreator @LudovicCreator
38K Followers 2K Following AI VFX TEAM House of David S 2 Leonardo / KLING AI/RUNWAY /HAILUO / TOPAZ /LUMA/OpenArt CPP Adobe Firefly / LTX Ambassador https://t.co/4PWgTsNpBP
苏依丨Suyi @suyi_cc
797 Followers 545 Following Holder of BTC, BNB, ETH and TSLA stocks. FE & Node.js developer. Previously worked at SHEIN, TikTok, Antfin and Alibaba.
机器之心 JIQIZHIX... @jiqizhixin
19K Followers 1K Following China's leading media & information provider for #AI & #MachineLearning
陈凯文 KC @kc41165
2K Followers 1K Following Julius Caesar once said: Foolish man only believe what he wishes.
Ollin Boer Bohan @madebyollin
3K Followers 2K Following Independent researcher working on real-time on-device generative models. Made sdxl-vae-fp16-fix, TAESD/TAEHV, pokemon-emulation-via-dnn.
karminski-牙医 @karminski3
37K Followers 2K Following A coder, road bike rider, server fortune teller, electronic waste collector, co-founder of KCORES, ex-director at IllaSoft, KingsoftOffice, Juejin.
Jukka Seppänen @Kijaidesign
6K Followers 79 Following 3D modeling/printing artist, AI enthusiast, rookie Python coder.
Chinese Embassy in US @ChineseEmbinUS
345K Followers 631 Following Official twitter account of the Embassy of the People's Republic of China in the United States of America
saif.dev 👨🏻�... @saifalfalah
29K Followers 2K Following Software Developer. Product: https://t.co/0o8B66p65y
Wan @Alibaba_Wan
29K Followers 13 Following Video & Image Generation Model from @Ali_TongyiLab Discord: https://t.co/RI6qTtAnFA
JUNDE WU @JundeMorsenWu
15K Followers 1K Following Building @onecontext_ai. Phd @UniofOxford. Founder of Panoptes (acq. @SenseTime_AI). Senior Scientist @Baidu_Inc.
楊戩 @fsbyangjian
3K Followers 925 Following 修一身正氣,撐天地; 留三分匪氣;鎮小人; 養七分俠骨,行天下; 帶三分痞氣,戲紅塵; 存半點倡狂,傲侯王; 去滿腔赤膽,昭日月; 藏半點愚拙,避禍害; 留一縷鋒芒,斬奸爾; 圓一方棱角,守初心。 不私聊,想私聊的勿擾。執業律師,大工業黨,理性、中立,客觀。相信普世價值。幣圈勿擾,幣圈不解釋,一律拉黑。
DataVoid @DataPlusEngine
2K Followers 632 Following Independent ML researcher. The First step in knowing is admitting you don't
Ostris @ostrisai
13K Followers 312 Following AI / ML researcher and developer. Creator of AI Toolkit - https://t.co/Thqof0Gxpj Support my work - https://t.co/Isg2EXrP7s
JO. Z @jojodecayz
2K Followers 984 Following Founding member @ComfyUI - Product & GTM. Co-host @comfy_community
Black Myth @BlackMythGame
216K Followers 5 Following The only official account for the Black Myth series.
Ruiholy @Ruiholy
59K Followers 450 Following INTP,AI artist,( won three awards in Project Odyssey Season 1)Social Anxiety Disorder,Keep technology and thinking updated,crypto Enthusiast,
小鸡咯咯 @xiaoji_gege
18 Followers 163 Following
















































