خطة تعلم تقنية مبسطة، ابدأ بسيستم بسيط:
قسم السنة لاربع مراحل.
كل ربع تتعلم محور تقني محدد مثل برمجة، انظمة، سحابة، امن. خلك واقعي، لا تكدس كورسات ما تطبقها. سجل تقدمك بلوحة kanban وتقييم شهري. اهم شي ختام كل مرحلة مشروع حقيقي يضغطك شوي عشان الخبرة تثبت.
my 8 GB VRAM gaming laptop is absolutely going to hate me for this. but I still did it.
ran a 31b dense model (Gemma 4 31b Q4) with only 8 GB VRAM
last week I ran Gemma 4 26B A4B a mixture of experts model on my RTX 4060 and hit 25–28 tokens/sec using llama.cpp's new MTP support. smooth. snappy.
but MoE has a secret: it only activates 4B parameters per token despite having 26B total. that's why it flies.
so the real question started haunting me. what if I throw a full, no tricks, every parameter fires on every token, 31B DENSE model at the same machine?
# Hardware:
GPU: NVIDIA RTX 4060, 8 GB VRAM
RAM: 16 GB
CPU: Intel Core i7 H
Laptop. Gaming. Modest.
The model: gemma-4-31B-it-qat-UD-Q4_K_XL.gguf
(model's unsloth huggingface link in the comments)
This is Google DeepMind's flagship dense model in the Gemma 4 family that can run on single consumer GPU. It packs a hybrid attention architecture, supports up to 256K context natively, and is QAT (Quantization Aware Training) optimized, meaning it retains far more quality than standard post training quants at the same bit depth. This is NOT the MoE. This is 31 BILLION dense parameters, every single one of them loaded.
# the flags I used:
-m gemma-4-31B-it-qat-UD-Q4_K_XL.gguf -cnv --spec-type draft-mtp --spec-draft-model mtp-gemma-4-31B-it.gguf --spec-draft-n-max 8 --spec-draft-p-min 0.6 -c 6000 -v
Multi Token Prediction (MTP) is still active here. Separate draft GGUF required, same as the 26B setup.
# Results:
→ Decode: ~3 tokens/sec
→ Prefill: ~2 tokens/sec
→ Context: 6000 tokens
→ Hardware crying quietly in the corner: yes
so is 3 tps actually usable?
For real time back and forth chat? Not ideal. You're not having a fluid conversation at 3 tps.
but slow ≠ useless. And this is where it gets genuinely interesting.
think about how senior devs actually work in a real team. But when something is architectural, deeply complex, or needs serious reasoning? they walk down the hall and escalate to the senior.
That's exactly the local AI agent architecture this unlocks:
→ Fast orchestrator model (Gemma 4 26B MoE at 25+ tps) handles routing, simple queries, tool calls, memory. The junior dev.
→ Gemma 4 31B dense is the senior, called only when the fast model genuinely hits a wall. Hard multi step reasoning. Complex code generation. Deep architectural decisions. The agentic loop stays fast. Only the hard hops touch the 31B. That's a legitimate production grade local AI architecture on a budget hardware. (requires 2 8gb gpus)
other workflows where 3 tps is completely fine:
- overnight batch jobs. summarize documents, extract structured data, review code. Fire it off. Sleep. wake up to results.
- One shot deep reasoning
- Silent code audit loops, you write and test, the 31B reviews diffs and flags issues in the background between your sprints
- Any workflow where output quality > output speed
A few weeks ago, nobody was running a 30B+ dense model on a single consumer GPU with 8 GB VRAM. At all. Now we're doing it on an Intel i7-H gaming laptop with a NVIDIA RTX 4060, thanks to llama.cpp + QAT quants + MTP speculative drafting.
Google DeepMind said the Gemma 4 31B targets "consumer GPUs and workstations." They were not exaggerating. The hardware bar to run serious frontier class models locally keeps dropping.
the tools are here. the models are here. you just have to be willing to abuse your laptop a little.
what workflows would you actually run on a local 3 tps 31B dense model? genuinely curious. drop it below.
Run Gemma 4 26b MTP on 8 GB VRAM GPUs at 25+ tokens/second. Flags included!
local llm space is moving at terminal velocity. only 3 days ago google released gemma 4 26b a4b qat quants. more efficient than before, ran on 8gb vram at 20 tok/sec.
and now just a few hours ago,
Reinforcement Learning (RL) is quickly becoming the most important skill for AI researchers.
RL is more important now than it has ever been, but (probably due to its complexity) there aren’t a ton of great resources for learning it online
Here are the best resources for learning RL for LLMs…
1/ RLHF book. Nathan is a long-time RL researcher and an expert on LLM alignment / post-training. He decided to write an entire book on (LLM-focused) RL techniques and has been slowly expanding / iterating on the book over the last year.
This is the most comprehensive RL resource that is currently available, and it’s an especially great resource for those who are unfamiliar with RL and still need to learn the basics.
2/ The Spinning up with Deep RL Course from OpenAI despite being created in 2018 has stood the test of time and is one of the best tutorials for learning RL.
This course builds up to understanding PPO, which is one of the most widely used algorithms for RL with LLMs.
Plus, understanding related algorithms (policy gradients, TRPO, etc.) will help a lot with gaining an understanding of new RL algorithms like GRPO.
3/ PPO / GRPO blog. Jimmy Shi (DeepMind) recently wrote a great blog explaining both PPO (RL algo traditionally used for RLHF) and GRPO (RL algo used for reasoning models).
This blog is great and it’s written in a way that is understandable for non-RL people.
4/ HuggingFace RL. HuggingFace has also published numerous useful blogs on the topic of RL.
Most recently, they published a blog that explains GRPO and PPO from the ground up (i.e., not assuming any background knowledge on RL). These blogs are inspired by the recent initiative from HuggingFace to create a fully open replication of DeepSeek-R1.
يا زين قلبٍ ما تغيّره الظروف
يبقى على طيب النوايا والوفا
ما هزّه الوقت لو كثرت الحتوف
مثل الجبل ثابتٍ بعزّه واعتلا
والطيب ما يضيع لو طال الغياب
يبقى أثره في القلوب سنين عمر
والناس تُعرف بالمواقف لا الخطاب
والفعل يرفع قدر راعيه ويرتقي به القدر
،(Deep Freeze) برنامج تجميد نظام الكمبيوتر بحيث أي تغيير (تثبيت برامج، حذف ملفات، فيروسات) تختفي بعد إعادة التشغيل، ويرجع الجهاز للوقت الي جمدت فيه النظام ⛄️ .
رابط البرنامج : 2u.pw/9zwIzD ⬇️ .
Intelligence should be open, accessible, and ready to build with, empowering every developer, everywhere.
GLM-5.2 is now available to all GLM Coding Plan users, including Lite, Pro, Max, and Team plans.
docs.z.ai/devpack/latest…
As our new flagship model, GLM-5.2 delivers powerful coding capabilities, usable 1M-context support, and continued strengths in long-horizon tasks.
API and Chatbot services will launch next week. The model will also be officially open-sourced next week under the MIT License.
The future of AI is open, and it belongs to the people.
ALGUIEN ENCONTRÓ LA MANERA DE APRENDER CUALQUIER COSA 10 VECES MÁS RÁPIDO CON IA.
NotebookLM + Gemini + Obsidiana.
13 minutos. Dale una oportunidad.
Aquí lo tienes. ⬇️
مطور صيني يشغل نموذج ذكاء إصطناعي Llama 70B محلياً على جهازه MacBook طوال رحلة جوية مدتها 11 ساعة.
احفظ البوست قبل تنسى.
تخيل إنك قاعد جنب الشباك في رحلة عابرة للمحيط الأطلسي، والناس حولك بتدفع 25 دولار عشان واي فاي طيارة "يقطع النفس"، وأنت فاتح الـ MacBook وبتلعب في ليفل تاني خالص!
المطور ده قرر يقطع علاقته بالعالم، لا API، ولا اشتراك Claude، ولا OpenAI.. اعتمد كلياً على Llama 3.3 70B شغال "Offline" وسط السحاب.
تدفع Anthropic أكثر من 750,000 دولار سنويًا لمهندسين قادرين على بناء نماذج اللغة (LLMs) من الصفر.بينما قدمت Stanford University شرحًا كاملًا لهذا المجال في محاضرة مدتها ساعة واحدة — ومجانية بالكامل.
168 Followers 862 FollowingDark Web Monitoring for MSSP.
Budget-friendly and technically advanced darknet threat intelligence solution. Real time alerts and automated reports.
2K Followers 7K Followingأناقة الروح هي التي تفرض هيبتها..كـَ الوردةِ النادرة؛لا تلتفتُ لمن يقطفُها، بل تشغلُ الكون بعطرها وجمالها
🌹✨ كن جميلاً في ذاتك، تزهرُ لكَ الحياة.
1.5M Followers 1 Followingحساب يهتم بأخبار صاحب السمو الملكي الأمير محمد بن سلمان بن عبدالعزيز آل سعود ولي العهد ورئيس مجلس الوزراء HRH Prince Mohammad Bin Salman AISaud،Run by his fan
180K Followers 136 FollowingBuilt by Moonshot AI to empower everyone to be superhuman.
⚡️API: https://t.co/XCrgjXAqMw
@KimiProduct where we share cool use cases.
@Kimidevs built for developers
102K Followers 259 FollowingThe AI Lab behind GLM models, dedicated to inspiring the development of AGI to benefit humanity.
https://t.co/7a5aSCUNcZ
https://t.co/x14hb3klXm
1.4M Followers 2 FollowingWe're an AI safety and research company that builds reliable, interpretable, and steerable AI systems. Talk to our AI assistant @claudeai on https://t.co/FhDI3KQh0n.
7K Followers 1 Followingموجودين معكم…✨
فريق خدمة العملاء جاهز يخدمكم من الأحد إلى الجمعة على مدار 24 ساعة💚
أي استفسار؟ أي ملاحظة؟ إحنا هنا عشانكم💬
تواصلوا معنا وما راح نقصر أبدًا🙏
735 Followers 557 FollowingPassionate about CyberSecurity/RedTeam, Member of Cybersecurity club @CCSIT_CS at KFU | Diploma IS graduate from KKU #eJPTv2 | #Security+ | eWPTXv3⏳ | eCPPT
32K Followers 764 Followingسعودي مغترب مختص في علوم الحاسب الآلي ومهتم بالعلوم السياسية .. عندما اخالفك في جزئية من طرحك هذا لا يعني انني لا اتفق معك في بقية الجزئيات
1.5M Followers 2 FollowingClaude is an AI assistant built by @anthropicai to be safe, accurate, and secure. Talk to Claude on https://t.co/ZhTwG8d1e5 or download the app.
105K Followers 92 Followingسبحان الله وبحمده سبحان الله العظيم عراب اسهم النمو مهتم بسوق الأسهم مستشار مالي حاصل ع شهادة تدريب المدربينTOTللأعمال واتساب 0505496461 رخصة موثوق 157706