AI Engineering Weekly #5
This is a production risk signal: if your CI/CD pipelines or tooling depend on GitHub repos that got caught in the blast radius, they may have silentl
Signals
Anthropic's Claude Code source leak triggered a mass GitHub DMCA takedown
and they accidentally nuked thousands of unrelated repos in the process.
Web
attn-rot (TurboQuant-style activation rotation for better quantization) merged into llama.cpp
local inference just got meaningfully tighter; Qwen 2.5-27B reportedly fits on a 16GB GPU at near-Q4_0 quality with this technique, worth testing before your next hardware order.
GitHub
LiteLLM supply-chain attack hit Mercor and reportedly thousands of other orgs
if LiteLLM is in your stack as a routing layer, audit your dependency versions and review what credentials it touches.
Web
Intuit reports AI agents hitting 85% repeat usage by keeping humans in the loop
the production evidence here cuts against the "full autonomy" narrative; human-in-the-loop isn't a crutch, it's what drives retention.
Web
OpenAI quietly backed a lobbying group pushing age-verification requirements for AI
disclosed only after investigation; worth tracking as regulatory capture attempts around AI access controls accelerate.
Web
Iran war disrupting helium supply critical for chip fabrication and MRI-cooled quantum hardware
supply chain pressure on semiconductor fabs is real and not priced into most AI infrastructure roadmaps.
Web
The Take
The llama.cpp quantization merge and the LiteLLM supply-chain attack are the two things that require immediate action this week — one is an upgrade you want, one is a vulnerability you need to close. Audit your LiteLLM version and pinned dependencies today, then benchmark the attn-rot quant improvement against your local inference targets.
Subscribe
Related Signals