Issue #3 2026-03-31 3 min read

AI Engineering Weekly #3

Google's TurboQuant memory market impact

Signals

Google's TurboQuant memory market impact

a technical clarification thread on r/LocalLLaMA breaks down TurboQuant/RaBitQ quantization claims, and separately The Register reports memory-maker shares dropped and some RAM prices eased, with Google's quantization work cited as a contributing factor; if aggressive model compression is already moving hardware markets, your infrastructure cost assumptions from six months ago are stale.

Axios npm package compromised

malicious versions were pushing a remote access trojan via one of the most-downloaded JS packages in existence; audit your supply chain now, especially any CI/CD that pulls Axios transitively.

Web

Ollama now powered by MLX on Apple Silicon (preview)

native MLX backend means local inference on Mac gets a meaningful speed and efficiency bump without any user-side changes; worth testing if you're running dev workflows on Apple hardware.

Web

Google multi-agent study: 180 setups tested, multi-agent degraded performance

a community post summarizing Google's internal findings claims multi-agent architectures made outcomes worse in the majority of configurations tested; matches what practitioners building in production have been reporting for a year.

Qwen 3.5-Omni results published by Alibaba, Qwen 3.6 spotted in model registry

two data points in 24 hours signal Alibaba is shipping fast; if you're evaluating open-weight multimodal options, the Qwen family is moving faster than most Western alternatives right now.

Web

LiteLLM drops Delve as a vendor

the most widely used open-source LLM gateway cut ties with a controversial observability startup; if you're using LiteLLM in prod, check whether any Delve integration is in your stack and what replaces it.

TechCrunch

Claude Code source partially leaked via NPM map file

a source map left in the published NPM package exposed Claude Code's internal structure; Anthropic hasn't commented publicly, but this is a reminder that shipping CLI tools via NPM with source maps enabled is an opsec failure mode worth auditing in your own tooling.

Web

Get signals like this in your inbox

Daily AI engineering intelligence. No noise.

[ Subscribe ]

The Take

Supply chain (Axios), quantization economics (TurboQuant), and multi-agent reliability are all converging on the same message: the defaults you set up last year are wrong. Audit your npm dependencies, reprice your inference infrastructure assumptions, and stop defaulting to multi-agent architectures until you have evidence they help your specific workload.

Related Signals

2026-03-30 · community, tech press, latent space, research, general web

AI Engineering Weekly #2

2026-03-28 · community, general web, tech press

AI Engineering Weekly #1