Issue #36 2026-05-12 2 min read

AI Engineering Signal #36

Thinking Machines ships TML-Interaction-Small, a 276B-A12B MoE model designed to listen while it speaks

Share

Signals

Thinking Machines ships TML-Interaction-Small, a 276B-A12B MoE model designed to listen while it speaks

making standard voice activity detection obsolete for realtime voice agents.

TechCrunch

Nvidia releases official Rust-to-CUDA compiler (CUDA-oxide)

GPU programming in Rust becomes first-class, no more unsafe C bindings.

Web

Curl maintainer runs Anthropic’s Mythos security scanner, finds one real vulnerability and ~20 bugs

AI code review tools are starting to catch flaws in critical infrastructure.

Web

MiniCPM-V-4.6 open-weights vision-language model released

new multimodal option for local deployment, continuing the small-model surge.

Web

Consumer GPU hits 500k-token context at 21 tok/s (coding)

quantization and inference tricks make extreme context lengths practical on 48GB VRAM.

Reddit

ArXiv preprint proposes “Containment Verification”

a formal guarantee of AI safety independent of alignment training, offering a verifiable containment approach.

ArXiv

Get signals like this in your inbox

Daily AI engineering intelligence. No noise.

The Take

Voice interaction, local multimodality, and applied AI security are all exiting the prototype phase simultaneously. The same tools that catch real-world CVEs and stretch context windows to book-length are the ones you can deploy today — the frontier is collapsing into production.

Subscribe

Related Signals

2026-03-30 · community, tech press, latent space, research, general web

AI Engineering Weekly #2

2026-04-03 · simon willison, general web, tech press, github, research, community

AI Engineering Weekly #6