AI Engineering Signal #36
Thinking Machines ships TML-Interaction-Small, a 276B-A12B MoE model designed to listen while it speaks
Signals
Thinking Machines ships TML-Interaction-Small, a 276B-A12B MoE model designed to listen while it speaks
making standard voice activity detection obsolete for realtime voice agents.
TechCrunch
Nvidia releases official Rust-to-CUDA compiler (CUDA-oxide)
GPU programming in Rust becomes first-class, no more unsafe C bindings.
Web
Curl maintainer runs Anthropic’s Mythos security scanner, finds one real vulnerability and ~20 bugs
AI code review tools are starting to catch flaws in critical infrastructure.
Web
MiniCPM-V-4.6 open-weights vision-language model released
new multimodal option for local deployment, continuing the small-model surge.
Web
Consumer GPU hits 500k-token context at 21 tok/s (coding)
quantization and inference tricks make extreme context lengths practical on 48GB VRAM.
ArXiv preprint proposes “Containment Verification”
a formal guarantee of AI safety independent of alignment training, offering a verifiable containment approach.
ArXiv
The Take
Voice interaction, local multimodality, and applied AI security are all exiting the prototype phase simultaneously. The same tools that catch real-world CVEs and stretch context windows to book-length are the ones you can deploy today — the frontier is collapsing into production.
Subscribe
Related Signals