Issue #16 2 min read

AI Engineering Signal #16

Claude Opus 4.7 ships 50% more expensive and immediately benchmarks worse than Opus 4.6 on long-context tasks

Share

Signals

Claude Opus 4.7 ships 50% more expensive and immediately benchmarks worse than Opus 4.6 on long-context tasks

the Thematic Generalization Benchmark dropped from 80.6 to 72.8, and MRCR long-context performance regressed noticeably, with power users reporting consistent quality drops across coding and reasoning tasks.

Reddit

Qwen3.6-35B-A3B released, beats Opus 4.7 on image tasks

35B active params from 109B total, runs locally, outdraws frontier models on at least one creative benchmark.

Web

OpenAI Codex expanded to "almost everything"

agentic desktop control now in Codex; hacked a Samsung TV and wrote a Chrome exploit in documented tests, raising real security surface questions.

Web

Physical Intelligence claims robot brain generalizes to untaught tasks

if the claim holds under scrutiny, this is the embodied-AI generalization milestone people have been waiting for.

TechCrunch

Claude identity verification now requires passport or facial scan

driving local-model adoption; Anthropic is tightening access controls at the API edge.

Web

Researchers reveal new method for tuning superconductivity

materials-level control over superconducting state has direct implications for future compute substrate economics.

Web

Cloudflare launches inference-native AI platform for agents

purpose-built inference layer with agent-aware routing; worth evaluating if you're running multi-step agent workloads at scale.

Web

Get signals like this in your inbox

Daily AI engineering intelligence. No noise.

[ Subscribe ]

The Take

Anthropic shipped a regression at a price increase on the same week an open-weight local model beat it on a creative benchmark — the gap between frontier API cost and open-weight capability is closing faster than the labs' release cadence can justify. If your stack is Opus-dependent, benchmark before you upgrade.

Subscribe

Unsubscribe any time.

Related Signals