Issue #21 2 min read

AI Engineering Signal #21

OpenAI ships GPT-5.5, positioning it as a step toward a unified "super app"

Share

Signals

OpenAI ships GPT-5.5, positioning it as a step toward a unified "super app"

benchmark results are circulating but independent evals are not yet in, so treat capability claims as provisional until third-party numbers land.

Web

DeepSeek V4 drops, Flash variant priced aggressively

near-frontier performance at a fraction of API cost; worth benchmarking against your current stack this week.

Simon Willison

Qwen 3.6 27B ties Claude Sonnet 4.6 on agentic evals

a locally-runnable open-weight model matching a top hosted model on agency benchmarks is a meaningful threshold.

Reddit

Qwen 3.6 27B running at 85 TPS on a single RTX 3090

125K context and vision on consumer hardware changes the local deployment calculus significantly.

Web

Claude Code post-mortem published by its creator

public acknowledgment of quality regressions in an agentic coding tool is rare; read it before deploying Claude Code in CI.

Simon Willison

Google reports 75% of new code is now AI-generated

up from roughly 50% in 2025 and 25% in 2024; the velocity of adoption inside a major engineering org is the signal, not the number itself.

Reddit

FairyFuse: multiplication-free LLM inference on CPUs via fused ternary kernels

if this holds up, it shifts the floor on what hardware can run inference without a GPU.

ArXiv

Get signals like this in your inbox

Daily AI engineering intelligence. No noise.

[ Subscribe ]

The Take

The open-weight tier is collapsing the cost and capability gap with hosted models faster than most production roadmaps assumed — while hosted providers respond with new releases that lack independent evals at launch. The teams that benchmark DeepSeek V4 and Qwen 3.6 against their actual workloads this week will have real data; everyone else is guessing.

Subscribe

Unsubscribe any time.

Related Signals