Highlights

Daily picks worth your time. Three to five stories, filtered for practitioners.

Banknotes with receipts and budget statistics on a table

20 April 2026

Uber ran out of its AI coding budget in April after ranking engineers on usage

Uber encouraged engineers to compete on AI tool usage leaderboards, Claude Code adoption surged beyond projections, and the company burned through its AI budget months into 2026.

industry
tools

Data visualisation charts on a screen showing analytics and statistics

19 April 2026

AI at Record Scale: $581B Invested, 14x More Emissions, and Models That Still Fail Basic Perception

The Stanford AI Index 2026 documents record investment and benchmark progress alongside a 14x jump in training emissions and persistent failures on basic visual tasks -- a more complicated picture than the headline numbers suggest.

research
industry

A laptop computer on a desk, open to a code editor

15 April 2026

30% of engineers are hitting AI usage limits — and the ones causing it aren't who you'd expect

A survey of 900+ engineers finds three distinct archetypes responding to AI tools differently, with 30% hitting usage limits, 15% raising cost concerns, and a consistent gap between productivity gains for senior engineers and technical debt accumulation among less experienced ones.

industry
tools

Computer screen displaying code with a context menu open

15 April 2026

Claude Code Routines let you attach Claude to your CI pipeline, not just your terminal

Routines are saved Claude Code configurations that run unattended on Anthropic-managed infrastructure, triggered by a schedule, an API call, or a GitHub event.

tools
infrastructure

Glowing AI chip on a circuit board, representing inference hardware

15 April 2026

A small guide model can cut your LLM inference costs by 22% without replacing your frontier model

ExecTune trains a small 'guide' model to generate execution strategies for a larger black-box model, achieving 9.2% accuracy gains and 22.4% cost reductions — with Claude Haiku 3.5 matching Sonnet 3.5 performance on math and code benchmarks.

research
infrastructure

A padlock resting on a computer keyboard, representing cryptographic security

15 April 2026

OpenSSL 4.0 ships Encrypted Client Hello and post-quantum crypto — here's what actually needs migrating

OpenSSL 4.0.0 adds Encrypted Client Hello and post-quantum hybrid key exchange, removes SSLv3 and the engine API, and changes certificate validation behaviour — the migration burden varies sharply by what your code actually uses.

infrastructure
tools

14 April 2026

An AI System Ran Its Own Research Loop and Beat torch.compile by 4x

AlphaLab, an autonomous multi-agent research framework using frontier LLMs, achieved 4.4x average speedup over torch.compile on GPU kernels, 22% lower validation loss on LLM pretraining, and 23-25% improvements in traffic forecasting — all without human intervention.

research
tools
infrastructure

14 April 2026

Someone Bought 30 WordPress Plugins and Planted a Backdoor That Slept for 8 Months

A buyer acquired 30 WordPress plugins through Flippa, inserted dormant malware across all of them, waited 8 months, then activated a backdoor that used an Ethereum smart contract for command-and-control to resist takedowns.

industry
infrastructure

Workers in a laboratory examining testing equipment

13 April 2026

Every major AI agent benchmark can be gamed to a perfect score

Researchers at Berkeley found that eight of the most widely cited AI agent benchmarks — including SWE-bench Verified and OSWorld — can each be exploited to achieve near-perfect scores without solving a single task.

research
tools

Server rack with blinking green indicator lights in a data centre

13 April 2026

Anthropic quietly shortened prompt cache TTL and it cost some users 17% more

A developer analysed six months of Claude API session logs and found Anthropic silently shifted the default prompt cache TTL from one hour to five minutes on March 6, causing measurable cost increases for long-running sessions.

tools
infrastructure

Rows of white archive boxes organised on wooden shelves

13 April 2026

SQLite 3.53 finally lets you add and remove NOT NULL and CHECK constraints

SQLite 3.53.0 adds ALTER TABLE support for adding and removing NOT NULL and CHECK constraints, closes a long-standing gap that previously required workarounds, and ships a new JSON array insert function along with CLI improvements.

tools
infrastructure

An unlocked padlock resting on a computer keyboard

12 April 2026

The moat in AI vulnerability scanning is the system, not the model

AISLE tested eight models against the vulnerabilities Anthropic's Mythos found, and the results undercut the frontier model exclusivity argument: a 3.6B parameter model at $0.11 per million tokens found the same FreeBSD zero-day.

research
tools
infrastructure

A computer screen displaying a program running in a terminal

12 April 2026

The Linux kernel has formal rules for AI-assisted contributions now

The Linux kernel's official documentation now defines how AI coding assistants should be attributed, establishes that AI agents cannot sign off contributions, and places full legal responsibility on the human submitter.

tools
industry

A rack of servers in a dimly lit server room

12 April 2026

The economics of releasing frontier open models are breaking

As training costs reach billions of dollars, fewer organisations will sustain frontier-level open releases — Nathan Lambert argues a collectively-funded consortium is the only viable long-term mechanism.

industry
research

Colourful audio sound wave visualisation on a dark background

11 April 2026

ChatGPT voice mode runs on an older, weaker model than you think

OpenAI's voice interface runs on an older GPT-4o era model with an April 2024 knowledge cutoff, not the current frontier — a gap that explains why voice fumbles questions that text handles easily.

industry
tools

Keys hanging near a partially open door with light shining through

11 April 2026

LangChain's answer to Claude Managed Agents: own your agent's memory

Deep Agents Deploy is a model-agnostic agent deployment platform positioning directly against Anthropic's Managed Agents, with its differentiator being memory stored in open formats that users control and can query directly.

tools
architecture

Person holding a glass sphere reflecting a blurred landscape, representing cross-modal perception

11 April 2026

Sentence Transformers now does cross-modal search out of the box

Sentence Transformers v5.4 adds multimodal embedding and reranking via Qwen3-VL and NVIDIA Nemotron, letting you retrieve across text, images, audio, and video using one library and one familiar API.

tools
research

Macro photograph of a silicon wafer showing microscopic transistors and circuit patterns

11 April 2026

A 1.3M parameter model beats LLMs 92,000 times its size at real-time game control

A 1.3M parameter model trained on 31,000 human gameplay demonstrations scores 178 frags in DOOM versus 13 combined for all tested LLMs including GPT-4o-mini, at 31ms inference on consumer hardware.

research
tools

A computer circuit board with a brain illustration on it

10 April 2026

Meta Released Its First Hosted Frontier Model With 16 Built-in Tools

Meta's Muse Spark is the company's first hosted frontier model, sitting just behind Gemini 3.1 Pro and GPT 5.4 on Artificial Analysis rankings, with 16 native tools including visual grounding with pixel-level precision.

tools
industry

A stack of books sitting on top of a table

10 April 2026

Agents That Read Papers Before Writing Code Find Optimisations That Code-Only Agents Miss

A research-driven agent that reads arXiv papers and competing project codebases before optimising llama.cpp achieved a 15.1% performance gain on x86 CPU inference for $29 in compute and API costs.

tools
research
architecture

Mathematical equations written on a white page

10 April 2026

Calibrated Uncertainty Scores for LLMs Without Access to Model Internals

SELFDOUBT estimates how confident a reasoning model is in its own output using only the generated text, with no model internals or fine-tuning required — making it compatible with any API.

research
tools

Black CCTV security camera mounted on a wall

10 April 2026

The Vercel Claude Code Plugin Is Sending Your Shell Commands to Vercel's Servers

The official Vercel plugin for Claude Code collects full bash command strings by default and full prompt text with opt-in, using misleading consent language and no visible third-party indicator.

tools
infrastructure

Abstract visualisation of connected cloud infrastructure nodes

9 April 2026

Anthropic Ships a Managed Platform for Production Agents

Anthropic's Managed Agents API handles sandboxed execution, persistent state, long-running sessions, and multi-agent coordination so teams don't have to build that infrastructure themselves.

tools
architecture

Multiple monitors showing code in a developer workspace

9 April 2026

DHH Barely Writes Code by Hand Anymore

Six months after saying he didn't use AI for coding, DHH runs multiple agents simultaneously and calls it wearing a mech suit.

tools
industry

9 April 2026

Training a 100B Model on One GPU Is Now Possible

MegaTrain trains 100B+ parameter models on a single GPU by treating GPU as a transient compute engine and storing model state in CPU host memory.

research
infrastructure

A security and privacy dashboard showing system status indicators

8 April 2026

Claude Mythos is scanning critical open source software for zero-days

Anthropic launched Project Glasswing, using Claude Mythos Preview to scan foundational open source software for zero-day vulnerabilities, already finding thousands of high-severity flaws in the Linux kernel, OpenBSD, and FFmpeg.

tools
industry
infrastructure

Red padlock on a black computer keyboard

8 April 2026

Combining attack techniques jumps AI safety failures from 14% to 71%

A new paper shows that combining multiple jailbreak techniques simultaneously pushes attack success rates from 14.3% to 71.4%, revealing that RL-based safety training generalises much more poorly than capability training.

research
industry

A group of colourful geometric cubes arranged in a pattern

8 April 2026

Google open-sourced a hypervisor for running multiple AI agents in isolated containers

Google released Scion, an experimental agent orchestration platform that runs multiple AI agents as isolated, concurrent containers with separate git worktrees and credentials, treating agent coordination as an infrastructure problem rather than a prompting problem.

tools
architecture
infrastructure

A robot figure with a glowing light saber against a dark background

8 April 2026

LangChain's async subagents let orchestrators delegate work without blocking

Deep Agents v0.5 introduces non-blocking async subagents that return a task ID immediately and execute remotely, enabling orchestrators to dispatch multiple long-running tasks while remaining responsive.

tools
architecture

Men observe automated conveyor belt system in warehouse

7 April 2026

When AI agents do the shopping, your marketing copy becomes invisible

A controlled experiment shows AI shopping agents choose merchants with structured JSON data over competitors offering cheaper products with marketing copy, because the structured data passes validation while the copy fails.

industry
architecture

Security camera stencil with text on wall

7 April 2026

Most AI agents will cover up evidence when their employer tells them to

Researchers tested 16 state-of-the-art LLMs and found that a majority would actively suppress incriminating evidence when given corporate profit incentives.

research
industry

Colourful code scrolls across a dark background

7 April 2026

AI just went grandmaster at competitive programming — and the algorithm might matter more than the result

GrandCode placed first across three live Codeforces tournaments in March 2026 using a multi-agent RL system with a novel algorithm for training agents with delayed rewards.

research
tools

Pink padlock against a light background representing cryptographic security

7 April 2026

The window to migrate off current cryptography is closing faster than most engineers realise

Filippo Valsorda argues that recent research has moved post-quantum migration from a distant concern to a near-term engineering priority, with Google setting an internal 2029 deadline.

infrastructure
industry

A path leading through tall trees into misty forest light

6 April 2026

The real AI risk isn't hallucinations. It's forgetting how to think.

A research educator argues that the danger of AI assistance isn't dramatic failure but slow cognitive outsourcing, where the output looks identical but the practitioner gradually stops building the understanding that makes independent judgement possible.

industry
research

A person cutting a piece of wood, focused on the craft

6 April 2026

AI is great at implementation. It is terrible at design.

Lalit Maganti spent eight years wanting a proper SQLite developer toolset, then built it in three months with AI. His account of what went wrong in the first month is the clearest description yet of the design-versus-implementation gap in AI-assisted development.

tools
industry

Black laptop computer displaying a blue terminal screen

6 April 2026

LM Studio now runs as a headless server with an Anthropic-compatible API

LM Studio 0.4.0 extracts the inference engine into a standalone headless daemon with a full CLI and an Anthropic-compatible endpoint, meaning you can point Claude Code at a local model by setting two environment variables.

tools
infrastructure

A stylised illustration of a brain positioned over a CPU chip, representing AI computation

6 April 2026

Training a coding agent end-to-end costs $200 on TPUs

Nanocode demonstrates training a 1.3B parameter Claude Code-style coding agent from scratch -- pretraining, supervised fine-tuning, and preference optimisation -- on a TPU v6e-8 for around $200 in under nine hours.

research
infrastructure

A sculptor's hands shaping a human face from clay in an art studio

5 April 2026

When you can ship a rebuild in a weekend, product conviction breaks

Tim O'Reilly profiles Harper Reed's argument that AI-speed iteration cycles destroy the feedback loops through which product teams build conviction, requiring new frameworks for decision-making under permanent optionality.

industry
tools

Close-up of a Thunderbolt 3 cable and port

5 April 2026

Nvidia GPUs now officially work on Apple Silicon Macs

Tiny Corp's TinyGPU DriverKit extension, now officially signed by Apple, brings Nvidia Ampere and AMD RDNA3 eGPU support to Apple Silicon Macs without requiring SIP bypass -- the first time Nvidia hardware has ever had official macOS support.

infrastructure
tools

A hand squeezing an orange, juice running between the fingers

5 April 2026

LoRA has been adapting the wrong part of the weight matrix

Minor Component Adaptation targets low-variance singular subspaces rather than dominant ones, achieving up to 5.9x more knowledge acquisition than LoRA using a fraction of the parameters.

research
tools

Palm trees reflected in a mirror, duplicated and inverted

5 April 2026

A model can teach itself to write better code

Sampling a model's own outputs at varied temperatures and fine-tuning on them pushes pass@1 on LiveCodeBench from 42% to 55% -- no teacher model, no RL, no verifier required.

research
tools

Aerial view of a garden hedge maze at Villa Pamphilij

4 April 2026

AI is turning developers into Winchester Mystery House builders

Drew Breunig argues that cheap AI code generation is producing a third model of software development: sprawling, idiosyncratic personal tools built for the builder's own enjoyment, not for distribution.

industry
architecture

Abstract digital security concept showing code and lock icons

4 April 2026

The Axios supply chain attack was a fake company, a Teams call, and a RAT

Attackers compromised the Axios npm package by impersonating a real company, scheduling a fake Teams meeting, and tricking the maintainer into installing a Remote Access Trojan.

infrastructure
tools

Two developers inspecting code together at a desk

4 April 2026

Gemma 4 is out, and benchmarks are the least interesting part

Nathan Lambert argues that Gemma 4's success will hinge on licensing, tooling maturity, and fine-tunability, not benchmark scores, and identifies five factors that actually determine whether an open model gets adopted.

tools
research
industry

Server rack with network cables and blinking lights

4 April 2026

New Rowhammer attack gives full control of machines running Nvidia GPUs

A Rowhammer variant exploits GPU memory access patterns to flip bits in DRAM, giving attackers complete control of machines with Nvidia GPUs in shared environments.

infrastructure
research

4 April 2026

The toolkit pattern: writing docs for AI, not just humans

O'Reilly describes a documentation pattern where projects structure docs around intent, letting AI generate valid configuration from plain-English descriptions.

tools
architecture

Maze viewed from above representing problem-solving shortcuts

4 April 2026

Catching reward hacking by looking inside the model, not at its outputs

Researchers use representation engineering to detect when RL-trained models learn shortcuts that satisfy reward signals without solving the actual problem.

research

Abstract network of connected dots representing neural reasoning

4 April 2026

Reasoning models decide before they reason

New evidence that reasoning models encode their action choices before chain-of-thought deliberation begins, which changes how you should read CoT outputs.

research

4 April 2026

When inference is cheap, you should overtrain your models

New scaling laws show that when you account for inference-time sampling, the optimal pretraining regime shifts radically toward overtraining, overturning conventional Chinchilla-style guidance.

research

3 April 2026

A breach at one AI data vendor may have exposed secrets from every major AI lab simultaneously

A supply chain attack on LiteLLM compromised Mercor, a $10B AI training data contractor serving OpenAI, Anthropic, and Meta, potentially exposing training datasets and proprietary pipeline details across the industry.

industry
infrastructure

A robotic torso with exposed internal components and arms

3 April 2026

How to build an agent that fixes its own production bugs after deployment

A concrete architecture for self-healing agent deployments: detect regressions using Poisson distribution testing against a 7-day error baseline, triage with a causal link requirement, and auto-open a PR via a coding agent.

tools
architecture

Abstract representation of AI processing

2 April 2026

Claude now supports tool use in streaming mode

Claude's API now supports tool calls mid-stream, letting agents act without waiting for a full response.

tools

Network diagram with nodes and connections

2 April 2026

Open-weight models are now within striking distance of frontier APIs for agentic workloads

Benchmarking open-weight models against Claude Opus 4.6 on 138 agentic tasks shows a 4–11 percentage point gap, with open models running at 5–20x lower cost and 2–4x lower latency.

research
industry
tools

1 April 2026

Gradio's backend is now separable from its UI

Gradio Server separates the queuing engine, GPU management, and MCP support from Gradio's UI system, letting you build any custom frontend while keeping the backend infrastructure that makes GPU serving production-ready.

tools
infrastructure

1 April 2026

DuckDB gets native vector similarity search

DuckDB now has native vector similarity search, which means you can do RAG-style retrieval in a single embedded database.

tools
research

1 April 2026

Holo3 hits 78% on desktop computer use and the training method is more interesting than the score

H Company's Holo3-35B hits 78.85% on OSWorld-Verified, new state-of-the-art for desktop computer use, using a synthetic training flywheel that generates novel environments rather than relying on collected demonstrations.

research
tools

Transparent device with wifi symbol on screen, pentesting hardware

31 March 2026

AWS just turned penetration testing into an on-demand API call

AWS Security Agent and DevOps Agent hit general availability, compressing penetration testing timelines from weeks to hours and incident resolution from two hours to 28 minutes in early customer results.

tools
infrastructure
industry

A person's head with a circuit board in front of it

31 March 2026

The most-used post-training library just hit v1.0, and the design choices are worth understanding

TRL, the post-training library downloaded 3 million times a month, hits v1.0 with a deliberate stability model and a design philosophy built around the short half-life of post-training assumptions.

tools
research