The  LGTM
  • Home
  • Agentic Coding
  • Claude Code
  • Codex
Sign in Subscribe
Qwen-VLA Makes the Generalist-Robot Bet Look Less Theoretical
ai-models

Qwen-VLA Makes the Generalist-Robot Bet Look Less Theoretical

Qwen-VLA is a robotics paper, but the interesting part is not only robotics. It is another sign that the foundation-model interface is expanding from “read and write text” to “perceive, decide, and act” — with embodiment treated as context rather than a completely separate architecture. Alibaba’s Qwen team frames the
29 May 2026 3 min read
AgentDoG 1.5 Is a Small-Model Guardrail for the Part of Agents People Keep Pretending Is Safe: the Trajectory
ai-models

AgentDoG 1.5 Is a Small-Model Guardrail for the Part of Agents People Keep Pretending Is Safe: the Trajectory

Most AI safety systems still inspect the part of an agent run that arrives after the damage is already done: the final answer. AgentDoG 1.5 is interesting because it moves the review point upstream, into the trajectory itself — the chain of observations, tool calls, approvals, command outputs, memory updates,
29 May 2026 3 min read
Codex 0.136 Alpha Adds Image Generation, Guardian Metrics, and a Diff Security Lesson
codex

Codex 0.136 Alpha Adds Image Generation, Guardian Metrics, and a Diff Security Lesson

Codex 0.136.0-alpha.1 looks like a minor alpha if you only read the release note. The note itself is almost comically blank: “Release 0.136.0-alpha.1.” The useful story is in the compare range, where OpenAI is doing the work that separates a flashy coding assistant from
29 May 2026 5 min read
Claude API’s Mid-Conversation System Messages Are a Cost-Control Primitive, Not Just Prompt Ergonomics
claude-code

Claude API’s Mid-Conversation System Messages Are a Cost-Control Primitive, Not Just Prompt Ergonomics

Mid-conversation system messages sound like prompt ergonomics. They are not. For anyone building long-running Claude agents, they are a cost-control primitive with a security warning label attached. Anthropic’s May 28 platform release notes quietly shipped a set of API features around Opus 4.8 — refusal categories, task budgets, lower
29 May 2026 5 min read
Claude Code’s Opus 4.8 Hotfix Is the Boring Launch Detail Teams Should Actually Read
claude-code

Claude Code’s Opus 4.8 Hotfix Is the Boring Launch Detail Teams Should Actually Read

The important part of Claude Code v2.1.156 is not that Anthropic shipped a hotfix. Hotfixes happen. The important part is what broke: Opus 4.8 thinking blocks were being modified in a way that caused API errors. That is a small release note with a large architectural smell.
29 May 2026 4 min read
Qwen Code’s May 29 Nightly Pushes the Agent Out of the Terminal and Into Team Chat
qwen

Qwen Code’s May 29 Nightly Pushes the Agent Out of the Terminal and Into Team Chat

Qwen Code’s May 29 nightly is not a model launch, which is precisely why it is worth paying attention to. The release moves Alibaba’s terminal-first coding agent into Feishu/Lark team chat and adds telemetry groundwork for measuring skill-driven response-time improvements. That sounds like plumbing because it is.
28 May 2026 6 min read
Opus 4.8 Breaks the Old Thinking Schema, and OpenClaw's Allowlist Lag Shows the Cost of Hardcoded Model Routing
openclaw

Opus 4.8 Breaks the Old Thinking Schema, and OpenClaw's Allowlist Lag Shows the Cost of Hardcoded Model Routing

The phrase “supports Claude” is no longer a useful product claim. Which Claude? Which provider wrapper? Which thinking schema? Which tool behavior? Which region? OpenClaw PR #87835 is a small fix with a large lesson: frontier-model compatibility is now capability routing, not a boolean checkbox. The immediate bug is narrow.
28 May 2026 4 min read
OpenClaw's Prompt-Clobber Bug Shows Why System Prompt Ownership Is a Runtime Boundary
openclaw

OpenClaw's Prompt-Clobber Bug Shows Why System Prompt Ownership Is a Runtime Boundary

System prompts are still too often discussed like clever copywriting. In an agent runtime, they are closer to control-plane configuration: identity, policy, tool expectations, channel behavior, and the line between operator instruction and user-controlled context. That is why OpenClaw PR #87812 matters. It fixes a regression where active tool selection
28 May 2026 4 min read
OpenClaw's Dependency Gate Treats Agent Skills Like a Supply Chain
openclaw

OpenClaw's Dependency Gate Treats Agent Skills Like a Supply Chain

OpenClaw PR #87791 is the kind of security work that will not trend, will not demo well, and will probably annoy contributors the first time it blocks a pull request. Good. Agent platforms need more friction in exactly these places. The PR turns dependency graph changes from a polite advisory
28 May 2026 4 min read
openclaw

OpenClaw's Billing Cooldown Fix Turns Cost Governance Into Recovery Governance

Billing failures are usually treated like accounting problems. In an agent runtime, they are scheduling problems, reliability problems, and occasionally self-inflicted outages with a receipt attached. OpenClaw PR #87694 is interesting because it fixes a bug that looks small on paper — stale provider cooldowns — but exposes a larger operational truth:
28 May 2026 4 min read
NVIDIA’s ICRA Research Says Physical AI Is Becoming a Sim-to-Real Toolchain
nvidia

NVIDIA’s ICRA Research Says Physical AI Is Becoming a Sim-to-Real Toolchain

Robotics has spent years producing videos that look ten years ahead of the actual deployment curve. NVIDIA’s ICRA 2026 research package is useful because it mostly avoids that trap. The interesting part is not that a robot arm grasped something or a humanoid walked somewhere. The interesting part is
28 May 2026 6 min read
Copilot Gets Claude Opus 4.8, and the 15X Multiplier Is the Real Governance Signal
azure-ai

Copilot Gets Claude Opus 4.8, and the 15X Multiplier Is the Real Governance Signal

GitHub just added Claude Opus 4.8 to Copilot, but the most important number in the announcement is not a benchmark. It is 15X. Claude Opus 4.8 is now generally available for GitHub Copilot Pro+, Business, and Enterprise users, with rollout across VS Code chat, ask, edit, and agent
28 May 2026 5 min read
← Newer Posts Page 19 of 109 Older Posts →
The LGTM © 2026
  • Sign up
Powered by Ghost