The LGTM (Page 25)

The LGTM

Sign in Subscribe

NVIDIA’s New MoE Kernels Are a Reminder That AI Cost Gets Won in the Epilogue

NVIDIA’s New MoE Kernels Are a Reminder That AI Cost Gets Won in the Epilogue

AI cost does not only get decided in model architecture diagrams. It gets decided in the epilogue of a grouped GEMM, in whether quantization causes another memory pass, and in whether the CPU is still being asked to schedule dynamic expert work while thousands of GPUs wait politely. Not the

BioNeMo LoRA Makes Billion-Parameter Biology Models Feel Less Like a Cluster Reservation

BioNeMo LoRA Makes Billion-Parameter Biology Models Feel Less Like a Cluster Reservation

Biology foundation models have had the same problem as every other foundation model category, only with more lab coats involved: the demos are impressive, the checkpoints are huge, and the moment a team needs to adapt one to its own assay or organism, the workflow starts looking like a cluster

Work IQ GA Turns Microsoft 365 Context Into a Metered Agent Primitive

Work IQ GA Turns Microsoft 365 Context Into a Metered Agent Primitive

Microsoft just put a meter on one of the most valuable ingredients in enterprise AI: the context of work itself. Work IQ, Microsoft’s API layer for grounding agents in Microsoft 365 context, reaches general availability on June 16 with consumption-based pricing through Copilot Credits, according to Microsoft Partner Center.

QwenPaw’s Agent OS Driver Turns MCP Into a Governed Capability Layer

QwenPaw’s Agent OS Driver Turns MCP Into a Governed Capability Layer

QwenPaw’s Agent OS Driver work is a useful reminder that MCP adoption is the easy part. Letting an agent discover tools is not the same as deciding which tool it may call, under which identity, with which credentials, after which approval, and with what audit trail. The hard problem

Qwen Code Is Dogfooding Autonomous Bug Fixes Without Handing the Agent the Keys

Qwen Code Is Dogfooding Autonomous Bug Fixes Without Handing the Agent the Keys

The most credible autonomous coding feature is usually the least theatrical one. Qwen Code’s new scheduled autofix workflow does not promise to clear the backlog, replace maintainers, or make bug triage obsolete. It tries to fix at most one stale unattended bug per day, in public, with a claim,

Qwen Code 0.18.1 Makes Agent Runtime Reliability the Stable Release Story

Qwen Code 0.18.1 Makes Agent Runtime Reliability the Stable Release Story

Qwen Code 0.18.1 is not interesting because Alibaba cut another npm release. It is interesting because the stable line is starting to admit what coding agents really are: not chatbots with shell access, but small distributed systems that happen to use a model as one component. The release