AI Digest

Digest curado

viernes, 29 de mayo de 2026·weekly-deep·deep·11,489 tokens

🔥 TOP — lo que SÍ o SÍ tenés que ver

  • Claude Opus 4.8 ya está disponible — mejora modesta pero tangible, con énfasis en honestidad. Ya lo podés usar desde hoy tanto en anthropic.com como via llm-anthropic 0.25.1 (soporta claude-opus-4.8 y la nueva opción -o fast 1 para fast mode). link
  • Anthropic recauda $65B en Serie H a $965B de valuación — la run-rate de ingresos ya supera los $47B anuales. Señal de que el mercado enterprise está adoptando Claude en serio. link
  • Superpowers: la metodología completa para coding agents que funciona — define skills composables y un flujo de trabajo (spec → approval → código) que podés aplicar directo con Claude Code, Codex CLI, Cursor, etc. Si laburás con coding agents, esto es fundamental. link
  • Harness: meta-skill que diseña equipos de agentes especializados para Claude Code — le decís el dominio y te genera agent definitions y skills en .claude/agents/ y .claude/skills/. Ideal si ya estás experimentando con multi-agent en Claude Code. link
  • Free Week de Claude Code — una semana gratis para probarlo si todavía no lo hiciste. link
  • SQLite agregó un AGENTS.md — documento oficial para guiar a agentes de IA que interactúan con su codebase. Deja clarísimo que no aceptan código generado por agentes sin revisión humana. Indirectamente es un manifiesto de cómo deberían operar los coding agents en proyectos serios. link

📦 Claude / Anthropic ecosystem

  • Anthropic abre oficina en Milán — apunta al mercado enterprise italiano. link
  • Simon Willison analiza la run-rate de Anthropic — el detalle de $47B anuales y qué significa para el mercado enterprise. link
  • Simon Willison dice que Anthropic y OpenAI encontraron product-market fit — señala que empresas están sorprendidas del tamaño de sus facturas de API porque sus equipos la usan masivamente. link
  • Zot ya soporta Claude Opus 4.8 — cliente de terminal para Claude. link
  • AINews: cobertura completa de la semana — Serie H, Opus 4.8, Dynamic Workflows y ultracode. link
  • Cognition levanta $1B en Serie D a $26B — señal de que el mercado de coding agents es enorme y sigue creciendo. link

🛠️ Dev tools & coding

  • The VibeSec Reckoning en Thoughtworks — cómo lidiar con la seguridad cuando hacés vibe coding: security context file, caution con permisos de IA, templates secure-by-default. Post práctico de gente que lo vivió. link
  • CodePulse: indexador token-efficient para codebases — optimizado para herramientas de AI coding, minimiza tokens consumidos al indexar. link
  • Playwright-MCP: que agentes AI manejen tests de Playwright — integración MCP para que coding agents ejecuten y gestionen tests end-to-end automáticamente. link
  • Microsoft MarkItDown: convertidor de documentos a Markdown para LLMs — oficial de MS, útil si estás armando pipelines que procesan PDFs, Office, etc. antes de mandarlos a Claude. link
  • Lithium Core: toolkit open-source para memoria AI que escala — para sistemas multi-agent con persistencia de contexto. link
  • Build with Claude Code: nuevo cohort (28-29 de Mayo) — curso práctico de ByteByteGo sobre cómo construir con Claude Code. link

🏗️ Software engineering

  • Cloudflare: cómo construyeron Town Lake (data platform unificada) + Skipper (AI agent interno) — post enorme contando la arquitectura real, las decisiones de diseño, y cómo un AI agent opera sobre datos reales de producción. Imperdible si te interesa system design de big tech + AI aplicado. link
  • ByteByteGo: Failure Modes en sistemas distribuidos — repaso de los modos de falla más importantes y cómo manejarlos. Material sólido para tener fresco. link
  • ByteByteGo: cómo Airtable construyó el search layer para sus AI features — arquitectura real, tradeoffs, cómo integran búsqueda semántica con estructuras tradicionales. link
  • ByteByteGo: cómo CockroachDB hizo vector indexing a escala — implementación real de índices vectoriales en una base distribuida. link
  • ByteByteGo: cómo Vercel redujo build times de 90s a 5s — optimizaciones concretas en su pipeline de builds. link
  • ByteByteGo: RAGs vs Agents — cuándo usar cada patrón, cómo se complementan. link
  • Martin Fowler: Fragmentos de Mayo — Kent Beck y Fowler hablando de LLM-augmented programming, refactors de codebases legacy, y qué deberían aprender los juniors hoy. link

📚 Vale la pena leer

  • The Age of Async Agents — Cognition (Devin) + OpenInspect — entrevista sobre cómo Devin llega al 80% de commits, workflows spec-to-PR, VMs completas, y agent memory. link
  • When AI Starts Writing Systems Code — post sobre qué pasa cuando LLMs empiezan a generar código de sistemas (drivers, kernels, etc.). link
  • Adopt ≠ Adapt: análisis longitudinal de conversaciones con LLMs — paper que estudia ~12k usuarios de Bing Copilot y encuentra que los hábitos de uso son sorprendentemente estables. link
  • Frontier LLM agents pueden superar el cuello de botella de curaduría de ontologías — paper que muestra que LLMs pueden igualar (y en algunos casos superar) a humanos expertos en anotación de fenotipos. link
  • Review Arcade: alineación humana de reviews generadas por LLM — estudio empírico que muestra que la alineación varía mucho según prompt y modelo. link

💤 Skippeable pero conviene saber

  • MoneyPrinterTurbo — herramienta para generar videos cortos automáticos con AI. Más para el side project de SaaS que para el laburo diario. link
  • datasette 1.0a31 — ahora soporta write queries y stored queries. Si usás Datasette para explorar datos, es relevante. link
  • markdown-svg-renderer de Simon Willison — renderiza SVGs inline en Markdown, útil para documentación técnica. link
  • Iran's Internet parcialmente restaurada — datos de Cloudflare Radar, más relevante para contexto global que para tu laburo. link

Artículos fetched (49)

  • Introducing Claude Opus 4.8
    anthropic-news· 28-may

    May 28, 2026Product

  • Anthropic raises $65B in Series H funding at $965B post-money valuation
    anthropic-news· 28-may

    May 28, 2026Announcements

  • Anthropic opens Milan office to support Italian enterprise, research, and developers
    anthropic-news· 27-may

    May 27, 2026Announcements

  • Frontier LLM-based agents can overcome the ontology curation bottleneck for natural phenotypes
    arxiv-ai· 29-may

    arXiv:2605.28965v1 Announce Type: new Abstract: Linking free-text phenotype descriptions to ontology terms, typically referred to as phenotype annotation, is essential for the cross-study integration of comparative morphological data. This labor intensive process has heavily relied on highly trained human experts, which makes it challenging to scale and thus a key bottleneck. Dahdul et al. (2018) established a Gold Standard (GS) of Entity-Quality (EQ) annotations across seven phylogenetic studies and used it to evaluate three human curators and the Semantic CharaParser NLP tool with ontology-based semantic similarity metrics; they reported that machine-human consistency was significantly lower than inter-curator (human-human) consistency. Here we revisit that benchmark with five frontier …

  • VFEAgent: A Multimodal Agent Framework for End-to-End Automated Finite Element Analysis
    arxiv-ai· 29-may

    arXiv:2605.28978v1 Announce Type: new Abstract: Finite Element Analysis (FEA) serves as the cornerstone of modern engineering design. However, its workflow is inherently complex and relies heavily on domain expertise. Although recent efforts have integrated Large Language Models (LLMs) into FEA, existing approaches face limitations in handling multimodal inputs and executing complex tasks. To address these limitations, we propose VFEAgent, an end-to-end multi-agent system designed to automate FEA modeling and simulation directly from input images and problem descriptions. Our methodology integrates two core components: (1) a multimodal vision-language multi-agent pipeline that employs ReAct-driven reasoning to extract structured FEA specifications from heterogeneous inputs and (2) a verif…

  • BEAMS: Benchmarking and Evaluating AI for Modeling and Simulation
    arxiv-ai· 29-may

    arXiv:2605.28994v1 Announce Type: new Abstract: AI tools to support real world decision making must be able to build simulation models that inform their recommendations and render them interpretable. Tools that can automate aspects of modeling practice must complement human expertise, not replace it. The BEAMS Initiative aims to guide the development of AI tools for modeling and simulation toward forms that are responsible and ethical by establishing benchmarks for human centered modeling and simulation practices. The initiative uses open digital and organizational infrastructure to collaboratively evaluate AI tools for modeling and simulation. The open source sd ai project hosted by the initiative establishes transparency and enables contributions to be shared broadly. A steering group f…

  • Adopt $\neq$ Adapt: Longitudinal Analyses of LLM Conversations in the Wild
    arxiv-ai· 29-may

    arXiv:2605.29018v1 Announce Type: new Abstract: Although a growing body of research has begun to describe user--LLM interactions, the picture it paints is largely static; little is known about how individual users change their behavior over time. To address this gap, we analyze the conversational trajectories of $\sim$12,000 randomly sampled Microsoft Bing Copilot users and compare these with data from WildChat-4.8M. While the Copilot data contains significant population-level trends, we find that trends in individual user trajectories are much weaker; user habits prove to be overwhelmingly sticky. We also find stark differences between users of different activity levels: more active users have more successful conversations and use the LLM for more complex and professionally oriented task…

  • Orthogonal Concept Erasure for Diffusion Models
    arxiv-ai· 29-may

    arXiv:2605.28902v1 Announce Type: new Abstract: Concept erasure has emerged as a promising approach to mitigate undesired or unsafe content in diffusion models, yet existing methods still face significant limitations. While training-based methods are effective, their high computational cost limits scalability. Editing-based methods are more efficient and deployment-friendly, yet they struggle to simultaneously achieve precise concept erasure and preserve overall generative capacity. We identify this core limitation of the editing-based methods as reliance on additive parameter updates. Our empirical analysis reveals that concept semantics primarily depend on neuron direction rather than neuron magnitude, while overall generative capacity relies on the angular geometry of neurons. As addit…

  • Behavior-Induced Mirror-Prox Temporal-Difference Learning for Faster Off-Policy Prediction
    arxiv-ai· 29-may

    arXiv:2605.28849v1 Announce Type: new Abstract: Gradient temporal-difference methods provide stable off-policy prediction with linear function approximation, but their practical performance is strongly affected by the geometry induced by the auxiliary-variable metric. Existing Mirror-Prox TD methods typically use the feature covariance metric, whereas hybrid TD methods suggest that behavior-policy transition information can provide a more informative update geometry. This paper proposes a behavior-induced Mirror-Prox temporal-difference method, called STHTD-MP, which replaces the covariance metric in the primal-dual saddle-point formulation with the symmetric part of the behavior-policy Bellman matrix. The method keeps a single learning rate for the primal and auxiliary variables and appl…

  • Review Arcade: On the Human Alignment and Gameability of LLM Reviews
    arxiv-ai· 29-may

    arXiv:2605.28897v1 Announce Type: new Abstract: LLM-generated reviews for scientific papers are gaining considerable traction and are even being officially piloted by major conferences. We have to assume that not only reviewers are using LLM-assistance, but also that authors use LLMs to revise their papers before submitting. In this work, we perform empirical experiments on papers from the 2025 ACL Rolling Review (ARR) to evaluate LLM reviews from both the author and the reviewer perspective. First, we identify a limited alignment of LLM reviews with human ones. In the best-case scenario, the alignment is reasonable. However, we also find that LLM-human alignment varies substantially across prompts and models. Finally, we investigate the scenario in which the author uses an iterative draf…

  • Behavior-Aware Auxiliary Corrections for Off-Policy Temporal-Difference Prediction
    arxiv-ai· 29-may

    arXiv:2605.28855v1 Announce Type: new Abstract: Temporal-difference learning with function approximation can be unstable under off-policy sampling. TDC stabilizes off-policy TD through an auxiliary covariance correction, and TDRC further regularizes this correction in a single-timescale recursion. This paper studies a behavior-aware replacement of the auxiliary covariance geometry in the linear prediction setting, which is the standard local model for understanding the feature-space dynamics of value-function approximation. We first replace the TDC auxiliary matrix (C) by the behavior Bellman matrix (A_\mu), yielding BA-TDC, and then regularize the same behavior-aware equation to obtain BA-TDRC. This two-step construction separates the contribution of behavior-aware geometry from the cont…

  • The Cognitive Categorical Transformer: Category-Theoretic Inductive Biases for Language Modeling
    arxiv-ai· 29-may

    arXiv:2605.28864v1 Announce Type: new Abstract: The Cognitive Categorical Transformer (CCT) is a 306M-parameter architecture that augments a pretrained GPT-2 Small backbone with cognitively grounded components derived from category theory and several inspirations from cognitive science. Under a matched-step protocol (215,000 optimizer steps, matched data, matched optimizer and schedule) on WikiText-103, CCT reaches 21.27 validation perplexity, compared with 24.19 for an identically fine-tuned GPT-2 Small baseline. The architecture therefore contributes a 2.92 PPL (12% relative) reduction beyond what in-domain fine-tuning alone provides. A retrain-from-scratch ablation that holds GT-Full simplicial message passing bypassed across the entire seven-phase activation schedule reaches 23.72 PPL…

  • Ultra-Reduced-Impact-Encased-Logging (URIEL): propose a new method for selective sustainable logging and post-harvest silvicultural treatment in tropical forest using airborne robotics systems
    arxiv-ai· 29-may

    arXiv:2605.28883v1 Announce Type: new Abstract: Tropical forests worldwide are under intense deforestation pressure driven by economic and political interests, and scientific evidence suggests this deforestation contributes to climate change. This paper proposes a novel logging method for tropical forests, Ultra-Reduced-Impact-Encased-Logging (URIEL). This new method is based on heli-logging techniques combined with intensive use of robotics and AI integrated with post-harvest silvicultural treatments performed by drones. The concept of appropriate equipment for this method was developed, dimensions were determined, details were completed in a digital proof of concept, and an effective digital simulation and economic feasibility analysis were carried out for various helicopter-timber-dist…

  • How CockroachDB Built Vector Indexing at Scale
    bytebytego· 25-may

    In this article, we will look at how the CockroachDB engineering team built this index and the challenges they faced.May 25 • ByteByteGo27046

  • How Vercel Cut Build Wait Times From 90 Seconds To 5
    bytebytego· 26-may

    In this article, we examine the constraints Vercel faced, the choices they made in response, and the optimizations that produced the speedup.May 26 • ByteByteGo26013

  • Must-Know Failure Modes in Distributed Systems
    bytebytego· 28-may

    In this article, we will look at the most significant failure mode patterns in distributed systems and the standard approaches to deal with each of…14 hrs ago • ByteByteGo867

  • EP216: RAGs vs Agents
    bytebytego· 23-may

    Ask an LLM about your company's data and it will guess. The two patterns that fix this are RAG and agents, and they solve different problems.May 23 • ByteByteGo305512

  • How Airtable Built the Search Layer Behind Their AI Features
    bytebytego· 27-may

    In this article, we will look at how Airtable’s data infrastructure team built its architecture, the challenges they faced, the tradeoffs they accepted…May 27 • ByteByteGo24112

  • Build with Claude Code: New Cohort Launch
    bytebytego· 22-may

    The first cohort starts in about a week: May 28-29, 2026.May 22 • ByteByteGo23786

  • Iran's Internet is partially restored, Cloudflare Radar data shows
    cloudflare· 27-may

    Cloudflare Radar data confirms early indications of a partial Internet restoration in Iran, nearly three months after the shutdown began. Traffic spikes and DNS queries have risen, but network activity is currently just 40% of pre-shutdown levels.

  • How we built Cloudflare's data platform and an AI agent on top of it
    cloudflare· 28-may

    Here’s how we built Town Lake, Cloudflare's unified analytics platform, alongside Skipper, an internal AI agent running on top of it.

  • obra/superpowers
    github-trending

    An agentic skills framework & software development methodology that works. Superpowers Superpowers is a complete software development methodology for your coding agents, built on top of a set of composable skills and some initial instructions that make sure your agent uses them. Quickstart Give your agent Superpowers: Claude Code, Codex CLI, Codex App, Factory Droid, Gemini CLI, OpenCode, Cursor, GitHub Copilot CLI. How it works It starts from the moment you fire up your coding agent. As soon as it sees that you're building something, it doesn't just jump into trying to write code. Instead, it steps back and asks you what you're really trying to do. Once it's teased a spec out of the conversation, it shows it to you in chunks short enough to actually read and digest. After you've signed o…

  • byoungd/English-level-up-tips
    github-trending

    An advanced guide to learn English which might benefit you a lot 🎉 . 离谱的英语学习指南/英语学习教程/英语学习/学英语 简体中文 | English 谨以此献给我曾今的挚爱 W. 我们每个人都生活在各自的过去中,人们会用一分钟的时间去认识一个人,用一小时的时间去喜欢一个人,再用一天的时间去爱上一个人,到最后呢,却要用一辈子的时间去忘记一个人。 项目介绍 An advanced guide to learn English which might benefit you a lot. 离谱的英语学习指南/英语学习教程。 推荐资源:ku0.com - 库 如果你在使用本指南里的 AI 学习方案时,需要更稳定、可信的 AI 账户与接口资源,可以看看我们的产品:ku0.com - 库。 ku0.com 是一个可信任 AI 资源库,可一站式获取 ChatGPT、Claude、Gemini 账户充值、成品号和号池资源。我们用 Token 质检和统一网关筛掉不稳定、掺水、冒名的中转服务,并通过可信账户资源、质检报告和接入记录,帮助你降低 AI 使用成本与采购风险。 背景 你好啊朋友,欢迎来到离谱的英语学习指南。 当你的目光与这些文字相遇,我衷心希望,这不仅仅是一次攻克英语的艰苦征程,更是一场开启智慧之门的奇妙冒险。愿这方寸纸墨,化作你我心灵共鸣的琴弦,弹奏出语言学习的天籁妙音。 时间回到 2017 年 7 月初,备考托福的女神W.问了我一个问题:如何高效学习英语? 在我思考如何回答这个问题时,回想起我在大四一学期一次性考过 26 门课的经验(其中重修 19 门,当前学期 7 门),再加上本人英语 和 语文 两门学科曾侥幸在高考时摘得省第一(江苏卷),或许我勉强有资格提供一些高效学习的小技巧,权当抛砖引玉。 与她交流…

  • harry0703/MoneyPrinterTurbo
    github-trending

    利用AI大模型,一键生成高清短视频 Generate short videos with one click using AI LLM. MoneyPrinterTurbo 💸 简体中文 | English 只需提供一个视频 主题 或 关键词 ,就可以全自动生成视频文案、视频素材、视频字幕、视频背景音乐,然后合成一个高清的短视频。 Web界面 API界面 功能特性 🎯 完整的 MVC架构,代码 结构清晰,易于维护,支持 API 和 Web界面 支持视频文案 AI自动生成,也可以自定义文案 支持多种 高清视频 尺寸 竖屏 9:16,1080x1920 横屏 16:9,1920x1080 支持 批量视频生成,可以一次生成多个视频,然后选择一个最满意的 支持 视频片段时长 设置,方便调节素材切换频率 支持 中文 和 英文 视频文案 支持 多种语音 合成,可 实时试听 效果 支持 字幕生成,可以调整 字体、位置、颜色、大小,同时支持字幕描边设置 支持 背景音乐,随机或者指定音乐文件,可设置背景音乐音量 视频素材来源 高清,而且 无版权,也可以使用自己的 本地素材 支持 OpenAI、Moonshot、Azure、gpt4free、one-api、通义千问、Google Gemini、Ollama、DeepSeek、MiniMax、 文心一言, Pollinations、ModelScope 等多种模型接入 中国用户建议使用 DeepSeek 或 Moonshot 作为大模型提供商(国内可直接访问,不需要VPN。注册就送额度,基本够用) 视频演示 📺 竖屏 9:16 ▶️ 《如何增加生活的乐趣》 ▶️ 《金钱的作用》 更真实的合成声音 ▶️ 《生命的意义是什么》 横屏 16:9 ▶️ 《生命的意义是什么》 ▶️ 《为什么要运动》 配置要求 📦 建议系统:Windows 10…

  • microsoft/markitdown
    github-trending

    Python tool for converting files and office documents to Markdown. MarkItDown Important MarkItDown performs I/O with the privileges of the current process. Like open() or requests.get(), it will access resources that the process itself can access. Sanitize your inputs in untrusted environments, and call the narrowest convert_* function needed for your use case (e.g., convert_stream(), or convert_local()). See the Security Considerations section of the documentation for more information. MarkItDown is a lightweight Python utility for converting various files to Markdown for use with LLMs and related text analysis pipelines. To this end, it is most comparable to textract, but with a focus on preserving important document structure and content as Markdown (including: headings, lists, tables,…

  • revfactory/harness
    github-trending

    A meta-skill that designs domain-specific agent teams, defines specialized agents, and generates the skills they use. Harness — The Team-Architecture Factory for Claude Code English | 한국어 | 日本語 Harness is a team-architecture factory for Claude Code. Say "build a harness for this project" (English) or "하네스 구성해줘" (한국어) or "ハーネスを構成して" (日本語), and the plugin turns your domain description into an agent team and the skills they use — picked from six pre-defined team-architecture patterns. Overview Harness leverages Claude Code's agent team system to decompose complex tasks into coordinated teams of specialized agents. Say "build a harness for this project" and it automatically generates agent definitions (.claude/agents/) and skills (.claude/skills/) tailored to your domain. Category — Where Har…

  • Zot now supports Claude Opus 4.8
    hn-ai· 29-may

    Article URL: https://www.zot.sh Comments URL: https://news.ycombinator.com/item?id=48319524 Points: 9 # Comments: 0

  • Free Week of Claude Code
    hn-ai· 29-may

    Article URL: https://claude.ai/referral/pIpeQjEpEw Comments URL: https://news.ycombinator.com/item?id=48319662 Points: 1 # Comments: 0

  • Show HN: Open-source toolkit for AI memory that scales
    hn-ai· 29-may

    Article URL: https://github.com/0xJaksun/lithium-core Comments URL: https://news.ycombinator.com/item?id=48319144 Points: 1 # Comments: 0

  • Playwright-MCP – Let AI agents run and manage Playwright tests
    hn-ai· 29-may

    Article URL: https://github.com/Bairinikhi1/playwright-mcp Comments URL: https://news.ycombinator.com/item?id=48319131 Points: 1 # Comments: 0

  • CodePulse – token-efficient codebase indexer for AI coding tools
    hn-ai· 29-may

    Article URL: https://github.com/leogong99/codepulse Comments URL: https://news.ycombinator.com/item?id=48319172 Points: 2 # Comments: 0

  • Evidence that the first papal encyclical on AI was substantially written by AI
    hn-ai· 29-may

    Article URL: https://linch.substack.com/p/claude-author-of-the-humanitas Comments URL: https://news.ycombinator.com/item?id=48319229 Points: 1 # Comments: 0

  • Conversational LLM Client Made in Tkinter
    hn-ai· 29-may

    Article URL: https://meltdown.merkoba.com/index.html Comments URL: https://news.ycombinator.com/item?id=48319512 Points: 1 # Comments: 0

  • Funny but serious, Chieng issues an AI warning to grads
    hn-ai· 29-may

    Article URL: https://news.harvard.edu/gazette/story/2026/05/funny-but-serious-chieng-issues-an-ai-warning-to-grads/ Comments URL: https://news.ycombinator.com/item?id=48319405 Points: 1 # Comments: 0

  • When AI Starts Writing Systems Code
    hn-ai· 29-may

    Article URL: https://www.coreauto.com/blog/when-ai-starts-writing-systems-code Comments URL: https://news.ycombinator.com/item?id=48319293 Points: 1 # Comments: 0

  • What if remote working, not AI, is to blame for weak junior hiring?
    hn-ai· 29-may

    Article URL: https://www.ft.com/content/2205e2d0-50dc-4e80-9bf7-78d0272276c0 Comments URL: https://news.ycombinator.com/item?id=48319392 Points: 1 # Comments: 1

  • [AINews] Anthropic raises $965B Series H, releases Opus 4.8 and Dynamic Workflows/ultracode
    latentspace· 29-may

    Total Anthropic victory!

  • [AINews] Cognition raises $1B in $26B Series D
    latentspace· 28-may

    coding is an uncapped TAM market

  • The Age of Async Agents — Cognition's Walden Yan & OpenInspect's Cole Murray
    latentspace· 28-may

    80% Devin Commits, Spec-to-PR Workflows, Full VMs, Agent Memory, and PMs Shipping Code

  • 🔬ESM: The Bitter Lesson is Coming for Proteins - Alex Rives, BioHub
    latentspace· 27-may

    Biohub’s Protein World Model: ESMC-6B, ESMFold2, 6.8B proteins, 1.1B structures, antibody design, SAEs, & the potential for programmable biology

  • The VibeSec Reckoning
    martin-fowler· 27-may

    Vibe coding has significantly accelerated software prototyping but AI agents frequently recommend insecure configurations, creating security problems. Gautam Koul, Lucian Moss, Neil Drew-Lopez, and Daberechi Ruth Edeokoh share their experience while building applications for Thoughtworks's global marketing. They learned that to combat this we need to write a security context file to guide the AI, be cautious with AI permission requests, create a daily security intelligence feed, and provide builders with a secure-by-default harness and templates. more…

  • Fragments: May 27
    martin-fowler· 27-may

    At the GOTO Conference in Copenhagen in 2025, Kent Beck and I spent some time on stage talking and answering questions from the audience - a format I refer to as “two old geezers on a park bench”. We talk about our experiences with LLM-augmented programming (at that point - October 2025), we show our frustration that things we’ve been saying for thirty years still need to be said, we say how anything like a manifesto reunion needs to be led by a younger generation, and opine on what junior developers should be focusing on in their career. ❄ ❄ ❄ ❄ ❄ Ian Johnson has written a series of posts about restructuring a gnarly codebase The story follows a real Laravel + React codebase over ~3 months and ~258 commits from a legacy monolith with no tests to a well-structured application with automat…

  • markdown-svg-renderer
    simonw· 28-may

    <p><strong>Tool:</strong> <a href="https://tools.simonwillison.net/markdown-svg-renderer">markdown-svg-renderer</a></p> <p>A slightly customized Markdown rendering tool with special treatment for fenced code SVG blocks - it both renders the image and provides a tab for switching to the code view.</p> <p>You can paste in Markdown or give it a URL to a CORS-enabled Markdown file or Gist. <a href="https://tools.simonwillison.net/markdown-svg-renderer#url=https%3A%2F%2Fgist.github.com%2Fsimonw%2Ffea4f7546626d627862dc241a4e3a86a">Here's an example</a> where it loads a Markdown file full of LLM pelican logs for <a href="https://simonwillison.net/2026/May/28/claude-opus-4-8/#and-some-pelicans">Opus 4.8</a>.</p> <p>Tags: <a href="https://simonwillison.net/tags/svg">svg</a>, <a href="https://simon…

  • llm-anthropic 0.25.1
    simonw· 28-may

    <p><strong>Release:</strong> <a href="https://github.com/simonw/llm-anthropic/releases/tag/0.25.1">llm-anthropic 0.25.1</a></p> <blockquote> <ul> <li>New model: <a href="https://www.anthropic.com/news/claude-opus-4-8">Claude Opus 4.8</a> (<code>claude-opus-4.8</code>).</li> <li>New <code>-o fast 1</code> option for <a href="https://platform.claude.com/docs/en/build-with-claude/fast-mode">fast mode</a>, for organizations with that feature enabled on their account.</li> <li>Default max_tokens for each model now defaults to that model's maximum output rather than 8,192. <a href="https://github.com/simonw/llm-anthropic/issues/72">#72</a></li> </ul> </blockquote> <p>See also my <a href="https://simonwillison.net/2026/May/28/claude-opus-4-8/">notes on Opus 4.8</a> - I used this new release of <…

  • Anthropic's run-rate revenue hits $47 billion
    simonw· 29-may

    <p>The most interesting thing about <a href="https://www.anthropic.com/news/series-h">Anthropic's $65B Series H announcement</a> is this line (emphasis mine):</p> <blockquote> <p>Since our Series G in February, adoption has continued to grow across global enterprise customers, and our run-rate revenue crossed <strong>$47 billion</strong> earlier this month.</p> </blockquote> <p>Anthropic have made a bit of a habit of sharing their "run-rate revenue" in this kind of announcement, which is an annualized projection of their current revenue - typically calculated by taking the most recent month and multiplying by 12.</p> <p>Earlier this year:</p> <ul> <li>Apr 6, 2026 in <a href="https://www.anthropic.com/news/google-broadcom-partnership-compute">Anthropic expands partnership with Google and B…

  • datasette 1.0a31
    simonw· 29-may

    <p><strong>Release:</strong> <a href="https://github.com/simonw/datasette/releases/tag/1.0a31">datasette 1.0a31</a></p> <p>Another significant alpha release, with two new headline features.</p> <blockquote> <p>Datasette now offers users with the necessary permissions the ability to both <strong>execute write queries</strong> against their database and to <strong>save stored queries</strong> (renamed from "canned queries") both privately and for use by other members of their Datasette instance.</p> </blockquote> <p>There's more detail in <a href="https://datasette.io/blog/2026/sql-write-queries/">SQL write queries and stored queries in Datasette 1.0a31</a> on the Datasette blog, which now has <a href="https://datasette.io/blog/">three posts introducing new features</a> since the blog launc…

  • sqlite AGENTS.md
    simonw· 27-may

    <p><strong><a href="https://github.com/sqlite/sqlite/blob/master/AGENTS.md">sqlite AGENTS.md</a></strong></p> SQLite gained an AGENTS.md file <a href="https://github.com/sqlite/sqlite/commit/a1e5778889252d2609a59fd9b819d70392c5789e">five days ago</a> - but it's not intended for their own development, it's presumably aimed at people who are pointing agents at the SQLite codebase. It includes:</p> <blockquote> <p>SQLite does not accept pull requests without prior agreement and/or accompanying legal paperwork that places the pull request in the public domain. However, the human SQLite developers will review a concise and well-written pull request as a proof-of-concept prior to reimplementing the changes themselves.</p> <p>SQLite does not accept agentic code. However the project will accept a…

  • I think Anthropic and OpenAI have found product-market fit
    simonw· 27-may

    <p>Anthropic are <a href="https://techcrunch.com/2026/05/20/anthropic-says-its-about-to-have-its-first-profitable-quarter/">strongly rumored</a> to be about to have their first profitable quarter. Stories <a href="https://www.theinformation.com/newsletters/applied-ai/uber-cto-shows-claude-code-can-blow-ai-budgets">are circulating</a> of companies surprised at how expensive their LLM bills are becoming from usage by their staff. I think this is because OpenAI and Anthropic have both found product-market fit.</p> <ul> <li><a href="https://simonwillison.net/2026/May/27/product-market-fit/#enterprise-customers-are-now-paying-api-prices">Enterprise customers are now paying API prices</a></li> <li><a href="https://simonwillison.net/2026/May/27/product-market-fit/#i-think-they-ve-found-product-m…

  • Claude Opus 4.8: "a modest but tangible improvement"
    simonw· 28-may

    <p>Anthropic shipped <a href="https://www.anthropic.com/news/claude-opus-4-8">Claude Opus 4.8</a> today. My favourite thing about it is this note in the release announcement:</p> <blockquote> <p>Users will find Opus 4.8 to be a modest but tangible improvement on its predecessor. There’s still more to be done: we’re working on developing and releasing models that provide many of the same capabilities as Opus at a lower cost.</p> </blockquote> <p>It's so refreshing to see an AI lab honestly describe a release as a minor incremental improvement over the previous model!</p> <p>Honesty seems to be a theme. Here's my other favorite note from that announcement:</p> <blockquote> <p>One of the most prominent improvements in Opus 4.8 is its <em>honesty</em>. We train all our models to be honest---f…