- Introducing Claude Opus 4.8
anthropic-news· 28-may
May 28, 2026Product
- Anthropic raises $65B in Series H funding at $965B post-money valuation
anthropic-news· 28-may
May 28, 2026Announcements
- Anthropic opens Milan office to support Italian enterprise, research, and developers
anthropic-news· 27-may
May 27, 2026Announcements
- Frontier LLM-based agents can overcome the ontology curation bottleneck for natural phenotypes
arxiv-ai· 29-may
arXiv:2605.28965v1 Announce Type: new Abstract: Linking free-text phenotype descriptions to ontology terms, typically referred to as phenotype annotation, is essential for the cross-study integration of comparative morphological data. This labor intensive process has heavily relied on highly trained human experts, which makes it challenging to scale and thus a key bottleneck. Dahdul et al. (2018) established a Gold Standard (GS) of Entity-Quality (EQ) annotations across seven phylogenetic studies and used it to evaluate three human curators and the Semantic CharaParser NLP tool with ontology-based semantic similarity metrics; they reported that machine-human consistency was significantly lower than inter-curator (human-human) consistency. Here we revisit that benchmark with five frontier …
- VFEAgent: A Multimodal Agent Framework for End-to-End Automated Finite Element Analysis
arxiv-ai· 29-may
arXiv:2605.28978v1 Announce Type: new Abstract: Finite Element Analysis (FEA) serves as the cornerstone of modern engineering design. However, its workflow is inherently complex and relies heavily on domain expertise. Although recent efforts have integrated Large Language Models (LLMs) into FEA, existing approaches face limitations in handling multimodal inputs and executing complex tasks. To address these limitations, we propose VFEAgent, an end-to-end multi-agent system designed to automate FEA modeling and simulation directly from input images and problem descriptions. Our methodology integrates two core components: (1) a multimodal vision-language multi-agent pipeline that employs ReAct-driven reasoning to extract structured FEA specifications from heterogeneous inputs and (2) a verif…
- BEAMS: Benchmarking and Evaluating AI for Modeling and Simulation
arxiv-ai· 29-may
arXiv:2605.28994v1 Announce Type: new Abstract: AI tools to support real world decision making must be able to build simulation models that inform their recommendations and render them interpretable. Tools that can automate aspects of modeling practice must complement human expertise, not replace it. The BEAMS Initiative aims to guide the development of AI tools for modeling and simulation toward forms that are responsible and ethical by establishing benchmarks for human centered modeling and simulation practices. The initiative uses open digital and organizational infrastructure to collaboratively evaluate AI tools for modeling and simulation. The open source sd ai project hosted by the initiative establishes transparency and enables contributions to be shared broadly. A steering group f…
- Adopt $\neq$ Adapt: Longitudinal Analyses of LLM Conversations in the Wild
arxiv-ai· 29-may
arXiv:2605.29018v1 Announce Type: new Abstract: Although a growing body of research has begun to describe user--LLM interactions, the picture it paints is largely static; little is known about how individual users change their behavior over time. To address this gap, we analyze the conversational trajectories of $\sim$12,000 randomly sampled Microsoft Bing Copilot users and compare these with data from WildChat-4.8M. While the Copilot data contains significant population-level trends, we find that trends in individual user trajectories are much weaker; user habits prove to be overwhelmingly sticky. We also find stark differences between users of different activity levels: more active users have more successful conversations and use the LLM for more complex and professionally oriented task…
- Orthogonal Concept Erasure for Diffusion Models
arxiv-ai· 29-may
arXiv:2605.28902v1 Announce Type: new Abstract: Concept erasure has emerged as a promising approach to mitigate undesired or unsafe content in diffusion models, yet existing methods still face significant limitations. While training-based methods are effective, their high computational cost limits scalability. Editing-based methods are more efficient and deployment-friendly, yet they struggle to simultaneously achieve precise concept erasure and preserve overall generative capacity. We identify this core limitation of the editing-based methods as reliance on additive parameter updates. Our empirical analysis reveals that concept semantics primarily depend on neuron direction rather than neuron magnitude, while overall generative capacity relies on the angular geometry of neurons. As addit…
- Behavior-Induced Mirror-Prox Temporal-Difference Learning for Faster Off-Policy Prediction
arxiv-ai· 29-may
arXiv:2605.28849v1 Announce Type: new Abstract: Gradient temporal-difference methods provide stable off-policy prediction with linear function approximation, but their practical performance is strongly affected by the geometry induced by the auxiliary-variable metric. Existing Mirror-Prox TD methods typically use the feature covariance metric, whereas hybrid TD methods suggest that behavior-policy transition information can provide a more informative update geometry. This paper proposes a behavior-induced Mirror-Prox temporal-difference method, called STHTD-MP, which replaces the covariance metric in the primal-dual saddle-point formulation with the symmetric part of the behavior-policy Bellman matrix. The method keeps a single learning rate for the primal and auxiliary variables and appl…
- Review Arcade: On the Human Alignment and Gameability of LLM Reviews
arxiv-ai· 29-may
arXiv:2605.28897v1 Announce Type: new Abstract: LLM-generated reviews for scientific papers are gaining considerable traction and are even being officially piloted by major conferences. We have to assume that not only reviewers are using LLM-assistance, but also that authors use LLMs to revise their papers before submitting. In this work, we perform empirical experiments on papers from the 2025 ACL Rolling Review (ARR) to evaluate LLM reviews from both the author and the reviewer perspective. First, we identify a limited alignment of LLM reviews with human ones. In the best-case scenario, the alignment is reasonable. However, we also find that LLM-human alignment varies substantially across prompts and models. Finally, we investigate the scenario in which the author uses an iterative draf…
- Behavior-Aware Auxiliary Corrections for Off-Policy Temporal-Difference Prediction
arxiv-ai· 29-may
arXiv:2605.28855v1 Announce Type: new Abstract: Temporal-difference learning with function approximation can be unstable under off-policy sampling. TDC stabilizes off-policy TD through an auxiliary covariance correction, and TDRC further regularizes this correction in a single-timescale recursion. This paper studies a behavior-aware replacement of the auxiliary covariance geometry in the linear prediction setting, which is the standard local model for understanding the feature-space dynamics of value-function approximation. We first replace the TDC auxiliary matrix (C) by the behavior Bellman matrix (A_\mu), yielding BA-TDC, and then regularize the same behavior-aware equation to obtain BA-TDRC. This two-step construction separates the contribution of behavior-aware geometry from the cont…
- The Cognitive Categorical Transformer: Category-Theoretic Inductive Biases for Language Modeling
arxiv-ai· 29-may
arXiv:2605.28864v1 Announce Type: new Abstract: The Cognitive Categorical Transformer (CCT) is a 306M-parameter architecture that augments a pretrained GPT-2 Small backbone with cognitively grounded components derived from category theory and several inspirations from cognitive science. Under a matched-step protocol (215,000 optimizer steps, matched data, matched optimizer and schedule) on WikiText-103, CCT reaches 21.27 validation perplexity, compared with 24.19 for an identically fine-tuned GPT-2 Small baseline. The architecture therefore contributes a 2.92 PPL (12% relative) reduction beyond what in-domain fine-tuning alone provides. A retrain-from-scratch ablation that holds GT-Full simplicial message passing bypassed across the entire seven-phase activation schedule reaches 23.72 PPL…
- Ultra-Reduced-Impact-Encased-Logging (URIEL): propose a new method for selective sustainable logging and post-harvest silvicultural treatment in tropical forest using airborne robotics systems
arxiv-ai· 29-may
arXiv:2605.28883v1 Announce Type: new Abstract: Tropical forests worldwide are under intense deforestation pressure driven by economic and political interests, and scientific evidence suggests this deforestation contributes to climate change. This paper proposes a novel logging method for tropical forests, Ultra-Reduced-Impact-Encased-Logging (URIEL). This new method is based on heli-logging techniques combined with intensive use of robotics and AI integrated with post-harvest silvicultural treatments performed by drones. The concept of appropriate equipment for this method was developed, dimensions were determined, details were completed in a digital proof of concept, and an effective digital simulation and economic feasibility analysis were carried out for various helicopter-timber-dist…
- How CockroachDB Built Vector Indexing at Scale
bytebytego· 25-may
In this article, we will look at how the CockroachDB engineering team built this index and the challenges they faced.May 25 • ByteByteGo27046
- How Vercel Cut Build Wait Times From 90 Seconds To 5
bytebytego· 26-may
In this article, we examine the constraints Vercel faced, the choices they made in response, and the optimizations that produced the speedup.May 26 • ByteByteGo26013
- Must-Know Failure Modes in Distributed Systems
bytebytego· 28-may
In this article, we will look at the most significant failure mode patterns in distributed systems and the standard approaches to deal with each of…14 hrs ago • ByteByteGo867
- EP216: RAGs vs Agents
bytebytego· 23-may
Ask an LLM about your company's data and it will guess. The two patterns that fix this are RAG and agents, and they solve different problems.May 23 • ByteByteGo305512
- How Airtable Built the Search Layer Behind Their AI Features
bytebytego· 27-may
In this article, we will look at how Airtable’s data infrastructure team built its architecture, the challenges they faced, the tradeoffs they accepted…May 27 • ByteByteGo24112
- Build with Claude Code: New Cohort Launch
bytebytego· 22-may
The first cohort starts in about a week: May 28-29, 2026.May 22 • ByteByteGo23786
- Iran's Internet is partially restored, Cloudflare Radar data shows
cloudflare· 27-may
Cloudflare Radar data confirms early indications of a partial Internet restoration in Iran, nearly three months after the shutdown began. Traffic spikes and DNS queries have risen, but network activity is currently just 40% of pre-shutdown levels.
- How we built Cloudflare's data platform and an AI agent on top of it
cloudflare· 28-may
Here’s how we built Town Lake, Cloudflare's unified analytics platform, alongside Skipper, an internal AI agent running on top of it.
- obra/superpowers
github-trending
An agentic skills framework & software development methodology that works. Superpowers Superpowers is a complete software development methodology for your coding agents, built on top of a set of composable skills and some initial instructions that make sure your agent uses them. Quickstart Give your agent Superpowers: Claude Code, Codex CLI, Codex App, Factory Droid, Gemini CLI, OpenCode, Cursor, GitHub Copilot CLI. How it works It starts from the moment you fire up your coding agent. As soon as it sees that you're building something, it doesn't just jump into trying to write code. Instead, it steps back and asks you what you're really trying to do. Once it's teased a spec out of the conversation, it shows it to you in chunks short enough to actually read and digest. After you've signed o…
- byoungd/English-level-up-tips
github-trending
An advanced guide to learn English which might benefit you a lot 🎉 . 离谱的英语学习指南/英语学习教程/英语学习/学英语 简体中文 | English 谨以此献给我曾今的挚爱 W. 我们每个人都生活在各自的过去中,人们会用一分钟的时间去认识一个人,用一小时的时间去喜欢一个人,再用一天的时间去爱上一个人,到最后呢,却要用一辈子的时间去忘记一个人。 项目介绍 An advanced guide to learn English which might benefit you a lot. 离谱的英语学习指南/英语学习教程。 推荐资源:ku0.com - 库 如果你在使用本指南里的 AI 学习方案时,需要更稳定、可信的 AI 账户与接口资源,可以看看我们的产品:ku0.com - 库。 ku0.com 是一个可信任 AI 资源库,可一站式获取 ChatGPT、Claude、Gemini 账户充值、成品号和号池资源。我们用 Token 质检和统一网关筛掉不稳定、掺水、冒名的中转服务,并通过可信账户资源、质检报告和接入记录,帮助你降低 AI 使用成本与采购风险。 背景 你好啊朋友,欢迎来到离谱的英语学习指南。 当你的目光与这些文字相遇,我衷心希望,这不仅仅是一次攻克英语的艰苦征程,更是一场开启智慧之门的奇妙冒险。愿这方寸纸墨,化作你我心灵共鸣的琴弦,弹奏出语言学习的天籁妙音。 时间回到 2017 年 7 月初,备考托福的女神W.问了我一个问题:如何高效学习英语? 在我思考如何回答这个问题时,回想起我在大四一学期一次性考过 26 门课的经验(其中重修 19 门,当前学期 7 门),再加上本人英语 和 语文 两门学科曾侥幸在高考时摘得省第一(江苏卷),或许我勉强有资格提供一些高效学习的小技巧,权当抛砖引玉。 与她交流…
- harry0703/MoneyPrinterTurbo
github-trending
利用AI大模型,一键生成高清短视频 Generate short videos with one click using AI LLM. MoneyPrinterTurbo 💸 简体中文 | English 只需提供一个视频 主题 或 关键词 ,就可以全自动生成视频文案、视频素材、视频字幕、视频背景音乐,然后合成一个高清的短视频。 Web界面 API界面 功能特性 🎯 完整的 MVC架构,代码 结构清晰,易于维护,支持 API 和 Web界面 支持视频文案 AI自动生成,也可以自定义文案 支持多种 高清视频 尺寸 竖屏 9:16,1080x1920 横屏 16:9,1920x1080 支持 批量视频生成,可以一次生成多个视频,然后选择一个最满意的 支持 视频片段时长 设置,方便调节素材切换频率 支持 中文 和 英文 视频文案 支持 多种语音 合成,可 实时试听 效果 支持 字幕生成,可以调整 字体、位置、颜色、大小,同时支持字幕描边设置 支持 背景音乐,随机或者指定音乐文件,可设置背景音乐音量 视频素材来源 高清,而且 无版权,也可以使用自己的 本地素材 支持 OpenAI、Moonshot、Azure、gpt4free、one-api、通义千问、Google Gemini、Ollama、DeepSeek、MiniMax、 文心一言, Pollinations、ModelScope 等多种模型接入 中国用户建议使用 DeepSeek 或 Moonshot 作为大模型提供商(国内可直接访问,不需要VPN。注册就送额度,基本够用) 视频演示 📺 竖屏 9:16 ▶️ 《如何增加生活的乐趣》 ▶️ 《金钱的作用》 更真实的合成声音 ▶️ 《生命的意义是什么》 横屏 16:9 ▶️ 《生命的意义是什么》 ▶️ 《为什么要运动》 配置要求 📦 建议系统:Windows 10…
- microsoft/markitdown
github-trending
Python tool for converting files and office documents to Markdown. MarkItDown Important MarkItDown performs I/O with the privileges of the current process. Like open() or requests.get(), it will access resources that the process itself can access. Sanitize your inputs in untrusted environments, and call the narrowest convert_* function needed for your use case (e.g., convert_stream(), or convert_local()). See the Security Considerations section of the documentation for more information. MarkItDown is a lightweight Python utility for converting various files to Markdown for use with LLMs and related text analysis pipelines. To this end, it is most comparable to textract, but with a focus on preserving important document structure and content as Markdown (including: headings, lists, tables,…
- revfactory/harness
github-trending
A meta-skill that designs domain-specific agent teams, defines specialized agents, and generates the skills they use. Harness — The Team-Architecture Factory for Claude Code English | 한국어 | 日本語 Harness is a team-architecture factory for Claude Code. Say "build a harness for this project" (English) or "하네스 구성해줘" (한국어) or "ハーネスを構成して" (日本語), and the plugin turns your domain description into an agent team and the skills they use — picked from six pre-defined team-architecture patterns. Overview Harness leverages Claude Code's agent team system to decompose complex tasks into coordinated teams of specialized agents. Say "build a harness for this project" and it automatically generates agent definitions (.claude/agents/) and skills (.claude/skills/) tailored to your domain. Category — Where Har…
- Zot now supports Claude Opus 4.8
hn-ai· 29-may
Article URL: https://www.zot.sh Comments URL: https://news.ycombinator.com/item?id=48319524 Points: 9 # Comments: 0
- Free Week of Claude Code
hn-ai· 29-may
Article URL: https://claude.ai/referral/pIpeQjEpEw Comments URL: https://news.ycombinator.com/item?id=48319662 Points: 1 # Comments: 0
- Show HN: Open-source toolkit for AI memory that scales
hn-ai· 29-may
Article URL: https://github.com/0xJaksun/lithium-core Comments URL: https://news.ycombinator.com/item?id=48319144 Points: 1 # Comments: 0
- Playwright-MCP – Let AI agents run and manage Playwright tests
hn-ai· 29-may
Article URL: https://github.com/Bairinikhi1/playwright-mcp Comments URL: https://news.ycombinator.com/item?id=48319131 Points: 1 # Comments: 0
- CodePulse – token-efficient codebase indexer for AI coding tools
hn-ai· 29-may
Article URL: https://github.com/leogong99/codepulse Comments URL: https://news.ycombinator.com/item?id=48319172 Points: 2 # Comments: 0
- Evidence that the first papal encyclical on AI was substantially written by AI
hn-ai· 29-may
Article URL: https://linch.substack.com/p/claude-author-of-the-humanitas Comments URL: https://news.ycombinator.com/item?id=48319229 Points: 1 # Comments: 0
- Conversational LLM Client Made in Tkinter
hn-ai· 29-may
Article URL: https://meltdown.merkoba.com/index.html Comments URL: https://news.ycombinator.com/item?id=48319512 Points: 1 # Comments: 0
- Funny but serious, Chieng issues an AI warning to grads
hn-ai· 29-may
Article URL: https://news.harvard.edu/gazette/story/2026/05/funny-but-serious-chieng-issues-an-ai-warning-to-grads/ Comments URL: https://news.ycombinator.com/item?id=48319405 Points: 1 # Comments: 0
- When AI Starts Writing Systems Code
hn-ai· 29-may
Article URL: https://www.coreauto.com/blog/when-ai-starts-writing-systems-code Comments URL: https://news.ycombinator.com/item?id=48319293 Points: 1 # Comments: 0
- What if remote working, not AI, is to blame for weak junior hiring?
hn-ai· 29-may
Article URL: https://www.ft.com/content/2205e2d0-50dc-4e80-9bf7-78d0272276c0 Comments URL: https://news.ycombinator.com/item?id=48319392 Points: 1 # Comments: 1
- [AINews] Anthropic raises $965B Series H, releases Opus 4.8 and Dynamic Workflows/ultracode
latentspace· 29-may
Total Anthropic victory!
- [AINews] Cognition raises $1B in $26B Series D
latentspace· 28-may
coding is an uncapped TAM market
- The Age of Async Agents — Cognition's Walden Yan & OpenInspect's Cole Murray
latentspace· 28-may
80% Devin Commits, Spec-to-PR Workflows, Full VMs, Agent Memory, and PMs Shipping Code
- 🔬ESM: The Bitter Lesson is Coming for Proteins - Alex Rives, BioHub
latentspace· 27-may
Biohub’s Protein World Model: ESMC-6B, ESMFold2, 6.8B proteins, 1.1B structures, antibody design, SAEs, & the potential for programmable biology
- The VibeSec Reckoning
martin-fowler· 27-may
Vibe coding has significantly accelerated software prototyping but AI agents frequently recommend insecure configurations, creating security problems. Gautam Koul, Lucian Moss, Neil Drew-Lopez, and Daberechi Ruth Edeokoh share their experience while building applications for Thoughtworks's global marketing. They learned that to combat this we need to write a security context file to guide the AI, be cautious with AI permission requests, create a daily security intelligence feed, and provide builders with a secure-by-default harness and templates. more…
- Fragments: May 27
martin-fowler· 27-may
At the GOTO Conference in Copenhagen in 2025, Kent Beck and I spent some time on stage talking and answering questions from the audience - a format I refer to as “two old geezers on a park bench”. We talk about our experiences with LLM-augmented programming (at that point - October 2025), we show our frustration that things we’ve been saying for thirty years still need to be said, we say how anything like a manifesto reunion needs to be led by a younger generation, and opine on what junior developers should be focusing on in their career. ❄ ❄ ❄ ❄ ❄ Ian Johnson has written a series of posts about restructuring a gnarly codebase The story follows a real Laravel + React codebase over ~3 months and ~258 commits from a legacy monolith with no tests to a well-structured application with automat…
- markdown-svg-renderer
simonw· 28-may
<p><strong>Tool:</strong> <a href="https://tools.simonwillison.net/markdown-svg-renderer">markdown-svg-renderer</a></p> <p>A slightly customized Markdown rendering tool with special treatment for fenced code SVG blocks - it both renders the image and provides a tab for switching to the code view.</p> <p>You can paste in Markdown or give it a URL to a CORS-enabled Markdown file or Gist. <a href="https://tools.simonwillison.net/markdown-svg-renderer#url=https%3A%2F%2Fgist.github.com%2Fsimonw%2Ffea4f7546626d627862dc241a4e3a86a">Here's an example</a> where it loads a Markdown file full of LLM pelican logs for <a href="https://simonwillison.net/2026/May/28/claude-opus-4-8/#and-some-pelicans">Opus 4.8</a>.</p> <p>Tags: <a href="https://simonwillison.net/tags/svg">svg</a>, <a href="https://simon…
- llm-anthropic 0.25.1
simonw· 28-may
<p><strong>Release:</strong> <a href="https://github.com/simonw/llm-anthropic/releases/tag/0.25.1">llm-anthropic 0.25.1</a></p> <blockquote> <ul> <li>New model: <a href="https://www.anthropic.com/news/claude-opus-4-8">Claude Opus 4.8</a> (<code>claude-opus-4.8</code>).</li> <li>New <code>-o fast 1</code> option for <a href="https://platform.claude.com/docs/en/build-with-claude/fast-mode">fast mode</a>, for organizations with that feature enabled on their account.</li> <li>Default max_tokens for each model now defaults to that model's maximum output rather than 8,192. <a href="https://github.com/simonw/llm-anthropic/issues/72">#72</a></li> </ul> </blockquote> <p>See also my <a href="https://simonwillison.net/2026/May/28/claude-opus-4-8/">notes on Opus 4.8</a> - I used this new release of <…
- Anthropic's run-rate revenue hits $47 billion
simonw· 29-may
<p>The most interesting thing about <a href="https://www.anthropic.com/news/series-h">Anthropic's $65B Series H announcement</a> is this line (emphasis mine):</p> <blockquote> <p>Since our Series G in February, adoption has continued to grow across global enterprise customers, and our run-rate revenue crossed <strong>$47 billion</strong> earlier this month.</p> </blockquote> <p>Anthropic have made a bit of a habit of sharing their "run-rate revenue" in this kind of announcement, which is an annualized projection of their current revenue - typically calculated by taking the most recent month and multiplying by 12.</p> <p>Earlier this year:</p> <ul> <li>Apr 6, 2026 in <a href="https://www.anthropic.com/news/google-broadcom-partnership-compute">Anthropic expands partnership with Google and B…
- datasette 1.0a31
simonw· 29-may
<p><strong>Release:</strong> <a href="https://github.com/simonw/datasette/releases/tag/1.0a31">datasette 1.0a31</a></p> <p>Another significant alpha release, with two new headline features.</p> <blockquote> <p>Datasette now offers users with the necessary permissions the ability to both <strong>execute write queries</strong> against their database and to <strong>save stored queries</strong> (renamed from "canned queries") both privately and for use by other members of their Datasette instance.</p> </blockquote> <p>There's more detail in <a href="https://datasette.io/blog/2026/sql-write-queries/">SQL write queries and stored queries in Datasette 1.0a31</a> on the Datasette blog, which now has <a href="https://datasette.io/blog/">three posts introducing new features</a> since the blog launc…
- sqlite AGENTS.md
simonw· 27-may
<p><strong><a href="https://github.com/sqlite/sqlite/blob/master/AGENTS.md">sqlite AGENTS.md</a></strong></p> SQLite gained an AGENTS.md file <a href="https://github.com/sqlite/sqlite/commit/a1e5778889252d2609a59fd9b819d70392c5789e">five days ago</a> - but it's not intended for their own development, it's presumably aimed at people who are pointing agents at the SQLite codebase. It includes:</p> <blockquote> <p>SQLite does not accept pull requests without prior agreement and/or accompanying legal paperwork that places the pull request in the public domain. However, the human SQLite developers will review a concise and well-written pull request as a proof-of-concept prior to reimplementing the changes themselves.</p> <p>SQLite does not accept agentic code. However the project will accept a…
- I think Anthropic and OpenAI have found product-market fit
simonw· 27-may
<p>Anthropic are <a href="https://techcrunch.com/2026/05/20/anthropic-says-its-about-to-have-its-first-profitable-quarter/">strongly rumored</a> to be about to have their first profitable quarter. Stories <a href="https://www.theinformation.com/newsletters/applied-ai/uber-cto-shows-claude-code-can-blow-ai-budgets">are circulating</a> of companies surprised at how expensive their LLM bills are becoming from usage by their staff. I think this is because OpenAI and Anthropic have both found product-market fit.</p> <ul> <li><a href="https://simonwillison.net/2026/May/27/product-market-fit/#enterprise-customers-are-now-paying-api-prices">Enterprise customers are now paying API prices</a></li> <li><a href="https://simonwillison.net/2026/May/27/product-market-fit/#i-think-they-ve-found-product-m…
- Claude Opus 4.8: "a modest but tangible improvement"
simonw· 28-may
<p>Anthropic shipped <a href="https://www.anthropic.com/news/claude-opus-4-8">Claude Opus 4.8</a> today. My favourite thing about it is this note in the release announcement:</p> <blockquote> <p>Users will find Opus 4.8 to be a modest but tangible improvement on its predecessor. There’s still more to be done: we’re working on developing and releasing models that provide many of the same capabilities as Opus at a lower cost.</p> </blockquote> <p>It's so refreshing to see an AI lab honestly describe a release as a minor incremental improvement over the previous model!</p> <p>Honesty seems to be a theme. Here's my other favorite note from that announcement:</p> <blockquote> <p>One of the most prominent improvements in Opus 4.8 is its <em>honesty</em>. We train all our models to be honest---f…