AI Digest

← Volver

Digest curado

lunes, 27 de abril de 2026·light·short·8,960 tokens

🔥 TOP — lo que SÍ o SÍ tenés que ver

DeepSeek V4 Pro y Flash ya están afuera — Dos modelos MoE (1.6T y 284B) con 1M de contexto, casi frontier a fracción del costo. Ideal para experimentar con agentes sin romper el bolsillo. link
EvanFlow: feedback loop TDD para Claude Code — Proyecto con 42 puntos en HN que arma un loop de TDD + Claude Code. Si usás Claude Code para codear, esto te puede cambiar el workflow. link
mattpocock/skills — agent skills posta para engineers de verdad — Matt Pocock abrió su directorio .claude con skills reales que usa en producción. Planning, PRDs, diseño. Para copiar y poner en tu propio Claude Code ya. link
Claude Platform on AWS (coming soon) — Anthropic oficialmente trayendo Claude como plataforma manejada en AWS. Si laburás con cloud, esto es grande. link

📦 Claude / Anthropic ecosystem

Feature request: Persona Profiles para Claude Code — Issue oficial pidiendo bundles de configuración switchables. Si tenés múltiples contextos de laburo (full-stack, side project, etc), métele un voto. link
OpenClaw — tu propio asistente AI local, multicanal — Alternativa open-source a Claude Code que corre en tu máquina, habla por Discord, terminal, iOS. Pinta para side projects sin depender de la nube. link
Squish — runtime de memoria local para AI agents — Memoria persistente para agents que corre local. Si armás sistemas multi-agent, esto te puede servir. link

🛠️ Dev tools & coding

free-claude-code: ruteá Claude Code por NVIDIA NIM, OpenRouter o local — Proxy liviano que te deja usar Claude Code sin API key de Anthropic. Ideal para side projects donde querés minimizar costos. link
GitNexus — knowledge graph client-side de tu codebase — Subís un repo y te arma un grafo interactivo + Graph RAG Agent. Todo en el browser. Para entender codebases grandes sin mandar código a ningún lado. link
Beads — memoria gráfica persistente para coding agents — Reemplaza planes en markdown por un grafo con dependencias. Pensado para tareas largas sin perder contexto. link

🏗️ Software engineering

Jaeger v2 adopta OpenTelemetry en su core para observabilidad de AI agents — Si te importa tracing y agents, esto es clave: Jaeger se está reescribiendo para trackear llamadas a LLMs, tool calls, etc. link
Making a Landing Page Work for Both Humans and AI Agents — Post práctico sobre cómo diseñar contenido que agentes de IA puedan parsear bien. Relevante si tu SaaS va a ser consumido por bots. link

📚 Vale la pena leer

"The people do not yearn for automation" — análisis de Nilay Patel sobre por qué la AI es impopular — Ensayo espectacular sobre el "software brain" y por qué la gente rechaza la automatización aunque use ChatGPT. Para entender el lado humano de lo que construís. link
I Learned to Stop Worrying and Love Coding with AI — Post honesto de un dev que pasó del escepticismo a integrar Claude en serio. Buenos tips de workflow real. link

💤 Skippeable pero conviene saber

US power demand récord 2026-2027 por AI y datacenters — Contexto macro: la demanda energética se dispara. Si te interesa el costado infra de AI, vale tenerlo en el radar. link
TypeScript reescrito en Go (preview disponible) — Microsoft ya tiene preview en npm (@typescript/native-preview). Todavía no está para producción, pero es el future de TS. link
Awesome Codex Skills — skills curados para Codex CLI/API — Si algún día usás Codex, acá tenés skills para mandar emails, crear issues, postear a Slack, etc. link

Artículos fetched (50)

mattpocock/skills
github-trending
Agent Skills for real engineers. Straight from my .claude directory. Agent Skills For Real Engineers My agent skills that I use every day to do real engineering - not vibe coding. If you want to keep up with changes to these skills, and any new ones I create, you can join ~60,000 other devs on my newsletter: Sign Up To The Newsletter Planning & Design These skills help you think through problems before writing code. to-prd — Turn the current conversation context into a PRD and submit it as a GitHub issue. No interview — just synthesizes what you've already discussed. npx skills@latest add mattpocock/skills/to-prd to-issues — Break any plan, spec, or PRD into independently-grabbable GitHub issues using vertical slices. npx skills@latest add mattpocock/skills/to-issues grill-me — Get relent…
microsoft/typescript-go
github-trending
Staging repo for development of native port of TypeScript TypeScript 7 Not sure what this is? Read the announcement post! Preview A preview build is available on npm as @typescript/native-preview. npm install @typescript/native-preview npx tsgo # Use this as you would tsc. A preview VS Code extension is available on the VS Code marketplace. To use this, set this in your VS Code settings: { "js/ts.experimental.useTsgo": true } What Works So Far? This is still a work in progress and is not yet at full feature parity with TypeScript. Bugs may exist. Please check this list carefully before logging a new issue or assuming an intentional change. Feature Status Notes Program creation done Same files and module resolution as TS 6.0. Not all resolution modes supported yet. Parsing/scanning done Ex…
openclaw/openclaw
github-trending
Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞 🦞 OpenClaw — Personal AI Assistant EXFOLIATE! EXFOLIATE! OpenClaw is a personal AI assistant you run on your own devices. It answers you on the channels you already use. It can speak and listen on macOS/iOS/Android, and can render a live Canvas you control. The Gateway is just the control plane — the product is the assistant. If you want a personal, single-user assistant that feels local, fast, and always-on, this is it. Supported channels include: WhatsApp, Telegram, Slack, Discord, Google Chat, Signal, iMessage, BlueBubbles, IRC, Microsoft Teams, Matrix, Feishu, LINE, Mattermost, Nextcloud Talk, Nostr, Synology Chat, Tlon, Twitch, Zalo, Zalo Personal, WeChat, QQ, WebChat. Website · Docs · Vision · DeepWiki · Gett…
ComposioHQ/awesome-codex-skills
github-trending
A curated list of practical Codex skills for automating workflows across the Codex CLI and API. Awesome Codex Skills A curated list of practical Codex skills for automating workflows across the Codex CLI and API. Want skills that do more than generate text? Codex can send emails, create issues, post to Slack, and take actions across 1000+ apps. See how → Quickstart: Add Skills to Codex Install with the Skill Installer (recommended) git clone https://github.com/ComposioHQ/awesome-codex-skills.git cd awesome-codex-skills/awesome-codex-skills # Install one or more skills into $CODEX_HOME/skills (defaults to ~/.codex/skills) python skill-installer/scripts/install-skill-from-github.py --repo ComposioHQ/awesome-codex-skills --path meeting-notes-and-actions The installer fetches the skill and pl…
PostHog/posthog
github-trending
🦔 PostHog is an all-in-one developer platform for building successful products. We offer product analytics, web analytics, session replay, error tracking, feature flags, experimentation, surveys, data warehouse, a CDP, and an AI product assistant to help debug your code, ship features faster, and keep all your usage and customer data in one stack. Docs - Community - Roadmap - Why PostHog? - Changelog - Bug reports PostHog is an all-in-one, open source platform for building successful products PostHog provides every tool you need to build a successful product including: Product Analytics: Autocapture or manually instrument event-based analytics to understand user behavior and analyze data with visualization or SQL. Web Analytics: Monitor web traffic and user sessions with a GA-like dashbo…
trycua/cua
github-trending
Open-source infrastructure for Computer-Use Agents. Sandboxes, SDKs, and benchmarks to train and evaluate AI agents that can control full desktops (macOS, Linux, Windows). Build, benchmark, and deploy agents that use computers Choose Your Path Cua Driver - Background computer-use on macOS Drive any native macOS app in the background — agents click, type, and verify without stealing the cursor, focus, or Space, even on non-AX surfaces like Chromium web content and canvas-based tools (Blender, Figma, DAWs, game engines). Use with the CLI or MCP server for Claude Code, Cursor, and custom clients. Every session records as a replayable trajectory. /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/trycua/cua/main/libs/cua-driver/scripts/install.sh)" Full tool reference, architecture …
Z4nzu/hackingtool
github-trending
ALL IN ONE Hacking Tool For Hackers All-in-One Hacking Tool for Security Researchers & Pentesters What's New in v2.0.0 Feature Description 🐍 Python 3.10+ All Python 2 code removed, modern syntax throughout 🖥 OS-aware menus Linux-only tools hidden automatically on macOS 📦 185+ tools 35 new modern tools added across 6 categories 🔍 Search Type / to search all tools by name, description, or keyword 🏷 Tag filter Type t to filter by 19 tags — osint, web, c2, cloud, mobile... 💡 Recommend Type r — "I want to scan a network" → shows relevant tools ✅ Install status ✔/✘ shown next to every tool — know what's ready ⚡ Install all Option 97 in any category — batch install at once 🔄 Smart update Each tool has Update — auto-detects git pull / pip upgrade / go install 📂 Open folder Jump into any t…
curl/curl
github-trending
A command line tool and library for transferring data with URL syntax, supporting DICT, FILE, FTP, FTPS, GOPHER, GOPHERS, HTTP, HTTPS, IMAP, IMAPS, LDAP, LDAPS, MQTT, MQTTS, POP3, POP3S, RTMP, RTMPS, RTSP, SCP, SFTP, SMB, SMBS, SMTP, SMTPS, TELNET, TFTP, WS and WSS. libcurl offers a myriad of powerful features curl is a command-line tool for transferring data from or to a server using URLs. It supports these protocols: DICT, FILE, FTP, FTPS, GOPHER, GOPHERS, HTTP, HTTPS, IMAP, IMAPS, LDAP, LDAPS, MQTT, MQTTS, POP3, POP3S, RTSP, SCP, SFTP, SMB, SMBS, SMTP, SMTPS, TELNET, TFTP, WS and WSS. Learn how to use curl by reading the man page or everything curl. Find out how to install curl by reading the INSTALL document. libcurl is the library curl is using to do its job. It is readily available …
Alishahryar1/free-claude-code
github-trending
Use claude-code for free in the terminal, VSCode extension or via discord like openclaw 🤖 Free Claude Code Use Claude Code CLI & VSCode for free. No Anthropic API key required. A lightweight proxy that routes Claude Code's Anthropic API calls to NVIDIA NIM (40 req/min free), OpenRouter (hundreds of models), DeepSeek (direct Anthropic-compatible API), LM Studio (fully local), llama.cpp (local with Anthropic endpoints), or Ollama (fully local, native Anthropic Messages). Quick Start · Providers · Discord Bot · Configuration · Development · Contributing Claude Code running via NVIDIA NIM, completely free Features Feature Description Zero Cost 40 req/min free on NVIDIA NIM. Free models on OpenRouter. Fully local with LM Studio, Ollama, or llama.cpp Drop-in Replacement Set 2 env vars. No modi…
abhigyanpatwari/GitNexus
github-trending
GitNexus: The Zero-Server Code Intelligence Engine - GitNexus is a client-side knowledge graph creator that runs entirely in your browser. Drop in a GitHub repo or ZIP file, and get an interactive knowledge graph wit a built in Graph RAG Agent. Perfect for code exploration GitNexus ⚠️ Important Notice:** GitNexus has NO official cryptocurrency, token, or coin. Any token/coin using the GitNexus name on Pump.fun or any other platform is not affiliated with, endorsed by, or created by this project or its maintainers. Do not purchase any cryptocurrency claiming association with GitNexus. Join the official Discord to discuss ideas, issues etc! Enterprise (SaaS & Self-hosted) - akonlabs.com Building nervous system for agent context. Indexes any codebase into a knowledge graph — every dependency…
gastownhall/beads
github-trending
Beads - A memory upgrade for your coding agent bd - Beads Distributed graph issue tracker for AI agents, powered by Dolt. Platforms: macOS, Linux, Windows, FreeBSD Docs: https://gastownhall.github.io/beads/ Beads provides a persistent, structured memory for coding agents. It replaces messy markdown plans with a dependency-aware graph, allowing agents to handle long-horizon tasks without losing context. ⚡ Quick Start # Install beads CLI (system-wide - don't clone this repo into your project) curl -fsSL https://raw.githubusercontent.com/gastownhall/beads/main/scripts/install.sh | bash # Initialize in YOUR project cd your-project bd init # Tell your agent echo "Use 'bd' for task tracking" >> AGENTS.md Note: Beads is a CLI tool you install once and use everywhere. You don't need to clone this…
home-assistant/core
github-trending
🏡 Open source home automation that puts local control and privacy first. Home Assistant |Chat Status| Open source home automation that puts local control and privacy first. Powered by a worldwide community of tinkerers and DIY enthusiasts. Perfect to run on a Raspberry Pi or a local server. Check out home-assistant.io <https://home-assistant.io>__ for a demo <https://demo.home-assistant.io>, installation instructions <https://home-assistant.io/getting-started/>, tutorials <https://home-assistant.io/getting-started/automation/>__ and documentation <https://home-assistant.io/docs/>__. |screenshot-states| Featured integrations |screenshot-integrations| The system is built using a modular approach so support for other devices or actions can be implemented easily. See also the section on arch…
codecrafters-io/build-your-own-x
github-trending
Master programming by recreating your favorite technologies from scratch. Build your own <insert-technology-here> This repository is a compilation of well-written, step-by-step guides for re-creating our favorite technologies from scratch. What I cannot create, I do not understand — Richard Feynman. It's a great way to learn. 3D Renderer AI Model Augmented Reality BitTorrent Client Blockchain / Cryptocurrency Bot Command-Line Tool Database Docker Emulator / Virtual Machine Front-end Framework / Library Game Git Memory Allocator Network Stack Neural Network Operating System Physics Engine Processor Programming Language Regex Engine Search Engine Shell Template Engine Text Editor Visual Recognition System Voxel Engine Web Browser Web Server Uncategorized Tutorials Build your own Distributed…
The reporters at this news site are AI bots. OpenAI appears to be funding it
hn-ai· 27-abr
Article URL: https://modelrepublic.substack.com/p/the-reporters-at-this-news-site-are Comments URL: https://news.ycombinator.com/item?id=47916519 Points: 23 # Comments: 1
What type of code should you generate with AI?
hn-ai· 27-abr
I (and many people) have been thinking a lot about "what are the best tasks to do using AI." My personal framework I've been using is, you should use AI to generate code when: 1. The code is easy to validate 2. It is not important that a human understands it I find that the best thing to AI generate is something on the level of a pure function (easy to validate + not necessarily important to understand implementation as long as you understand the interface). I've tried doing things like generating whole services or applications, but those often violate rule 1 - it's hard to validate an entire application behaves "correctly" when "correctly" isn't really well defined - i.e., are there memory leaks, is it secure, is it able to be monitored, etc. I'm curious of others thoughts on this topic.…
SpaceX warns probes into abusive AI imagery could cause headaches for IPO
hn-ai· 27-abr
Article URL: https://nypost.com/2026/04/24/business/elon-musks-spacex-warns-probes-into-sexually-abusive-ai-imagery-could-hurt-ahead-of-ipo/ Comments URL: https://news.ycombinator.com/item?id=47916744 Points: 5 # Comments: 0
I Am Doing This: The Origin Story of Project-AI
hn-ai· 27-abr
Article URL: https://zenodo.org/records/19592336 Comments URL: https://news.ycombinator.com/item?id=47917986 Points: 1 # Comments: 0
Claude Platform on AWS (Coming Soon)
hn-ai· 27-abr
Article URL: https://aws.amazon.com/claude-platform/ Comments URL: https://news.ycombinator.com/item?id=47917904 Points: 1 # Comments: 0
Show HN: ChatForm – Create an AI chat form in 1 minute
hn-ai· 27-abr
Article URL: https://chatform.000ooo.ooo/ Comments URL: https://news.ycombinator.com/item?id=47917834 Points: 1 # Comments: 0
Show HN: CheckThisOut – a filtered directory of side hustles and AI tools
hn-ai· 26-abr
Article URL: https://checkthisout.com/ Comments URL: https://news.ycombinator.com/item?id=47915576 Points: 1 # Comments: 0
Making a Landing Page Work for Both Humans and AI Agents
hn-ai· 27-abr
Article URL: https://docsalot.dev/blog/i-redesigned-my-landing-page-so-ai-agents-can-read-it Comments URL: https://news.ycombinator.com/item?id=47917329 Points: 3 # Comments: 0
Claude Feature Request: Persona Profiles – switchable bundles
hn-ai· 27-abr
Article URL: https://github.com/anthropics/claude-code/issues/53458 Comments URL: https://news.ycombinator.com/item?id=47916380 Points: 2 # Comments: 0
EvanFlow – A TDD driven feedback loop for Claude Code
hn-ai· 27-abr
Article URL: https://github.com/evanklem/evanflow Comments URL: https://news.ycombinator.com/item?id=47916909 Points: 42 # Comments: 16
Language Anchoring: A Systematic Method for LLM Multilingual Adaptation
hn-ai· 27-abr
Article URL: https://github.com/fkyah3/opencode-fkyah3 Comments URL: https://news.ycombinator.com/item?id=47916857 Points: 1 # Comments: 0
OGMA – persistent memory and dual-brain AI, newcomer seeks pro feedback
hn-ai· 27-abr
Article URL: https://github.com/kidshadow79/Ogma Comments URL: https://news.ycombinator.com/item?id=47916377 Points: 1 # Comments: 0
Aether – A GCP-Native Framework to Terminate LLM Agent Drift
hn-ai· 27-abr
Article URL: https://github.com/poinsettiaclg-gif/AETHER-core Comments URL: https://news.ycombinator.com/item?id=47916656 Points: 1 # Comments: 1
Human AI Collaboration in LIterature
hn-ai· 26-abr
Article URL: https://indignified.com/history-of-human-ai-collaboration-in-literature/ Comments URL: https://news.ycombinator.com/item?id=47916083 Points: 2 # Comments: 0
The AI Guy [Google_made_an_Ooopsie]
hn-ai· 27-abr
Article URL: https://inv.nadeko.net/channel/UCUYWQzo6AtlFSTHab_qKQaA/community Comments URL: https://news.ycombinator.com/item?id=47917188 Points: 1 # Comments: 0
I Learned to Stop Worrying and Love Coding with AI
hn-ai· 26-abr
Article URL: https://jeffield.net/blog/claude-strangelove-or-how-i-learned-to-stop-worrying-and-love-coding-with-ai/ Comments URL: https://news.ycombinator.com/item?id=47915677 Points: 3 # Comments: 0
Google banks on AI edge to catch up to cloud rivals Amazon and Microsoft
hn-ai· 27-abr
Article URL: https://www.ft.com/content/2429f0f0-b685-4747-b425-bf8001a2e94c Comments URL: https://news.ycombinator.com/item?id=47916410 Points: 89 # Comments: 60
Draft's knowledge graph engine – deterministic codebase understanding for AI
hn-ai· 27-abr
Article URL: https://www.getdraft.dev/blog/local-graph-engine/ Comments URL: https://news.ycombinator.com/item?id=47917819 Points: 1 # Comments: 0
US power demand to reach record highs in 2026–2027 driven by AI and data centers
hn-ai· 27-abr
Article URL: https://www.reuters.com/business/energy/us-power-use-beat-record-highs-2026-2027-ai-use-surges-eia-says-2026-04-07/ Comments URL: https://news.ycombinator.com/item?id=47917664 Points: 6 # Comments: 3
UK departments at odds over energy demands of AI datacentres
hn-ai· 26-abr
Article URL: https://www.theguardian.com/technology/2026/apr/26/uk-departments-at-odds-over-energy-demands-of-ai-datacentres Comments URL: https://news.ycombinator.com/item?id=47915061 Points: 5 # Comments: 0
Claude 4.7 vs. ChatGPT 5.5
hn-ai· 27-abr
Article URL: https://www.tomsguide.com/ai/7-0-wipeout-i-put-chatgpt-5-5-and-claude-4-7-through-7-impossible-tests-and-the-results-shocked-me Comments URL: https://news.ycombinator.com/item?id=47917916 Points: 2 # Comments: 0
AI and Digitalization in HTA: Enhancing Evidence, Equity and Efficiency [video]
hn-ai· 27-abr
Article URL: https://www.youtube.com/watch?v=2wN0D-AqUQQ Comments URL: https://news.ycombinator.com/item?id=47917310 Points: 1 # Comments: 0
An AI driven WP theming workflow
hn-ai· 27-abr
Article URL: https://anchor.host/a-custom-wordpress-theme-from-scratch-in-2026-an-ai-driven-workflow/ Comments URL: https://news.ycombinator.com/item?id=47917942 Points: 1 # Comments: 0
Squish – a local memory runtime for AI agents
hn-ai· 26-abr
Article URL: https://squishplugin.dev/ Comments URL: https://news.ycombinator.com/item?id=47915017 Points: 3 # Comments: 0
Team9 Review: The Fastest AI Workspace for Small Teams
hn-ai· 27-abr
Article URL: https://team9.ai Comments URL: https://news.ycombinator.com/item?id=47917145 Points: 1 # Comments: 0
Jaeger adopts OpenTelemetry at its core to solve the AI agent observability gap
hn-ai· 26-abr
Article URL: https://thenewstack.io/jaeger-v2-ai-observability/ Comments URL: https://news.ycombinator.com/item?id=47915620 Points: 1 # Comments: 0
Banning AI Art – Wallhaven
hn-ai· 27-abr
Article URL: https://wallhaven.cc/forums/thread/4800 Comments URL: https://news.ycombinator.com/item?id=47917429 Points: 2 # Comments: 2
AI can cost more than human workers now
hn-ai· 27-abr
Article URL: https://www.axios.com/2026/04/26/ai-cost-human-workers Comments URL: https://news.ycombinator.com/item?id=47918009 Points: 2 # Comments: 0
CIOs struggle to find clarity in their organizations' AI strategies
hn-ai· 26-abr
Article URL: https://www.cio.com/article/4162949/cios-struggle-to-find-clarity-in-their-organizations-ai-strategies.html Comments URL: https://news.ycombinator.com/item?id=47915267 Points: 6 # Comments: 0
Show HN: Free On-Brand AI Ad Maker
hn-ai· 27-abr
Article URL: https://www.context.dev/free-tools/ad-maker Comments URL: https://news.ycombinator.com/item?id=47917519 Points: 1 # Comments: 0
[AINews] DeepSeek V4 Pro (1.6T-A49B) and Flash (284B-A13B), Base and Instruct — runnable on Huawei Ascend chips
latentspace· 25-abr
The prodigal Tiger returns... but is no longer the benchmarks leader.
llm 0.31
simonw· 24-abr
Release: <a href="https://github.com/simonw/llm/releases/tag/0.31">llm 0.31</a> <blockquote> <ul> <li>New GPT-5.5 OpenAI model: <code>llm -m gpt-5.5</code>. <a href="https://github.com/simonw/llm/issues/1418">#1418</a></li> <li>New option to set the <a href="https://developers.openai.com/cookbook/examples/gpt-5/gpt-5_new_params_and_tools#1-verbosity-parameter">text verbosity level</a> for GPT-5+ OpenAI models: <code>-o verbosity low</code>. Values are <code>low</code>, <code>medium</code>, <code>high</code>.</li> <li>New option for setting the <a href="https://developers.openai.com/api/docs/guides/images-vision#choose-an-image-detail-level">image detail level</a> used for image attachments to OpenAI models: <code>-o image_detail low</code> - values are <code>low</c…
The people do not yearn for automation
simonw· 24-abr
<a href="https://www.theverge.com/podcast/917029/software-brain-ai-backlash-databases-automation">The people do not yearn for automation</a> This written and video essay by Nilay Patel explores why AI is unpopular with the general public even as usage numbers for ChatGPT continue to skyrocket. It’s a superb piece of commentary, and something I expect I’ll be thinking about for a long time to come. Nilay’s core idea is that people afflicted with “software brain” - who see the world as something to be automated as much as possible, and attempt to model everything in terms of information flows and data - are becoming detached from everyone else. <blockquote> […] software brain has ruled the business world for a long time. AI has just made it easie…
DeepSeek V4 - almost on the frontier, a fraction of the price
simonw· 24-abr
Chinese AI lab DeepSeek's last model release was V3.2 (and V3.2 Speciale) <a href="https://simonwillison.net/2025/Dec/1/deepseek-v32/">last December</a>. They just dropped the first of their hotly anticipated V4 series in the shape of two preview models, <a href="https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro">DeepSeek-V4-Pro</a> and <a href="https://huggingface.co/deepseek-ai/DeepSeek-V4-Flash">DeepSeek-V4-Flash</a>. Both models are 1 million token context Mixture of Experts. Pro is 1.6T total parameters, 49B active. Flash is 284B total, 13B active. They're using the standard MIT license. I think this makes DeepSeek-V4-Pro the new largest open weights model. It's larger than Kimi K2.6 (1.1T) and GLM-5.1 (754B) and more than twice the size of DeepSeek V3.2 (685B).…
Quoting Romain Huet
simonw· 25-abr
<blockquote cite="https://twitter.com/romainhuet/status/2047955381578838357">Since GPT-5.4, we’ve unified Codex and the main model into a single system, so there’s no separate coding line anymore. GPT-5.5 takes this further, with strong gains in agentic coding, computer use, and any task on a computer.</blockquote> — <a href="https://twitter.com/romainhuet/status/2047955381578838357">Romain Huet</a>, confirming OpenAI won't release a GPT-5.5-Codex model Tags: <a href="https://simonwillison.net/tags/generative-ai">generative-ai</a>, <a href="https://simonwillison.net/tags/gpt">gpt</a>, <a href="https://simonwillison.net/tags/openai">openai</a>, <a href="https://simonwillison.net/tags/ai">ai</a>, <a href="https://simonwillison.net/tags/llms">llms</…
WHY ARE YOU LIKE THIS
simonw· 25-abr
@scottjla <a href="https://twitter.com/scottjla/status/2047535371664457863">on Twitter</a> in reply to my <a href="https://simonwillison.net/tags/pelican-riding-a-bicycle/">pelican riding a bicycle</a> benchmark: <blockquote> I feel like we need to stack these tests now <img alt="AI generated image. A pelican is riding a bicycle along a dirt track, chased by a police car. The pelican looks panicked, likely because there is an astronaut (with prehensile toes for some reason) riding the pelican clinging on to where its ears should be. The astronaut is being ridden by a horse, with an equally wild expression. A slice of pizza and a can and a cowboy hat are falling next to them. A road sign in the background reads WHY ARE YOU LIKE THIS." src="https://static.simonwillison.net/…
GPT-5.5 prompting guide
simonw· 25-abr
<a href="https://developers.openai.com/api/docs/guides/prompt-guidance?model=gpt-5.5">GPT-5.5 prompting guide</a> Now that GPT-5.5 is <a href="https://developers.openai.com/api/docs/models/gpt-5.5">available in the API</a>, OpenAI have released a wealth of useful tips on how best to prompt the new model. Here's a neat trick they recommend for applications that might spend considerable time thinking before returning a user-visible response: <blockquote> <code>Before any tool calls for a multi-step task, send a short user-visible update that acknowledges the request and states the first step. Keep it to one or two sentences.</code> </blockquote> I've already noticed their Codex app doing this, and it does make longer running tasks feel less like …