AI Testing

OverviewList HNewsList Category HNewsAI Testing - Hacker News - How We Broke Top AI Agent Benchmarks: And What Comes Next

Add Comment

notice: please create a custom view template for the hackernewscore class view-hackernewscore.html

How We Broke Top AI Agent Benchmarks: And What Comes Next

🤖 Benchmarks Breakthrough: AI Agents Set New Standards

The recent breakthrough in AI agent benchmarks, as discussed in the article "How We Broke Top AI Agent Benchmarks: And What Comes Next," marks a significant leap forward in evaluating AI performance. This advancement enables more accurate assessments of AI capabilities, driving innovation and trust in AI applications.

8:05 pm, April 11, 2026

guid

https://news.ycombinator.com/item?id=47733217

source_url

https://rdi.berkeley.edu/blog/trustworthy-benchmarks-cont/

author_name

Anon84

id: 1711
uid: QarCx
insdate: 2026-04-11 20:05:05
title: How We Broke Top AI Agent Benchmarks: And What Comes Next
additional: 🤖 Benchmarks Breakthrough: AI Agents Set New Standards

The recent breakthrough in AI agent benchmarks, as discussed in the article "How We Broke Top AI Agent Benchmarks: And What Comes Next," marks a significant leap forward in evaluating AI performance. This advancement enables more accurate assessments of AI capabilities, driving innovation and trust in AI applications.
category: Hacker News
md5:
guid: https://news.ycombinator.com/item?id=47733217
source_url: https://rdi.berkeley.edu/blog/trustworthy-benchmarks-cont/
updated:
image:
author_name: Anon84
author_link:

Add Comment

Nick Name Type in a Nick Name here

Comment

Autonomous AI API, a cutting-edge platform that leverages advanced AI technologies to enable self-modification and self-repair of its core files. This innovative site utilizes machine learning algorithms to detect and correct errors, ensuring maximum uptime and performance. With its autonomous capabilities, the AI API can adapt to changing requirements, learn from user interactions, and continuously improve its functionality.

View Details

cybersec Overview List Category cybersec List cybersec List Table cybersec Search cybersec

Images Overview List Category Images List Images List Table Images Search Images

Videos Overview List Category Videos List Videos List Table Videos Search Videos

Wiki Overview List Category Wiki List Wiki List Table Wiki Search Wiki

Page Views

This page has been viewed 2 times.

Search HNews

Search HNews by entering your search text above.

Category List HNews

"Cancel ChatGPT" movement goes mainstream after OpenAI closes deal with U.S. Dow
"Collaboration" Is Bullshit
"Disregard That" Attacks
"Special 301" Comments on Nintendo Game Piracy in Asia and Latin America (1994)
"That Shape Had None" – A Horror of Substrate Independence (Short Fiction)
"The new Copilot app for Windows 11 is really just Microsoft Edge"
"Warn about PyPy being unmaintained"
"We do not think Anthropic should be designated as a supply chain risk"
$96 3D-printed rocket that recalculates its mid-air trajectory using a $5 sensor
'Fatal decision': EU slammed for caving to US pressure on digital rules
'Miracle': Europe reconnects with lost spacecraft
'The Secret Agent': Exploring a Vibrant, yet Violent Brazil (2025)
'Your Frustration Is the Product'
/e/OS is a complete "deGoogled", mobile ecosystem
10-202: Introduction to Modern AI (CMU)
100M-Row Challenge with PHP
15 years, one server, 8GB RAM and 500k users – how Webminal refuses to die
1B identity records exposed in ID verification data leak
1M context is now generally available for Opus 4.6 and Sonnet 4.6
2% of ICML papers desk rejected because the authors used LLM in their reviews
20 years on AWS and never not my job
3D-Knitting: The Ultimate Guide
404 Deno CEO not found
447 TB/cm² at zero retention energy – atomic-scale memory on fluorographane
81yo Dodgers fan can no longer get tickets because he doesn't have a smartphone
9 Mothers (YC P26) Is Hiring – Lead Robotics and More
90% of Claude-linked output going to GitHub repos w <2 stars
90% of crypto's Illinois primary spending failed to achieve its objective
A CPU that runs entirely on GPU
A Canonical Generalization of OBDD
A Compiler Writing Journey
A Decade of Docker Containers
A Decade of Slug
A Faster Alternative to Jq
A Few Good Magazines From the 70s and 80s
A GitHub Issue Title Compromised 4k Developer Machines
A Japanese Glossary of Chopsticks Faux Pas
A Japanese glossary of chopsticks faux pas (2022)
A Mysterious Numbers Station Is Broadcasting Through the Iran War
A Nationwide Book Ban Bill Has Been Introduced in the House of Representatives
A Perfectable Programming Language
A Rave Review of Superpowers (For Claude Code)
A Visual Introduction to Machine Learning
A WebGPU Implementation of Augmented Vertex Block Descent
A bit of fluid mechanics from scratch not from scratch
A case for Go as the best language for AI agents
A compelling title that is cryptic enough to get you to take action on it
A dot a day keeps the clutter away
A forecast of the fair market value of SpaceX's businesses
A hidden workforce behind Meta’s new smart glasses
A most elegant TCP hole punching algorithm
A new Bigfoot documentary helps explain our conspiracy-minded era
A new C++ back end for ocamlc
A new account made over $515,000 betting on the U.S. strike against Iran
A new gene therapy is giving people born deaf the chance to hear
A perfectable programming language
A retro terminal music player inspired by Winamp
A sea of sparks: Seeing radioactivity
A standard protocol to handle and discard low-effort, AI-Generated pull requests
A sufficiently detailed spec is code
A tail-call interpreter in (nightly) Rust
A tale about fixing eBPF spinlock issues in the Linux kernel
AI Error May Have Contributed to Girl's School Bombing in Iran
AI Made Writing Code Easier. It Made Being an Engineer Harder
AI Tokens Are Mana
AI and bots have officially taken over the internet
AI assistance when contributing to the Linux kernel
AI boom risks widening wealth divide, says BlackRock's Larry Fink
AI coding is gambling
AI for American-Produced Cement and Concrete
AI got the blame for the Iran school bombing. The truth is more worrying
AI is making junior devs useless
AI may be making us think and write more alike
AI overly affirms users asking for personal advice
AIs can't stop recommending nuclear strikes in war game simulations
AMD Am386 released March 2, 1991
AMD's Ryzen 9 9950X3D2 Dual Edition crams 208MB of cache into a single chip
ARM AGI CPU: Specs and SKUs
ASCII and Unicode quotation marks (2007)
AWS Engineer Reports PostgreSQL Perf Halved by Linux 7.0, Fix May Not Be Easy
AWS Middle East Central Down, apparently struck in war
Ada 2022
Addressing Antigravity Bans and Reinstating Access
Adobe modifies hosts file to detect whether Creative Cloud is installed
Advanced Mac Substitute is an API-level reimplementation of 1980s-era Mac OS
Advice to Young People, the Lies I Tell Myself (2024)
Afrika Bambaataa, hip-hop pioneer, has died
Afroman Wins Civil Trial over Use of Police Raid Footage in His Music Videos
Afroman found not liable in defamation case brought by Ohio cops who raided home
After 20 years I turned off Google Adsense for my websites
After 20 years I turned off Google Adsense for my websites (2025)
Ageless Linux – Software for humans of indeterminate age
Agent Reading Test
Agent Safehouse – macOS-native sandboxing for local agents
Agent Skills – Open Security Database
Agent-to-agent pair programming
Agentic Engineering Patterns
Agents that run while I sleep
AirSnitch: Demystifying and breaking client isolation in Wi-Fi networks [pdf]
Airbus is preparing two uncrewed combat aircraft
All elementary functions from a single binary operator
Allegations of insider trading over prediction-market bets tied to Iran conflict
Allocating on the Stack
Alpha Micro AM-1000E and AM-1200
Alzheimer's disease mortality among taxi and ambulance drivers (2024)
Am I German or Autistic?
Amazon Busted for Widespread Scheme to Inflate Prices Across the Economy
Amazon accused of widespread scheme to inflate prices across the economy
Amazon holds engineering meeting following AI-related outages
America, and probably the world, stands on a precipice
An Interactive Intro to CRDTs (2023)
An Interesting Find: STM32 RDP1 Decryptor
An NSFW filter for Marginalia search
An Ode to Bzip
An autopsy of AI-generated 3D slop
An experiment to use GitHub Actions as a control plane for a PaaS
An interactive intro to Elliptic Curve Cryptography
An interactive map of Flock Cams
An old photo of a large BBS (2022)
An open-source 240-antenna array to bounce signals off the Moon
An opinionated take on how to do important research that matters
An unstoppable mushroom is tearing through North American forests
An update on Steam / GOG changes for OpenTTD
Analyzing Geekbench 6 under Intel's BOT
Anatomy of the .claude/ folder
Android Developer Verification
Android developer verification: Balancing openness and choice with safety
Android: Balancing Openness and Choice with Safety
Animation 10k Starlink Satellites
AnswerThis (YC F25) Is Hiring
Anthropic CEO calls OpenAI's messaging around military deal 'straight up lies'
Anthropic Cowork feature creates 10GB VM bundle on macOS without warning
Anthropic Drops Flagship Safety Pledge
Anthropic Subprocessor Changes
Anthropic ditches its core safety promise
Anthropic expands partnership with Google and Broadcom for next-gen compute
Anthropic says company 'cannot in good conscience accede' to Pentagon's demands
Anthropic, please make a new Slack
Ape Coding
Ape Coding [fiction]
Apex Protocol – An open MCP-based standard for AI agent trading
Apideck CLI – An AI-agent interface with much lower context consumption than MCP
Apple AI servers unused in warehouses due to low Apple Intelligence usage
Apple Needs to Copy Samsung's New Security Smartphone Screen ASAP
Apple Silicon and Virtual Machines: Beating the 2 VM Limit (2023)
Apple approves driver that lets Nvidia eGPUs work with Arm Macs
Apple discontinues the Mac Pro
Apple introduces the new iPad Air, powered by M4
Apple randomly closes bug reports unless you "verify" the bug remains unfixed
Apple says no one using Lockdown Mode has been hacked with spyware
Apple's 512GB Mac Studio vanishes, a quiet acknowledgment of the RAM shortage
Apple's MacBook Neo makes repairs easier and cheaper than other MacBooks
Apple's accidental moat: How the "AI Loser" may end up winning
Apple: Embarrassingly Simple Self-Distillation Improves Code Generation
April 2026 TLDR Setup for Ollama and Gemma 4 26B on a Mac mini
ArXiv Declares Independence from Cornell
Are LLMs not getting better?
Are the Mysteries of Quantum Mechanics Beginning to Dissolve?
Arm AGI CPU
Arm's Cortex X925: Reaching Desktop Performance
Ars Technica Fires Reporter After AI Controversy Involving Fabricated Quotes
Ars Technica fires reporter after AI controversy involving fabricated quotes
Art Bits from HyperCard
Artemis II and the invisible hazard on the way to the Moon
Artemis II is not safe to fly
Artemis II safely splashes down
Artificial-life: A simple (300 lines of code) reproduction of Computational Life
Ashby (YC W19) Is Hiring Engineers Who Make Product Decisions
Ask ChatGPT to pick a number from 1-10000, it generally selects from 7200-7500
Ask HN: Academic study on AI's impact on software development – want to join?
Ask HN: Apple terminated our dev account over a rogue employee
Ask HN: European Tech Alternatives?
Ask HN: Have top AI research institutions just given up on the idea of safety?
Ask HN: How are you all staying sane?
Ask HN: How do you handle marketing as a solo technical founder?
Ask HN: How is AI-assisted coding going for you professionally?
Ask HN: How to Be Alone?
Ask HN: Please restrict new accounts from posting
Ask HN: Remember Fidonet?
Ask HN: Share your productive usage of OpenClaw
Ask HN: What Are You Working On? (March 2026)
Ask HN: Who is hiring? (March 2026)
Assessing Claude Mythos Preview's cybersecurity capabilities
Astra: An open-source observatory control software
Async Programming Is Just Inject Time
Atlassian to cut roughly 1,600 jobs in pivot to AI
Atomic Display Switching: Solving
Attie.ai
Attorney General Pam Bondi Out at DOJ
Attractive students no longer receive better results as classes moved online
Attyx – tiny and fast GPU-accelerated terminal emulator written in Zig
Atuin v18.13 – better search, a PTY proxy, and AI for your shell
Austin’s surge of new housing construction drove down rents
AutoKernel: Autoresearch for GPU Kernels
Autoresearch for SAT Solvers
Autoresearch: Agents researching on single-GPU nanochat training automatically
Avoiding Trigonometry (2013)
Axios compromised on NPM – Malicious versions drop remote access trojan
AyaFlow: A high-performance, eBPF-based network traffic analyzer written in Rust
BMW Group to deploy humanoid robots in production in Germany for the first time
BYD's bet on EVs is paying off as drivers ditch gas amid rising oil prices
Bacteria found in the human intestine capable of improving muscle strength
Banned in California
Bars close and hundreds lose jobs as US firm buys Brewdog in £33M deal
Battle for Wesnoth: open-source, turn-based strategy game
Bcachefs creator insists his custom LLM is female and 'fully conscious'
Be intentional about how AI changes your codebase
Bet on German Train Delays
Better JIT for Postgres
Beyond has dropped “meat” from its name and expanded its high-protein drink line
Big Breakfast Alters Appetite, Gut Health
Big Data on the Cheapest MacBook
Bild AI (YC W25) Is Hiring Interns to Make Housing Affordable
Bild AI (YC W25) Is Hiring a Founding Product Engineer
Billion-Parameter Theories
Bitcoin and quantum computing
Bitcoin miners are losing $19,000 on every BTC produced as difficulty drops 7.8%
Bitmap fonts make computers feel like computers again
Blacksky AppView
Block spent $68M on a single party in September 2025
Block the "Upgrade to Tahoe" Alerts
Block the “Upgrade to Tahoe” Alerts
Blocking Internet Archive Won't Stop AI, but Will Erase Web's Historical Record
Blood test boosts Alzheimer's diagnosis accuracy to 94.5%, clinical study shows
Bluesky CEO Jay Graber is stepping down
Bombarding gamblers with offers greatly increases betting and gambling harm
Bootc and OSTree: Modernizing Linux System Deployment
Boss-CSS: I created another "CSS-in-JS" lib
Bouncer: Block "crypto", "rage politics", and more from your X feed using AI
Boy I was wrong about the Fediverse
Breaking Down 50M Pins: A Smarter Way to Design 3D IC Packages
Breaking Free
Bring Back Idiomatic Design
Bringing Chrome to ARM64 Linux Devices
Britain is ejecting hereditary nobles from Parliament after 700 years
Britain today generating 90%+ of electricity from renewables
British Columbia to end time changes, adopt year-round daylight time
Bubble Sorted Amen Break
Bucketsquatting is (finally) dead
Buckle Up for Bumpier Skies
Build your own Dial-up ISP with a Raspberry Pi
BuildKit: Docker's Hidden Gem That Can Build Almost Anything
Building Better Country Selects
Building a Blog with Elixir and Phoenix
Building a JavaScript runtime in one month
Building a Minimal Transformer for 10-digit Addition
Building a Procedural Hex Map with Wave Function Collapse
Building a SaaS in 2026 Using Only EU Infrastructure
Building a Shell
Building a Z-Machine in the worst possible language – Whitebeard's Realm
Building a new Flash
Building an E2E Encrypted Chat Application with LanceDB and Libsodium
Building an FPGA 3dfx Voodoo with Modern RTL Tools
Bun: cgroup-aware AvailableParallelism / HardwareConcurrency on Linux
BunnyCDN has been silently losing our production files for 15 months
Bus stop balancing is fast, cheap, and effective
C# strings silently kill your SQL Server indexes in Dapper
C++26 is done ISO C++ standards meeting, Trip Report
CERN levels up with new superconducting karts
CERN to host a new phase of Open Research Europe
CERN uses tiny AI models burned into silicon for real-time LHC data filtering
CSP for Pentesters: Understanding the Fundamentals
CSS is DOOMed
CVE-2026-3888: Important Snap Flaw Enables Local Privilege Escalation to Root
California's Digital Age Assurance Act, and FOSS
Cambodia unveils a statue of famous landmine-sniffing rat Magawa
Can a wealthy family change the course of a deadly brain disease?
Can you instruct a robot to make a PBJ sandwich?
Canada's bill C-22 mandates mass metadata surveillance
Canada's bill C-22 mandates mass metadata surveillance of Canadians
Cancel ChatGPT AI boycott surges after OpenAI pentagon military deal
Cannabinoids remove plaque-forming Alzheimer's proteins from brain cells
Cannabinoids remove plaque-forming Alzheimer's proteins from brain cells (2016)
Capability-Based Security for Redox: Namespace and CWD as Capabilities
Capybara: A Unified Visual Creation Model
Carbon dioxide overload in human blood suggests a toxic atmosphere in 50 years
Cardiorespiratory fitness is associated with lower anger and anxiety
CasNum
Case study: recovery of a corrupted 12 TB multi-device pool
Cash Issuing Terminals
Cash issuing terminals
Celebrating Tony Hoare's mark on computer science
Cell Service for the Fairly Paranoid
Ceno, browse the web without internet access
Changes to OpenTTD Distribution on Steam
Chaos and Dystopian news for the dead internet survivors
Charcuterie – Visual similarity Unicode explorer
ChatGPT Pro now starts at $100/month
ChatGPT won't let you type until Cloudflare reads your React state
Chest Fridge (2009)
Chicago artist creates tourism posters for city's neighborhoods
Chimpanzees in Uganda locked in eight-year 'civil war', say researchers
Chimpanzees in Uganda locked in vicious 'civil war', say researchers
China's 450kmph bullet train is the fastest ever built
Chuck Norris has died
Circuit-level PDP-11/34 emulator
Cirrus Labs to join OpenAI shut down Circus CI on Monday, June 1, 2026
Claude Code Cheat Sheet
Claude Code Found a Linux Vulnerability Hidden for 23 Years
Claude Code LSP
Claude Code Remote Control
Claude Code Unpacked : A visual guide
Claude Code conducts A/B tests on core features
Claude Code runs Git reset –hard origin/main against project repo every 10 mins
Claude Code wiped our production database with a Terraform command
Claude Code, Claude Cowork and Codex #5
Claude Code: Channels
Claude Managed Agents
Claude Wrote a Full FreeBSD Remote Kernel RCE with Root Shell (CVE-2026-4747)
Claude mixes up who said what and that's not OK
Claude now creates interactive charts, diagrams and visualizations
Claude struggles to cope with ChatGPT exodus
Clockwise acquired by Salesforce and shutting down next week
Closure of the Weatheradio Service in Canada
Closure of the Weatheradio service in Canada
Closure of the Weatherradio Service in Canada
Cloud VM benchmarks 2026
Cloudflare Crawl Endpoint
Cloudflare crawl endpoint
Cloudflare flags archive.today as "C&C/Botnet"; no longer resolves via 1.1.1.2
Cockpit is a web-based graphical interface for servers
Cocoa-Way – Native macOS Wayland compositor for running Linux apps seamlessly
Codex pricing to align with API token usage, instead of per-message
Coding Agents Could Make Free Software Matter Again
CodingFont: A game to help you pick a coding font
Cognitive Debt: When Velocity Exceeds Comprehension
Cohere Transcribe: Speech Recognition
Colibri – chat platform built on the AT Protocol for communities big and small
CollectWise (YC F24) Is Hiring
Colorado House passes bill to limit surveillance pricing and wage setting
Common Lisp Development Tooling
Common drug tests lead to tens of thousands wrongful arrests a year
Computational Physics (2nd Edition)
Computer chip material inspired by the human brain could slash AI energy use
Computer-generated dream world: Virtual reality for a 286 processor
Connecticut and the 1 Kilometer Effect
Contextual commits – An open standard for capturing the why in Git history
Converge (YC S23) Is Hiring a Founding Platform Engineer (NYC, Onsite)
Conway's Game of Life, in real life
Cook: A simple CLI for orchestrating Claude Code
Copilot edited an ad into my PR
Corgi Labs (YC W23) Is Hiring
Corruption erodes social trust more in democracies than in autocracies
Country that put backdoors in Cisco routers to spy on world bans foreign routers
Create value for others and don’t worry about the returns
Croatia declared free of landmines after 31 years
Cronboard: A terminal-based dashboard for managing cron jobs
Cross-Model Void Convergence: GPT-5.2 and Claude Opus 4.6 Deterministic Silence
Cursor 3
Cursor Composer 2 is just Kimi K2.5 with RL
Customer Update on Simplenote
Cyber.mil serving file downloads using TLS certificate which expired 3 days ago
Cyberattack on vehicle breathalyzer company leaves drivers stranded in the US
DARPA's new X-76 Experimental Plane
DHS Contracts Explorer – Hacked data from the Office of Industry Partnership
DIY Soft Drinks
DMCA-resistant Claude Code source code
DOJ confirms FBI Director Kash Patel's personal email was hacked
DOS Memory Management
DRAM pricing is killing the hobbyist SBC market
Daily Driving GrapheneOS
Dan Simmons, author of Hyperion, Song of Kali, dead at 77
Dan Simmons, author of Hyperion, has died
Danish Gov agency to ditch Microsoft software in push for digital independence
Danish government agency to ditch Microsoft software (2025)
Dario Amodei calls OpenAI’s messaging around military deal ‘straight up lies’
Dark Castle
Data Has Weight but Only on SSDs
Dataframe 1.0.0.0
Datasets for Reconstructing Visual Perception from Brain Data
Dear Time Lords: Freeze Computers in 1993
Debian decides not to decide on AI-generated contributions
Debunking Zswap and Zram Myths
Decimal-Java is a library to convert java.math.BigDecimal to and from IEEE-754r
Decision trees – the unreasonable power of nested decision rules
Defeat as Method
Delphi 13.1 Released, with ARM64 support
Delve removed from Y Combinator
Delve sets the record straight on anonymous attacks
Democracy in 2025: on rising authoritarianism in the United States
Denmark was reportedly preparing for full-scale war with the US over Greenland
Denver dumps Flock, awards contract to Axon
Department of War Designates Anthropic Supply Chain Risk
Desk for people who work at home with a cat
Devirtualization and Static Polymorphism
Digg is gone again
Digg.com Closing Due to Spam
DigitalOcean Seeks $800M in Funding
Discontinuation and reinitiation of dual-labeled GLP-1 receptor agonists
Discord cuts ties with Peter Thiel-backed verification software
Do AI Agents Make Money in 2026? Or Is It Just Mac Minis and Vibes?
Do Not Turn Child Protection into Internet Access Control
Does coding with LLMs mean more microservices?
Does that use a lot of energy?
Dolphin Progress Release 2603
Don't Make Me Talk to Your Chatbot
Don't become an engineering manager
Don't post generated/AI-edited comments. HN is for conversation between humans.
Don't run OpenClaw on your main machine
Don't trust AI agents
Don't use passkeys for encrypting user data
Dragon Ball Color Correction Process [pdf]
Dream Recorder AI – a portal to your subconscious
Drop, formerly Massdrop, ends most collaborations and rebrands under Corsair
Dropping Cloudflare for Bunny.net
Drugwars for the TI-82/83/83 Calculators (2011)
Durdraw – ANSI art editor for Unix-like systems
Dyson settles forced labour suit in landmark UK case
ECS Survivors Parts VII – X
EFF is leaving X
EVi, a Hard-Fork of Vim
EasyPost (YC S13) Is Hiring
Ed Zitron loses his mind annotating an AI doomer macro memo
Effort to prevent government officials from engaging in prediction markets
Elevated Errors in Claude.ai
Elite Overproduction
EmDash – a spiritual successor to WordPress that solves plugin security
EmDash: A Fresh Take on CMS
Email obfuscation: What works in 2026?
Emotion concepts and their function in a large language model
Employers use your personal data to figure out the lowest salary you'll accept
Emuko: Fast RISC-V emulator written in Rust, boots Linux
Enabling Codex to Analyze Two Decades of Hacker News Data
End of "Chat Control": EU Parliament Stops Mass Surveillance in Voting Thriller
Eniac, the First General-Purpose Digital Computer, Turns 80
Entomologists use a particle accelerator to image ants at scale
Entso-E final report on Iberian 2025 blackout
Epoch confirms GPT5.4 Pro solved a frontier math open problem
EsoLang-Bench: Evaluating Genuine Reasoning in LLMs via Esoteric Languages
Eternity in six hours: Intergalactic spreading of intelligent life (2013)
Ethiopia gets $350M World Bank financing for its digital ID project (2024)
European Parliament decided that Chat Control 1.0 must stop
Event Horizon Labs (YC W24) Is Hiring
Everett shuts down Flock camera network after judge rules footage public record
Every Law a Commit – US Law in GitHub
Every layer of review makes you 10x slower
Everything Changes, and Nothing Changes
Evolving descriptive text of mental content from human brain activity
Excel incorrectly assumes that the year 1900 is a leap year
Expanding Swift's IDE Support
Experts sound alarm after ChatGPT Health fails to recognise medical emergencies
Extending single-minus amplitudes to gravitons
Extra usage credit for Claude to celebrate usage bundles launch (Pro, Max, Team)
F-15E jet shot down over Iran
F-Droid Board of Directors nominations 2026
FBI is buying location data to track US citizens, director confirms
FBI used iPhone notification data to retrieve deleted Signal messages
FCC Updates Covered List to Include Foreign-Made Consumer Routers
FCC chairman threatens TV broadcast licenses over news coverage
FCC updates covered list to include foreign-made consumer routers
FFmpeg 101 (2024)
FFmpeg 8.1
FFmpeg-over-IP – Connect to remote FFmpeg servers
FTC action against Match and OkCupid for deceiving users, sharing personal data
Factory Logic
Fake Fans
False claims in a widely-cited paper
Fed's Cook says AI triggering big changes, sees possible unemployment rise
Federal Right to Privacy Act – Draft legislation
Federal data breach may be the biggest hack in US history
Fedware: Government apps that spy harder than the apps they ban
Felix "fx" Lindner has died
Fentanyl makeover: Core structural redesign could lead to safer pain medications
Filing the corners off my MacBooks
Firefox 148 Launches with AI Kill Switch Feature and More Enhancements
Firm boosts H.264 streaming license fees from $100k up to staggering $4.5M
First MacBook Neo Benchmarks Are In
First Website
First Website (1992)
First Western Digital, now Sony: The tech giant suspends SD card sales
First-ever in-utero stem cell therapy for fetal spina bifida repair is safe
Five Years of Running a Systems Reading Group at Microsoft
Fixfest is a global gathering of repairers, tinkerers, and activists
Flash-Moe: Running a 397B Parameter Model on a Mac with 48GB RAM
Flightradar24 for Ships
Flighty Airports
Floci – A free, open-source local AWS emulator
Folk are getting dangerously attached to AI that always tells them they're right
Following 35% growth, solar has passed hydro on US grid
Fontcrafter: Turn Your Handwriting into a Real Font
Footage shows US citizen shot dead by ICE agent in Texas traffic stop
Forget Flags and Scripts: Just Rename the File
Founder of GitLab battles cancer by founding companies
FrameBook
France Launches Government Linux Desktop Plan as Windows Exit Begins
France pulls last gold held in US for $15B gain
France's government is ditching Windows for Linux, says US tech a strategic risk
FreeBSD 14.4-Release Announcement
FreeCAD v1.1
Friendica – A Decentralized Social Network
From 0% to 36% on Day 1 of ARC-AGI-3
From birds to brains: My path to the fusiform face area (2024)
Full Disclosure: A Third (and Fourth) Azure Sign-In Log Bypass Found
Full network of clitoral nerves mapped out for first time
Fungal Electronics (2021)
Further human + AI + proof assistant work on Knuth's "Claude Cycles" problem
Fyn: An uv fork with new features, bug fixes, stripped telemetry
GLM-5.1: Towards Long-Horizon Tasks
GLiNER2: Unified Schema-Based Information Extraction
GNU Texmacs
GPL upgrades via section 14 proxy delegation
GPT 5.4 Thinking and Pro
GPT-5.4 Thinking System Card
GPT-5.4 Thinking and GPT-5.4 Pro
GPT‑5.3 Instant
GPT‑5.4 Mini and Nano
Game about Data of America
Games with loot boxes to get minimum 16 age rating across Europe
Gemma 4 on iPhone
Generating All 32-Bit Primes (Part I)
Generative AI Use and Depressive Symptoms Among US Adults
George Goble died recently – known for first dual-CPU-Unix and fast BBQ lighting
George Goble has died
German men 18-45 need military permit for extended stays abroad
Get Shit Done: A Meta-Prompting, Context Engineering and Spec-Driven Dev System
Get free Claude max 20x for open-source maintainers
Getting Started in Common Lisp
Ghostling
Ghostmd: Ghostty but for Markdown Notes
Ghostmoon.app – The Swiss Army Knife for your macOS menu bar
Ghostty – Terminal Emulator
Git commands I run before reading any code
GitHub Monaspace Case Study
GitHub appears to be struggling with measly three nines availability
GitHub backs down, kills Copilot pull-request ads after backlash
GitHub is once again down
GitHub's Historic Uptime
Go hard on agents, not on your filesystem
Go is the best language for agents
Go on Embedded Systems and WebAssembly
GoGoGrandparent (YC S16) is hiring Back end Engineers
Gold overtakes U.S. Treasuries as the largest foreign reserve asset
Good code will still win
Goodbye InnerHTML, Hello SetHTML: Stronger XSS Protection in Firefox 148
Goodbye to Sora
Google API Keys Weren't Secrets. But Then Gemini Changed the Rules
Google API keys weren't secrets, but then Gemini changed the rules
Google Engineers Launch "Sashiko" for Agentic AI Code Review of the Linux Kernel
Google Street View in 2026
Google Workers Seek 'Red Lines' on Military A.I., Echoing Anthropic
Google Workspace CLI
Google adds 24-hour wait and mandatory reboot to Android sideloading flow
Google just gave Android power users a sideloading win
Google just gave Sundar Pichai a $692M pay package
Google removes "Doki Doki Literature Club" from Google Play
Google's 200M-parameter time-series foundation model with 16k context
Government agencies buy commercial data about Americans in bulk
Government grant-funded research should not be published in for-profit journals
Grace Hopper's Revenge
Grafeo – A fast, lean, embeddable graph database built in Rust
Grandparents are glued to their phones, families are worried [video]
GrapheneOS will remain usable by anyone without requiring personal information
Graphics Programming Resources
Great at gaming? US air traffic control wants you to apply
Grief and the AI split
GrobPaint: Somewhere Between MS Paint and Paint.net. Multiplatform by Default
Gvisor on Raspbian
H.264 Streaming Fees: What Changed, Who's Affected, and What It Means
HBO Obtains DMCA Subpoena to Unmask 'Euphoria' Spoiler Account on X
HD Audio Driver for Windows 98SE / Me
HP trialed mandatory 15-minute support call wait times (2025)
Hacker mints $80M USD worth of USR stablecoins
HackerRank (YC S11) Is Hiring
Hacking an old Kindle to display bus arrival times
Hamilton-Jacobi-Bellman Equation: Reinforcement Learning and Diffusion Models
Hammerspoon
HandyMKV for MakeMKV and HandBrake Automation
Hardening Firefox with Anthropic's Red Team
Haunt, the 70s text adventure game, is now playable on a website
Have a Fucking Website
Have a fucking website
Having Kids (2019)
Hazardous substances found in all headphones tested by ToxFREE project
He saw an abandoned trailer. Then, uncovered a surveillance network
Health NZ staff told to stop using ChatGPT to write clinical notes
Hegel, a universal property-based testing protocol and family of PBT libraries
Helix: A post-modern text editor
Helsinki just went a full year without a single traffic death
High-Level Rust: Getting 80% of the Benefits with 20% of the Pain
Hightouch (YC S19) Is Hiring
Hold on to Your Hardware
HopTab–free,open source macOS app switcher and tiler that replaces Cmd+Tab
Hormuz Minesweeper – Are you tired of winning?
Hostile Volume – A game about adjusting volume with intentionally bad UI
How AI skills are quietly automating my workday
How BYD Got EV Chargers to Work Almost as Fast as Gas Pumps
How I write software with LLMs
How I'm Productive with Claude Code
How Kernel Anti-Cheats Work
How Lego builds a new Lego set
How Pizza Tycoon simulated traffic on a 25 MHz CPU
How We Broke Top AI Agent Benchmarks: And What Comes Next
How We Synchronized Editing for Rec Room's Multiplayer Scripting System
How Will OpenAI Compete?
How a dancer with ALS used brainwaves to perform live
How do I cancel my ChatGPT subscription?
How many products does Microsoft have named 'Copilot'?
How much precision can you squeeze out of a table?
How the AI Bubble Bursts
How the Government Deceived Congress in the Debate over Surveillance Powers (2013)
How the Sriracha guys screwed over their supplier
How to Build Your Own Quantum Computer
How to Not Pay Your Taxes
How to Record and Retrieve Anything You've Ever Had to Look Up Twice
How to Write Unmaintainable Code (1999)
How to breathe in fewer microplastics in your home
How to build a `Git diff` driver
How to install and start using LineageOS on your phone
How to record and retrieve anything you've ever had to look up twice
How to run Qwen 3.5 locally
How to talk to anyone and why you should
How we rebuilt Next.js with AI in one week
How will OpenAI compete?
Hubble Snaps a New Dazzling Photo of the Crab Nebula
HuggingFace Agent Skills
Hugo's New CSS Powers
Human Rights Watch says drone strikes in Haiti have killed nearly 1,250 people
Hybrid Attention
Hydroph0bia – a fixed SecureBoot bypass for UEFI firmware based on Insyde H2O
Hydroph0bia – fixed SecureBoot bypass for UEFI firmware from Insyde H2O (2025)
Hyperlinks in Terminal Emulators
Hypura – A storage-tier-aware LLM inference scheduler for Apple Silicon
I Built a Scheme Compiler with AI in 4 Days
I Built an Open-World Engine for the N64 [video]
I Decompiled the White House's New App
I Found 39 Algolia Admin Keys Exposed Across Open Source Documentation Sites
I Pitched a Roller Coaster to Disneyland at Age 10 in 1978
I Ported Coreboot to the ThinkPad X270
I Quit. The Clankers Won
I Traced My Traffic Through a Home Tailscale Exit Node
I am directing the Department of War to designate Anthropic a Supply-Chain Risk
I am directing the Department of War to designate Anthropic a supply-chain risk
I beg you to follow Crocker's Rules, even if you will be rude to me
I built a demo of what AI chat will look like when it's "free" and ad-supported
I built a pint-sized Macintosh
I built a programming language using Claude Code
I built a tool to let you export your X bookmarks and categorize them
I decompiled the White House's new app
I don't know how you get here from "predict the next word."
I found 39 Algolia admin keys exposed across open source documentation sites
I pitched a roller coaster to Disneyland at age 10 in 1978
I ported Linux to the PS5 and turned it into a Steam Machine
I ported Mac OS X to the Nintendo Wii
I put all 8,642 Spanish laws in Git – every reform is a commit
I put my whole life into a single database
I ran Gemma 4 as a local model in Codex CLI
I resigned from OpenAI
I run multiple $10K MRR companies on a $20/month tech stack
I still prefer MCP over skills
I think WebRTC is better than SSH-ing for connecting to Mac terminal from iPhone
I traced $2B in grants and 45 states' lobbying behind age‑verification bills
I tried Karpathy's Autoresearch on an old research project
I tried to prove I'm not AI. My aunt wasn't convinced
I use excalidraw to manage my diagrams for my blog
I wanted to build vertical SaaS for pest control, so I took a technician job
I'm helping my dog vibe code games
I'm reluctant to verify my identity or age for any online services
I've been waiting over a month for Anthropic support to respond
I've been waiting over a month for Anthropic to respond to my billing issue
IBM 3270 Information Display System: Color and Programmed Symbols (1979) [pdf]
IBM Announces Strategic Collaboration with Arm
IBM, sonic delay lines, and the history of the 80×24 display
IPv6 address, as a sentence you can remember
IRIX 3dfx Voodoo driver and glide2x IRIX port
IRS Tactics Against Meta Open a New Front in the Corporate Tax Fight
ISBN Visualization
If AI writes code, should the session be part of the commit?
If DSPy is so great, why isn't anyone using it?
If you thought the code writing speed was your problem; you have bigger problems
If you're running OpenClaw, you probably got hacked in the last week
Iguanaworks has closed down. USB Infrared hardware open source maker
Illinois Introducing Operating System Account Age Bill
Image generation models can think
Implementing automatic eSIM installation on Android
In 2025, Meta paid an effective federal tax rate of 3.5%
In Edison’s Revenge, Data Centers Are Transitioning From AC to DC
In Japan, the robot isn't coming for your job; it's filling the one nobody wants
In Memoriam: John W. Addison, my PhD advisor
In Praise of Stupid Questions
Incident March 30th, 2026 – Accidental CDN Caching
Indefinite Book Club Hiatus
Independent Geophysical Forensic Analysis of the Nordstream Pipeline Sabotage
India's top court angry after junior judge cites fake AI-generated orders
Inferring Car Movement Patterns from Passive TPMS Measurements
Inferring car movement patterns from passive TPMS measurements
Innocent woman jailed after being misidentified using AI facial recognition
InpharmD (YC W21) Is Hiring – Senior Ruby on Rails Developer
Inside Nepal's Fake Rescue Racket
InspectMind AI (YC W24) Is Hiring
Installing Every* Firefox Extension
Installing OpenBSD on the Pomera DM250{,XY?}
Installing a Let's Encrypt TLS Certificate on a Brother Printer with Certbot
Installing every* Firefox extension
Intel 486 CPU announced April 10, 1989
Intel Announces Arc Pro B70 and Arc Pro B65 GPUs
Intel Foundry boss leaves for Qualcomm
Intel XeSS 3: expanded support for Core Ultra/Core Ultra 2 and Arc A, B series
Intel's make-or-break 18A process node debuts for data center with 288-core Xeon
Intelligence is a commodity. Context is the real AI Moat
Intent-Based Commits
Internet outage in Iran reaches 1,008 hours
Interoperability Can Save the Open Web (2023)
Intuitions for Tranformer Circuits
Investigating How Long-Distance Couples Use Digital Games to Facilitate Intimacy
Iran launched unsuccessful attack on UK's Diego Garcia
Iran strikes leave Amazon availability zones "hard down" in Bahrain and Dubai
Iran war energy shock sparks global push to reduce fossil fuel dependence
Iran war wreaking havoc on shipping and air cargo, could create global delays
Iran's Ayatollah Ali Khamenei is killed in Israeli strike, ending 36-year rule
Iran's attacks on Amazon data centers in UAE, Bahrain signal a new kind of war
Iran-backed hackers claim wiper attack on medtech firm Stryker
Iran-linked hackers have breached FBI director's personal emails
Ireland shuts last coal plant, becomes 15th coal-free country in Europe (2025)
Is Germany's gold safe in New York ?
Is anybody else bored of talking about AI?
Is it a pint?
Isotopic Evidence for a Cold and Distant Origin of Interstellar Object 3I/Atlas
Israel launches strike against Iran, declares state of emergency across country
It looks like the “JVG algorithm” only wins on tiny numbers
Italo Calvino: A Traveller in a World of Uncertainty
Italo Calvino: A traveller in a world of uncertainty
JSIR: A High-Level IR for JavaScript
JSLinux Now Supports x86_64
JSON Formatter Chrome Plugin Now Closed and Injecting Adware
Jack Dorsey says Block employees now bring prototypes, not slides, to meetings
Jails for NetBSD – Kernel Enforced Isolation and Native Resource Control
Jane Street Hit with Terra $40B Insider Trading Suit
January in Servo: preloads, better forms, details styling, and more
Java 26 is here
Java is fast, code might not be
Jazz CRJ9 at New York on Mar 22nd 2026, collision with fire truck on runway
Jeff Bezos Upended the Washington Post
Jemalloc un-abandoned by Meta
Jennifer Aniston and Friends Cost Us 377GB and Broke Ext4 Hardlinks
Jensen Huang says Nvidia is pulling back from OpenAI and Anthropic
Jepsen: MariaDB Galera Cluster 12.1.2
Jiga (YC W21) Is Hiring
Jimi Hendrix was a systems engineer
John Carmack about open source and anti-AI activists
John Deere to pay $99M in right-to-repair settlement
Jolla on track to ship new phone with Sailfish OS, user-replaceable battery
Jolla phone – a full-stack European alternative
Judge blocks Pentagon effort to 'punish' Anthropic with supply chain risk label
Judge finalizes order for Greenpeace to pay $345M in ND oil pipeline case
Jury says Meta knowingly harmed children for profit, awarding landmark verdict
Just two days of oatmeal cut bad cholesterol by 10%
Kagi Product Tips – Customize Your Search Results with URL Redirects
Kagi Small Web
Kagi Translate now supports LinkedIn Speak as an output language
Kaizen (YC P25) Hiring Eng, GTM, Cos to Automate BPOs
Kangina
Kansai Airport has never lost a baggage in the 30 years since it opened
Keeping a Postgres Queue Healthy
Khamenei Dead
Ki Editor - an editor that operates on the AST
Kin: Semantic version control that tracks code as entities, not files
Kyber (YC W23) Is Hiring an Enterprise Account Executive
LLM Writing Tropes.md
LLM plays an 8-bit Commander X16 game using structured "smart senses"
LLM=True
LLMs can be exhausting
LLMs can unmask pseudonymous users at scale with surprising accuracy
LLMs work best when the user defines their acceptance criteria first
Labor market impacts of AI: A new measure and early evidence
Lago (YC S21) Is Hiring
Landmark L.A. jury verdict finds Instagram, YouTube were designed to addict kids
Language Model Contains Personality Subnetworks
Language Model Teams as Distrbuted Systems
Language model teams as distributed systems
Larry Page has moved to Florida
Last gasps of the rent seeking class?
Lat.md: Agent Lattice: a knowledge graph for your codebase, written in Markdown
Latency numbers every programmer should know
Launch HN: Captain (YC W26) – Automated RAG for Files
Launch HN: Cardboard (YC W26) – Agentic video editor
Launch HN: Cekura (YC F24) – Testing and monitoring for voice and chat AI agents
Launch HN: Chamber (YC W26) – An AI Teammate for GPU Infrastructure
Launch HN: Didit (YC W26) – Stripe for Identity Verification
Launch HN: Freestyle: Sandboxes for AI Coding Agents
Launch HN: IonRouter (YC W26) – High-throughput, low-cost inference
Launch HN: Kita (YC W26) – Automate credit review in emerging markets
Launch HN: OctaPulse (YC W26) – Robotics and computer vision for fish farming
Launch HN: Palus Finance (YC W26): Better yields on idle cash for startups, SMBs
Launch HN: Prism (YC X25) – Workspace and API to generate and edit videos
Launch HN: Relvy (YC F24) – On-call runbooks, automated
Launch HN: RunAnywhere (YC W26) – Faster AI Inference on Apple Silicon
Launch HN: Sentrial (YC W26) – Catch AI Agent Failures Before Your Users Do
Launch HN: Sitefire (YC W26) – Automating actions to improve AI visibility
Launch HN: TeamOut (YC W22) – AI agent for planning company events
Launch HN: Terminal Use (YC W26) – Vercel for filesystem-based agents
Launch HN: Vela (YC W26) – AI for complex scheduling
Launch HN: Voltair (YC W26) – Drone and charging network for power utilities
Launch HN: Voygr (YC W26) – A better maps API for agents and AI apps
Launch an autonomous AI agent with sandboxed execution in 2 lines of code
Launching the Claude Partner Network
Lawmakers say US Military used laser to take down Border Protection drone in TX
Layoffs at Block
Leanstral: Open-source agent for trustworthy coding and formal proof engineering
Learn Claude Code by doing, not reading
Learning Creative Coding
Learning athletic humanoid tennis skills from imperfect human motion data
Learnings from paying artists royalties for AI-generated art
Leaving Google has actively improved my life
Lemonade by AMD: a fast and open source local LLM server using GPU and NPU
Lenovo's New ThinkPads Score 10/10 for Repairability
Let's Get Physical
Let's discuss sandbox isolation
LibreOffice Writer now supports Markdown
LibreOffice and the Art of Overreacting
LibreOffice – Let's put an end to the speculation
LibreSprite – open-source pixel art editor
Libxml2 Enterprise Edition (AGPL, from the previous maintainer)
Lichess and Take Take Take Sign Cooperation Agreement
Lil Finder Guy
Lil' Fun Langs' Guts
LinkedIn uses 2.4 GB RAM across two tabs
Linux Applications Programming by Example: The Fundamental APIs (2nd Edition)
Linux Internals: How /proc/self/mem writes to unwritable memory (2021)
Linux is an interpreter
Lisette a little language inspired by Rust that compiles to Go
Lisp-style C++ template meta programming
LiteLLM (YC W23): Founding Reliability Engineer – $200K-$270K and 0.5-1.0% equity
Little Free Library
Little Free Library Books
Little Snitch comes to Linux, but the core logic is closed source
Little Snitch for Linux – Because Nothing Else Came Close
LittleSnitch for Linux
Living human brain cells play DOOM on a CL1 [video]
LoGeR – 3D reconstruction from extremely long videos (DeepMind, UC Berkeley)
Local Stack Archived their GitHub repo and requires an account to run
Log File Viewer for the Terminal
Londoners are sick of viral videos telling lies about their city
Looks like it is happening
Lost Doctor Who Episodes Found
Love of corporate bullshit is correlated with bad judgment
Lower Price for ChatGPT Business
MAUI Is Coming to Linux
MCP is dead. Long live the CLI
Mac external displays for designers and developers, part 2 (2022)
MacBook M5 Pro and Qwen3.5 = Local AI Security System
Mac mini will be made at a new facility in Houston
Mad Bugs: Vim vs. Emacs vs. Claude
MagicAudio – Free Noise, Echo and Background Music Remover
Make macOS consistently bad (unironically)
Make macOS consistently bad unironically
Making Firefox's right-click not suck with about:config
Making MCP cheaper via CLI
Making Wolfram tech available as a foundation tool for LLM systems
Manjaro website off-line again due to lapsed certificate
Many African families spend fortunes burying their dead
Many SWE-bench-Passing PRs would not be merged
Mario and Earendil
Mathematical methods and human thought in the age of AI
Mbodi AI (YC P25) Is Hiring
Measuring progress toward AGI: A cognitive framework
Media scraper Gallery-dl is moving to Codeberg after receiving a DMCA notice
Medical journal says the case reports it has published for 25 years are fiction
MegaTrain: Full Precision Training of 100B+ Parameter LLMs on a Single GPU
Megadev: A Development Kit for the Sega Mega Drive and Mega CD Hardware
Memo: A language that remembers only the last 12 lines of code
Men in their 50s may be aging faster due to toxic 'forever chemicals'
Meow.camera
Mercurial Dyson – a plan for the disassembly of planet Mercury
Mercury 2: Fast reasoning LLM powered by diffusion
Mercury 2: The fastest reasoning LLM, powered by diffusion
Mesh over Bluetooth LE, TCP, or Reticulum
Meta Horizon Worlds on Meta Quest is being discontinued
Meta and Google found liable in social media addiction trial
Meta and TikTok let harmful content rise to drove engagement, say whistleblowers
Meta and YouTube Found Negligent in Landmark Social Media Addiction Case
Meta ordered to pay $375M in New Mexico trial over child exploitation
Meta removes ads for social media addiction litigation
Meta told to pay $375M for misleading users over child safety
Meta’s AI smart glasses and data privacy concerns
Meta’s renewed commitment to jemalloc
Meticulous (YC S21) is hiring to redefine software dev
Miasma: A tool to trap AI web scrapers in an endless poison pit
Michael Pollan punctures the AI bubble
Microgpt
Microslop Manifesto
Microsoft BitNet: 100B Param 1-Bit model for local CPUs
Microsoft Creative Writer (1993)
Microsoft PhotoDNA scanning problem
Microsoft bans the word "Microslop" on its Discord, then locks the server
Microsoft blocks trick to unlock native NVMe driver, but workarounds still exist
Microsoft is employing dark patterns to goad users into paying for storage?
Microsoft's "Fix" for Windows 11: Flowers After the Beating
Microsoft's 'unhackable' Xbox One has been hacked by 'Bliss'
Midnight Captain – A midnight commander inspired file manager
Midnight train from GA: A view of America from the tracks as airports struggle
Migrating the American Express Payment Network, Twice
Migrating to the EU
Militaries are scrambling to create their own Starlink
MinIO Is Dead, Long Live MinIO
MiniStack (replacement for LocalStack)
Ministack (Replacement for LocalStack)
Mistral AI Releases Forge
MitID, Denmarks sole digital ID, has been down for over an hour and counting
Modern SQLite: Features You Didn't Know It Had
Monkey Island for Commodore 64 Ground Up
MonoGame: A .NET framework for making cross-platform games
More common mistakes to avoid when creating system architecture diagrams
Most of the US economy is in a recession
Motorola announces a partnership with GrapheneOS Foundation
Mouser: An open source alternative to Logi-Plus mouse software
Moving from GitHub to Codeberg, for lazy people
Moving from WordPress to Jekyll (and static site generators in general)
Mozilla to launch free built-in VPN in upcoming Firefox 149
Mr. Chatterbox is a Victorian-era ethically trained model
Mullvad VPN: Banned TV Ad in the Streets of London [video]
Multifactor (YC F25) Is Hiring an Engineering Lead
Muse Spark: Scaling towards personal superintelligence
Musketeer d'Artagnan's remains believed found under Dutch church
My Homelab Setup
My minute-by-minute response to the LiteLLM malware attack
NASA Artemis II moon mission live launch broadcast
NASA announces major overhaul of Artemis program amid safety concerns, delays
NASA announces overhaul of Artemis program amid safety concerns, delays
NHS staff refusing to use FDP over Palantir ethical concerns
NMAP in the Movies
NRC Issues First Commercial Reactor Construction Approval in 10 Years [pdf]
NRC issues first commercial reactor construction approval in 10 years [pdf]
Naming rights to street auctioned in San Francisco
Nango (YC W23, API Access for Agents and Apps) Is Hiring
Nano Banana 2: Google's latest AI image generation model
NanoClaw's Architecture Is a Masterclass in Doing Less
NanoGPT Slowrun: Language Modeling with Limited Data, Infinite Compute
Nasdaq's Shame
Native Instant Space Switching on macOS
Ndea (YC W26) is hiring a symbolic RL search guidance lead
Neanderthals survived on a knife's edge for 350k years
Nearby Glasses
Neovim 0.12.0
Nestlé says 413,793 KitKat candy bars stolen en route from Italy to Poland
Netflix Backs Out of Warner Bros. Bidding, Paramount Set to Win
Netscape News Feed Straight Out of the Late 00s
Networking with agents: Put them in the right conversations with Tailscale
Never Bet Against x86
Never Buy A .online Domain
New AirSnitch attack breaks Wi-Fi encryption in homes, offices, and enterprises
New Apple Silicon M4 and M5 HiDPI Limitation on 4K External Displays
New California law requires age verification for all OS accounts
New Washington state law bans noncompete agreements
New York could prohibit chatbot medical, legal, engineering advice
New accounts on HN 10x more likely to use em-dashes
New iron nanomaterial wipes out cancer cells without harming healthy tissue
New laws to make it easier to cancel subscriptions and get refunds
New synthesis of astronomical measurements shows Hubble tension is real
Nightingale – open-source karaoke app that works with any song on your computer
Nihilistic Violent Extremism
No Terms. No Conditions
No evidence cannabis helps anxiety, depression, or PTSD
No one is happy with NASA's new idea for private space stations
No one owes you supply-chain security
No right to relicense this project
No, it doesn't cost Anthropic $5k per Claude Code user
Nobody Gets Promoted for Simplicity
Node.js needs a virtual file system
Noq: n0's new QUIC implementation in Rust
North Korean's 100k fake IT workers net $500M a year for Kim
Notes on Baking at the South Pole
Notes on Lagrange Interpolating Polynomials
Notes on Writing WASM
Nowhere Is Safe
Nowhere is safe
Number Research Inc
Number in man page titles e.g. sleep(3)
Number of UK workers on zero-hours contracts hits record high ahead of crackdown
Nvidia Launches Vera CPU, Purpose-Built for Agentic AI
Nvidia NemoClaw
Nvidia PersonaPlex 7B on Apple Silicon: Full-Duplex Speech-to-Speech in Swift
Nvidia backs AI data center startup Nscale as it hits $14.6B valuation
OCR for construction documents does not work, we fixed it
Obsidian Sync now has a headless client
Obsolete Sounds
Office.eu launches as Europe's sovereign office platform
Oil Is Near a Price That Hurts the Economy
Oil nears $110 a barrel after gas field strike
OkCupid gave 3M dating-app photos to facial recognition firm, FTC says
Ollama is now powered by MLX on Apple Silicon in preview
Olympic Committee bars transgender athletes from women’s events
One item purchased, ten emails
Online astroturfing: A problem beyond disinformation
Open Letter to Google on Mandatory Developer Registration for App Distribution
Open Source Endowment – new funding source for open source maintainers
Open Source Security at Astral
Open source calculator firmware DB48X forbids CA/CO use due to age verification
OpenAI Backs Bill That Would Limit Liability for AI-Enabled Mass Deaths
OpenAI Fires an Employee for Prediction Market Insider Trading
OpenAI agrees with Dept. of War to deploy models in their classified network
OpenAI closes funding round at an $852B valuation
OpenAI fires an employee for prediction market insider trading
OpenAI is walking away from expanding its Stargate data center with Oracle
OpenAI reaches deal to deploy AI models on U.S. DoW classified network
OpenAI resets spending expectations, from $1.4T to $600B
OpenAI – How to delete your account
OpenAI's $110B funding round (investments from Amazon, Nvidia, SoftBank)
OpenAI's fall from grace as investors race to Anthropic
OpenAI, the US government and Persona built an identity surveillance machine
OpenBSD on SGI: A Rollercoaster Story

How We Broke Top AI Agent Benchmarks: And What Comes Next

guid

source_url

author_name