Question 1

什么是“Agent Reading Test”？

Accepted Answer

Agent Reading Test 是 Link News 基于事实数据库聚合的新闻话题，当前摘要为：A benchmark that tests how well AI coding agents can read web content. Point your agent at the test, get a score, compare across platforms.

AI coding agents (Claude Code, Cursor, GitHub Copilot, and others) read documentation websites as part of their workflows. But most agents hit silent failure modes: content gets truncated, CSS buries the real text, client-side rendering delivers empty shells, and tabbed content serializes into walls of text where only the first variant is visible.

This benchmark surfaces those failure modes. Each test page is designed around a specific problem documented in the Agent-Friendly Documentation Spec . The pages embed canary tokens at strategic positions. But instead of asking agents to hunt for tokens (which games relevance filters), the test gives the agent realistic documentation tasks. Only after the agent completes all tasks does it learn about the canary tokens and report which ones it encountered. You paste the results into a scoring form.

150K-char page with canary tokens at 10K, 40K, 75K, 100K, and 130K. Maps exactly where your agent's truncation limit kicks in.

80K of inline CSS before the real content. Tests whether agents distinguish CSS noise from documentation.

Client-side rendered page. Content only appears after JavaScript executes. Most agents see an empty shell.

8 language variants in tabs. Canary tokens in tabs 1, 4, and 8. Tests how far into serialized tab content the agent reads.

Returns HTTP 200 with a "page not found" message. Tests whether the agent recognizes it as an error page.

Markdown with an unclosed code fence. Everything after it becomes "code." Tests markdown parsing awareness.

Different canary tokens in HTML vs. markdown versions. Tests whether your agent requests the better format.

301 redirect to a different hostname. Most agents won't follow it (security measure). The canary is on the other side.

Three cloud platforms, identical "Step 1/2/3" headers. Tests whether agents can determine which section is which.

Real content buried after 50% navigation chrome. Tests whether agents read past the sidebar serialization.

The test has a maximum score of 20 points . Each canary token found earns 1 point, and correct answers to qualitative questions earn 1 point each. The answer key has the full breakdown.

A perfect score is unlikely for any current agent. The tests are calibrated so that each failure mode will realistically affect at least some agents. A typical score range for current agents is probably 14-18 out of 20, depending on the platform's web fetch pipeline.

Agent Reading Test is a companion project to the Agent-Friendly Documentation Spec , which defines 22 checks across 8 categories evaluating how well documentation sites serve AI agent consumers. The spec is grounded in empirical observation of real agent workflows.

This benchmark flips the perspective: instead of testing the documentation site, it tests the agent. The same failure modes apply, but here we're measuring which agents handle them gracefully and which don't.

Source code: github.com/agent-ecosystem/agent-reading-test

Created by Dachary Carey &bull; Licensed under CC BY 4.0

Question 2

这个话题覆盖了哪些来源？

Accepted Answer

这个话题当前覆盖 1 个来源平台，并持续汇总相关新闻、搜索与社交讨论信号。

Question 3

这个话题有哪些可追溯证据？

Accepted Answer

当前页面展示 1 条来源证据、1 个时间线节点，并保留原始出处链接便于核验。

Agent Reading Test

Why this topic matters

Keywords

Source evidence

Agent Reading Test

Timeline

Related topics

Research-Driven Agents: When an agent reads before it codes

Moving from WordPress to Jekyll (and static site generators in general)

Show HN: Hippo, biologically inspired memory for AI agents

The cult of vibe coding is insane

Show HN: Modo – I built an open-source alternative to Kiro, Cursor, and Windsurf

Show HN: Modo – I built an open-source alternative to Kiro, Cursor, and Windsurf

Show HN: Hippo, biologically inspired memory for AI agents

The cult of vibe coding is insane