
Markdown is the universal language of AI agents. It is also, increasingly, the reason nobody reads what they produce.
That tension sits at the center of a debate that has been building quietly among AI engineers and is now breaking into the open. In early May, Andrej Karpathy posted a recommendation: ask your LLM to structure its response as HTML, then view it in a browser. Around the same time, Thariq Shihipar, an engineering lead on the Claude Code team at Anthropic, published a detailed argument called The Unreasonable Effectiveness of HTML, making the case that he had stopped using markdown altogether for AI-generated outputs.
Their reasoning is not about aesthetics. It is about a structural mismatch between what AI agents can produce and what humans can actually absorb.
How markdown became the default
Markdown won the AI output format race for three practical reasons: it is cheap, it is readable by machines, and it is easy to edit by hand.
On cost, the gap is significant. Converting HTML to markdown reduces token usage by roughly 68% for clean content and up to 87% for real-world web pages. Cloudflare launched a "Markdown for Agents" feature specifically to strip HTML down to markdown before feeding it to AI systems, cutting inference costs dramatically.
On machine comprehension, markdown actually outperforms HTML. In GPT-based table extraction benchmarks, markdown representations achieved 60.7% accuracy compared to 53.6% for HTML tables. RAG pipelines see up to 35% accuracy improvement when ingesting markdown over raw HTML.
And on editability, markdown is hard to beat. You can open a .md file in any text editor, make changes, and commit them to version control with clean, readable diffs. HTML diffs are noisy and hard to review.
These advantages are real. They explain why every AI coding tool, from Cursor to GitHub Copilot to Claude Code, defaults to markdown for plans, specs, and documentation. But they also reveal the assumption baked into the format: that the human on the other end is going to read and edit the file manually.
That assumption is breaking down.
The reading problem
Harvard Business Review published a study in March 2026 that coined the term "AI Brain Fry." Workers with high AI oversight reported 19% greater information overload, 14% more mental effort, and 33% more decision fatigue compared to those with low AI oversight. A separate Fortune analysis found that time spent on email doubled after AI tool adoption, while focused work sessions fell by 9%.
The problem is not that AI writes badly. The problem is that AI writes too much, and markdown does nothing to help humans process the volume.
Thariq Shihipar put it bluntly: "I tend to not actually read more than a 100-line markdown file, and I certainly am not able to get anyone else in my organization to read it." This matches what most teams experience. AI agents produce 200-line implementation plans, detailed specs, and multi-page reports. The output is technically correct and structurally sound. And most of it goes unread.
The neuroscience supports this. Roughly 30% of the human cerebral cortex is dedicated to visual processing, according to Felleman and Van Essen's foundational cortical mapping research. Hearing gets 3%. Touch gets 8%. Vision is, as Karpathy put it, "the 10-lane superhighway of information into brain." Markdown barely uses it. Bold text, headers, and bullet points are the entirety of markdown's visual toolkit.
Where HTML changes the equation
HTML does not make AI agents smarter. It makes their output consumable.
The difference is information density. HTML can represent tabular data, styled layouts, SVG diagrams, interactive elements with JavaScript, spatial relationships with absolute positioning, and embedded code snippets that actually run. Markdown forces AI agents to approximate all of this with ASCII art and unicode characters.
Karpathy framed this as a progression that mirrors how computing interfaces have always evolved: raw text, then markdown, then HTML, and eventually interactive neural video. The pattern holds across the history of computing, from command lines to GUIs to touchscreens. Each step traded efficiency for comprehension.
Thariq's examples make the practical case. He uses HTML for implementation plans with embedded mockups and code snippets. For code review artifacts that render actual diffs with inline annotations, color-coded by severity. For interactive prototypes where sliders let you tune parameters and a "copy as JSON" button exports the result back into the coding session. For research reports with SVG flowcharts and tabbed navigation.
The vibe coding movement is accelerating this. When developers are increasingly describing what they want in natural language and letting agents write the code, the output format matters more than ever. You need to verify what the agent built, and a rendered HTML preview communicates that far more efficiently than scrolling through raw code in a terminal.
The shareability advantage is just as significant. Markdown files require a renderer or an attachment. HTML opens natively in any browser. When you can share an agent's output as a URL, the likelihood of stakeholder engagement goes up substantially.
Enterprise platforms have already decided
While the developer community debates, enterprise AI platforms have been quietly building rich output systems for years.
Salesforce Agentforce processes over 4 million sessions across 133,000+ agents using Adaptive Response Formats, a system that converts LLM text responses into structured UI components like carousels, rich choice buttons, and media cards. Their engineering team documented an interesting problem during development: early versions "over-formatted" responses, turning simple yes/no answers into full UI components. The lesson was that rich output needs to match the complexity of the information.
Microsoft's Copilot Studio uses Adaptive Cards, a platform-agnostic format for rich interactive content. ServiceNow's Now Assist displays agent results as actionable cards with source links and step-by-step progress tracking.
Google went further with A2UI, an open protocol where agents request pre-approved UI components rather than generating raw HTML. The distinction matters for security: instead of trusting agents to write safe HTML, A2UI lets agents declare what they want to show and the platform handles rendering.
All three major AI labs have invested in rich output for their consumer products too. Anthropic's Claude Artifacts has generated "tens of millions" of interactive HTML outputs. OpenAI added HTML and React rendering to ChatGPT Canvas. These are not experiments. They are production features with massive adoption.
The signal from every major platform is the same. When agents talk to humans, text alone is not enough. The platforms that orchestrate agents across multiple models and workflows need output formats that match the complexity of what those agents produce.
The tradeoffs are real
HTML is not a free upgrade. The costs are concrete.
Token usage is the most obvious. Clean HTML costs 2 to 3 times more tokens than equivalent markdown. Real-world HTML with CSS and JavaScript can balloon to 8 to 10 times more. With context windows now stretching past one million tokens, this matters less than it did in 2023, but it still adds up at scale.
Security is a harder problem. Raw HTML output from AI agents can contain JavaScript, which opens the door to cross-site scripting and injection attacks. Google's A2UI protocol exists specifically because enterprise security teams cannot accept agents writing arbitrary HTML that runs in production environments.
Version control suffers too. HTML diffs are noisy, full of closing tags and attribute changes that obscure the actual content change. This is one of the biggest downsides Thariq himself acknowledged.
And there is a counter-argument worth taking seriously. Kurtis Redux published "The Unreasonable Ineffectiveness of HTML" in direct response, arguing that the switch "chases visual gloss at the expense of source readability, security, ecosystem compatibility, and reviewability." For codebases where agents collaborate with humans through shared files, markdown's simplicity remains a genuine advantage.
Which format makes better agents?
The answer depends on who the agent is talking to.
For agent-to-agent communication and machine processing, markdown wins clearly. It is cheaper, more accurately parsed, and easier to version. When an AI agent produces output that another system will consume, markdown's constraints are strengths.
For agent-to-human communication, HTML wins just as clearly. When the goal is for a person to understand, evaluate, and act on what an agent produced, visual clarity and information density outweigh token efficiency. The 19% increase in information overload that HBR documented is not going to be solved by writing better markdown. It will be solved by presenting information in formats the brain can actually process.
The best agent platforms will support both. They will use markdown and structured data internally for agent reasoning, memory, and inter-agent communication. And they will render rich, visual, interactive outputs for the humans who need to review, approve, and act on agent work. The platform layer is where this translation happens, converting agent reasoning into human-readable results without forcing the agent to write presentational code.
Karpathy's progression, text to markdown to HTML to interactive neural video, is not a prediction about a distant future. The first three steps are happening now. The question for enterprises is whether their agent infrastructure is keeping up, or whether they are still asking people to read 200-line markdown files that nobody finishes.





