WebMCP – a much needed way to make agents play with rather than against the web

Monday, February 16th, 2026 at 6:06 pm

WebMCP is an exciting W3C proposal that just landed in Chrome Canary to try out. The idea is that you can use some HTML attributes on a form or register JavaScript tool methods to give agents direct access to content. This gives us as content providers and web developers an active way to point agents to what they come for rather than dealing with tons of traffic of scripts that haphazardly and clumsily try to emulate a real visitor.

Agents vs. the web

The current relationship of agentic AI and the web is predatory, wasteful and fraught with error. What is happening is that agents scrape web sites, take screenshots and scan those or keep trying to fill out form fields and click on buttons to get to content that was meant for real, human visitors. Under the hood, agents use browser automation we created for testing, both of browsers and web apps. But instead of going through a defined test suite with knowledge of the structure of the web app, agents brute force their way on. This is exactly what we’ve been hardening the web against because of malware, spammers and content thieves. Companies like Cloudflare make a good living providing the tools for that. Publishing on the web is full of hazards. Just publish an free form to enter things you store in a database and boy will you have to deal with a mess within seconds. Spam and malware bots are quite at the ready to find any vulnerability to post their content to your site and XSS protection is the biggest game of whack-a-mole I hate having to play.

Agents vs. user wallets

For the users of agents this means that they burn through tokens much quicker as the agent grabs web content that is bloated, slow to parse and often needs several authentication steps. WebMCP can improve this as it allows content providers to show agents where the content to index is and what to put into form fields to reach the content it came for. Or – even better – it gives agents programatic access to trigger functionality and get content instead of trying to fill out a form and perform a side wide search that needs filtering afterwards.

Agents are now first-class citizens

In essence, this standard and its implementation in Chrome means that agents have become first-class citizens of the world wide web. A future we as publishers have to deal with. The good thing is that the web is pretty much ready for this, as we’ve done it before for search engine bots, syndication services and many other automated travelers of the information superhighway.

The web was designed to be machine readable!

The thing that annoys me about this is that we are re-inventing the wheel over and over again. When the web came around it was an incredible simple and beautiful publishing mechanism. HTML described the content and all you needed to do was to put a file on a server:

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>…</title>  
    <meta name="description" content="…">
    <meta name="keywords" content="…">
    <meta name="author" content="…">  
</head>
<body>
   <header>…</header>
   <main>
      <article>…</article>
    </main>
    <footer>…</footer>
</body>
</html>

The header, main and footer elements do not only help assistive technologies to understand the structure of the page, but they also help search engines and agents to find the content they are looking for. A clear description and keywords help search engines to understand what the page is about and to index it correctly. A good title makes it easy for users to see what the app is about. The meta tags and the structure of the page make it easy for agents to find the content they are looking for and to understand how to interact with it.

The web was designed to be machine readable and to allow for easy indexing and syndication. We had meta tags for search engines, we had sitemaps, we had RSS feeds and APIs. We had all the tools we needed to make our content discoverable and accessible to machines. But instead of using those tools, we have been building more and more complex web pages that are designed for human consumption and then trying to scrape them with agents. This is not only inefficient but also disrespectful to the web and its creators.

Semantic HTML would be a great thing for agents!

But the agent creators aren’t to blame for this, they are just trying to get the content they need to provide their users with the best experience possible. The problem is that we as content providers have given up on semantic HTML and machine readability in favour of flashy designs and complex interactions that are meant to impress human visitors but are a nightmare for agents to parse and understand. And are often a nightmare for human visitors as well, but that’s a different topic. I’ve been advocating for semantic HTML for decades, as I just love that it means that my content comes with a description and interaction for free. But for decades now we have been fighting a new breed of developers that see the web as a compilation target for their JavaScript and not as a publishing platform. Why bother with semantic HTML when you can just throw a div on the page and style it to look like anything you want? Why bother with meta tags when you can just stuff your content with keywords and hope for the best?

The meta content like description, keywords and author are still there, but they are often ignored or misused. The same goes for sitemaps and RSS feeds. We have been so focused on making our content look good for humans (and act and look like native apps) that we have neglected the machine readability of our content. And we have been focusing hard on making our content look good for search engines, which is a different kind of machine than agents. The meta description, title and keywords had a short life span of usefulness as search engines quickly learned to ignore them and rely on the actual content of the page because often the meta content was misleading or stuffed with keywords. Instead of using these in-built mechanisms of the web we added tons of extra information to the HTML head for Twitter, Facebook and many other services, some of which are dead by now and just add to the overall landfill of forgotten bespoke HTML solutions. Maybe this is a good time to read up on meta tags, alternative display modes of your content connected via LINK elements beyond 34 CSS files and 20 fonts.

Will WebMCP get adoption or will we take another loop around the conversion tree?

The question is, will we use this opportunity to make the web better for everyone, or will we continue to build bloated and inefficient web pages that are designed for human consumption or – worse – optimised for developer convenience. Will providers of agent services embrace this standard or discard it as a nice to have and keep brute forcing their way through the web. Or find other ways to make the web cheaper to read by agents. Cloudflare just introduced Markdown for Agents – a service that turns your already rendered HTML with thousands of DIVs and unreadable class names into structured markdown. Markdown, a non-standardised format, that just caused a scary security issue in Windows Notepad.

Alternative content has been a staple for Web2.0

We have had the tools for quite a while, many content providers offer feeds and APIs you and your agent can play with. Did you know for example that WordPress has a built in REST API that gives you access to all the content of a WordPress site? You can use that to get the content you need without having to scrape the web page. Terence Eden wrote a great article about how to use the WordPress REST API to get content with the lovely title Stop crawling my HTML you dickheads – use the API!.

Find-ability has always been the issue with this. Remember the incredibly simple and powerful idea of Microformats? They were a way to add semantic meaning to your content by using a few CSS classes. They were a way to make your content more machine readable and accessible without having to change the way it looked for human visitors. But they never really took off because they were not widely adopted and because they were not supported by search engines or somehow shown to end users in browsers. They were a great idea, ahead of their time and they never really caught on.

I am on team WebMCP, are you?

With WebMCP, we have the opportunity to go back to the roots of the web and make our content truly machine readable and accessible. We can use the new attributes and methods to point agents to the content we want them to index and to provide them with the information they need to understand our content. This is a chance to make the web a better place for both humans and machines, and to create a more symbiotic relationship between the two. We can use WebMCP to create a more efficient and effective web, where agents can easily find and index the content they need, and where publishers can have more control over how their content is accessed and used by agents. This is excellent news for the future of the web and for the future of AI, and I can’t wait to see how it evolves and how we can use it to create a better web for everyone.

Share on Mastodon (needs instance)

Share on BlueSky