Christian Heilmann

A more complicated web

Tuesday, January 15th, 2019 at 5:03 pm

One of the amazing things about the web used to be its simplicity. It was not too hard to become your own publisher on it. You either used one of the now defunct services like Geocities, Xoom, Apple Web Pages, Google Pages and so on… Or you got a server, learned about HTML and CSS and a dash of JavaScript and created your own site. Training materials were online and largely free and open.

The more important thing to me was there was a sense of adventure and exploration. Many of us took our first steps as web developers by changing colours on a GeoCities or NeoPets web site. We looked at the source code. We used what we had and made it work – no matter how convoluted. That way we discovered now terrible ideas like layout tables or inline styles. There was no guide to follow – it was the thrill of beating the system and making it do something it wasn’t meant to. It was our cleverness that got us there – not picking from a huge offer of choices and finding one that does the job. I loved the times when online magazines about web design talked about CSS techniques like sliding doors and what image replacement technique to use and not “which is the best framework to get started” or “which browser is the fastest this month”.

A read-write web

The big success of the web is that everybody can take part and the barriers to entry were low. It was a read-write web, you learned the trade by using the medium. This was the big breakthrough. You didn’t learn sound production by listening to the radio. You didn’t learn how to make movies by watching TV. Old school media needed many experts to work together to produce the final product. On the web, things seemed much easier. And being able to peek under the hood with a view-source was a great opportunity.

This is still the idea of the indie web and there are many great ideas to be your own publisher. And – maybe even more importantly – the owner of your publishing platform and how your content gets to the end users. I consider this incredibly important but I am torn about what happens in that area.

I’m disappointed that we allowed self-publishing on the web to become a niche experience again. But the more problematic part to me is that outside the indie web movement there is a general call to go back to when the web was simpler and we can fight the siren song of Facebook by running our own blogs. First of all, fighting Facebook is fighting the most finely honed skinner box and peer pressure machinery out there. Secondly, it is not as simple to run your own web site these days as it used to be.

The problem that I see though is that there is a romantic view of the realities of the web today. In the following few paragraphs I will point out a few things that broke along the way of the dream of an open and simple to contribute web. These are based on 20 years experience in this field, working as a web developer, server admin, in security and on browsers and standards.

I don’t want them to discourage anyone to take part in the web. But I am tired of the message that “everything was simpler back in the days” and that “we should go back to that”. Running a web site means you take on responsibility for your users and – to a degree – the open web. Any system is as weak as its weakest link.

The gamed web

The web isn’t a cool geek playground any longer. It is a vital part of everyday life. And decades of trying to find a way to monetise something open and decentralised took their toll. When I look back at when I started publishing on the web there was a genuine “build it and they will come”. Or, to be more precise, “write it and they will come” – as good content, structured in a clear way, was the big winner. To a degree, it still is, but the question is who will come.

Put an email link on the web and you will get 95% spam, 3% people trying to sell you their content services and 2% genuine requests. Have a comment option on your web product and things are worse. You will either have to share your content with a third party doing spam protection for you or drown in it. A huge part of web traffic these days is bots and scripts. Which is a downside of a simple system designed to be open.

Good content still gets you found. But it also invites a lot of people to quote, steal or find some other way to associate their – often terrible – products with it. It is damn easy to set up a web product full of scraped content with lots of link optimisation. Lazy SEO consultants have been doing it for years.

Take this blog. I have no uncertain words about it being my work, and that I don’t publish third party content. Yet I get about 50 emails a week of people offering me their articles, infographics or videos to publish for a link back. I even have been approached by companies in direct competition to the product I work on offering me money for each download of theirs.

Fact is that when you publish on your own site, you inherit a whole community of people you don’t want and you need to deal with them. You need to factor this time in.

The abused web

What we consider a way to express ourselves on the web – our personal web site – is a welcome opportunity for attackers. You may think that your little home on the web isn’t interesting to attackers. It probably isn’t. But it can be recruited as a part of a botnet or to store illegal and malicious content for re-distribution.

Publish any form or non-paranoid display of user entered or URL data and you will have lots of hacking attempts. So we need to be constantly vigilant about this. It may look like nothing when a security tool shows a JavaScript alert on your page, but it isn’t. To an attacker this means they can access your server and store whatever they want, scan for more credentials and create their own users. Unless you have access to the server logs, you often don’t realise unauthorised use. Often with shared virtual hosting, you don’t. And even if you do but lack the tools or knowledge it can be months before you realise someone is abusing your server. I did.

Any chance to publish content is a possible attack vector. If you want to hear a real horror story about this, check out what Remy Sharp went through over the years with JSBin .

To put this in other words:

If it is easy for you to quickly FTP some content to your web product, it is easy for everybody.

Which brings me to the last part of our open web world.

A new level of technical complexity

Again, I don’t want to discourage people to take part in the open web and I am 100% behind the message that we need to own our content. But I also want to make sure that when we tell people to do that about the responsibilities and dangers.

The web of old had a few attack vectors but now the game has changed. Our goal as web standards and browser makers shifted some time ago. It wasn’t only about offering and displaying web content. It was to match what native apps offered. This was a necessity to keep the web alive in a world of mobile devices. It had to answer the different challenges of mobile connectivity. That way we made the web a lot more complicated. We have databases, offline functionality and storage, workers and can use and create binary code in the browser. In CSS we have layout tools that aren’t abuse of position and float. We can generate and manipulate images with gradients, drop shadows and filters. We can generate sound and access cameras and sensors. It is a wonderful time to be a web developer.

One big change in this new functionality of the web was the extensible web manifesto . In it we rightfully demanded more transparency and access to the low-level functionality of browsers. We didn’t want “magical functionality” on the web that did things. We wanted more detailed access to how browsers work and how they show the things we defined in our markup. Thus we created a much more complex web. More access means more responsibility. And more responsibility demands more insight and knowledge.

Lately I got a few bug reports of scripts I wrote to work with HTML5 canvas. People complained that Chrome reported tainted canvas data not being available. It turns out that people downloaded my script and used it in a local file in the browser. Almost every newer API in the browser needs to be accessed via http or even a safer resource accessed with https or by running a local server. This is now a given – and it means we need to step up as new developers and for us to train them accordingly.

So, to me, there is no such thing as going back to the good old web where everything was simple. It never was. What we need now to match the siren call of closed garden publishers is making it easier to publish on the web. And to control your data and protect the one of your users. This isn’t a technical problem – it is one of user interfaces, services and tools that make the new complexity of the web manageable. I’m tired of complaints about people using frameworks when there is a simpler alternative. I am tired of the argument of “too much JavaScript”.

Every feature of an interface isn’t an opportunity but a choice. And it costs some effort to blend it out when you don’t need it until you do. When we introduce new people to the web these days we often overwhelm them with an overload of choice. Freedom of choice should be a gift, not a burden.

Publishing on Medium, Facebook and LinkedIn is simple. It also comes with a pre-filtered audience and tools to control abuse. Self-publishing is better – no question asked. But as of now, it is harder to do. It seems simple enough, but can get problematic soon. We have enough un-maintained, open-to-attack resources out there . All these started with the best intentions in mind but ran out of steam soon enough.

Own your content. Own your platform. But take your time to understand the risk. Learn how to be a good landlord for your words and thoughts by keeping their home in check.

This is where tooling comes in. Teaching new publishers on the web using an editor that lints and creates local servers for you is a great idea. Showing them tools that check their sites for interoperability, security and accessibility issues with explanations is a good idea. Getting people started with GitHub to host their projects and find a way to generate a static page from them is a good idea. I don’t want to see people using a file name as version control any longer and have no history of their work. Sure, they have the right to make life harder for them, but isn’t this about publishing content?

Share on Mastodon (needs instance)

Share on Twitter

My other work: