Christian Heilmann

Quick tip: Getting all links from any web site into a spreadsheet using browser developer tools

Thursday, August 24th, 2023 at 3:15 pm

As part of taking over the editorial job of the WeAreDevelopers newsletter, I needed to get all the links from older editions and import them into a spreadsheet. Eventually I will write a scraping script, but there is a much simpler way to get all the links from a web site into a spreadsheet: browser developer tools. You can follow how that’s done in this video:

The step-by-step instructions:

  • Go to the web site you want the links from
  • Open Developer Tools (press F12)
  • Go to the Console tool
  • Enter (or copy and paste) the following: console.table($$('a'),['innerHTML','href'])
  • Highlight the table with your mouse
  • Copy and paste into your spreadsheet, making sure to only copy the values

This works in Microsoft Edge, Google Chrome, Firefox but for some reason not in Safari

Another issue is that long links and link texts get shortened with a … in Chrome and Edge, to work around that you need to shorten the links first. In my case, I removed the `utm_source=` tracking info. To this end, I had to write another short script in the console to run first.

Using $$('a').forEach(l => {l.href = l.href.replace(/?utm_source=.*$/,'')}) replaces all the utm_source tracking info on each link before calling the console.table.

Share on Mastodon (needs instance)

Share on Twitter

Newsletter

Check out the Dev Digest Newsletter I write every week for WeAreDevelopers. Latest issues:

160: Graphs and RAGs explained and VS Code extension hacks Graphs and RAG explained, how AI is reshaping UI and work, how to efficiently use Cursor, VS Code extensions security issues.
159: AI pipelines, 10x faster TypeScript, How to interview How to use LLMs to help you write code and how much electricity does that use? Is your API secure? 10x faster TypeScript thanks to Go!
158: 🕹️ Super Mario AI 🔑 API keys in LLMs 🤙🏾 Vibe Coding Why is AI playing Super Mario? How is hallucinating the least of our worries and what are rules for developing Safety Critical Code?
157: CUDA in Python, Gemini Code Assist and back-dooring LLMs We met with a CUDA expert from NVIDIA about the future of hardware, we look at how AI fails and how to play pong on 140 browser tabs.
156: Enterprise dead, all about Bluesky and React moves on! Learn about Bluesky as a platform, how to build a React App and how to speed up SQL. And play an impossible game in the browser.

My other work: