Christian Heilmann

Quick tip: using flatMap() to extract data from a huge set without any loop

Friday, September 6th, 2024 at 12:47 pm

A capybara wearing a flat cap and holding a pint with the name Flat Cap crossed out and .flatMap() instead.

I just created a massive dataset of all the AI generated metadata of the videos of the WeAreDeveloper World Congress and I wanted to extract only the tags.

The dataset is a huge array with each item containing a description, generated title, an array of tags, the original and their title, like this:

{
  "description": "The talk begins with an introduction to Twilio…", 
  "generatedtitle: "Enhancing Developer Experience: Strategies and Importance",
  "tags": ["Twilio", "DeveloperExperience", "CognitiveJourney"],
  "title": "Diving into Developer Experience"
}

What I wanted was an alphabetical lost of all the tags in the whole dataset, and this is a one-liner if you use flatMap():

data.flatMap(d => d.tags);

You can sort them alphabetically with sort():

data.flatMap(d => d.tags).sort();

And you can de-dupe the data and only get unique tags when you use Set():

new Set(data.flatMap(d => d.tags).sort());

You can try this in this codepen.

Share on Mastodon (needs instance)

Share on BlueSky

Newsletter

Check out the Dev Digest Newsletter I write every week for WeAreDevelopers. Latest issues:

160: Graphs and RAGs explained and VS Code extension hacks Graphs and RAG explained, how AI is reshaping UI and work, how to efficiently use Cursor, VS Code extensions security issues.
159: AI pipelines, 10x faster TypeScript, How to interview How to use LLMs to help you write code and how much electricity does that use? Is your API secure? 10x faster TypeScript thanks to Go!
158: 🕹️ Super Mario AI 🔑 API keys in LLMs 🤙🏾 Vibe Coding Why is AI playing Super Mario? How is hallucinating the least of our worries and what are rules for developing Safety Critical Code?
157: CUDA in Python, Gemini Code Assist and back-dooring LLMs We met with a CUDA expert from NVIDIA about the future of hardware, we look at how AI fails and how to play pong on 140 browser tabs.
156: Enterprise dead, all about Bluesky and React moves on! Learn about Bluesky as a platform, how to build a React App and how to speed up SQL. And play an impossible game in the browser.

My other work: