Below you will find pages that utilize the taxonomy term “web scraping”
Post
Topic modelling on one of Germany's most popular Corona podcast
Since the early start of the Corona pandemic German public broadcaster NDR Info airs the popular podcast with the virologist Christian Drosten.
In addtion to the podcast itself the transcript of each episode is published on the broadcaster’s website. This allows to easily do some text processing and analysis on the podcasts’ content.
First I wrote a script to scrape and transform the transcript of all podcast episodes to get a dataframe with the columns episode title, date, link to the transcript, episode no, interviewer, speaker of the transcript section and transcript section.
Post
Who is going to finish the Köln Marathon?
Inspired by this question on Quora I wanted to dig into marathon results and in particular the “DNFs” (participants who started the marathon but didn’t make it to the finish line).
I would have liked to do my little analysis on the results from the Berlin Marathon, but unfortunately I couldn’t find any results including the DNFs (if you have a hint where to find them please let me know 😄).