Skip to content

Distant Reading Case Study III

Analysing reviews of True Crime podcasts

True crime is one of the most popular media genres of the digital age. Its popularity can be attributed to a captivating blend of mystery, suspense, real-world drama and, in many cases, troubling insights into subcultures, politics, and police work. It taps into our fascination with the darker aspects of human nature, providing a window into the minds of both criminals and investigators. Analysing true crime podcast reviews via distant reading offers a unique opportunity to better understand why listeners tune in and what they like about their favourite shows, but also what they think should be improved. By employing computational tools to sift through countless reviews, we can identify recurring themes and patterns that we might overlook in a close reading approach.

For your first distant reading experiments, we would like you to work with a relatively small datasets of reviews on the successful American true crime podcast Serial. All the reviews selected for the dataset have 1 or 2 star ratings, so listeners who left the reviews obviously did not like it all that much. The dataset only contains the reviews and does not give you any indication who posted them when. To find that out, you can consult the full data set in CSV format.

Ingest your data into Voyant Tools

Go the the Voyant Tools website and simply paste the URL of the data (as indicated in the link above) into the "add text" field. Press the blue "reveal" button and start exploring the data set!

Tasks to perform in Voyant

High-level analysis with word cloud and frequencies table

  • Look at the word cloud and the corresponding frequencies table. What words are the most prominent? What different topics or themes can you identitfy in the data?
  • Who are the "protagonists" mentioned in the data? Are any people mentioned by name and who are they? Show hosts, politicians, police officers?
  • What places (e.g. cities) are mentioned in the data? Why do you think that could be the case?
  • What surprises you? What information is hard to contextualise?
  • Write down words that express emotions / people's feelings towards the show and its hosts. Put them into the trends tool. How did these emotions develop over time?
  • Try to find groups of words for comparative analysis in "trends". This group of words has to be homogenous. You can, for example, look at different place names or trace the trends for different adjectives (e.g. "good", "bad", "exciting", "boring"). It does not make sense to combine terms from different word groups.
  • What words are most prominently associated with the words "police" and "cop"? What does this say about the listeners' opinions?
  • Try other words such as "criminals", "journalism" or "investigators". What are the results? What do they tell us about the podcast and perhaps also about more general debates in the United States?

Reading keywords in context

  • Put the words which you have used to find co-occurences in the "context" tool to see the full sentences in which they are used. Does this give you any additional insights?

Drawing general conclusions

  • What conclusions can you draw from all the individual distant reading results?
  • What is (un)expected?
  • What would have been difficult to find via close reading?
  • What research questions would be interesting to explore based on this data set?