February 12, 2011 at 11:42 am

Crowdfunded Datamine To Explore Whether There’s a Rap For That

Tahir Hemphill has reached his targeted $7,500 in funding on the crowdfunding site Kickstarter — enough to begin datamining rap lyrics in earnest to answer questions including who was first to rap about champagne, who coined the term “hater,” which sneakers are the most hip-hop, and so on.

Soon, science should have the answers. In other words, #dataftw.

“The Hip-Hop Word Count (HHWC) [will be] a searchable ethnographic database built from the lyrics of over 40,000 hip-hop songs from 1979 to present day. The database [will become] the heart of an online analysis tool that generates textual and quantified reports on searched phrases, syntax, memes and socio-political ideas…

How can analyzing lyrics teach us about our culture? The Hip-Hop Word Count [will lock] in a time and geographic location for every metaphor, simile, cultural reference, phrase, meme and socio-political idea used in the corpus of hip-hop.

The Hip-Hop Word Count [will convert] this data into explorable visualisations which help us to comprehend this vast set of cultural data. This data can be used to chart the migration of ideas and builds a geography of language and [be] the engine for a teaching curriculum.

So far, 342 people have contributed $8,2560 to the project, which the Brooklyn-based founder says will go to the programmers, design consultants, data cleaners and hosting company that would make this happen.

(Thanks, Lydia)