17. Buzzwords of History

Jon M. Kleinberg, Computer Science, developed a method for a computer to find the topics that dominate a discussion at a particular time by scanning large collections of documents or sudden, rapid bursts of words. Kleinberg scanned presidential State of the Union addresses, from Washington in 1790 to Bush in 2002, and created a list of words that reflects historical trends. The list includes words such as British, militias, Spain, slavery, emancipation, coinage, interstate, depression, atoms, communism, jobs, children, Medicare, America, and century. Kleinberg conceived the idea as he dealt with his flood of incoming email—from other computer scientists, the keywords related to hot topics; from students, the word prelim burst shortly before each midterm exam. A search for these words provided ways to categorize messages. When an important topic comes up for discussion, keywords related to the topic will show a sudden increase in frequency. He devised a search algorithm to look for “burstiness,” measuring the number of times words appear and the rate of increase in those numbers over time. His technique could have many data mining applications, including searching the Web or studying trends in society as reflected in Web pages.

