In this small project, a very rough estimation of the mood of the world is performed through the daily analysis of the headlines of several news websites. In total, 36 websites are scanned, which are divided in 8 arbitrary regions: Africa, Asia, Australia, Europe, Latin America, Middle East, Russia and USA. A very basic sentiment analysis is done by comparing the extracted text with a sentiment lexicon of rated words.
That's how looked the evolution of the mood in different regions of the world from August to October 2014:
That's how looked the evolution of the mood in different regions of the world from August to October 2014:
These are the steps to obtain the graphic above, all of them performed in Python:
- daily text retrieval from news websites,
- analysis of the html with regular expressions,
- calculation of a mood score by comparison of the treated text with the word list afinn [1],
- storage in a database using MySQLdb,
- dealing with the data with Pandas,
- plot the results interactively with Bokeh.
All the code is available at GitHub.
Visit TheMoodOfTheWorld.weebly.com for more plots.