Jordi Casanellas
  • Data Science Blog
  • Dataguda
  • Astrophysics
    • Research
    • Teaching
    • Videos
    • Press
  • Contact
  • Data Science Blog
  • Dataguda
  • Astrophysics
    • Research
    • Teaching
    • Videos
    • Press
  • Contact
Jordi Casanellas

Tracking the mood of the world

14/9/2015

0 Comments

 
In this small project, a very rough estimation of the mood of the world is performed through the daily analysis of the headlines of several news websites. In total, 36 websites are scanned, which are divided in 8 arbitrary regions: Africa, Asia, Australia, Europe, Latin America, Middle East, Russia and USA. A very basic sentiment analysis is done by comparing the extracted text with a  sentiment lexicon   of rated words. 

That's how looked the evolution of the mood in different regions of the world from August to October 2014:
These are the steps to obtain the graphic above, all of them performed in Python:
  1. daily text  retrieval  from  news websites,
  2. analysis of the html with regular expressions,
  3. calculation of a mood score by comparison of the treated text with the word list afinn [1],
  4. storage in a database using MySQLdb,
  5. dealing with the data with Pandas,
  6. plot the results interactively with Bokeh.

All the code is available at  GitHub. 
Visit TheMoodOfTheWorld.weebly.com for more plots.
0 Comments



Leave a Reply.

    Jordi

    Data Scientist.
    Here you'll find some examples of data analysis, visualizations, machine learning and related topics.

    Archives

    July 2016
    October 2015
    September 2015

    Categories

    All
    Bokeh
    Data Visualization
    Machine Learning
    Python
    R
    SQL

    RSS Feed

Picture