U_Cite: America politician network analysis based on Quotebank

President Donald Trump and former Vice President Joe Biden during the first presidential debate at Case Western University and Cleveland Clinic, in Cleveland, Ohio. (Photo: Patrick Semansky/AP)
Final project for CS-401 Applied Data Analysis
Students: Chuanfang Ning, Irvin Mero Zambrano, Thomas Castiglione, Guoyuan Liu
Lecturer: Prof. Robert West

Quotebank is a dataset of 235 million unique, speaker-attributed quotations that were extracted from 196 million English news articles crawled from over 377 thousand English web domains. The project aims at analyzing the quotebank mentions in between year 2015 and 2020 to reveal the bi-polar political landscape of America. Keypoints in the project:

  • data cleaning and preprocessing pipeline from original Quotebank quotations, Wikidata dump and Partisan Audience Bias Scores Dataset.
  • political mention analysis pipeline including topic, sentiment and bias analysis.
  • political network analysis pipeline including network construction, community/centrality analysis and edge/node feature detection.
  • visualisation pipeline for the analysis above with interactive network graphs.

More details to be found in our code and website.

Chuanfang Ning
Chuanfang Ning
Msc Student in Robotics @ EPFL

Versatility makes Vision. Practice makes Proficiency.