web builder

Data

The foundation of our sentiment analysis.

For the sake of transparency, all data, which has been processed during our project, can be accessed on our GitHub repository. Regarding our data, the table provides an overview of all the languages, which we considered, and the number of tweets, which we were able to acquire for the period 27th May to 2nd June. The number of tweets which we actually analysed is different due to some technical problems with Sentilo

LanguageAbbreviationTerm Acquired Analysed
BulgarianbgЕвропейски избори2520
CroatianhrEuropski izbori9929
CzechcsEvropské volby126109
DanishdaEU-valget7258
DutchnlEuropese verkiezingen1967623
EnglishenEuropean elections287874708
EstonianetEuroopa valimised90
FinnishfiEU-vaalit4410
FrenchfrÉlections européennes229911827
GermandeEuropawahl1673010784
GreekelΕυρωεκλογές53813523
HungarianhuEurópai választások143
IrishgaNa toghcháin Eorpacha30
ItalianitElezioni europee120993191
LatvianlvEiropas vēlēšanas7415
LithuanianltEP rinkimai63
MaltesemtL-elezzjonijiet Ewropej10
PolishplWybory europejskie19646
PortugueseptEleições europeias1432491
RomanianroAlegerile europene3717
SlovakskEurópske voľby43
SlovenianslEvropske volitve9434
SpanishesElecciones europeas71852048
SwedishsvEU-valet1596562
Showing entries (filtered from total entries)
free simple site templates

Twitter Data.


The Twitter API returns several data, such as a long list of attributes associated to the user, the text of the tweet, its source, its lang, the place associated to the tweet, how many times it has been retweeted, quoted and liked by other users, entities such as hashtags, urls, users’ mentions, media, symbols etc.

    We decided to keep the following information:

  • date of creation of the tweet (created_at)
  • language of the tweet as identified by Twitter (lang)
  • name of the user (screen_name)
  • place where the tweet has been posted from (location)
  • text contained in the tweet (full_text)
  • links (urls)
  • hashtags used in the tweet (tags)
  • the mentions realized by the user through the @ (mentions)
  • number of times the tweet has been retweeted (retweet_count)
  • number of favourites (favorite_count)

    Additional information we added with the preprocessing:

  • text of the tweet free from hashtags and mentions to be correctly analysed by Sentilo (parsed_text)
  • emojis associated to the tweet to enrich the final graph (emoji)

Besides, each tweet has finally been enriched with the average positive and/or average negative score assigned by Sentilo, in order to perform our sentiment analysis through the sparql queries and represent it in our graph.

© Copyright 2019 Severin Josef Burg, Eleonora Peruch