For the sake of transparency, all data, which has been processed during our project, can be accessed on our GitHub repository. Regarding our data, the table provides an overview of all the languages, which we considered, and the number of tweets, which we were able to acquire for the period 27th May to 2nd June. The number of tweets which we actually analysed is different due to some technical problems with Sentilo.
Language | Abbreviation | Term | Acquired | Analysed |
---|---|---|---|---|
Bulgarian | bg | Европейски избори | 25 | 20 |
Croatian | hr | Europski izbori | 99 | 29 |
Czech | cs | Evropské volby | 126 | 109 |
Danish | da | EU-valget | 72 | 58 |
Dutch | nl | Europese verkiezingen | 1967 | 623 |
English | en | European elections | 28787 | 4708 |
Estonian | et | Euroopa valimised | 9 | 0 |
Finnish | fi | EU-vaalit | 44 | 10 |
French | fr | Élections européennes | 22991 | 1827 |
German | de | Europawahl | 16730 | 10784 |
Greek | el | Ευρωεκλογές | 5381 | 3523 |
Hungarian | hu | Európai választások | 14 | 3 |
Irish | ga | Na toghcháin Eorpacha | 3 | 0 |
Italian | it | Elezioni europee | 12099 | 3191 |
Latvian | lv | Eiropas vēlēšanas | 74 | 15 |
Lithuanian | lt | EP rinkimai | 6 | 3 |
Maltese | mt | L-elezzjonijiet Ewropej | 1 | 0 |
Polish | pl | Wybory europejskie | 196 | 46 |
Portuguese | pt | Eleições europeias | 1432 | 491 |
Romanian | ro | Alegerile europene | 37 | 17 |
Slovak | sk | Európske voľby | 4 | 3 |
Slovenian | sl | Evropske volitve | 94 | 34 |
Spanish | es | Elecciones europeas | 7185 | 2048 |
Swedish | sv | EU-valet | 1596 | 562 |
The Twitter API returns several data, such as a long list of attributes associated to the user, the text of the tweet, its source, its lang, the place associated to the tweet, how many times it has been retweeted, quoted and liked by other users, entities such as hashtags, urls, users’ mentions, media, symbols etc.
We decided to keep the following information:
Additional information we added with the preprocessing:
Besides, each tweet has finally been enriched with the average positive and/or average negative score assigned by Sentilo, in order to perform our sentiment analysis through the sparql queries and represent it in our graph.