Queries

Sparkling SPARQL results.

Our data comes to live by querying the graph with Fuseki. After having created information out of data, the final step is to infer knowledge. The following page demonstrates ways to exploit the knowledge graph with SPARQL. The spectrum of examples comprises thirteen queries. Of those thirteen queries, ten focus on the extraction of specific data from the dataset. The other three queries are used to combine data for visualising phenomena concerning the European Election debate on Twitter. Generally, all queries start from an initial question.

The three figures contain the average positive and negative sentiment for each language [Fig. 1], the development of the overall sentiment and the general participation in the discussion [Fig. 2], and the ambiguity of tweets with positive and negative sentiment [Fig. 3]. The raw data for those three figures and all other queries can be found in our GitHub repository.

Average Pos/Neg Sentiment per Language

The following figure grounds on the question: Which is the average positive and average negative score for each language? This question derives from the research interest in the matter, whether based on the public opinion expressed in tweets, it can be inferred, how a milieu has perceived the political event. As the data shows, there are slight differences in the expressed opinion and a stronger like/dislike regarding the election outcome can be assumed. However, caution is required because cultures see the public expression of strong emotions very differently. A direct correlation between public outspeak and the perception is not necessarily provided.

Which is the average positive and average negative score for each language?

						     
  BASE <http://www.europinion.com/>
  PREFIX eur: <http://www.europinion.com/>

  SELECT ?lang
  (COUNT(DISTINCT ?tweet) AS ?tweet_count)
  (SUM(?avg_score) AS ?tot_score)
  (?tot_score/?tweet_count AS ?final_score)

  WHERE {
    {<2019/> eur:lang ?lang.
     ?lang eur:hasTweet ?tweet.
     ?tweet eur:hasAvgNegative ?n_score.
      FILTER NOT EXISTS {?tweet eur:hasAvgPositive ?p_score}
      BIND((?n_score) AS ?avg_score)
     }
  UNION
  {
    <2019/> eur:lang ?lang.
    ?lang eur:hasTweet ?tweet.
    ?tweet eur:hasAvgPositive ?p_score.
      FILTER NOT EXISTS {?tweet eur:hasAvgNegative ?n_score}
      BIND((?p_score) AS ?avg_score)
  }
  UNION
    {
    <2019/> eur:lang ?lang.
    ?lang eur:hasTweet ?tweet.
    ?tweet eur:hasAvgPositive ?p_score.
    ?tweet eur:hasAvgNegative ?n_score.
      BIND((?n_score+?p_score)/2 as ?avg_score)
  }
  }

  GROUP BY (?lang)

For displaying the result in a graph, the data had to be cleaned. See the script on GitHub.

Development of Sentiment and Involvement

The second visualisation focuses on how the debate on Twitter has transformed over time. The guiding question for the query was: How did emotions and interest change over time after the elections? The hypothesis for this question is the interest would decline rapidly, and the discussion would become more neutral. The data shows how the interest indeed declined. Already on the second day, only half of the number of tweets were sent. Further, after the third day, the amount stagnated around 1.000 tweets. The average sentiment was always slightly more positive and has even become more neutral as predicted.

How did emotions and interest change over time after the elections?

						     
  BASE <http://www.europinion.com/>
  PREFIX eur: <http://www.europinion.com/>

  SELECT ?date
  (COUNT(DISTINCT ?tweet) AS ?tweet_count)
  (SUM(?avg_score) AS ?tot_avg_score)
  (?tot_avg_score/?tweet_count AS ?daily_avg_score)

  WHERE {
    {?tweet eur:hasDate ?date.
     ?tweet eur:hasAvgNegative ?n_score.
     FILTER NOT EXISTS {?tweet eur:hasAvgPositive ?p_score}
     BIND((?n_score) AS ?avg_score)
     }
  UNION
    {?tweet eur:hasDate ?date.
     ?tweet eur:hasAvgPositive ?p_score.
     FILTER NOT EXISTS {?tweet eur:hasAvgNegative ?n_score}
     BIND((?p_score) AS ?avg_score)
  }
  UNION
    {?tweet eur:hasDate ?date.
     ?tweet eur:hasAvgPositive ?p_score.
     ?tweet eur:hasAvgNegative ?n_score.
     BIND((?n_score+?p_score)/2 as ?avg_score)
  }
  }

  GROUP BY (?date)
  ORDER BY ASC (?date)

Ambiguity of Tweets

The last graph represents to what degree tweets are two-sided or one-sided in regards to their sentiment. The query followed the descriptive question: Which is the average negative and average positive score for each tweet? We intended to understand whether statements in tweets are usually either strongly negative or strongly positive or if they typically are relatively neutral. The result shows a tendency of the tweets to be eighter strongly one-sided or two-sided with low intensity. Tweets which are strongly two-sided, do exist but are outliers. This phenomenon might be due to the length of tweets in which there is no space to express multiple positions.

Which is the average negative and average positive score for each individual tweet?

						     
  BASE <http://www.europinion.com/>
  PREFIX eur: <http://www.europinion.com/>

  SELECT ?tweet ?lang ?p_score ?n_score
  WHERE {
    {<2019/> eur:lang ?lang.
     ?lang eur:hasTweet ?tweet.
     ?tweet eur:hasAvgNegative ?n_score.
      FILTER NOT EXISTS {?tweet eur:hasAvgPositive ?p_score}
      BIND(("0") AS ?p_score)
     }
  UNION
  {
    <2019/> eur:lang ?lang.
    ?lang eur:hasTweet ?tweet.
    ?tweet eur:hasAvgPositive ?p_score.
      FILTER NOT EXISTS {?tweet eur:hasAvgNegative ?n_score}
      BIND(("0") AS ?n_score)
  }
    UNION
    {
    <2019/> eur:lang ?lang.
    ?lang eur:hasTweet ?tweet.
    ?tweet eur:hasAvgPositive ?p_score.
    ?tweet eur:hasAvgNegative ?n_score.
    }
  }

For displaying the result in a graph, the data had to be cleaned. See the script on GitHub.

best css templates

Let's ask some more questions.

CQ1

Which tweets have "de" identified as language by Twitter?

	    
  PREFIX eur: <http://www.europinion.com/>

  SELECT ?tweet
  WHERE {
  ?tweet eur:hasLang "de".
  }

CQ2

Which tweets have the language set to "en"?

	    
  BASE <http://www.europinion.com/>
  PREFIX eur: <http://www.europinion.com/>

  SELECT ?tweet
  WHERE {
    <2019/> eur:lang <2019/en/>.
    <2019/en/> eur:hasTweet ?tweet.
  }

CQ3

How many tweets have been tweeted in “bg”?

	    
  BASE <http://www.europinion.com/>
  PREFIX eur: <http://www.europinion.com/>

  SELECT (COUNT(?tweet) AS ?total_tweet)
  WHERE {
    <2019/> eur:lang <2019/bg/>.
    <2019/bg/> eur:hasTweet ?tweet.
  }

CQ4

Which languages have more than 10000 tweets?

	    
  BASE <http://www.europinion.com/>
  PREFIX eur: <http://www.europinion.com/>

  SELECT ?lang (COUNT(?tweet) AS ?tweet_count)
  WHERE {
    <2019/> eur:lang ?lang.
    ?lang eur:hasTweet ?tweet.
  }
  GROUP BY (?lang)
  HAVING (?tweet_count > 10000)

CQ5

In which of the languages identified by Twitter people tweeted more?

	    
  PREFIX eur: <http://www.europinion.com/>

  SELECT ?lang (COUNT (?tweet) AS ?tweet_count)
  WHERE {
   ?tweet eur:hasLang ?lang.
  }
  GROUP BY (?lang)
  ORDER BY DESC(?tweet_count)
  LIMIT 1

CQ6

Which language among the ones that received more than 100 tweets has used the highest number of emojis?

	    
  BASE <http://www.europinion.com/>
  PREFIX eur: <http://www.europinion.com/>

  SELECT ?lang
  (COUNT(?tweet) AS ?tweet_count)
  (COUNT(?e_tweet) AS ?e_tweet_count)
  (ROUND(?e_tweet_count*100/?tweet_count) AS ?percentage)
  WHERE {
    {<2019/> eur:lang ?lang.
     ?lang eur:hasTweet ?tweet.}
  UNION
  {
    <2019/> eur:lang ?lang.
    ?lang eur:hasTweet ?e_tweet.
    ?e_tweet eur:hasEmoji ?emoji.
  }
  }
  GROUP BY (?lang)
  HAVING (?tweet_count > 100)
  ORDER BY DESC (?percentage)
  LIMIT 1

CQ7

Which tweet with 'lang' set to 'da' has the highest average negative score?

	    
  BASE <http://www.europinion.com/>
  PREFIX eur: <http://www.europinion.com/>

  SELECT ?tweet ?avg_score
  WHERE {
    {<2019/> eur:lang <2019/da/>.
     <2019/da/> eur:hasTweet ?tweet.
     ?tweet eur:hasAvgNegative ?avg_score.
     FILTER NOT EXISTS {?tweet eur:hasAvgPositive ?p_score}
    }
    UNION
    {
    <2019/> eur:lang <2019/da/>.
    <2019/da/> eur:hasTweet ?tweet.
    ?tweet eur:hasAvgPositive ?p_score.
    ?tweet eur:hasAvgNegative ?n_score.
    BIND((?n_score+?p_score)/2 as ?avg_score)
    FILTER(?avg_score <0)
    }
  }
  ORDER BY ASC (?avg_score)
  LIMIT 1

CQ8

Which tweet has the highest average negative score?

	    
  BASE <http://www.europinion.com/>
  PREFIX eur: <http://www.europinion.com/>

  SELECT ?tweet ?avg_score
  WHERE {
    {?tweet eur:hasAvgNegative ?avg_score.
     FILTER NOT EXISTS {?tweet eur:hasAvgPositive ?p_score}
    }
    UNION
    {?tweet eur:hasAvgPositive ?p_score.
     ?tweet eur:hasAvgNegative ?n_score.
     BIND((?n_score+?p_score)/2 as ?avg_score)
     FILTER(?avg_score <0)
    }
  }
  ORDER BY ASC (?avg_score)
  LIMIT 1

CQ9

Which language expressed the highest percentage of average positive tweets?

	    
  BASE <http://www.europinion.com/>
  PREFIX eur: <http://www.europinion.com/>

  SELECT ?lang
  (COUNT(DISTINCT ?tweet) AS ?tweet_count)
  (COUNT(DISTINCT ?p_tweet) AS ?positive_tweets)
  (ROUND(?positive_tweets*100/?tweet_count) AS ?percentage)
  WHERE {
    {<2019/> eur:lang ?lang.
     ?lang eur:hasTweet ?tweet.}
  UNION
  {
    <2019/> eur:lang ?lang.
    ?lang eur:hasTweet ?tweet.
    ?tweet eur:hasAvgPositive ?avg_score.
    BIND((?tweet) AS ?p_tweet)
  }
  UNION
  {  <2019/> eur:lang ?lang.
     ?lang eur:hasTweet ?tweet.
     ?tweet eur:hasAvgPositive ?p_score.
     ?tweet eur:hasAvgNegative ?n_score.
     BIND((?n_score+?p_score)/2 as ?avg_score)
     BIND((?tweet) AS ?p_tweet)
     FILTER(?avg_score > 0)
  }
  }
  GROUP BY (?lang)
  ORDER BY DESC (?percentage)
  LIMIT 1

CQ10

Which is the percentage (rounded to its integer) of average positive tweets for each language?

	    
  BASE <http://www.europinion.com/>
  PREFIX eur: <http://www.europinion.com/>

  SELECT ?lang
  (COUNT(DISTINCT ?tweet) AS ?tweet_count)
  (COUNT(DISTINCT ?p_tweet) AS ?positive_tweets)
  (ROUND(?positive_tweets*100/?tweet_count) AS ?percentage)

  WHERE {
    {<2019/> eur:lang ?lang.
     ?lang eur:hasTweet ?tweet.}
  UNION
  {
    <2019/> eur:lang ?lang.
    ?lang eur:hasTweet ?tweet.
    ?tweet eur:hasAvgPositive ?score.
    BIND((?tweet) AS ?p_tweet)
  }
    UNION
  {  <2019/> eur:lang ?lang.
     ?lang eur:hasTweet ?tweet.
     ?tweet eur:hasAvgPositive ?p_score.
     ?tweet eur:hasAvgNegative ?n_score.
     BIND((?n_score+?p_score)/2 as ?avg_score)
     BIND((?tweet) AS ?p_tweet)
     FILTER(?avg_score > 0)
  }
  }
  GROUP BY (?lang)
  ORDER BY DESC (?percentage)