360º about the hashtag #ZapataDimisión
A week ago Antonio Delgado suggested I write about the hashtag #ZapataDimisión but I was too busy reviewing a vital paper for my thesis. I collected the data but I had to postpose the analysis until right now. #ZapataDimisión’s case could mark a before and after in the use of social media in Spain.
Lately Twitter is becoming a battlefield in which users are polarized for any event. When a controversy becomes TT (Trending Topic) it attains a general visibility beyond the timeline, sometimes going outside of Twitter as a news item on traditional media. For this reason TT´s have an enormous spread potential and we always ask ourselves if it was spontaneous or manipulated.
TT´s could be compared to a fire whose cause is unknown natural or arson until it is analysed the origin and the factors that favoured its spread. In the same way we were analysing the TT #ZapataDimision, looking into the hashtag´s origin, the first propagations, how it spread, who were the most spread users and what was the role of the media.
Origin and spread
The first tweet was detected with the hashtag #zapatadimision was published by José Luis Vázquez at 17:17 on 13 June.
— José Luis Vázquez (@jlvazmar) June 13, 2015
This hashtag was used since its beginning in various forms, including uppercase and lowercase accented or not. This denotes spontaneity in origin and that it was not an organized campaign. In addition, growth was organic, not due to synchronization. The spread began to rise when some users with many followers retweeted or posted tweets, like these: hermanntertsch (19:01, 19:23, 22:10, 22:11, 22:44, 22:45),pedroj_ramirez ( 20:25 14th 0:16),DavidSummersHG (20:53), @ mrm8488 (22:13), @ kastillo62 (23:10),alfonso_ussia (23:23),isanseba (23:31, 23:53, 14th 0:50, 0:51, 1:02, 1:21),hugokeppler (23:40, 14th 0:40),oneto_p (23:54),JesusEncinar (day 14 0:57),josecdiez (14th 1:00 am),JuanfraEscudero (1:14),CubaSinFrontera (14th 2:23). At 00:02:12 on 14 June was 5th TT in Madrid.
The 14th at 8am was reactivated, these users could contribute to its spread through RTs or tweets: josecdiez (9:07, 10:39, 11:18),hermanntertsch (9:21, 9 : 55, 9:58),pedroj_ramirez (9:22, 9:23, 9:38),JesusEncinar (10:30, 10:40),martinvars (10:32, 10:43),fredhermel (10:47),jatirado (10:53, 10:55, 10:58). At 10:16 it started to be TT in Spain in the 6th, becoming the first at 11:47.
The following interactive image contrasts the global spread with polarized propagation. RT relation was used to determine the polarity of the users who participated. Users tend to retweet if they agree, this causes some users are more interrelated than others and forming communities. By this method (see methodology at the foot of the post) the users were classified as pro-Zapata and anti-Zapata and their tweets are segregated into two separate datasets to see their evolution in time. These two groups published the 94.74% of the tweets, the remaining 5.26% were uncertain users (retweeted to both polarities) and people did not retweet. The accuracy of this classification was manually checked over a random sample of 300 tweets in each datasets. For anti-Zapata tweets precision was 94.33% and for the pro-Zapata 92%.
We can see how tweet growth was organic, rising diffusion in small jumps possibly induced by users with many followers who retweeted or tweeted. On 13th of June the anti-Zapata users were a majority (the line of global and anti-Zapata tweets overlap). On 14th June pro-Zapata tweets appeared when it became TT in Spain. Except at 18:00 anti-Zapata tweets were more than pro-Zapata tweets.
We can deduce that the origin was spontaneous and the spread was favoured by users with many followers with liberal-conservative political tendency or sympathizers with the Jewish community.
Much ado About Nothing?
One way to understand the true dimension of a TT is measuring how many tweets were original or RTs and what the distribution of RTs per user was. Participation on social media follows a power law distribution in which a minority is the source of most of the information spread. Broadly we can identify the critical mass that caused most of the messages.
The result did not surprise us; it is similar to other cases. From 130,954 global tweets, 104,580 were RT’s (79.86%), so only 20.14% was original information. From 80,867 anti-Zapata tweets, 65,527 (81.03%) were RT’s, hence 18.97% was original tweets
To know the distribution of anti-Zapata RT’s, the users were classified according their activity (the classification details below, at methodology) and found that the speakers, 508 users, received 52,415 RT’s, so 2.65 % of users obtained 79,99% of RT’s against Zapata.
The following interactive graphic show the classification of anti-Zapata users based on the impact of their tweets. Clicking on each users group can see at the bottom the users ordered by the number of RT’s received.
Likewise we can see in the following interactive graphic the RT activity of anti-Zapata users. We found 2,830 users that retweeted 37,641messages, so 14.74% of users retweeted 57.44% of messages.
The hard core of those who moved tweets was about 3.500 users. We can conclude that there was more bark than bite.
The role of the speakers and the media
To see how the speakers and the media influenced the spread we have created three graphics temporarily aligned. The first one is the time evolution of total tweets, anti-Zapata and pro-zapata. The second shows the distribution of the twenty most widespread tweets, both anti-and pro-Zapata Zapata (used a colour code to group. Anti-Zapata blue, purple pro-Zapata and green neutral. Mouse over the graphic displays text of the tweet and see the number of RT’s that had at that time). The third graph shows the spread of the tweets of the media and journalists.
During the 13th June the pattern of diffusion of second and third graphs was similar because pedroj_ramirez and isanseba messages were almost all in the top 20 and there weren’t other journalists publishing tweets. On day 14th June, the propagation of journalists users lost weight against other influential users.
We could conclude that pedroj_ramirez and isanseba messages helped to create TT on 13th but on the 14th other users intervened.
Pro or contra Zapata?
The polarization of users is reflected in this RT’s graph. On the left, in a single community, they are those who supported Zapata and right attacked him, formed by several groups sympathetic to the Jewish community and people of liberal-conservative ideology.
- The data was obtained with the Twitter REST API, search method, applying the tweets containing the hashtag “#ZapataDimision OR # dimisióndeZapataySoto”
- 130,954 tweets from 37,979 different Twitter users were captured from 06/13/2015 to 6/15/2015 14:39:09 17:17:37
- To determine the polarity of tweets semantic analysis has not been used (I’m not an expert in this technology) instead we used RT relation. Users tend to RT messages with which they agree, this causes some users to be more interrelated than others and those communities are formed. The polarity was obtained from the graph of RT’s using Gephi. Several communities were detected whose users were classified as: anti-Zapata (19,205 users), pro-Zapata (13,782 users) and uncertain that retweet different polarity (1,378 users)
- Once users were identified as anti or pro Zapata, their tweets were segregated in order to analyse them separately. 80 867 tweets were classified as anti-zapata (61,75%),. 43.193 tweets as pro-Zapata (32.98%) and 5.26% were uncertain or not retweeted. The accuracy of this classification was measured checked manually a random sample of 300 tweets in each of the datasets. For anti-Zapata tweets precision was 94.33% and for the pro-Zapata 92%
- Anti-Zapata users were classified according to their activity and impact in these categories
- Speaker: When the number of RT’s received was four times greater than the volume of tweets published. There were three groups: the high speaker formed by users with more impact than accounted for 20% of RT’s, medium speakers are the next most retweeted users who obtained 30% of RT’s and other ones were low speakers
- Networker: high activity, a number of RT’s received above average, and the number of RT’s sent and received was balanced
- Retweeter: with high activity and more RT’s that own tweets
- Replier: most of his tweets are replies. As replies do not appear on the timeline usually don’t receive RT’s
- Monologist: activity above average and the number of RT’s received low
- Normal: other users who do not follow these specific behaviour patterns
- The spread of the 20 most widespread tweets and tweets of the media and journalists analysed to see how the overall expansion favoured
- The graph display RT’s was performed with Gephi