An exploratory look at 3,039,804 #elxn42 tweets

Ruest, Nick; Milligan, Ian

An exploratory look at 3,039,804 #elxn42 tweets

dc.contributor.author	Ruest, Nick
dc.contributor.author	Milligan, Ian
dc.date.accessioned	2016-04-14T11:56:05Z
dc.date.available	2016-04-14T11:56:05Z
dc.date.issued	2016-04-14
dc.description.abstract	This presentation examines the tools, approaches, collaboration, and findings of the Web Archives for Historical Research Group around the capture and analysis of Twitter for the 2015 Canadian Federal Election. While Twitter is not a representative sample of broader society - Pew Research notes that it skews young, college-educated, and affluent (above $50,000 household income) – Twitter still represents an exponential increase in the amount of information generated, retained, and preserved from non-elite people. Therefore, when historians study the 2015 federal election, Twitter will be a prime source. On August 3, 2015, the team initiated both a search API and stream API collection with twarc using the hashtag #elxn42. Data collection ceased on November 5, 2015, the day after Justin Trudeau was sworn in as the 42nd Prime Minister of Canada. We collected for a total of 102 days, 13 hors and 50 minutes. To analyze the data set, we took advantage of a number of utilities that are available within twarc and twarc-report, as well as jq, Mathematica, and Apache Spark Notebook. In accordance with the Twitter ToS, we also hosted the tweet ids in an institutional repository. Our analytics included: * breaking tweet text down by day to track change over time; * client analysis, allowing us to see how the scale of mobile devices affected medium interactions; * URL analysis, comparing both to Archive-It collections and the Wayback Availability API to add to our understanding of crawl completeness; * and image analysis, using an archive of extracted images. Our presentation introduces our collecting work, the analysis we have done, and provides a framework for other collecting institutions to do similar work with our off-the-shelf open-source tools. We hope that national libraries and other institutions will find our model useful as they consider how to archive ongoing events using Twtiter.	en_US
dc.description.sponsorship	International Internet Preservation Consortium (IIPC)
dc.identifier.uri	http://hdl.handle.net/10315/31087
dc.language.iso	en
dc.rights	Attribution-ShareAlike 2.5 Canada	*
dc.rights.uri	http://creativecommons.org/licenses/by-sa/2.5/ca/	*
dc.subject	social media	en
dc.subject	web archives	en
dc.subject	text analysis	en
dc.subject	iipc	en
dc.subject	#elxn42	en
dc.subject	twitter	en
dc.subject	json	en
dc.title	An exploratory look at 3,039,804 #elxn42 tweets	en
dc.type	Presentation

Files

Original bundle

Now showing 1 - 1 of 1

Name:: An exploratory look at 3,039,804 #elxn42 tweets.pdf
Size:: 3.69 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.83 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

YUL research and professional contributions