DSpace Repository

Web Archives Analysis at Scale with the Archives Unleashed Cloud

Web Archives Analysis at Scale with the Archives Unleashed Cloud

Show full item record

Title: Web Archives Analysis at Scale with the Archives Unleashed Cloud
Author: Ruest, Nick
Milligan, Ian
Abstract: Web archives, repositories of born-digital information dating back to the Internet Archive and national libraries in the mid-1990s, are fantastic resources of information covering topics of interest to humanities and social sciences scholars. Imagine a political historian studying elections, a historian studying youth culture in the late 1990s, or a scholar of the military or policy exploring how wars were reflected online. Yet while we have been collecting this information for over two decades, access has lagged: most scholars are limited to working with web archives one page at a time through portals such as the Wayback Machine. With the rise of the digital humanities, the computational social sciences, and web science more generally, scholars increasingly have the ability and desire to work with data at scale. In this presentation, we introduce the Archives Unleashed Cloud, currently supported through a grant from The Andrew W. Mellon Foundation. This service facilitates the (a) transfer of web archival data to the Cloud; (b) its analysis and transformation into standard scholarly derivatives; and (c) the building of a community around it via in-person events and learning guides. Our presentation begins by introducing the Cloud and discussing its motivation, discussing its technical underpinnings, and then exploring our current sustainability plan to keep the Archives Unleashed Cloud running after our foundation funding ends in 2020.
Sponsor: This work is primarily supported by the Andrew W. Mellon Foundation. Additional funding has come from the U.S. National Science Foundation, Columbia University Library's Mellon-funded Web Archiving Incentive Award, the Natural Sciences and Engineering Research Council of Canada, the Social Sciences and Humanities Research Council of Canada, and the Ontario Ministry of Research and Innovation's Early Researcher Award program.
Subject: web archives
web archive analysis
sustainability
cloud computing
Type: Presentation
Rights: Attribution-ShareAlike 2.5 Canada
http://creativecommons.org/licenses/by-sa/2.5/ca/
URI: http://hdl.handle.net/10315/36119
Date: 2019-04-08

Files in this item



The following license files are associated with this item:

This item appears in the following Collection(s)

Attribution-ShareAlike 2.5 Canada Except where otherwise noted, this item's license is described as Attribution-ShareAlike 2.5 Canada