Search
Now showing items 1-9 of 9
From archive to analysis: accessing web archives at scale through a cloud-based interface
(Springer Nature, 2021-01-06)
This paper introduces the Archives Unleashed Cloud, a web-based interface for working with web archives at scale. Current access paradigms, largely driven by the scope and scale of web archives, generally involve using the ...
Building Community at Distance: A Datathon during COVID-19
(Digital Library Perspectives, 2020-08-04)
This paper aims to use the experience of an in-person event that was forced to go virtual in the wake of COVID-19 as an entryway into a discussion on the broader implications around transitioning events online. It gives ...
Content-Based Exploration of Archival Images Using Neural Networks
(ACM/IEEE, 2020-08)
We present DAIRE (Deep Archival Image Retrieval Engine), an image exploration tool based on latent representations derived from neural networks, which allows scholars to "query" using an image of interest to rapidly find ...
The Archives Unleashed Project: Technology, Process, and Community to Improve Scholarly Access to Web Archives
(ACM/IEEE, 2020-08)
The Archives Unleashed project aims to improve scholarly access to web archives through a multi-pronged strategy involving tool creation, process modeling, and community building -- all proceeding concurrently in mutually ...
Building Community and Tools for Analyzing Web Archives through Datathons
(2019)
Starting in March 2016, the Archives Unleashed team and our collaborators have brought together social scientists, humanists, archivists, librarians, computer scientists, and other stakeholders to explore web archives as ...
The Cost of a WARC: Analyzing Web Archives in the Cloud
(2019)
The value of web archives to support scholarship in the humanities and social sciences is slowly being realized by the increasing availability of scalable tools and platforms. The cost of providing scholarly access is a ...
The Archives Unleashed Notebook: Madlibs for Jumpstarting Scholarly Exploration
(2019)
This paper introduces the Archives Unleashed Notebook, which is designed to work with derivative datasets from the Archives Unleashed Cloud, a platform for analyzing web archives. These datasets contain common starting ...
Fostering Community Engagement through Datathon Events: The Archives Unleashed Experience
(Alliance of Digital Humanities Organizations, 2021-03)
This article explores the impact that a series of Archives Unleashed datathon events have had on community engagement both within the web archiving field, and more specifically, on the professional practices of attendees. ...
ABCDEF - The 6 key features behind scalable, multi-tenant web archive processing with ARCH: Archive, Big Data, Concurrent, Distributed, Efficient, Flexible
(ACM, 2022-06-20)
Over the past quarter-century, web archive collection has emerged as a user-friendly process thanks to cloud-hosted solutions such as the Internet Archive’s Archive-It subscription service. Despite advancements in collecting ...