Arch-It!

dc.contributor.authorHolzmann, Helge
dc.contributor.authorRuest, Nick
dc.contributor.authorBailey, Jefferson
dc.contributor.authorDempsey, Alex
dc.contributor.authorFritz, Samantha
dc.contributor.authorMilligan, Ian
dc.contributor.authorWillis, Kody
dc.date.accessioned2022-11-10T13:18:13Z
dc.date.available2022-11-10T13:18:13Z
dc.date.issued2022-06-24
dc.description.abstractOver the past quarter-century, web archive collection has emerged as a user-friendly process thanks to cloud-hosted solutions such as the Internet Archive’s Archive-It subscription service. Despite advancements in collecting web archive content, no equivalent has been found by way of a user-friendly cloud-hosted analysis system. Web archive processing and research require significant hardware resources and cumbersome tools that interdisciplinary researchers find difficult to work with. In this paper, we present ARCH (Archives Research Compute Hub)1, an interactive interface, closely connected with Archive-It, engineered to provide analytical actions, specifically generating datasets and in-browser visualizations. It efficiently streamlines research workflows while eliminating the burden of computing requirements. Building off past work by both the Internet Archive (Archive-It Research Services) and the Archives Unleashed Project (the Archives Unleashed Cloud), this merged platform achieves a scalable processing pipeline for web archive research.en_US
dc.description.sponsorshipThis research was supported by the Andrew W. Mellon Foundation’s Public Knowledge program, the Social Sciences and Humanities Research Council of Canada, as well as Start Smart Labs, the University of Waterloo, and York University.en_US
dc.identifier.urihttp://hdl.handle.net/10315/39924
dc.language.isoenen_US
dc.relation.ispartofseriesWeb Archiving and Digital Libraries;2022
dc.rightsAttribution 4.0 International*
dc.rights.journalhttps://fox.cs.vt.edu/wadl2022.htmlen_US
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/*
dc.subjectweb archivesen_US
dc.subjectData analyticsen_US
dc.subjectDistributed systemsen_US
dc.subjectInformation retrievalen_US
dc.titleArch-It!en_US
dc.typePresentationen_US
dc.typeTechnical Reporten_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Arch-It.pdf
Size:
721.95 KB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.83 KB
Format:
Item-specific license agreed upon to submission
Description: