At Saturday’s hackathon, we will give an overview of the datasets available, have short talks from affiliated projects and services, and point to tools and methods for analyzing the hackathon’s data. We plan for a loose, informal event. for the event and publicly accessible online:
Obama Administration White House social media from 2009-current, including Twitter, Tumblr, Vine, Facebook, and (possibly) YouTube
Comprehensive web archive data of current White House websites: whitehouse.gov, petitions.whitehouse.gov, letsmove.gov and other .gov websites
The End of Term Web Archives, a large-scale buy sales lead collaborative effort to preserve the federal government web ( .gov/.mil) at presidential transitions, including web data from 2008, 2012, and our current 2016 project
Special sub-collections of government data, such as every powerpoint in the Internet Archive’s web archive from the .mil web domain
Extensive archives of of social media data related to the 2016 election including data from candidates, pundits, and media
Full text transcripts of Trump candidate speeches
Python notebooks, cluster computing tools, and pointers to methods for playing with data at scale.
Much of this data was collected in partnership with other libraries and with the support of external funders. We thank, foremost, the current White House Office of Digital Strategy staff for their advocacy for open access and working with us and others to make their social media open to the public. We also thank our End of Term Web Archive partners and related community efforts helping preserve the .gov web, as well as the funders that have supported many of the collecting and engineering efforts that makes all this data publicly accessible, including the Institute of Museum and Library Services, Altiscale, the Knight Foundation, the Democracy Fund, the Kahle-Austin Foundation, and others.