Building Social Media Collections
Presentation to University Library Committee
Brian Dietz & Jason Ronallo
NCSU Libraries
Preponderance of Social Media
2.2 Billion Active Social Media Users
30% global usage
http://wearesocial.net/blog/2015/08/global-statshot-august-2015/
In 2015 in the US
65% of adults use at least one social media platform
http://www.pewinternet.org/2015/08/19/the-demographics-of-social-media-users/
Percentage of Adult Internet Users, by Platform
- 72% Facebook
- 28% Instagram
- 23% Twitter
Percentage of Young Adult Internet Users (18-29), by Platform
- 82% Facebook
- 55% Instagram
- 32% Twitter
Researchers are Taking Note
In a meta-analysis of studies using data from Twitter, there were least seventeen different disciplines represented in 382 studies spread over six years.
Michael Zimmer and Nicholas John Proferes, “A Topology of Twitter Research: Disciplines, Methods, and Ethics,” Aslib Journal of Information Management 66, no. 3 (2014): 250–61.
Twitter Research Data Grants
- Foodborne Gastrointestinal Illness, Harvard Medical School / Boston Children’s Hospital
- Disaster Information Analysis, NICT (Japan)
- Cancer Early Detection Campaigns on Twitter, University of Twente (Netherlands)
- Happiness of Cities, UC San Diego
- Modelling Urban Flooding in Jakarta, University of Wollongong (Australia)
- Sports Team Performance, University of East London
But Why Archive Social Media Data?
Discourse Relevant to Archival Collections
How could we not want to preserve a vast record of everyday life and thoughts from tens of millions of people, however mundane?
Dan Cohen, Digital Ephemera and the Calculus of Importance.
Perceived Value of Social Media Data Among Archival Researchers
Serious discourse occurs on social media?
45% agreed
22% strongly agreed
(67% combined)
Value in using social media data in research?
34% agreed
37% strongly agreed
(71% combined)
Official Records
@NCState
Everday Experience
#ThinkAndDo
Significant Events
#WomenInSTEM
Greater Representation in the Archival Record
- Increase diversity of voices in historic record
- Build more representative collections
My #HuntLibrary
Crowdsourced storytelling
Multiple Access Layers
Battles, Voting, Moderation
Archival Component
“New Voices and Fresh Perspectives”
2014-15 LSTA EZ Innovation Grant
- Administered by the NC State Library
- Collaboration between SCRC and DLI
- Guidance from Copyright & Digital Scholarship Center
- Significant contributions from student assistants
Project Goals
- Establish groundwork for a social media collecting program at NCSU Libraries
- Develop free, web-based documentary toolkit
- Develop open, easily deployable collecting environment
Collecting Program
Not collecting all of Twitter and Instagram!
Focus on Current SCRC Collecting Strengths
- NCSU History
- Architecture, Landscape Architecture, Design
- Animal Rights and Welfare
- Entomology
- History of Computing and Simulation
- Plant and Forestry Genetics and Genomics
- Textiles
- Veterinary Medicine
- Zoological Health
Identifying Content
- Targeted accounts
- Hashtags
- Keywords
Account-based harvests
- Alumni Assoc.
- AfAm Cultural Center
- Arts NC STATE
- Coll + Depts
- DASA
- GLBT Center
- Grad School
- Libraries
- NCSU
- NC Mod Houses
- OIT
- Park Scholarship
- Technician
- Welcome Week
- WKNC
- 450+ more
Hashtag-based harvests
- Agriculture Awareness Week
- Art2Wear
- Chapel Hill Shooting
- Free Expression Tunnel
- Hofmann Forest
- Homecoming
- Hoops 4 Hope
- Hunt Library
- Krispy Kreme Challenge
- NCState16 - NCState20
- NCStateOnCampus
- Pack Athletics
- Packapalooza
- Pan Afrikan Festival
- Shack-a-thon
- Talley Grand Opening
- Think And Do
- Women In STEM
This Data Is University History
Documentary Toolkit
To help other institutions kickstart
their own collecting initiatives
- Environmental scan
- Research value
- Legal and ethical analysis
- Documentation
- Surveys
Contributions to the Profession
Open Source
https://github.com/NCSU-Libraries/
Lentil
Along with email, social media will probably provide the main source of information for researchers studying our current time. However, our institution just does not have the resources right now to collect and store the social media of other people or organizations.
NCSU Social Media Archives Toolkit survey of North Carolina Cultural Heritage Organizations
Social Media Combine
Virtualized social media harvesting environment
https://github.com/NCSU-Libraries/Social-Media-Combine
Repurposing Virtualization
Future Plans
- Continued collecting
- Access
- Best practices
- Outreach and campus partners
- Challenges of social media archiving
Thanks!
Brian Dietz bjdietz@ncsu.edu
Jason Ronallo jnronall@ncsu.edu
Associated content?
- Linked web pages
- Replies
- Videos and other media
- Retweeting account info
- Engagement metrics
Availability and access
- What is the "whole" dataset if it is constantly being revised?
- How do we redistribute unstable data?
- How can research results be reproduced?