Building Social Media Collections
					Presentation to Management Council
					
						Brian Dietz & Jason Ronallo
					
				
				
				
					Preponderance of Social Media
				
				
					2.2 Billion Active Social Media Users
					30% global usage
					http://wearesocial.net/blog/2015/08/global-statshot-august-2015/
				
					
						In 2015 in the US
						65% of adults use at least one social media platform
						http://www.pewinternet.org/2015/08/19/the-demographics-of-social-media-users/
					
					
						Percentage of Adult Internet Users, by Platform
						
							- 72% Facebook
- 28% Instagram
- 23% Twitter
Percentage of Young Adult Internet Users (18-29), by Platform
						
							- 82% Facebook
- 55% Instagram
- 32% Twitter
Researchers are Taking Note
					
					
						In a meta-analysis of studies using data from Twitter, there were least seventeen different disciplines represented in 382 studies spread over six years.
						Michael Zimmer and Nicholas John Proferes, “A Topology of Twitter Research: Disciplines, Methods, and Ethics,” Aslib Journal of Information Management 66, no. 3 (2014): 250–61.
					
					
						Twitter Research Data Grants
						
							- Foodborne Gastrointestinal Illness (US)
- Disaster Information Analysis (Japan)
- Cancer Early Detection Campaigns (Netherlands)
- Modelling Urban Flooding in Jakarta (Australia)
But Why Archive Social Media Data?
					
					
						Discourse Relevant to Archival Collections
						How could we not want to preserve a vast record of everyday life and thoughts from tens of millions of people, however mundane?
						Dan Cohen, Digital Ephemera and the Calculus of Importance.
					
					
						Perceived Value of Social Media Data Among SCRC Researchers
					
					
					Serious discourse occurs on social media?
					45% agreed
					22% strongly agreed
					(67% combined)
				
				
					Value in using social media data in research?
					34% agreed
					37% strongly agreed
					(71% combined)
				
					
						What Does This Content Represent?
					
					
						Official Records
						 
						@NCState
					
					
						Everday Experience
						 
						#ThinkAndDo
					
					
						Significant Events
						 
						#OurThreeWinners
					
					
						Greater Representation in the Archival Record
						
							- Increase diversity of voices in historic record
- Build more representative collections
Engagement With New Communities & Deeper Engagement With Existing Communities
					
					
						My #HuntLibrary
						 
					
					
						My #HuntLibrary
						
							- Crowdsourced storytelling
- Multiple Access Layers
- Battles, Voting, Moderation
- Award-winning
Archival Component
						 
						That's pretty legit! Appreciate the props #huntlibrary!
					
					
						My #HuntLibrary User Study
						75% listed contributing to the archive as a main motivator for participating.
					
					
						“Even Better Than Winning an iPad!”
					
					
						“New Voices and Fresh Perspectives”
					
					
						2014-15 LSTA EZ Innovation Grant
						
							- Administered by the NC State Library
- Collaboration between SCRC and DLI
- Guidance from Copyright & Digital Scholarship Center
- Significant contributions from student assistants
Project Goals
						
							- Establish groundwork for a social media collecting program at NCSU Libraries
- Develop free, web-based documentary toolkit
- Develop open, easily deployable collecting environment
Not collecting all of Twitter and Instagram!
						 
					
					
						Historians of the English Civil War are deeply thankful that Humphrey Bartholomew had the presence of mind to save 50,000 pamphlets (once considered throwaway pieces of hack writing) from the seventeenth century and give them to a library at Oxford.
						Dan Cohen, Digital Ephemera and the Calculus of Importance.
					
					
						SCRC Collecting Strengths
						Largely focused on NC State History
						
					
					
						Identifying Content
						
							- Targeted accounts
- Hashtags
- Keywords
Account-based Twitter Harvests
					
					
						@NCState
						 
					
					
						Colleges and Departments
						 
					
					
						DASA
						 
					
					
						Student Organizations
						 
					
					
						And About 460 Other Accounts
					
						
					
					
						Hashtag-based Instagram and Twitter Harvests
						
					
					
						NCSU16 - NCSU20
						 
					
					
						NCStateOnCampus
						 
					
					
						Packapalooza
						 
					
					
						Homecoming
						 
					
					
						Krispy Kreme Challenge
						 
					
					
						This Data Tells Part of the University's History
					
					
						Documentary Toolkit
						To help other institutions kickstart
their own collecting initiatives
						
							- Environmental scan
- Research value
- Legal and ethical analysis
- Documentation
- Surveys
Contributions to the Profession
  Open Source
  https://github.com/NCSU-Libraries/
	Lentil
	 
  Technical Requirements for Social Media Archiving Tools
  
    - Social Feed Manager
    
- Lentil
    
Technical Requirements
  
    - Social Feed Manager
      
    
- Lentil
      
    
Technical Requirements
  
    - Social Feed Manager
      
    
- Lentil
      
    
Technical Requirements
  
    - Social Feed Manager
      
    
- Lentil
      
    
Technical Requirements
  
    - Social Feed Manager
      
    
- Lentil
      
    
Technical Requirements
  
    - Social Feed Manager
      
    
- Lentil
      
    
- Linux:
      sudo apt-get install git apache2 python-dev python-virtualenv postgresql libxml2-dev libxslt1-dev libpq-dev libapache2-mod-wsgi supervisor
Along with email, social media will probably provide the main source of information for researchers studying our current time. However, our institution just does not have the resources right now to collect and store the social media of other people or organizations.
	NCSU Social Media Archives Toolkit survey of North Carolina Cultural Heritage Organizations
	Social Media Combine
	Virtualized social media harvesting environment
	https://github.com/NCSU-Libraries/Social-Media-Combine
	Server Virtualization
  (not desktop virtualization)
  Virtual Machines/Virtual Servers
   
  Virtual Machine on Your Laptop
   
  Repurposing Virtualization
  Future Plans
  
    - Continued collecting
- Access
- Best practices
- Outreach and campus partners
- Challenges of social media archiving
- Campus collaborators
Thanks!
  Brian Dietz bjdietz@ncsu.edu
  Jason Ronallo jnronall@ncsu.edu
	Associated content?
	
		- Linked web pages
- Replies
- Videos and other media
- Retweeting account info
- Engagement metrics
Availability and access
	
		- What is the "whole" dataset if it is constantly being revised?
- How do we redistribute unstable data?
- How can research results be reproduced?