Jason Ronallo
NCSU Libraries
Interim Head of Digital Library Initiatives
@ronallo
"It's hot in here." translated to "Il fait chaud ici." "It's not the heat it's the humanity." translated to "La chaleur humaine!"
From Brigadoon (1954)
Audio description is narration of the visual content of a video.
Audio description is an art. How do you fit the description into the pauses in the regular audio content?
Works now with Text to Speech with a Chrome browser plugin
<video>
<source src="video.webm">
<track src="track.vtt">
</video>
WebVTT is...
WEBVTT
00:00.000 --> 00:03.000
This is the first cue.
00:03.000 --> 00:07.000
This is the second cue and
breaks over two lines.
00:07.000 --> 00:10.000
Line breaks are kept though
additional line breaks
may be added to make the text fit
the available space for the cue.
$('#no-controls-controls .play').on('click',
function(){
$('video#no-controls')[0].play();
});
President John F. Kennedy Speech at conference on milk and nutrition
Look in the links for the open source tools to do all of this this!
*Or use scene detection to to create the thumbnails.
Use any text or data for the WebVTT cue payload.
WEBVTT
00:01.000 --> 00:03.000
{"title": "First image",
"image": "http://example.com/images/first.png"}
00:05.000 --> 00:07.000
{"title": "2nd image",
"image": "http://example.com/images/2nd.png"}
00:08.000 --> 00:10.000
{"title": "Image C",
"image": "http://example.com/images/c.png"}
WEBVTT
00:00:00.000 --> 00:00:05.000
http://example.co/sprite.jpg#xywh=0,0,150,100
00:00:05.000 --> 00:00:10.000
http://example.co/sprite.jpg#xywh=150,0,150,100
00:00:10.000 --> 00:00:15.000
http://example.co/sprite.jpg#xywh=300,0,150,100
00:00:15.000 --> 00:00:20.000
http://example.co/sprite.jpg#xywh=450,0,150,100
Get part of an image.
http://example.co/sprite.jpg#xywh=450,0,150,100
MediaElement.js time rail thumbnails plugin. https://github.com/jronallo/mep-feature-time-rail-thumbnails
video-sprites https://github.com/jronallo/video-sprites
WEBVTT
00:00.000 --> 00:01.600 line:40%
<00:00.100><c>He</c> <00:00.200><c>drinks</c>
<00:00.400><c>a</c> <00:00.900><c>whiskey</c>
<00:01.000><c>drink</c>
00:01.601 --> 00:02.699 line:40%
<00:01.602><c>He</c> <00:01.700><c>drinks</c>
<00:02.000><c>a</c> <00:02.100><c>vodka</c>
<00:02.300><c>drink</c>
00:02.700 --> 00:03.770 line:40%
<00:02.800><c>He</c> <00:03.000><c>drinks</c>
<00:03.200><c>a</c> <00:03.300><c>lager</c>
<00:03.600><c>drink</c>
video#tubthumping::cue { font-size: 3em; }
video#tubthumping::cue(:past) { color: red }
video#tubthumping::cue(:future) { color: gray }
To Silvia Pfeiffer for being editor of the WebVTT specification and answering some of my questions.
To the Association of Moving Image Archivists for sponsoring the longer presentation this lightning talk is based on.
This work by Jason Ronallo is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Permissions beyond the scope of this license may be available at http://ronallo.com.
Alternative or augmented access to video for those with a range of disabilities.
v.0.1