Associate Head, Digital Library Initiatives
North Carolina State University Libraries
This is basically how search engines work.Problem is that for the most part they're having to do natural language processing to pull out semantics. It is really hard to do. Even the smart minds at big companies like Google can only do so much.
<span itemscope itemtype="http://schema.org/Person"> <a itemprop="url" href="http://twitter.com/ronallo"> <span itemprop="name">Jason Ronallo</span> </a> is the <span itemprop="jobTitle"> Associate Head of Digital Library Initiatives</span> at <span itemprop="affiliation" itemscope itemtype="http://schema.org/Library"> <span itemprop="name"> <a itemprop="url" href="http://lib.ncsu.edu">NCSU Libraries</a></span> </span>. </span>
But here's what the markup looks like. Some attributes have been added to some simple HTML to add some more structure to the data.
That's all that embedded semantic markup is. Embedded semantic markup provides the syntax (some extra markup) to structure data in your HTML pages.Think of this like hidden annotations.
Since youre eyes are more often on the web site, it can be better than trying to keep your data in sync with some external XML serialization.We often have very rich metadata for the resources we describe in our databases. In the past schemes to expose this metadata through HTML and the Web led to a lot of dumbing down. Using embedded semantic markup like Microdata or RDFa Lite allows for us to expose richer metadata with more structure through HTML.
* Numbers from last time I checked early in 2013.
Third result in Google video search for "bug sprays and pets."
So the main benefit we get out of all this right now is what Google calls Rich Snippets.
This search result has a video thumbnail, the duration and a bit from the transcript of the video as the description. Rich snippets is really the only thing that Google has said it will use this data for.
You can see how having this extra information can make a particular search result stand out and be more likely to be clicked on. So it improves discoverability.But how else could this benefit libraries and archives if all of this gets pushed further?