The Promise of Semantic Web
Last week, I attended a session on the topic of Semantic Web, presented by Nova Spivack, CEO of Radar Networks. I've been following this topic off and on for a few years, but I've always wondered how real it is (remember the AI efforts of the '80s?). After listening to this talk, I'm convinced that at least parts of this technology are on the way, although full-fledged machine-automated reasoning remains as elusive as ever. The highlights of the session are covered below.
What is it good for?
Mr. Spivack defined Semantic Web as a specific set of W3C open standards for working with knowledge. The main idea is to use technologies based on these standards for adding machine-understandable structured data to the web, with the overall goal of enabling automated reasoning algorithms. Note that semantic web advocates do not insist that the ontologies be the same across the web (unlike, say, the microformat approach) - different sites can use different schemas, as long as they are published and can be mapped to one another.
Benefits of the Semantic Web include the following:
- Richer content
- More precise search and navigation
- Increased productivity and better collaboration
- Integrating data and applications
- Machine-automated reasoning and AI
Most exciting, the availability of semantic information could facilitate much richer web search paradigms, e.g. parametric search and associative search. Semantic web technologies hold the promise of automated reasoning, letting the engine make inferences based on structured data and the links between data.
How does it work?
Semantic Web depends on the following core standards:
- RDF - Resource Description Framework - Enables the storage of data as "triples" (subject, predicate, object)
- OWL - Web Ontology Language - Define systems of concepts called "ontologies"
- SPARQL - an RDF Query Language - To query RDF data
- SWRL - Semantic Web Rule Language - Enables us to define rules
- GRDDL - markup format - Transform xml/xhtml data to RDF
- (Microformats) - do they really belong in this list?
Today's web pages present information essentially as text (xhtml); links between pages are simply links, with no semantics associated with each link. In a semantic web, however, data is modeled as a network of nodes, which are connected together using semantic (meaningful) links. Specifically, information is usually modeled as a set of triples, each of which includes a Subject, a Predicate and an Object. The nodes themselves can be arranged in a hierarchy.
Modeling data in this way allows simple inferences to be made that were not explicitly stated in the information provided.
Challenges
In his presentation, Mr. Spivack identified the following barriers to adoption:
- A lack of tools
- Scaling challenges (what if you want to store a trillion+ triples?)
- Vision issues (how can we define a practical vision, for the low-hanging fruit?)
- Inadequate Content (not enough semantic data available)
- No killer apps
- Market education
Although all of these are clearly important and need to be solved, I see the core problem as being the lack of a practical, popular "Killer App". If a specific application were to catch the imagination of a large section of the population, I have no doubt that the rest of the problems - technical issues such as tools and scaling - could be solved.
[For more information about Semantic Web technologies, you should check out Nova Spivack's blog: Minding the Planet.]
Comments