It’s been hard not to notice the influx of tech blog posts over the past week or so covering all things Semantic Web. I’ve not had a decent chance to talk about the Semantic Web since I wrote my thesis last year, so I thought I’d take this as a good opportunity to do so and collect some of the best of those links in the process.

It’s staggering how much the Semantic idea has grown since I wrote my dissertation. In it I discussed mainly the social, technical and theoretical concepts of the Semantic Web – when I wrote it there was little else around to write about. There were no public start-ups or beta platforms – the community was small and extended little outside a designated group at the W3C and a handful of tech bloggers – their efforts too, were mostly spent on translating Tim Berners-Lee’s very technical, comparatively abstract, web dream across to the mainstream reader.

At that point the W3C were rapidly beginning to develop technologies like SPARQL and OWL, while others were under varied debate, such as RDF and Microformats. Not having a background in hard computer sciences, my interest was more in exploring the ubiquity, connectivity, of a Semantic Web, investigating what forces we would need to drive the paradigm – how our ideas of the Web were already changing with coming to grips with terms like (the then brand-new) ‘Web 2.0′ and embracing the Social Web phenomenon.

Nova Spivack is a leading voice, CEO of Radar Networks, founder of Semantic start-up Twine – an information storage and knowledge sharing service. He writes frequently at his Minding the Planet blog, full of optimism and colourful metaphor. He recently gave a talk at the Bonnier GRID ’08 conference in Stockholm – basically ‘the TED conference of Scandinavia’, about what he terms the future of the Web, ‘the Semantic Web and the Global Brain‘. Whilst worrying me by using many of the buzzing memes and science fiction references that I think actually harm those trying to invest in his optimism and adopt the Semantic idea – an all-knowing, understanding, artificially intelligent Web just sounds too good – it’s exciting how immediate he predicts the true impact of the Semantic Web impending.

Then it’s even more so to read popular ‘mainstream’ – or at least the more general – tech blogs giving more and more coverage to Semantic web, that the technology and abstract concepts are becoming commonplace and frequently in normal interest.

ReadWriteWeb, ever popular for polls and predictions favour Semantic technologies in various recent top ten-style looks into future web trends (and more here), but too, see the oncoming breakthrough as more immediate. Richard MacManus puts Semantic apps at number one on his hit-list of Web Predictions for 2008.

But while predictions continue to be made, the true killer-app remains allusive. Some do extremely well though. Freebase went public around the end of 2007, essentially a semantic Wikipedia, hasn’t gained the popularity I though it would’ve by now. True Knowledge natural language search is still in beta, though the platform I’ve tested so far is as impressive as their promotional video.

But then kinda out of the blue for me came two Yahoo! developments. SearchMonkey, not exclusively a Semantic Web app, is a search engine that promotes semantic data standards by making use of Microformats and embedded RDF as searchable metadata – definitely read the FAQ – and at the recent Web 3.0 Conference and Expo, announced the consumer release of Yahoo! Open Strategy (Y!OS), ‘blowin’ the doors wide open’ to the ‘open source, hacker attitude’ – basically, ‘rewiring’ Yahoo! to make all data service-wide openly available to developers and consumers alike, granting the opportunity for complete data portability – with the intent to extend even further in the future.

I concluded my thesis in suggesting a new drive would be necessary for consumers Web-wide to understand and willingly adopt the Semantic Web change. At that point the Web Science Research Initiative (WSRI) had just been launched, a joint venture between MIT and the University of Southampton to teach the literal academic science of the web – it was unclear whether this would be enough. Molly Holzschlag wrote a recent article at A List Apart, believing the ‘Ivory tower’ perception of the W3C to be discouraging to everyone independent of the organisation – that they’ve no real outreach. I agree, but think efforts like the newly founded World Wide Web Foundation are a direct result of their awareness of that, but they will fulfil their principle objective and speed the technological advancement faster than she may expect – I hope they do, at least.

For even more reading (and listening) material, subscribe to the Nodalities blog and podcasts. There’s a good interview with David Provost I recommend, discussing many of the things I’ve spoken about here, but also his recently published report on the Semantic Web industry as a whole.

It’s called, ‘On the cusp’. :)


  1. I see you’ve mentioned Yahoo! has a new search platform, why aren’t Google doing anything similar? Or are they? I’d imagine their financial clout could be pretty useful…

  2. Hi Barry,

    I know Google previously used Semantic technology to power their AdSense and AdWords services, but I think the processing was far too slow to function for their main search platform at the time. I think more than anything it would be a ridiculously huge overhaul to attempt to reconstruct Google for semantic search, same as Yahoo! – that’s why they launched SearchMonkey separately.

    Google’s search algorithm grew from the complete opposite of semantic understanding – to match keywords and patterns, honed to almost ‘guess’ what the user intends – it’s affected the fundamental way we enter search terms so it seems strange to ask literal questions to engines like True Knowledge or Hakia. But it’s those kind of new platforms that really rival Google with natural language processing at least – though I’ve no doubt there’s some semantic technology under the hood at Google – tagging pages I’m sure, dealing with ambiguity and synonyms, etc – even if the index isn’t wholly semantic.

    Even before these search engines achieve the White Rabbit complete and universal natural language understanding, they’re challenging the core concepts of search by performing sentence analysis over keyword analysis at the very least – Google are surely going to have to respond some way.

  3. It’s interesting. Google must have seen this coming, but as their search platform is so fundamentally different, do you think they’re ignoring it on a larger scale because it’s quite a departure from their existing technologies? It’s looking now as though they should have invested more steadily and heavily in this over the last few years or so, wouldn’t you say?

  4. Hopefully the onset will be too great to ignore, but yes – it’d almost be a complete reinvention. It’s more likely they’ll attempt to bolt on semantic technologies in and around the existing infrastructure in an attempt to do what the semantic search engines can do on top of what they already do so well.

    The other argument to consider is the imposition of introducing Semantic technologies – in the same way any new technology is introduced. The RWW article from my previous comment quotes Tim Berners-Lee well – where SearchMonkey and some engines look for semantic data in web pages, Hakia for example, doesn’t rely on Web authors having to have put that data there in the first place – it analyses natural language.

    It’s still keenly debated whether the ‘top down’ or ‘bottom up’ approach to the Semantic Web is best – whether to code semantics at the data level or develop systems to mine/extract semantics from already existing data – or whether it’s even such a strict dichotomy. Whichever; I think Google are definitely at one end and said search engines like Hakia, True Knowledge and Powerset are at the other.

