Monthly Archives: January 2009

As part of our agency rebranding, we’ve all been tasked with finding a suitable image for the reverse side of our business cards. Apparently it should represent our image and personality, be quirky, but important to us.

Avoiding copyright infringement means I can’t use any bad-ass images of Superman, we can’t have any people we ‘know’ – assuming this includes famous people scratches out using Kirk, Kara Thrace or Tim Berners-Lee etc – even his Semantic Web stack is too square to use as well. (Put this on a t-shirt for me and I’ll be your friend forever).

Anyway, after trawling Flickr for anything half decent under a Creative Commons license I’ve narrowed it down to five images – bit geeky, quite unimaginative, cliché retro.

TAC-2 controller

TAC-2 controller

"Yeah" by Sameli

The background is as good as the joystick itself, TMNT ftw.

Atari 2600 games

Atari 2600 games

"Day 323/366" by Great Beyond

Brilliant. We could use patterns instead of photos if we like – this is almost both.

Commodore CBM

Commodore CBM

"Cutting Edge Technology, 1981" by Superbomba

Could I pass this off as me?

Girl coder!

Girl coder!

Original by Dave & Bry

Older still, maybe a bit obvious.

Then I started looking for trash – I love photography of pretty much anything abandoned or broken. Flickr has a great pool of Abandoned Swimming pools.

Mac and Toaster

Mac and Toaster

"Macintosh Plus + Toaster" by Eric__I_E

They belong together!

Abandoned Monitors

Abandoned Monitors

"Four Toxic Computer Monitors" by Tonx

Possibly the strongest contender. Looks like they’re their holding cables ready to cross the road. Not too overly techy either?

What do you think?

Suggestions/recommendations/votes welcome – need to decide before Friday!

While I’m on the subject of data portability, I thought I’d talk about DataPortability.

A loose analogy: Consider the definition of the Semantic Web – a conceptual framework combining standardised semantic applications on the web. Similarly, the DataPortability project aims to define and implement a set of recommendations of open standards to enable (entire and complete) end-to-end portability of data.

Both ‘capitalised’ terms denote distinct, considered models – composed of specific selections of the technologies that together embody their respective namesakes.

Not that DataPortability really has anything to do with the Semantic Web other than the shared idyllic standardisation and ‘boundless’ interoperation of data and services online..

In essence the project a volunteer-based workgroup, as transparent and ‘frictionless’ a movement as the borderless experience they promote. Their vision describes the web as place where people can move easily between network services, reusing data they provide, controlling their own privacy and respecting the privacy of others (read in full here).

They wish to see end to every problem I described in my last post – the social network fatigue, the fragmentation and walled-garden silo landscape of current web platforms – and too, promote the combination of a open source technologies and protocols (including OpenID and OAuth) for web-wide benefit, not only with regards to social networking.

The following video, quite simply but accurately, describes the already too familiar picture:

So what technologies are we talking about?

Although our Semantic friends RDF, SIOC and FOAF are present, it’s much more familiar territory for the rest. The line up includes RSS, OPML, again OAuth, OpenID and Microformats. These are existing open standards though, not technologies still in development awaiting a W3C recommendation like some of the Semantic Web projections.

There’s some other very cool stuff I’d like to go into more detail with later. Definitely APML, for example – Attention Profiling Markup Language – an XML-based format that encapsulates a summary of your interests, your informed ‘attention data’.

As well as identifying the components that make up their blueprint (the recognition of how their goals can be achieved – which, and I know I keep coming back to this, is one of the largest cause for doubters of the Semantic Web – that the speculative combination of some of the technologies is almost unimaginable) – the DataPortability project also documents best practices for why you should to participate in the initiative – specifically tailored as to how they can come together for you, as developers, or consumers, or service providers etc.

DataPortability is about empowering users, aiming to grant a ‘free-flowing web’ within your control.

How are they doing this? Are they likely to succeed? They’ve already got some huge names on board – Google, Facebook, Flickr, Twitter, Digg, LinkedIn, Plaxo, Netvibes – the list goes on. This is really happening.

Find out more at

Hopefully the last of the posts that I should have written last year – a while back I wrote about Facebook Connect and Google Friend Connect, I mentioned three open source data projects – OpenID, OpenSocial and OAuth.

I only mentioned them briefly in the thinking that they deserved attention separate to that topic – they’ll play a key part in the progression of social media technology, but the three are part of a bigger issue. That of data portability – one perhaps more concerned with my current Semantic Web conversation.

While the three have been separately developed over the past three (or so) years, their popularity and general implementation are becoming ever more widespread. In combination, they offer very powerful potential in leveraging data, interoperability thereof between systems and ultimately offer standardising methods and protocols in which data ‘portability’ becomes possible.

In very, (very) short:

  • OpenSocial (wiki) is a set of common APIs for web-based social network applications.
  • OpenID (wiki) is an decentralised user identification standard, allowing users to log onto many services with the same digital identity.
  • OAuth (wiki) is an protocol to simplify and standardised secure API authorisation and authentication for desktop, mobile and web applications.

There’s a ton of reading fired from each of those links.

But more than anything, I very strongly recommend watching the following presentation by Joseph Smarr of Plaxo, taken from Google’s I/O conference last year:

Google I/O 2008 – OpenSocial, OpenID, and OAuth: Oh, My!

He covers each of these open source building blocks in detail, collectively considering them as a palatable set of options for developers in creating social media platforms. He presents the compelling engagement they can offer social websites, how they fit together in a holistic way so developers aren’t constantly building from scratch and how he envisions the social web evolving.

He critiques that today’s platforms are essentially broken, highlighting the fragmentation of social media sites – that their rapid growth forced developers to build each platform to be built separately, from scratch so therefore differently, so that each platform has their own silo, headed in a different direction. That the very nature of social network infrastructure and architecture is still very nascent.

We are at breaking point, social media sites still assume that a every new user has never been on a social network site before. We’ve all experience having to register and re-register, upload profile information, find friends to then confirm friends – it’s not scaling any more.

Not only has it gotten to the point that we as consumers are experiencing social network fatigue, but users are also, understandably, opting out of joining even newer networks, pre-empting the nauseous motions they’ll have to repeat.

It’s very easily digestible – not at all deeply technical until the Q&A section. Do watch!

Not to be outdone by Google’s efforts this week, have also expanded their search technology to return specific ‘direct’ answers to searches, where possible, by means of semantic language processing.

Fortunately far more public than Google, announced on their blog yesterday that they’ve been developing proprietry semantic analysis technology since October of last year in their efforts of advancing the next generation of search tools.

DADS(SM) (Direct Answers from Databases), DAFS(SM) (Direct Answers from Search), and AnswerFarm(SM) technologies, which are breaking new ground in the areas of semantic, web text, and answer farm search technologies. Specifically, the increasing availability of structured data in the form of databases and XML feeds has fueled advances in our proprietary DADS technology. With DADS, we no longer rely on text-matching simple keywords, but rather we parse users’ queries and then we form database queries which return answers from the structured data in real time. Front and center. Our aspiration is to instantly deliver the correct answer no matter how you phrased your query.

The results aren’t returned as explicitly as Google’s, mainly due to the amount of adverts above the page fold, but they work. Try searching for ‘Football on TV this weekend‘ or ‘Movies on TV now‘ and you’ll see the results in accordingly custom-formatted sections.

Unfortunately the results are still only returned in HTML, so again – the term ‘semantics’ here describes the form of processing are doing beind the scenes rather than depicting this as their first outright foray in to the Semantic Web (capital S).

This though is proprietary technology and presumably it’ll stay that way. So I’m unsure whether to celebrate their realisation of the importance of semantics (in search at least) or in realising their more ‘closed source’ ethos, consider this to be almost against the idea of the Semantic Web – portability, sharing, transparency – as they hold these advances close to their chest in order to gain an edge over their competitors, causing others in the future to understandably do too.

Quite out of the blue and without notification of it’s launch as far as I’ve been able to find, Google seem to be exposing semantic data in their global search results.

Try searching for ‘What is the capital city of England?’ or ‘Who is Bill Clinton’s wife?’ and you’ll see sourced direct answers returned at the top of your search results.

It’s hard to tell if these direct results are actually semantic expressions or just presented to appear that way – in the expected Semantic triple of subject-predicate-object. The list of sources definitely don’t structure their information with semantic expression, so perhaps quite an amount of logic and natural language processing is being done on Google’s part to process non- or semi-structured data.

I’ve tried to find out before what Google have been up to concerning semantic technology but found little. The coverage over at ReadWriteWeb reports that neither they or their Semantic Web contacts had heard or seen anything about this before, but the community feedback suggests there’s been variations of this for some time – including a three year old Google program called ‘Direct Answers’ – but none of the coverage of that program offers the kind of examples we’re seeing here.

Marshall Kirkpatrick points to a blog post of Matt Cutts, Google search algorithm engineer, but it seems to be a dead link now. Though trailing through Google’s caches, it seems to find him quote:

Many of the data points are being pulled in from the structured part of Wikipedia entries, which is interesting. Other sources are wide ranging, from a license plate website to Jason Calacanis’s Mahalo.

If Google are constructing semantic data from semi-structured or non-structured source data, then there’s undoubtedly some quite powerful semantic processing technology in place. I highly doubt this will be the final product of their development with such technologies, simply the first we’ve noticed – most likely why it’s slipped under most people’s notice.

The inaccuracy is also an issue. Try searching ‘Who is Bob Dylan’s wife?’ – and you’ll see Sara Lownds (his ex-wife) returned. Seeing these direct answers reminds me of True Knowledge.

Even their example questions though, are far more complex – for example, ‘Who owns Universal Studios?‘, ‘Is the Chrysler building taller than the Eiffel Tower?‘, ‘What is the half life of plutonium 239?‘.

More importantly, if it doesn’t know that answer, it won’t ‘guess’ – it’ll tell you it doesn’t know and ask you to deconstruct your query in order to expand it’s knowledge base so it can find the answer later.

As Marshall says, this is all speculation based on limited observation – and low visibility of Google’s development. Hopefully there’ll be more soon!

You can’t start a fire without a spark.