Walled Gardens, Semantic Data and the Open Web: an Interview with Steven Pemberton

Monday, 2 November 2009 | Jospoortvliet

During the NLUUG end-of-year conference "The Open Web" in Ede, Netherlands, we did an interview with keynote speaker Steven Pemberton. Steven Pemberton is a researcher at the Center for Math and Information Technology in Amsterdam and has been involved with the web since it's first incarnation - he vividly remembers the day the connection from Europe to the US was doubled to 128 Kbit.

Steve, can you introduce yourself to the DOT audience?

Well, I've been involved with the web since ever - I was among first 25 non-military users of the Internet. I gave several workshops during the first web conference, among others about clientside scripting and have been involved in various W3C commissions, working on standards like HTML4, XHTML, CSS, XForms and RDFa.

As a researcher I have of course been researching and preparing for the future. We try to maintain a long-term vision, routinely looking ahead 10 years, sometimes more. I feel the effects of the internet on our current lives have not even started to flesh out - change can only go as fast as people can incorporate it into their lives.

So you expect much more invasive changes compared to what we have seen already?

Oh, yes. I expect a second enlightenment. The first book press resulted, over the course of several hundreds of years, into the first enlightenment The web will enable the second one. Paradigm shifts come exponentially faster, approaching a singularity where we can't see beyond.

Singularity on Wikipedia

What did you talk about during your keynote?

Openness, mostly. There are issues in various area's, right now - our data isn't ours. Open standards and net neutrality are key for accessibility, both for people with say mobile devices or non-standard operating systems and web browsers, but also for partially sighted - something we'll probably all will have to live with one day.

Currently we are moving to what I call meta standards. CSS would be an example - it lets you layer stuff on top of other things. From the point of view of openness, this allows content to be re-purposed because it is separate from the implementation. This is very important in device independence and accessibility. SVG would be another example - you have a nice graphical interface, yet the page below it is open for eg search engine indexing. We're also moving to a more declarative approach. Take XForms. It does not show how a radio button needs to look, but it just describes what it is supposed to do (select something from a list). So the same XForms based form works on a desktop, a cell phone, but also over a voice browser and I've even seen a demonstration over an IM network.

Steven during the keynote

The biggest danger the web currently faced are what I call walled gardens. When talking about Web 2.0 you often see references to Metcalfe's law: the value of a network is proportional to the square of the number of nodes. Take a network, split it in two and value halves despite you having access to both. That's why it's good there is one email network; that's why it is bad there are many independent IM networks. That's why it's good there is one WWW. Web 2.0 is about users adding value to websites (wikipedia, flickr, facebook). But this currently leads to some new kind of lock in based on commitment. It takes a lot of work to bring all your data into a web 2.0 site, so you don't want to move and have to do it again somewhere else. Currently there is no standard way of getting data out. It makes it hard to choose a site - say you add all your genealogy data to one site, then discover a very interesting tree on another site. What do you do, do it all over again? And what if the site closes down? Or your account? Take a recent Facebook incident: an account got closed because the user tried to download the email addresses of his/her friends into Outlook! And imagine losing your Google mail account with years of email, agenda and other data.

So what is the solution? What you want is some sort of a personal site with all your own data. Then the data can be connected to other data by search engines. For that we need a way to say on a site to explain to search engines what some piece of data represents (eg genealogy, foto & place & data). One of the technologies which can do this is RDFa, it's like a CSS of meaning. This improves search and user experience; it improves services; and aggregators can create value by joining data.

So what happens in the future?

Well, I expect the walled gardens to go. Or, at least I hope so. The net neutrality is a perquisite for this - it creates and helps to maintain a open market, where closing your garden has a negative effect. But there are plenty of dangers. Take Google. I applaud their motto of 'do no evil', and certainly they do attempt to follow up on it. But what if you look forward 10 years? 10 years ago, Yahoo was the big gorilla in the room, and now - it's almost gone. Nobody would have thought it could happen but it did. What if Google is bought by a third party, 10 years down the line? What happens to our data?

The Free Software community rarely looks forward that long. Three years is a long time for us - 10 years is an eternity. Is that a problem?

Yes, it is. I see how Free Software mostly copies what the proprietary competition is doing, it doesn't innovate enough. For that I think we have to move on to a next generation infrastructure, and most importantly programming language. And Free Software needs to take it on to develop the next language.

Math Example

In the middle ages, 1545 to be exact, the mathematician Cardan wrote in his Ars Magna (in Latin, thus my thanks to Lambert Meertens for this translation):

Raise the third part of the coefficient of the unknown to the cube, to which you add the square of half the coefficient of the equation, and take the root of the sum, namely the square one, and this you will copy, and to one {copy} you add the half of the coefficient that you have just multiplied by itself, from another {copy} you subtract the same half, and you will have the Binomium with its Apotome, next, when the cube root of the Apotome is subtracted from the cube root of its Binomium, the remainder that is left from this is the determined value of the unknown.

What he was trying to say was that one root of x3 + px = q is calculated so:

d = (p/3)3+(q/2)3 c = sqrt(d) b = c + (q/2) a = c - (q/2) x = cuberoot(b) - cuberoot(a)

a calculation that any reasonably trained schoolchild can even prove nowadays.

(taken from Steven's website from a talk at ApacheCon, 2007)

What would that language be?

Well, it needs to focus on doing more in less. Current deskops use C and C++ - fine now, but with technologies like employed by XForms you can do 10 to 1000 times more with the same work. We were working on interpreted languages in the 80's and 90's and everybody said we were crazy. After all, computers were barely capable of running those. But we understood that in 10 years, computers would be so much faster it wouldn't be a problem. And doing all the computing for a webpage on the server simply does not scale - it MUST move to the client.

So wouldn't python and ruby help in this move to higher-level languages?

Well, it's a start, but I'm looking more at a 10000-fold improvement. In 10 years computers will again be many orders of magnitute faster than now - current languages are completely overkill, a waste of time.

If you look at the math example, you see it's not just shorter but also allows for new functionality, transformations and insight you didn't have before. We need to realize how the world has changed to see how it will change. In 1960, you leased a PC from IBM and got a bunch of programmers for free. These days, it's almost the other way around. Programmers are the most scarce resource, hardware is cheap. Free Software in particular should realize this - any project has a lack of hands, any time.

What desktop software are you familiar with?

Well, I use Ubuntu at home, and something linux-y at work, I don't know what. I'm very much a command line person, having used Unix since version 6, somewhere in the 70s. And in research I'm very much an infrastructure guy. So I'm not well versed in the KDE and Gnome world.

A final tip or comment for the readers?

Technology needs to be focused on what you want to do. Take a file system. Say file systems would not be abstracted away by the kernel, like now. The file system would be like a toolkit - applications would be tied to it. Imagine how much more limited that would be in terms of exchange of data. But our GUI tool kits, our user interface, IS tied to an application, an OS, a device, a form factor. So interoperability needs to move deeper into the operating system. That's what I've been researching since the 80's, and what needs to be done.

Comments:

take away - jospoortvliet - 2009-11-02

The talk I had with Steven was very interesting - while reading the article you'll probably find many of his ideas floating around in KDE as well. Independence of screen size, input devices and platform - we're doing that with our libraries, and Plasma in particular. We're allowing the use of any higher-level language for scripting using Kross, and have many different bindings to allow for writing full applications. RDFa - the CSS of meaning - Nepomuk anyone? And we're fighting the walled gardens with Silk and the Social Desktop initiative. Still, it would be interesting to hear his ideas on what we are doing - after all, he's been thinking about these things before we started hacking ;-) I am also interested in getting the 1000 times improvement in productivity he talks about, and I wonder if we need a new language for that...