Following the news that the US Congress might cut the funding for various flagship data sites like Data.gov, the following debate has sprung up on a little Q&A site - "What would you change about Data.gov to get more people to care?"
I can't login to the discussion because their OAuth is busted, but I wanted to comment because I think this question is peculiarly helpful to the entire open data movement: it's helpful because it's exactly the wrong question. I wish someone had asked this particular wrong question a few years ago.
Open data policy matters because it reduces barriers to people with bright ideas from creating goods and services that make the world a bit better, either socially or economically. It really is as simple as that.
Data.gov and all its' domestic and international spinoffs suceed only in so far as they help the frustrated innovators or researchers to get what they need quickly and easily, so that people in the future don't have to break the law by 'stealing' their own parliamentary transcripts data, as the first TheyWorkForYou volunteers had to.
I think the notion that a large volume of people should ever be expected to come to sites like Data.gov, and that these sites should offer mass market, easily accessible content is quite wrong (although I have sympathies for the political pressures that might have been at work when the funding was originally granted) . Sites like Data.gov should be entirely honed to serve the needs of a small number of frustrated data seekers, whether from business, journalistic, research or social enterprise backgrounds.
At this point I can see some people might accuse me of being some sort of an elitist, but my motivations for writing this post are entirely opposite - I do want this data to benefit as many people as possible, and I think that this is the only way. I think public data sites should serve the hard core data-seeker audience most intently because I think that those seekers are also known as the people who will build the services that will get the data used by millions of people. They're the BBC election night graph producer, or the next Julian Todd.
There is no more need for a Data.gov to be a big shiny, well trafficed site than there is for a page on Public Key Encryption to be shiny, friendly and well trafficed - what counts is that the right people (who in that case work for our banks, and our governments) do visit those pages sometimes and get the things they need to. When their needs are met, we all benefit - we get secure banking and private email as a result.
The open data community should shake off its guilt about not producing data for direct consumption by end users - power station managers don't feel guilty about not producing iPods. Instead we should be proud of and fiercely focussed on enabling the next generation of entrepreneurs and story tellers to do their mass-market magic.
Tom
I've been making this argument for at least a year in my dealings with various central and local govt bureaucrats, but this is the most concise and best articulated statement of it I've come across, and in future I'll just send people a link to this post. Excellent.
Chris
Posted by: Opencorporates.wordpress.com | April 05, 2011 at 05:27 PM
To use a rather non-PC allegory...
Raw data is like oil.
It is dirty, smelly, inconvenient and has very limited uses in its natural state. It is hard to get at and requires some prowess and brute force.
I'd guess only about 1 in 5000 of the population is involved in getting oil out of the ground.
Once refined as petrol, diesel, aviation fuel and plastics then we all use it and largely take for granted that this minority is involved in actually getting the black sticky stuff out of the ground on our behalf.
Raw data is one of the (clean) fuels of the future digital economy and we should be both recruiting the equivalent of oil workers to get it, and industrialists make it into something palatable and useful as well as looking actively for new oil fields to exploit.
Just as the early 20th century oil pioneers could neither foresee Concorde nor plastic knives, we cannot know what future Android/iPhone/Computers will evolve to exploit this vast resource.
Posted by: Paulgeraghty | April 05, 2011 at 08:37 PM
First off, the OAuth is no longer busted!
Second, I initially felt much the same way as you. I happen to think that digitally archiving and publishing government data that is already "public domain" should be such an obvious job of the government that it shouldn't even be questioned. However, the problem is that money is being pulled from these programs, because they don't seem important to policy-makers or the people who've elected them. Regardless of their civic importance, these programs need to figure out ways to appeal to a wider swath of people, not necessarily by having a flashy website, but at least generating more obviously important apps or projects built on their data. Otherwise, how will they get funded?
Posted by: Kraykray | April 06, 2011 at 12:57 AM
I fully support your views. having worked for gov't statistics for years I find the concept of engaging a mass audience with data entirely ludicrous. IMO public data sites should focus on being machine- and librarian-friendly rather than going after the mythical concerned citizen.
Posted by: twitter.com/jcukier | April 06, 2011 at 02:50 PM