These days I find myself giving advice on an occasional basis to the governments of four different countries, including the UK. I like doing this, and it makes for a refreshing counterbalance to the nitty gritty of running mySociety - spreadsheets, UX design, contracts, discussions about features - all that stuff.
Recently, however, I've started to think about what citizens themselves need to do to help make their governments keen on open data, or how to sustain any interest the government of their country might be showing.
So, last week I took the opportunity of speaking at the excellent DataCamp in London to put my thoughts into a short talk. I'm sure there'll be a video online soon, but here is a slightly more polished version of the speech I read from on the day. I hope you find it thought provoking.
"We live in quite extraordinary times. We live at a moment where many of the most important politicians in our country and in some others overseas are actually eager to stand up and say that open data is an important priority for them, and for their nations. Just pause and think about it for a second: politicians! Talking about data! At all!
Even more remarkable than the fact that they are keen to talk about it, is the fact that they are keen to change policy to deliver more open data – a commodity who's impacts are by definition complex and unpredictable. This is not the simple politics of supplying an extra hospital bed or locking up another criminal – this is about taking a calculated bet that a series of seemingly esoteric measures will have big impacts, driven mainly by the remarkable characteristics of the Internet. This is a long way from the simple crowd pleasing pleasures of bread and circuses.
We should celebrate the fact that the political classes are paying attention to open data. And we should celebrate the fact that we are starting to get information that many of us in this room have been clamouring for for years. But we should also realise that the current situation is extraordinary, and if we don't work together to manage it quite carefully, it could crash from extraordinary to ordinary with considerable speed.
And that's why the title of my talk today – a talk which is addressed to each and every one of you in the audience - is Open Data: How Not To Cock It Up.
What do I mean by cocking up open data? I mean making making mistakes that result in the flow of data we think is so valuable either drying up, or never starting in the first place. And when I say mistakes, I mean mistakes that we can make – those of us in this room right now, not the politicians.
The first way we can avoiding cocking up Open Data is to ensure that we always advocate for it in the same way that scientists advocate for Blue Sky science research – we must argue for it as a numbers game – a calculated risk that is worth taking. So, we say loud and clear : we do not expect the majority of government datasets will contain massive wells of untapped value, just as we don't expect that most university research will yield a new penicillin, or an atom bomb. But we do believe that it would be a terrible, criminally stupid bet to assume that all the value that lurked in the hidden mines of government data is valueless, especially when so much remains locked up or hidden. It would be like sitting looking at the internet in 1996, and betting against the creation of Google – a mega enterprise founded on data that nobody thought had any value.
The second way we can avoid a stroll down cock-up boulevard is cease tinkering around with demos and hacks and pretty-but-useless visualisations and bloody finish something useful. FixMyStreet is no Facebook, but it is a real service that makes a real difference to a significant number of people people who would be really sad if it went away. There are too few projects like this because it is so much easier to make a nice maps mashup, or a cool graph. But it is a steady stream of this sort of service, both commercial and not for profit, that will achieve the goal of making the cessation of open data releases unthinkable.
Third, we must go out of our way to gather information on what value is generated by people using data, even if just anecdotal. Did you know that 55% of TheyWorkForYou's users tell us that they leave the site with a more favourable impression of their MPs than they have when they arrive? Data.gov.uk could take a lead here by putting up a big, friendly page after every download that says “Please tell us how you use the data, so we can justify releasing more in the future”. Or, perhaps more menacingly: “Nice Dataset you got there – be a real shame if anything untoward were to happen to it”.
However, the process of recording and shouting the value of public data is fundamentally in our hands. For the forseeable future we should adopt the approach of someone who is trying to woo a rather unconvinced lover. Their friends – risk averse public servants and lobbyists - keep telling them that they should dump us and get on with the wooing someone more worth while (the deficit, or china), and that this open data toy-boy is just a shiny distraction with good cheekbones. We need to keep showing up with flowers and kind deeds (ie success stories) to show that we are worth not kicking out of bed. The academic research community could do a huge job here, focussing their efforts in the near term in better understandings of what sort of value open data is starting to create.
Method Number Four for avoiding cockups is to keep an eye out for where public servants might be about to make terrible mistakes that they cheerfully intend to stick the badge labelled 'open data' on. By 'terrible mistakes' I mean the possibility that someone somewhere will release data that is obviously seriously privacy infringing, and which should never have gone anywhere near the public domain (like so many people involved in open data, I am actually much more worried about data privacy than the average Joe). We should probably maintain a top-five datasets in our country that we don't want to see released! This may seem silly, but one really bad error and the name of our whole movement could become mud, and it wouldn't be our fault at all.
And finally, cock-up avoidance technique number 5, which in my mind is the most important, but which I accept many of you might find controversial. This mistake is the mistaking of insisting that Government really should be in the business of publishing everything non-private it can.
"What heresy is this?" I hear you cry, "Aren't you in favour of as much open data as possible?" My answer is simple: No, not at the moment - I don't think any government anywhere is really up to it yet. In fact, right now, I think it is a rather dangerous idea.
It may surprise you to hear that my vision of a perfect (but realistic) government, is one that would release nothing, not a jot of data, not a single row or column..... until someone asked for it. When they did ask, my perfect government would then instantly publish that data in a brilliant, helpful format, regularly updated, and running on a lovely webservice that fulfils every data-mashers dreams.
Why not publish everything as soon as we can though? Surely that would make more sense? I say no, for two admittedly counter-intuitive reasons:
First - if we focus on asking our governments to publish as much as they can, it will spread the finite amount of good will, money and political capital which exists and is available to help us achieve change. I think the change we need most strongly right now is reform to give you in the audience stronger rights to the data that many of you are still clamouring for, but whatever you think is most important change you need to realise that spreading your efforts too thin is likely to see it fail
Second, I don't actually think it is worth replacing a hopeless old 70s It system with a shiny modern web service if the combined population of the country you're in doesn't seem to care about what's in said system. There are, in short, better things to spend our money on than upgrading systems which contain data that nobody appears interested in.
So there we are. These are five suggestions I have for helping make the current move to open data as successful as it can be. I hope they're pretty international, and of relevance to you all. But my main message is this - the open data movement has some momentum internationally, but it will only sustain it if those of us who use data realise we've got to take certain steps to make the case for why it matters, and if we temper our idealism a little in the name of effectiveness."
Very clear and very relevant analysis. I think the focus on giving people the data they want (and ask for) is absolutely right - much better than the current game of "I'm publishing more data sets than you!".
Posted by: Paul Johnston | November 26, 2010 at 09:42 AM
Good analysis; if anything I think the second method is a tad understated and there is a significant linked omission.
The zeroes and 1s of data can mean, usually do mean, very little without a context. With context comes value, to citizens, agencies and others. It is often only when data is linked back (to the policies, programmes, responsibilities etc that anything other than axe-grinding and knee-jerk media frenzy can result. Trouble is those things are not being released as yet and if they ever are explicit relationships with specific open data sets ought to be embedded to give a jumping off point for improved analysis (of course other data are likely to have been impacted by a given policy or programme objective rather like cumulative impacts of toxins in the environment). Combine this with the issue of time or persistence over time....using a close to home example: report a pothole, contractor throws wet tar in, its fixed, first shower of rain its there again and so the cycle goes on - what do the stats on road repairs show, probably that the council is quick at responding and not that the low cost contractor is rubbish or milking the contract?
So, concur broadly, but the real rewards (and a sound platform for maintaining and developing an open data regime) are in the net social and economic benefits that can come over time if the data can be subject to genuinely informative contextual insight.
Posted by: James C | November 26, 2010 at 10:35 AM
Thanks for this - very useful advice. Particularly the focus on useful applications rather than attractive visualisations. I agree with Paul; quality over quantity.
Posted by: Gail | November 26, 2010 at 10:58 AM
Blogged on this today in context of risks to open data because of wikileaks http://www.barryjorman.com
Posted by: BarryJOGorman | November 30, 2010 at 06:41 PM
You made your point number five rather well:
"do not insist that the gov't must publish any and all extant, non-private data."
We need meaningful data. Generally speaking, we have our hands full merely with that, and with scrubbing or remapping it to derive something useful.
You wrote a good post here. As Barry said, it becomes particularly interesting in light of wikileaks that had not yet occurred when you posted this. Good insights...
Thank you.
Posted by: Curious Ellie | December 01, 2010 at 04:29 AM
Barry makes an interesting connection between open data and wikileaks and his blog post (actually at http://barryjogorman.com/). He discusses how the perceived dangers of wikileaks might influence the inclination to open up troves of data on the part of those in government who collect it and control its dissemination.
Posted by: Ambjörn Elder | December 01, 2010 at 06:34 PM
My God! She is incredibly glamorous! Thanks for the post, dear. Your blog is really very nice and amazing. Keep posting.
renamer
Posted by: Karon Aaland | December 13, 2010 at 06:49 PM