Friday, March 14, 2014

Give us Data We Can Use

In providing funding of $3 million for the new Open Data Institute in the last budget, the federal government offered up support, albeit modest, for the concept of utilizing the data that is out there. Vast amounts of data are in existence and the big data movement has been a major outcome of the realization of this fact, but much of the data is useless for actual consumption.

The Open Data Institute is managed by the Canadian Digital Media Network, whose managing Director is Kevin Tuer. In a recent interview, Mr Tuer pointed out that although the amount of data is vast, a lot of it is not useful because there are no standards wrapped around it. A lack of standards means consumption is difficult. See the article here.

He particularly mentioned that a lot of the data is in the form of PDF files and as he says " What do you do with that?"  PDF files are great for formatting, but they are meant to be read and are not useful for data consumption. That's a big difference. In this age of fast moving analytics and vast amounts of data, the last thing people need is something else to read. What they need is data that is presented in a form that can be used, preferably on an automated basis so exceptions and highlights can be investigated and acted upon.

A case in point is the corporate data being filed on the SEDAR website. It's all in PDF form and is worse than useless. Most other countries have such information presented in XBRL format, which represents an internationally accepted set of standards that enables analysts and others to aggregate and analyze those data efficiently. With PDF, you can't do that. Lets hope that the SEDAR system improves its data presentation and catches up with the rest of the world soon.


Post a Comment

<< Home