An introduction to open data, big data, data standards and code lists
There is plenty written on the web about big data and open data, so here is a short overview on these topics, and some useful links to further reading. Plus, there’s more depth on data standards and code lists. It is worth bearing in mind that the Big Data revolution is relatively new, and so the terminology has not yet become fixed.
Big data is the term that describes the gathering, analysing and interpreting of huge volumes of data – so much data that it requires significant computer processing power. Often this data comes from more than one source, and is brought together to discover new insights that make a difference. For instance, if you analysed detailed weather for a specific location together with road traffic accident information, you could identify accident hotspots when the weather’s bad. That insight could save a cyclist’s life.
Here’s what Wikipedia has to say about Big Data.
Open data, as described by the UK government, is data that meets the following criteria:
- accessible, ideally via the internet, at no more than the cost of reproduction, without limitations based on the user identity or intent
- in a digital, readable, format for interoperation with other data and
- free of restriction on use or redistribution in its licensing conditions.
And open government data is public sector information that has been made available to the public as open data.
Examples of open government data include the exam performance of the UK’s schools, colleges and universities; hygiene and death rates of hospitals; mapping information and so on. The UK has a new Open Data Institute which is encouraging effective use of open data to generate new jobs and better public services..
Data Standards, or reference data standards
Just as standards are beneficial to everyday life – the inch, the meter, the second and even the size of a lightbulb fitting – standards are essential to successfully analysing big data and the key is having access to the relevant data standards and code lists.
We like Malcolm Chisholm’s definition of data standards, or reference data, as: any kind of data that is used solely to categorize other data found in a database, or solely for relating data in a database to information beyond the boundaries of the enterprise.
A list of data standards is known as a code list, and sometimes also called “code tables” “lookup tables” or “domain values”.
An example is the list of codes used to describe airports. The code that the International Air Transport Association use for the Paris-Orly airport is ORY, but the code used by the International Civil Aviation Organisation is LFPO. Same place, different code.
Confusion around data standards and code lists is the one of the big brakes firmly holding back the accelerated understanding and adoption of Big Data and Open Data.
You do not need to be worried by this, and you do not have create a new database to get ‘one view’. Instead, you can use reference data standards and code lists in a smart way so you can map and interpret data on different systems – quickly and easily. Listpoint helps you do this by making the code lists visible and providing tools for you to test your processes and standards.
And by being part of the community of users, you can work with the power of the crowd to keep improving code lists and data standards so they become ever more useful.
90% of the world’s data has been generated in the last two years and it’s growing at such a rate that enforcing centralised uniformity, or having one repository for information, is just not possible.
The only choice is to make the standards that define the data very visible and accessible, and help the community of users continually improve them.
Data standards might be very simple but they are very powerful because they help you interpret data held on different systems - but only if data standard and code lists are open, and accessible for the community of users.
So where does Listpoint come in?
As a web-based service , Listpoint helps you build apps, maintain data standards, build interoperability, map and interpret your data better.
Designed as the preferred open platform for all developers, Listpoint helps you find, map and join together code lists, as well as collaborate and coordinate, in order to make sense of competing standards. Listpoint works across the many diverse software applications, industries and country standards that exist.
Listpoint harnesses the collective intelligence of its customers, is backed by a unique collection of data and supports applications of all sizes. Our community of data publishers and integrators use these freely available code lists in order to build apps, maintain data standards or to build interoperability between multiple data sources.
Using award-winning tools, you can organise and validate all your code lists in one useful, trusted and secure environment, instantly improving the quality, usability and integrity of this critical piece of the data jigsaw.
This is your open platform… Listpoint simply provides you with the toolset to find, manage, maintain, create and get better use of all code list standards. Click here to learn more about what Listpoint can do for you.