Data Standards

WHY DO WE NEED DATA STANDARDS?


Using standards makes using things easier. For example, let's say you need a AAA battery for your flashlight. You don't need to worry about the make of the battery, since all AAA batteries are the same size - because they are produced to a standard. You don't need to worry about getting a specific brand of AAA battery, since all AAA batteries will work in your flashlight.


The Bureau of Land Management notes that "Standards provide data integrity, accuracy and consistency, clarify ambiguous meanings, minimize redundant data, and document business rules." Utilizing data standards allow the agency to move from "project-based" data files to "enterprise" data files - and vice versa. In other words, the data becomes usable to more than just the project or person that created the data, because you know the data will be in an expected format and you know what is represented by the data.


If different groups are using different data standards, combining data from multiple sources is difficult, if not impossible. If we go back to the case of needing a battery for our flashlight, if there were no standards for AAA batteries, then we wouldn't be able to use just any AAA battery. We'd have to find one specific for our make and model of flashlight. You'd have to have many sets of AAA batteries in your house, one that worked for each item, instead of one set that works in all applicable cases.


If you were trying to integrate datasets from different sources, each of which used a different format for their date variable, it would be a much harder task since you would have to convert the dates into a common format before you could integrate the data. If everyone agreed upon what standard they were going to use for dates, then you wouldn't have to do this extra step.


A structured data element name gives us:

  • An informative name
  • A description and definition
  • The ability to assign unique, consistent names
  • The ability to identify the natural relationships of data
  • The ability to identify all of the uses of a data element