Who Has Time to Cleanse Date/Time Data?
Apr 23

Who Has Time to Cleanse Date/Time Data?

Text data that needs to be matched with pre-defined data sets is often seen as most challenging. But what about innocuous date and time fields – surely they can’t be difficult? Oh yes they can! Here are four tips to avoid many potholes and maintain data accuracy.

Tip 1 – Be consistent

There are two distinct types of time-related data – points in time and durations. Points in time, such as dates of birth may simply require the date. However, sometimes a time element is very important e.g. when a sample was taken or frozen. Always use the same data type for storage – “Datetime” is best.

Durations are more challenging. You need to be clear about the accuracy you need. Do you need a number of hours, minutes or seconds or even higher accuracy? Once you have your definition you must stick strictly to it for that field. You can then record the number of units – the lowest unit e.g. seconds, as an integer.

Tip 2 – How are empty values interpreted?

Free LIMS Data Migration Guide

The rules around what “empty” means can be vague for date fields and this is the most common issue. However, making such a field mandatory doesn’t always work. If the user doesn’t know the correct value at the time the record is entered, what should they enter? 01-01-1900? Today’s date? Whatever they choose is guaranteed to be wrong. Only make entry mandatory if the user at that point will always know the correct value.

When you decide how to interpret empty values for a particular field document that rule. Users must be consistent.

Tip 3 – Regional settings

If you’ve inherited a set of data entered by users in a different country then look out for regional layouts and interpretations. When exchanging data between the UK and USA, is “01/04/2021” April 1st or January 4th? Be clear!

Tip 4 – Perform relative checks

This kind of issue can be the most difficult to spot. Sometimes known as “sanity checks” – validate date fields against each other on the same record. If you’re looking at tissue sample data then compare the Date Taken with the Frozen Date and other related dates to ensure they make sense. I’ve seen a data set indicating the samples were taken several years before the donors were born!

A final thought about cleansing date/time data

Avoiding easy-to-miss systemic issues can smooth the way to basing clinical and business decisions on reliable date/time data and viewing related data on accurate timelines. And then you can make time for that much-needed cuppa!

Related Topics

About The Author

Gary Rooksby has over 25 years’ experience implementing and evolving corporate systems including manufacturing and quality systems for a range of major clients such as the MOD. For the last 18 years Gary has specialised in Sample Management Software with emphasis on process optimisation and data management. Gary works in partnership with clients and draws on his wealth of experience to help institutes and their teams to maximise the benefits from new and upgraded systems. Business needs are constantly evolving, and Gary loves the changing challenges. Gary always focuses on delivering value to the users, whether that is financial, scientific or simply easing workloads. He believes that the system is never an end in itself; it is a tool to help the users achieve their goals and that principle is always at the heart of any system or data designs.