It’s Time to Spring Clean Your Data
It’s the time of year for the tedious chores that everyone dreads. But what if that chore actually benefitted you and helped you make good choices?
Below are 3 tips on how to spring clean your data, so that it can be meaningful and useful to you.
1. Make a key for data categories and column headers.
Our number one struggle when looking at someone else’s data is understanding what the data even are. Column titles are usually abbreviations with no code or key provided, and although playing detective can be fun, this is not where you want to be making assumptions.
What does it mean??
Solution: create a Key tab to define your acronyms and abbreviations- within the same document.
Oh! Okay, got it!
2. Enter data consistently.
If data are not entered consistently, doing simple calculations becomes complex and tedious.
Hm, what is the racial breakdown of our participants? Is it 3 African Americans or 4 Black/African Americans?
Solution: Create drop down lists. If you know the data entered into a column has a finite set of options you create a drop down list to ensure data consistency.
3. Look for oddballs.
For the data you can’t use a drop-down list with, scan the columns and look for oddities, like people with birthdays in 1901 or test scores above the maximum possible point value. These are often just typos but can seriously throw off your data analysis.
Hm, these seem off. How do I check what the correct data are?
Solution: Create a codebook describing data points before you start entering the data. For example, document that the TOD range is 0-60. Then, use time and common sense to work through any oddballs. Go back to the data source and data coder when necessary.
Bonus tip! Track who is entering the data and when.
When you have questions about the oddball data you find, you want to track the data back to its source. If you add 2 columns, one for name and one for data entry date, this becomes a whole lot easier.
Great, I just need to check with Jane and John!
Need assistance? Thoughtwell can help you clean up your data!