As I explained in the last Excel post, you can check for blank and invalid data in Excel several ways.
In this post, I will focus on the insights and issues encountered by sorting each column from A to Z and then Z to A.
Sounds pretty simple, but I’m willing to bet you will be surprised to learn a thing or two…
For a list of the reasons why you must validate data before analyzing it, see Why You Must Validate Data.
During a recent visit to a library near you, I was trying to find a book via the online card catalog.
[I remember when card catalogs were on actual cards, in drawers, like the one pictured. Yikes!]
I was trying to find a book by someone who runs an analytics blog that I frequent, but I couldn’t remember the guy’s last name.
Several of my friends passed the CISSP exam recently, and told me that it isn’t as technical as I told them it would be.
They said it was more of a security manager certification.
Windows 10 has a new feature called Wifi Sense that allows you to share wifi network access with others without sharing the wifi passkey – kinda.
I don’t see any sense in using it; too risky, and rather unnecessary.
You can check for blank and invalid data in Excel several ways.
Depending on the size of the file and your preferences, you can either scroll through the dropdown list, sort each column from A to Z and then Z to A, or apply a filter.
Sometimes, you need to use a combination of these methods.
It’s important to know how these methods treat data differently and to be aware of their limitations.
Before you analyze data, you must first validate it.
Otherwise, your analysis may not be accurate, and you may miss some important insights or errors.
This post is part of the Excel: Basic Data Analytic series.
Before analyzing your data, you need to check the following:
- Duplicate transactions do not exist.
- Required fields/key fields do not contain blanks, spaces, zeroes, unprintable characters, or other invalid data.
- Date fields contain real dates, and the range of dates is appropriate.
- Amount fields don’t contain inappropriate zero, positive, or negative amounts, and the range of values is appropriate.
- Each field is stored in the correct format. This prevents data from being converted on the fly into something else unexpectantly (e.g., user ID JUL15 becomes 15-Jul).
Running the desktop version of ACL in a virtual machine* (VM) has so many advantages, but I haven’t heard anyone else doing it.
Consider the following advantages, listed in order of importance (to me):