Before you analyze data, you must first validate it.
Otherwise, your analysis may not be accurate, and you may miss some important insights or errors.
This post is part of the Excel: Basic Data Analytic series.
Before analyzing your data, you need to check the following:
- Duplicate transactions do not exist.
- Required fields/key fields do not contain blanks, spaces, zeroes, unprintable characters, or other invalid data.
- Date fields contain real dates, and the range of dates is appropriate.
- Amount fields don’t contain inappropriate zero, positive, or negative amounts, and the range of values is appropriate.
- Each field is stored in the correct format. This prevents data from being converted on the fly into something else unexpectantly (e.g., user ID JUL15 becomes 15-Jul).
When you need to determine whether several fields in 2 Excel documents (or tabs) match, all you need to do is combine the fields in each document into one value and then compare the 2 values using vlookup.
You could do this many ways, but if you’re new to Excel formulas, I think this way is easier to configure and understand. I’m assuming you’re familar with the basics of Excel and vlookup already.
If you are not familiar with vlookup, you might want to review this first, as my post does not teach you vlookup, just another way to use it.
PSPad is a great text editor and search tool, so by default, it’s a great audit tool, and it’s free. It can also handle a million lines of text–literally. Are you interested yet? It is also a great file diff/compare tool I’ve ever seen.
PSPad works with text files, such as those ending in TXT or CSV, or any text-based file (like an ini file). It works with DOC files too.
I’ll explain how to do the following with PSPad:
- Search a file (find all lines containing X)
- List all occurrences/matches of a search term
- Export a list of occurrences
- Compare 2 documents (diff)
- Download & install PSPad