Before you analyze data, you must first validate it.
Otherwise, your analysis may not be accurate, and you may miss some important insights or errors.
This post is part of the Excel: Basic Data Analytic series.
Before analyzing your data, you need to check the following:
- Duplicate transactions do not exist.
- Required fields/key fields do not contain blanks, spaces, zeroes, unprintable characters, or other invalid data.
- Date fields contain real dates, and the range of dates is appropriate.
- Amount fields don’t contain inappropriate zero, positive, or negative amounts, and the range of values is appropriate.
- Each field is stored in the correct format. This prevents data from being converted on the fly into something else unexpectantly (e.g., user ID JUL15 becomes 15-Jul).
Excel’s Text to Columns function allows you to separate pieces of data in a single column into multiple columns.
This function helps when key data is buried in a field with other information and you need to extract the key data into a separate column before you can analyze it.
For example, you obtain a list of email addresses, and all you want are the user IDs. Or you get a list of servers, and the server name is server.domain.com, and you need just the “server” name. Or you need to separate LastName, First Name into separate columns. That’s where Text to Columns saves the day.
This article is the fourth post in the Excel basic data analytic series.
To identify unique values in an Excel table, follow the steps below.
This article is the third post in the Excel basic data analytic series, which starts here.
The steps for identifying unique values are similar to identifying duplicates. The first difference shows up in step 3 below.
While the previous post in this series described how to remove duplicate values in Excel, this post describes how to identify duplicates.
The remove duplicates function doesn’t tell you which values are duplicates, it just removes them. Sometimes you need a list of the duplicates so you can review them in detail or include them in your workpapers.
So we’ll look at how to create a list of duplicates across all values/columns and in specific columns.
To remove duplicate values in Excel, follow the steps below.
This is the first post in a series of basic data analytic procedures using Excel. If you work with data regularly, these procedures will help you understand your data better and analyze it faster.
I started this series because I am asked how to do these tasks, sometimes by experienced contractors and auditors.
The profile article of the new ALC CEO, Laura Schultz, indicates a new direction at the company, but I’m not sure what that direction is. Here’s why:
1. ACL tweeted that Schultz is “fiercely determined” (see below), and in the profile, she talks about being “hell-bent” and “extreme” and taking vacations that involve “starving” and “afraid”. This is not your grandmother’s CEO, and maybe that’s the point. Either way, it doesn’t give me any comfort.