This function helps when key data is buried in a field with other information and you need to extract the key data into a separate column before you can analyze it.
For example, you obtain a list of email addresses, and all you want are the user IDs. Or you get a list of servers, and the server name is server.domain.com, and you need just the “server” name. Or you need to separate LastName, First Name into separate columns. That’s where Text to Columns saves the day.
This article is the fourth post in the Excel basic data analytic series.
These procedures are written for Excel 2013, but the same steps should also work in Excel 2010, but the screenshots might be a little different.
Extracting Partial Data from a Field
To isolate the user ID from email addresses:
- Make a copy of the column you want to extract data from (which I refer to as the ‘data column’), and paste it as the last column on the right.
You want to do this for two reasons: a) when you execute the Text to Columns function, it overwrites several columns to the right of the data column. b) I like to preserve an original copy of the data column to show that I didn’t change the data during extraction..
In Example 1 below, I copied column A to column C.
- Highlight the data column (column C). Next, on the Data tab, click the Text to Columns.
See Example 1 below.
- Ensure that Delimited is selected, and click Next.
See Example 2 below.
- Select Other and type @ as the delimiter. Click Next.
The Data Preview box shows you how the data will be extracted based on the delimiter you entered. The delimiter is the character(s) that separate the value you want to extract (JON316 in the Data Preview area) from the rest of the data (itauditsecurity.com). In this case, the delimiter is the @ symbol.
See Example 3 below.
- Select the proper format. In this example, select Text, and click Finish.
Notice in the Data preview box, the first column’s format is Text and the second column’s format is General.
See Example 4 below.
- The result is shown in Example 5 below. The user IDs are now in a separate column and ready for analysis.
In step 4 above, you can select multiple delimiters. For example, if the data was JON316..itauditsecurity.com, the delimited would be 2 periods (..). (stuff like that DOES happen, but usually with data other than email addresses). In this case, you would enter a period (.) in the Other box and check the Treat consecutive delimiters as one box as shown in Example 5a below.
Beware of ‘Date’ Values
In step 5 above, if you leave the Column Data Format field at the default format (General), any user names that are similar to calendar dates will be changed to dates. For example, JAN01 becomes 1-Jan, which you do not want. See Example 6 below.
Always ensure you pick the correct format. Unless the data contains dates, Text format is usually the safest choice.
Formatting all the Columns
In step 5 above, you can change the format of each column. For example, you can change the format of the second column to Text also (you don’t need to, I’m just showing you that you can). To do so, click in that column (the column turns black) and choose the appropriate format.
See Example 7 below.
Assume you want to isolate the server name from the domain name (see the data in column B in Example 1 above). Since the server name and domain name are separated by a period, that’s the delimiter in this example.
See Example 8 below.
See all the posts for these series at Excel: Basic Data Analytic series.