The Text to columns function in Excel is useful. If you have a bunch of data in a single column, it can be used to split it into separate columns. Found on the Data ribbon, you can either specify delimiters (for example, you might specify a space as a delimiter to split a UK postcode into its two halves), or you can click on the positions at which the data should be split based on each field being a fixed width.
The trouble with the fixed-width method is that Excel has no way of saving the file layout for later use. So if you have a need to repeatedly split some data in exactly the same positions, you have to click each column splitting point each time you do it. Very frustrating.
I have a need to do this quite regularly. I receive text files on a regular basis, all sharing the same fixed-width layout. To split them into their constituent fields, using the Text to columns function over and over would drive me crazy and waste an inordinate amount of time.
Instead, I've created a shell spreadsheet containing two sheets:
- Raw: into which I can paste the raw data, all in column A, starting in row 4
- Split: the split data, starting in column A
At the top of the Split sheet, I have the column titles (row 3), field lengths (row 2) and field start positions (row 1). The field lengths came straight from the file layout; the start positions are calculated, using the following trivial formula in B1:
Each cell in the top data row (row 4) of the Split sheet contains a formula to bring back that column's data for the corresponding row. For example cell A4 contains the following formula:
The dollars before the 1 and the 2 ensure that when you copy the formula down, it still refers to the start position and field length. Leaving the column free to roam means that it always looks up against the relevant column.
Now, instead of having to define the file layout each time I have to analyse an incoming file, I copy the text file into the Raw sheet, flick across to the Split sheet and copy that top row of formulae down to match the corresponding number of rows in the Raw sheet.
It's memory-intensive. But Excel copes just fine with over 20,000 rows of data and 100 columns. Which is all I really need it for.