Excel. The multi-tool in every BA’s tool-belt. Or if it’s not, it should be.
From capturing user stories, calculating cost basis, data analytics, metrics tracking, management dashboards, and it’s default use as the table-creation-tool du-jour; it does it all. It may not be the BEST tool for any of those but it’s the tool you can count on almost everyone in an office environment having and being at least somewhat familiar with.
It’s also the tool that most people I know turn to when you get one or more data dumps of some sort and you need to try and make sense of the information, or just make it more accessible for further analysis.
In this post I am going to discuss a specific problem I encountered with a data extract in the form of an Excel file. That problem was that:
- Out of about 25 data elements in the file, only about 10 were in distinct columns
- The rest of the data elements were concatenated together into one string in a single column
- There were nearly a thousand rows of data and I would be getting a new file every month, meaning a manual solution was not desirable
- The discreet data values in the string were not always consistently present (e.g. a value might be present in one row but not another)
- The data values in the strings were not always in the same order
So in this post I am going to go through the logic and eventual Excel functions that were created to solve this problem. Hopefully you find it useful.