The raw data used in Excel is generally in a table of columns and any number of rows and there are times when a large Excel raw data set requires cleaning and/or manipulation to enable further use.
Where this is the case, Excel can be used to cleanse raw data, creating a clean, user-friendly and less complicated data set. This can be useful in a number of circumstances
The front end functionality built into Excel is well suited to performing a simple ‘Data Cleansing’ process and also the ability to ‘move’ data into a more usable layout, but if the data is too complex for the front end functionality, the following can be utilised
An additional functionality available that is not well known is the ‘Excel Fuzzy Lookup’ add-in. This has been developed by Microsoft Research and performs fuzzy matching of textual data in Excel.
It can be used to match tables of data where duplicates exist but may contain spelling mistakes, abbreviations and/or missing data and the ‘Excel Fuzzy Lookup’ add-in is extremely useful when dealing with very large data sets where manual checking and/or cleansing is not practical