Data Scrubbing
April 29, 2015

Data Scrubbing is the process by which noise, outliers and mistakes are identified and eliminated. Data scrubbing is required for accurate modeling.

Using the records and spectra views, fix any meta-data errors and remove or address the chart outliers. Use the processed and normalized filters to help you find the outliers in the spectra. In the screen below, the outliers are highlighted.
Data Scrubbing_Spotting Outliers

Data scrubbing and Model Creation are an iterative process.
In order to build the best models, data scrubbing should be done until the outliers and anomalies are removed from your collection.

Once you have sufficiently scrubbed your data, it is time to create the model.

Leave a Reply