What are the key points for a good data collection?

 

  • Data. As much as possible, and as varied as possible. This point is critical for the success of your data collections. By data, we mean both spectra and meta-data (attributes). The spectra should be scanned properly and the meta-data should be obtained from a reliable source (such as an  external lab or a different measurement device, etc.).
  • Choosing the right preprocessing when analyzing your data and building your models. SCiO Lab provides a few predefined types of preprocessing (Processed and Normalized). Processed is mainly useful for estimation models in which we assume the Beer-Lambert equation holds. Normalized is useful mainly for classification models. Sometimes , the use of both “processed” and “normalized” is needed when there is variance in the optical path of your samples.
  • Using the Expert mode to create your own preprocessing method might also be useful for certain models.