eive data from different clients, and different departments, etc.

 Duplicate observations frequently arise during the process of data collection, such as when we are trying to comb timesofamerica.info ine the data sets from multiple sources. It is also possible when we scrape data, receive data from different clients, and different departments, etc.

Irrelevant observations timevinger.org come into the picture when the data does not actually fit a specific problem that you are having in hand.For example, if you need to build a model for single-family homes in a specific region, you may not want observations for apartments in this p tincona.com articular dataset. It is also ideal for reviewing the charts from the exploratory analysisto understand the challenges and categorical features in order to see if any classes should not be there. Checking for any error elements before data engineering will save you a lot of time and headache down the road.

Fixing all the structural errors

The next bucket in terms of data cleaning involves mixing all types of structural errors in datasets. These are those which arise during the time of measuring data, transferring it, and due to other poor housekeeping practices. At this stage, you have to check for any errors like inconsistent capitalization, typos, or other types of entry errors. Structural errors are mostly concerned about the categorical features, which you can look at. Sometimes, it may be simple spelling errors, and some other times, these may be some compound errors. You also have to look for some mislabeled classes, which may actually be separate classes butneeded to be considered 

Comments

Popular posts from this blog

A Fashionista: Guide To Ho

issing data can be a tricky affair when it comes