What is Text Cleaning?
Text Cleaning (or data cleaning) is the manipulation of existing text into some other text, either to replace the old text, remove unwanted text or produce new information. For example, changing of existing text might be something simple like uppercase. Replacing text might be to remove excess spaces. New information might be turning a postcode into a city, country or country to fill in missing gaps or correct mistyped text.
In most Text Cleaning applications you can see a Preview of the new text before accepting the change.
Text Cleaning (or data cleaning) is the manipulation of existing text into some other text, either to replace the old text, remove unwanted text or produce new information. For example, changing of existing text might be something simple like uppercase. Replacing text might be to remove excess spaces. New information might be turning a postcode into a city, country or country to fill in missing gaps or correct mistyped text.
In most Text Cleaning applications you can see a Preview of the new text before accepting the change.
Why do we need Text Cleaning?
There are five main reasons we need Text Cleansing:
How does Text Cleaning work?
Depending on the application and the data source:
There are five main reasons we need Text Cleansing:
- Accuracy. We don't want our letters to be sent to 'Mr Mary Smith'
- Completeness. If we are missing the county in our address, but we have the postcode, then why not add it in.
- Consistency. If in most cases the text is in Title Case, then why not make it in all cases.
- Uniformity. If we need our numbers to three decimal points, then lets make them all three decimal points!
How does Text Cleaning work?
Depending on the application and the data source:
- You either paste in the text from a spreadsheet or word-processor.
- Choose the type of text cleaning you want to perform.
- Clean the text so it shows you a Preview of the changed text
- Either copy the text to the clipboard, update the results back into the database, export the results, or resubmit the text for further cleaning.