Paxata - an excellent tool to treat text
Pros
- Visualize distributions in large data sets effectively which enable the user to quickly spot outliers and treat them appropriately
- Provides recommendation to merge datasets based on matching column values
- The cluster and edit feature in my opinion is its most powerful feature and reduces cardinality in column with text
Cons
- Doesn't provide recommendation on how to impute values
- There is a lag quite often
- We can say whether a column has errors or quality issues in the first look
Return on Investment
- It saves time to clean data
- It reduces the requirement of too many data engineer/stewards and hence adds positive impact on the return of the business
Alternatives Considered
Talend Data Preparation
Other Software Used
Alteryx Analytics, Talend Data Quality, Tableau Desktop
