A leading business data provider to the financial services sector had a primary goal of providing precise investment management insights. The company's database contained approximately 27,000 public companies, each with a comprehensive set of 240+ attributes covering contact information, personnel data, stock information, financial data, etc. The data had to be meticulously curated to ensure 100% accuracy while also being refreshed periodically to avoid outdated data.
Xtract.io provided an accurate and feasible option to fetch the required information in real-time automatically with Freda, our advanced yet affordable data quality platform.
We had to deal with many attributes, each with its own set of validation rules. Also, we had to customize web monitoring procedures to monitor various data sources and track changes in a data variable
Addressed the challenge faced by our client using a combination of automation and human-in-loop manual curation.
Classified and sorted the data sources based on priority, reliability, and frequency of updates.
Freda used different crawlers for different data sources to collect the majority of the required attributes.
Sources such as stock exchanges were crawled by fully automated scrapers, whereas AI-ML-based bots processed websites.
Data such as financials, profit, and loss figures, were crawled by specific bots for interpretation.
Validated the attributes through MOJO, our proprietary data validation platform, to ensure high accuracy.
Double validated the data by a team of 70+ in-house technical experts who checked the data for validity and accuracy using the source URL as a reference.
The errors corrected by our experts were returned as feedback to the AI-Ml system, allowing the logic to be fine-tuned.
We refreshed the customer data regularly and achieved data accuracy of greater than 98%. Freda's feedback mechanism allowed the system to learn over time and achieve greater than 90% accuracy on critical attributes. Our cutting-edge technologies and technical experts reduced the processing time by 30% and processing cost by 25%.
> 98% accuracy