Health Provider Database

As a Data Scientist at Veda, I co-authored a white paper project validating the accuracy of a newly launching data product of health providers. Building up a sample data in a traditional way, I demonstrated the data quality through research on efficient data processing method and diverse diverse data quality assessment frameworks. I defined and calculated key performance metrics.

Auto Webscaping

Collected data from 8000 providers across 380 zip codes. Implemented an automated web scraper with Selenium, resulting in an impressive 88% reduction in data collection time while ensuring accuracy and comprehensiveness.

Entity Mapping and Data Processing

Devised a data cleaning process to refine entity mapping, achieving an 80% reduction in plurality.

Deliverable

Highlighted product quality through in-depth research on efficient data processing methods, data quality assessment frameworks, and the federal regulations on health information exchange.