Let's start with definition.
"Harmonization is to create the possibility of combining data from more heterogeneous data sources, both internal and external, into an integrated, consistent, and interactive Data Story, in a way that is of no concern or struggle to the user. Collaborative decisions is the sharing of pertinent data and insights, in real-time, while keeping context, for fast, data-driven decisions." -- Clear Story
Simpler words, Data Harmonization = Data Inferencing + Data Profiling +
We have 15 data sources (structured -columnar schema database). Each data source has some 40 fields. The data belongs to people & related professional attributes.
We want to create a single unified view of these 15 data sources, however, there is lot of data missing pertaining to one field in one source when we look for the same field in another source....