From the course: Data Versioning, Lineage, and Quality Monitoring for AI

Unlock this course with a free trial

Join today to access over 24,900 courses taught by industry experts.

Data lineage vs. data provenance vs. data governance

Data lineage vs. data provenance vs. data governance

From the course: Data Versioning, Lineage, and Quality Monitoring for AI

Data lineage vs. data provenance vs. data governance

- [Instructor] Data lineage is closely associated with data provenance and data governance. Let's take a moment to understand what exactly these terms refer to, how they're similar, how they're different, and what the overlap is. First, a quick definition. Data lineage is the process of tracking and visualizing the flow of data from its origin through various transformations to its final destination. Data provenance includes detailed documentation of where the data came from, of data's origin and ownership. Data governance, on the other hand, is a set of practices and policies, ensuring the secure and ethical use of data within an organization. Let's first compare data lineage and data provenance. Data lineage tracks the entire data journey from source to consumption, including transformations and dependencies. Data provenance, on the other hand, focuses on the origin, history, and ownership of data, ensuring authenticity and traceability. You can see that data lineage is a much…

Contents