Data lineage refers to the process of tracing the movement of data from its origin and the different stages of transformation it has to go through over a period of time before reaching its final destination. It provides a comprehensive view of the data source, its transformations, and its final destination within the data pipeline. … Continue reading What is data lineage and why is it relevant?
Category: Data
Mount volumes to persist data in local & initialize database in Docker
When a docker container is deleted, relaunching it from the image will start a fresh new container without any of the changes made in the previously running containerThis happens because when we create a new container from an image, we add a new writable layer on top of the underlying stack of layers present in … Continue reading Mount volumes to persist data in local & initialize database in Docker
What is a Data Mesh ??
Data mesh is an architectural paradigm about how you think and organize data and its services, similar to microservices which is an architectural pattern for designing and building applications.Microservices based architecture helps to solve the challenges which prevents an organization to be agile and respond quickly to changes like delay in introducing new features, long … Continue reading What is a Data Mesh ??
Data warehouse vs Data Mart Vs Data Lake
Data warehouse Aggregation of data collected from multiple sources to a single central repository that unifies the data quality and format.Highly curated data that serves as the central version of the truthMeant to store structured data.Mostly used for BI, Analytics, Data mining, Artificial Intelligence(AI) and machine learningExample of use cases - Big data integration, NLP, … Continue reading Data warehouse vs Data Mart Vs Data Lake