Intermix.io [http://intermix.io] presents a special 1-hour online
training that introduces you to strategies and best practices for
adding data validation tests and data quality MONITORING TO THE TABLES
IN YOUR DATA LAKE AND DATA WAREHOUSE. WHY THIS TRAINING The data in
your data warehouse is mission critical. Reports are being used to
make crucial company decisions. Your reputation is on the line to be
sure of the accuracy of the data being reported. It is critical that
you have confidence in your data pipeline execution.
Your data pipelines are complex and changing often. Regressions are
possible from minor changes to DAGs and tasks. This may have
unintended impacts on tables, and data flow which may not be
discovered until much later and data has been already used in reports.
Also, failure and DAG outages occur. You need confidence that when
failures are fixed, that data is ‘flowing’ again and things are
back to normal. Finally, complex ETL demands accuracy when data is
mission critical. How do you know that your ETL is copying all rows
accurately? That your joins are not dropping any data?
This training gives you practical, real-world examples of tests that
can be added to any data pipeline to provide you with the confidence
that things are working as expected.
WHAT WILL I LEARN?
The class will include hands-on activities and provide pseudo-code
examples of tests that can be run against your tables and data models.
You will learn about the different classes of tests, how to set them
up, and the important metrics to monitor.
HOW WILL THIS HELP MY COMPANY AND ME?
Accurate business decisions, confidence that data quality is valid,
and no more guesswork or surprises about data quality. Sounds great,
right? Once you’re freed from worrying and fighting fires, you can
refocus on your being creative with your data and have the confidence
to make changes.
trade shows
business
472
Views
15/06/2018 Last update