Domain-Specific Visual Language for Data Engineering Quality


Data engineering pipelines process large amounts of information, and ensuring that the quality and integrity of the data is maintained throughout is critical for technical, business, and social reasons. Conventional data quality assurance approaches require a large amount of fine-grained testing code, which is laborious, easy to get out of sync, and inscrutable to non-technical stakeholders. An executable higher-level visual approach to expressing quality requirements can serve as a shared representation of these constraints and their implications for all parties, eliminating repetition while increasing accessibility and maintainability. We present a visual programming language for expressing data quality requirements within a pipeline declaratively, structured as a diagram of compositional data flow, transformation, and validation steps.


Alexis De Meo, Michael Homer

Published in

ACM SIGPLAN International Workshop on Programming Abstractions and Interactive Notations, Tools, and Environments (PAINT), 2022

The final copy of this publication is available from the publisher.


this page
Michael Homer — 2022 729e28c0