We are hiring a hands-on Databricks Data Quality Engineer to rebuild five Data Quality Scorecards using SAP ECC data already available in the Lakehouse. You will design and validate profiling logic, build rule-based data quality checks in PySpark, generate field-level and row-level results, and publish business-facing scorecards in Power BI. This role will also define reusable templates, naming conventions, and repeatable processes to support future scorecard expansion (47 more scorecards) and help transition the organization away from Informatica IDQ.
Responsibilities
- Rebuild Data Quality scorecards in Databricks
- Develop profiling logic (nulls, distincts, pattern checks)
- Build PySpark-based Data Quality rules and row / column-level metrics
- Create curated DQ datasets for Power BI scorecards
- Establish reusable DQ rule templates and standardized development patterns
- Work with SAP ECC data models
- Support and mentor a junior developer (Mexico-based) on rule logic and development standards
Qualifications
Strong Databricks engineering experience (PySpark, SQL, Delta Lake)Hands-on experience building Data Quality rules, frameworks, or scorecardsExperience in profiling large datasets and implementing metadata-driven DQ logicAbility to mentor, review code, and explain concepts clearlyExcellent communication skills in EnglishFamiliarity with SAP ECC tables and key fields (preferred)Experience with Unity Catalog or Purview (nice to have)Exposure to Lakehouse Monitoring or DQX accelerators (bonus)If you are passionate about Data Quality, strong in Databricks / PySpark, and enjoy building reusable DQ capabilities, please apply today for immediate consideration!