Databricks Data Engineer
Fully Remote Contract
We’re looking for a hands-on Databricks Data Engineer with strong experience building scalable data pipelines using Spark, PySpark, SQL, and Delta Lake. This role focuses on ingesting data from multiple sources, transforming it for analytics, and publishing high-quality datasets and visualizations.
Responsibilities :
- Build and optimize ETL / ELT pipelines in Databricks using PySpark, Spark SQL, and Delta Lake.
- Ingest, clean, and transform data from diverse sources (APIs, SQL databases, cloud storage, SAP / legacy systems, streaming).
- Develop reusable pipeline frameworks, data validation logic, and performance-tuned transformations.
- Design curated datasets and deliver insights through Power BI dashboards.
- Implement best practices for lakehouse development, orchestration, and version control.
- Troubleshoot pipeline performance issues and ensure data accuracy, reliability, and quality.
Required Skills :
Strong hands-on Databricks experience (Spark, PySpark, Delta Lake).Advanced SQL for large-scale data processing.Experience integrating data from multiple structured and unstructured sources.Solid understanding of distributed computing, performance tuning, and debugging Spark jobs.Power BI (reports, models, DAX preferred).Experience with CI / CD pipelines in an Azure environmentNice to Have :
Experience with data quality frameworks, Lakehouse monitoring, or DQXKnowledge of Airflow, ADF, IoT, Kafka or other tools is a plusExperience with SAP data is a plus.