Senior Python Data Engineer
About the Project
Responsibilities
- Design, build, and maintain high-performance data processing pipelines using Python libraries (Pandas, Polars).
- Develop and expose RESTful APIs using FastAPI or similar frameworks.
- Consume and process normalized Parquet files from multiple upstream sources to generate dynamic Excel reports.
- Contribute to a spec-driven development workflow (using GitHub Copilot, Claude, etc.) to scaffold and generate API / data pipeline code.
- Optimize report generation logic for speed and scalability, currently targeting sub-20 second response times.
- Integrate with messaging and storage mechanisms (e.g., Service Bus, Storage Accounts).
- Collaborate on infrastructure-as-code automation using Bicep (or similar IaC tools).
- Participate in design discussions for future migration to Snowflake and / or a data lake architecture.
- Contribute to CI / CD pipelines using GitHub Actions.
Required Skills and Experience
Strong proficiency in Python for data processing (must have expertise in Pandas, nice to have : Polars, openpyxl).Experience building backend services or APIs using frameworks like FastAPI.Solid understanding of data modeling principles (Star Schema) and handling normalized datasets.Familiarity with enterprise messaging patterns and data integration from various sources (API-based and file-based).Experience working with GitHub and CI / CD pipelines (GitHub Actions or similar).Infrastructure-as-Code experience with Bicep or comparable tools (Terraform, AWS CDK).Comfort with spec-driven development and leveraging AI tools like GitHub Copilot for scaffolding.