Job Title : Lead DevOps Engineer
Location : Remote
Experience Level : 10+ years
About the Role
We are seeking a highly experienced Lead DevOps Engineer to spearhead our cloud infrastructure and DevOps initiatives. In this role, you will lead a small but growing team of engineers and drive the strategic direction of our DevOps practices. The ideal candidate has a proven track record of building reliable, scalable, and secure platforms, coupled with the leadership skills to mentor engineers and align infrastructure with business goals.
Key Responsibilities
Leadership & Strategy
Lead, mentor, and grow a DevOps engineering team, fostering a culture of automation, reliability, and continuous improvement.
Define best practices, standards, and architectural direction for DevOps across the organization.
Partner with engineering, security, and product teams to ensure infrastructure supports business needs.
Cloud & Infrastructure
Design, implement, and manage large-scale cloud infrastructure (AWS, Azure, or GCP).
Architect and maintain infrastructure as code (IaC) using tools like Terraform, Pulumi, or CloudFormation.
Establish and enforce high-availability and disaster recovery strategies.
Automation & CI / CD
Build and optimize CI / CD pipelines for efficient, secure, and reliable software delivery.
Automate operational tasks, deployments, monitoring, and scaling.
Ensure fast feedback loops and minimal downtime through advanced release strategies (blue / green, canary).
Reliability & Observability
Implement and manage monitoring, logging, and alerting systems (e.g., Prometheus, Grafana, ELK / EFK, Datadog, New Relic).
Drive service-level objectives (SLOs) and error budgets to enhance reliability.
Perform root cause analysis and lead postmortems to prevent recurrence.
Security & Compliance
Embed security practices into infrastructure and CI / CD pipelines ("shift-left" security).
Ensure compliance with industry standards (ISO 27001, SOC2, HIPAA, GDPR, etc.).
Implement secrets management, access controls, and vulnerability scanning.
Collaboration & Mentorship
Provide technical guidance, code reviews, and hands-on support to team members.
Collaborate cross-functionally with developers, QA, and operations teams.
Advocate for DevOps culture, evangelizing best practices throughout the organization.
Required Qualifications
10+ years of experience in DevOps, Site Reliability Engineering (SRE), or Platform Engineering.
5+ years of leadership experience (team lead, manager, or architect role).
Expert-level proficiency in at least one major cloud provider (AWS, Azure, or GCP).
Strong hands-on experience with :
IaC : Terraform, Pulumi, or CloudFormation
CI / CD : Jenkins, GitHub Actions, GitLab CI, ArgoCD, Spinnaker, etc.
Containers & Orchestration : Docker, Kubernetes, Helm
Observability Tools : Prometheus, Grafana, ELK / EFK, Datadog, Splunk, New Relic
Security Tools : HashiCorp Vault, AWS IAM, OPA, Prisma, Aqua Security
Proven track record of designing and maintaining large-scale, highly available, secure systems.
Strong background in Linux / Unix systems administration and networking fundamentals.
Proficiency in at least one programming / scripting language (Python, Go, Bash, etc.).
Excellent communication, leadership, and collaboration skills.
Nice to Have
Experience with hybrid or multi-cloud environments.
Familiarity with service meshes (Istio, Linkerd) and API gateways.
Background in cost optimization and FinOps practices.
Contributions to open-source DevOps or cloud-native projects.
Engineer • Niterói, Rio de Janeiro, Brasil