About the PositionWe're looking for a seasoned Site Reliability Engineer to collaborate with our dynamic team in Brazil.You'll work closely with product teams to tackle complex infrastructure challenges, acting as both a hands-on technical expert and a mentor.You'll help design, build, and operate cloud platforms while driving reliability, performance, and security across the engineering organization.Main Responsibilities : Technical Leadership & Mentorship : Guide and support engineers, foster technical excellence through code reviews and design discussions, and help unblock critical challenges.Documentation & Standards : Create and maintain runbooks, standards, and best-practice guides to strengthen operational capabilities.Infrastructure as Code & CI / CD : Automate provisioning and deployments using Terraform and best-practice pipelines (GitHub Actions, ArgoCD, etc.).
Reliability Engineering : Define SLIs / SLOs, manage error budgets, and build dashboards and alerts to proactively monitor system health.Security & Compliance : Enforce least-privilege IAM policies, automate vulnerability scans, and maintain audit logging.Monitoring & Observability : Instrument services with metrics, logs, and distributed tracing to enable rapid troubleshooting.Cost Optimization : Implement tagging strategies, right-size resources, and use data-driven insights to manage cloud spend effectively.Required Skills : Experience managing production-critical systems with deep knowledge of SRE and DevOps principles.Proven experience mentoring engineers and leading technical projects.Strong proficiency with AWS and cloud-native best practices.Hands-on experience with Kubernetes (EKS or GKE) and large-scale container orchestration.Expertise with Terraform for infrastructure provisioning.Experience managing and debugging Redis and Postgres databases.Solid understanding of VPCs, VPNs, load balancers, and cloud networking.Proficiency with Git, branching strategies, and CI / CD integrations.Strong grasp of web and network protocols (HTTP, REST, TLS, DNS, etc.).
Nice to Have : Bachelor's degree in Computer Science, Engineering, or related field.Experience breaking down large, ambiguous projects into actionable tasks.Familiarity with ArgoCD, GitHub Actions, Jenkins, or similar CI / CD tools.Working knowledge of Python, Golang, and Helm.Experience running scalable Node.js microservices.Knowledge of cloud infrastructure security and Terragrunt best practices.Background in production readiness and high-growth engineering environments.
Architect • Fortaleza, Ceará, Brasil