OverviewAs a Mid SRE at DEUNA, you'll ensure the reliability, scalability, and performance of our AWS-based platform by integrating observability, automation, and SRE best practices across the software lifecycle.
You will work closely with development teams to improve uptime, provide observability tooling, and ensure we scale efficiently and securely.Key ResponsibilitiesDesign, define, and maintain observability and monitoring for our AWS infrastructureDefine and track SLIs, SLOs, and SLAs for critical systemsImprove system uptime, latency, and fault tolerance across the platformProvide internal libraries and toolsets to developers for diagnostics and debuggingManage scaling, performance, and resilience efforts related to system reliabilityCollaborate with technical teams on capacity planning, load testing, and scaling policiesImprove production operations by defining and evolving deployment strategies and conducting disaster recovery (DR) testingTechnical SkillsExpertise with Prometheus, Grafana, OpenTelemetry, AWS CloudWatch, or other observability toolsExperience designing dashboards, alerts, and log aggregation pipelinesDeep understanding of AWS services : ECS, Lambda, RDS, CodePipelineStrong proficiency in Go programming languageSkilled at defining SLIs, SLOs, error budgets, and improving Mean Time to Recovery (MTTR)Experience conducting failure drills (e.g., Chaos Monkey, Gremlin) to ensure system resilienceSoft SkillsExcellent communication and collaboration skillsAdaptability to thrive in dynamic, fast-paced environmentsStrong time management and task prioritizationProficiency in EnglishWhat you will find when you join DEUNAA multicultural team distributed throughout LATAMDynamism, agility and constant innovationBeing part of a high-impact solution for an entire regionThe best tools and technology to operateBeing part of the startup cultureWe are in full expansion!
BenefitsVacations and additional PTORemote work from anywhereEconomic support for health insurance, internet and cell phone lineWe all own DEUNA, we offer stock optionsLearning and development platformMultidisciplinary, diverse and dynamic teamGrowth and career pathBe part of a dynamic team that's creating the next generation payments platformJoin us at DEUNADetailsSeniority level : Not ApplicableEmployment type : Full-timeJob function : Engineering and Information TechnologyIndustries : Software Development
#J-18808-Ljbffr
Site Reliability Engineer • São Paulo, Brasil