Role Summary
The SRE Technical Member will :
Deliver engineering, operational, and administrative support for the application and its technology landscape.
Address reliability and operational challenges such as application failures, production issues, infrastructure performance (disk, memory), monitoring, and security.
Serve as a mid-level subject matter expert, integrating with multiple teams to develop and evolve SRE practices for Azure-based environments.
Participate in production support activities, including deployments, upgrades, and critical issue resolution.
This role is central to designing, implementing, and maintaining monitoring, alerting, and reporting solutions across servers, containers, databases, and cloud infrastructure components.Key Responsibilities
Collaborate with Central SRE, DevOps, and InfoSec teams on new projects, platform builds, and deployments.
Contribute to thedesign, implementation, and operationof large-scale, Azure-based platforms.
Applyindustry best practicesin monitoring, alerting, reporting, and cloud architecture.
Participate ininfrastructure, application, and security planning , focusing on scalability, redundancy, and data preservation.
Supporthigh-availability topologieswith development teams.
Producedocumentation and weekly operational status reports , detailing project progress and key metrics.
Provideengineering and supportfor technical infrastructure, cloud, databases, and application performance.
Manageincident response, change management, and user permissionsfollowing SRE best practices (Google SRE model).
Maintain close collaboration between Application, Central SRE, DevOps, InfoSec, and business units.
Assist in configuring and onboarding new applications into the Azure DevOps (ADO) platform.Core Technical Skills
Strong understanding ofSRE fundamentals : monitoring, alerting, reporting, performance, availability, and incident response.
Hands-on experience withCI / CD tools(Git, Azure Pipelines, Ansible, etc.).
Infrastructure as Code (IaC)design, scripting, and setup.
Deep knowledge ofAzure Web Services— installation, configuration, and management.
Experience administeringMicrosoft applications(.NET, C#, Angular) with focus on automation, optimization, and security.
Proficiency inCosmos DBandMS SQLoperational tasks.
Excellenttroubleshooting, root-cause analysis , andproblem-solvingskills.
Experience withdisaster recovery, scalability testing,andcapacity planning .
Qualifications
Bachelor's degreein a technical discipline (Computer Science, Engineering, or related field).
5+ years of industry experiencein SRE, DevOps, or related technical operations roles.
Proven experience incloud infrastructure ,automation , andapplication reliability engineeringwithin large-scale, enterprise environments.
Site Reliability Engineer • Recife, Pernambuco, Brasil