Role Summary The SRE Technical Member will : Deliver engineering, operational, and administrative support for the application and its technology landscape. Address reliability and operational challenges such as application failures, production issues, infrastructure performance (disk, memory), monitoring, and security. Serve as a mid-level subject matter expert, integrating with multiple teams to develop and evolve SRE practices for Azure-based environments. Participate in production support activities, including deployments, upgrades, and critical issue resolution. This role is central to designing, implementing, and maintaining monitoring, alerting, and reporting solutions across servers, containers, databases, and cloud infrastructure components.
Key Responsibilities Collaborate with Central SRE, DevOps, and InfoSec teams on new projects, platform builds, and deployments. Contribute to the
design, implementation, and operation
of large-scale, Azure-based platforms. Apply
industry best practices
in monitoring, alerting, reporting, and cloud architecture. Participate in
infrastructure, application, and security planning , focusing on scalability, redundancy, and data preservation. Support
high-availability topologies
with development teams. Produce
documentation and weekly operational status reports , detailing project progress and key metrics. Provide
engineering and support
for technical infrastructure, cloud, databases, and application performance. Manage
incident response, change management, and user permissions
following SRE best practices (Google SRE model). Maintain close collaboration between Application, Central SRE, DevOps, InfoSec, and business units. Assist in configuring and onboarding new applications into the Azure DevOps (ADO) platform.
Core Technical Skills Strong understanding of
SRE fundamentals : monitoring, alerting, reporting, performance, availability, and incident response. Hands-on experience with
CI / CD tools
(Git, Azure Pipelines, Ansible, etc.). Infrastructure as Code (IaC)
design, scripting, and setup. Deep knowledge of
Azure Web Services
— installation, configuration, and management. Experience administering
Microsoft applications
(.NET, C#, Angular) with focus on automation, optimization, and security. Proficiency in
Cosmos DB
and
MS SQL
operational tasks. Excellent
troubleshooting, root-cause analysis , and
problem-solving
skills. Experience with
disaster recovery, scalability testing,
and
capacity planning .
Qualifications Bachelor’s degree
in a technical discipline (Computer Science, Engineering, or related field). 5+ years of industry experience
in SRE, DevOps, or related technical operations roles. Proven experience in
cloud infrastructure ,
automation , and
application reliability engineering
within large-scale, enterprise environments.
Site Reliability Engineer • Novo Hamburgo, Brasil