ContextWe are a fast-growing startup that has experienced tremendous growth over the past months, achieving 8x revenue growth in the last 18 months and a 10x usage increase over the same period.
As we continue this upward trajectory, we are expanding our team to ensure our products remain reliable, intuitive, and delightful for our ever-growing user base.MissionAs a DevOps / Infrastructure team member, your primary mission will be to help ensuring that our infrastructure is scalable, reliable, and cost-effective, supporting the company's rapid growth and evolving needs.
You will play a critical role in both day-to-day operations and long-term strategic planning, helping shape our platform's future.Key Responsibilities : Infrastructure Scaling and Stability : Contribute to the continuous scaling of our infrastructure to handle increasing loads while maintaining stability and performance.
This includes designing and implementing robust and scalable architectures, automating deployment processes, and optimizing resource allocation.
Performance Optimization : You will continuously monitor and analyze system performance, identifying bottlenecks and areas for improvement.
You will also work on optimizing our infrastructure and applications to ensure low-latency, high-throughput operation, particularly for video and audio processing.
Cost Management : Implement strategies to optimize infrastructure costs without compromising performance.
This includes rightsizing resources, automating scaling policies, and leveraging cloud provider cost-saving mechanisms.
Enhancing Observability : Improve our monitoring and observability capabilities, ensuring we have comprehensive visibility into the health and performance of our infrastructure.
You will develop and maintain dashboards, alerts, and logs that provide actionable insights for the entire engineering team.
Collaboration and Support : Work closely with developers and other stakeholders to ensure seamless integration between infrastructure and applications.
Provide guidance and support on infrastructure-related topics, fostering a culture of shared responsibility for the system's reliability.
Security and Compliance : Contribute to the security and compliance of our infrastructure by implementing best practices and staying up-to-date with industry standards.
Innovation and Continuous Improvement : Stay abreast of industry trends and technologies and proactively suggest and implement improvements.
You will have the opportunity to experiment with new tools and methodologies, driving innovation within the team.Who We Are Looking ForWe seek a highly skilled and experienced DevOps / Infrastructure Engineer who is passionate about building and maintaining scalable, reliable, and efficient infrastructure.
You should have a strong background in managing complex environments across cloud providers and bare-metal servers and be comfortable working in high-performance environments, proactively identifying and resolving potential issues before they impact the system.Key Qualifications : KubernetesMastery : Proven experience designing, deploying, and managing Kubernetes clusters in production environments.
Multi-Cloud and Bare Metal Expertise : Deep understanding of at least one primary cloud provider (AWS, GCP, or Azure) and experience managing infrastructure on bare-metal servers.
You should be familiar with each environment's unique challenges and opportunities and capable of leveraging their strengths effectively.
Infrastructure at Scale : Demonstrated ability to architect and manage infrastructure that can scale horizontally and vertically, supporting a growing user base and increasing traffic.
Experience with distributed systems, load balancing, and automated scaling strategies is essential.
Cost Optimization & Resource Management : Strong focus on optimizing infrastructure to balance performance and cost.
You should be proficient in tools and techniques for monitoring resource usage, identifying inefficiencies, and implementing cost-saving measures.
Programming and Automation : Proficiency in scripting and automation (e.g.Bash, Python).
Monitoring and Observability : You should have a deep understanding of best practices for monitoring, logging, and observability.
Experience with tools such as Prometheus, Grafana, ELK stack, or similar systems is crucial.
You should also be able to design systems that provide comprehensive insights into infrastructure health and performance.
Problem-Solving Mindset : You should thrive on solving complex problems and constantly look for ways to improve system reliability, performance, and efficiency.
Strong analytical skills and a methodical approach to troubleshooting are essential.
Collaboration and Communication : Excellent written and verbal communication skills, with the ability to articulate complex technical concepts to non-technical stakeholders.
You should be comfortable collaborating across teams to ensure that infrastructure decisions support broader company goals.Bonus Qualifications : Experience with Media Processing : Familiarity with video and audio processing, streaming, or related technologies is a strong advantage.
This includes understanding the performance implications and resource requirements of media workloads.
Familiarity with the Node.js environment and applications at scale is a strong advantage
Familiarity with configuration management tools like Ansible, Terraform, or Puppet is a plus
Engineer • Bauru, São Paulo, Brasil