Veeam Logo

Veeam

Staff Site Reliability Engineer

Posted 3 Days Ago
Be an Early Applicant
In-Office
Pune, Maharashtra
Senior level
In-Office
Pune, Maharashtra
Senior level
As a Staff Site Reliability Engineer at Veeam, you'll lead SRE initiatives, mentor engineers, collaborate on resilient architecture, and drive observability practices.
The summary above was generated by AI

Veeam, the #1 global market leader in data resilience, believes businesses should control all their data whenever and wherever they need it. Veeam provides data resilience through data backup, data recovery, data portability, data security, and data intelligence. Based in Seattle, Veeam protects over 550,000 customers worldwide who trust Veeam to keep their businesses running. Join us as we move forward together, growing, learning, and making a real impact for some of the world’s biggest brands. The future of data resilience is here - go fearlessly forward with us.

We are looking for a Staff Site Reliability Engineer, you will serve as a hands-on technical leader within the SRE team, guiding senior engineers, influencing product development teams, and ensuring the systems we operate are built to be reliable, scalable, and observable from the ground up.

You will drive strategic initiatives, mentor others in the practice of SRE, and help define architectural best practices across our platform. This role is pivotal in aligning teams, enforcing high standards, and scaling SRE principles globally within Veeam.

Your tasks will include:Reliability Engineering & Resilience:
  • Act as a technical authority in your area, mentoring senior engineers and guiding design choices that improve service reliability and resilience
  • Lead the definition and enforcement of SLIs, SLOs, and error budgets; drive adherence across engineering teams
  • Collaborate with Staff peers across teams to align strategy and champion shared reliability standards and goals
  • Partner with development and product teams to proactively design for failure, build resilient architecture, and operationalize reliability from the start
Observability & Operational Excellence:
  • Drive company-wide adoption of observability best practices and tooling
  • Ensure metrics, logs, and traces provide deep, actionable insights across systems
  • Lead complex incident responses, postmortems, and systemic reliability improvements
  • Promote and enforce a blameless culture of learning and continuous improvement
Engineering at Scale:
  • Lead initiatives in infrastructure as code, deployment automation, and resilience testing
  • Influence the development and adoption of chaos engineering practices and release validation frameworks
  • Partner with platform and security teams to ensure production readiness
Collaboration & Culture:
  • Work closely with your peer Staff Engineers to plan, align, and deliver against reliability goals
  • Provide architectural guidance and advocate for engineering rigor and consistency
  • Represent the SRE team in technical leadership forums and product planning discussions
What we expect from you:
  • 8+ years of experience in a Software Engineering or SRE role, including technical leadership
  • Demonstrated experience mentoring and guiding senior engineers
  • Deep expertise in building distributed systems on public cloud (Azure preferred)
  • Strong skills in programming (e.g., JS, Go, Typescript, Java, or C#)
  • Hands-on experience with observability tooling (e.g., Prometheus, Grafana, OpenTelemetry)
  • Mastery of infrastructure automation tools (Terraform, Pulumi) and container orchestration (Kubernetes)
  • Ability to communicate clearly across geographies and disciplines
Will be an advantage:
  • Experience leading SRE initiatives across multiple product teams
  • Background in chaos engineering, incident learning, or performance and load testing
  • Familiarity with global compliance standards (ISO, SOC 2, GDPR, FedRAMP, CMMC)
We offer:
  • Family Medical Insurance
  • Annual flexible spending allowance for health and well-being
  • Life insurance
  • Personal accident insurance
  • Employee Assistance Program
  • A comprehensive leave package, including parental leave
  • Meal Benefit Pass
  • Transportation Allowance
  • Daycare/Child care Allowance
  • Veeam Care Days – additional 24 hours for your volunteering activities
  • Professional training and education, including courses and workshops, internal meetups, and unlimited access to our online learning platforms (Percipio, Athena, O’Reilly) and mentoring through our MentorLab program

Please note: If the applicant is permanently located outside India, Veeam reserves the right to decline the application.
#LI-PK2
#Hybrid


Veeam Software is an equal opportunity employer and does not tolerate discrimination in any form on the basis of race, color, religion, gender, age, national origin, citizenship, disability, veteran status or any other classification protected by federal, state or local law. All your information will be kept confidential.

Please note that any personal data collected from you during the recruitment process will be processed in accordance with our Recruiting Privacy Notice.  

The Privacy Notice sets out the basis on which the personal data collected from you, or that you provide to us, will be processed by us in connection with our recruitment processes. 

By applying for this position, you consent to the processing of your personal data in accordance with our Recruiting Privacy Notice.
By submitting your application, you acknowledge that the information provided in your job application and any supporting documents is complete and accurate to the best of your knowledge. Any misrepresentation, omission, or falsification of information may result in disqualification from consideration for employment or, if discovered after employment begins, termination of employment.

Top Skills

Azure
C#
Go
Grafana
Java
JavaScript
Kubernetes
Opentelemetry
Prometheus
Pulumi
Terraform
Typescript

Similar Jobs

3 Days Ago
In-Office
Pune, Maharashtra, IND
Senior level
Senior level
Cloud • Security • Software • Cybersecurity
The Senior Site Reliability Engineer will guide teams in reliability engineering, promote observability, mentor engineers, and enhance the performance of Veeam's cloud infrastructure.
Top Skills: C#GoGrafanaJavaJavaScriptKubernetesNode.jsOpentelemetryPrometheusPulumiTerraformTypescript
25 Days Ago
In-Office
Pune, Maharashtra, IND
Senior level
Senior level
eCommerce • Logistics • Software • Analytics
Design and manage reliable infrastructure for SaaS products using AWS, implement CI/CD pipelines, and collaborate with development teams to ensure system reliability.
Top Skills: AWSCircleCIDatadogDockerDynatraceGithub ActionsGrafanaJenkinsKubernetesNewrelicPostgresPrometheusPythonRedisSnowflakeTerraform
18 Days Ago
In-Office
Pune, Maharashtra, IND
Mid level
Mid level
AdTech • Digital Media • Healthtech • Marketing Tech • Analytics
The Site Reliability Engineer will manage production systems focusing on reliability and performance, automate processes, and maintain cloud services.
Top Skills: AirflowAnsibleAWSBashCi/CdDockerGitGrafanaHelmKubernetesMySQLPrometheusPythonRedisTerraform

What you need to know about the Pune Tech Scene

Once a far-out concept, AI is now a tangible force reshaping industries and economies worldwide. While its adoption will automate some roles, AI has created more jobs than it has displaced, with an expected 97 million new roles to be created in the coming years. This is especially true in cities like Pune, which is emerging as a hub for companies eager to leverage this technology to develop solutions that simplify and improve lives in sectors such as education, healthcare, finance, e-commerce and more.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account