Backblaze Logo

Backblaze

Sr. Manager, Operations (NOC + SRE)

Reposted 10 Days Ago
Be an Early Applicant
Remote
Hiring Remotely in India
Senior level
Remote
Hiring Remotely in India
Senior level
The Sr. Manager, Operations leads the 24x7 NOC and SRE teams, ensuring service continuity, incident resolution, and operational excellence while integrating SRE practices and improving service quality.
The summary above was generated by AI

Backblaze is the object storage leader in the open cloud movement, fueling customer success with cloud storage built purposefully to unlock budgets, unburden administrators, and unleash innovators. Together with our partners, we’re helping customers break free from the restrictive, overpriced legacy solutions that hold them back, and blaze forward with the full power of the open cloud in their hands.

Founded in 2007, we scaled the business with less than $3 million in outside funding until 2021, when we did a traditional IPO on the Nasdaq stock exchange. Today, Backblaze generates over $100m in revenue and is the leading specialized storage cloud - managing over three billion gigabytes of data storage for 500K+ customers in 175+ countries, including businesses, developers, IT professionals, and individuals.
But while there is a lot to celebrate in our past, there is almost as much opportunity ahead of us. We are seeking a Sr. Manager, Operations (NOC + SRE) to join our team!

What You’ll Do: 

The Senior Manager, Operations is responsible for leading the 24x7 Network Operations Center (NOC) and Site Reliability Engineering (SRE) teams in the Backblaze India office. This role ensures service continuity, system reliability, observability, and operational excellence across a diverse client infrastructure portfolio. This role is critical to delivering high-availability services with rapid incident resolution, automation, and measurable performance improvements.

Key Responsibilities

Operational Management

  • Lead and develop a 24x7 NOC team to monitor, triage, and resolve incidents across customer environments (network, server, cloud, and security systems).
  • Oversee daily operations including alert response, incident escalation, service reporting, and SLA adherence.
  • Manage shift schedules, on-call rotations, escalation policies, and team performance reviews.

Site Reliability Engineering (SRE) Integration

  • Champion and implement SRE practices such as SLIs/SLOs, error budgets, reliability scorecards, and toil reduction strategies.
  • Drive automation and tool development to reduce manual work and improve response times.
  • Establish observability practices using metrics, logs, traces, and health checks for proactive issue identification.
  • Collaborate with Engineering to embed reliability, scalability, and fault tolerance into client solutions.

Service Quality & Improvement

  • Conduct root cause analysis (RCA) and lead post-incident reviews (PIRs) to prevent recurrence and drive continuous improvement.
  • Own the monitoring, incident, and change management frameworks based on ITIL and DevOps best practices.
  • Define and track key performance indicators (KPIs) such as uptime, MTTR, first contact resolution, SLO compliance, and automation coverage.
  • Ensure accurate and timely client communication during service-impacting events.

Client and Stakeholder Engagement

  • Partner with Engineering, Service Delivery and Account Management teams to support operational onboarding and ongoing service support.
  • Serve as a technical and operational escalation point for high-priority issues and executive briefings.
  • Support pre-sales activities by providing input on operational readiness and service reliability.

The Right Fit:

  • Must be located in Bangalore.
  • Bachelor’s degree in Computer Science, Engineering, or related field—or equivalent hands-on experience.
  • 8+ years of experience in IT operations or infrastructure support, with at least 3 years in a leadership role within an MSP or SaaS environment.
  • Proven experience managing NOC operations and applying SRE practices to improve system availability and reduce manual operations.
  • Strong knowledge of networking (BGP, VPN, SD-WAN), server infrastructure (Linux), public cloud platforms, and automation frameworks.
  • Experience with monitoring and incident management tools such as Zabbix, Prometheus, Grafana, Jira, and Firehydrant.
  • ITIL Foundation and/or demonstrated experience with Incident, Problem, and Change Management processes.

Preferred:

  • ITIL Foundation Certification or higher a plus.
  • Experience with remote infrastructure management.
  • Exposure to compliance standards (SOC 2, HIPAA, etc.).
  • Knowledge of automation, scripting, or orchestration technologies.

Work Environment

  • Operates within a 24x7 delivery model, with rotating on-call responsibilities and potential support for critical incident response outside business hours.
  • Remote flexibility, with occasional travel to Corp Office, client or datacenter sites.

At this point, we hope you're feeling excited about the job description you're reading. Even if you don't meet every requirement, we still encourage you to apply. Learning, developing, and growing are key parts of our culture. We're eager to meet people who believe in our mission and can contribute to our team in various ways. We want people to feel comfortable expressing their true selves and to come, stay, and do their best work here.

At Backblaze, we value being fair and good to our customers, partners, and employees. That’s why diversity, equity, and inclusion are at the core of our values. We are committed to fostering a workforce where all employees feel a sense of belonging regardless of race, ethnicity, nationality, gender, sexual orientation, age, religion, socio-economic status, ability, veteran status, and education. We believe that our dedication to cultivating a diverse workspace not only allows us to better serve our customers in over 175 countries, but further reinforces our commitment to doing the right thing. We are proud to be an Equal Opportunity Employer.

To understand more about the data we collect and process as part of your application, please view our Backblaze Employee Privacy Notice.


Top Skills

Firehydrant
Grafana
JIRA
Linux
Prometheus
Zabbix

Similar Jobs

9 Minutes Ago
Remote or Hybrid
India
Mid level
Mid level
Fintech • Information Technology • Insurance • Financial Services • Big Data Analytics
Design, develop, and maintain software solutions based on defined standards to meet organizational goals, involving both front and backend development.
Top Skills: Ci/CdCloud Native TechnologiesDatabase Development ToolsDevsecopsQuadientXpressions
9 Minutes Ago
Remote or Hybrid
India
Entry level
Entry level
Fintech • Information Technology • Insurance • Financial Services • Big Data Analytics
Provide L2 support for AEM and Java, fix security vulnerabilities, collaborate with teams, develop solutions for performance and security, and document processes.
Top Skills: Adobe Experience ManagerAemJava
9 Minutes Ago
Remote or Hybrid
India
Junior
Junior
Fintech • Information Technology • Insurance • Financial Services • Big Data Analytics
The role involves designing and developing software solutions, assisting with user requirements, testing applications, and resolving production incidents.
Top Skills: Agile PracticesCi/CdCloud Native TechnologiesDatabase Development ToolsDevsecopsSafe For Teams

What you need to know about the Pune Tech Scene

Once a far-out concept, AI is now a tangible force reshaping industries and economies worldwide. While its adoption will automate some roles, AI has created more jobs than it has displaced, with an expected 97 million new roles to be created in the coming years. This is especially true in cities like Pune, which is emerging as a hub for companies eager to leverage this technology to develop solutions that simplify and improve lives in sectors such as education, healthcare, finance, e-commerce and more.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account