Datavail Logo

Datavail

Technical Support Manager - Cloud SRE

Posted Yesterday
Be an Early Applicant
Hybrid
Mumbai, Maharashtra
Expert/Leader
Hybrid
Mumbai, Maharashtra
Expert/Leader
The Technical Support Manager will lead SRE operations in cloud environments, manage support teams, ensure SLAs, handle high-priority incidents, and drive reliability through SRE practices.
The summary above was generated by AI

Job Title: Technical Support Manager SRE (Cloud Managed Services)

Education: Any Graduate

Experience: 12+years

Location:  Mumbai

 

Job Description: 

 

Role Overview:

We are seeking an experienced SRE Support Manager to lead multi-cloud managed services support operations across Amazon Web Services, Microsoft Azure, and Google Cloud environments. This role will be responsible for ensuring platform reliability, operational excellence, SLA governance, and customer satisfaction while managing Level 1 and Level 2 SRE engineers and collaborating with Level 3 engineering teams.

The ideal candidate combines strong people leadership, customer management, cloud operations expertise, and deep understanding of Site Reliability Engineering practices, including SLI, SLO, SLA, error budgets, observability, automation, and incident management.

 

Experience Required:

12 + years overall experience with 3+ years in team leadership / support management / SRE management role.

 

Key Responsibilities:

 

Team Leadership & Support Operations:

  • Lead, mentor, and develop Level 1 and Level 2 SRE Support Engineers. 

  • Manage 24x7 support coverage, shift planning, workforce utilization, and operational readiness. 

  • Establish clear escalation matrices and support ownership models. 

  • Drive skill upliftment across cloud technologies, troubleshooting, and SRE practices. 

 

Customer & Service Delivery Management:

  • Manage support delivery for multiple enterprise managed services customers. 

  • Understand customer expectations, business priorities, and critical workloads. 

  • Act as senior escalation point for high-priority incidents and service concerns. 

  • Ensure proactive communication during outages, incidents, and service requests. 

 

Reliability Engineering & SRE Governance:

  • Define and monitor Service Level Indicators (SLIs) for availability, latency, error rates, throughput, and ticket responsiveness. 

  • Establish and govern Service Level Objectives (SLOs) aligned to customer needs. 

  • Manage Error Budgets and balance reliability with speed of change. 

  • Improve operational reliability through automation, standardization, and continuous improvement. 

  • Reduce toil and repetitive manual support tasks.

 

Incident / Problem / Change Management:

  • Lead major incident management bridges and restoration activities. 

  • Coordinate with Level 3 teams, cloud vendors, and customer stakeholders. 

  • Drive Root Cause Analysis (RCA) and preventive corrective actions. 

  • Ensure controlled execution of change management, patching, releases, and maintenance. 

 

SLA / KPI / Reporting:

  • Track contractual SLAs, operational KPIs, MTTR, MTTD, ticket aging, and backlog health. 

  • Publish weekly/monthly service review dashboards. 

  • Highlight risks, recurring issues, and improvement opportunities. 

  • Ensure audit readiness and governance compliance. 

 

Multi-Cloud Platform Management:

  • Oversee customer workloads on: 

  • Amazon Web Services  - EC2, RDS, EKS, Lambda, IAM, VPC, CloudWatch

  • Microsoft Azure - Azure VM, AKS, Azure SQL, VNets, Monitor, Defender 

  • Google Cloud - Compute Engine, GKE, Cloud SQL, IAM, Operations Suite 

 

Required Technical Skills:

 

Cloud & Infrastructure

  • Strong hands-on experience in any one or more cloud platforms: Amazon Web Services / Microsoft Azure / Google Cloud 

  • Good understanding of compute, storage, networking, IAM, backup, DR, and security controls. 

  • Experience with Linux and/or Windows server administration. 

  • Knowledge of containers and orchestration platforms such as Kubernetes / Docker. 

 

SRE & Reliability Engineering

  • Strong knowledge of SRE principles and best practices. 

  • Experience designing and tracking SLI, SLO, SLA frameworks. 

  • Practical understanding of Error Budget policy management. 

  • Expertise in incident response, on-call operations, postmortems, and resilience engineering. 

  • Familiarity with capacity planning, availability engineering, and performance optimization. 

 

Monitoring / Observability

  • Hands-on experience with: 

  • Amazon CloudWatch 

  • Azure Monitor 

  • Google Cloud Operations Suite 

  • Datadog 

  • Grafana 

  • Prometheus 

 

Automation / DevOps

  • Experience with scripting: Python / Bash / PowerShell. 

  • Infrastructure as Code using Terraform or similar. 

  • CI/CD exposure using GitHub Actions, Jenkins, or similar tools. 

 

Leadership Skills

  • Proven experience managing technical support or SRE operations teams. 

  • Strong customer-facing communication skills. 

  • Ability to manage escalations under pressure. 

  • Strong decision-making and stakeholder management skills. 

 

Preferred Qualifications

  • ITIL Foundation / ITSM knowledge. 

  • AWS / Azure / GCP certifications. 

  • Experience in Managed Services / MSP environment. 

  • Experience leading 24x7 global support teams. 

 

Success Metrics

  • SLA / SLO attainment 

  • Error budget compliance 

  • MTTR reduction 

  • Service availability improvement 

  • Customer satisfaction (CSAT) 

  • Ticket backlog health 

  • Automation delivered 

  • Team productivity and retention 

 

About UsDatavail is a leading provider of data management, application development, analytics, and cloud services, with more than 1,000 professionals helping clients build and manage applications and data via a world-class tech-enabled delivery platform and software solutions across all leading technologies. For more than 17 years, Datavail has worked with thousands of companies spanning different industries and sizes, and is an AWS Advanced Tier Consulting Partner, a Microsoft Solutions Partner for Data & AI and Digital & App Innovation (Azure), an Oracle Partner, and a MySQL Partner. About the Team
Datavail’s Team of Cloud Experts Can Save You Time and Money
Our Cloud experts are capable to overcome every obstacle in helping clients manage everything from databases, analytics, reporting, migrations, and upgrades to monitoring and overall data management.
You can free up your IT resources to focus on growing your business rather than fighting fires. Our Cloud experts can guide you through strategic initiatives or support routine database management.
Cloud Managed Services
Datavail’s business focuses on helping you use your data to drive business results through cost-saving services. The success of your business depends on how well you understand and manage your data. Our managed cloud services give you the power to unleash your organization’s potential. We provide comprehensive and technically advanced support for Cloud Operation to ensure that your infrastructure is safe, secure, and managed with the utmost level of care.
Our delivery performance in data management leads the industry. We offer highly trained Cloud administrators via a 24×7, always on, always available, global delivery model.
With the combination of a proven delivery model and top-notch experience ensures that Datavail will remain the Cloud experts on demand you desire. Datavail’s flexible and client focused services always add value to your organization.

Similar Jobs

An Hour Ago
Hybrid
Pune, Maharashtra, IND
Senior level
Senior level
Fintech • Information Technology • Insurance • Financial Services • Big Data Analytics
The Sr. DevOps Engineer I is responsible for leading technical initiatives, optimizing CI/CD pipelines, managing infrastructure, and enhancing operational efficiency through automation and monitoring.
Top Skills: AutomationCi/CdCloudMonitoring
An Hour Ago
Remote or Hybrid
India
Junior
Junior
Artificial Intelligence • Cloud • Sales • Security • Software • Cybersecurity • Data Privacy
The Sales Executive is responsible for selling SailPoint's IGA Solution Suite to large organizations, engaging with C-level executives, and maintaining relationships with key clients while navigating a lengthy sales cycle.
Top Skills: AICybersecurityIamIgaMachine LearningSaaS
7 Hours Ago
Hybrid
Senior level
Senior level
Artificial Intelligence • Automotive • Computer Vision • Information Technology • Internet of Things • Logistics • Software
Lead the software development lifecycle, design and implement backend services using Java and Scala, leverage AWS, and mentor junior engineers.
Top Skills: AIAWSCi/CdJavaOraclePostgresScala

What you need to know about the Pune Tech Scene

Once a far-out concept, AI is now a tangible force reshaping industries and economies worldwide. While its adoption will automate some roles, AI has created more jobs than it has displaced, with an expected 97 million new roles to be created in the coming years. This is especially true in cities like Pune, which is emerging as a hub for companies eager to leverage this technology to develop solutions that simplify and improve lives in sectors such as education, healthcare, finance, e-commerce and more.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account