Mitratech

SRE Manager

Reposted 3 Days Ago

Be an Early Applicant

Easy Apply

Remote

Hiring Remotely in India

Senior level

Easy Apply

Remote

Hiring Remotely in India

Senior level

The SRE Manager will lead DevOps and SRE teams, manage cloud infrastructure, optimize CI/CD pipelines, ensure system reliability, and promote operational best practices.

The summary above was generated by AI

At Mitratech, we are a team of technocrats focused on building world-class products that simplify operations in the Legal, Risk, Compliance, and HR functions. We are a close-knit, globally dispersed team that thrives in an ecosystem that supports individual excellence and takes pride in its diverse and inclusive work culture centered around great people practices, learning opportunities, and having fun! Our culture is the ideal blend of entrepreneurial spirit and enterprise investment, enabling the chance to move at a rapid pace with some of the most complex, leading-edge technologies available.

For over 35 years, the experts at Mitratech have been focused on solving the complex needs. Today, we serve 20,000 client companies of all sizes globally, representing 30% of the Fortune 500 and over 500,000 users in over 160 countries.

As we continue to grow, we’re always looking for resourceful, enthusiastic, and fresh perspectives. Join our global team and see what makes Mitratech a truly exceptional place to work!

Job Overview:

We are looking for an experienced and passionate DevOps & SRE Manager to lead our DevOps and Site Reliability Engineering teams. The ideal candidate will be responsible for building and maintaining scalable, reliable, and high-performing infrastructure and operational processes. As a DevOps & SRE Manager, you will play a key role in ensuring our development, deployment, and operational practices align with industry standards while fostering a culture of automation and continuous improvement.

Key Responsibilities:

Leadership & Team Management:

Lead, mentor, and develop a team of DevOps engineers and SREs to drive innovation and operational excellence.
Build a collaborative and inclusive team culture focused on delivering high-quality services.
Establish and track goals for your team to align with business objectives.

Infrastructure Automation & Scalability:

Design, implement, and manage highly available and scalable cloud infrastructure (AWS, Azure, or OCI). OCI Experience preferred
Oversee the implementation of Infrastructure as Code (IaC) tools (e.g., Terraform, Bicep, Ansible etc) to automate provisioning and configuration.
Identify and address bottlenecks in deployment pipelines and infrastructure performance.

Site Reliability Engineering:

Lead efforts to define and maintain Service Level Objectives (SLOs), Service Level Indicators (SLIs), and error budgets.
Drive incident management processes to quickly detect, mitigate, and resolve issues while ensuring post-mortem analyses for continuous improvement.
Optimize and enhance monitoring, logging, and alerting systems (e.g., Datadog, Splunk, Prometheus, Grafana, ELK stack).

Continuous Integration and Continuous Deployment (CI/CD):

Establish and refine CI/CD pipelines to ensure smooth software releases with minimal/zero downtime.
Collaborate with development teams to implement DevOps best practices and ensure code quality, security, and performance.

Security & Compliance:

Implement and oversee security best practices in DevOps and operational workflows, including secrets management, vulnerability scans, and automated patching.
Ensure compliance with relevant regulations and standards (e.g., SOC2, ISO 27001).

Collaboration & Communication:

Work cross-functionally with product, engineering, and operations teams to ensure alignment on goals and priorities.
Provide regular updates to stakeholders on system health, incidents, and improvement initiatives.

Cost Optimization:

Analyze cloud and infrastructure costs, identify opportunities for savings, and implement cost optimization strategies.
Manage budgets and vendor relationships for tools and services used by the team.

Qualifications:

Education:

Bachelor’s degree in Computer Science, Engineering, or a related field. A Master’s degree is a plus.

Experience:

Proven experience managing DevOps or SRE teams in fast-paced environments.
Hands-on expertise in cloud platforms (AWS, Azure, OCI, GCP) and containerization technologies (Docker, Kubernetes).
Deep understanding of software development lifecycle (SDLC) and Agile practices.
Track record of driving operational efficiency, incident resolution, and automation.

Technical Skills:

Expertise in CI/CD tools (e.g., Jenkins, CircleCI, Azure DevOps).
Experience operating in Kubernetes platforms like AKS, EKS, or similar.
Experience using managed languages such as Python, Go, C#, Java, or similar.
Experience designing tooling to simplify the operational management of SaaS/PaaS systems.
Experience with monitoring and observability tools (e.g., Prometheus, Splunk, New Relic, Datadog, ELK Stack).
Strong knowledge of infrastructure-as-code tools (e.g., Terraform, Bicep, CloudFormation).

Soft Skills:

Excellent leadership and people management abilities.
Strong problem-solving skills and attention to detail.
Exceptional communication skills to collaborate across teams and with stakeholders.

We are an equal-opportunity employer that values diversity at all levels. All qualified applicants will receive consideration for employment without regard to race, color, religion, gender, national origin, age, sexual orientation, gender identity, disability, or veteran status.

Top Skills

Ansible

AWS

Azure

Azure Devops

Bicep

CircleCI

Datadog

Docker

Elk Stack

Grafana

Java

Jenkins

Kubernetes

Oci

Prometheus

Python

Splunk

Terraform

Similar Jobs

NVIDIA

Site Reliability Engineer

Yesterday

Remote

India

Senior level

Artificial Intelligence • Computer Vision • Hardware • Robotics • Metaverse

As a Senior Manager of Site Reliability Engineering, you will lead a team of SREs in driving system reliability, automation, and incident management for cloud services.

Top Skills: AnsibleAWSAzureChefCloud ServicesElk StackGCPGoGrafanaJaegerKubernetesPrometheusPuppetPythonSplunkTerraform

LSEG (London Stock Exchange Group)

Site Reliability Engineer

5 Days Ago

Remote

Shri Bhrigukshetra, BLR, Uttar Pradesh, IND

Senior level

Fintech • Analytics

Manage cloud infrastructure deployment, supervise SRE engineers, ensure reliable application performance, and implement CI/CD processes within a collaborative environment.

Top Skills: Application InsightsArm TemplatesAWSAzureAzure MonitorCi/CdTerraform

Circle.so

Software Engineer

6 Hours Ago

Easy Apply

Remote

Easy Apply

Senior level

Artificial Intelligence • Consumer Web • Digital Media • Information Technology • Social Impact • Software

The role involves shipping full-stack AI projects, designing experiments for AI features, and optimizing AI infrastructure while collaborating with teams.

Top Skills: MySQLPostgresReactRuby On Rails

What you need to know about the Pune Tech Scene

Once a far-out concept, AI is now a tangible force reshaping industries and economies worldwide. While its adoption will automate some roles, AI has created more jobs than it has displaced, with an expected 97 million new roles to be created in the coming years. This is especially true in cities like Pune, which is emerging as a hub for companies eager to leverage this technology to develop solutions that simplify and improve lives in sectors such as education, healthcare, finance, e-commerce and more.