Qualys Logo

Qualys

Director, Site Reliability Engineer

Sorry, this job was removed at 08:09 a.m. (IST) on Wednesday, May 21, 2025
Be an Early Applicant
Pune, Maharashtra
Pune, Maharashtra

Similar Jobs

17 Minutes Ago
Hybrid
2 Locations
Senior level
Senior level
Big Data • Food • Hardware • Machine Learning • Retail • Automation • Manufacturing
Lead the development of data products, manage analytics agendas, collaborate with business leaders, and drive data insights for performance improvement.
Top Skills: AdoGCPJIRAPower BISQLTableau
19 Minutes Ago
Remote
Hybrid
18 Locations
Senior level
Senior level
Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
As a Sr. Engineer - Observability, you will enhance monitoring and tracing, design tracing across microservices, and build Kubernetes operators.
Top Skills: AWSBashGCPGoJaegerKubernetesOpentelemetryPythonSentry
21 Minutes Ago
Hybrid
Mumbai, Maharashtra, IND
Mid level
Mid level
Financial Services
The role involves credit analysis on corporate and FI clients, managing portfolios, and liaising with stakeholders while developing junior team members.
Top Skills: Credit AnalysisFinancial ModelingLoan Documentation

Come work at a place where innovation and teamwork come together to support the most exciting missions in the world!

Qualys’ site reliability engineering (SRE) team supports all Qualys products across all our production environments, including our 11 global multi-tenant platforms and over 90 on-premise setups. Effective incident management is a big part of our SRE efforts to minimize the disruption of an incident and restore normal business operations as quickly as possible.

 

We are seeking a highly motivated and talented Director , Site Reliability Engineering to lead our SRE team that works on a 24/7 rotation. In this role, you will be responsible for leading a group that responds proactively to alerts and is accountable for the efficiency and effectiveness of service delivery over the life cycle of an incident, Deployment of applications in production , automating the deployments , making the production environments very stable .

 

We are looking for an individual who believes in SRE principles, has a software engineering mindset, and wants to be part of an organization that is transforming itself to be more agile and nimble operationally.

 

Responsibilities

 

Ensure effective performance and 24x7 availability of all production systems.

Strong understanding of industry best practices for Site Reliability Engineering and ops automation

Proactively work to implement and improve automation of applications tasks

Knows system performance, testing, and programming; monitor, measure, and optimize system and application performance.

Work with other SRE leaders in setting the enterprise strategy for designing and developing resiliency in the application code

Working closely with Product Management and partner Sales and architect teams.

Track record of success in delivering quality products from concept to launch

Monitor alerts coming out of all Qualys platforms, and coordinate with Operations/SRE/DBRE/Engineering teams as necessary to take preventive or corrective action to resolve any incidents, with a goal to minimize MTTR.

Put in place and manage an effective on-call rotation within the team.

Work with engineering teams to set up proper monitoring and alerting thresholds across all Qualys services and applications so SRE team is focusing on key areas to stabilize the platforms .

Accountability for platform uptime SLAs.

 

Desired Skills

 

15 or more years of experience working in application support or Site Reliability Engineering.

Experience in a leadership role on a development or engineering team

Strong prior production operations experience leading a first responder incident management team for a high-traffic platform.

CI/CD pipelines to achieve the automation of software delivery process

Knowledge of the products and services regarding cloud platforms ; Strong skills to develop cloud solutions and deploy applications on cloud platforms.

Solid exposure to monitoring tools such as Prometheus, ELK, Kibana, AppDynamics, Splunk, Grafana, etc.

Very good experience on how to use Kubernetes , Jenkins , Terraform templates .

Very good experience on the capacity sizing of the applications .

Good experience in configuring and managing on-call and alerting platforms like PagerDuty, etc.

Comfortable working in a dynamic environment with ability to coordinate multiple tasks simultaneously.

Strong verbal and written communication skills are essential as are the ability to work in a disciplined manner and to remain composed under pressure.

Obtain and exhibit expert knowledge of Qualys’ infrastructure, monitoring, and its products and services

Coordinate with Incident management team to produce weekly reports and dashboards for various products to clearly showcase, backed by data, any areas of improvement that need to be taken up.

Must have a strong passion for continuous improvement.

Qualys Pune, Mahārāshtra, IND Office

Survey No. 20, 10th to 16th Floor, Tower B Panchshil Business Park, Balewadi, Pune, Maharashtra , India, 411045

Qualys Shivaji Nagar, Maharashtra, IND Office

Survey No. 20, 10th to 16th Floor, Tower B Panchshil Business Park,, Shivaji Nagar, 411005, India

What you need to know about the Pune Tech Scene

Once a far-out concept, AI is now a tangible force reshaping industries and economies worldwide. While its adoption will automate some roles, AI has created more jobs than it has displaced, with an expected 97 million new roles to be created in the coming years. This is especially true in cities like Pune, which is emerging as a hub for companies eager to leverage this technology to develop solutions that simplify and improve lives in sectors such as education, healthcare, finance, e-commerce and more.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account