Wolters Kluwer Jobs

Senior Site Reliability Engineer

Wolters Kluwer

Senior Site Reliability Engineer

Reposted 7 Days Ago

Be an Early Applicant

In-Office

Pune, Maharashtra, IND

Senior level

In-Office

Pune, Maharashtra, IND

Senior level

As a Senior Site Reliability Engineer, you'll enhance platform reliability by designing monitoring solutions, automating risk reduction, and analyzing production incidents, while collaborating with various engineering teams.

The summary above was generated by AI

The role

As a Senior Site Reliability Engineer, you work close to both the code and production environments. You design and build solutions that make our platform measurable, predictable, and resilient. You ensure a high level of user satisfaction through preventative maintenance, effective troubleshooting, and the rapid resolution of complex production issues. Ensuring robust performance in a high‑stakes environment is a key responsibility of this role.

Rather than “running” systems, you engineer reliability into them. You help define and implement SLI’s and SLO’s, build meaningful monitoring and alerting, and automate away operational risk. You collaborate with product and platform teams to ensure reliability is treated as a core software quality.

This is not a classic DevOps or operations role. We are looking for an experienced engineer who is comfortable changing application code, designing observability as part of system architecture, and driving long-term improvements in how we build and operate software.

You will work primarily from our Pune office, while collaborating closely with the rest of the SRE team in our Dutch office. In Pune, you will also act as the first point of contact for production related issues within our engineering organization.

You’ll join a growing a Site Reliability Engineering team that is still evolving, offering significant opportunity to influence technical direction, standards, and ways of working.

What will you do?

Engineer reliability into a large-scale Azure SaaS platform
Design, implement, and continuously improve monitoring, alerting, and observability solutions
Define and improve SLI’s, SLO’s and error budgets together with engineering teams
Build automation to reduce operational risk and eliminate manual toil
Analyse incidents end-to-end and translate learnings into structural improvements
Perform deep debugging and optimization of production issues across application code, services and infrastructure
Improve how teams use metrics, logs and traces to understand system behaviour
Collaborate closely with software engineers, platform engineers and support teams
Contribute to incident response when needed, with a strong focus on learning and prevention
Support deployment strategies and execution
Provide advanced technical support to help user issues

Your skills & experience

5+ years of experience as a Site Reliability Engineer
Strong experience with monitoring, alerting and observability in production environments
Experience with Datadog, Grafana, Log Analytics and/or Prometheus
Proven ability to design and work with SLI’s, SLO’s and reliability metrics
Hands-on coding experience (preferable C#/.NET, but not required) in production environments
Experience building automation to improve system reliability and reduce toil
Experience working with preferable Microsoft Azure or in another major public cloud providers like AWS, GCP
Comfortable working with live production systems and customer data
Understanding of performance optimization techniques
Excellent communication skills, including direct interaction with users
Strong cross‑functional collaboration skills

Nice to have

Experience with distributed systems or large SaaS platforms
SQL knowledge and understanding of relational databases
Experience working in regulated or compliance-sensitive environments

Soft skills

You think like an engineer when things go wrong, curious, analytical and focused on long-term improvement. You care about why systems fail and how to prevent that failure from happening again. You question existing reliability practices, alerts and assumptions, and use data, incidents and metrics to drive measurable improvements. You enjoy collaborating across teams and see reliability as a shared engineering responsibility.

Why this role at Wolters Kluwer?

You work on a mission-critical SaaS platform with real customer impact
You influence how reliability and observability are engineered into software
You help shape SRE and reliability practices in a growing, evolving team
You get ownership, trust and space to improve how we build and operate software
You work in an open, pragmatic, international and engineering-driven culture

Our Interview Practices

To maintain a fair and genuine hiring process, we kindly ask that all candidates participate in interviews without the assistance of AI tools or external prompts. Our interview process is designed to assess your individual skills, experiences, and communication style. We value authenticity and want to ensure we’re getting to know you—not a digital assistant. To help maintain this integrity, we ask to remove virtual backgrounds and include in-person interviews in our hiring process. Please note that use of AI-generated responses or third-party support during interviews will be grounds for disqualification from the recruitment process.

Applicants may be required to appear onsite at a Wolters Kluwer office as part of the recruitment process.

Similar Jobs

Akamai Technologies

Senior Site Reliability Engineer

27 Days Ago

In-Office or Remote

India

Senior level

Cloud • Security • Software • Cybersecurity

As a Senior Site Reliability Engineer, you will enhance automation and efficiency, troubleshoot complex issues, and improve system reliability and monitoring.

Top Skills: AnsibleAWSAzureDatadogElkGCPGoGrafanaLinuxOpensearchPrometheusPythonSaltstackSplunkTerraform

DBS Bank Ltd

Full-stack Engineer

8 Days Ago

In-Office or Remote

India

Senior level

Fintech • Information Technology • Software • Financial Services

Design, build, and maintain real-time, secure distributed systems and observability UIs/APIs. Implement CI/CD, containerized deployments (Docker/Kubernetes/OpenShift), integrate observability stack (Elasticsearch/Logstash/Grafana), and apply secure coding and API security standards to ensure reliability, performance, and incident automation. Collaborate in Agile teams and explore AI to improve resiliency.

Top Skills: Agentic AiCi/CdDockerElasticsearchGrafanaJava Spring BootKafkaKubernetesLogstashMariadbNode.jsOauth2OpenshiftReactSecrets Management

Datavail

Site Reliability Engineer

22 Days Ago

Hybrid

Senior level

Database

The role involves managing AWS migrations from Azure, designing VPC architecture, implementing Infrastructure as Code with Terraform, and automating deployments. Responsibilities also include security compliance, application monitoring, and operational support.

Top Skills: AuroraAWSAws DmsAws MgnCi/Cd PipelinesCloudwatchDevOpsGitGuarddutyRdsSecurity HubTerraform

What you need to know about the Pune Tech Scene

Once a far-out concept, AI is now a tangible force reshaping industries and economies worldwide. While its adoption will automate some roles, AI has created more jobs than it has displaced, with an expected 97 million new roles to be created in the coming years. This is especially true in cities like Pune, which is emerging as a hub for companies eager to leverage this technology to develop solutions that simplify and improve lives in sectors such as education, healthcare, finance, e-commerce and more.