Wolters Kluwer Logo

Wolters Kluwer

Senior Site Reliability Engineer

Posted 7 Hours Ago
Be an Early Applicant
In-Office
Pune, Maharashtra, IND
Senior level
In-Office
Pune, Maharashtra, IND
Senior level
As a Senior Site Reliability Engineer, you'll enhance platform reliability by designing monitoring solutions, automating risk reduction, and analyzing production incidents, while collaborating with various engineering teams.
The summary above was generated by AI

The role

As a Senior Site Reliability Engineer, you work close to both the code and production environments. You design and build solutions that make our platform measurable, predictable, and resilient. You ensure a high level of user satisfaction through preventative maintenance, effective troubleshooting, and the rapid resolution of complex production issues. Ensuring robust performance in a high‑stakes environment is a key responsibility of this role.

Rather than “running” systems, you engineer reliability into them. You help define and implement SLI’s and SLO’s, build meaningful monitoring and alerting, and automate away operational risk. You collaborate with product and platform teams to ensure reliability is treated as a core software quality.

This is not a classic DevOps or operations role. We are looking for an experienced engineer who is comfortable changing application code, designing observability as part of system architecture, and driving long-term improvements in how we build and operate software.

You will work primarily from our Pune office, while collaborating closely with the rest of the SRE team in our Dutch office. In Pune, you will also act as the first point of contact for production related issues within our engineering organization.

You’ll join a growing a Site Reliability Engineering team that is still evolving, offering significant opportunity to influence technical direction, standards, and ways of working.

What will you do?

  • Engineer reliability into a large-scale Azure SaaS platform
  • Design, implement, and continuously improve monitoring, alerting, and observability solutions
  • Define and improve SLI’s, SLO’s and error budgets together with engineering teams
  • Build automation to reduce operational risk and eliminate manual toil
  • Analyse incidents end-to-end and translate learnings into structural improvements
  • Perform deep debugging and optimization of production issues across application code, services and infrastructure
  • Improve how teams use metrics, logs and traces to understand system behaviour
  • Collaborate closely with software engineers, platform engineers and support teams
  • Contribute to incident response when needed, with a strong focus on learning and prevention
  • Support deployment strategies and execution
  • Provide advanced technical support to help user issues

Your skills & experience

  • 5+ years of experience as a Site Reliability Engineer
  • Strong experience with monitoring, alerting and observability in production environments
  • Experience with Datadog, Grafana, Log Analytics and/or Prometheus
  • Proven ability to design and work with SLI’s, SLO’s and reliability metrics
  • Hands-on coding experience (preferable C#/.NET, but not required) in production environments
  • Experience building automation to improve system reliability and reduce toil
  • Experience working with preferable Microsoft Azure or in another major public cloud providers like AWS, GCP
  • Comfortable working with live production systems and customer data
  • Understanding of performance optimization techniques
  • Excellent communication skills, including direct interaction with users
  • Strong cross‑functional collaboration skills
     

Nice to have

  • Experience with distributed systems or large SaaS platforms
  • SQL knowledge and understanding of relational databases
  • Experience working in regulated or compliance-sensitive environments

Soft skills

You think like an engineer when things go wrong, curious, analytical and focused on long-term improvement. You care about why systems fail and how to prevent that failure from happening again. You question existing reliability practices, alerts and assumptions, and use data, incidents and metrics to drive measurable improvements. You enjoy collaborating across teams and see reliability as a shared engineering responsibility.

Why this role at Wolters Kluwer?

  • You work on a mission-critical SaaS platform with real customer impact
  • You influence how reliability and observability are engineered into software
  • You help shape SRE and reliability practices in a growing, evolving team
  • You get ownership, trust and space to improve how we build and operate software
  • You work in an open, pragmatic, international and engineering-driven culture
Our Interview Practices

To maintain a fair and genuine hiring process, we kindly ask that all candidates participate in interviews without the assistance of AI tools or external prompts. Our interview process is designed to assess your individual skills, experiences, and communication style. We value authenticity and want to ensure we’re getting to know you—not a digital assistant. To help maintain this integrity, we ask to remove virtual backgrounds and include in-person interviews in our hiring process. Please note that use of AI-generated responses or third-party support during interviews will be grounds for disqualification from the recruitment process.

Applicants may be required to appear onsite at a Wolters Kluwer office as part of the recruitment process.

Top Skills

.Net
Azure
C#
Datadog
Grafana
Log Analytics
Prometheus

Similar Jobs

4 Days Ago
In-Office
Pearl Tower, Hadapsar, Pune, Maharashtra, IND
Senior level
Senior level
Cloud • Information Technology • Internet of Things • Software • Consulting • Infrastructure as a Service (IaaS) • Automation
The Senior Site Reliability Engineer will develop and operate OpenShift managed cloud services, focusing on automation, incident response, and collaboration in a distributed system environment.
Top Skills: AnsibleAWSDockerGoKubernetesLinuxOpenshiftPrometheusPython
Senior level
Fintech
As a Senior Site Reliability Engineer, you'll enhance reliability, scalability, and operational excellence of GRV's platform, optimize Kubernetes architectures, improve monitoring and logs, and collaborate with teams on deployment and automation.
Top Skills: DjangoDockerGoGrafanaJavaKubernetesLinuxPrometheusPythonUnix
5 Days Ago
In-Office
Senior level
Senior level
Fintech • Financial Services
The Senior Manager leads the Technology Chief Controls Office, focusing on risk management integration into technology and engineering, enhancing controls, and advising senior leadership.
Top Skills: Advanced AnalyticsAi/MlAutomationEngineeringPlatformSreTechnology Risk

What you need to know about the Pune Tech Scene

Once a far-out concept, AI is now a tangible force reshaping industries and economies worldwide. While its adoption will automate some roles, AI has created more jobs than it has displaced, with an expected 97 million new roles to be created in the coming years. This is especially true in cities like Pune, which is emerging as a hub for companies eager to leverage this technology to develop solutions that simplify and improve lives in sectors such as education, healthcare, finance, e-commerce and more.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account