Wolters Kluwer Logo

Wolters Kluwer

Senior Site Reliability Engineer

Reposted 7 Days Ago
Be an Early Applicant
In-Office
Pune, Maharashtra, IND
Senior level
In-Office
Pune, Maharashtra, IND
Senior level
As a Senior Site Reliability Engineer, you'll enhance platform reliability by designing monitoring solutions, automating risk reduction, and analyzing production incidents, while collaborating with various engineering teams.
The summary above was generated by AI

The role

As a Senior Site Reliability Engineer, you work close to both the code and production environments. You design and build solutions that make our platform measurable, predictable, and resilient. You ensure a high level of user satisfaction through preventative maintenance, effective troubleshooting, and the rapid resolution of complex production issues. Ensuring robust performance in a high‑stakes environment is a key responsibility of this role.

Rather than “running” systems, you engineer reliability into them. You help define and implement SLI’s and SLO’s, build meaningful monitoring and alerting, and automate away operational risk. You collaborate with product and platform teams to ensure reliability is treated as a core software quality.

This is not a classic DevOps or operations role. We are looking for an experienced engineer who is comfortable changing application code, designing observability as part of system architecture, and driving long-term improvements in how we build and operate software.

You will work primarily from our Pune office, while collaborating closely with the rest of the SRE team in our Dutch office. In Pune, you will also act as the first point of contact for production related issues within our engineering organization.

You’ll join a growing a Site Reliability Engineering team that is still evolving, offering significant opportunity to influence technical direction, standards, and ways of working.

What will you do?

  • Engineer reliability into a large-scale Azure SaaS platform
  • Design, implement, and continuously improve monitoring, alerting, and observability solutions
  • Define and improve SLI’s, SLO’s and error budgets together with engineering teams
  • Build automation to reduce operational risk and eliminate manual toil
  • Analyse incidents end-to-end and translate learnings into structural improvements
  • Perform deep debugging and optimization of production issues across application code, services and infrastructure
  • Improve how teams use metrics, logs and traces to understand system behaviour
  • Collaborate closely with software engineers, platform engineers and support teams
  • Contribute to incident response when needed, with a strong focus on learning and prevention
  • Support deployment strategies and execution
  • Provide advanced technical support to help user issues

Your skills & experience

  • 5+ years of experience as a Site Reliability Engineer
  • Strong experience with monitoring, alerting and observability in production environments
  • Experience with Datadog, Grafana, Log Analytics and/or Prometheus
  • Proven ability to design and work with SLI’s, SLO’s and reliability metrics
  • Hands-on coding experience (preferable C#/.NET, but not required) in production environments
  • Experience building automation to improve system reliability and reduce toil
  • Experience working with preferable Microsoft Azure or in another major public cloud providers like AWS, GCP
  • Comfortable working with live production systems and customer data
  • Understanding of performance optimization techniques
  • Excellent communication skills, including direct interaction with users
  • Strong cross‑functional collaboration skills
     

Nice to have

  • Experience with distributed systems or large SaaS platforms
  • SQL knowledge and understanding of relational databases
  • Experience working in regulated or compliance-sensitive environments

Soft skills

You think like an engineer when things go wrong, curious, analytical and focused on long-term improvement. You care about why systems fail and how to prevent that failure from happening again. You question existing reliability practices, alerts and assumptions, and use data, incidents and metrics to drive measurable improvements. You enjoy collaborating across teams and see reliability as a shared engineering responsibility.

Why this role at Wolters Kluwer?

  • You work on a mission-critical SaaS platform with real customer impact
  • You influence how reliability and observability are engineered into software
  • You help shape SRE and reliability practices in a growing, evolving team
  • You get ownership, trust and space to improve how we build and operate software
  • You work in an open, pragmatic, international and engineering-driven culture
Our Interview Practices

To maintain a fair and genuine hiring process, we kindly ask that all candidates participate in interviews without the assistance of AI tools or external prompts. Our interview process is designed to assess your individual skills, experiences, and communication style. We value authenticity and want to ensure we’re getting to know you—not a digital assistant. To help maintain this integrity, we ask to remove virtual backgrounds and include in-person interviews in our hiring process. Please note that use of AI-generated responses or third-party support during interviews will be grounds for disqualification from the recruitment process.

Applicants may be required to appear onsite at a Wolters Kluwer office as part of the recruitment process.

Similar Jobs

27 Days Ago
In-Office or Remote
India
Senior level
Senior level
Cloud • Security • Software • Cybersecurity
As a Senior Site Reliability Engineer, you will enhance automation and efficiency, troubleshoot complex issues, and improve system reliability and monitoring.
Top Skills: AnsibleAWSAzureDatadogElkGCPGoGrafanaLinuxOpensearchPrometheusPythonSaltstackSplunkTerraform
8 Days Ago
In-Office or Remote
India
Senior level
Senior level
Fintech • Information Technology • Software • Financial Services
Design, build, and maintain real-time, secure distributed systems and observability UIs/APIs. Implement CI/CD, containerized deployments (Docker/Kubernetes/OpenShift), integrate observability stack (Elasticsearch/Logstash/Grafana), and apply secure coding and API security standards to ensure reliability, performance, and incident automation. Collaborate in Agile teams and explore AI to improve resiliency.
Top Skills: Agentic AiCi/CdDockerElasticsearchGrafanaJava Spring BootKafkaKubernetesLogstashMariadbNode.jsOauth2OpenshiftReactSecrets Management
22 Days Ago
Hybrid
Senior level
Senior level
Database
The role involves managing AWS migrations from Azure, designing VPC architecture, implementing Infrastructure as Code with Terraform, and automating deployments. Responsibilities also include security compliance, application monitoring, and operational support.
Top Skills: AuroraAWSAws DmsAws MgnCi/Cd PipelinesCloudwatchDevOpsGitGuarddutyRdsSecurity HubTerraform

What you need to know about the Pune Tech Scene

Once a far-out concept, AI is now a tangible force reshaping industries and economies worldwide. While its adoption will automate some roles, AI has created more jobs than it has displaced, with an expected 97 million new roles to be created in the coming years. This is especially true in cities like Pune, which is emerging as a hub for companies eager to leverage this technology to develop solutions that simplify and improve lives in sectors such as education, healthcare, finance, e-commerce and more.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account