Zendesk Logo

Zendesk

Information Technology Lead

Reposted 19 Hours Ago
Be an Early Applicant
In-Office
Pune, Maharashtra
Mid level
In-Office
Pune, Maharashtra
Mid level
As an Observability & Monitoring Engineer, you'll enhance system reliability through telemetry, dashboards, automation, and collaboration with incident management, focusing on actionable insights and cost optimization.
The summary above was generated by AI
Job Description

Job Title: Observability & Monitoring Engineer
Location: India
Department: Employee Services Technology & Operations (ESTO) – ITSM & Service Operations

Love making complex systems feel simple and reliable? We’re looking for an Observability & Monitoring Engineer who is equal parts builder and detective—someone who instruments services end-to-end, shines a light on blind spots, and turns noise into actionable signals. You’ll help us evolve a modern RunOps capability that improves reliability, reduces toil, and elevates the employee experience across Zendesk.

About Zendesk

At Zendesk, we believe outstanding customer and employee experiences start with great service and resilient platforms. We lead with empathy, innovate with purpose, and celebrate diversity and inclusion in everything we do. Join our global team and help us build an observability practice that others want to copy.

The Role

You will design and operate the telemetry backbone for our internal platforms and business-critical applications. This role spans metrics, logs, traces, synthetics, RUM, and event correlation—instrumenting services, building dashboards, tuning alerts, and partnering with Incident/Problem/Change to drive measurable reliability outcomes.

What You’ll Do
  • Design the observability stack: Define and implement standards for metrics, logs, traces, and profiling (e.g., OpenTelemetry collectors, exporters, and context propagation).

  • Instrument what matters: Establish golden signals, SLIs/SLOs, and health checks for priority services; automate baselining and anomaly detection.

  • Build actionable visibility: Create executive and on-call views (dashboards, service health, dependency maps) for Apps, Network, Collaboration tools, HRIS, and integrations.

  • Engineer signal > noise: Develop alerting policy as code; reduce false positives; implement suppression, deduplication, and auto-remediation runbooks.

  • Partner in operations: Work hand-in-hand with Incident & Problem Management to accelerate triage, cut MTTR, and drive durable RCAs and prevention actions.

  • Integrate the ecosystem: Connect observability to CI/CD, feature flags, incident tooling  CMDB/service catalog, and collaboration channels (Slack/Zoom).

  • Champion reliability culture: Coach product and platform teams on instrumentation patterns, trace context, and SLO thinking; contribute reusable modules/templates.

  • Continuously improve: Lead telemetry hygiene initiatives, cost/usage optimization of monitoring platforms, and performance tuning across tiers.

  • Security & compliance: Ensure monitoring data is handled per policy; implement role-based access and guardrails for sensitive logs/metrics.
     

What You Bring
  • Experience: 4–8 years in Observability/SRE/Platform/Monitoring roles supporting SaaS or enterprise applications.

  • Telemetry tools: Hands-on with monitoring and logging tools.  

  • Tracing & metrics: Strong grasp of distributed tracing, RED/USE/golden signals, SLI/SLO/SLA, and error budgets.

  • Automation & code: Proficient in common languages such as Python.

  • Cloud & platforms: Experience with AWS 

  • ITSM fluency: Comfortable operating within Incident/Problem/Change frameworks; adept at runbooks, RCAs, and post-incident reviews.

  • Data mindset: SQL or log query languages; can translate telemetry into insights and narratives.

  • Soft skills: Clear communicator, collaborative partner, bias to action, and calm during outages.
     

Nice to Have
  • Service maps/dependency modeling, synthetic/RUM design, APM transaction tuning, log schema governance.

  • Experience integrating observability with CMDB/service catalog and feature flag systems.

  • Certifications (e.g., AWS, Datadog).
     

Working Model
  • Hybrid role in India collaborating with global teams; core hours aligned to IST with occasional off-hours participation for major incidents or change windows.

  • Part of an on-call rotation with follow-the-sun support.

Please note that Zendesk can only hire candidates who are physically located and plan to work from Karnataka or Maharashtra. Please refer to the location posted on the requisition for where this role is based.

Hybrid: In this role, our hybrid experience is designed at the team level to give you a rich onsite experience packed with connection, collaboration, learning, and celebration - while also giving you flexibility to work remotely for part of the week. This role must attend our local office for part of the week. The specific in-office schedule is to be determined by the hiring manager.

The intelligent heart of customer experience

Zendesk software was built to bring a sense of calm to the chaotic world of customer service. Today we power billions of conversations with brands you know and love.

Zendesk believes in offering our people a fulfilling and inclusive experience. Our hybrid way of working, enables us to purposefully come together in person, at one of our many Zendesk offices around the world, to connect, collaborate and learn whilst also giving our people the flexibility to work remotely for part of the week.

As part of our commitment to fairness and transparency, we inform all applicants that artificial intelligence (AI) or automated decision systems may be used to screen or evaluate applications for this position, in accordance with Company guidelines and applicable law.

Zendesk is an equal opportunity employer, and we’re proud of our ongoing efforts to foster global diversity, equity, & inclusion in the workplace. Individuals seeking employment and employees at Zendesk are considered without regard to race, color, religion, national origin, age, sex, gender, gender identity, gender expression, sexual orientation, marital status, medical condition, ancestry, disability, military or veteran status, or any other characteristic protected by applicable law. We are an AA/EEO/Veterans/Disabled employer. If you are based in the United States and would like more information about your EEO rights under the law, please click here.

Zendesk endeavors to make reasonable accommodations for applicants with disabilities and disabled veterans pursuant to applicable federal and state law. If you are an individual with a disability and require a reasonable accommodation to submit this application, complete any pre-employment testing, or otherwise participate in the employee selection process, please send an e-mail to [email protected] with your specific accommodation request.

Top Skills

AWS
Opentelemetry
Python
SQL

Similar Jobs

3 Days Ago
Remote or Hybrid
18 Locations
Senior level
Senior level
Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
The Engineering Manager will lead the Linux sensor development team, manage engineers, drive technical strategy, and ensure high code quality for cybersecurity features.
Top Skills: CC++EbpfKubernetesLinuxUnix
6 Days Ago
Remote or Hybrid
16 Locations
Senior level
Senior level
Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
The Sr. Software Engineer will create file format parsers, collaborate on machine learning features, and maintain software systems. Responsibilities include testing, optimization, and documentation.
Top Skills: AWSAzureBitbucketC++GCPGitJenkinsJIRAPythonRust
Senior level
Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
The Collaboration Tool Engineer is responsible for administering, securing, and optimizing multiple collaboration platforms, ensuring compliance and integration, while enhancing user experience across the organization.
Top Skills: AsanaBoxDropboxKalturaMiroNextup.AiPowershellPythonRest ApisSmartsheetSso/Saml

What you need to know about the Pune Tech Scene

Once a far-out concept, AI is now a tangible force reshaping industries and economies worldwide. While its adoption will automate some roles, AI has created more jobs than it has displaced, with an expected 97 million new roles to be created in the coming years. This is especially true in cities like Pune, which is emerging as a hub for companies eager to leverage this technology to develop solutions that simplify and improve lives in sectors such as education, healthcare, finance, e-commerce and more.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account