Talkdesk Logo

Talkdesk

Staff Site Reliability Engineer

Posted 11 Days Ago
Be an Early Applicant
In-Office
Bengaluru, Karnataka
Senior level
In-Office
Bengaluru, Karnataka
Senior level
The role involves ensuring high availability of systems, leading design and implementation of infrastructure, driving incident response, and improving developer productivity through automation.
The summary above was generated by AI

At Talkdesk, we are courageous innovators focused on redefining the customer experience, making the impossible possible for companies globally. We champion an inclusive and diverse culture representative of the communities in which we live and serve. And, we give back to our community by volunteering our time, supporting non-profits, and minimizing our global footprint. Each day, thousands of employees, customers, and partners all over the world trust Talkdesk to deliver a better way to great experiences.

We are recognized as a cloud contact center leader by many of the most influential research organizations, including Gartner and Forrester. With $498 million in total funding, a valuation of more than $10 Billion, and a ranking of #16 on the Forbes Cloud 100 list, now is the time to be part of the Talkdesk legacy to help accelerate our success in a new decade of transformational growth.

At Talkdesk, we embrace FAST, our fundamental operating principles that define who we are as an organization. These principles drive us to make the impossible possible. FAST: Focus + Accountability + Speed = Talkdesker.

  • Focus: Focus time, energy and attention on what is most impactful for the business and thoughtful about how and when to partner with others.
  • Accountability: Hold self and others accountable to meet commitments and drive results. Accept responsibility for successes and failures.
  • Speed: Execute with agility and urgency. Act promptly, decisively, and without delay. Make good and timely decisions that keep the organization moving forward.
  • Talkdesker: YOU!

We are looking for an engineer to focus on Developer Experience and who can help us design, build, and maintain high-performance, scalable, and reliable services. As Talkdesk provides a Contact Center service, we play a very critical role in our Customer’s business operations and therefore need to provide a highly available and fault tolerant service.

We believe in a DevOps philosophy where every engineering team at Talkdesk should be responsible for the software they build and deploy and SREs play a critical role in ensuring that the teams have the tools, practices, and expertise to make that happen in a blame free culture.Our mission is to improve developers’ experience by giving them the tools to manage the entire software lifecycle and to be self-sufficient. To help with this we are building our own internal PaaS using the latest technologies like Kubernetes, Prometheus, Kotlin and others. This platform is an important pillar in Talkdesk’s engineering effort and helps us deliver better, faster and more reliable solutions for our customers.


Responsibilities:

  • Ensure high availability, performance, and scalability of mission-critical systems and services.
  • Lead the design and implementation of resilient and fault-tolerant infrastructure.
  • Drive incident response, root cause analysis, and postmortem culture. Mentor others in incident practices.
  • Write and maintain operational documentation, runbooks, and architecture diagrams.
  • Drive and promote protocols on production readiness and operational excellence.
  • Own and evolve infrastructure automation using Terraform or similar tools to remove as much as possible any human intervention.
  • Help automate infrastructure provisioning and other engineering processes by working on automations built on top of an engineering platform written in GitHub Actions.
  • Build internal platforms, tools, and frameworks to improve developer productivity and service reliability.
  • Work closely with software engineers, platform teams, and product managers to align on company goals.
  • Coach and up-skill other engineering team members
  • Plan for growth of Talkdesk’s infrastructure.

Skills and Qualifications

  • 8–12+ years in SRE, DevOps, or related infrastructure-focused roles.
  • Understand large-scale complex systems from a reliability perspective.
  • Design, implement and maintain processes and tools.
  • Passion for producing clean, standards-compliant, secure code.
  • Bringing a developer mindset and applying it to infrastructure
  • Strong experience with Linux/Unix systems.
  • Deep experience with Kubernetes.
  • Deep experience with tools like Terraform, Ansible, Helm.
  • Strong coding skills in scripts for automating the execution of certain tasks with a programming language like Python, Bash or any other scripting language.
  • Experience with at least one relational and non-relational databases (ex: PostgreSQL, MySQL, MongoDB, Redis, ElasticSearch).
  • Ability to identify time consuming and error prone manual tasks and then build/leverage tooling to automate them.
  • Ability to identify root causes of instability in a large-scale distributed system across stacks.
  • Experience leading high-severity incident responses and postmortems

Nice to haves / Pluses

  • Experience with cloud-based solutions such as Amazon AWS, Google Cloud, or Microsoft Azure.
  • Experience supporting scalable DBs like PostgreSQL, or MongoDB in production.
  • Understanding of cost

Work Environment and Physical Requirements:

Primarily office-environment work, extended periods of sitting or standing, computer-based work. Limited lifting, and equipment usage limited to computer-related equipment (keyboards, mouse, etc.)

The Talkdesk story hinges on empathy and acceptance. It is the shared goal among all Talkdeskers to empower a new kind of customer hero through our innovative software solution, and we firmly believe that the best path to success for our mission is inclusivity, diversity, and genuine acceptance. To that end, we will hire, promote, work along, cheer for, bond with, and warmly welcome into the Talkdesk family all persons without regard to ethnic and racial identity, indigenous heritage, national origin, religion, gender, gender identity, gender expression, sexual orientation, age, disability, marital status, veteran status, genetic information, or any other legally protected status.

Top Skills

Amazon Aws
Ansible
Bash
Elasticsearch
GCP
Helm
Kotlin
Kubernetes
Azure
MongoDB
MySQL
Postgres
Prometheus
Python
Redis
Terraform

Similar Jobs

7 Days Ago
Hybrid
Bengaluru, Karnataka, IND
Mid level
Mid level
Financial Services
As a Site Reliability Engineer III, you will enhance system reliability, optimize cloud infrastructure, and support deployment processes through code, collaborating with various teams.
Top Skills: .NetDatadogDockerDynatraceEcsGitlabGrafanaJavaJenkinsKubernetesPrometheusPythonSplunkSpring BootTerraform
3 Days Ago
In-Office
Bangalore, Bengaluru Urban, Karnataka, IND
Senior level
Senior level
Cloud • Security • Software • Cybersecurity
As a Staff Site Reliability Engineer, you will lead SRE practices, mentor engineers, and drive reliability initiatives across teams, focusing on observability and operational excellence.
Top Skills: AzureC#GoGrafanaJavaJavaScriptKubernetesOpentelemetryPrometheusPulumiTerraformTypescript
10 Days Ago
In-Office
Bengaluru, Bengaluru Urban, Karnataka, IND
Senior level
Senior level
Cloud • Software
Lead and improve the reliability of Procore's services, mentor teammates, and collaborate on system architecture and design.
Top Skills: AnsibleArgocdAWSAzureCircle CiCloudFormationConsulEnvoyGCPGoIstioJavaJenkinsKubernetesLinkerdNode.jsRubySpinnakerTerraformTravis

What you need to know about the Pune Tech Scene

Once a far-out concept, AI is now a tangible force reshaping industries and economies worldwide. While its adoption will automate some roles, AI has created more jobs than it has displaced, with an expected 97 million new roles to be created in the coming years. This is especially true in cities like Pune, which is emerging as a hub for companies eager to leverage this technology to develop solutions that simplify and improve lives in sectors such as education, healthcare, finance, e-commerce and more.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account