SailPoint Logo

SailPoint

Sr Observability Engineer (SRE)

Reposted 5 Hours Ago
Be an Early Applicant
Hybrid
Pune, Maharashtra
Senior level
Hybrid
Pune, Maharashtra
Senior level
Seeking a Senior Site Reliability Engineer to ensure reliability, scalability, and performance of systems, collaborating closely with developers and managing incidents.
The summary above was generated by AI

We are seeking a highly motivated and experienced Site Reliability Engineer (SRE) to join our [Team Name] software development team. This is an embedded role, meaning you will be a full member of the development team, working closely with software engineers, infrastructure platform services, engineering managers, and other stakeholders to ensure the reliability, scalability, and performance of teams’ services. You will be responsible for leveraging the infrastructure, tooling, and processes that support our applications in dev and production, as well as participating in on-call rotations. This role offers a unique opportunity to directly influence the design and architecture of our systems from a reliability and performance perspective.

Responsibilities:

Work with the developments and service owners at the intersection of development and operations to solve performance issues and ensure system scalability.

  • Reliability Engineering: Design, develop, and implement solutions to improve the reliability, availability, performance, and scalability of our systems. Work with technical leaders and infrastructure platform services to develop alerts and dashboards.
  • Operational Excellence: Own and improve key operational metrics (SLIs, SLOs, Error Budgets, monitoring and alerting) for team related services and drive continuous improvement through post-incident reviews and blameless postmortems of non-functional issues. Develop and maintain comprehensive monitoring, alerting to proactively identify and resolve issues. ConductCreate and maintain dashboards and , conducting ongoing reviews to address and optimize gaps. Improve operational processes and improve operational processes and team practices, working with technical leaders and NOC team.
  • Monitoring and Alerting: Develop and maintain comprehensive monitoring, alerting to proactively identify and resolve issues.
  • Capacity Planning: Collaborate with technical leads, DevOps/SRE and infra teams to forecast capacity needs and ensure sufficient resources are available to support growth.
  • Performance Optimization: Collaborate with performance SMEs to identify and address production performance bottlenecks through profiling, tuning, and optimization of services and infrastructure.
  • Automation: Automate repetitive tasks and processes to improve efficiency and reduce manual intervention.
  • Collaboration: Work closely with Software, Performance and Test Engineers to influence system design and architecture for operability and reliability.
  • Documentation: Create and maintain clear and concise documentation for systems, processes, runbooks, and procedures.
  • On-Call: Participate in on-call rotation.
  • Incident Management: Participate in on-call rotations and lead incident response efforts, ensuring timely resolution and effective communication.  Conduct in-depth incident analysis and help drive completion of post-incident action.
  • Troubleshooting skills: Excellent diagnostic and problem-solving skills, with the ability to analyze complex systems and data

Qualifications:

  • Bachelor’s degree in computer science, a related field, or equivalent practical experience.
  • Proven 5+ years of SRE experience
  • Strong understanding of SRE principles and practices.
  • Experience with cloud platforms (AWS, GCP, or Azure).
  • Proficiency in at least one scripting language (e.g., Python, Bash, Go).
  • Experience with monitoring and logging tools (e.g., Prometheus, Grafana).
  • Level of coding experience beyond simple scripts with one of the programming languages such as Go, Java, or Python to help build reliability engineering
  • Experience with containerization and orchestration technologies (e.g., Docker, Kubernetes).
  • Understanding of network protocols, and security best practices
  • Familiarity with DevOps culture and practices and experience with CI/CD toolchains
  • Experience with Incidence Response processes and config management tools (PagerDuty, Git),
  • Strong problem-solving and troubleshooting skills.
  • Excellent communication and collaboration skills.   
  • Ability to work independently and as part of a team to achieve the SRE agenda.

Preferred Qualifications:

  • Experience with technologiesTechnology experience with: Kafka, what DBs, ???relational databases,  performance tuning (JVM, Go)
  • Experience with Grafana K6 – Continuous Performance Tool
  • Infrastructure as Code (IaC) tools (e.g., Terraform, CloudFormation, Ansible).

What success looks like in the role
Within the first 30 days you will:

  • Onboard into your new role, get familiar with our product offering and technology, proactively meet peers and stakeholders, set up your test and development environment.
  • Seek to deeply understand business problems or common engineering challenges and propose software architecture designs to solve them elegantly by abstracting useful common patterns.

By 90 days:

  • Proactively collaborate on, discuss, debate and refine ideas, problem statements, and software designs with different (sometimes many) stakeholders, architects and members of your team.
  • Take a committed approach to prototyping and co-implementing systems alongside less experienced engineers on your team—there’s no room for ivory towers here.

By 6 months:

  • Share support of critical team systems by participating in call, learning the characteristics of currently running systems, and participating in improvements.
  • Occasionally serve as a debugging and implementation expert during escalations of systems issues that have evaded the ability of less experienced engineers to solve in a timely manner.
  • Collaborates with Support Management and Engineering Manager to quick resolution of escalation.

SailPoint is an equal opportunity employer and we welcome all qualified candidates to apply to join our team.  All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, protected veteran status, or any other category protected by applicable law.  

Alternative methods of applying for employment are available to individuals unable to submit an application through this site because of a disability.  Contact [email protected] or mail to 11120 Four Points Dr, Suite 100, Austin, TX 78726, to discuss reasonable accommodations.

Top Skills

Ansible
AWS
Azure
Bash
CloudFormation
Docker
GCP
Go
Grafana
Kubernetes
Prometheus
Python
Terraform

SailPoint Pune, Mahārāshtra, IND Office

Lohia Jain Arcade, Sr. No. 106/107, Near Chatursringi Temple, Senapati Bapat Road , Pune, Maharashtra , India, 411016

Similar Jobs at SailPoint

4 Days Ago
Hybrid
Pune, Maharashtra, IND
Senior level
Senior level
Artificial Intelligence • Cloud • Sales • Security • Software • Cybersecurity • Data Privacy
The role involves gathering customer requirements, designing solutions, training partners, and ensuring technical delivery in the identity management domain.
Top Skills: AdAngularAWSAzureBeanshellCSSGCPHTMLIamIdamIgaJavaJavaScriptLdapLinuxMssqlMySQLNode.jsOraclePeoplesoftReactRestapiSailpointSAPServicenowSoapSpmlSybaseUnixVueWindowsXML
4 Days Ago
Hybrid
Pune, Maharashtra, IND
Senior level
Senior level
Artificial Intelligence • Cloud • Sales • Security • Software • Cybersecurity • Data Privacy
Solution Architects at SailPoint lead customer projects, developing technical solutions, educating clients, and providing implementation support, leveraging their extensive experience in identity governance and software solutions.
Top Skills: AadAdAngularAWSAzureBeanshellCassandraCSSGCPHTMLJavaJavaScriptLdapMongoDBMssqlMySQLNode.jsOracleReactRedisSpml/SoapSybaseVueXML
9 Days Ago
Hybrid
Pune, Maharashtra, IND
Senior level
Senior level
Artificial Intelligence • Cloud • Sales • Security • Software • Cybersecurity • Data Privacy
The Senior Solution Architect at SailPoint leads technical projects, educates clients on product architectures, and delivers custom solutions while mentoring junior team members.
Top Skills: AdAngularAWSAzureBeanshellCassandraCSSGCPHTMLJavaJavaScriptLdapLinuxMongoDBMssqlMySQLNode.jsOracleReactRedisSailpointSpml/SoapSybaseUnixVueWindowsXML

What you need to know about the Pune Tech Scene

Once a far-out concept, AI is now a tangible force reshaping industries and economies worldwide. While its adoption will automate some roles, AI has created more jobs than it has displaced, with an expected 97 million new roles to be created in the coming years. This is especially true in cities like Pune, which is emerging as a hub for companies eager to leverage this technology to develop solutions that simplify and improve lives in sectors such as education, healthcare, finance, e-commerce and more.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account