JPMorganChase Logo

JPMorganChase

Lead SRE

Posted 16 Days Ago
Be an Early Applicant
Hybrid
Bengaluru, Karnataka
Senior level
Hybrid
Bengaluru, Karnataka
Senior level
The Lead SRE will enhance application reliability and performance, oversee automated CI/CD, and lead incident response while collaborating with development teams and managing cloud infrastructures.
The summary above was generated by AI

Job Description
Assume a critical role in defining the future of a globally recognized firm and have a direct and significant effect in a realm tailored for top achievers in site reliability.
As a Site Reliability Engineer III at JPMorgan Chase within the Corporate Technology - Capital Management, you will solve complex and broad business problems with simple and straightforward solutions. Through code and cloud infrastructure, you will configure, maintain, monitor, and optimize applications and their associated infrastructure to independently decompose and iteratively improve on existing solutions. You are a significant contributor to your team by sharing your knowledge of end-to-end operations, availability, reliability, and scalability of your application or platform.
Job responsibilities

  • Drive continuous improvement of reliability, monitoring, and alerting for mission-critical microservices, while reducing toil through automation and creating reliable infrastructure and tooling to expedite feature development.
  • Develop and implement metrics for microservices, define user journeys, SLOs, and error budgets, and configure dashboards and alerts, facilitating blameless post-mortems to ensure permanent incident closure.
  • Engage with development teams throughout the software lifecycle to enhance reliability and scale, design self-healing and resiliency patterns, and implement infrastructure, configuration, and network as code.
  • Collaborate with software engineers and teams to design and implement deployment approaches using automated CI/CD pipelines, supporting the adoption of site reliability engineering best practices.
  • Demonstrate and champion site reliability culture and practices, leading initiatives to improve application and platform reliability and stability using data-driven analytics.
  • Collaborate with team members to identify service level indicators and establish reasonable service level objectives and error budgets with stakeholders, proactively resolving issues before customer impact.
  • Act as the main point of contact during major incidents, demonstrating technical expertise to quickly identify and solve issues, while documenting and sharing knowledge within the organization.


Required qualifications, capabilities, and skills

  • Formal training or certification on site reliability concepts and 5+ years applied experience in public cloud platforms such as AWS, Azure, or GCP.
  • Proficiency in at least one programming language, such as Python, Go, or Java/Spring Boot, with expertise in designing, coding, testing, and delivering software.
  • Experience with Kubernetes and cloud computing, preferably AWS, and hands-on experience with relational databases like Oracle or MySQL.
  • Proficiency in one or more technology domains, with the ability to solve complex and mission-critical problems within a business or across the firm.
  • Excellent debugging and troubleshooting skills, with experience in common SRE toolchains like Grafana, Prometheus, ELK Stack, Kibana, and Jaeger.
  • Experience with continuous integration and continuous delivery tools such as Jenkins, GitLab, or Terraform, and observability tools like Dynatrace, Datadog, New Relic, CloudWatch, or Splunk.
  • Familiarity with ETL tools like Databricks and experience with container and container orchestration technologies such as ECS, Kubernetes, and Docker.
  • Deep proficiency in reliability, scalability, performance, security, enterprise system architecture, and toil reduction, with the ability to implement these practices within an application or platform.
  • Ability to identify and solve problems related to complex data structures and algorithms, and experience with troubleshooting common networking technologies and issues.
  • Drive to self-educate and evaluate new technology, with the ability to teach new programming languages to team members.
  • Ability to contribute to large and collaborative teams, proactively recognize roadblocks, and demonstrate interest in learning technology that facilitates innovation, while expanding and collaborating across different levels and stakeholder groups.


About Us
JPMorganChase, one of the oldest financial institutions, offers innovative financial solutions to millions of consumers, small businesses and many of the world's most prominent corporate, institutional and government clients under the J.P. Morgan and Chase brands. Our history spans over 200 years and today we are a leader in investment banking, consumer and small business banking, commercial banking, financial transaction processing and asset management.
We recognize that our people are our strength and the diverse talents they bring to our global workforce are directly linked to our success. We are an equal opportunity employer and place a high value on diversity and inclusion at our company. We do not discriminate on the basis of any protected attribute, including race, religion, color, national origin, gender, sexual orientation, gender identity, gender expression, age, marital or veteran status, pregnancy or disability, or any other basis protected under applicable law. We also make reasonable accommodations for applicants' and employees' religious practices and beliefs, as well as mental health or physical disability needs. Visit our FAQs for more information about requesting an accommodation.
About the Team
Our professionals in our Corporate Functions cover a diverse range of areas from finance and risk to human resources and marketing. Our corporate teams are an essential part of our company, ensuring that we're setting our businesses, clients, customers and employees up for success.

Top Skills

AWS
Azure
Cloudwatch
Databricks
Datadog
Docker
Dynatrace
Ecs
Elk Stack
GCP
Gitlab
Go
Grafana
Jaeger
Java
Jenkins
Kibana
Kubernetes
MySQL
New Relic
Oracle
Prometheus
Python
Splunk
Terraform

Similar Jobs at JPMorganChase

Yesterday
Hybrid
Bengaluru, Karnataka, IND
Senior level
Senior level
Financial Services
Lead site reliability initiatives, improve application reliability, conduct resiliency design reviews, mentor engineers, and manage major incidents effectively.
Top Skills: .NetDatadogDockerDynatraceEcsGitlabGrafanaJava Spring BootJenkinsKubernetesPrometheusPythonSplunkTerraform
Senior level
Financial Services
Lead the Global Site Reliability Engineering team, enhance managed services, and improve customer experience through teamwork and technological solutions.
Top Skills: AnsibleAWSAzureBambooBashGCPGemfireJenkinsMongoDBMssqlNeo4JNoSQLOraclePythonRedisSpinnakerSybase
11 Days Ago
Hybrid
Bengaluru, Karnataka, IND
Senior level
Senior level
Financial Services
As a Lead Site Reliability Engineer, you will oversee reliability initiatives, mentor engineers, and ensure the stability of applications and platforms.
Top Skills: .NetAWSDatadogDockerDynatraceGitlabGrafanaJava Spring BootJenkinsKubernetesPrometheusPythonSplunkTerraform

What you need to know about the Pune Tech Scene

Once a far-out concept, AI is now a tangible force reshaping industries and economies worldwide. While its adoption will automate some roles, AI has created more jobs than it has displaced, with an expected 97 million new roles to be created in the coming years. This is especially true in cities like Pune, which is emerging as a hub for companies eager to leverage this technology to develop solutions that simplify and improve lives in sectors such as education, healthcare, finance, e-commerce and more.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account