HighLevel Logo

HighLevel

Lead Site Reliability Engineer

Posted Yesterday
Be an Early Applicant
Remote
Hiring Remotely in Delhi, Connaught Place, New Delhi, Delhi
Senior level
Remote
Hiring Remotely in Delhi, Connaught Place, New Delhi, Delhi
Senior level
The Lead Site Reliability Engineer will ensure the availability, performance, and scalability of our systems, collaborating with development and operations teams to enhance reliability and observability, automate processes, and drive cost optimization efforts.
The summary above was generated by AI

About HighLevel:

HighLevel is a cloud-based, all-in-one white-label marketing and sales platform that empowers marketing agencies, entrepreneurs, and businesses to elevate their digital presence and drive growth. With a focus on streamlining marketing efforts and providing comprehensive solutions, HighLevel helps businesses of all sizes achieve their marketing goals. We currently have ~1200 employees across 15 countries, working remotely as well as in our headquarters, which is located in Dallas, Texas. Our goal as an employer is to maintain a strong company culture, foster creativity and collaboration, and encourage a healthy work-life balance for our employees wherever they call home.


Our Website - https://www.gohighlevel.com/

YouTube Channel - https://www.youtube.com/channel/UCXFiV4qDX5ipE-DQcsm1j4g

Blog Post https://blog.gohighlevel.com/general-atlantic-joins-highlevel/


Our Customers:

HighLevel serves a diverse customer base, including over 60K agencies & entrepreneurs and 500K businesses globally. Our customers range from small and medium-sized businesses to enterprises, spanning various industries and sectors.


Scale at HighLevel:

We operate at scale, managing over 40 billion API hits and 120 billion events monthly, with more than 500 micro-services in production. Our systems handle 200+ terabytes of application data and 6 petabytes of storage.


About the Role:

We are looking for a Lead Site Reliability Engineer to join our team and help ensure the availability, performance, and scalability of our critical systems. You will work closely with development and operations teams to automate processes, enhance system reliability, and improve observability.

Requirements:

  • Experience: 5+ years in Site Reliability Engineering, DevOps, or Cloud Infrastructure roles
  • Cloud Expertise: Hands-on experience with GCP and AWS
  • Infrastructure as Code (IaC): Terraform, Helm, or equivalent tools
  • Containerisation & Orchestration: Docker, Kubernetes (GKE)
  • Observability: Experience with Prometheus, Grafana, ELK, OpenTelemetry, or similar monitoring/logging tools
  • Programming/Scripting: Proficiency in Python, Bash, or Shell scripting. Basic understanding of API parsing and JSON manipulation
  • CI/CD Pipelines: Hands-on experience with Jenkins, GitHub Actions, ArgoCD, or similar tools
  • Incident Management: Experience with on-call rotations, SLOs, SLIs, SLAs, Escalation Policies, and incident resolution
  • Databases: Experience in monitoring MongoDB, Redis, ES, Queue based etc

Responsibilities:

  • Develop and improve observability using monitoring, logging, tracing, and alerting tools (Prometheus, Grafana, ELK, OpenTelemetry, etc.)
  • Optimize system performance, troubleshoot incidents, and conduct post-mortems/RCA to prevent future issues
  • Collaborate with developers to enhance application reliability, scalability, and performance
  • Drive cost optimisation efforts in cloud environments.
  • Monitor multiple databases (MongoDB, Redis, ES, Queue based etc.)
  • Provide technical leadership and mentorship to SRE team members, fostering a culture of continuous learning and knowledge sharing in site reliability practices

EEO Statement:

At HighLevel, we value diversity. In fact, we understand it makes our organisation stronger. We are committed to inclusive hiring/promotion practices that evaluate skill sets, abilities, and qualifications without regard to any characteristic unrelated to performing the job at the highest level. Our objective is to foster an environment where really talented employees from all walks of life can be their true and whole selves, cherished and welcomed for their differences while providing excellent service to our clients and learning from one another along the way! Reasonable accommodations may be made to enable individuals with disabilities to perform the essential functions


#LI-Remote #LI-RS1

Similar Jobs

8 Days Ago
Remote
8 Locations
Mid level
Mid level
Cloud • Software
As an SRE & Gitops Engineer, you'll automate software operations, enhance infrastructure as code practices, maintain core services at Canonical, and collaborate with development teams to improve products. Responsibilities include troubleshooting, capacity planning, and using observability tools for monitoring and alerting.
8 Days Ago
Remote
8 Locations
Senior level
Senior level
Cloud • Software
As a Senior Site Reliability/GitOps Engineer at Canonical, you'll drive automation and infrastructure as code practices. You'll automate software operations across private and public clouds, support core services, troubleshoot issues, and collaborate with global teams. You'll also provide feedback to improve Canonical products and work alongside talented individuals in a remote-first environment.
Top Skills: Cloud ComputingElasticsearchGitopsGrafanaInfrastructure As CodeKubernetesLinuxPrometheusPythonUbuntu
Yesterday
Remote
Delhi, Connaught Place, New Delhi, Delhi, IND
Mid level
Mid level
Information Technology • Internet of Things • Marketing Tech
The Site Reliability Engineer will ensure the performance and scalability of critical systems, collaborating with development and operations teams to automate processes, enhance system reliability, and improve observability through monitoring and logging tools. Responsibilities include troubleshooting incidents, optimizing system performance, and managing databases.
Top Skills: ArgocdAWSBashDockerElkEsGCPGithub ActionsGrafanaHelmJenkinsKubernetesMongoDBOpentelemetryPrometheusPythonRedisTerraform

What you need to know about the Pune Tech Scene

Once a far-out concept, AI is now a tangible force reshaping industries and economies worldwide. While its adoption will automate some roles, AI has created more jobs than it has displaced, with an expected 97 million new roles to be created in the coming years. This is especially true in cities like Pune, which is emerging as a hub for companies eager to leverage this technology to develop solutions that simplify and improve lives in sectors such as education, healthcare, finance, e-commerce and more.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account