Octus Logo

Octus

Site Reliability Engineer

Posted 2 Days Ago
Be an Early Applicant
Easy Apply
Hybrid
Bogotá, Bogotá, D.C.
Senior level
Easy Apply
Hybrid
Bogotá, Bogotá, D.C.
Senior level
The Site Reliability Engineer will build and maintain scalable services, automate processes, improve system reliability, and collaborate with engineering teams on production issues.
The summary above was generated by AI

Octus

Octus is a leading global provider of credit intelligence, data, and analytics. Since 2013, tens of thousands of professionals across hedge fund, investment banking, management consulting, and law firm verticals have come to rely on Octus to make better, faster, and more confident decisions in pace with the fast-moving credit markets.
For more information, visit: https://octus.com/

Working at Octus

Octus hires growth-minded innovators and trailblazers across the globe to drive our business and culture. Our core values – Action Oriented, Customer First Mindset, Effective Team Players, and Driven to Excel – define an organizational ethos that’s as high-performing as it is human. Among other perks, Octus employees enjoy competitive health benefits, matched 401k and pension plans, PTO, generous parental leave, gym subsidies, educational reimbursements for career development, recognition programs, pet-friendly offices (US only), and much more. 
Role

We’re looking for Site Reliability Engineers who can help us build, operate, and maintain high-performance, scalable, and reliable services for our production infrastructure across our cloud environment. Site Reliability Engineers combine engineering experience and an innate drive to improve existing systems and processes, with the creativity to develop novel solutions to evolving challenges. Our team strives to automate processes wherever possible, using whichever tools are best for the job. You’ll be the experts for the environments that you operate infrastructure in, helping partner teams build & configure their software to operate reliably within.We strongly believe in engineering teams being responsible for the operations of their services in production. In this role, you’ll work closely with engineers to advocate and participate in sensible, scalable, systems design and share responsibility with them in diagnosing, resolving, and preventing production issues.

What you'll do:

  • Identify, assess, and mitigate risks associated with our systems, applications, and infrastructure.
  • Proactively recognize sources of instability in distributed systems and analyze how complex systems fail from a reliability and resilience perspective.
  • Improve our applications availability, reliability, and observability and reduce outages to a minimum.
  • Implement DR strategies, including backups and recovery techniques with minimal downtime for different applications.
  • Automate and codify our tooling, processes, and infrastructure to speed up development and make them repeatable and error-proof.
  • Deep dive into issues and outages to establish root causes and communicate them to your business partners.
  • Write and maintain thorough documentation to share with your teammates around the world, allowing them all to function as a cohesive unit.
  • Participate in a 24/7 weekly on-call rotation with members of your team to troubleshoot incidents in a complex distributed systems environment.
  • Ability to create meaningful metrics and alerting for service health monitoring.

Skills and knowledge you should posses:

  • Bachelor's degree in Computer Science or a related field, or equivalent experience
  • 5+ years of experience in SRE, Devops or systems engineering
  • Proficient in command-line interface (CLI) operations, shell scripting (Python or Bash), and Linux system administration
  • Extensive experience working with Infrastructure as code technologies, preferably Terraform
  • Extensive experience working with major cloud providers, preferably AWS
  • Significant experience working with Observability and telemetry tools ( Datadog, AWS Cloudwatch,  New Relic, Prometheus, Grafana etc.)
  • Professional experience in working with at least one general purpose programming language (Python, PHP, Go, C# etc.)
  • Experience building CI/CD workflows with tools like Jenkins, CircleCI, Github actions or AWS Code pipeline
  • Fundamental understanding of Internet networking protocols: TCP/IP, TLS, DNS, HTTP, SMTP

Bonus points (nice skills to have):

  • Database Systems Fundamentals (MySQL/Postgres) and administering them at scale including schema and query optimization
  • Familiarity working with event driven systems and messaging infrastructure (Kafka, RabbitMQ, AWS Kinesis etc.)
  • Experience working with containerized and serverless applications such as Docker, AWS ECS, Kubernetes and AWS Lambda
  • Experience working with web servers such as Nginx, Apache, Tomcat etc.
  • Application security, infrastructure security and SOC2 compliance experience

Equal Employment Opportunity

Octus is committed to providing equal employment opportunities to all employees and applicants for employment without regard to race, colour, religion, sex, sexual orientation, gender identity, national origin, age, disability, genetic information, marital status, pregnancy, veteran status, or any other legally protected status. We strive to create an inclusive and diverse work environment where all individuals are valued, respected, and treated fairly. We believe that diversity enriches our workplace and enhances our ability to innovate and succeed.

Top Skills

AWS
Aws Cloudwatch
Aws Code Pipeline
Bash
CircleCI
Datadog
Dns
Github Actions
Grafana
HTTP
Jenkins
Linux
New Relic
Prometheus
Python
Smtp
Tcp/Ip
Terraform
Tls

Octus Pune, Mahārāshtra, IND Office

Octus Pune Office Office

VB Capital, Range Hill Road, Pune, Maharashtra, India, 411007

What you need to know about the Pune Tech Scene

Once a far-out concept, AI is now a tangible force reshaping industries and economies worldwide. While its adoption will automate some roles, AI has created more jobs than it has displaced, with an expected 97 million new roles to be created in the coming years. This is especially true in cities like Pune, which is emerging as a hub for companies eager to leverage this technology to develop solutions that simplify and improve lives in sectors such as education, healthcare, finance, e-commerce and more.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account