Pattern Logo

Pattern

Senior Site Reliability Engineer-SAAS

Posted 4 Days Ago
Be an Early Applicant
Pune, Maharashtra
Senior level
Pune, Maharashtra
Senior level
The Senior Site Reliability Engineer at Pattern will design, build, and maintain scalable and reliable systems for SaaS products using AWS. Responsibilities include managing CI/CD pipelines, monitoring systems, responding to incidents, and implementing automation while collaborating with development teams to enhance system reliability and performance.
The summary above was generated by AI

Job Description:

Our SRE role spans software, systems, and operations engineering. If your passion is building stable, scalable systems for a growing set of innovative products, as well as helping to reduce the friction for deploying these products for our engineering team, Pattern is the place for you. Come help us build a best-in-class platform for amazing growth.

This position requires minimum 3 years of working experience on SAAS products
 

Key Responsibilities:
● Infrastructure and Automation

  • Design, build, and manage scalable and reliable infrastructure in AWS (Postgres, Redis, Docker, Queues, Kinesis Streams, S3, etc.)Develop Python or shell scripts for automation, reducing operational toil.
  • Implement and maintain CI/CD pipelines for efficient build and deployment processes using Github Actions.

● Monitoring and Incident Response

  • Establish robust monitoring and alerting systems using observability methods, logs, and APM tools.
  • Participate in on-call rotations (aligned with US business hours) to respond to incidents, troubleshoot problems, and ensure system reliability.
  • Perform root cause analysis on production issues and implement preventative measures to mitigate future incidents.

● Cloud Administration

  • Manage AWS resources, including Lambda functions, SQS, SNS, IAMs, RDS, etc.
  • Perform Snowflake administration and set up backup policies for various databases.

● Reliability Engineering

  • Define Service Level Indicators (SLIs) and measure Service Level Objectives (SLOs) to maintain high system reliability.
  • Utilize Infrastructure as Code (IaC) tools like Terraform for managing and provisioning infrastructure.

● Collaboration and Empowerment

  • Collaborate with development teams to design scalable and reliable systems.
  • Empower development teams to deliver value quickly and accurately.
  • Document system architectures, procedures, run books and best practices.
  • Assist developers in creating automation scripts and workflows to streamlineoperational tasks and deployments.

● Innovative Infrastructure Solutions

  • Spearhead the exploration of innovative infrastructure solutions and technologies aligned with industry best practices.
  • Embrace a research-based approach to continuously improve system reliability, scalability, and performance.
  • Encourage a culture of experimentation to test and implement novel ideas for system optimization.

Required Qualifications

● Bachelor’s degree in a technical field or relevant work experience
● 6+ years of experience in engineering, development, DevOps/SRE fields
● 3+ years experience deploying and managing systems using Amazon Web Services
● 3+ years experience on Software as a Service (SaaS) applications.
● Proven “doer” attitude with ability to self-start, take a project to completion.
Demonstrate project ownership.
● Familiarity with container orchestration tools like Kubernetes, Fargate, etc.
● Familiarity with Infrastructure as Code tooling like Terraform, CloudFormation,
Ansible, Puppet
● Experience working with CI/CD automated deployments using tools like Github
Actions, Jenkins, CircleCI
● Experience working on observability tools like Datadog, NewRelic, Dynatrace,
Grafana, Prometheus, etc.
● Experience with Linux server management, bash scripting, SSH keys, SSL/TLS
certificates, MFA, cron, and log files
● Deep understanding of AWS networking (VPCs, subnets, security groups, route
tables, internet gateways, NAT gateways, NACLs), IAM policies, DNS, Route53, and
domain management
● Strong problem-solving and troubleshooting skills
● Attention to Details: Thoroughness in accomplishing tasks, ensuring accuracy and
quality in all aspects of work.
● Excellent communication and collaboration abilities
● Desire to help take Pattern to the next level through exploration and innovation

Preferred Qualifications

● Experience in deploying applications on ECS, Fargate with ELB/ALB and Auto
Scaling Groups.
● Experience in deploying serverless applications with Lambda, API Gateway, Cognito,
CloudFront.
● Experience in deploying applications built using JavaScript, Ruby, Go, Python.
● Experience with Infrastructure as Code (IaC) using Terraform.
● Experience with database administration for Snowflake, Postgres.
● AWS Certification would be a plus.
● A focus on adopting security best practices while building great tools.

What We're About

● Data Fanatics: Our edge is always found in the data
● Partner Obsessed: We are obsessed with partner success
● Team of Doers: We have a bias for action
● Game Changers: We encourage innovation

Pattern is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees.

Top Skills

Python

Similar Jobs

2 Days Ago
Hybrid
Navi Mumbai, Thane, Maharashtra, IND
Senior level
Senior level
Enterprise Web • Fintech • Financial Services
As a Lead Site Reliability Engineer, you'll design and implement system enhancements to boost performance and reliability. You will lead a skilled team, improve deployment processes, and optimize cloud solutions while ensuring system visibility and customer satisfaction.
Top Skills: DockerSQLTerraform
2 Days Ago
Easy Apply
Hybrid
Pune, Maharashtra, IND
Easy Apply
Senior level
Senior level
AdTech • Big Data • Digital Media • Marketing Tech
As a Senior Site Reliability Engineer, you will ensure platform stability and responsiveness, enhance software solutions, resolve incidents, implement automation tools, and analyze system performance. You will collaborate with cross-functional teams to promote best practices in reliability and efficiency, and maintain scripts with a focus on Python while participating in on-call rotations.
Top Skills: BashPython
2 Days Ago
Hybrid
Mumbai, Maharashtra, IND
Mid level
Mid level
Financial Services
As a Site Reliability Engineer II, you'll manage small to medium-sized projects, develop high-quality code, troubleshoot incidents, and enhance service monitoring. Collaborate across teams to improve operational efficiency through automation while ensuring system reliability and adherence to best practices.
Top Skills: JavaPython

What you need to know about the Pune Tech Scene

Once a far-out concept, AI is now a tangible force reshaping industries and economies worldwide. While its adoption will automate some roles, AI has created more jobs than it has displaced, with an expected 97 million new roles to be created in the coming years. This is especially true in cities like Pune, which is emerging as a hub for companies eager to leverage this technology to develop solutions that simplify and improve lives in sectors such as education, healthcare, finance, e-commerce and more.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account