As an SRE II, ensure the reliability, performance, and scalability of data-intensive applications while collaborating with engineering teams and optimizing system infrastructure.
Who we are
Mindtickle is the market-leading revenue productivity platform that combines on-the-job learning and deal execution to get more revenue per rep. Mindtickle is recognized as a market leader by top industry analysts and is ranked by G2 as the #1 sales onboarding and training product. We’re honoured to be recognized as a Leader in the first-ever Forrester Wave™: Revenue Enablement Platforms, Q3 2024!
Job Snapshot
As an SRE II, you will play a key role in ensuring our mission-critical systems' reliability, performance, and scalability. You will work closely with engineering teams to design, implement, and maintain infrastructure that supports high-volume data-intensive applications. Your expertise in monitoring, troubleshooting, and automation will drive operational excellence across our distributed environment.
What’s in it for you?
- Maintain and improve the reliability, availability, and performance of high-volume, data-intensive applications.
- Design, implement, and enhance monitoring, logging, and alerting solutions at scale.
- Collaborate with development teams to optimize system architecture and reliability.
- Manage and troubleshoot distributed systems in a Linux-based production environment.
- Leverage AWS cloud services to scale infrastructure efficiently.
- Utilize Kubernetes for container orchestration, ensuring optimal resource utilization and deployment strategies.
- Implement CI/CD pipelines using GitLab to automate deployments and operational tasks.
- Use infrastructure as code (IaC) tools such as Terraform and CloudFormation for provisioning and managing cloud resources.
- Implement observability best practices using Grafana, Prometheus, Thanos, and Loki.
- Perform root cause analysis (RCA) and proactively address performance bottlenecks and system failures.
- Ensure security best practices and compliance across all infrastructure components.
We’d love to hear from you, if you:
- Have 3+ years of experience in Site Reliability Engineering or related fields.
- Possesses strong Linux fundamentals with a deep understanding of system internals.
- Expertise in troubleshooting and problem-solving in distributed environments.
- Have hands-on experience with logging and monitoring solutions at scale.
- Are proficient in at least one programming language (preferably Python).
- Have strong experience with AWS services and Kubernetes.
- Have exposure to CI/CD pipelines, preferably using GitLab CI/CD.
- Have experience with infrastructure as code (Terraform, CloudFormation).
- Are familiar with observability tools such as Grafana, Prometheus, Thanos, and Loki.
Preferred Qualifications
- Experience in performance tuning and capacity planning.
- Knowledge of incident management and post-mortem analysis processes.
- Familiarity with security best practices in cloud environments.
- Experience in automating operational tasks using scripting and configuration management tools.
Our culture & accolades
As an organization, it’s our priority to create a highly engaging and rewarding workplace. We offer tons of awesome perks and many opportunities for growth.
Our culture reflects our employee's globally diverse backgrounds along with our commitment to our customers, and each other, and a passion for excellence. We live up to our values, DAB, Delight your customers, Act as a Founder, and Better Together.
Mindtickle is proud to be an Equal Opportunity Employer.
All qualified applicants will receive consideration for employment without regard to race, colour, religion, sex, national origin, disability, protected veteran status, or any other characteristic protected by law.
Your Right to Work - In compliance with applicable laws, all persons hired will be required to verify identity and eligibility to work in the respective work locations and to complete the required employment eligibility verification document form upon hire.
Top Skills
AWS
CloudFormation
Gitlab
Grafana
Kubernetes
Loki
Prometheus
Python
Terraform
Thanos
Mindtickle Pune, Mahārāshtra, IND Office
Pune Banglore Highway Pashan Exit, Baner, Pune, Maharashtra, India, 411045
Similar Jobs
AdTech • Digital Media • Healthtech • Marketing Tech • Analytics
The Senior Site Reliability Engineer ensures system stability and efficiency through managing production systems, automating deployments, and monitoring infrastructure performance.
Top Skills:
AirflowAnsibleAWSBashCi/CdDockerGitGoogle Cloud PlatformGrafanaHelmKubernetesMySQLPrometheusPythonRedisTerraform
Information Technology • Marketing Tech
The Site Reliability Engineer II will enhance system performance, stability, and availability while managing infrastructure automation and data engineering tasks across cloud platforms.
Top Skills:
AWSAzureBashDockerElkGCPGrafanaHbaseHdfsInfluxJavaKafkaKubernetesLinuxNode.jsPythonTelegrafZookeeper
Healthtech • Logistics • Pharmaceutical
Lead Salesforce development and security efforts, manage user access and data integrity, and drive innovation across the Salesforce platform.
Top Skills:
AWSAzureCSSGCPHadoopHTMLJavaJavaScriptNoSQLPythonSalesforceSQL
What you need to know about the Pune Tech Scene
Once a far-out concept, AI is now a tangible force reshaping industries and economies worldwide. While its adoption will automate some roles, AI has created more jobs than it has displaced, with an expected 97 million new roles to be created in the coming years. This is especially true in cities like Pune, which is emerging as a hub for companies eager to leverage this technology to develop solutions that simplify and improve lives in sectors such as education, healthcare, finance, e-commerce and more.