Similar Jobs
Energy
The Senior Site Reliability Engineer will build and support Cloud infrastructure automation solutions, ensuring security, monitoring, and improving cloud services while collaborating with cross-functional teams.
Top Skills:
AnsibleAppdynamicsAWSAws CodepipelineAzureAzure DevopsChefConfluenceDockerDynatraceElk StackGitGitlab Ci/CdGrafanaJenkinsJIRAKubernetesLinux Shell ScriptingPrometheusPythonSplunkTerraformWindows Powershell
Artificial Intelligence • Cloud • Sales • Security • Software • Cybersecurity • Data Privacy
Seeking a Senior Site Reliability Engineer to ensure reliability, scalability, and performance of systems, collaborating closely with developers and managing incidents.
Top Skills:
AnsibleAWSAzureBashCloudFormationDockerGCPGoGrafanaKubernetesPrometheusPythonTerraform
Artificial Intelligence • Computer Vision • Hardware • Robotics • Metaverse
As a Senior Site Reliability Engineer, you'll design and maintain GPU clusters for AI research, optimize operations, and ensure system reliability while enhancing productivity through automation.
Top Skills:
Ai InfrastructureBashBcmCuda ProgrammingDockerEnrootGpfsGpu ComputingInfinibandKubernetesLustreMySQLPythonSlurmTerraform
Acoustic is seeking a skilled and seasoned Senior Site Reliability Engineer to join our SRE team. We believe that the ideal candidate will bring innovative ideas and implement preventative measures to minimize downtime. This position is perfect for someone enthusiastic about technology and eager to contribute to the growth and success of our organization.
Key Responsibilities
- Lead major incident calls and provide solutions to the team.
- Collaborate with our SRE teams to provide early detection and response.
- Provide automated solutions for our application problems.
- Collaborate with our Engineering team to understand our products and features.
- Participate in team on-call rotation.
- 5-8 years experience
- Strong Communication Skills
- Coding proficiency in one or more of the following languages with the ability to quickly learn new languages:
- Go, JS, Python
- Strong automation experience in AWS (preferred) or other Cloud Providers
- Worked with at least one of the automation tools such as
- Puppet, Chef, Ansible, or Terraform
- Strong Java application debugging and troubleshooting skills such as looking at thread and heap dumps, performance tuning.
- Experience working with distributed systems and at least one of the following databases:
- Oracle, DB2, MySQL, PostgresDB, MSSQL
- In-depth knowledge of monitoring & log aggregation and o11y tools such as:
- DataDog, New Relic, LGTM, OpenTelemetry and other open source tools.
- Experience with deploying and managing
- Kubernetes, Kafka, Open Search, Hbase, MongoDB, or their variants.
- Experience with queueing stack such as:
- ActiveMQ, RabbitMQ, or their variants
- Experience with CICD pipelines and work with at least one of the following tools:
- Artifactory, GitHub, Jenkins, CloudBees, Octopus, and other tools
- Ability to document work for the benefit of the team
Nice to have
- Experience with Snowflake and Looker
What you need to know about the Pune Tech Scene
Once a far-out concept, AI is now a tangible force reshaping industries and economies worldwide. While its adoption will automate some roles, AI has created more jobs than it has displaced, with an expected 97 million new roles to be created in the coming years. This is especially true in cities like Pune, which is emerging as a hub for companies eager to leverage this technology to develop solutions that simplify and improve lives in sectors such as education, healthcare, finance, e-commerce and more.