The Software Engineer - SRE will build and maintain highly reliable infrastructure while focusing on automation, observability, and operational excellence for the Albert platform.
About Albert Invent
Albert Invent is a cutting-edge AI-driven software company headquartered in Oakland, California, on a mission to empower scientists and innovators in chemistry and materials science to invent the future faster. Every day, scientists in 30+ countries use Albert to accelerate R&D with AI trained like a chemist, bringing better products to market, faster
SOFTWARE ENGINEER - SRE JOB DESCRITIONThe Software Engineer – SRE will be responsible for building and maintaining highly reliable, scalable, and secure infrastructure that powers the Albert platform. This role focuses on automation, observability, and operational excellence to ensure seamless deployment, performance, and reliability of core platform services
- Act as a passionate representative of the Albert product and brand.
- Work closely with Product Engineering and other stakeholders to plan and deliver core platform capabilities that enable scalability, reliability, and developer productivity.
- Work with Site Reliability Engineering (SRE) team on the shared full stack ownership of a collection of services and/or technology areas.
- Understand the end-to-end configuration, technical dependencies, and overall behavioral characteristics of all the micro-services.
- Responsible for the design and delivery of the mission-critical stack, with a focus on security, resiliency, scale, and performance.
- Authority for end-to-end performance and operability.
- Demonstrate a clear understanding of automation and orchestration principles.
- Act as an ultimate escalation point for complex or critical issues that have not yet been documented as Standard Operating Procedures (SOPs).
- Utilize a deep understanding of service topology and their dependencies required to troubleshoot issues and define mitigations
- Bachelor’s degree in computer science, Engineering, or equivalent experience.
- 2+ years of software engineering experience, with at least 1 year in SRE role focused on automation.
- Solid in IAC (Infrastructure as Code), preferably using terraform.
- Solid expertise in Python or Node.js and designing RESTful APIs and microservices architecture.
- Solid expertise in cloud infrastructure (AWS) and platform technologies, including microservices, APIs, and distributed systems.
- Hands-on experience with observability stack including centralized log management, metrics & tracing.
- Familiarity with CI/CD tools like CircleCI and performance testing using K6.
- A desire to bring more automation and standards to an Engineering organization.
- A desire to build high-performance APIs with lower latencies (< 200 ms).
- Ability to work in a fast-paced environment and learn from peers and leaders.
- Ability to lead technically, mentor other engineers, and help facilitate the growth of the team through active participation in recruiting and related activities
- Experience with Kubernetes and container orchestration.
- Familiarity with observability stacks (Prometheus, Grafana, OpenTelemetry, Datadog, etc.).
- Experience building internal developer platforms (IDPs) or reusable frameworks for engineering teams.
- Exposure to ML infrastructure or data engineering workflows.
- Experience working in compliance-heavy environments (SOC2, HIPAA, etc.)
- Joining Albert Invent means becoming part of a mission-driven, fast-growing global team at the intersection of AI, data, and advanced materials science.
- You will collaborate with world-class scientists and technologists to redefine how new materials are discovered, developed, and brought to market.
- The culture is built on curiosity, collaboration, and ownership, with a strong focus on learning and impact.
- You will enjoy the opportunity to work on cutting-edge AI tools that accelerate real- world R&D and solve global challenges from sustainability to advanced manufacturing while growing your careers in a high-energy environment
Top Skills
AI
AWS
Ci/Cd
Datadog
Grafana
Kubernetes
Node.js
Opentelemetry
Prometheus
Python
Terraform
Similar Jobs
Financial Services
As a Lead Software Engineer, you will mentor a team, ensure robust cloud infrastructure, automate processes, and enhance system reliability. Responsibilities include collaborating on best practices, managing CI/CD pipelines, and leading incident responses.
Top Skills:
AWSGrafanaKubernetesOpen TelemetryPrometheusTerraform
Artificial Intelligence • Healthtech • Analytics • Biotech
As a Principal Engineer in SRE, you'll shape cloud platform reliability, architect fault-tolerant systems, drive automation, mentor teams, and ensure high standards of system reliability and performance.
Top Skills:
AWSAws CdkCloudFormationCloudwatchEksElkGoGrafanaJavaKubernetesPrometheusPythonTerraform
Consumer Web • Information Technology
This role involves monitoring application performance, developing automation, conducting first-level analysis, and managing incidents for a 24/7 SaaS environment.
Top Skills:
ConfluenceDatadogDockerJIRAKubernetesLinuxMattermost
What you need to know about the Pune Tech Scene
Once a far-out concept, AI is now a tangible force reshaping industries and economies worldwide. While its adoption will automate some roles, AI has created more jobs than it has displaced, with an expected 97 million new roles to be created in the coming years. This is especially true in cities like Pune, which is emerging as a hub for companies eager to leverage this technology to develop solutions that simplify and improve lives in sectors such as education, healthcare, finance, e-commerce and more.



