Juniper Square Logo

Juniper Square

Lead Site Reliability Engineer - India

Posted Yesterday
Be an Early Applicant
Remote
Hiring Remotely in India
Senior level
Remote
Hiring Remotely in India
Senior level
Lead the technical direction for infrastructure systems, ensuring reliability and scalability while managing incident response and project delivery in collaboration with teams.
The summary above was generated by AI
About Juniper Square

Our mission is to unlock the full potential of private markets. Privately owned assets like commercial real estate, private equity, and venture capital make up half of our financial ecosystem yet remain inaccessible to most people. We are digitizing these markets, and as a result, bringing efficiency, transparency, and access to one of the most productive corners of our financial ecosystem. If you care about making the world a better place by making markets work better through technology – all while contributing as a member of a values-driven organization – we want to hear from you. 

Juniper Square offers employees a variety of ways to work, ranging from a fully remote experience to working full-time in one of our physical offices. We invest heavily in digital-first operations, allowing our teams to collaborate effectively across 27 U.S. states, 2 Canadian Provinces, India, Luxembourg, and England. We also have physical offices in San Francisco, New York City, Mumbai and Bangalore for employees who prefer to work in an office some or all of the time.

What You Will Do

Technical Leadership & Architecture

  • Own and drive the technical direction for your team's infrastructure systems, making architectural decisions that balance reliability, scalability, and cost.

  • Design systems of moderate to high complexity using distributed systems best practices; anticipate future use cases and minimize technical debt.

  • Conduct architectural reviews and advance design patterns across the organization.

  • Identify and implement improvements to existing software architecture; define and expand design patterns to solve common platform problems.

  • Define and enforce security best practices across team-owned systems; proactively surface gaps to senior leadership.

Reliability & Operational Excellence

  • Own the reliability posture of team-owned services — establish SLOs, monitor SLAs, and hold the team accountable to them.

  • Lead incident response for complex, multi-service issues; systematically debug, identify root causes, and ensure issues do not recur.

  • Establish standards for logging, monitoring, and operationalization across all team-owned systems.

  • Foresee potential operational issues and implement preventative measures to safeguard the customer experience.

  • Participate in and help lead the on-call rotation; ensure production systems are appropriately instrumented.

Project & Delivery Ownership

  • Act as DRI (Directly Responsible Individual) for medium-to-large SRE projects spanning months and involving cross-team collaboration.

  • Partner with Engineering Managers and Product Managers to scope roadmap initiatives, break down work into actionable increments, and commit to delivery plans.

  • Negotiate scope effectively when required, ensuring adjustments align with customer needs and project goals.

  • Proactively identify and resolve project risks — dependencies, architectural drift, and staffing blockers — before they impact delivery.

What We Are Looking For

Required Experience

  • 7-10 years of experience in Site Reliability Engineering, DevOps, or Platform Engineering in a production cloud environment.

  • 5+ years of hands-on experience with AWS cloud services across compute, networking, storage, and security.

  • 5+ years managing Linux-oriented production environments at scale.

  • 5+ years using Infrastructure-as-Code (Terraform, CDK, CloudFormation) and/or GitOps best practices.

  • 3+ years operating and troubleshooting production Kubernetes environments.

  • 3+ years applying AWS Well-Architected Framework principles across reliability, security, performance, and cost pillars.

  • 3+ years in cloud security best practices including IAM, secrets management, network security, and compliance.

  • 3+ years working with PostgreSQL in production: performance tuning, replication, backup, and recovery.

  • Demonstrated track record of leading multi-person technical projects from scoping through delivery.

Technical Skills

  • Strong general programming skills; comfort writing automation scripts and tooling in Python, Go, or similar.

  • Deep knowledge of observability tooling — metrics, logging, distributed tracing — and how to use them to drive reliability.

  • Solid understanding of data retention, backup, and recovery processes across cloud-native systems.

  • Experience with CI/CD pipelines, release management, and deployment automation.

  • Familiarity with service mesh, API gateway patterns, and microservices architectures.

AI Fluency

  • Experience using AI-assisted workflows across the SDLC, with an emphasis on production reliability, operability, and maintainability of large-scale systems (design, deployment, monitoring, incident response)

  • Hands-on experience integrating LLMs or AI systems into production environments, with a focus on reliability, latency, observability, and failure handling (e.g., automated triage, incident copilots, runbook automation)

  • Familiarity with agent-based or workflow automation systems applied to operational use cases such as alert triage, remediation loops, system diagnostics, or automated runbook execution

  • Demonstrated ability to apply AI tools to improve system reliability, reduce MTTR, automate operational workflows, and enhance observability and alerting systems

  • Working knowledge of LLMs, embeddings, RAG, and their operational constraints in production systems (latency, cost, drift, safety, and observability)

  • Ability to identify opportunities where AI can meaningfully improve system reliability, on-call efficiency, incident response, and infrastructure automation

Nice to have (SRE):

  • Experience handling model degradation, fallback strategies, and cost anomalies

Leadership & Collaboration

  • Proven ability to lead technical discussions, drive alignment across engineering and product, and communicate decisions clearly to stakeholders.

  • Experience mentoring junior and mid-level engineers in both technical skills and professional development.

  • Able to operate independently with minimal supervision; comfortable making final technical decisions as DRI.

  • Strong communication skills in English — written and verbal — with experience influencing cross-functional partners.


Why Juniper Square
  • High-impact role at the intersection of cloud infrastructure and financial technology — your work directly underpins products managing hundreds of billions in AUM.

  • Significant growth potential: opportunity to help shape the SRE practice and prepare the platform for exponential scale.

  • A promising technology roadmap spanning capacity planning, Kubernetes migrations, and service-oriented architecture modernization.

  • Collaborative, engineering-driven culture that values quality, curiosity, and ownership.

  • Competitive compensation and benefits package.

Similar Jobs

4 Hours Ago
Easy Apply
Remote
India
Easy Apply
Senior level
Senior level
Artificial Intelligence • Consumer Web • Digital Media • Information Technology • Social Impact • Software
As the Lead Product Designer for Discover, you'll design AI-driven experiences for a two-sided marketplace while mentoring the design team and owning the product design direction.
Top Skills: Ai-Assisted Prototyping Tools (CursorClaude CodeFigmaLovable)V0
4 Hours Ago
Remote or Hybrid
Pune, Mahārāshtra, IND
Senior level
Senior level
Blockchain • Fintech • Payments • Consulting • Cryptocurrency • Cybersecurity • Quantum Computing
Lead the vision and development of internal platforms and APIs for Mastercard's digital payments, focusing on secure and scalable solutions. Collaborate with engineering teams to drive platform capabilities aligned with product strategy and regulatory needs.
Top Skills: AgileAPIsCloud-Native ArchitecturesDomain-Driven DesignMicroservicesPci ComplianceSecurityTokenization
4 Hours Ago
Remote or Hybrid
Mid level
Mid level
Artificial Intelligence • Big Data • Cloud • Information Technology • Software • Big Data Analytics • Automation
The role involves acquiring new enterprise-level clients, managing sales cycles, developing territory plans, and collaborating across teams to drive revenue.
Top Skills: Enterprise SoftwareFinanceLegalMarketingSales Engineering

What you need to know about the Pune Tech Scene

Once a far-out concept, AI is now a tangible force reshaping industries and economies worldwide. While its adoption will automate some roles, AI has created more jobs than it has displaced, with an expected 97 million new roles to be created in the coming years. This is especially true in cities like Pune, which is emerging as a hub for companies eager to leverage this technology to develop solutions that simplify and improve lives in sectors such as education, healthcare, finance, e-commerce and more.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account