Arrow Electronics, Inc. Logo

Arrow Electronics, Inc.

Sr Staff Site Reliability Engineer (SRE)

Posted Yesterday
Be an Early Applicant
In-Office or Remote
2 Locations
Senior level
In-Office or Remote
2 Locations
Senior level
Senior technical leader defining SRE strategy across multi-cloud (AWS/GCP) infrastructure. Establish reliability standards, SLIs/SLOs, observability, CI/CD guardrails, and deployment safety. Drive architecture and production-readiness reviews, incident response, and cross-team collaboration to ensure large-scale, multi-region platform availability supporting millions of IoT devices.
The summary above was generated by AI
Position:Sr Staff Site Reliability Engineer (SRE)

Job Description:

We are seeking a Sr Staff Site Reliability Engineer — on a long-term basis during USA hours— who brings deep software engineering roots alongside SRE expertise. This individual will help shape and scale the reliability of our global cloud platform, bringing the full-stack perspective of someone who has built and shipped software and now drives reliability from the inside out.

The Role

This is a Senior Staff-level technical leadership role with organization-wide influence. You will define and drive reliability strategy across our multi-cloud infrastructure (AWS and GCP), establish architectural standards, and ensure our backend systems operate with exceptional availability, scalability, and resilience.

You will also collaborate with strategic partners and engineering teams to enable our organization as a cloud-integrated service, leading technical discussions and ensuring secure and reliable integrations.

This is a long-term position for someone who thrives at the intersection of software development and reliability engineering. The ideal candidate has hands-on development experience, understands the complete software delivery lifecycle, and brings an end-to-end systems perspective — from code commit to production operation.

What You’ll Do
  • Define and drive Organization’s SRE strategy across engineering teams.
  • Establish reliability standards, architectural guardrails, and production readiness frameworks.
  • Initiate, participate in, and review architectural changes — leveraging development experience to ensure reliability and operability are built in, not bolted on.
  • Apply SDLC knowledge to reliability decisions — engage early in design and architecture reviews to embed reliability, testability, and operability as first-class requirements.
  • Proactively identify system-wide gaps — continuously assess the platform for reliability blind spots, missing observability, or architectural debt, and drive initiatives to close them without waiting to be asked.
  • Bridge development and SRE teams — translate between engineering intent and operational reality, serving as a technical liaison who can read code, review PRs, and contribute to service-level design decisions.
  • Design and maintain highly available, multi-region, multi-cloud systems.
  • Ensure platform reliability supporting millions of IoT devices globally.
  • Guide engineering teams in building fault-tolerant, scalable microservices and monolithic systems.
  • Define and enforce SLIs, SLOs, and error budgets.
  • Lead architecture reviews and production readiness reviews.
  • Partner with strategic teams to deliver our organization as a cloud-integrated service and support partner integrations.
  • Improve and streamline production release processes.
  • Implement safe deployment strategies (canary, blue/green, progressive delivery).
  • Build CI/CD guardrails to reduce deployment risk and improve reliability.
  • Develop and mature observability strategies across infrastructure and services.
  • Lead high-severity incident response, facilitate blameless postmortems, and drive systemic improvements to prevent recurring issues.
What You Bring
  • 10+ years of combined software engineering and SRE/infrastructure experience, with a clear progression from development into reliability or platform engineering.
  • Deep understanding of the complete Software Development Lifecycle (SDLC) — enabling well-informed reliability and design decisions across all phases of software delivery.
  • Strong software development background — with hands-on experience building and shipping production software — enabling effective design collaboration, code-level review, and reliability-driven architectural input.
  • End-to-end system comprehension — ability to reason about the full stack from device/client behavior through API layer, backend services, data stores, and infrastructure, connecting the dots across teams and domains.
  • Self-directed gap identification — demonstrated initiative in spotting reliability, scalability, or process gaps and driving improvements without needing explicit direction.
  • Collaborative cross-team communication — proven ability to work across engineering, product, and operations teams; comfortable influencing without authority and presenting technical decisions to both technical and non-technical stakeholders.
  • Proven experience operating large-scale distributed systems in production.
  • Strong hands-on expertise with AWS and GCP cloud platforms.
  • Deep experience with Kubernetes in production environments.
  • Advanced knowledge of Terraform, including modular design and infrastructure governance.
  • Strong understanding of distributed systems, networking, and system reliability principles.
  • Experience supporting Java-based monolithic systems and microservices architectures.
  • Proficiency in Python for automation and tooling.
  • Experience with modern observability stacks (Prometheus, Grafana, Datadog, OpenTelemetry, etc.).
  • Strong debugging, incident response, and root cause analysis skills.
  • Security knowledge in transport and identity — working knowledge of SSL/TLS certificate lifecycle management, mutual TLS (mTLS) for service-to-service authentication, cipher suite selection and hardening, and TLS version enforcement across microservices and infrastructure boundaries.
  • Excellent written and verbal communication skills, with experience coordinating across distributed engineering teams, facilitating technical discussions, and driving alignment on reliability decisions.

Qualification-

  • This Position is only for IST Evening (3pm to midnight) OR IST night (10pm to 7am) flexible rotation shift

  • Bachelor’s degree in computer science or software engineering.

  • 10+ years of combined software engineering and SRE/infrastructure experience, with a clear progression from development into reliability or platform engineering.

Location:IN-GJ-Ahmedabad, India-Ognaj (eInfochips)

Time Type:Full time

Job Category:Engineering Services

Similar Jobs

One Month Ago
In-Office or Remote
India
Senior level
Senior level
Cloud • Security • Software • Cybersecurity
As a Senior Site Reliability Engineer, you will enhance automation and efficiency, troubleshoot complex issues, and improve system reliability and monitoring.
Top Skills: AnsibleAWSAzureDatadogElkGCPGoGrafanaLinuxOpensearchPrometheusPythonSaltstackSplunkTerraform
7 Days Ago
Remote
India
Senior level
Senior level
Cloud • Information Technology • Productivity • Software • Automation
As a Senior Site Reliability Engineer, you will enhance system reliability, automate infrastructure, mentor engineers, and implement observability practices.
Top Skills: AnsibleAWSNew RelicPythonSplunkTerraform
2 Days Ago
Remote
India
Senior level
Senior level
Payments
As a Senior Site Reliability Engineer, you will design and implement scalable infrastructure solutions, enhance reliability and performance, and lead AI enablement efforts within the team.
Top Skills: AWSDatadogGoGrafanaKubernetesMySQLNew RelicPostgresPrometheusRdsRestful Apis

What you need to know about the Pune Tech Scene

Once a far-out concept, AI is now a tangible force reshaping industries and economies worldwide. While its adoption will automate some roles, AI has created more jobs than it has displaced, with an expected 97 million new roles to be created in the coming years. This is especially true in cities like Pune, which is emerging as a hub for companies eager to leverage this technology to develop solutions that simplify and improve lives in sectors such as education, healthcare, finance, e-commerce and more.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account