CommandLink Logo

CommandLink

DevOps Engineer

Posted Yesterday
Remote
Hiring Remotely in India
Senior level
Remote
Hiring Remotely in India
Senior level
As a DevOps Engineer, you will manage platform reliability, oversee Kubernetes clusters, operate Kafka and OpenSearch, optimize Azure/GCP/AWS environments, and implement Infrastructure as Code practices while enhancing observability and security compliance.
The summary above was generated by AI

About Command|Link


Command|Link is a global SaaS Platform providing network, voice services, and IT security solutions, helping corporations consolidate their core infrastructure into a single vendor and layering on a proprietary single pane of glass platform. Command|Link has revolutionized the IT industry by tackling the problems our competitors create. In recognition for our unprecedented innovation and dedication, Command|Link was recognized as the SD-WAN Product of the Year, ITSM Visionary Spotlight, UCaaS Product of the Year, NaaS Product of the Year, Supplier of the Year, and the AT&T Strategic Growth Partner. Command|Link has built the only IT platform for scale that solves ISP vendor sprawl and IT headaches. We make it easy for our customers to get more done, maximize uptime and improve the bottom line.


Learn more about us here!


This is a 100% remote position

About your new role:

As our Founding DevOps Engineer, you will own the reliability of a high-throughput distributed platform processing network telemetry, voice, and security data for a global customer base. Your mandate: keep the platform fast, available, and scalable as CommandLink grows — enabling fast, iterative deployments without sacrificing uptime.


You'll work hands-on across VMs, firewalls, Kubernetes clusters, Kafka and Flink pipelines, OpenSearch, and Azure infrastructure — designing systems to fail gracefully and recover automatically, not just monitoring them. You'll bring strong platform judgment to decisions that directly impact customer uptime, data latency, and our ability to scale new product lines without rearchitecting from scratch.


Working closely with Engineering and Product leaders, you'll embed reliability into how we build. That means driving SLO definition, incident response, and postmortems, as well as building the automation that makes on-call sustainable long-term.


You'll also lead a genuine greenfield initiative: transforming our infrastructure into a fully code-defined Infrastructure as Code model — bringing consistency, repeatability, and engineering rigor to how we provision, manage, and evolve the platform.


Key Responsibilities:

  • Own platform reliability end-to-end: define and enforce SLOs/SLIs, build alerting strategies, lead incident response, and drive blameless postmortems
  • Kubernetes cluster operations: manage HA multi-node and cloud clusters in production, handle rolling upgrades, resource quotas, autoscaling, network policies, and pod disruption budgets
  • Distributed data infrastructure: operate and scale Kafka clusters, Flink streaming jobs, and OpenSearch clusters under sustained high-throughput workloads, including rebalancing, partition management, index lifecycle policies, and shard tuning
  • Temporal workflow platform: maintain and scale Temporal server deployments; work with engineering to design workflows for durability and backpressure
  • Azure/AWS/GCP infrastructure: manage and optimise Azure/GCP/AWS environments including K8S, Networking, Monitoring, Vaults, and IAM; contribute to IaC codebase (Terraform or Bicep)
  • CI/CD and deployment pipelines: improve build, release, and deployment pipelines to enable safe, fast, and automated delivery across environments
  • Observability: build and maintain a comprehensive observability stack, metrics, logs, traces, and dashboards that give engineers actionable signals rather than noise
  • Security and compliance: work with the security team to harden infrastructure, enforce least-privilege policies, and support compliance requirements
  • Capacity planning: proactively model growth, identify bottlenecks before they become incidents, and lead scaling initiatives for critical components
  • Takes on additional responsibilities and projects as needed to support the success of the team and organization.


What you'll need for success:

Essential:

  • 6+ years in a Site Reliability Engineering, DevOps, or Platform Engineering role in a production environment
  • Deep, hands-on Kubernetes experience: cluster administration, HA configurations, networking (CNI, ingress, service mesh), and storage not just application deployment
  • Proven experience operating Apache Kafka at scale: topic management, consumer group tuning, broker operations, and monitoring lag
  • Experience with Apache Flink or equivalent stream processing frameworks in production
  • OpenSearch / Elasticsearch cluster operations: index management, scaling strategies, performance tuning, and snapshot management
  • Azure/AWS/GCP cloud platform expertise: AKS, virtual networking, managed identities, monitoring, and cost management
  • Solid understanding of distributed systems theory: CAP theorem, consensus protocols, failure modes, backpressure, and circuit breaking
  • Infrastructure as Code mindset — Terraform, Helm, or equivalent
  • Temporal workflow engine: deployment, operation, and scaling (or strong experience with an equivalent durable execution platform such as Cadence or Conductor)
  • Strong scripting and automation skills (Bash, PHP, Python, or Go)
  • Experience designing and operating high-availability architectures across multiple availability zones or regions

Nice to Have:

  • Experience with Vector (from Datadog) for log and metric collection and routing pipelines
  • Datadog for APM, infrastructure monitoring, log management, or dashboards
  • Experience with service meshes (Istio, Linkerd, or Cilium)
  • Familiarity with chaos engineering practices (Chaos Monkey, LitmusChaos, or similar)
  • Contributions to open source infrastructure tooling
  • Experience working in or with network/telco SaaS products
  • Knowledge of eBPF-based networking or observability tools


Why you'll love life at Command|Link

Join us at CommandLink, where you'll have the opportunity to shape the future of business communication. We value the innovative spirit and seek individuals ready to bring their unique vision and expertise to a team that values bold ideas and strategic thinking. Are you ready to make an impact?

  • Room to grow at a high-growth company
  • An environment that celebrates ideas and innovation
  • Your work will have a tangible impact
  • Flexible time off  
  • Fun events at cool locations
  • Employee referral bonuses to encourage the addition of great new people to the team


At CommandLink, we’re committed to creating a fair, consistent, and efficient hiring experience. As part of our process, we use AI-assisted tools to help review and analyze applications. These tools support our recruiting team by identifying qualifications and experience that align with the requirements of each role.


AI tools are used only to assist in the evaluation process — they do not make final hiring decisions. Every application is reviewed by a member of our recruiting or hiring team before any decisions are made.


Top Skills

AWS
Azure
Bash
Flink
GCP
Go
Kafka
Kubernetes
Opensearch
PHP
Python
Terraform

Similar Jobs

2 Days Ago
Easy Apply
In-Office or Remote
Easy Apply
Mid level
Mid level
AdTech • Marketing Tech
The DevOps Engineer will build cloud-native solutions in AWS, developing CI/CD pipelines, managing Kubernetes clusters, and implementing cost management practices while ensuring system reliability and performance.
Top Skills: AnsibleAWSChefCloudwatchDatadogDockerGithub ActionsGrafanaJenkinsKubernetesOpentelemetryPrometheusPuppetPythonShell ScriptingSQLTerraform
12 Days Ago
In-Office or Remote
Senior level
Senior level
Information Technology
The Lead DevOps Engineer AWS will oversee infrastructure projects, manage CI/CD pipelines, utilize IaC tools, and mentor junior engineers.
Top Skills: AnsibleAWSCloudwatchDockerHelmJavaKubernetesPythonTerraform
12 Days Ago
In-Office or Remote
Senior level
Senior level
Information Technology
As a Senior DevOps Engineer, you will manage software development lifecycles, mentor team members, and implement CI/CD pipelines, while ensuring security and high availability.
Top Skills: AWSAzureAzure DevopsBashC#Ci/CdCloudFormationCloudwatchDockerElk StackGithub ActionsGrafanaJavaKubernetesMySQLPostgresPowershellPrometheusPythonTerraform

What you need to know about the Pune Tech Scene

Once a far-out concept, AI is now a tangible force reshaping industries and economies worldwide. While its adoption will automate some roles, AI has created more jobs than it has displaced, with an expected 97 million new roles to be created in the coming years. This is especially true in cities like Pune, which is emerging as a hub for companies eager to leverage this technology to develop solutions that simplify and improve lives in sectors such as education, healthcare, finance, e-commerce and more.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account