As an Infrastructure Support Engineer, you will monitor, manage, and maintain data centers while troubleshooting issues, coordinating with teams, and improving operational processes.
Available Locations: Bengaluru
About The Role
In this role, you will be focused on monitoring, managing and maintaining the Cloudflare global network. You'll work closely with Cloudflare's SRE (Site Reliability Engineering) team, Network Engineering team, Network Deployment Engineering team and with various vendors and partners (including hardware vendors, datacenter and network providers, and ISPs) to maintain and improve our global infrastructure. This role ensures the maximum uptime performance, and security of critical systems and services. This is a highly visible position that requires deep technical understanding of datacenter infrastructure, networking (physical), and basic experience with data analysis.
To be successful in this position, you should have excellent technical skills, communication skills, and be able to navigate a range of challenges and constraints (e.g. schedule adherence, time zones, and cultures). You will have the opportunity to maintain a faster, safer Internet for our millions of users and the billions of web surfers that visit their sites each month.
Who you are
You are detail-oriented, eager to learn, and excited to grow your career in a fast-paced, global infrastructure environment. You bring foundational knowledge of data center environments, networking, and Linux systems, and are motivated to support mission-critical operations. You're comfortable following established processes, working across time zones, and collaborating with global teams. You will be working with partners to support infrastructure to a number of remote locations. You will have had experience managing operational environments, and used to developing new approaches to improve efficiency or operational stability.
What You'll Do
Required Experience
Other Responsibilities May Include
Examples of desirable skills, knowledge and experience
Bonus Points
About The Role
In this role, you will be focused on monitoring, managing and maintaining the Cloudflare global network. You'll work closely with Cloudflare's SRE (Site Reliability Engineering) team, Network Engineering team, Network Deployment Engineering team and with various vendors and partners (including hardware vendors, datacenter and network providers, and ISPs) to maintain and improve our global infrastructure. This role ensures the maximum uptime performance, and security of critical systems and services. This is a highly visible position that requires deep technical understanding of datacenter infrastructure, networking (physical), and basic experience with data analysis.
To be successful in this position, you should have excellent technical skills, communication skills, and be able to navigate a range of challenges and constraints (e.g. schedule adherence, time zones, and cultures). You will have the opportunity to maintain a faster, safer Internet for our millions of users and the billions of web surfers that visit their sites each month.
Who you are
You are detail-oriented, eager to learn, and excited to grow your career in a fast-paced, global infrastructure environment. You bring foundational knowledge of data center environments, networking, and Linux systems, and are motivated to support mission-critical operations. You're comfortable following established processes, working across time zones, and collaborating with global teams. You will be working with partners to support infrastructure to a number of remote locations. You will have had experience managing operational environments, and used to developing new approaches to improve efficiency or operational stability.
What You'll Do
- Monitor for network and data center issues working with Infrastructure, Network, and SRE teams to support the day-to-day health of data center operations.
- Identify and respond to incident, outage and performance issues to ensure data center and network availability through proactive support and remote coordination.
- Perform first level of troubleshooting of issues by following SOPs, and helping to coordinate and track tasks with remote hands/contractors (e.g. hardware support/check cabling).
- Conduct root cause analysis for recurring issues and recommend preventive measures.
- Creating and maintaining documentation related to SOPs and participating in development and refinement of monitoring best practices.
- Support and reconfigure network infrastructure where required.
- Use tools like JIRA to update task status and progress reports.
- Providing feedback to internal teams to support internal tools and external vendor partnerships.
Required Experience
- English language proficiency (written and verbal) is mandatory
- Over 2 years of experience in a technical support, IT operations, or data center environment (internship or junior role experience acceptable)
- Exposure to basic networking concepts (cabling, ports, troubleshooting). Experience with Juniper, Cisco and DWDM network equipment
- Familiarity with Linux-based systems and command-line tools
- Experience working with or coordinating third-party contractors (e.g. remote hands, field engineers)
- Familiarity with work required to stand up infrastructure in remote colocation facilities
- Experience running and improving operational processes.
- Familiarity with day-to-day tasks common to Data Center Operations e.g decommissioning and power)
- Comfortable handling basic program management responsibilities (prioritization, planning, scheduling, status reporting) such as JIRA
- Incident management
Other Responsibilities May Include
- Assist in improving documentation and procedures for remote site operations
- Participate in on-call rotations or incident response support
- Collaborate with global teammates across time zones and cultures
- Assist with the definition, documentation and implementation of consistent processes across all region
- Limited travel may be required for team offsites
Examples of desirable skills, knowledge and experience
- Bachelor's degree; technical background in engineering, computer science, or MIS
- Direct experience executing on complex data center/infrastructure projects
- Previous experience installing / maintaining data center (and other IT) infrastructure and DCIM tools
- Experience running and improving operational processes in a rapidly changing environment
- Strong verbal and written communication skills, problem-solving skills, attention to detail, and interpersonal skills
- Must be proactive with proven ability to learn fast and execute on multiple tasks simultaneously
- Ability to manage MS excel and Google spreadsheets
- Comfortable handling multiple responsibilities (prioritization, planning, scheduling, status reporting) such as JIRA
- Must be a team player
Bonus Points
- Multi-lingual; experience working with infrastructure in multiple countries
- Comfortable with remote "lights-out" and out-of-band access to data center resources
- Linux certifications (RHCSA etc.)
- Network certifications (CCNA, JNCIA or higher)
- Configuration management systems such as Saltstack, Chef, Puppet or Ansible
- Scripting or software development experience in Bash, Python or Go-lang
- Familiarity with load balancing and reverse proxies such as Nginx, Varnish, HAProxy, Apache
- Experience in working within a large scale SaaS vendor
Top Skills
Ansible
Apache
Bash
Chef
Cisco
Dwdm
Go-Lang
Haproxy
JIRA
Juniper
Linux
Nginx
Puppet
Python
Saltstack
Varnish
Similar Jobs at Cloudflare
Cloud • Information Technology • Security • Software • Cybersecurity
As a Hardware Systems Engineer, you will maintain and troubleshoot Cloudflare's hardware, validate bug fixes, and deploy firmware updates.
Top Skills:
ArmBashBitbucketGitGrafanaIpmiJIRALinuxPrometheusPythonRedfishSaltTeamcityX86 Server Hardware
Cloud • Information Technology • Security • Software • Cybersecurity
The Senior Partner Account Manager drives regional revenue growth through partner sales, executes co-sell initiatives, manages partner pipelines, and develops strategic plans to enhance partner relationships and performance.
Top Skills:
Cloudflare
Cloud • Information Technology • Security • Software • Cybersecurity
Lead a team of engineers to develop features for Cloudflare One's Zero Trust security platform, ensuring alignment with business needs and technical innovation.
Top Skills:
ClickhouseElasticsearchGoGrafanaKafkaKibanaPostgresPrometheusPythonReactRedisRustTimescaledbTypescript
What you need to know about the Pune Tech Scene
Once a far-out concept, AI is now a tangible force reshaping industries and economies worldwide. While its adoption will automate some roles, AI has created more jobs than it has displaced, with an expected 97 million new roles to be created in the coming years. This is especially true in cities like Pune, which is emerging as a hub for companies eager to leverage this technology to develop solutions that simplify and improve lives in sectors such as education, healthcare, finance, e-commerce and more.