TransUnion's Job Applicant Privacy Notice
What We'll Bring:
We are seeking a highly skilled and motivated SRE Application Support Lead / Sr. Lead to join our 24x7 support team. This role is critical to ensuring the stability, performance, and reliability of mission-critical applications deployed across modern platforms including Docker, Kubernetes, and cloud environments. The ideal candidate will possess strong technical expertise, leadership capabilities, and a proactive mindset to drive operational excellence.What You'll Bring:
Key Responsibilities
Team Leadership & Management
Lead and mentor a team of SRE/Application Support Engineers.
Assign tasks, set goals, and ensure smooth day-to-day operations.
Foster a culture of ownership, accountability, and continuous improvement.
Incident & Problem Management
Own and manage critical incidents end-to-end.
Perform root cause analysis and drive permanent resolutions.
Collaborate with cross-functional teams and vendors for quick recovery.
Monitoring & Observability
Utilize tools like Splunk, Grafana, AppDynamics, Spotfire to monitor application health.
Set up proactive alerting and dashboards for performance tracking.
Automation & Tooling
Develop scripts (Shell, Python) to automate routine tasks.
Build and maintain internal tools to improve support efficiency.
Cloud & DevOps Integration
Support applications deployed in Docker, Kubernetes, and cloud platforms.
Collaborate with DevOps teams for CI/CD pipeline support and release validations.
Change & Release Management
Perform pre- and post-release validations.
Ensure production stability during deployments.
Documentation & Knowledge Management
Maintain runbooks, SOPs, and knowledge base articles.
Ensure onboarding materials and troubleshooting guides are up-to-date.
Stakeholder Communication
Provide timely updates to leadership and business teams.
Present metrics, incident summaries, and improvement plans.
SRE Mindset
Apply SRE principles to improve reliability, scalability, and performance of supported applications through proactive monitoring and automation.
Focus on reducing toil by automating repetitive tasks and improving operational efficiency.
Participate in blameless postmortems and contribute to continuous improvement initiatives based on incident learnings.
Drive observability enhancements by integrating metrics, logs, and traces into monitoring dashboards.
Collaborate with engineering teams to define and measure SLIs/SLOs, ensuring alignment with business availability goals.
Required Skills:
Strong Incident Management (IM) expertise: Proven ability to lead and coordinate high-severity incidents, including real-time triaging, root cause identification, and resolution tracking.
Bridge Call Management: Experience in initiating and leading bridge calls, ensuring timely updates, stakeholder alignment, and effective resolution.
Stakeholder Communication & Coordination: Ability to interact with cross-functional teams, vendors, and leadership during incidents and planned changes.
Monitoring & Observability Tools: Proficient in Splunk, Grafana, AppDynamics, Spotfire, and other monitoring platforms.
Technical Proficiency: Strong hands-on experience in Linux, SQL, Shell scripting, and Python (preferred).
Cloud & Containerization: Exposure to cloud platforms (AWS, Azure, GCP), Docker, and Kubernetes.
Automation & Tooling: Experience in automating support tasks and building internal tools to improve operational efficiency.
Change & Problem Management: Familiarity with ITIL processes, including change, incident, and problem management.
Certifications: ITIL, AWS, Azure, Kubernetes, or other relevant technical/process certifications are a plus.
Excellent Communication Skills: Strong verbal and written communication for effective collaboration and reporting.
Team Leadership: Experience in managing and mentoring support teams, driving performance, and ensuring 24x7 operational readiness.
Impact You'll Make:
Lead 24x7 SRE/Application Support operations ensuring high availability and performance of critical applications.
- Drive Incident Management processes including triage, resolution, and post-incident reviews.
- Initiate and lead bridge calls during high-severity incidents, ensuring timely updates and coordination across teams.
- Act as the primary point of contact for stakeholder communication during incidents and planned changes.
- Oversee monitoring and observability using tools like Splunk, Grafana, AppDynamics, and Spotfire.
- Support applications deployed in Docker, Kubernetes, and cloud platforms (AWS/Azure/GCP).
- Lead automation initiatives using Shell scripting and Python to improve operational efficiency.
- Collaborate with DevOps and Engineering teams for CI/CD and release management.
- Ensure compliance with ITIL processes (Incident, Problem, Change Management).
- Maintain documentation including runbooks, SOPs, and knowledge base articles.
- Tools & Technologies: Linux, SQL, Docker, Kubernetes, Splunk, Grafana, AppDynamics, Spotfire, Shell, Python
- Certifications Preferred: ITIL, AWS/Azure/GCP, Kubernetes, DevOps
- Work Mode: Hybrid (as per team policy)
- Shift Type: Rotational (24x7 coverage)
TransUnion Job Title
Sr Lead, Applications SupportTop Skills
TransUnion Pune, Mahārāshtra, IND Office
6th Floor, Tower B, Panschil Business Park, Vimanaggar, Pune, India, 411014

