The role involves providing L3 technical support, leading incident resolution, managing system performance, and mentoring junior teams while collaborating across departments.
Job Description:
Job Title: Production Support Specialist (SL3)
Corporate Title: Associate Vice President
Location: Pune, India
Role Description
- We are seeking a highly skilled and experienced Senior Production Support Specialist to join our dynamic Production Operations team. This critical role is responsible for providing expert-level (L3) technical support, troubleshooting, and incident resolution for our complex & Global Banking strategic platform for supporting backend functions and infrastructure in a fast-paced environment.
- The ideal candidate should possess a strong technical background across various domains, a keen problem-solving attitude, excellent analytical skills, and the ability to operate autonomously while collaborating effectively with development, infrastructure, and business teams. This role demands proactive identification of issues, root cause analysis, and the implementation of permanent solutions to ensure optimal system performance and reliability.
What we’ll offer you
As part of our flexible scheme, here are just some of the benefits that you’ll enjoy,
- Best in class leave policy.
- Gender neutral parental leaves
- 100% reimbursement under childcare assistance benefit (gender neutral)
- Sponsorship for Industry relevant certifications and education
- Employee Assistance Program for you and your family members
- Comprehensive Hospitalization Insurance for family and your dependents
- Accident and Term life Insurance
- Complementary Health screening for 35 yrs. and above
Your key responsibilities
- Incident Management (L3 Support):
- Serve as the primary escalation point for complex production incidents, providing expert-level diagnosis and resolution for critical issues that cannot be resolved by L1/L2 teams.
- Lead incident resolution efforts, coordinating with multiple teams (Dev, QA, Infra, Network) to restore service rapidly and minimize business impact.
- Perform in-depth root cause analysis (RCA) for major incidents, identifying underlying technical problems and proposing long-term preventative measures.
- Participate in a 24/7 on-call rotation to provide support for critical production systems.
- Problem Management:
- Identify recurring issues and systemic problems, working collaboratively with development teams to implement permanent fixes and architectural improvements.
- Proactively monitor system health, performance, and trends to identify potential issues before they impact users.
- System Health & Performance:
- Utilize monitoring tools (e.g., Splunk, Grafana, ELK, Prometheus, Datadog) to analyze system performance, identify bottlenecks, and ensure optimal resource utilization.
- Develop and refine monitoring alerts and dashboards to provide early warnings for potential issues.
- Optimize application performance and stability through configuration tuning, code analysis, and infrastructure recommendations.
- Technical Expertise & Mentorship:
- Maintain deep technical expertise in [specific technologies, e.g., Java/Spring, .NET, Python, SQL, NoSQL, Kafka, AWS/Azure/GCP].
- Act as a subject matter expert (SME) and provide technical guidance and training to L1/L2 support teams.
- Document troubleshooting procedures, runbooks, and knowledge articles to enhance the team's capabilities.
- Automation & Tooling:
- Develop and implement automation scripts and tools (e.g., Python, Bash, PowerShell) to streamline operational tasks, reduce manual effort, and improve efficiency.
- Contribute to the continuous improvement of our production support toolkit and processes.
- Deployment & Release Support:
- Support application deployments, environment refreshes, and production releases, ensuring stability and verifying post-deployment health.
- Conduct pre-release checks and post-release validation to minimize risks.
- Collaboration & Communication:
- Communicate effectively with technical and non-technical stakeholders during incidents, providing clear and concise updates.
- Collaborate closely with development teams to understand new features, provide feedback on supportability, and ensure smooth handovers.
- Participate in design and architecture reviews to represent operational requirements and provide input on system resilience and maintainability.
Your skills and experience
- Education:
- Bachelor's degree in Computer Science, Information Technology, Engineering, or a related field; or equivalent practical experience.
- Experience:
- 5+ years of hands-on experience in a Production Support, Site Reliability Engineering (SRE), or DevOps role, with at least 2-3 years at an L3 level.
- Proven experience supporting complex, high-transaction, and mission-critical applications in a Follow the SUN Model.
- Technical Proficiency (Demonstrated expert-level knowledge in several of the following areas):
- Operating Systems: Linux (RHEL, CentOS, Ubuntu) and/or Windows Server administration.
- Databases: Strong proficiency in SQL (e.g., PostgreSQL, MySQL, MS SQL Server) including complex query writing, performance tuning, and troubleshooting. Experience with NoSQL databases (e.g., MongoDB, Cassandra, Redis) is a plus.
- Programming/Scripting: Proficiency in at least one scripting language (Python, Bash, PowerShell) for automation and data analysis.
- Application Servers/Web Servers: Experience with technologies like Tomcat, JBoss, WebLogic, Nginx, Apache HTTP Server, IIS.
- Cloud Platforms: Hands-on experience with at least one major cloud provider (AWS, Azure, GCP), including understanding of cloud services (EC2, S3, RDS, Lambda, AKS, GKE, etc.).
- Monitoring & Logging Tools: Extensive experience with tools such as Splunk, ELK Stack (Elasticsearch, Logstash, Kibana), Grafana, Prometheus, Datadog, AppDynamics, Dynatrace.
- Networking Fundamentals: Solid understanding of TCP/IP, DNS, Load Balancers (e.g., F5, Nginx, AWS ELB/ALB), Firewalls.
- Messaging Queues: Experience with Kafka, RabbitMQ, ActiveMQ, or similar.
- Containerization/Orchestration: Experience with Docker and Kubernetes is highly desirable.
- Problem-Solving & Analytical Skills:
- Exceptional analytical and diagnostic skills with the ability to quickly triage, isolate, and resolve complex technical issues under pressure.
- Strong ability to perform thorough root cause analysis and implement effective preventative measures.
- Soft Skills:
- Excellent written and verbal communication skills, with the ability to articulate complex technical issues to both technical and non-technical audiences.
- Strong interpersonal skills, with the ability to build relationships and collaborate effectively across teams.
- High degree of initiative, proactivity, and self-motivation.
- Ability to manage multiple priorities and work independently with minimal supervision.
- A strong sense of ownership and accountability.
- ITIL/Service Management:
- Familiarity with ITIL principles (Incident, Problem, Change Management) is a plus.
How we’ll support you
- Training and development to help you excel in your career.
- Coaching and support from experts in your team.
- A culture of continuous learning to aid progression.
- A range of flexible benefits that you can tailor to suit your needs.
About us and our teams
https://www.db.com/company/company.html
We strive for a culture in which we are empowered to excel together every day. This includes acting responsibly, thinking commercially, taking initiative and working collaboratively.
Together we share and celebrate the successes of our people. Together we are Deutsche Bank Group.
We welcome applications from all people and promote a positive, fair and inclusive work environment.
Top Skills
Activemq
Apache Http Server
AWS
Aws Elb/Alb
Azure
Bash
Cassandra
Datadog
Dns
Docker
Elk Stack
F5
GCP
Grafana
Jboss
Kafka
Kubernetes
Linux
MongoDB
Ms Sql Server
MySQL
Nginx
Nginx
Postgres
Powershell
Prometheus
Python
RabbitMQ
Redis
Splunk
SQL
Tcp/Ip
Tomcat
Weblogic
Windows Server
Similar Jobs
Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
The role involves supporting sales cycles for Strategic Portfolio Management solutions, conducting presentations, and collaborating with teams to enhance product offerings.
Top Skills:
Agile Management SolutionsEnterprise ArchitectureProject Portfolio ManagementServicenow
Artificial Intelligence • Cloud • Sales • Security • Software • Cybersecurity • Data Privacy
The Recruiting Coordinator manages candidate interactions, schedules interviews, and ensures a smooth recruiting process while supporting the Talent Acquisition team.
Top Skills:
RoosterServicenowSlackWorkday
Artificial Intelligence • Fintech • Information Technology • Logistics • Payments • Business Intelligence • Generative AI
As a Technical Support Engineer at Coupa, you'll support enterprise customers, handle escalations, provide guidance, and contribute to knowledge articles while ensuring customer satisfaction.
Top Skills:
HTMLLinuxSQLUnixXML
What you need to know about the Pune Tech Scene
Once a far-out concept, AI is now a tangible force reshaping industries and economies worldwide. While its adoption will automate some roles, AI has created more jobs than it has displaced, with an expected 97 million new roles to be created in the coming years. This is especially true in cities like Pune, which is emerging as a hub for companies eager to leverage this technology to develop solutions that simplify and improve lives in sectors such as education, healthcare, finance, e-commerce and more.



