Job Title: Sr Data Engineer (GCP)
About Us
“Capco, a Wipro company, is a global technology and management consulting firm. Awarded with Consultancy of the year in the British Bank Award and has been ranked Top 100 Best Companies for Women in India 2022 by Avtar & Seramount. With our presence across 32 cities across globe, we support 100+ clients across banking, financial and Energy sectors. We are recognized for our deep transformation execution and delivery.
WHY JOIN CAPCO?
You will work on engaging projects with the largest international and local banks, insurance companies, payment service providers and other key players in the industry. The projects that will transform the financial services industry.
MAKE AN IMPACT
Innovative thinking, delivery excellence and thought leadership to help our clients transform their business. Together with our clients and industry partners, we deliver disruptive work that is changing energy and financial services.
#BEYOURSELFATWORK
Capco has a tolerant, open culture that values diversity, inclusivity, and creativity.
CAREER ADVANCEMENT
With no forced hierarchy at Capco, everyone has the opportunity to grow as we grow, taking their career into their own hands.
DIVERSITY & INCLUSION
We believe that diversity of people and perspective gives us a competitive advantage.
Job Description:
Role: Data Engineer with AWS
Location: Bangalore / Chennai/Gurgoan
Skills & Experience
Job Description – Data Engineer
We are looking for a Data Engineer with strong experience in building and operationalizing data pipelines, ETL workflows, and analytics platforms using PySpark, Apache Airflow, and AWS data services.
Key ResponsibilitiesBuild scalable ETL/ELT pipelines using PySpark on distributed processing frameworks
Orchestrate workflows using Apache Airflow (DAG design, scheduling, monitoring)
Develop data ingestion and transformation jobs using AWS Glue
Manage secure, compliant data access using AWS Lake Formation
Maintain and optimize AWS Glue Data Catalog for metadata, schema, and table management
Work with analytics teams to publish datasets for BI and dashboards
Build and support visualizations using Amazon QuickSight
Ensure data quality, performance, and reliability across all pipelines
Strong hands-on experience with PySpark for large-scale data processing
Deep knowledge of Airflow DAGs, operators, sensors, and CI/CD integration
Expertise in AWS Glue (ETL jobs, crawlers, Glue Studio, Glue Job Bookmarks)
Experience with Lake Formation permissions, governance, and data lakes
Familiarity with Glue Data Catalog for metadata management
Ability to build dashboards in Amazon QuickSight
Understanding of data modeling, partitioning, and performance optimization
Experience with S3, Athena, Redshift, or EMR
Knowledge of Python-based automation and testing
Exposure to cloud-native DevOps (IaC, Terraform/CloudFormation

