Designs, builds, and operates scalable AWS-based batch and streaming data pipelines and platforms (Lakehouse and warehouse). Owns architecture, performance, security, and cost optimization; implements transformations (SQL/dbt, PySpark), Kafka ingestion, Airflow orchestration, and CI/CD/Terraform. Partners with product and platform teams, mentors engineers, and drives data modeling, schema evolution, and operational excellence.
Job Purpose and Impact
Key Accountabilities
Qualifications
- The Senior Data Engineer designs, builds, and operates scalable, reliable data products and platforms that power analytics, reporting, and downstream applications. This role owns end‑to‑end delivery of batch and streaming data pipelines on a modern AWS‑based cloud data platform, applying strong engineering patterns to ensure performance, security, observability, and cost efficiency.
With minimal supervision, the role partners closely with product, analytics, and platform teams to translate business requirements into robust technical solutions across a Lakehouse (Iceberg) and approved warehousing platforms (e.g., Snowflake). The Senior Data Engineer also mentors other engineers, drives code quality, and raises the engineering bar across the organization.
Key Accountabilities
- Data & Analytical Solutions
- Designs and delivers scalable data products using standard cloud and data engineering architectures.
- Owns technical decisions (batch vs. streaming, Lakehouse vs. warehouse) and ensures solutions meet reliability, security, governance, latency, and cost requirements.
- Reviews designs and contributes reusable components, templates, and standards.
Data Pipelines- Builds and operates end‑to‑end batch and streaming pipelines.
- Implements transformations using SQL/dbt and PySpark as needed.
- Integrates real‑time or event‑driven ingestion using Kafka.
- Orchestrates workflows with Airflow; establishes SLAs/SLOs and CI/CD‑based deployments.
Data Systems & Architecture- Optimizes data architectures for performance, scalability, and cost.
- Applies best practices for Iceberg table design, incremental processing, and query optimization across Hive, Impala, Snowflake, and RDBMS.
- Diagnoses systemic issues and drives remediation with platform teams.
Data Infrastructure (AWS)- Leads technical readiness across dev/test/prod environments.
- Works hands‑on with AWS services including S3, Glue, Lambda, IAM, and SageMaker.
- Partners with governance and platform teams on access control, tagging, and operational support.
Data Modeling & Formats- Leads modeling across RAW, CURATED, and SERVING layers.
- Applies dimensional or normalized models for correctness, performance, and usability.
- Implements efficient formats (Parquet + Iceberg) with clear schema evolution strategies.
DevOps & CI/CD- Designs and improves Git‑based CI/CD pipelines and infrastructure‑as‑code using Terraform.
- Ensures quality gates, auditability, and compliance with governance requirements.
Stakeholder & Engineering Leadership- Partners with product, analytics, and platform teams to align on requirements, data contracts, and SLAs.
- Communicates complex technical topics clearly and leads technical discussions.
- Coaches engineers and raises engineering standards through reviews and documentation.
AI‑First & Product Mindset- Uses GenAI‑assisted development responsibly to accelerate delivery.
- Builds products, not just pipelines, focusing on usability, adoption, reliability, and lifecycle ownership.
- Designs systems end‑to‑end and continuously optimizes cost‑performance trade‑offs using metrics.
Qualifications
- 8+ years of total experience with 6+ years of Data Engineering experience.
- Strong expertise in AWS‑based data engineering and scalable cloud architectures
- Proven experience building end‑to‑end batch and streaming pipelines, including Kafka
- Advanced proficiency in SQL, Hive, Impala, and PostgreSQL / RDBMS
- Strong programming skills in Python and PySpark
- Hands‑on experience with AWS Glue, Lambda, S3, IAM, and SageMaker
- Experience with Snowflake and modern data warehousing
- Expertise in CI/CD, Terraform, and DevOps practices
- Proficiency in Airflow for workflow orchestration
- Experience with Power BI for data visualization and reporting
- Strong foundation in data modeling, performance optimization, and large‑scale data systems
Similar Jobs at Cargill
Food • Greentech • Logistics • Sharing Economy • Transportation • Agriculture • Industrial
Design, build, and maintain scalable batch and streaming data platforms and pipelines (Snowflake, Kafka/Pulsar). Develop production-grade Python ETL/ELT, data models, automated deployments, and ensure security, governance, performance, reliability, and GenAI enablement. Partner with analytics and business stakeholders to deliver robust data products.
Top Skills:
.NetApache KafkaApache PulsarData LakeData WarehouseEltOpenaiOraclePostgresPower BIPythonSnowflakeSQLSQL ServerTableau
Food • Greentech • Logistics • Sharing Economy • Transportation • Agriculture • Industrial
Design, build, and maintain scalable batch and streaming data platforms and pipelines (Snowflake, Kafka/Pulsar). Develop production-grade Python ETL/ELT, data models, automated deployments, and ensure security, governance, performance, reliability, and GenAI enablement. Partner with analytics and business stakeholders to deliver robust data products.
Top Skills:
.NetApache KafkaApache PulsarData LakeData WarehouseEltOpenaiOraclePostgresPower BIPythonSnowflakeSQLSQL ServerTableau
Food • Greentech • Logistics • Sharing Economy • Transportation • Agriculture • Industrial
The Software Engineer (Backend Developer) designs and develops software applications, collaborates with teams, writes code, tests, and provides technical support.
Top Skills:
AuthenticationAuthorizationAWSCi/CdDatadogDockerGitGradleJavaJunitOktaPostmanRest ApiSpring BootSQL
What you need to know about the Pune Tech Scene
Once a far-out concept, AI is now a tangible force reshaping industries and economies worldwide. While its adoption will automate some roles, AI has created more jobs than it has displaced, with an expected 97 million new roles to be created in the coming years. This is especially true in cities like Pune, which is emerging as a hub for companies eager to leverage this technology to develop solutions that simplify and improve lives in sectors such as education, healthcare, finance, e-commerce and more.
.png)
.png)