We are seeking a Data Engineer with 2 to 4 years of experience to design, build, and maintain scalable data pipelines using Databricks and cloud‑based data platforms. The ideal candidate will have hands‑on experience with Databricks Lakehouse architecture, building reliable ETL/ELT pipelines, and enabling analytics and data science use cases across the organization.
Your role will also include overseeing, supervising, and reviewing tasks performed by team members to ensure effective execution of work; managing end-to-end processes and projects for both internal and external clients with responsibility for timely and accurate delivery; issuing clear instructions and guidance to team members on assigned tasks; and mentoring and guiding junior colleagues to support their skill development, professional growth, and overall success.
Your role will also include overseeing, supervising and reviewing tasks performed by team members to ensure effective execution of work; managing end‑to‑end processes and projects for both internal and external clients with responsibility for timely and accurate delivery; issuing clear instructions and directions to team members on tasks to be performed; and mentoring and guiding junior colleagues to support their skill development, professional growth, and overall success
Responsibilities- Design, develop, and maintain data pipelines using Databricks (PySpark / Spark SQL)
- Implement and optimize ETL/ELT workflows using Databricks jobs, notebooks, and workflows
- Build and manage Delta Lake tables, ensuring data reliability, performance, and ACID compliance
- Develop and optimize data models for analytics, BI, and downstream consumption
- Work with batch and streaming data processing using Spark Structured Streaming (where applicable)
- Collaborate with data scientists, analysts, and product teams to deliver trusted datasets
- Ensure data quality, validation, and monitoring across pipelines
- Optimize Spark jobs for cost and performance (partitioning, caching, tuning)
- Follow best practices for code versioning, documentation, and deployment
- Support production workloads and assist with troubleshooting data issues
- 2–4 years of professional experience as a Data Engineer
- Strong hands‑on experience with Databricks platform
- Proficiency in Python (PySpark) and Spark SQL
- Solid experience with Delta Lake, including merges and time travel
- Strong SQL skills for data transformation and analysis
- Experience with cloud data storage (AWS S3 / Azure Data Lake / GCP Cloud Storage)
- Understanding of data warehousing and lakehouse concepts
- Experience with ETL orchestration tools (Databricks Workflows, Airflow, Azure Data Factory, etc.)
- Familiarity with Git and version control practices
Good to Have -
- Experience with streaming technologies (Kafka, Event Hubs, Kinesis)
- Exposure to dbt, Unity Catalog, or Databricks governance features
- Knowledge of cloud security, IAM, and cost optimization
- Experience supporting BI tools (Power BI, Tableau, Looker)
- Understanding of data science or ML workflows on Databricks
- Experience working in Agile/Scrum teams


