About ProCogia:
The core of our culture is maintaining a high level of cultural equality throughout the company. Our diversity and differences allow us to create innovative and effective data solutions for our clients.
Our Core Values: Trust, Growth, Innovation, Excellence, and Ownership
Location: India (Remote)
Time Zone: 12pm to 9pm IST
Job Description:
We are seeking a Senior Data Engineer with strong expertise in SQL, Python, and PySpark, along with hands-on experience in AWS tools such as CDK, Glue, Lambda, and Kinesis. This role will focus on building scalable, event-driven data pipelines and real-time streaming solutions within a cloud-native environment. The ideal candidate brings a deep understanding of software development best practices, CI/CD, and application resiliency. You will collaborate with cross-functional teams to ensure high-quality, reliable, and secure data infrastructure. .
Key Responsibilities:
- Experience across AWS cloud services to build scalable data pipelines and data engineering workflow
- Design and develop robust data pipelines using SQL, Python, and PySpark.
- Build and manage scalable, cloud-native solutions leveraging AWS services such as CDK, Glue, S3, Lambda, EventBridge, Kafka, Kinesis, CloudWatch, IAM, SNS, and SQS.
- Implement real-time data streaming and event-driven architectures to support low-latency data processing.
- Adhere to software development lifecycle (SDLC) standards and best practices, ensuring high code quality and reliability.
- Collaborate using GitLab for version control, code reviews, and CI/CD pipeline management.
- Apply Test Driven Development (TDD) principles and maintain continuous integration and deployment workflows.
- Design and implement fault-tolerant data pipelines with built-in retries, monitoring, and support for high availability in distributed cloud environments.
Required Skills:
- 5+ years of experience in data engineering with cloud platforms, with 2+ years of hands-on experience in the AWS ecosystem.
- Proficiency in Python, SQL, and PySpark for building scalable data pipelines.
- Strong experience with AWS services including Glue, Lambda, CDK, S3, Kinesis, EventBridge, and IAM.
- Hands-on experience with GitLab CI/CD pipelines, including artifact scanning, deployment automation, and API integration.
- Solid understanding of real-time streaming, event-driven architectures, and distributed data systems (e.g., Kafka, Kinesis).
- Familiarity with Test-Driven Development (TDD), Continuous Integration/Continuous Deployment (CI/CD), and SDLC best practices.
- Ability to design fault-tolerant pipelines with logging, retries, and monitoring for high availability.
- Strong attention to detail and experience working in consulting or client-facing environments.
Preferred Qualifications:
- Experience with Apache Iceberg or other open table formats (e.g., Parquet, Delta).
- Exposure to infrastructure as code (IaC) using AWS CDK in TypeScript or Python.
- Experience collaborating in client-facing or consulting environments.
- Understanding of cost optimization and data governance in AWS environments.
What We Offer
- Competitive compensation and benefits
- Opportunity to work on cutting-edge projects with enterprise clients
- Collaborative, growth-oriented team culture
- Access to cloud certifications, training, and hands-on learning
ProCogia is proud to be an equal-opportunity employer. We are committed to creating a diverse and inclusive workspace. All qualified applicants will receive consideration for employment without regard to race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status.