The Staff Software Engineer will design, develop, and maintain data ingestion pipelines, ensuring data quality and performance while integrating various data sources.
The Staff Software Engineer, Data Ingestion will be a critical individual contributor responsible for designing collection strategies, developing, and maintaining robust and scalable data pipelines. This role is at the heart of our data ecosystem, deliver new analytical software solution to access timely, accurate, and complete data for insights, products, and operational efficiency.
Key Responsibility
- Design, develop, and maintain high-performance, fault-tolerant data ingestion pipelines using Python.
- Integrate with diverse data sources (databases, APIs, streaming platforms, cloud storage, etc.).
- Implement data transformation and cleansing logic during ingestion to ensure data quality.
- Monitor and troubleshoot data ingestion pipelines, identifying and resolving issues promptly.
- Collaborate with database engineers to optimize data models for fast consumption.
- Evaluate and propose new technologies or frameworks to improve ingestion efficiency and reliability.
- Develop and implement self-healing mechanisms for data pipelines to ensure continuity.
- Define and uphold SLAs and SLOs for data freshness, completeness, and availability.
- Participate in on-call rotation as needed for critical data pipeline issues.
Required Skills
- 6+ years experience in software development industry from computer science background
- Extensive Python Expertise: Extensive experience in developing robust, production-grade applications with Python.
- Data Collection & Integration: Proven experience collecting data from various sources (REST APIs, OAuth, GraphQL, Kafka, S3, SFTP, etc.).
- Distributed Systems & Scalability: Strong understanding of distributed systems concepts, designing for scale, performance optimization, and fault tolerance.
- Cloud Platforms: Experience with major cloud providers (AWS or GCP) and their data-related services (e.g., S3, EC2, Lambda, SQS, Kafka, Cloud Storage, GKE).
- Database Fundamentals: Solid understanding of relational databases (SQL, schema design, indexing, query optimization). OLAP database experience is a plus (Hadoop)
- Monitoring & Alerting: Experience with monitoring tools (e.g., Prometheus, Grafana) and setting up effective alerts.
- Version Control: Proficiency with Git.
- Containerization (Plus): Experience with Docker and Kubernetes.
- Streaming Technologies (Plus): Experience with real-time data processing using Kafka, Flink, Spark Streaming
Top Skills
AWS
Docker
Flink
GCP
Grafana
Kafka
Kubernetes
Prometheus
Python
Spark Streaming
SQL
Similar Jobs
Machine Learning • Natural Language Processing
Join Welo Data to support AI language data projects in Korean and Chinese. Engage in tasks like annotation, evaluation, and prompt creation with flexible, freelance work.
Top Skills:
AIAnnotatingData CollectionDigital ToolsEvaluationLabelingPrompt Engineering
Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
The Senior Software Engineer will resolve L2/L3 support issues, track performance metrics, communicate support needs, and drive operational efficiency for integrations within the ETG Product Ops.
Top Skills:
AIApi GatewaysAWSAzureAzure ApimAzure App GatewayBoomiDockerFlinkGCPJavaKafkaKubernetesMicroservicesMlMonitoring ToolsNoSQLRestful ApiServicenowSQL
Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Drive the strategy and roadmap for Generative AI platform capabilities, collaborate with teams, and manage product lifecycle, focusing on data infrastructure and AI delivery.
Top Skills:
Ai/MlData PipelinesGenerative AiLlm TuningModel OrchestrationSaaS
What you need to know about the Pune Tech Scene
Once a far-out concept, AI is now a tangible force reshaping industries and economies worldwide. While its adoption will automate some roles, AI has created more jobs than it has displaced, with an expected 97 million new roles to be created in the coming years. This is especially true in cities like Pune, which is emerging as a hub for companies eager to leverage this technology to develop solutions that simplify and improve lives in sectors such as education, healthcare, finance, e-commerce and more.