Design, build, and operate AI-ready data platforms and scalable pipelines (batch, streaming, real-time) to support model training, inference, RAG, semantic search, and enterprise AI. Implement lakehouse/lake/warehouse architectures, data governance, security controls, DataOps/CI-CD, observability, and feature/embedding pipelines. Collaborate with AI/ML teams to enable trusted, performant, and compliant data products.
Requisition Number: 2369782
Optum is a global organization that delivers care, aided by technology to help millions of people live healthier lives. The work you do with our team will directly improve health outcomes by connecting people with the care, pharmacy benefits, data and resources they need to feel their best. Here, you will find a culture guided by inclusion, talented peers, comprehensive benefits and career development opportunities. Come make an impact on the communities we serve as you help us advance health optimization on a global scale. Join us to start Caring. Connecting. Growing together.
We're looking for a hands-on Senior Data Engineer - AI Platforms to build and scale AI-ready data platforms that power AI/ML, Generative AI, Agentic AI, analytics, and intelligent enterprise applications. This role focuses on engineering modern data platforms, data products, semantic foundations, and scalable data pipelines that enable AI systems to consume trusted, governed, and context-rich data.
The ideal candidate brings solid expertise in data engineering, distributed processing, modern cloud data platforms, and AI-centric data foundations. You will work closely with Data Architects, AI/ML Engineers, Applied Scientists, and Platform Engineers to deliver data platforms that support model training, inference, RAG, semantic retrieval, and enterprise AI applications.
Primary Responsibilities:
AI Data Platform Engineering
Data Engineering & Processing
AI Data Foundations
Data Governance & Security
Platform Reliability & Operational Excellence
Collaboration & Innovation
Required Qualifications:
Preferred Qualifications:
Technical Stack
At UnitedHealth Group, our mission is to help people live healthier lives and make the health system work better for everyone. We believe everyone-of every race, gender, sexuality, age, location and income-deserves the opportunity to live their healthiest life. Today, however, there are still far too many barriers to good health which are disproportionately experienced by people of color, historically marginalized groups and those with lower incomes. We are committed to mitigating our impact on the environment and enabling and delivering equitable care that addresses health disparities and improves health outcomes - an enterprise priority reflected in our mission.
Optum is a global organization that delivers care, aided by technology to help millions of people live healthier lives. The work you do with our team will directly improve health outcomes by connecting people with the care, pharmacy benefits, data and resources they need to feel their best. Here, you will find a culture guided by inclusion, talented peers, comprehensive benefits and career development opportunities. Come make an impact on the communities we serve as you help us advance health optimization on a global scale. Join us to start Caring. Connecting. Growing together.
We're looking for a hands-on Senior Data Engineer - AI Platforms to build and scale AI-ready data platforms that power AI/ML, Generative AI, Agentic AI, analytics, and intelligent enterprise applications. This role focuses on engineering modern data platforms, data products, semantic foundations, and scalable data pipelines that enable AI systems to consume trusted, governed, and context-rich data.
The ideal candidate brings solid expertise in data engineering, distributed processing, modern cloud data platforms, and AI-centric data foundations. You will work closely with Data Architects, AI/ML Engineers, Applied Scientists, and Platform Engineers to deliver data platforms that support model training, inference, RAG, semantic retrieval, and enterprise AI applications.
Primary Responsibilities:
AI Data Platform Engineering
- Build and enhance AI-ready data platforms supporting AI/ML, Generative AI, Agentic AI, analytics, and operational workloads
- Develop scalable data pipelines spanning:
- Data ingestion
- Data transformation
- Data processing
- Data serving
- Data consumption
- Implement modern data architectures using:
- Lakehouse
- Data Lake
- Data Warehouse
- Medallion Architecture (Bronze, Silver, Gold)
- Support data platforms that enable model training, inference, feature engineering, RAG, and enterprise AI applications
Data Engineering & Processing
- Develop high-performance pipelines supporting structured, semi-structured, and unstructured data
- Build batch, streaming, and real-time processing solutions using modern distributed data technologies
- Implement scalable data processing frameworks utilizing:
- Apache Spark
- PySpark
- Kafka
- Cloud-native data services
- Optimize data storage, partitioning, indexing, and query performance for scalability and cost efficiency
- Implement resilient data processing patterns including checkpointing, retries, recovery mechanisms, and data validation
AI Data Foundations
- Build and maintain AI-ready datasets, feature pipelines, and data products
- Develop embedding generation pipelines and vectorized data preparation workflows
- Support semantic search, retrieval, and RAG use cases through efficient data engineering practices
- Enable AI data readiness through:
- Data quality management
- Feature engineering
- Data enrichment
- Metadata management
- Semantic indexing
- Contribute to building semantic data layers that provide business context and improve AI consumption of enterprise data
Data Governance & Security
- Implement data governance standards covering:
- Metadata management
- Data lineage
- Data quality
- Data cataloging
- Data stewardship
- Support compliance with HIPAA, GDPR, PII protection, and enterprise governance standards
- Implement secure data access controls using:
- RBAC
- Encryption
- Data masking
- Auditing
- Ensure data platforms meet security, privacy, and regulatory requirements
Platform Reliability & Operational Excellence
- Implement monitoring, logging, lineage tracking, alerting, and operational dashboards for data platforms
- Support platform scalability, reliability, performance, and operational efficiency
- Contribute to DataOps practices including:
- CI/CD for data pipelines
- Automated testing
- Deployment automation
- Data observability
- Troubleshoot production issues and support continuous improvement initiatives
Collaboration & Innovation
- Partner with AI/ML Engineers, Data Scientists, Applied Scientists, Architects, and Platform Teams to deliver AI-ready data solutions
- Contribute to reusable engineering frameworks, shared services, and platform accelerators
- Support adoption of emerging technologies across AI data platforms, semantic retrieval, and modern data ecosystems
- Participate in architecture discussions and contribute to enterprise engineering standards
- Comply with the terms and conditions of the employment contract, company policies and procedures, and any and all directives (such as, but not limited to, transfer and/or re-assignment to different work locations, change in teams and/or work shifts, policies in regards to flexibility of work benefits and/or work environment, alternative work arrangements, and other decisions that may arise due to the changing business environment). The Company may adopt, vary or rescind these policies and directives in its absolute discretion and without any limitation (implied or otherwise) on its ability to do so
Required Qualifications:
- Bachelor's degree in computer science, Engineering, Information Systems, Data Engineering, or related field
- 8+ years of experience in Data Engineering, Data Platforms, Analytics Engineering, or related disciplines
- Experience building and operating enterprise-scale data pipelines and data platforms
- Experience implementing modern data architectures including Data Lakes, Lakehouse, Data Warehouses, and Medallion Architecture
- Experience developing data pipelines supporting AI/ML and analytics workloads
- Experience working with structured, semi-structured, and unstructured datasets
- Experience with metadata management, data quality, lineage, and governance practices
- Experience implementing CI/CD, automated testing, and DataOps practices
- Solid experience with:
- Databricks
- Snowflake
- Apache Spark
- PySpark
- SQL
- Python
- Solid understanding of distributed processing, scalability, fault tolerance, and performance optimization
- Understanding of security, privacy, and compliance requirements for enterprise data platforms
- Familiarity with feature engineering, embeddings, semantic search, vectorized data, and AI-ready data foundations
- Proven solid analytical, communication, problem-solving, and collaboration skills
Preferred Qualifications:
- Hands-on experience with Databricks Lakehouse Platform, Snowflake Data Cloud, Delta Lake, Apache Iceberg, and cloud-native data platforms
- Experience building AI-ready data platforms that support AI/ML, Generative AI, Agentic AI, and Retrieval-Augmented Generation (RAG) workloads
- Experience developing feature stores, embedding pipelines, semantic indexing solutions, and AI data products
- Experience with vector databases, semantic retrieval platforms, and enterprise search solutions
- Experience implementing batch, streaming, and event-driven architectures using Kafka and related technologies
- Experience working with cloud platforms including Azure, AWS, or GCP
- Experience contributing to reusable data frameworks, platform accelerators, and shared engineering services
- Experience within healthcare, financial services, insurance, banking, or other regulated industries
- Experience mentoring junior engineers and contributing to engineering best practices
- Solid understanding of DataOps, data observability, automated data quality, and platform engineering practices
- Familiarity with semantic layers, knowledge graphs, ontology-driven models, and context-aware data architectures
Technical Stack
- Data Platforms: Databricks, BigQuery, Snowflake, Azure Synapse, Delta Lake
- Processing: Apache Spark, PySpark, Spark SQL
- Streaming: Kafka, Spark Streaming, Event Hub, Kinesis
- Storage: S3, ADLS, GCS, Parquet, ORC
- Databases: PostgreSQL, MySQL, SQL Server, Cosmos DB, NoSQL
- AI Data Layer: Pinecone, ChromaDB, FAISS, embeddings, semantic search, RAG pipelines
- Orchestration: Airflow, Azure Data Factory, dbt
- Programming: Python, SQL
- Cloud: AWS, Azure, GCP
- DevOps: Docker, Kubernetes, CI/CD (Jenkins, GitHub Actions)
- Observability & Governance: Monitoring, logging, lineage, data catalogs
- Security & Compliance: RBAC, IAM, encryption, masking
- Integration: REST APIs, microservices
At UnitedHealth Group, our mission is to help people live healthier lives and make the health system work better for everyone. We believe everyone-of every race, gender, sexuality, age, location and income-deserves the opportunity to live their healthiest life. Today, however, there are still far too many barriers to good health which are disproportionately experienced by people of color, historically marginalized groups and those with lower incomes. We are committed to mitigating our impact on the environment and enabling and delivering equitable care that addresses health disparities and improves health outcomes - an enterprise priority reflected in our mission.
Optum Pune, Maharashtra, IND Office
Pune, India, India
Similar Jobs at Optum
Artificial Intelligence • Big Data • Healthtech • Information Technology • Machine Learning • Software • Analytics
Design, build, deploy, and operate AI-powered full-stack applications following an AI Development Lifecycle. Implement frontend (React/Next.js/TypeScript) and backend (Node.js/Python) services, integrate LLMs, RAG, vector search, and MLOps/CI-CD pipelines, ensure security, observability, and production readiness, and mentor engineers.
Top Skills:
Agentic AiAutogenAWSAzureAzure Ai SearchBffChromadbCi/CdCrewaiDatabricksDeep LearningDjangoDockerEncryptionFastapiFlaskGCPGraphQLInfrastructure-As-CodeKafkaKubernetesLangchainLanggraphLarge Language Models (Llms)LlamaindexLlmopsLoggingMachine LearningMicroservicesMlopsMonitoringNext.JsNode.jsOauthPgvectorPineconePrompt EngineeringPythonRbacReactRest ApisRetrieval-Augmented Generation (Rag)Semantic KernelSemantic SearchServer-Sent EventsSnowflakeSparkSsoTracingTypescriptVector SearchWeaviateWebsockets
Artificial Intelligence • Big Data • Healthtech • Information Technology • Machine Learning • Software • Analytics
Design, develop, deploy, and maintain AI/ML solutions (including LLMs and Generative AI) for healthcare. Lead chatbot and conversational AI initiatives, integrate AI into cloud architectures, ensure responsible and compliant AI practices, provide technical leadership, mentor team members, and drive operational excellence and alignment with enterprise AI strategy.
Top Skills:
Api DevelopmentAsrAWSChatbotsCloud ComputingComputer VisionConversational AiData Analysis SystemsDeep LearningGenerative AiIntent ClassificationLarge-Scale Computing FrameworksLlmsMachine LearningNlpNluSemantic UnderstandingWeb Development
Artificial Intelligence • Big Data • Healthtech • Information Technology • Machine Learning • Software • Analytics
Design and implement scalable full-stack enterprise applications with modern frontend frameworks and AI/GenAI integrations. Build Node.js/Python backends, BFF layers, real-time and streaming systems, RAG/vector-based semantic search, and cloud-native deployments. Ensure performance, security, observability, and mentor teams while collaborating across frontend, backend, and AI groups to deliver large-scale solutions.
Top Skills:
AngularAWSAzureBffCi/CdCSS3ExpressGitHTML5JavaScriptLlmsNext.JsNode.jsOpenaiPythonRagReactRestSemantic SearchServer-Sent EventsTypescriptVector EmbeddingsWebsockets
What you need to know about the Pune Tech Scene
Once a far-out concept, AI is now a tangible force reshaping industries and economies worldwide. While its adoption will automate some roles, AI has created more jobs than it has displaced, with an expected 97 million new roles to be created in the coming years. This is especially true in cities like Pune, which is emerging as a hub for companies eager to leverage this technology to develop solutions that simplify and improve lives in sectors such as education, healthcare, finance, e-commerce and more.

