Apiphany Logo

Apiphany

Associate Data Scientist

Posted 25 Days Ago
Remote
Hiring Remotely in India
Junior
Remote
Hiring Remotely in India
Junior
Prepare, clean, and validate structured and unstructured data for LLM-driven systems; build training datasets, support RAG and NL->SQL pipelines, perform data quality checks, and assist in data pipelines/APIs and model evaluation.
The summary above was generated by AI
Role Overview

We are seeking an Associate Data Scientist to support AI/ML engineering efforts by preparing, validating, and structuring data for LLM-driven systems. This is a hands-on role focused on real-world data processing, pipeline support, and model evaluation.

Key Responsibilities
  • Process and clean structured and unstructured data for AI/ML pipelines.

  • Prepare training-ready datasets for LLM fine-tuning and evaluation workflows.

  • Support RAG and NL→SQL systems through data preparation and validation.

  • Perform data quality checks and ensure completeness and consistency.

  • Assist in building and maintaining data pipelines and APIs (e.g., FastAPI).

  • Collaborate with engineering teams to troubleshoot and optimize data workflows.

Required Skills
  • 2+ years of experience in data processing or data-focused roles.

  • Strong Python skills with experience in data libraries (Pandas, NumPy, Scikit-learn).

  • Experience supporting LLM workflows (fine-tuning, prompt engineering, evaluation).

  • Familiarity with structured (SQL) and unstructured text data.

  • Understanding of data preparation for AI/ML systems.

Nice to Have
  • Exposure to RAG pipelines, embeddings, or evaluation metrics.

  • Experience with ML frameworks (PyTorch/TensorFlow) and Docker-based workflows.

  • Experience with CI/CD pipelines for ML systems.

  • Familiarity with vector databases (e.g., Chroma) and reranking techniques.

  • Research exposure to Transformer-based architectures.

Note: This position is open to candidates residing in India only.

Top Skills

Fastapi
Llms
Numpy
Pandas
Python
Scikit-Learn
SQL

Similar Jobs

3 Hours Ago
Remote
India
Mid level
Mid level
Cloud • Information Technology • Productivity • Software • Automation
The Revenue Analytics Engineer will design and maintain analytics solutions for revenue reporting, focusing on subscription metrics and ensuring data quality and technical implementation.
Top Skills: DbtPower BISalesforceSnowflakeSQL
3 Hours Ago
Remote
India
Senior level
Senior level
Cloud • Information Technology • Productivity • Software • Automation
As a Senior ServiceNow CRM Developer/Administrator, you will lead the design and implementation of CRM solutions, manage platform governance, and mentor junior developers while ensuring optimal service delivery.
Top Skills: CSSFlow DesignerGlide ApiHTMLIntegration HubJavaScriptRestServicenowSoapXML
3 Hours Ago
Remote
India
Mid level
Mid level
Cloud • Information Technology • Productivity • Software • Automation
Collaborate on pricing strategy, analyze data for insights, support sales with pricing expertise, and enhance pricing tools.
Top Skills: Cpq SystemsExcelGoogle SheetsPower BITableau

What you need to know about the Pune Tech Scene

Once a far-out concept, AI is now a tangible force reshaping industries and economies worldwide. While its adoption will automate some roles, AI has created more jobs than it has displaced, with an expected 97 million new roles to be created in the coming years. This is especially true in cities like Pune, which is emerging as a hub for companies eager to leverage this technology to develop solutions that simplify and improve lives in sectors such as education, healthcare, finance, e-commerce and more.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account