Checkmate (itsacheckmate.com)

Prompt Engineer - Data Science & Quality Analysis - India

Posted 2 Days Ago

Be an Early Applicant

Remote

Hiring Remotely in India

Mid level

Remote

Hiring Remotely in India

Mid level

The job involves designing and evaluating prompts for AI systems, conducting data-driven analysis, leading a team, and collaborating across various departments to optimize AI performance.

The summary above was generated by AI

Description

Checkmate is building advanced Voice AI systems for some of the largest restaurant and retail brands in the US, including several in the top 10. Unlike many companies still in the prototype phase, our AI solutions are live in production with real customers, achieving over 80% accuracy. This is a $1 billion market opportunity, and we’re scaling to 3,000+ stores by the end of this year.

Join us at this pivotal moment to shape AI products used daily by thousands of staff and customers, driving measurable impact at scale.

Requirements

Prompt Design & Evaluation - Develop, test, and refine prompts for tasks such as text generation, question answering, data classification, and structured data extraction to optimize Voice AI performance.
Data-Driven Analysis & Quality Measurement - Design evaluation frameworks and analyze prompt outputs using quantitative metrics, human-in-the-loop evaluation, and user feedback to identify improvement opportunities.
Experimentation & Iteration - Conduct experiments to test prompt variations, measure their business and operational impact, and iterate to enhance accuracy, consistency, and safety.
Regression Testing & Compliance - Build principled regression test suites using tools like LangFuse and Galileo to ensure prompts remain compliant and high-performing as models and use cases evolve.
Collaboration Across Teams - Work closely with data science, product, legal, engineering, and operations teams to align prompt designs with business goals, operational workflows, and compliance requirements.
Model Adaptation & Strategy Develop - prompts across multiple LLMs (GPT, LLaMA, Gemini, and Checkmate’s fine-tuned models), understanding model differences to optimize outputs effectively.
Team Leadership & Mentorship Lead - a team of analysts focused on prompt evaluation and data quality analysis, guiding prioritization, experimentation, and reporting. Collaborate with ops teams for seamless deployment and feedback loops.

Research & Continuous Learning -Stay up to date on emerging prompting techniques, LLM behaviors, evaluation frameworks, and AI safety practices to keep Checkmate’s AI solutions best-in-class.

Minimum Qualifications

Strong analytical and data science skills, with hands-on experience in Python (pandas, NumPy, scikit-learn)
Experience designing and conducting experiments and evaluations in applied AI or NLP contexts
Proficiency in SQL and working with relational databases (e.g. MySQL, PostgreSQL, Oracle, MS SQL)
Good understanding of data processing, quality measurement, and testing fundamentals
Experience leading analyst or operations teams, with strong prioritization, mentorship, and collaboration skills
Strong problem-solving mindset with a drive to explore, optimize, and automate workflows
Excellent communication skills for presenting insights to technical and non-technical stakeholders
Bachelor’s degree in Data Science, Computer Science, Statistics, Engineering, or a related field
Flexible to work US hours until at least 6 p.m. ET, with a strong remote setup

Preferred Qualifications

Experience with LLM evaluation and prompt engineering workflows
Familiarity with tools like LangFuse and Galileo for prompt evaluation and analysis
Knowledge of cloud platforms (AWS, GCP, Azure) and data pipeline tools
Familiarity with machine learning concepts and NLP workflows
Master’s or PhD in Data Science, Statistics, Computer Science, or a related field

Top Skills

AWS

Azure

Galileo

GCP

Langfuse

Ms Sql

MySQL

Oracle

Postgres

Python

SQL

Similar Jobs

Sortly

Software Engineer

2 Hours Ago

Remote

India

Mid level

Software • App development

As a Backend Software Engineer 2, you'll write clean code, conduct reviews, manage services, and enhance product performance through debugging and analysis.

Top Skills: Coding Languages

Pfizer

Director, AI & Data Product Management

5 Hours Ago

Remote or Hybrid

Senior level

Artificial Intelligence • Healthtech • Machine Learning • Natural Language Processing • Biotech • Pharmaceutical

Lead the development and lifecycle management of AI and data products at Pfizer, driving strategy, collaboration, and compliance for impactful solutions.

Top Skills: Agile MethodologiesAIAi/Ml TechnologiesData ManagementData PlatformsProduct Management

Rapid7

Manager, Software Quality Engineer

11 Hours Ago

Remote or Hybrid

Pune, Maharashtra, IND

Senior level

Artificial Intelligence • Cloud • Information Technology • Sales • Security • Software • Cybersecurity

Manage a QA team to ensure product quality, mentor junior members, implement testing strategies, and enhance product quality through collaboration and process improvement.

Top Skills: AWSAzureCucumberGoGoogle Cloud PlatformJavaJIRANunitOpenstackPlaywrightPythonRdbmsRobotframeworkSeleniumSQLVMware

What you need to know about the Pune Tech Scene

Once a far-out concept, AI is now a tangible force reshaping industries and economies worldwide. While its adoption will automate some roles, AI has created more jobs than it has displaced, with an expected 97 million new roles to be created in the coming years. This is especially true in cities like Pune, which is emerging as a hub for companies eager to leverage this technology to develop solutions that simplify and improve lives in sectors such as education, healthcare, finance, e-commerce and more.