Photon Jobs

QA Lead (Automation+Performance)- Dallas, TX

Photon

QA Lead (Automation+Performance)- Dallas, TX

Reposted 21 Hours Ago

Be an Early Applicant

In-Office or Remote

Hiring Remotely in United States

Expert/Leader

In-Office or Remote

Hiring Remotely in United States

Expert/Leader

Lead QA Automation for agentic AI products by designing eval pipelines, golden datasets, and automated tests for tool-use, hallucination detection, latency/token monitoring, and regression across models and prompts. Integrate performance testing into CI/CD and collaborate with AI engineers to convert requirements into measurable automated evaluations.

The summary above was generated by AI

We are seeking a QA Automation Lead who is ready to move beyond traditional "Pass/Fail" testing. In this role, you will design and build automation frameworks specifically for Agentic AI products. You will focus on evaluating the performance of autonomous agents, ensuring they follow logical reasoning paths, call the correct tools, and provide accurate, safe outputs.

Your mission is to build the "evaluations" (Evals) that define what high-quality AI behavior looks like, moving the needle from unpredictable experiments to production-grade software.

Key Responsibilities

Non-Deterministic Testing: Develop automation strategies for probabilistic outputs, using model-based evaluation to "test the tester."
Building "Eval" Pipelines: Create and maintain "Golden Datasets" to benchmark agent performance across different versions of prompts and models.
Tool-Use Validation: Build automated tests to verify that agents call the correct functions/APIs with the right parameters in complex multi-step workflows.
Regression Testing for Prompts: Monitor how subtle changes in prompt engineering or model updates (e.g., moving from GPT-4 to Claude 3.5) affect the product’s reliability.
Latency & Token Monitoring: Integrate performance testing into the CI/CD pipeline to track agent reasoning time and cost-efficiency.
Hallucination Detection: Develop automated checks to identify and report AI hallucinations, bias, or "jailbreak" attempts.
Collaboration: Work closely with AI Engineers to translate "vague" business requirements into measurable, automated test cases.

Required Skills & Qualifications

Experience: 10+ years in QA Automation, with a recent focus on AI/ML or LLM-based applications.
Python Proficiency: Expert-level Python skills (the industry standard for AI testing) and experience with testing frameworks like Pytest.
AI Testing Tools: Familiarity with AI evaluation frameworks such as LangSmith, DeepEval, RAGAS, or Promptfoo.
API & Backend Testing: Deep experience with Playwright, Selenium, or Cypress for UI, but a heavy focus on API-level testing and database validation.
Statistical Mindset: Understanding that AI testing often requires "scoring" (e.g., 85% accuracy) rather than a simple binary pass/fail.
Data Skills: Ability to work with SQL and JSON to validate data retrieved by agents during RAG (Retrieval-Augmented Generation) processes.

Preferred Qualifications

Experience testing Multi-Agent Systems (where one agent tests another).
Knowledge of Prompt Engineering and how it influences software behavior.
Background in Investment Banking or Fintech (if applicable) to understand high-stakes data accuracy.

Compensation, Benefits and Duration

Minimum Compensation: USD 38,000
Maximum Compensation: USD 133,000
Compensation is based on actual experience and qualifications of the candidate. The above is a reasonable and a good faith estimate for the role.
Medical, vision, and dental benefits, 401k retirement plan, variable pay/incentives, paid time off, and paid holidays are available for full time employees.
This position is not available for independent contractors
No applications will be considered if received more than 120 days after the date of this post

Similar Jobs

Zscaler

Recruiter

A Minute Ago

Easy Apply

Remote or Hybrid

Easy Apply

Senior level

Cloud • Information Technology • Security • Software • Cybersecurity

Senior, full-cycle recruiter responsible for sourcing and hiring Sales Account Executives across the Americas. Build proactive pipelines via LinkedIn Recruiter and sourcing tools, partner with sales leaders on hiring strategy and compensation, manage interview scorecards and process, deliver pipeline health updates, and negotiate offers to close high-quality, diverse candidates.

Top Skills: Ai ToolsGemLinkedin RecruiterMarket Mapping ToolsSeekoutTalent Intelligence Platforms

Zscaler

Business Systems Analyst

2 Minutes Ago

Easy Apply

Remote or Hybrid

Easy Apply

Senior level

Cloud • Information Technology • Security • Software • Cybersecurity

Lead design and delivery of scalable Sales Cloud and GTM systems. Translate GTM requirements into functional and technical designs, coordinate stakeholders, automate sales workflows with AI, and manage onshore/offshore teams to implement and optimize Salesforce and related GTM tools.

Top Skills: AgentforceAi ToolsCRMPrmSales CloudSalesforceSalesforce Einstein

ServiceNow

Manager, Strategic Planning Operations

8 Minutes Ago

Remote or Hybrid

Senior level

Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation

Lead marketing strategic planning operations: run planning rhythms (MOR, QBR, forecasts, OKRs), translate cross-functional problems into solutions, own planning requirements and timelines, implement scalable processes, and align stakeholders across BizOps, GTM Ops, Analytics, and FP&A.

What you need to know about the Pune Tech Scene

Once a far-out concept, AI is now a tangible force reshaping industries and economies worldwide. While its adoption will automate some roles, AI has created more jobs than it has displaced, with an expected 97 million new roles to be created in the coming years. This is especially true in cities like Pune, which is emerging as a hub for companies eager to leverage this technology to develop solutions that simplify and improve lives in sectors such as education, healthcare, finance, e-commerce and more.