Capco

Tester AI QA (Polish is Mandatory) (She/He/They)

Posted 2 Hours Ago

Be an Early Applicant

Remote or Hybrid

Hiring Remotely in Poland

Entry level

Remote or Hybrid

Hiring Remotely in Poland

Entry level

QA / AI QA Tester responsible for designing test suites and evaluation datasets for LLM/agent systems, performing response quality, PII/compliance, guardrail, performance, integration and UX testing, analyzing telemetry, documenting results, and supporting AI/domain teams on defects and data drift.

The summary above was generated by AI

CAPCO POLAND

Location: Warsaw, Poland

Pref. work model - 3x per week from office

At Capco Poland, we’re not just another consultancy - we’re the spark behind digital transformation in the financial world. As a global leader in technology and management consulting, we thrive on helping clients tackle the toughest challenges across banking, payments, capital markets, wealth, and asset management.

Our secret?

A culture that’s fast, flexible, and fiercely entrepreneurial. We move quickly, think creatively, and always put our people first.

We’re passionate about growth - both for our clients and ourselves - and that means attracting the very best talent to join us on this exciting journey.

We’re proud to be:
• Trailblazers in banking, payments, capital markets, wealth, and asset management
• Champions of an agile, nimble, and innovative work environment
• Dedicated to building a team of top-notch professionals who share our drive and vision

ROLE OVERVIEW

We’re looking for a detail-oriented QA / AI QA Tester with experience in testing LLM- or agent-based systems (or strong QA experience with a focus on AI), who brings a structured approach to designing test cases and evaluation datasets, understands AI quality metrics, and is passionate about improving the reliability, stability, and overall quality of enterprise AI solutions.

Fluency in Polish is mandatory.

KEY RESPONSIBILITIES:

Design and maintain business test suites (functional, scenario-based, regression) for the Master Agent and domain agents.
Build evaluation datasets (PL/EN, domain-specific), including positive/negative queries, edge cases, and out-of-scope scenarios.
Perform response quality evaluation using metrics such as:
- Accuracy
- Top-k recall
- Groundedness
- Hallucination rate
- Refusal policy compliance
Conduct PII and compliance testing: validation of masking, anonymization, and sensitive data handling.
Test guardrails, including:
- Undesired output handling
- Prompt security testing
- “I don’t know” policy enforcement
Perform performance and resilience testing: latency, SLA compliance, pipeline stability.
Validate conversational UX (conversation flow, intent recognition, fallback handling, language detection).
Test integrations with:
- Copilot Studio
- Azure AI Search
- Azure OpenAI / Foundry
- Document Intelligence
- SharePoint Online
Analyze logs and telemetry (App Insights, Log Analytics) and identify anomalies.
Document test results, recommendations, and ensure traceability of test cases.
Support AI and domain teams in diagnosing defects, data drift, and quality regression.
Participate in periodic knowledge quality reviews and verify compliance with KM governance rules.

KEY TECHNOLOGIES USED BY THE TEAM:

Copilot Studio (knowledge agents)
Azure AI Search, Azure OpenAI / Foundry
Document Intelligence (OCR, table extraction)
SharePoint Online (knowledge sources)
App Insights + Log Analytics (telemetry)
Python (pandas, requests)
GitHub Actions (CI/CD)
BigQuery / Looker (analytics)

SKILLS & EXPERIENCES TO GET THE JOB DONE:

Experience in testing LLM-based or agent-based systems, or classical QA experience with a strong interest in transitioning to AI QA.
Ability to design business scenarios, test cases, and evaluation datasets.
Basic Python skills (pandas, REST APIs, simple evaluation scripts).
Familiarity with Copilot Studio and integration with domain agents.
Basic knowledge of Azure AI Search, SharePoint Online, and Document Intelligence (ability to interpret OCR/DI outputs).
Understanding of automated evaluation methods (LLM scoring, auxiliary models, benchmark evaluation).

Nice to have:

Experience with multicloud testing (GCP BigQuery/Looker, Azure, optionally Fabric).
Experience with Document Intelligence in the context of OCR and table extraction quality assessment.
Experience working with GitHub Actions (CI) and automated testing pipelines.
Basic understanding of the MCP protocol in agent-based systems.
Experience in data drift analysis and automated evaluation frameworks.

IMPORTANT

Fluent Polish (spoken and written) – mandatory.
Good command of English for documentation and collaboration.
Availability to work on-site, with partial remote work - 3 days per week from the office in Warsaw.

ONLINE RECRUITMENT PROCESS STEPS

Screening call with the Recruiter
Hiring Manager Technical Interview
Client Interview
Feedback/Offer

We offer a flexible collaboration model based on a B2B contract, with the opportunity to work on diverse projects.

Top Skills

Copilot Studio,Azure Ai Search,Azure Openai,Foundry,Document Intelligence,Sharepoint Online,Application Insights,Log Analytics,Python,Pandas,Requests,Rest Apis,Github Actions,Bigquery,Looker,Ocr

Similar Jobs at Capco

Capco

Machine Learning Engineer

6 Hours Ago

Remote or Hybrid

Poland

Mid level

Fintech • Professional Services • Consulting • Energy • Financial Services • Cybersecurity • Generative AI

Design and implement document ingestion pipelines, processing workflows, normalization, indexing into Azure AI Search, automated evaluation and monitoring, CI/CD and containerized services, and ensure security/compliance for knowledge artifacts.

Top Skills: Python,Asyncio,Fastapi,Pydantic,Multiprocessing,Azure Sdk For Python,Azure Storage,Azure Cognitive Services,Azure Ai Search,Azure Document Intelligence,Microsoft Graph Api,Sharepoint Api,Application Insights,Log Analytics,Pytest,Github Actions,Docker,Azure Container Registry,Azure Container Apps,Aks,Web App For Containers,Azure Functions,Logic Apps,Durable Functions,Entra Id (Azure Ad),Vector Search,Embeddings

Capco

Data Architect

7 Hours Ago

Remote or Hybrid

Poland

Senior level

Fintech • Professional Services • Consulting • Energy • Financial Services • Cybersecurity • Generative AI

Design and govern the Knowledge Management data layer: information models, taxonomies, ingestion pipelines, PII and access controls, Azure AI Search indexing, and integrations with SharePoint, GCP BigQuery/Looker and other data sources to ensure high data quality for AI agents.

Top Skills: Sharepoint Online,Microsoft 365,Azure Ai Search,Azure Knowledge Bases,Dataverse,Google Bigquery,Looker,Microsoft Fabric,Azure Synapse,Graph Api,Rest Api,Ocr,Document Intelligence,Vision Ai,Entra Id,Rbac,Abac,Etl,Elt,Semantic Search

Capco

Architect

7 Hours Ago

Remote or Hybrid

Poland

Senior level

Fintech • Professional Services • Consulting • Energy • Financial Services • Cybersecurity • Generative AI

Lead design and evolution of AI agent architecture (Master Agent + domain agents) across Copilot Studio, Azure and multicloud. Define integrations, knowledge layers, conversational UX, LLMOps, security, performance, and quality metrics. Collaborate with DevOps, Security, AI/ML engineers and domain owners to implement standards, compliance, monitoring, and continuous improvement for enterprise knowledge management.

Top Skills: Azure Openai,Azure Ai Foundry,Azure Ai Search,Knowledge Bases,Copilot Studio,Azure,Google Cloud Platform (Gcp),Rest Api,Mcp,Graph Connectors,Sharepoint Online,Entra Id,Github Actions,Azure Container Registry (Acr),Ci/Cd,Service Bus,Event Grid,Durable Functions,Webhooks,Microsoft Graph Api,Power Platform,Foundry Ai,Bigquery,Looker,Vertex Ai,Azure Document Intelligence,Ocr,Llmops,Rag,Log Analytics,Observability,Microsoft Fabric,Onelake,Semantic Models,Python,Docker,Aks,Azure Container Apps (Aca),Logic Apps,Microsoft Teams,Chatops

What you need to know about the Pune Tech Scene

Once a far-out concept, AI is now a tangible force reshaping industries and economies worldwide. While its adoption will automate some roles, AI has created more jobs than it has displaced, with an expected 97 million new roles to be created in the coming years. This is especially true in cities like Pune, which is emerging as a hub for companies eager to leverage this technology to develop solutions that simplify and improve lives in sectors such as education, healthcare, finance, e-commerce and more.