Capco Logo

Capco

Data Architect (Polish is Mandatory) (She/He/They)

Posted 7 Hours Ago
Be an Early Applicant
Remote or Hybrid
Hiring Remotely in Poland
Senior level
Remote or Hybrid
Hiring Remotely in Poland
Senior level
Design and govern the Knowledge Management data layer: information models, taxonomies, ingestion pipelines, PII and access controls, Azure AI Search indexing, and integrations with SharePoint, GCP BigQuery/Looker and other data sources to ensure high data quality for AI agents.
The summary above was generated by AI

CAPCO POLAND

DATA ARCHITECT

Location: Warsaw, Poland

Pref. work model - 3x per week from office

 

At Capco Poland, we’re not just another consultancy - we’re the spark behind digital transformation in the financial world. As a global leader in technology and management consulting, we thrive on helping clients tackle the toughest challenges across banking, payments, capital markets, wealth, and asset management.
Our secret?
A culture that’s fast, flexible, and fiercely entrepreneurial. We move quickly, think creatively, and always put our people first.
We’re passionate about growth - both for our clients and ourselves - and that means attracting the very best talent to join us on this exciting journey.
We’re proud to be:
• Trailblazers in banking, payments, capital markets, wealth, and asset management
• Champions of an agile, nimble, and innovative work environment
• Dedicated to building a team of top-notch professionals who share our drive and vision
 

ROLE OVERVIEW


We're hiring the Data Architect that is responsible for designing, developing, and overseeing the data layer within the Knowledge Management ecosystem. This includes documents, metadata, taxonomies, indexes, PII policies, and ingestion processes.

The mission of the role is to ensure completeness, consistency, security, and continuously high data quality used by the KM Master Agent and domain agents.

The Data Architect owns the information model and defines data integration standards across SharePoint Online, Azure AI Search / Knowledge Bases, and GCP (BigQuery/Looker), ensuring a well-structured and reliable data foundation for the entire KM platform.

Fluency in Polish is mandatory.

RESPONSIBILITIES


  • Designing information models for unstructured documents, including content, metadata, and related artifacts.
  • Defining taxonomies and dictionaries: keywords, ontologies, business categories, and knowledge cataloging rules.
  • Designing index and Knowledge Base strategies in Azure AI Search, including chunking, filtering, scoring, and scope policies.
  • Defining data retention and lifecycle policies: storage, archiving, anonymization, deletion, and versioning.
  • Designing Data Contracts for AI agents to ensure consistency between the Master Agent and domain agents.
  • Building ingestion pipelines (automated and ad-hoc): processing PL/EN documents, OCR, multimodal inputs, table extraction, confidence scoring, and metadata enrichment.
  • Ensuring PII compliance: masking, anonymization, and exclusion policies, with decisions at ingestion vs. retrieval stage.
  • Designing authorization models: document/fragment-level access control integrated with Entra ID (roles, user/group scopes).
  • Managing data sources: integration with SharePoint Online, Azure, GCP BigQuery/Looker, and potentially graph databases for document relationships.
  • Monitoring data quality: metadata completeness, language variants, OCR accuracy, missing permissions, and anomaly detection.
  • Designing repositories for processing artifacts (e.g., OCR tables, confidence scores, extracted insights) and defining their indexing rules.

SKILLS & EXPERIENCES TO GET THE JOB DONE


  • Strong experience in data and metadata modeling, including unstructured documents and analytical datasets.
  • Very good knowledge of SharePoint Online / M365 (data structures, metadata, integrations).
  • Hands-on experience with Azure AI Search / Knowledge Bases / Dataverse: index creation, scoring profiles, filtering, and semantic search.
  • Experience with GCP BigQuery / Looker (or equivalent such as Microsoft Fabric or Synapse): data analysis, aggregations, semantic models.
  • Experience with PII, banking regulations, compliance, access control, and data masking.
  • Experience with Graph API / REST API integrations, including metadata retrieval and updates.
  • Hands-on experience with OCR, Document Intelligence, or Vision AI solutions.
  • Knowledge of ingestion standards, chunking strategies, data extraction, and ETL/ELT pipelines.
  • Experience working with multilingual document environments (PL/EN).
  • Understanding of RBAC/ABAC models, Entra ID, and user/group-based access scopes.
  • Knowledge of data quality and Data Governance practices (profiling, lineage, data contracts).
  • Ability to design data architectures for multi-agent AI systems within a Knowledge Management environment.

Nice to have:


  • Experience with Graph databases (Neo4j, Cosmos DB Graph, Amazon Neptune).
  • Experience with DataHub, Apache Atlas, or Microsoft Purview.
  • Experience using GitHub Actions / CI/CD for metadata schemas and data model deployments.
  • Knowledge of MCP for data tools and agent-to-data pipelines.
  • Experience with Azure Functions, Durable Functions, or Logic Apps.
  • Experience working with Lakehouse architectures (Microsoft Fabric, Databricks).

IMPORTANT

  • Fluent Polish (spoken and written) – mandatory
  • Good command of English for documentation and collaboration.
  • Availability to work on-site, with partial remote work - 3 days per week from the office.

ONLINE RECRUITMENT PROCESS STEPS

  • Screening call with the Recruiter
  • Hiring Manager Technical Interview
  • Client stage
  • Feedback/Offer

We offer a flexible collaboration model based on a B2B contract, with the opportunity to work on diverse projects.

Top Skills

Sharepoint Online,Microsoft 365,Azure Ai Search,Azure Knowledge Bases,Dataverse,Google Bigquery,Looker,Microsoft Fabric,Azure Synapse,Graph Api,Rest Api,Ocr,Document Intelligence,Vision Ai,Entra Id,Rbac,Abac,Etl,Elt,Semantic Search

Similar Jobs at Capco

2 Hours Ago
Remote or Hybrid
Poland
Entry level
Entry level
Fintech • Professional Services • Consulting • Energy • Financial Services • Cybersecurity • Generative AI
QA / AI QA Tester responsible for designing test suites and evaluation datasets for LLM/agent systems, performing response quality, PII/compliance, guardrail, performance, integration and UX testing, analyzing telemetry, documenting results, and supporting AI/domain teams on defects and data drift.
Top Skills: Copilot Studio,Azure Ai Search,Azure Openai,Foundry,Document Intelligence,Sharepoint Online,Application Insights,Log Analytics,Python,Pandas,Requests,Rest Apis,Github Actions,Bigquery,Looker,Ocr
6 Hours Ago
Remote or Hybrid
Poland
Mid level
Mid level
Fintech • Professional Services • Consulting • Energy • Financial Services • Cybersecurity • Generative AI
Design and implement document ingestion pipelines, processing workflows, normalization, indexing into Azure AI Search, automated evaluation and monitoring, CI/CD and containerized services, and ensure security/compliance for knowledge artifacts.
Top Skills: Python,Asyncio,Fastapi,Pydantic,Multiprocessing,Azure Sdk For Python,Azure Storage,Azure Cognitive Services,Azure Ai Search,Azure Document Intelligence,Microsoft Graph Api,Sharepoint Api,Application Insights,Log Analytics,Pytest,Github Actions,Docker,Azure Container Registry,Azure Container Apps,Aks,Web App For Containers,Azure Functions,Logic Apps,Durable Functions,Entra Id (Azure Ad),Vector Search,Embeddings
7 Hours Ago
Remote or Hybrid
Poland
Senior level
Senior level
Fintech • Professional Services • Consulting • Energy • Financial Services • Cybersecurity • Generative AI
Lead design and evolution of AI agent architecture (Master Agent + domain agents) across Copilot Studio, Azure and multicloud. Define integrations, knowledge layers, conversational UX, LLMOps, security, performance, and quality metrics. Collaborate with DevOps, Security, AI/ML engineers and domain owners to implement standards, compliance, monitoring, and continuous improvement for enterprise knowledge management.
Top Skills: Azure Openai,Azure Ai Foundry,Azure Ai Search,Knowledge Bases,Copilot Studio,Azure,Google Cloud Platform (Gcp),Rest Api,Mcp,Graph Connectors,Sharepoint Online,Entra Id,Github Actions,Azure Container Registry (Acr),Ci/Cd,Service Bus,Event Grid,Durable Functions,Webhooks,Microsoft Graph Api,Power Platform,Foundry Ai,Bigquery,Looker,Vertex Ai,Azure Document Intelligence,Ocr,Llmops,Rag,Log Analytics,Observability,Microsoft Fabric,Onelake,Semantic Models,Python,Docker,Aks,Azure Container Apps (Aca),Logic Apps,Microsoft Teams,Chatops

What you need to know about the Pune Tech Scene

Once a far-out concept, AI is now a tangible force reshaping industries and economies worldwide. While its adoption will automate some roles, AI has created more jobs than it has displaced, with an expected 97 million new roles to be created in the coming years. This is especially true in cities like Pune, which is emerging as a hub for companies eager to leverage this technology to develop solutions that simplify and improve lives in sectors such as education, healthcare, finance, e-commerce and more.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account