As a Data Architect, you will design modern data architectures on the Databricks platform, oversee data migration initiatives, implement CI/CD practices, and ensure compliance with industry standards.
Overview
We are seeking an experienced Data Architect with extensive expertise in designing and implementing modern data architectures. This role requires strong software engineering principles, hands-on coding abilities, and experience building data engineering frameworks. The ideal candidate will have a proven track record of implementing Databricks-based solutions in the healthcare industry, with expertise in data catalog implementation and governance frameworks.
About the Role
As a Data Architect, you will be responsible for designing and implementing scalable, secure, and efficient data architectures on the Databricks platform. You will lead the technical design of data migration initiatives from legacy systems to modern Lakehouse architecture, ensuring alignment with business requirements, industry best practices, and regulatory compliance.
Key Responsibilities
- Design and implement modern data architectures using Databricks Lakehouse platform
- Lead the technical design of Data Warehouse/Data Lake migration initiatives from legacy systems
- Develop data engineering frameworks and reusable components to accelerate delivery
- Establish CI/CD pipelines and infrastructure-as-code practices for data solutions
- Implement data catalog solutions and governance frameworks
- Create technical specifications and architecture documentation
- Provide technical leadership to data engineering teams
- Collaborate with cross-functional teams to ensure alignment of data solutions
- Evaluate and recommend technologies, tools, and approaches for data initiatives
- Ensure data architectures meet security, compliance, and performance requirements
- Mentor junior team members on data architecture best practices
- Stay current with emerging technologies and industry trends
Qualifications
- Extensive experience in data architecture design and implementation
- Strong software engineering background with expertise in Python or Scala
- Proven experience building data engineering frameworks and reusable components
- Experience implementing CI/CD pipelines for data solutions
- Expertise in infrastructure-as-code and automation
- Experience implementing data catalog solutions and governance frameworks
- Deep understanding of Databricks platform and Lakehouse architecture
- Experience migrating workloads from legacy systems to modern data platforms
- Strong knowledge of healthcare data requirements and regulations
- Experience with cloud platforms (AWS, Azure, GCP) and their data services
- Bachelor's degree in computer science, Information Systems, or related field; advanced degree preferred
Technical Skills
- Programming languages: Python and/or Scala (required)
- Data processing frameworks: Apache Spark, Delta Lake
- CI/CD tools: Jenkins, GitHub Actions, Azure DevOps
- Infrastructure-as-code (optional): Terraform, CloudFormation, Pulumi
- Data catalog tools: Databricks Unity Catalog, Collibra, Alation
- Data governance frameworks and methodologies
- Data modeling and design patterns
- API design and development
- Cloud platforms: AWS, Azure, GCP
- Container technologies: Docker, Kubernetes
- Version control systems: Git
- SQL and NoSQL databases
- Data quality and testing frameworks
Optional - Healthcare Industry Knowledge
- Healthcare data standards (HL7, FHIR, etc.)
- Clinical and operational data models
- Healthcare interoperability requirements
- Healthcare analytics use cases
Top Skills
Alation
Spark
AWS
Azure
Azure Devops
CloudFormation
Collibra
Databricks Unity Catalog
Delta Lake
Docker
GCP
Github Actions
Jenkins
Kubernetes
NoSQL
Pulumi
Python
Scala
SQL
Terraform
Similar Jobs
Artificial Intelligence • Cloud • Information Technology • Sales • Security • Software • Cybersecurity
As a Principal AI Engineer at Rapid7, you will lead AI/ML deployments, manage data pipelines, and enhance cybersecurity solutions through collaborative engineering and innovative problem-solving.
Top Skills:
AIAWSDevOpsDockerFastapiFlaskKubernetesLlmsMlMlopsPythonSagemakerTerraformTypescript
Artificial Intelligence • Cloud • Information Technology • Sales • Security • Software • Cybersecurity
As a Detection and Response Analyst, you will investigate security events, analyze malicious activities, and communicate findings using Rapid7's tools.
Top Skills:
BloodhoundMetasploitMimikatzMitre Att&Ck FrameworkRapid7 Software
Artificial Intelligence • Consumer Web • Edtech • HR Tech • Information Technology • Software • Conversational AI
The Customer Sales Director will lead new business development and account management in the e-learning sector, utilizing solution selling techniques to achieve sales targets.
Top Skills:
Digital Business ToolsSalesforce
What you need to know about the Pune Tech Scene
Once a far-out concept, AI is now a tangible force reshaping industries and economies worldwide. While its adoption will automate some roles, AI has created more jobs than it has displaced, with an expected 97 million new roles to be created in the coming years. This is especially true in cities like Pune, which is emerging as a hub for companies eager to leverage this technology to develop solutions that simplify and improve lives in sectors such as education, healthcare, finance, e-commerce and more.