Job Description Summary
Designs, develops, tests, debugs and implements more complex operating systems components, software tools, and utilities with full competency. Coordinates with users to determine requirements. Reviews systems under development and related documentation. Makes more complex modifications to existing software to fit specialized needs and configurations, and maintains program libraries and technical documentation. May coordinate activities of the project team and assist in monitoring project schedules and costs.
ESSENTIAL DUTIES AND RESPONSIBILITIES
- Lead and Manage configuration, maintenance, and support of portfolio of AI models and related products.
- Manage model delivery to Production deployment team and coordinate model production deployments.
- Ability to analyze complex data requirements, understand exploratory data analysis, and design solutions that meet business needs.
- Work on analyzing data profiles, transformation, quality and security with the dev team to build and enhance data pipelines while maintaining proper quality and control around the data sets.
- Work closely with cross-functional teams, including business analysts, data engineers, and domain experts.
- Understand business requirements and translate them into technical solutions.
- Understand and review the business use cases for data pipelines for the Data Lake including ingestion, transformation and storing in the Lakehouse.
- Present architecture and solutions to executive-level.
MINIMUM QUALIFICATIONS
Bachelor's or master's degree in computer science, Engineering, or related technical field
Minimum of 5 years' experience in building data pipelines for both structured and unstructured data.
At least 2 years' experience in Azure data pipeline development.
Preferably 3 or more years' experience with Hadoop, Azure Databricks, Stream Analytics, Eventhub, Kafka, and Flink.
Strong proficiency in Python and SQL
Experience with big data technologies (Spark, Hadoop, Kafka)
Familiarity with ML frameworks (TensorFlow, PyTorch, scikit-learn)
Knowledge of model serving technologies (TensorFlow Serving, MLflow, KubeFlow) will be a plus
Experience with one pof the cloud platforms (Azure preferred) and their Data Services. Understanding ML services will get preference.
Understanding of containerization and orchestration (Docker, Kubernetes)
Experience with data versioning and ML experiment tracking will be great addition
Knowledge of distributed computing principles
Familiarity with DevOps practices and CI/CD pipelines
PREFERRED QUALIFICATIONS
- Bachelor's degree in Computer Science or equivalent work experience.
- Experience with Agile/Scrum methodology.
- Experience with tax and accounting domain a plus.
- Azure Data Scientist certification a plus.
Applicants may be required to appear onsite at a Wolters Kluwer office as part of the recruitment process.