Citi
Data Engineering Lead (Hadoop, Scala, Spark) - VP-C13 - PUNE
Data Engineering Lead a strategic professional who stays abreast of developments within own field and contributes to directional strategy by considering their application in own job and the business. Recognized technical authority for an area within the business. Requires basic commercial awareness. There are typically multiple people within the business that provide the same level of subject matter expertise. Developed communication and diplomacy skills are required in order to guide, influence and convince others, in particular colleagues in other areas and occasional external customers. Significant impact on the area through complex deliverables. Provides advice and counsel related to the technology or operations of the business. Work impacts an entire area, which eventually affects the overall performance and effectiveness of the sub-function/job family.
Responsibilities:
- Strategic Leadership: Define and execute the data engineering roadmap for Global Wealth Data, aligning with overall business objectives and technology strategy. This includes understanding the data needs of portfolio managers, investment advisors, and other stakeholders in the wealth management ecosystem.
- Team Management: Lead, mentor, and develop a high-performing, globally distributed team of data engineers, fostering a culture of collaboration, innovation, and continuous improvement.
- Architecture and Design: Oversee the design and implementation of robust and scalable data pipelines, data warehouses, and data lakes, ensuring data quality, integrity, and availability for global wealth data. This includes designing solutions for handling large volumes of structured and unstructured data from various sources.
- Technology Selection and Implementation: Evaluate and select appropriate technologies and tools for data engineering, staying abreast of industry best practices and emerging trends specific to wealth management data.
- Performance Optimization: Continuously monitor and optimize data pipelines and infrastructure for performance, scalability, and cost-effectiveness, ensuring optimal access to global wealth data.
- Collaboration: Partner with business stakeholders, data scientists, portfolio managers, and other technology teams to understand data needs and deliver effective solutions that support investment strategies and client reporting.
- Data Governance: Implement and enforce data governance policies and procedures to ensure data quality, security, and compliance with relevant regulations, particularly around sensitive financial data.
Qualifications:
- 10-15 years of hands-on experience in Hadoop, Scala, Java, Spark, Hive, Kafka, Impala, Unix Scripting and other Big data frameworks.
- 4+ years of experience with relational SQL and NoSQL databases: Oracle, MongoDB, HBase
- Strong proficiency in Python and Spark Java with knowledge of core spark concepts (RDDs, Dataframes, Spark Streaming, etc) and Scala and SQL
- Data Integration, Migration & Large Scale ETL experience (Common ETL platforms such as PySpark/DataStage/AbInitio etc.) - ETL design & build, handling, reconciliation and normalization
- Data Modeling experience (OLAP, OLTP, Logical/Physical Modeling, Normalization, knowledge on performance tuning)
- Experienced in working with large and multiple datasets and data warehouses
- Experience building and optimizing ‘big data’ data pipelines, architectures, and datasets.
- Strong analytic skills and experience working with unstructured datasets
- Ability to effectively use complex analytical, interpretive, and problem-solving techniques
- Experience with Confluent Kafka, Redhat JBPM, CI/CD build pipelines and toolchain – Git, BitBucket, Jira
- Experience with external cloud platform such as OpenShift, AWS & GCP
- Experience with container technologies (Docker, Pivotal Cloud Foundry) and supporting frameworks (Kubernetes, OpenShift, Mesos)
- Experienced in integrating search solution with middleware & distributed messaging - Kafka
- Highly effective interpersonal and communication skills with tech/non-tech stakeholders.
- Experienced in software development life cycle and good problem-solving skills.
- Excellent problem-solving skills and strong mathematical and analytical mindset
- Ability to work in a fast-paced financial environment
Education:
- Bachelor’s/University degree or equivalent experience in computer science, engineering, or similar domain
This job description provides a high-level review of the types of work performed. Other job-related duties may be assigned as required.
------------------------------------------------------
Job Family Group:
Technology
------------------------------------------------------
Job Family:
Data Analytics
------------------------------------------------------
Time Type:
Full time
------------------------------------------------------
Citi is an equal opportunity employer, and qualified candidates will receive consideration without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, disability, status as a protected veteran, or any other characteristic protected by law.
If you are a person with a disability and need a reasonable accommodation to use our search tools and/or apply for a career opportunity review Accessibility at Citi.
View Citi’s EEO Policy Statement and the Know Your Rights poster.