Design, build, optimize, and maintain high-performance Spark-based data pipelines using Scala/Java and Hive on Hadoop/CDP. Own full project lifecycle, enforce coding best practices, troubleshoot Spark/Hive/YARN performance, and collaborate with stakeholders to deliver scalable data solutions.
We need a Senior Data Engineer with 10+ years exp proficient in Spark, Scala/Java, and Hive, with extensive hands-on development experience in the Big Data Ecosystem.
Key Responsibilities:
- Design, implement, and optimize highly performant data pipelines using Spark, Scala/Java, and Hive on platforms like Cloudera Data Platform (CDP) or other Hadoop echo systems.
- Take complete ownership of complex data engineering projects within the big data ecosystem, covering the entire lifecycle from initial design and development to deployment and ongoing maintenance.
- Develop robust and efficient Hive queries for extensive data analysis and reporting.
- Champion and enforce best practices and coding standards for new and existing data flows to ensure they are robust, scalable, secure, and maintainable using Spark, Scala/Java, and Hive within the big data ecosystem.
- Diagnose, troubleshoot, and resolve complex issues related to Spark, Scala/Java, and Hive applications and YARN resource management, implementing performance optimization solutions.
- Proactively collaborate with stakeholders, working closely to develop solutions with full commitment and accountability.
Technical Skills & Experience:
- Proven hands-on development expertise with Apache Spark
- Strong programming proficiency in Scala and/or Java
- In-depth knowledge and practical experience with Hive, including query optimization and data analysis.
- Experience with data platforms such as Cloudera Data Platform (CDP) is highly desirable.
Education:
- Bachelor’s / Master's degree/University degree or equivalent experience
Similar Jobs
Agency • Information Technology
Design, build, and optimize end-to-end data pipelines for large structured and unstructured data. Implement near-real-time ETL, data validation, monitoring, and performance optimization. Collaborate with stakeholders, document designs and workflows, and provide technical guidance to the team.
Top Skills:
Amazon RedshiftSparkAzure Data FactoryAzure DatabricksAzure DevopsAzure Storage AccountsCi/CdDatabricksDelta LakeAzurePythonScalaSnowflakeSQL
Cloud • Information Technology • Productivity • Software • Automation
Provide advanced technical support for Boomi integrations: troubleshoot APIs, protocols, and logs; use tools like Postman, Wireshark, and Charles Proxy; work with Java/Groovy/JavaScript, Kubernetes, and enterprise systems (Salesforce, NetSuite, Hadoop); leverage AI for diagnostics and collaborate with customers and engineering teams to resolve complex integration issues.
Top Skills:
AIAi TechnologiesBoomi AtomsphereCharles ProxyEltETLGroovyHadoopHttp/HttpsJavaJavaScriptKubernetesLinuxNetSuiteOauth 2.0PostmanRancher DesktopReactRestSalesforceSftpSoapSsl/TlsTcp/IpWindowsWiresharkWsdl
Artificial Intelligence • Healthtech • Machine Learning • Natural Language Processing • Biotech • Pharmaceutical
Lead clinical trial disclosure strategy and operations for Pfizer-sponsored interventional trials. Ensure timely, compliant posting of protocols, SAPs, CSRs, and clinical summaries per EMA Policy 70 and other regulations. Manage vendors, develop processes and technical solutions, represent Medical Writing on governance committees, and maintain regulatory knowledge and best practices to drive quality and consistency across disclosures.
What you need to know about the Pune Tech Scene
Once a far-out concept, AI is now a tangible force reshaping industries and economies worldwide. While its adoption will automate some roles, AI has created more jobs than it has displaced, with an expected 97 million new roles to be created in the coming years. This is especially true in cities like Pune, which is emerging as a hub for companies eager to leverage this technology to develop solutions that simplify and improve lives in sectors such as education, healthcare, finance, e-commerce and more.


