Design, develop, and maintain scalable data pipelines (PySpark/Apache Spark); process large-scale structured and unstructured data; build real-time analytics for cloud and edge devices; performance-tune and debug Spark jobs; collaborate with data scientists, data engineers, and firmware teams; apply distributed computing and data warehousing/lake concepts.
Location: This position will be based in Bangalore/Mumbai, India
Role and Responsibilities:
- Design, develop, and maintain scalable data pipelines using PySpark and Apache Spark.
- Process and analyze large-scale structured and unstructured datasets in distributed environments.
- Responsible for building real-time analytics on cloud and edge devices
- Solve challenging data and architectural problems using cutting edge technology
- Cross functional collaboration with data scientists / data engineering / firmware controls teams
Skills and Experience:
- Strong Java/ Scala programming/debugging ability and clear design patterns understanding, Python is a bonus
- Understanding of Kafka/ Spark / Flink / Hadoop / HBase etc. internals (Hands on experience in one or more preferred)
- Implementing data wrangling, transformation and processing solutions, demonstrated experience of working with large datasets
- Experience in performance tuning and debugging Spark jobs
- Good understanding of distributed computing principles
- Knowhow of cloud computing platforms like AWS/GCP/Azure beneficial
- Exposure to data lakes and data warehousing concepts, SQL, NoSQL databases
- Working on REST API’s, gRPC are good to have skills
- Ability to adapt to new technology, concept, approaches, and environment faster
- Problem-solving and analytical skills
- Must have a learning attitude and improvement mindset
Qualifications:
- MTech/M.S with emphasis in computational or decision sciences preferred
- 3+ years of relevant experience
Similar Jobs
Artificial Intelligence • Big Data • Enterprise Web • Fintech • Software • Financial Services
Join the Data Services team as a Software Developer/Data Scientist to deliver custom reports, manage data corrections, and automate processes using Python and AWS.
Top Skills:
Aws AthenaAws AuroraExcelPythonSQL
Artificial Intelligence • Big Data • Enterprise Web • Fintech • Software • Financial Services
As an Associate Quant Analyst, you'll automate data analysis processes, develop workflow optimization tools, and assist in quantitative analysis for credit ratings, collaborating with internal and external teams.
Top Skills:
AnacondaJupyterExcelMssqlNumpyPandasPythonSQLVBA
Cloud • Information Technology • Security • Software • Cybersecurity
The Staff Threat Researcher will analyze threats, conduct proactive hunting, document findings, and manage flexible schedules to improve detection capabilities.
Top Skills:
ElasticsearchMicrosoft SentinelMitre Att&CkPythonSplunkYara
What you need to know about the Pune Tech Scene
Once a far-out concept, AI is now a tangible force reshaping industries and economies worldwide. While its adoption will automate some roles, AI has created more jobs than it has displaced, with an expected 97 million new roles to be created in the coming years. This is especially true in cities like Pune, which is emerging as a hub for companies eager to leverage this technology to develop solutions that simplify and improve lives in sectors such as education, healthcare, finance, e-commerce and more.


