Fractal

Data Engineer

Posted 18 Days Ago

Be an Early Applicant

5 Locations

Mid level

5 Locations

Mid level

The Data Engineer will design and implement scalable big data solutions, conducting data analysis, machine learning model testing, and maintaining cloud-based applications.

The summary above was generated by AI

It's fun to work in a company where people truly BELIEVE in what they are doing!

We're committed to bringing passion and customer focus to the business.

The BigData Engineers have expertise in building horizontally scalable applications using distributed technologies like NoSQL Dbs/Hadoop/Spark and others and we execute projects on-premise and cloud based systems. The AI-Engineers and MLOps Engineers work on scaling AI-systems and in building end-to-end productionized MLOps pipelines.

RESPONSIBILITIES:

Our Big Data capability team needs hands-on developers who can produce beautiful & functional code to solve complex analytics problems. If you are an exceptional developer with an aptitude to learn and implement using new technologies, and who loves to push the boundaries to solve complex business problems innovatively, then we would like to talk with you.
You would be responsible for evaluating, developing, maintaining, and testing big data solutions for advanced analytics projects.
The role would involve big data pre-processing & reporting workflows including collecting, parsing, managing, analyzing and visualizing large sets of data to turn information into business insights.
The role would also involve testing various machine learning models on Big Data and deploying learned models for ongoing scoring and prediction. An appreciation of the mechanics of complex machine learning algorithms would be a strong advantage.

QUALIFICATIONS:

Demonstrable experience designing technological solutions to complex data problems, developing & testing modular, reusable, efficient and scalable code to implement those solutions.
Ideally, this would include work on the following technologies:
Expert-level proficiency in at-least one of Java, C++, or Python (preferred). Scala knowledge a strong advantage
Strong understanding and experience in distributed computing frameworks, particularly Apache Hadoop 2.0 (YARN; MR & HDFS) and associated technologies -- one or more of Hive, Sqoop, Avro, Flume, Oozie, Zookeeper, etc.
Hands-on experience with Apache Spark and its components (Streaming, SQL, MLLib) is a strong advantage.
Operating knowledge of cloud computing platforms (AWS, especially EMR, EC2, S3, SWF services and the AWS CLI)
Experience working within a Linux computing environment, and use of command line tools including knowledge of shell/Python scripting for automating common tasks.
Ability to work in a team in an agile setting, familiarity with JIRA and clear understanding of how Git works.
Linux environment and shell scripting

If you like wild growth and working with happy, enthusiastic over-achievers, you'll enjoy your career with us!

Not the right fit? Let us know you're interested in a future opportunity by clicking Introduce Yourself in the top-right corner of the page or create an account to set up email alerts as new job postings become available that meet your interest!

Top Skills

Apache Hadoop

Spark

Avro

AWS

C++

Ec2

Emr

Flume

Hive

Java

Linux

Oozie

Python

Scala

Sqoop

Swf

Zookeeper

Similar Jobs

Capco

Data Engineer

21 Hours Ago

Remote

Hybrid

Pune, Maharashtra, IND

Senior level

Fintech • Professional Services • Consulting • Energy • Financial Services • Cybersecurity • Generative AI

The Sr. Data Engineer will work on projects involving PySpark and Scala with a focus on data analysis and debugging. They will utilize their skills in Spark, GIT, and familiar CICD tools to manage the Big Data Application Life Cycle while ensuring efficient incident management using Control-M and Service Now.

Capco

PC-AWS Data Engineer

Yesterday

Remote

Hybrid

Pune, Maharashtra, IND

Expert/Leader

Fintech • Professional Services • Consulting • Energy • Financial Services • Cybersecurity • Generative AI

The role involves designing and maintaining large-scale data pipelines using AWS services, particularly Glue, while collaborating on ETL solutions.

Top Skills: AthenaAWSAws GlueEc2LambdaPysparkPythonRedshiftS3Spark

JPMorganChase

Software Engineer II - Data Engineer

4 Days Ago

Hybrid

Mumbai, Maharashtra, IND

Junior

Financial Services

As a Data Engineer II, you will design and implement scalable data pipelines and ETL processes, maintain data infrastructure, and collaborate with stakeholders for optimizing data access and performance.

Top Skills: AWSAws EmrAzureGitlabGCPJenkinsLambdaNoSQLPysparkPythonRedshiftS3SQL

What you need to know about the Pune Tech Scene

Once a far-out concept, AI is now a tangible force reshaping industries and economies worldwide. While its adoption will automate some roles, AI has created more jobs than it has displaced, with an expected 97 million new roles to be created in the coming years. This is especially true in cities like Pune, which is emerging as a hub for companies eager to leverage this technology to develop solutions that simplify and improve lives in sectors such as education, healthcare, finance, e-commerce and more.