Nanonets

Senior Deep Learning Engineer

Reposted Yesterday

Be an Early Applicant

Easy Apply

Remote

Hiring Remotely in India

Senior level

Easy Apply

Remote

Hiring Remotely in India

Senior level

Nanonets seeks a Senior Deep Learning Engineer with expertise in DL/ML to develop and optimize architectures, focusing on NLP and CV applications.

The summary above was generated by AI

Location: Bangalore (Hybrid) | $40M+ Funded | Building State-of-the-Art AI

Nanonets is transforming the way businesses work. Our AI platform takes the manual, messy, time consuming work — that bog down industries like finance, healthcare, supply chain, and more — and turns them into seamless, automated processes. What once took hours of human effort now takes seconds with Nanonets. Our client footprint spans across 34% of Fortune 500 enabling businesses across various industries to unlock the potential of AI in automating their business processes.

More than 10,000 businesses trust Nanonets because we don’t just promise efficiency — we deliver it with unmatched accuracy, seamless integrations.

Join Nanonets to push the boundaries of what's possible with deep learning. We're not just implementing models – we're setting new benchmarks in document AI, with our open-source models achieving nearly 1 million downloads on Hugging Face and recognition from global AI leaders.

Backed by $40M+ in total funding including our recent $29M Series B from Accel, alongside Elevation Capital and Y Combinator, we're scaling our deep learning capabilities to serve enterprise clients including Toyota, Boston Scientific, and Bill.com. You'll work on genuinely challenging problems at the intersection of computer vision, NLP, and generative AI.

Here's a quick 1-minute intro video.

Read about the release here:

Article 1

Article 2

What You'll BuildCore Technical Challenges:

Train & Fine-tune SOTA Architectures: Adapt and optimize transformer-based models, vision-language models, and custom architectures for document understanding at scale
Production ML Infrastructure: Design high-performance serving systems handling millions of requests daily using frameworks like TorchServe, Triton Inference Server, and vLLM
Agentic AI Systems: Build reasoning-capable OCR that goes beyond extraction – models that understand context, chain operations, and provide confidence-grounded outputs
Optimization at Scale: Implement quantization, distillation, and hardware acceleration techniques to achieve fast inference while maintaining accuracy
Multi-modal Innovation: Tackle alignment challenges between vision and language models, reduce hallucinations, and improve cross-modal understanding using techniques like RLHF and PEFT

Engineering Responsibilities:

Design distributed training pipelines for models with billions of parameters using PyTorch FSDP/DeepSpeed
Build comprehensive evaluation frameworks benchmarking against GPT-4V, Claude, and specialized document AI models
Implement A/B testing infrastructure for gradual model rollouts in production
Create reproducible training pipelines with experiment tracking
Optimize inference costs through dynamic batching, model pruning, and selective computation

We’re on a mission to hire the very best and are committed to creating exceptional employee experiences where everyone is respected and has access to equal opportunity.

Technical RequirementsMust-Have:

3+ years of hands-on deep learning experience with production deployments
Strong PyTorch expertise – ability to implement custom architectures, loss functions, and training loops from scratch
Experience with distributed training and large-scale model optimization
Proven track record of taking models from research to production
Solid understanding of transformer architectures, attention mechanisms, and modern training techniques
B.E./B.Tech from top-tier engineering colleges

Highly Valued:

Experience with model serving frameworks (TorchServe, Triton, Ray Serve, vLLM)
Knowledge of efficient inference techniques (ONNX, TensorRT, quantization)
Contributions to open-source ML projects
Experience with vision-language models and document understanding
Familiarity with LLM fine-tuning techniques (LoRA, QLoRA, PEFT)

Why This Role is Exceptional

Proven Impact: Our models approaching 1 million downloads – your work will have global reach
Real Scale: Your models will process millions of documents daily for Fortune 500 companies
Well-Funded Innovation: $40M+ in funding means significant GPU resources and freedom to experiment
Open Source Leadership: Publish your work and contribute to models already trusted by nearly a million developers
Research-Driven Culture: Regular paper reading sessions, collaboration with research community
Rapid Growth: Strong financial backing and Series B momentum mean ambitious projects and fast career progression

Our Recent Achievements

Nanonets-OCR model: ~1 million downloads on Hugging Face – one of the most adopted document AI models globally
Launched industry-first Automation Benchmark defining new standards for AI reliability
Published research recognized by leading AI researchers
Built agentic OCR systems that reason and adapt, not just extract
Secured $40M+ in total funding from Accel, Elevation Capital, and Y Combinator

Top Skills

Caffe

Jax

Keras

Python

PyTorch

TensorFlow

Theano

Torch

Similar Jobs

BlackLine

Senior Software Engineer

17 Hours Ago

Remote or Hybrid

Senior level

Cloud • Fintech • Information Technology • Machine Learning • Software • App development • Generative AI

As a Senior Software Engineer, you will design, develop, test, and optimize cloud services and backend platforms, driving innovation and ensuring high standards in software quality and architecture.

Top Skills: ApigeeAWSAzureElastic SearchGCPJavaKafkaNifiNo-SqlOktaRabbitMQServerless ArchitectureSQL

BlackLine

Manager, Database Engineering

17 Hours Ago

Remote or Hybrid

Senior level

Cloud • Fintech • Information Technology • Machine Learning • Software • App development • Generative AI

Lead and inspire a team of Database Engineers to enhance a cloud-based SaaS application. Manage project deliverables, mentor team members, and drive innovation in database systems and management.

Top Skills: AnsibleGoogle Cloud PlatformPowershellPythonSql 2008Sql 2012Sql 2014Sql 2017Sql 2019Sql 2022Terraform

Boomi

Senior Software Engineer

17 Hours Ago

Remote

India

Senior level

Cloud • Information Technology • Productivity • Software • Automation

As a Senior Site Reliability Engineer, you will develop advanced systems and software, ensure reliability standards, mentor others, and collaborate on automation and infrastructure improvements using cutting-edge technologies.

Top Skills: AnsibleAWSAzureCloudFormationJavaScriptNew RelicPythonSplunkTerraform

What you need to know about the Pune Tech Scene

Once a far-out concept, AI is now a tangible force reshaping industries and economies worldwide. While its adoption will automate some roles, AI has created more jobs than it has displaced, with an expected 97 million new roles to be created in the coming years. This is especially true in cities like Pune, which is emerging as a hub for companies eager to leverage this technology to develop solutions that simplify and improve lives in sectors such as education, healthcare, finance, e-commerce and more.