Lead training and fine-tuning workflows for Diffusion Transformer models, focusing on both image and video generation capabilities.
We are looking for an Engineer to build the training infrastructure, data pipelines, and inference optimization systems for state-of-the-art Diffusion Transformer (DiT) models. This role focuses on scaling the fine-tuning and deployment of models like Qwen, Wan, and LTX-2.
Key Responsibilities
- Training Infrastructure: Design and maintain scalable pipelines for training and fine-tuning Diffusion Transformer models on large-scale GPU clusters.
- Model Optimization: Optimize the inference performance of Wan, LTX-2, and Qwen (Vision) using quantization, pruning, and hardware-aware tuning (e.g., TensorRT, FlashAttention).
- Data Engineering: Develop efficient ingestion and preprocessing pipelines for high-resolution image and video datasets used in generative tasks.
- Capability Expansion: Implement engineering workflows that allow researchers to rapidly fine-tune and expand the capabilities of open-weights diffusion models.
- Production Deployment: Transition experimental fine-tuned models into reliable, low-latency production services.
- Resource Management: optimize distributed training jobs (FSDP, DeepSpeed) to maximize GPU utilization and minimize costs.
Required Qualifications
- Min 2 years of experience in Machine Learning Engineering with a focus on generative models.
- Core Tech: Strong proficiency in PyTorch, JAX, and distributed training frameworks.
- Model Expertise: Hands-on experience deploying or fine-tuning Diffusion Transformers (DiT) and specifically Qwen (Image), Wan, or LTX-2.
- Architecture: Deep understanding of Transformer-based diffusion backbones and flow matching (removing legacy reliance on CNNs/RNNs).
- Tooling: Proficiency in Python and modern ML ecosystem tools (e.g., Hugging Face, Diffusers, FFmpeg for video processing).
- Compute: Experience debugging and optimizing workloads in multi-node GPU environments.
Preferred Qualifications
- Inference Optimization: Experience with techniques like KV-caching, compile-time optimizations, or kernel fusion for transformers.
- MLOps: Familiarity with experiment tracking (W&B) and model versioning tools in a generative media context.
- Streaming: Experience handling real-time video generation or streaming inference pipelines.
- Open Source: Contributions to libraries like diffusers or active experimentation with the latest open-source DiT implementations.
Top Skills
Deepspeed
Diffusion Transformers
Fsdp
Jax
Ltx-2
PyTorch
Qwen
Wan
Similar Jobs
Artificial Intelligence • Big Data • Enterprise Web • Fintech • Software • Financial Services
Lead a software engineering team, design and maintain software systems, mentor junior engineers, drive engineering decisions, and ensure high-quality deliverables.
Top Skills:
AWSCi/CdDevops PracticesJavaJava Spring Boot
Artificial Intelligence • Automotive • Computer Vision • Information Technology • Internet of Things • Logistics • Software
The Lead Software Engineer will build scalable applications in the cloud, using Scala, Java, Python, and familiarity with AWS and software testing.
Top Skills:
AWSJavaPythonScala
Artificial Intelligence • Automotive • Computer Vision • Information Technology • Internet of Things • Logistics • Software
As a Senior Data Scientist, you'll develop and deploy machine learning models for mobility data, improve routing systems, and analyze geospatial datasets. Responsibilities include building ML models, designing data pipelines, and collaborating with engineering teams.
Top Skills:
AWSAzureBigquery GisDatabricksGCPGdalGeopandasPostgisPythonPyTorchScikit-LearnShapelySparkSQLTensorFlow
What you need to know about the Pune Tech Scene
Once a far-out concept, AI is now a tangible force reshaping industries and economies worldwide. While its adoption will automate some roles, AI has created more jobs than it has displaced, with an expected 97 million new roles to be created in the coming years. This is especially true in cities like Pune, which is emerging as a hub for companies eager to leverage this technology to develop solutions that simplify and improve lives in sectors such as education, healthcare, finance, e-commerce and more.


