As a Principal AI Engineer, you will define system architecture for AI platforms, ensuring scalability and security while mentoring teams and making architectural decisions.
Principal AI Engineer - Agentic AI, System Architecture & Data Science
Experience: 13+ years
About the Team
The AI Center of Excellence team includes Data Scientists and AI Engineers who work together to conduct research, build prototypes, design features, and deliver production-grade AI systems at scale. Our mission is to leverage the best available technology-including advanced ML, LLMs, and agentic AI systems-to protect our customers' attack surfaces.
We partner deeply with Detection and Response teams, including our MDR service, to embed AI into real-world security workflows. Our work builds on more than 20 years of threat intelligence, deep domain expertise, and a growing patent portfolio. We operate in ambiguous problem spaces and value technical rigor, strong ownership, and principled decision-making.
As a Principal engineer, you will define and own the system architecture for AI platforms and services across the organization.
The technologies we use include
About the Role
Rapid7 is seeking a Principal AI Engineer - Data Science who brings deep system design and architectural leadership to our AI organization.
This role sits at the intersection of data science, large-scale distributed systems, agentic AI, and cloud-native architecture. You will be responsible not just for building models, but for designing the end-to-end systems that make AI reliable, scalable, secure, and operable in production.
This role is ideal for someone who:
In this role, you will
The skills you'll bring include
Core (Required)
Agentic AI & LLM Systems (Strongly Required)
System Design, Cloud & MLOps (Strongly Required)
Experience with the following would be advantageous
Important clarification
A Principal AI Engineer at Rapid7:
If a candidate has not designed and defended system architectures for production AI platforms, they are not a fit for this role, regardless of individual modeling strength.
About Rapid7
At Rapid7, we are on a mission to create a secure digital world for our customers, our industry, and our communities. We do this by embracing tenacity, passion, and collaboration to challenge what's possible and drive extraordinary impact.
Here, we're building a dynamic workplace where everyone can have the career experience of a lifetime. We challenge ourselves to grow to our full potential. We learn from our missteps and celebrate our victories. We come to work every day to push boundaries in cybersecurity and keep our 10,000 global customers ahead of whatever's next.
Join us and bring your unique experiences and perspectives to tackle some of the world's biggest security challenges.
About Rapid7
At Rapid7, our vision is to create a secure digital world for our customers, our industry, and our communities. We do this by harnessing our collective expertise and passion to challenge what's possible and drive extraordinary impact. We're building a dynamic and collaborative workplace where new ideas are welcome.
Protecting 11,000+ customers against bad actors and threats means we're continuing to push the envelope just like we' ve been doing for the past 20 years. If you 're ready to solve some of the toughest challenges in cybersecurity, we're ready to help you take command of your career. Join us.
Experience: 13+ years
About the Team
The AI Center of Excellence team includes Data Scientists and AI Engineers who work together to conduct research, build prototypes, design features, and deliver production-grade AI systems at scale. Our mission is to leverage the best available technology-including advanced ML, LLMs, and agentic AI systems-to protect our customers' attack surfaces.
We partner deeply with Detection and Response teams, including our MDR service, to embed AI into real-world security workflows. Our work builds on more than 20 years of threat intelligence, deep domain expertise, and a growing patent portfolio. We operate in ambiguous problem spaces and value technical rigor, strong ownership, and principled decision-making.
As a Principal engineer, you will define and own the system architecture for AI platforms and services across the organization.
The technologies we use include
- Python for large-scale data science, modeling, and experimentation
- Jupyter notebooks (local & remote)
- Classical ML using scikit-learn
- Deep learning for NLP and sequence-based security problems
- Anomaly detection and behavioral modeling
- LLM / GenAI toolchains: HuggingFace, Transformers, LangChain, LangGraph
- Agentic AI systems: multi-agent orchestration, tool-calling, reasoning, memory, evaluation
- RAG pipelines using vector databases
- AWS cloud ecosystem: Bedrock, SageMaker, Lambda, EKS, S3, DynamoDB, Redshift, Kinesis
- CI/CD for ML & LLM systems (GitHub Actions, Jenkins)
- Model registry, versioning, drift detection, and retraining frameworks
- Observability & operations: CloudWatch, Prometheus, Grafana, PagerDuty
- Infrastructure as Code using Terraform
About the Role
Rapid7 is seeking a Principal AI Engineer - Data Science who brings deep system design and architectural leadership to our AI organization.
This role sits at the intersection of data science, large-scale distributed systems, agentic AI, and cloud-native architecture. You will be responsible not just for building models, but for designing the end-to-end systems that make AI reliable, scalable, secure, and operable in production.
This role is ideal for someone who:
- has designed complex, distributed AI systems end to end,
- understands how data, models, infrastructure, and services interact, and
- can make long-term architectural decisions under real-world constraints.
In this role, you will
- Own the system architecture for AI, ML, LLM, and agentic AI platforms across multiple teams
- Design end-to-end AI system architectures, including:
- data ingestion and streaming pipelines
- feature stores and offline/online data paths
- model training, fine-tuning, and evaluation
- inference services, APIs, and microservices
- monitoring, alerting, and incident response workflows
- Define reference architectures and design patterns for:
- LLM orchestration and agentic workflows
- RAG systems and vector retrieval
- secure and scalable inference
- Lead architectural reviews and act as the final technical authority on AI system design decisions
- Make trade-offs across accuracy, latency, cost, scalability, security, and reliability
- Establish architectural standards for:
- model registry and lifecycle management
- drift detection and retraining
- LLM evaluation, guardrails, and governance
- Ensure AI systems comply with cloud security best practices (IAM, KMS, VPC, secrets)
- Serve as the escalation point for complex production incidents
- Mentor Staff and Senior engineers on system design and architectural thinking
- Influence product roadmaps and long-term AI platform investments
The skills you'll bring include
Core (Required)
- 13+ years of experience in Data Science, ML Engineering, or Applied AI
- Proven experience designing and architecting large-scale AI systems
- Strong background in:
- data acquisition, cleaning, enrichment, and transformation
- feature engineering for structured and unstructured data
- supervised and unsupervised ML
- deep learning (NLP, CNNs, RNNs, sequence models)
- Experience with model explainability (SHAP, LIME)
- Hands-on experience with security-focused ML models:
- malware detection
- malware behavior-based models
- user behavioral analytics
- Exceptional ability to reason at the system and architecture level
Agentic AI & LLM Systems (Strongly Required)
- Deep hands-on experience with:
- LLM orchestration (LangChain, LangGraph)
- agentic and multi-agent architectures
- RAG pipelines and vector databases
- prompt engineering at scale
- LLM evaluation frameworks (Promptfoo, HELM)
- fine-tuning approaches (LoRA, PEFT)
- Designing robust guardrails, governance, and evaluation frameworks for LLM systems
- Understanding of failure modes and risks in autonomous and agentic AI systems
System Design, Cloud & MLOps (Strongly Required)
- Strong experience designing distributed systems and microservice architectures
- Expertise in:
- model registry and versioning (MLflow, SageMaker)
- drift detection and automated retraining
- monitoring and observability (CloudWatch, Prometheus, Grafana)
- incident management and on-call leadership (PagerDuty)
- Deep AWS experience:
- Bedrock, SageMaker, Lambda, EKS
- data storage systems (S3, DynamoDB, Redshift, Kinesis)
- cloud security (IAM, KMS, Secrets Manager, VPCs)
- Working knowledge of:
- Docker and Kubernetes
- CI/CD pipelines for ML/LLM workloads
- Infrastructure as Code using Terraform
Experience with the following would be advantageous
- Architecting internal AI platforms used by multiple product teams
- Defining AI governance, risk, and compliance frameworks
- Operating AI systems in high-scale or regulated environments
Important clarification
A Principal AI Engineer at Rapid7:
- owns architecture, not just code
- sets standards others follow
- is accountable for the long-term technical health of AI systems across teams
If a candidate has not designed and defended system architectures for production AI platforms, they are not a fit for this role, regardless of individual modeling strength.
About Rapid7
At Rapid7, we are on a mission to create a secure digital world for our customers, our industry, and our communities. We do this by embracing tenacity, passion, and collaboration to challenge what's possible and drive extraordinary impact.
Here, we're building a dynamic workplace where everyone can have the career experience of a lifetime. We challenge ourselves to grow to our full potential. We learn from our missteps and celebrate our victories. We come to work every day to push boundaries in cybersecurity and keep our 10,000 global customers ahead of whatever's next.
Join us and bring your unique experiences and perspectives to tackle some of the world's biggest security challenges.
About Rapid7
At Rapid7, our vision is to create a secure digital world for our customers, our industry, and our communities. We do this by harnessing our collective expertise and passion to challenge what's possible and drive extraordinary impact. We're building a dynamic and collaborative workplace where new ideas are welcome.
Protecting 11,000+ customers against bad actors and threats means we're continuing to push the envelope just like we' ve been doing for the past 20 years. If you 're ready to solve some of the toughest challenges in cybersecurity, we're ready to help you take command of your career. Join us.
Top Skills
AWS
Cloudwatch
Docker
Github Actions
Grafana
Jenkins
Jupyter Notebooks
Kubernetes
Mlflow
Prometheus
Python
Sagemaker
Scikit-Learn
Terraform
Rapid7 India Office
What you need to know about the Pune Tech Scene
Once a far-out concept, AI is now a tangible force reshaping industries and economies worldwide. While its adoption will automate some roles, AI has created more jobs than it has displaced, with an expected 97 million new roles to be created in the coming years. This is especially true in cities like Pune, which is emerging as a hub for companies eager to leverage this technology to develop solutions that simplify and improve lives in sectors such as education, healthcare, finance, e-commerce and more.

