COGNNA Logo

COGNNA

Senior Data Engineer

Posted 2 Days Ago
Be an Early Applicant
Remote
2 Locations
Senior level
Remote
2 Locations
Senior level
As a Senior Data Engineer, you will architect security data ecosystems by designing data lakehouse architectures, implementing real-time streaming pipelines, and enabling AI/ML features. You will manage data ingestion patterns and ensure system integrity through automation and observability.
The summary above was generated by AI

As a Senior Data Engineer, you will be the architect of our security data ecosystem. Your primary mission is to design and build high-performance data lake architectures and real-time streaming pipelines that serve as the foundation for COGNNA's Agentic AI initiatives. You will ensure that our AI models have access to fresh, high-quality security telemetry through sophisticated ingestion patterns.

Key Responsibilities

1. Data Lake & Storage Architecture

  • Architectural Design: Design and implement multi-tier Data Lakehouse architectures to support both structured security logs and unstructured AI training data.
  • Storage Optimization: Define lifecycle management, partitioning, and clustering strategies to ensure high-performance querying while optimizing for cloud storage costs.
  • Schema Evolution: Manage complex schema evolution for security telemetry, ensuring compatibility with downstream AI/ML feature engineering.

2. Real-Time & Streaming Processing

  • Streaming Ingestion: Build and manage low-latency, high-throughput ingestion pipelines capable of processing millions of security events per second in real-time.
  • Unified Processing: Design unified batch and stream processing architectures to ensure consistency across historical analysis and real-time threat detection.
  • Event-Driven Workflows: Implement event-driven patterns to trigger AI agent reasoning based on incoming live data streams.

3. AI/ML Enablement & Feature Engineering

  • Vector Data Foundations: Architect the data infrastructure required to support semantic search applications and variants of RAG architectures for our generative AI models.
  • Feature Management: Design and maintain a centralized repository for ML features, ensuring consistent data is used for both model training and real-time inference.
  • AI Pipeline Orchestration: Build automated workflows to handle data preparation, model evaluation, and deployment within our cloud AI ecosystem.

4. DataOps & Systems Design

  • Infrastructure as Code: Utilize declarative tools (e.g., Terraform) to manage the entire lifecycle of our cloud data resources and AI endpoints.
  • Quality & Observability: Implement automated data quality frameworks and real-time monitoring to detect "data drift" or pipeline failures before they impact AI model performance.

Requirements
  • Experience & Education: 5+ years in Data Engineering or Backend Engineering, focused on large-scale distributed systems. B.S. or M.S. in Computer Science or a related technical field.
  • Cloud Architecture: Deep architectural mastery of the Google Cloud Platform ecosystem, specifically regarding managed analytical warehouses, serverless compute, and identity/access management. Proven track record of deploying enterprise-scale Data Lakehouses from scratch.
  • Real-Time Mastery: Expertise in building production-grade distributed messaging and stream processing engines (e.g., managed Apache Beam/Flink environments) capable of handling high-velocity telemetry.
  • AI Enablement: Strong understanding of how data architecture impacts AI performance. Experience building embedding pipelines, feature stores, and automated workflows for model training and evaluation.
  • Software Fundamentals: Expert-level Python and advanced SQL. Proficiency in high-performance languages like Go or Scala is highly desirable.
  • Operational Excellence: Advanced knowledge of CI/CD, containerization on Kubernetes, and managing cloud infrastructure through code to ensure reproducible environments.
Preferred Qualifications
  • Experience with dbt for modern analytics engineering.
  • Understanding of cybersecurity data standards (OCSF/ECS).
  • Previous experience in an AI-first startup or a high-growth security tech company.

Benefits

💰 Competitive Package – Salary + equity options + performance incentives
🧘 Flexible & Remote – Work from anywhere with an outcomes-first culture
🤝 Team of Experts – Work with designers, engineers, and security pros solving real-world problems
🚀 Growth-Focused – Your ideas ship, your voice counts, your growth matters
🌍 Global Impact – Build products that protect critical systems and data

Top Skills

Apache Beam
Apache Flink
Dbt
Go
Google Cloud Platform
Kubernetes
Python
Scala
SQL
Terraform

Similar Jobs

12 Days Ago
Remote
3 Locations
Senior level
Senior level
Information Technology • Consulting
As a Senior Data Engineer, you will design and maintain data systems, implement ETL processes, and collaborate with teams to enhance data management and analytics solutions.
Top Skills: AWSAzureGoogle Cloud PlatformHadoopSparkSQL
An Hour Ago
Remote
2 Locations
Junior
Junior
Artificial Intelligence • Fintech • Software • Financial Services
As a Sales Development Representative, you'll prospect potential customers through calls and emails, qualify leads, and track sales activities using Salesforce.
Top Skills: Salesforce
11 Hours Ago
In-Office or Remote
68 Locations
Mid level
Mid level
Artificial Intelligence • HR Tech • Natural Language Processing • Software
As a Learning Experience Designer, you will create transformative learning experiences by integrating psychology, AI, and storytelling, crafting engaging courses, developing AI interactions, and iterating based on learner feedback.
Top Skills: Ai ToolsFigma

What you need to know about the Pune Tech Scene

Once a far-out concept, AI is now a tangible force reshaping industries and economies worldwide. While its adoption will automate some roles, AI has created more jobs than it has displaced, with an expected 97 million new roles to be created in the coming years. This is especially true in cities like Pune, which is emerging as a hub for companies eager to leverage this technology to develop solutions that simplify and improve lives in sectors such as education, healthcare, finance, e-commerce and more.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account