HighLevel Jobs

Lead Data Engineer

HighLevel

Lead Data Engineer

Reposted 22 Days Ago

Remote

Hiring Remotely in India

Senior level

Remote

Hiring Remotely in India

Senior level

The Senior Data Engineer will design and maintain pipelines for event data ingestion and validation, ensuring operational reliability and consistency for analytics.

The summary above was generated by AI

About HighLevel:
HighLevel is an AI-powered business operating system that gives agencies, entrepreneurs and SMBs the infrastructure to build, automate and scale. Today, HighLevel supports SMBs across 150+ countries, fueling community-driven growth rooted in real customer outcomes.
To date, businesses operating on HighLevel have generated over $7 billion in ecosystem value, demonstrating the impact of shared infrastructure at scale. By centralizing conversations, automation and intelligence into one system, we help businesses move faster, reduce complexity and execute efficiently.
Behind the platform, HighLevel powers more than 4 billion API hits and 2.5 billion message events daily. With 250 terabytes of distributed data, 250+ microservices and over 1 million domain names supported, our architecture is built for performance, resilience and long-term scalability.
Our People
With over 2,000 team members across 10+ countries, HighLevel operates as a global, remote-first organization built for speed and ownership. We value initiative, clarity and execution, creating space for ambitious people to build systems that support millions of businesses worldwide. Here, innovation thrives, ideas are celebrated and people come first, no matter where they call home.
Our Impact
Every month, HighLevel enables more than 1.5 billion messages, 200 million leads and 20 million conversations for the more than 1 million businesses we support. Behind those numbers are real people building independence, expanding opportunity and creating measurable impact. We’re proud to be a part of that.
Learn more about us on our YouTube Channel or Blog Posts

About the Role:

We are looking for a Lead Data Engineer to own the event ingestion and identity layer that connects product instrumentation to downstream analytical systems.
This role focuses on the operational reliability and correctness of event and identity data as it moves through the data platform. You will design and operate pipelines, schema validation, and replay workflows that ensure product events remain consistent and safe to use for analytics and customer-facing reporting.
You will work closely with product engineering teams on instrumentation patterns, with the CDP team on event contracts and definitions, and with platform teams to ensure event infrastructure and analytical systems scale reliably. This role builds the foundational event and identity datasets required for reliable downstream modeling. Behavioral models, canonical entities, and business analytics datasets are owned by the analytics engineering team.

Responsibilities:

Define event schemas, required fields, and compatibility rules in collaboration with the CDP team
Implement automated validation and contract enforcement to prevent breaking schema changes
Maintain versioning and compatibility guarantees for event producers and downstream consumers
Build and maintain pipelines that ingest, validate, and process high-volume product events
Ensure event streams are deduplicated, ordered correctly, and safe for downstream consumption
Partner with platform teams to ensure ingestion pipelines scale with product growth
Define and maintain identity stitching logic across anonymous and authenticated users
Handle identity merges, splits, and corrections while preserving tenant boundaries
Ensure identity resolution remains explainable, deterministic, and safe for downstream datasets
Design workflows that allow event datasets and identity graphs to be replayed or rebuilt safely
Build tooling for historical corrections, schema evolution, and dataset reprocessing
Ensure downstream models can be rebuilt without manual intervention when definitions evolve
Provide guidance and tooling that help product teams emit events consistently
Maintain validation checks and schema enforcement that catch instrumentation issues early
Collaborate with engineering teams to evolve instrumentation safely over time
Ensure deletion and suppression requests propagate correctly through event and identity pipelines
Partner with governance and security teams to support policy requirements
Define requirements and interfaces for event infrastructure and downstream analytical systems
Work with platform teams to ensure pipelines remain reliable, scalable, and observable.

Requirements:

7+ years of experience in data engineering, platform engineering, or product data roles

Strong experience building and operating event ingestion or streaming pipelines

Experience implementing schema validation, data contracts, or event governance frameworks

Strong SQL and Python, with experience building data processing or validation tooling

Familiarity with identity resolution, entity resolution, or customer identity systems

Experience operating analytical data systems or large-scale event datasets

EEO Statement:
The company is an Equal Opportunity Employer. As an employer subject to affirmative action regulations, we invite you to voluntarily provide the following demographic information. This information is used solely for compliance with government record-keeping, reporting, and other legal requirements. Providing this information is voluntary and refusal to do so will not affect your application status. This data will be kept separate from your application and will not be used in the hiring decision.
We encourage you to review our Privacy Policy before submitting your application.
#LI-Remote #LI-NJ1

Similar Jobs

Kanerika

Lead Data Engineer

3 Days Ago

In-Office or Remote

Telangana, IND

Senior level

Information Technology • Consulting

Lead design, development, and optimization of ETL/ELT data pipelines on Databricks using PySpark and Spark SQL. Implement Delta Lake, Unity Catalog, incremental ingestion (Auto Loader/streaming), Databricks Workflows, and performance tuning. Build reusable accelerators, participate in client technical discussions, mentor junior engineers, and resolve production issues while promoting engineering best practices.

Top Skills: Ai/Bi GenieAirflowAuto LoaderAWSAzureAzure Data FactoryChange Data FeedCi/CdDatabricksDatabricks SqlDatabricks WorkflowsDelta LakeDelta Live TablesFivetranGCPGitKafkaLakeflow ConnectLakeflow Declarative PipelinesLiquid ClusteringMlflowPysparkPythonSpark SqlSQLStructured StreamingUnity Catalog

Photon

Lead Data Engineer

5 Days Ago

Remote

India

Mid level

Agency • Information Technology

Design, build and maintain cloud-oriented big data solutions for ingestion, processing, cleaning and exposition. Improve data quality, implement ETL and data-warehouse architectures, and collaborate with DDS/Data Foundation teams to ensure platform reliability.

Top Skills: Data WarehousingETLHadoopJavaLinuxPythonScalaSparkSQLUnix

Enable Data Incorporated

Lead Data Engineer

5 Days Ago

In-Office or Remote

Expert/Leader

Cloud • Database

Design, develop, and modernize enterprise-scale Azure data platforms. Lead solution architecture, refactor and optimize PySpark/Databricks pipelines, implement ETL/ELT and lakehouse/warehousing solutions, ensure performance, data quality, governance, and mentor junior engineers.

Top Skills: AzureAzure Data FactoryAzure Data LakeAzure DatabricksAzure SynapseCi/CdData WarehousingEtl/EltGitLakehousePysparkPythonSQL

What you need to know about the Pune Tech Scene

Once a far-out concept, AI is now a tangible force reshaping industries and economies worldwide. While its adoption will automate some roles, AI has created more jobs than it has displaced, with an expected 97 million new roles to be created in the coming years. This is especially true in cities like Pune, which is emerging as a hub for companies eager to leverage this technology to develop solutions that simplify and improve lives in sectors such as education, healthcare, finance, e-commerce and more.