Senior Engineer - Ingestion & Streaming Frameworks
Job Description
Datavant, based in New York, NY and operating onsite, is seeking a Senior Engineer to design and build ingestion and streaming frameworks within the Data and Machine Learning Platform. This role combines platform engineering with hands-on data ingestion work and AI-enabled workflows, with a strong emphasis on security, observability, and cross-functional collaboration. The position offers a salary of USD 150,000 to 190,000 per year and opportunities to contribute to self-service tooling and guardrails that streamline onboarding of new data sources.
Responsibilities
- Design, build, and operate ingestion frameworks that pull data from operational databases, vendor APIs, document streams, and third-party feeds into Snowflake, Iceberg, and Databricks
- Own and evolve the ingestion stack (AWS DMS, MWAA / Airflow, Fivetran, and internal tooling) and create patterns for API sources that lack managed connectors
- Develop self-service tooling so product engineers can onboard new sources without deep infrastructure expertise
- Write and review Terraform for ingestion infrastructure, covering AWS networking, IAM, compute, and data services
- Collaborate with product, data, and analytics teams to choose the appropriate ingestion pattern (CDC, batch, API, streaming) and implement end-to-end
- Lead production troubleshooting and incident response, turning incidents into durable platform improvements
- Elevate engineering quality, observability, cost discipline, and security in all deliverables
- Mentor mid-career engineers and guide peers through code reviews, pairing, and design feedback
Requirements
- 6+ years in data engineering, platform engineering, or data-focused software engineering
- 3+ years of hands-on AWS with strong networking (VPC, subnets, routing, PrivateLink, security groups), IAM (roles, policies, permission boundaries), and related data services
- 2+ years writing production Terraform or equivalent IaC, including owning modules, managing state, and deploying changes safely
- 1+ years building self-service tooling, internal platforms, or paved-path frameworks used by other engineers
- Strong SQL skills with ability to reason about data storage in warehouses or data lakes
- Production experience with Snowflake (or an equivalent cloud data warehouse) and a workflow orchestrator (Airflow / MWAA preferred)
- Hands-on experience with at least one ingestion approach: CDC tooling (DMS, Debezium), managed connectors (Fivetran, Airbyte), or custom pipelines for API sources
- Solid CI/CD discipline in GitHub or equivalent: branching, code reviews, automated checks, repeatable deployments
- AI-native working style with daily use of tools like Claude Code, Cursor, Copilot, or equivalent
- Working knowledge of Python; depth of mastery is not required
- Clear written and verbal communication skills, especially in async, remote settings
Technologies
- Snowflake, Iceberg, Databricks
- AWS, AWS DMS, MWAA, Airflow
- Fivetran, Debezium, Airbyte
- Terraform, Python, SQL
- Claude Code, Cursor, Copilot
- Kubernetes, Azure, SCIM, SailPoint, Spark
What helps you stand out
- Direct production experience with Iceberg or another open table format, especially bridging Snowflake and Databricks
- Hands-on Databricks or Spark experience
- Kubernetes experience
- Snowflake certifications
- Azure experience in addition to AWS, reflecting diverse client needs
- Deep experience integrating data systems with managed identity platforms, particularly via SCIM (SailPoint a plus)
- Background in regulated industries such as healthcare or finance
- Past roles as DBA, SRE, or DRE operating production data systems under pressure