Arbitalhealth is hiring a Senior Data Engineer to join their Software Development team. Arbital Health is a rapidly growing healthcare technology and actuarial leader that centralizes, measures, and adjudicates value-based care contracts at scale. Key skills: React, Next.js, TypeScript, Python, AWS.
We’ve built a production data pipeline that ingests, enriches, aggregates, and summarizes healthcare financial data so it can be easily utilized in our AI and web tools. As we're continuing to scale in both data size and complexity, we are looking for a senior data engineer to help us enhance and scale this core part of our platform.
Is this role right for you?
If you are excited about building a new healthcare data and analytics platform to support Value Based Care (VBC) that will help reduce the cost of healthcare in the US, and the following matches your skills, experience, and interests:
Programming in Python, Spark or other big data technologies
Development and deployment of a data-intensive product on AWS and Databricks
AI-native development with Cursor/Claude/Copilot
Scale Arbital's healthcare data pipelines and lakehouse on AWS and Databricks, and own the underlying architecture
Implement and scale actuarially sound healthcare financial calculations in Spark
Build and maintain orchestration (Airflow) and CI/CD so enrichment and aggregation workflows are reliable, observable, and reproducible
Own data quality, integrity, privacy, security, and HIPAA compliance through automated testing and quality-control procedures
Collaborate with actuarial and delivery teams that primarily work in Python and R
Partner with data scientists to deploy and monitor machine learning models in production
Lead technical design reviews and contribute to platform-wide architecture decisions
Establish data observability, lineage, and SLAs, and tune Spark/Databricks jobs for performance and cost
Raise the engineering bar through code review, mentorship, and setting data-engineering standards across the team
5+ years building data-intensive SaaS platforms (L5: 8+ years with technical leadership)
Deep, hands-on expertise with Spark and distributed data processing
Strong SQL and data modeling / warehouse design (dimensional modeling, Delta / Lakehouse)
Proven track record scaling a product to an enterprise level
Experience with orchestration (Airflow), IaC (Terraform), and CI/CD for data
Experience with data-quality / testing frameworks such as dbt tests or Great Expectations
Ability to quickly understand complex modeling workflows and the business need driving them
Ships high-caliber, well-tested code with strong attention to detail
Experience with healthcare data (claims, eligibility) and handling PHI / PII under HIPAA
Thrives under minimal supervision in a rapidly changing, ambiguous start-up environment
Our team works in a hybrid model from the San Francisco Bay Area. We will prioritize candidates who are able to work 2 days per week from our office, and we will consider highly qualified remote candidates who can travel quarterly to the San Francisco office.
Startup experience is highly preferred
Extensive experience with Airflow, Databricks, Python, and AWS or GCP
Streaming / near-real-time data such as Structured Streaming or Kafka
Databricks certification (Data Engineer Associate or Professional)
Tools we use include
AI Tools: Cursor, Claude, Gemini, Databricks Genie
Core Tools: Python, R, SQL, Next.js, React, TypeScript, Tailwind CSS
Infrastructure: AWS, Databricks, Airflow, Terraform
Version Control: GitHub
Team Planning: Jira, Confluence
Cargo One·Remote (EMEA)