We are seeking a Senior Data Engineer to join our Data Engineering team and lead our Data Infrastructure group! In this role, you will design, build, and optimize scalable data systems that empower decision-making and innovation at Zocdoc. You’ll collaborate with cross-functional teams to manage data pipelines, strengthen governance, and improve the tools and processes that help data producers and consumers work together effectively. You’ll focus on the underlying infrastructure that supports analytics, experimentation, and machine learning-ensuring it runs efficiently, securely, and cost-effectively to improve the healthcare experience for patients and providers.
Responsibilities:
- Designing and maintaining scalable data pipelines for ingestion, transformation, and delivery across multiple data sources.
- Collaborating with Analytics Engineers and Product teams to curate datasets and establish data contracts that improve transparency and reliability.
- Developing and managing modern data architectures, such as lakehouses and medallion layers, using tools like Databricks, Delta Lake, or Iceberg.
- Optimizing Snowflake usage and performance, ensuring data quality and cost efficiency.
- Supporting and scaling orchestration platforms (like Dagster), metadata systems (like Unity Catalog or Collibra), and monitoring tools (like Datadog).
- Collaborating with data engineering, analytics engineering, and security teams to deliver stable and efficient infrastructure for diverse workflows.
- Building tools, alerting systems, and documentation that ensure reliable operation and developer self-service across our data stack.
Requirements:
- 5+ years of experience in data engineering or platform/infrastructure roles, with a focus on scaling tools and systems.
- Expertise in Python or Scala, and strong proficiency in SQL for data modeling and optimization.
- Deep experience with data warehouse technologies like Snowflake, including clustering, performance tuning, query profiling, and access management.
- Experience with data lake and lakehouse architectures such as Databricks, Delta Lake, Iceberg, or Apache Hudi, and query engines like Athena or Presto.
- Proven ability to design and implement scalable ETL pipelines using technologies like dbt for transformation and Databricks for large-scale processing.
- Familiarity with managing infrastructure-as-code, job orchestration (Dagster, Airflow), and CI/CD workflows.
- A proactive mindset and strong problem-solving skills, especially when troubleshooting complex infrastructure issues.
- Excellent collaboration and communication skills to support cross-functional teams and data consumers.
Nice to have:
- Experience implementing row-level security and data masking for PHI/PII use cases.
- Exposure to governance tools (e.g., Collibra, Amplitude, Looker admin, Unity Catalog).
- Familiarity with AWS services, especially for storage and compute cost optimization.
Benefits:
- Unlimited PTO
- 100% paid employee health benefit options
- Employer funded 401(k) match
- Corporate wellness programs with Headspace and Peloton
- Parental leave
- Cell Phone reimbursement
- Commuter Benefits
- Catered lunch everyday along with snacks (when back in office)
How to Apply
Interested in this position? Please submit your resume and cover letter through the application portal.
Apply Now