Senior Data Engineer
Job Description
This onsite Senior Data Engineer opportunity in Addison, Texas offers a competitive compensation range and a clear path to influence analytics and reporting across Allworth Financial. You will help modernize legacy SQL processes into scalable, maintainable cloud data pipelines, strengthen data governance, and guide the next stage of the data platform evolution. A key aspect of the role is mentoring teammates and raising the overall engineering maturity of the team.
What you will do
- Design, build, and optimize data pipelines using PySpark, Delta Lake, SQL, and cloud-based data engineering tools.
- Improve data pipeline reliability, observability, logging, error handling, and restartability.
- Review existing notebooks, SQL scripts, data models, and orchestration workflows for maintainability and performance.
- Guide best practices for Azure Synapse, Spark, Delta Lake, data lake storage, and related cloud data services.
- Identify architectural gaps, technical debt, and modernization risks.
- Help design and implement data quality checks, reconciliation processes, and validation frameworks.
- Support development of canonical IDs, master data patterns, and entity resolution processes.
- Assist with data cataloging, lineage, metadata management, and governance practices.
- Partner with business stakeholders to understand reporting, analytics, and data product requirements.
- Help structure scalable data models for BI tools such as ThoughtSpot, Power BI, or similar platforms.
- Mentor data team members and contribute to raising the overall engineering maturity of the team.
- Provide guidance on security, performance, cost, governance, orchestration, testing, and long-term maintainability.
Requirements
- 5+ years of professional experience in data engineering, analytics engineering, data architecture, or a closely related role.
- Strong SQL skills, including stored procedures, joins, window functions, CTEs, merge logic, and performance tuning.
- Hands-on experience with PySpark or distributed data processing frameworks.
- Experience designing and maintaining ETL/ELT pipelines in a cloud environment.
- Experience with data lake or lakehouse architecture, including Delta Lake or similar table formats.
- Strong understanding of data modeling concepts, including dimensional modeling, staging layers, curated layers, and semantic/reporting models.
- Experience with pipeline orchestration, scheduling, dependency management, and failure recovery.
- Ability to troubleshoot data quality issues, pipeline failures, schema drift, performance bottlenecks, and source system inconsistencies.
- Familiarity with modern data governance concepts, including cataloging, lineage, ownership, access control, data definitions, and data quality monitoring.
- Ability to communicate clearly with both technical and non-technical stakeholders.
- Experience reviewing existing systems and recommending practical, incremental improvements.
- Experience with Azure Synapse Analytics, Azure Data Lake Storage, Microsoft Fabric, Azure Data Factory, or related Azure data services.
Technologies
- PySpark
- Delta Lake
- SQL
- Azure Synapse Analytics
- Apache Spark
- Azure Data Lake Storage
- Microsoft Fabric
- Azure Data Factory
- ThoughtSpot
- Power BI