Data Engineer, PXT Central Science
Job Description
Based onsite in Seattle, WA, this Data Engineer position sits with Amazon's PXT Central Science team within the People Experience and Technology organization. The role focuses on building robust data pipelines, productionizing analytical models, and delivering analytics that support employee sentiment and business outcomes.
Responsibilities
- Data Pipeline Development: Design and maintain scalable data pipelines using native AWS services such as Glue, EMR, and Lambda; implement comprehensive monitoring and error handling for data workflows; continually optimize for performance, reliability, and cost efficiency.
- Model Productionization & API Development: Create APIs and data-serving layers to productionize science models for downstream consumption; build batch and real-time inference pipelines.
- Data Integration & Quality: Develop scalable feature extraction and processing frameworks for diverse data types; implement robust data quality and validation checks; design flexible schemas to support evolving requirements.
- Cross-team Collaboration: Partner with economics, data science, and software engineering teams to translate analytical requirements into production-ready solutions; participate in technical design reviews and architecture discussions.
- Analytics & Infrastructure: Maintain layered data systems used by economists and scientists; develop automated reporting solutions; operate across multiple interconnected AWS accounts with security best practices.
Requirements
- Knowledge of professional software engineering and full software development lifecycle best practices, including coding standards, architectures, code reviews, source control, continuous deployment, testing, and operational excellence.
- At least 3 years of data engineering experience.
- Experience with at least one modern programming or scripting language such as Python, Java, Scala, or NodeJS.
- Experience with data modeling, warehousing, and building ETL pipelines.
- Experience with AWS technologies including Redshift, S3, Glue, EMR, Kinesis, FireHose, Lambda, and IAM roles and permissions.
- Experience with non-relational databases or data stores, including object storage, document or key-value stores, graph databases, or column-family databases.
- Bachelor's degree or foreign equivalent in computer science, engineering, mathematics, or a related field.
Technologies
- Python, Java, Scala, NodeJS
- AWS Glue, EMR, Lambda, Redshift, S3, Kinesis, FireHose, IAM
- Hadoop, Hive, Spark
Benefits
- Health insurance
- 401(k) matching
- Paid time off
- Parental leave
- Sign-on payments
- Restricted stock units (RSUs)
- Flexible Spending Accounts
- Adoption and Surrogacy Reimbursement
- Employee Assistance Program (EAP)
- Mental Health Support
About the Team
The Central Science Team within Amazon’s People Experience and Technology organization (PXTCS) employs economics, behavioral science, statistics, machine learning, and Generative AI to proactively identify mechanisms and process improvements that enhance Amazon and the value of work for Amazonians. It is an interdisciplinary group that blends science, engineering, and user experience to deliver solutions with measurable impact.