Transflo is seeking a Data Scientist / Data Analytics Engineer to design, build, and operationalize analytics solutions across transportation and logistics, delivering both predictive and point-in-time insights on AWS.

Responsibilities

Design, train, validate, and deploy predictive models across regression, classification, time-series forecasting, survival analysis, clustering, anomaly detection, and gradient-boosted or deep learning approaches as appropriate.
Lead model selection, hyperparameter tuning, cross-validation, and performance evaluation using business-aligned metrics such as precision/recall trade-offs, MAPE, RMSE, lift, and calibration.
Develop data products in transportation domains including operational metrics, fraud signals, pricing analytics, and industry trends.
Establish model monitoring, drift detection, retraining cadence, and explainability practices (SHAP, feature importance, partial dependence) to maintain production reliability.
Produce point-in-time analytics, KPI scorecards, and exception reporting to inform daily decisions across dispatch, fleet, customer success, finance, and product teams.
Partner with business stakeholders to translate questions into well-scoped analyses and deliver defensible insights with documented assumptions and data lineage.
Build and maintain reusable analytical datasets, semantic layers, and certified metrics to ensure a single source of truth.
Design and maintain data pipelines (batch and streaming) on AWS using Redshift, S3, Glue, Lambda, Step Functions, Kinesis / MSK, EMR, Athena, and SageMaker.
Apply medallion architecture (bronze, silver, gold) to progressively refine raw operational data into analytics-ready and ML-ready datasets.
Utilize STARR modeling to create performant, business-friendly data models in Redshift and the warehouse layer.
Drive data selection, curation, profiling, and quality enforcement with source-of-truth datasets, lineage documentation, and data contracts.
Collaborate with data engineering and platform teams on CI/CD for data and ML assets, infrastructure as code, and cost-aware AWS design.
Take customer-facing analytics features from concept to implementation in partnership with product, design, and engineering.
Contribute to product discovery through interviews, opportunity sizing, prototyping, and rapid iteration on analytics concepts.
Own the analytical correctness of customer-facing metrics, models, and visualizations, including edge cases and explanations for non-technical users.
Define success metrics for shipped analytics features and drive iterative improvements post-launch.
Translate complex analyses into clear narratives and visuals for technical and non-technical audiences, including executives and customers.
Partner across product, engineering, operations, and commercial teams to embed analytics into workflows and customer-facing products.
Mentor analysts and engineers on statistical rigor, modeling practices, and modern data architecture.

Requirements

Bachelor's degree in Statistics, Mathematics, or Supply Chain Management; Computer Science is acceptable. Master’s degree preferred but not required.
Professional experience in transportation, trucking, freight, logistics, or broader supply chain with working knowledge of loads, stops, shipments, ELD/telematics, TMS, dispatch, and billing data.
Proven track record taking customer-facing analytics products from idea through launch, including discovery, scoping, metric and model design, and production support with real customers. Prepared to discuss at least one end-to-end example.
Strong ability to build advanced analytical models end-to-end: problem framing, data selection, feature engineering, model training/validation, and deployment.
Hands-on experience with AWS PaaS and analytics tooling, including Redshift and services such as S3, Glue, Lambda, Step Functions, Athena, Kinesis, EMR, and SageMaker.
Proficiency in SQL (advanced window functions, performance tuning on Redshift or similar warehouses) and at least one analytics-language (Python preferred) with libraries like pandas, scikit-learn, statsmodels, XGBoost/LightGBM, and PyTorch or TensorFlow as appropriate.
Experience designing and operating production data pipelines with clear orchestration, idempotency, observability, and data quality practices.
Solid grounding in statistics including hypothesis testing, experimental design, regression, time-series, and uncertainty quantification.

Technologies

AWS, Redshift, S3, Glue, Lambda, Step Functions, Kinesis, MSK, EMR, Athena, SageMaker
Python, pandas, scikit-learn, statsmodels, XGBoost, LightGBM, PyTorch, TensorFlow
Jupyter, SQL, QuickSight, Power BI, Looker
Airflow, Git, CI/CD, Terraform, CloudFormation
Medallion architecture, STARR (Star schema / dimensional) modeling

Preferred Qualifications

Master’s degree in Statistics, Mathematics, Operations Research, Supply Chain, Computer Science, or a related field.
Experience implementing medallion architecture in cloud data lakehouse or warehouse environments.
Experience designing STARR / star-schema dimensional models for analytics consumption.
Experience with streaming and event-driven data (Kinesis, Kafka/MSK) for near real-time analytics on transportation events.
Experience deploying and monitoring ML models in production using SageMaker, MLflow, or equivalent MLOps tooling.
Familiarity with BI tools and semantic layer concepts (QuickSight, Power BI, Looker).
Exposure to optimization and operations research techniques applied to transportation problems.
Experience with ELD/HOS data, telematics feeds, geospatial data, TMS/dispatch data, brokerage data, and understanding of transportation backoffice operations.

Core Competencies

Analytical rigor and the ability to defend methodology and assumptions
Business pragmatism and value-driven problem solving
Product mindset with focus on end-user experience
Engineering discipline and reproducible, testable code
Stakeholder partnership and clear communication of trade-offs
Curiosity, ownership, and root-cause problem-solving

Representative Tech Environment

Cloud & Data Platform: AWS (Redshift, S3, Glue, Lambda, Step Functions, Athena, Kinesis, EMR, SageMaker)
Modeling & Analysis: Python (pandas, scikit-learn, statsmodels, XGBoost/LightGBM, PyTorch/TensorFlow), SQL, Jupyter
Data Architecture: Medallion (bronze/silver/gold), STARR dimensional models, data contracts, lineage tooling
Orchestration & DevOps: Airflow / Step Functions, Git, CI/CD, Terraform or CloudFormation
Visualization: QuickSight, Power BI, or Looker

Data Scientist / Data Analytics Engineer

Job Description

Responsibilities

Requirements

Technologies

Preferred Qualifications

Core Competencies

Representative Tech Environment

Similar Jobs

Data Engineer

Senior Data Analytics Engineer

Data Engineer, OIS/CXI Analytics

Senior Data Analytics Engineer

Data Engineer

Senior Data Engineer (AWS, Databricks, Python)

Get Job Alerts