EngineerJobs.io
← Back to all jobs

Job Description

Transflo is seeking a Data Scientist / Data Analytics Engineer to design, build, and operationalize analytics solutions across transportation and logistics, delivering both predictive and point-in-time insights on AWS.

Responsibilities

  • Design, train, validate, and deploy predictive models across regression, classification, time-series forecasting, survival analysis, clustering, anomaly detection, and gradient-boosted or deep learning approaches as appropriate.
  • Lead model selection, hyperparameter tuning, cross-validation, and performance evaluation using business-aligned metrics such as precision/recall trade-offs, MAPE, RMSE, lift, and calibration.
  • Develop data products in transportation domains including operational metrics, fraud signals, pricing analytics, and industry trends.
  • Establish model monitoring, drift detection, retraining cadence, and explainability practices (SHAP, feature importance, partial dependence) to maintain production reliability.
  • Produce point-in-time analytics, KPI scorecards, and exception reporting to inform daily decisions across dispatch, fleet, customer success, finance, and product teams.
  • Partner with business stakeholders to translate questions into well-scoped analyses and deliver defensible insights with documented assumptions and data lineage.
  • Build and maintain reusable analytical datasets, semantic layers, and certified metrics to ensure a single source of truth.
  • Design and maintain data pipelines (batch and streaming) on AWS using Redshift, S3, Glue, Lambda, Step Functions, Kinesis / MSK, EMR, Athena, and SageMaker.
  • Apply medallion architecture (bronze, silver, gold) to progressively refine raw operational data into analytics-ready and ML-ready datasets.
  • Utilize STARR modeling to create performant, business-friendly data models in Redshift and the warehouse layer.
  • Drive data selection, curation, profiling, and quality enforcement with source-of-truth datasets, lineage documentation, and data contracts.
  • Collaborate with data engineering and platform teams on CI/CD for data and ML assets, infrastructure as code, and cost-aware AWS design.
  • Take customer-facing analytics features from concept to implementation in partnership with product, design, and engineering.
  • Contribute to product discovery through interviews, opportunity sizing, prototyping, and rapid iteration on analytics concepts.
  • Own the analytical correctness of customer-facing metrics, models, and visualizations, including edge cases and explanations for non-technical users.
  • Define success metrics for shipped analytics features and drive iterative improvements post-launch.
  • Translate complex analyses into clear narratives and visuals for technical and non-technical audiences, including executives and customers.
  • Partner across product, engineering, operations, and commercial teams to embed analytics into workflows and customer-facing products.
  • Mentor analysts and engineers on statistical rigor, modeling practices, and modern data architecture.

Requirements

  • Bachelor's degree in Statistics, Mathematics, or Supply Chain Management; Computer Science is acceptable. Master’s degree preferred but not required.
  • Professional experience in transportation, trucking, freight, logistics, or broader supply chain with working knowledge of loads, stops, shipments, ELD/telematics, TMS, dispatch, and billing data.
  • Proven track record taking customer-facing analytics products from idea through launch, including discovery, scoping, metric and model design, and production support with real customers. Prepared to discuss at least one end-to-end example.
  • Strong ability to build advanced analytical models end-to-end: problem framing, data selection, feature engineering, model training/validation, and deployment.
  • Hands-on experience with AWS PaaS and analytics tooling, including Redshift and services such as S3, Glue, Lambda, Step Functions, Athena, Kinesis, EMR, and SageMaker.
  • Proficiency in SQL (advanced window functions, performance tuning on Redshift or similar warehouses) and at least one analytics-language (Python preferred) with libraries like pandas, scikit-learn, statsmodels, XGBoost/LightGBM, and PyTorch or TensorFlow as appropriate.
  • Experience designing and operating production data pipelines with clear orchestration, idempotency, observability, and data quality practices.
  • Solid grounding in statistics including hypothesis testing, experimental design, regression, time-series, and uncertainty quantification.

Technologies

  • AWS, Redshift, S3, Glue, Lambda, Step Functions, Kinesis, MSK, EMR, Athena, SageMaker
  • Python, pandas, scikit-learn, statsmodels, XGBoost, LightGBM, PyTorch, TensorFlow
  • Jupyter, SQL, QuickSight, Power BI, Looker
  • Airflow, Git, CI/CD, Terraform, CloudFormation
  • Medallion architecture, STARR (Star schema / dimensional) modeling

Preferred Qualifications

  • Master’s degree in Statistics, Mathematics, Operations Research, Supply Chain, Computer Science, or a related field.
  • Experience implementing medallion architecture in cloud data lakehouse or warehouse environments.
  • Experience designing STARR / star-schema dimensional models for analytics consumption.
  • Experience with streaming and event-driven data (Kinesis, Kafka/MSK) for near real-time analytics on transportation events.
  • Experience deploying and monitoring ML models in production using SageMaker, MLflow, or equivalent MLOps tooling.
  • Familiarity with BI tools and semantic layer concepts (QuickSight, Power BI, Looker).
  • Exposure to optimization and operations research techniques applied to transportation problems.
  • Experience with ELD/HOS data, telematics feeds, geospatial data, TMS/dispatch data, brokerage data, and understanding of transportation backoffice operations.

Core Competencies

  • Analytical rigor and the ability to defend methodology and assumptions
  • Business pragmatism and value-driven problem solving
  • Product mindset with focus on end-user experience
  • Engineering discipline and reproducible, testable code
  • Stakeholder partnership and clear communication of trade-offs
  • Curiosity, ownership, and root-cause problem-solving

Representative Tech Environment

  • Cloud & Data Platform: AWS (Redshift, S3, Glue, Lambda, Step Functions, Athena, Kinesis, EMR, SageMaker)
  • Modeling & Analysis: Python (pandas, scikit-learn, statsmodels, XGBoost/LightGBM, PyTorch/TensorFlow), SQL, Jupyter
  • Data Architecture: Medallion (bronze/silver/gold), STARR dimensional models, data contracts, lineage tooling
  • Orchestration & DevOps: Airflow / Step Functions, Git, CI/CD, Terraform or CloudFormation
  • Visualization: QuickSight, Power BI, or Looker

Similar Jobs

Get Job Alerts

New jobs delivered to your inbox.