EngineerJobs.io
← Back to all jobs

Job Description

Tesla's network engineering team in Fremont is expanding its ML capabilities onsite. The role centers on training large language models, processing networking data at scale, and building intelligent diagnostics, with a focus on scalable pipelines, backend services, and deployment workflows.

Responsibilities

  • Design, build, and maintain pipelines for LLM pretraining, supervised fine-tuning (SFT), reward modeling, PPO/RLHF, and model evaluation
  • Develop scalable backend services and distributed training infrastructure
  • Construct data-ingestion and transformation pipelines for networking datasets including device inventory, topology data, ARP/MAC tables, routing state, metrics, telemetry, and logs
  • Normalize and model heterogeneous networking data to support ML workflows and agent inference
  • Develop intelligent agent capabilities, including intent classification, context tracking, troubleshooting logic, and action-routing workflows
  • Collaborate with networking domain experts to translate operational needs into model and platform capabilities
  • Implement CI/CD, model versioning, orchestration, and production-grade deployment pipelines
  • Drive architectural decisions to ensure scalability, modularity, reliability, and performance

Requirements

  • 5+ years of experience in applied ML, ML systems, or backend platform engineering
  • Strong experience with LLM training, including SFT, LoRA/PEFT, RLHF, reward modeling, and PPO
  • Proficiency with PyTorch and distributed training technologies
  • Solid backend engineering skills in Python, including APIs, microservices, data modeling, and containerization
  • Strong understanding of networking fundamentals such as IP addressing, Ethernet, VLANs, routing protocols (BGP/OSPF/ISIS), and topology concepts
  • Experience working with networking telemetry or operational data (SNMP, flow data, metrics, logs, routing/forwarding tables)
  • Ability to interpret and model complex network states, device relationships, and diagnostic patterns

Technologies

  • PyTorch
  • Python

Benefits

  • Medical plans with $0 payroll deduction
  • Family-building, fertility, adoption and surrogacy benefits
  • Dental and vision plans with a $0 paycheck contribution
  • Company-paid Health Savings Account (HSA) contribution
  • Healthcare Flexible Spending Account (FSA)
  • Dependent Care Flexible Spending Account (FSA)
  • 401(k) with employer match
  • Employee Stock Purchase Plan (ESPP)
  • Company-paid Basic Life and AD&D insurance
  • Short-term disability insurance
  • Long-term disability insurance
  • Employee Assistance Program
  • Paid time off including sick and vacation
  • Paid holidays
  • Back-up childcare resources
  • Parenting support resources
  • Critical illness insurance
  • Hospital indemnity insurance
  • Accident insurance
  • Theft & legal services
  • Pet insurance
  • Weight loss program
  • Tobacco cessation program
  • Tesla Babies program
  • Commuter benefits
  • Employee discounts and perks program

What to Expect

We seek a seasoned ML engineering professional to lead the development of a platform that fuses large-scale LLM training with advanced network data processing and intelligent diagnostic capabilities. The role demands strong machine-learning engineering skills, solid software engineering fundamentals, and a thorough grasp of networking systems.

Expected Compensation

Annual salary ranges from $140,000 to $300,000, plus cash and stock awards and benefits. Pay offered may vary based on location, job-related knowledge, skills, and experience. The total compensation package may include additional elements contingent on the position offered. Details of participation in benefit plans will be provided if an offer of employment is extended.

Similar Jobs

Get Job Alerts

New jobs delivered to your inbox.