EngineerJobs.io
← Back to all jobs

Job Description

The Senior Failure Analysis Engineer is a hands-on position focused on examining customer-returned SSD products to determine root causes and drive resolution. The role acts as the primary technical interface between customers and internal teams and leads a small team of 1–3 failure analysis engineers.

Responsibilities

  • Lead end-to-end handling of customer-returned SSDs, including investigation, root-cause analysis, documentation, customer communication, and issue closure.
  • Collaborate directly with customers to gather failure symptoms, usage conditions, logs, and reproduction details required for effective investigations.
  • Provide timely technical updates and formal failure analysis reports to customers throughout the investigation process.
  • Perform hands-on electrical, firmware, system-level, and physical failure analysis on SSD products.
  • Debug and analyze failures involving NAND flash, controllers, DRAM, PCBs, power circuitry, and interface-related issues in SSDs.
  • Execute structured root-cause analysis using appropriate failure analysis and problem-solving methodologies.
  • Coordinate corrective actions with cross-functional engineering, manufacturing, quality, and supplier teams to prevent recurrence.
  • Lead technical review meetings and customer-facing discussions for escalated failure analysis issues.
  • Monitor investigation progress, turnaround times, and closure of key failure analysis issues.
  • Maintain comprehensive documentation and failure analysis records to support resolution and knowledge sharing.
  • Provide supervision and technical guidance to the assigned failure analysis team of 1–3 engineers.

Requirements

  • 8+ years of hands-on experience in SSD failure analysis, SSD debugging, or storage product engineering.
  • 3+ years of leadership experience is preferred.
  • Bachelor’s or Master’s degree in electrical engineering, computer engineering, materials science, or a related field.
  • Experience leading technical investigations and supervising or mentoring engineers in a failure analysis or related environment.
  • Strong understanding of SSD architecture, including NAND flash, SSD controllers, NVMe/SATA/PCIe interfaces, and firmware interactions.
  • Experience handling customer-facing failure analysis, field-return investigations, and escalated technical issues.
  • Robust failure analysis and debugging skills with a methodical, investigative approach to identifying root causes in SSD products.

Technologies

  • Oscilloscopes
  • Logic analyzers
  • Protocol analyzers
  • X-ray
  • SEM/FIB
  • Thermal analysis tools
  • Python
  • SQL
  • JMP
  • MATLAB

Role Details

This is a full-time onsite role based in Rancho Santa Margarita, California, with a Monday through Friday schedule and no travel required. Compensation is listed as USD 140,000 to 160,000 per year, and the position is eligible for a bonus. The role may start at a higher level depending on experience and internal alignment.

Preferred Qualifications

  • Experience with enterprise SSD products or hyperscale customer support.
  • Knowledge of NAND reliability mechanisms such as retention, read disturb, endurance wear, and ECC behavior.
  • Familiarity with automotive or industrial reliability standards.

Similar Jobs

Get Job Alerts

New jobs delivered to your inbox.