Senior Failure Analysis Engineer
Job Description
The Senior Failure Analysis Engineer is a hands-on position focused on examining customer-returned SSD products to determine root causes and drive resolution. The role acts as the primary technical interface between customers and internal teams and leads a small team of 1–3 failure analysis engineers.
Responsibilities
- Lead end-to-end handling of customer-returned SSDs, including investigation, root-cause analysis, documentation, customer communication, and issue closure.
- Collaborate directly with customers to gather failure symptoms, usage conditions, logs, and reproduction details required for effective investigations.
- Provide timely technical updates and formal failure analysis reports to customers throughout the investigation process.
- Perform hands-on electrical, firmware, system-level, and physical failure analysis on SSD products.
- Debug and analyze failures involving NAND flash, controllers, DRAM, PCBs, power circuitry, and interface-related issues in SSDs.
- Execute structured root-cause analysis using appropriate failure analysis and problem-solving methodologies.
- Coordinate corrective actions with cross-functional engineering, manufacturing, quality, and supplier teams to prevent recurrence.
- Lead technical review meetings and customer-facing discussions for escalated failure analysis issues.
- Monitor investigation progress, turnaround times, and closure of key failure analysis issues.
- Maintain comprehensive documentation and failure analysis records to support resolution and knowledge sharing.
- Provide supervision and technical guidance to the assigned failure analysis team of 1–3 engineers.
Requirements
- 8+ years of hands-on experience in SSD failure analysis, SSD debugging, or storage product engineering.
- 3+ years of leadership experience is preferred.
- Bachelor’s or Master’s degree in electrical engineering, computer engineering, materials science, or a related field.
- Experience leading technical investigations and supervising or mentoring engineers in a failure analysis or related environment.
- Strong understanding of SSD architecture, including NAND flash, SSD controllers, NVMe/SATA/PCIe interfaces, and firmware interactions.
- Experience handling customer-facing failure analysis, field-return investigations, and escalated technical issues.
- Robust failure analysis and debugging skills with a methodical, investigative approach to identifying root causes in SSD products.
Technologies
- Oscilloscopes
- Logic analyzers
- Protocol analyzers
- X-ray
- SEM/FIB
- Thermal analysis tools
- Python
- SQL
- JMP
- MATLAB
Role Details
This is a full-time onsite role based in Rancho Santa Margarita, California, with a Monday through Friday schedule and no travel required. Compensation is listed as USD 140,000 to 160,000 per year, and the position is eligible for a bonus. The role may start at a higher level depending on experience and internal alignment.
Preferred Qualifications
- Experience with enterprise SSD products or hyperscale customer support.
- Knowledge of NAND reliability mechanisms such as retention, read disturb, endurance wear, and ECC behavior.
- Familiarity with automotive or industrial reliability standards.