Lead Machine Learning Engineer
Job Description
Capital One is seeking a Lead Machine Learning Engineer to productionize ML applications at scale, influence architectural design, and ensure high availability within an Agile team in New York, NY (onsite).
Responsibilities
- Design, build, and deploy ML models and components that address real world business needs, collaborating with Product and Data Science teams.
- Guide ML infrastructure decisions through your understanding of modeling techniques, including model choice, data and feature selection, training, hyperparameters, dimensionality, bias/variance, and validation.
- Address complex challenges by writing and testing application code, developing and validating ML models, and automating tests and deployment pipelines.
- Work within a cross functional Agile team to create and enhance software powering advanced big data and ML workloads.
- Retrain, monitor, and maintain models in production environments to sustain performance and reliability.
- Utilize or build cloud based architectures and platforms to deliver optimized ML models at scale.
- Construct efficient data pipelines to feed ML models with quality, timely data.
- Adopt continuous integration and continuous deployment practices, including test automation and monitoring, to ensure successful deployment of ML models and application code.
- Ensure code quality, governance of models from risk perspective, and adherence to Responsible and Explainable AI best practices.
- Work with programming languages such as Golang, Python, Scala, or Java to implement solutions.
Requirements
- Bachelor's Degree required
- Minimum six years of experience designing and building data‑intensive solutions with distributed computing (internships not counted)
- At least four years of hands-on programming experience with Python, Scala, or Java
- Minimum two years of experience building, scaling, and optimizing ML systems
Technologies
- Golang
- Python
- Scala
- Java
- scikit-learn
- PyTorch
- Dask
- Spark
- TensorFlow
- AWS
- Azure
- Google Cloud Platform
Benefits
- Performance-based incentive compensation (cash bonuses and/or long-term incentives)
- Health, financial and other benefits supporting overall well-being
Preferred Qualifications
- Master’s or Doctoral degree in computer science, electrical engineering, mathematics, or related field
- 3+ years building production-ready data pipelines that feed ML models
- 3+ years hands-on experience with industry leading ML frameworks such as scikit-learn, PyTorch, Dask, Spark, or TensorFlow
- 2+ years writing performant, resilient, and maintainable code
- 2+ years of experience gathering and preprocessing data for ML models
- 1+ years leading teams delivering ML solutions using established patterns and automation
- Experience deploying ML solutions in public cloud environments (AWS, Azure, or GCP)
- Experience designing, implementing, and scaling complex data pipelines for ML models and evaluating performance
- Demonstrated ML industry impact through conferences, publications, blogs, open source, or patents
- Experience leveraging interactive AI tooling to boost productivity beyond basic code completion