Machine Learning Engineer
Job Description
Robert Half is seeking a Machine Learning Engineer to design, deploy, and operate scalable ML infrastructure in a Databricks-centric environment. The role emphasizes GenAI and LLM systems, MLOps platforms, feature stores, vector search, and retrieval augmented generation, with on-site work in Los Angeles and a compensation range of $200,000 - $260,000 per year.
The position focuses on building and maintaining robust ML pipelines and governance, enabling production-grade model deployment, monitoring, and optimization while facilitating advanced document understanding and enterprise search capabilities.
Responsibilities
- Lead the architecture, deployment, and ongoing upkeep of scalable ML infrastructure on Databricks, including MLflow for experiment tracking, a model registry, and endpoints for model serving.
- Oversee the ML Ops platform and automated pipelines that deploy, monitor, and sustain models in production.
- Implement solid solutions for model versioning, systematic retraining, and artifact management using Databricks Unity Catalog for governance.
- Design and manage the Databricks Feature Store to ensure consistent feature engineering across training and inference workflows.
- Architect and deploy Retrieval-Augmented Generation systems for document Q&A to support queries of fund documents, investor letters, and market research.
- Design, deploy, and manage vector database solutions (Databricks Vector Search, Pinecone, or similar) to enable semantic search across enterprise documents.
- Lead fine-tuning and customization of LLMs, including Claude or open-source models, using CIM proprietary data while maintaining privacy and compliance.
- Develop and optimize document processing pipelines (PDF parsing, chunking strategies, embedding generation) for RAG applications.
- Implement prompt engineering best practices and establish LLM evaluation frameworks to ensure output quality, relevance, and factual accuracy.
- Build guardrails and safety measures for GenAI applications, including hallucination detection, output validation, and source attribution.
- Design and implement extensive automation across the ML lifecycle, covering training, testing, validation, and deployment with Databricks Workflows and Asset Bundles.
- Set up robust CI/CD pipelines for both traditional ML models and GenAI applications using GitHub Actions, Azure DevOps, or comparable tools.
- Automate complex data and model workflows leveraging orchestration tools such as Airflow, Prefect, or Databricks Workflows.
Technologies
- Databricks
- MLflow
- Databricks Unity Catalog
- Databricks Feature Store
- Databricks Vector Search
- Pinecone
- Claude
- GitHub Actions
- Azure DevOps
- Airflow
- Prefect
- Databricks Workflows
- Asset Bundles
- Python
- TensorFlow
Benefits
- Medical insurance
- Vision insurance
- Dental insurance
- Life insurance
- Disability insurance
- 401(k) plan
- Free online training