EngineerJobs.io
← Back to all jobs

Job Description

Bitus Labs seeks a Machine Learning Engineer to develop agent systems and production inference for an online gaming product, with fluent Mandarin Chinese required.

Responsibilities

  • Design, build, and optimize LLM powered agents, covering planning, tool use, workflow orchestration, and multi step reasoning
  • Architect memory systems including short term memory, long term memory, context management, and session state
  • Build and optimize retrieval augmented generation pipelines for relevance, grounding, freshness, and retrieval quality
  • Design and operate vector store infrastructure such as pgvector, Milvus, Qdrant, and Weaviate
  • Define evaluation methodologies for agents, prompts, and workflows
  • Optimize end to end agent quality, latency, reliability, and operating cost
  • Build and operate production inference services that are low latency, high concurrency, and highly reliable
  • Serve online learning models with real time inference and online parameter or weight updates (contextual bandits, reinforcement learning policies)
  • Deploy and optimize AI inference systems for latency, throughput, reliability, and resource efficiency
  • Analyze and resolve inference serving bottlenecks
  • Support deployment and serving of recommendation, ranking, and reinforcement learning models developed by research scientists
  • Apply lightweight model adaptation techniques (LoRA, QLoRA, PEFT) when appropriate for domain specific requirements
  • Build and maintain deployment pipelines, observability systems, and tracing infrastructure for agents and serving endpoints
  • Monitor quality regression, performance degradation, and model drift
  • Maintain version control for models, prompts, datasets, and agent configurations
  • Contribute to automated validation, testing, and CI/CD workflows for AI systems
  • Partner with research scientists, backend engineers, and data scientists to integrate AI systems into production products
  • Document systems, best practices, and internal tooling
  • Contribute to engineering standards and operational excellence across AI initiatives

Requirements

  • Bachelor's or Master's degree in Computer Science, Machine Learning, or a related field
  • 3+ years of industry experience in Machine Learning Engineering or related roles
  • Strong software and systems engineering experience with low latency, reliable production services in Go, Rust, C++, or equivalent
  • Experience building or supporting real time inference systems for recommendation, ranking, contextual bandits, reinforcement learning, or similar adaptive ML applications
  • Strong experience with PyTorch and the Hugging Face ecosystem
  • Experience building production LLM or agent applications (for example LangGraph, LlamaIndex, or equivalent frameworks)
  • Hands on experience with RAG systems, embeddings, and vector databases
  • Experience evaluating and monitoring LLM or agent systems in production
  • Experience deploying and optimizing production machine learning or LLM systems
  • Understanding of inference runtime behavior, resource utilization, latency optimization, and production serving performance
  • Experience with Docker and Kubernetes
  • Experience with cloud platforms such as AWS, GCP, or Azure
  • Fluent Mandarin Chinese

Technologies

  • Go
  • Rust
  • C++
  • PyTorch
  • Hugging Face
  • LlamaIndex
  • pgvector
  • Milvus
  • Qdrant
  • Weaviate
  • Docker
  • Kubernetes
  • AWS
  • GCP
  • Azure
  • LoRA
  • QLoRA
  • PEFT
  • CUDA
  • OpenAI Triton
  • TFLite
  • CoreML
  • FSDP
  • DeepSpeed
  • Spark
  • Hadoop

Benefits

  • 401(k)
  • 401(k) matching
  • Dental insurance
  • Health insurance
  • Life insurance
  • Paid time off
  • Parental leave
  • Retirement plan
  • Vision insurance

Pay

Salary: USD 130,000 per year

Location details

  • Ability to commute: Irvine, CA 92618 (Required)
  • Ability to relocate: Irvine, CA 92618, relocate before starting work (Required)
  • Work location: In person

Similar Jobs

Get Job Alerts

New jobs delivered to your inbox.