Job Description:
• In this Senior Staff role, you will set technical direction and lead execution for ML evaluation and the end-to-end data flywheel powering CSxAI products (e.g., assistive agents, issue resolution, and tooling).
• Your work will define how we measure quality, how we turn feedback into learning signals, and how we continuously improve models and products safely and efficiently.
• You will partner closely with product, engineering, design, operations to build evaluation systems that are trusted, scalable, and actionable - connecting offline metrics to online outcomes.
• Work with large scale structured and unstructured data; explore, experiment, build and continuously improve Machine Learning models and pipelines for Airbnb product, business and operational use cases.
• Work collaboratively with cross-functional partners including product managers, operations and data scientists, to identify opportunities for business impact; understand, refine, and prioritize requirements for machine learning, and drive engineering decisions.
• Hands-on develop, productionize, and operate Machine Learning models and pipelines at scale, including both batch and real-time use cases.
• Leverage third-party and in-house Machine Learning tools & infrastructure to develop reusable, highly differentiating and high-performing Machine Learning systems, enable fast model development, low-latency serving and ease of model quality upkeep.
Requirements:
• Educational Background: PhD in Computer Science, Mathematics, Statistics, or related technical field (or equivalent practical experience).
• Industry Experience: 10+ years building, testing, and shipping ML/AI systems end-to-end; including 2+ years of experience with GenAI/LLM systems in production.
• Leadership Experience: 5+ years leading large, ambiguous technical initiatives as a senior IC, influencing roadmap and engineering/science direction across teams.
• Technical Proficiency:
• Deep expertise in evaluation methodology (offline/online alignment, metric design, human-in-the-loop evaluation, A/B testing, power analysis, regression testing).
• Hands-on experience with GenAI systems, including orchestration, retrieval, tool calling, memory, etc.
• Experience building data pipelines and quality systems (labeling workflows, dataset curation, versioning, monitoring, and governance).
• Solid ML fundamentals and best practices (model selection, training/serving, monitoring, reliability, and model lifecycle management).
Benefits:
• This role may also be eligible for bonus, equity, benefits, and Employee Travel Credits.
Apply tot his job
Apply To this Job