Job Description:
• Design and implement a lightweight supervised fine tuning training pipeline using open source LLMs.
• Create new benchmarks to evaluate frontier models across defined scientific and performance criteria.
• Analyze production models to identify measurable areas for improvement.
• Improve model performance through targeted retraining and hyperparameter search.
• Deploy improved models while maintaining core model characteristics and avoiding regression.
• Build Python tooling to automate training, evaluation, benchmarking, and experimentation workflows.
• Implement structured evaluation methods, including rubric based scoring and LLM as a judge workflows.
• Document experimental design, benchmark methodology, and performance results with clarity and precision.
• Iterate rapidly in a research driven environment to increase model quality and reliability.
Requirements:
• Current enrollment in or recent completion of a Master’s or PhD in Computer Science, AI, Machine Learning, Computer Engineering, or a closely related technical field.
• Strong experience working with large language models, including supervised fine tuning, prompt engineering, or model evaluation.
• Hands on experience building machine learning pipelines or research infrastructure.
• Experience improving model performance through retraining or hyperparameter tuning.
• Proficiency in Python and comfort working with machine learning frameworks and open source model ecosystems.
• Familiarity with cloud environments such as AWS or Azure.
• Strong technical problem solving ability, including use of LLMs as development aids for building and iteration.
• Ability to work independently with minimal hand holding.
• Strong written communication skills for summarising research and drafting technical documentation.
• Ability to collaborate effectively in a remote research environment.
Benefits:
• Duration: June-August
• Schedule: Full-time
• Work Type: Remote
Apply tot his job
Apply To this Job