tl;dr: high-impact opportunity at LLM response optimization start-up for motivated and talented data scientist. Great fit if you want to be a founder one day. New York-based role.
The Job:
We are looking for a growth-oriented, positive-minded, talented and motivated data scientist to join our founding team. You will be based in New York and report directly into our Co-founder and CTO.
Joining us this early means you’re going to have huge influence over the technical and cultural evolution of the company. You are signing up for a fast-paced working environment with a high bar for quality. We are working on hard problems and we have high expectations for one another.
This is a great fit if you hope to start your own company one day.
Together, we’re going to face the hardest technical challenges this journey has to offer head on.
What you’re signing up for:
- Create innovative AI Agent architectures to solve high-complexity use-cases
- Design and develop innovative systems for evaluation and benchmarking of LLM performance around response quality, cost and latency
- Design, prototype and implement: meta-models to predict LLM performance, optimization algorithms to select the best LLM, and monitoring systems for LLM inputs and outputs
- Work directly with the CEO and CTO as a thought partner
- Work closely with customers to understand their needs and own end-to-end implementation of solutions in collaboration with engineering
You are:
- An experienced data scientist / software developers (3-5 years of professional xp with published research)
- An excellent Python programmer
- Experienced working in large distributed systems
- An expert in training/tuning LLMs with PyTorch or other frameworks (plus for use of data or model parallelism)
- Experienced with the Hugging Face Transformers library
- Experienced evaluating machine learning models for NLP applications
- Familiar with the latest embedding models, transformer architectures, reinforcement learning, advances in training techniques, and multi-dimensional evaluation of generated text