LLM Engineer

AI71

Early Applicant

27 days ago
Be among the first 50 applicants

Exp: 5-7 Years

Abu Dhabi, United Arab Emirates

Job Description

Company: AI71

Role: LLM Engineer

Location: Abu Dhabi, UAE

About Us:

AI71 is an applied research team dedicated to creating helpful and responsible AI agents for knowledge workers.

Working closely with our industry partners, our cross-functional teams of AI experts build products grounded in the cutting-edge research of our colleagues from the Technology Innovation Institute (TII).

Job Description:

As a Senior LLM Engineer, you will be responsible for the end-to-end development, optimization, and deployment of large language models. You'll work on challenging problems at the intersection of deep learning, natural language processing, and distributed computing.

What You'll Do:

Analyze large and complex datasets to extract meaningful insights and inform data-driven decision-making.
Develop, train, and deploy predictive models to enhance the capabilities of our AI solutions.
Collaborate with cross-functional teams to understand business objectives and translate them into actionable data science tasks.
Design and implement advanced LLM architectures, including transformer-based models and their variants
Develop novel attention mechanisms and positional encoding schemes
Experiment with model scaling techniques and efficient architectures (e.g., MoE, sparse transformers)
Continuously evaluate and improve existing models based on real-world performance and evolving business needs.
Implement and optimize distributed training pipelines for large-scale models
Develop strategies for efficient fine-tuning, including parameter-efficient techniques (e.g., LoRA, prefix tuning)
Apply advanced optimization techniques such as mixed-precision training and gradient accumulation
Optimize models for inference, including quantization and pruning techniques
Implement efficient serving solutions for real-time inference
Develop strategies for model compression and knowledge distillation
Develop task-specific algorithms for applications such as text classification, named entity recognition, and question-answering
Work with MLOps teams to design and maintain training and serving infrastructure

What You'll Bring:

5+ years of experience in deep learning and NLP, with a focus on large language models
Master's or Ph.D. in Data Science, Statistics, Computer Science, or a related field.
Expert-level proficiency in Python and at least one deep learning framework (PyTorch, TensorFlow, or JAX)
Strong understanding of transformer architectures, attention mechanisms, and recent advancements in LLMs
Experience with distributed training frameworks (e.g., DeepSpeed, Megatron-LM)
Proficiency in optimizing model performance using techniques like mixed-precision training, gradient checkpointing, and model parallelism
Understanding of NLP algorithms such as tokenization, parsing, and semantic analysis
Experience with sequence-to-sequence models and self-supervised learning techniques
Experience with both SQL and NoSQL databases for managing training data and model artifacts

Why AI71: