Search by job, company or skills

AI71

LLM Engineer

Early Applicant
  • 27 days ago
  • Be among the first 50 applicants

Job Description

Company: AI71

Role: LLM Engineer

Location: Abu Dhabi, UAE

About Us:

AI71 is an applied research team dedicated to creating helpful and responsible AI agents for knowledge workers.

Working closely with our industry partners, our cross-functional teams of AI experts build products grounded in the cutting-edge research of our colleagues from the Technology Innovation Institute (TII).

Job Description:

As a Senior LLM Engineer, you will be responsible for the end-to-end development, optimization, and deployment of large language models. You'll work on challenging problems at the intersection of deep learning, natural language processing, and distributed computing.

What You'll Do:

  • Analyze large and complex datasets to extract meaningful insights and inform data-driven decision-making.
  • Develop, train, and deploy predictive models to enhance the capabilities of our AI solutions.
  • Collaborate with cross-functional teams to understand business objectives and translate them into actionable data science tasks.
  • Design and implement advanced LLM architectures, including transformer-based models and their variants
  • Develop novel attention mechanisms and positional encoding schemes
  • Experiment with model scaling techniques and efficient architectures (e.g., MoE, sparse transformers)
  • Continuously evaluate and improve existing models based on real-world performance and evolving business needs.
  • Implement and optimize distributed training pipelines for large-scale models
  • Develop strategies for efficient fine-tuning, including parameter-efficient techniques (e.g., LoRA, prefix tuning)
  • Apply advanced optimization techniques such as mixed-precision training and gradient accumulation
  • Optimize models for inference, including quantization and pruning techniques
  • Implement efficient serving solutions for real-time inference
  • Develop strategies for model compression and knowledge distillation
  • Develop task-specific algorithms for applications such as text classification, named entity recognition, and question-answering
  • Work with MLOps teams to design and maintain training and serving infrastructure

What You'll Bring:

  • 5+ years of experience in deep learning and NLP, with a focus on large language models
  • Master's or Ph.D. in Data Science, Statistics, Computer Science, or a related field.
  • Expert-level proficiency in Python and at least one deep learning framework (PyTorch, TensorFlow, or JAX)
  • Strong understanding of transformer architectures, attention mechanisms, and recent advancements in LLMs
  • Experience with distributed training frameworks (e.g., DeepSpeed, Megatron-LM)
  • Proficiency in optimizing model performance using techniques like mixed-precision training, gradient checkpointing, and model parallelism
  • Understanding of NLP algorithms such as tokenization, parsing, and semantic analysis
  • Experience with sequence-to-sequence models and self-supervised learning techniques
  • Experience with both SQL and NoSQL databases for managing training data and model artifacts

Why AI71:

  • Proven performance of our large language models
  • Strong traction and adoption from the open-source community
  • Secured proprietary data to build specialized distinctive models.
  • Locked large compute power to support our roadmap.
  • Signed anchor clients, to develop POCs and demonstrate our solutions.

More Info

Industry:Other

Job Type:Permanent Job

Skills Required

Login to check your skill match score

Login

Date Posted: 31/10/2024

Job ID: 98755291

Report Job

About Company

Follow

Hi , want to stand out? Get your resume crafted by experts.

Last Updated: 25-11-2024 08:43:13 PM