On-Device LLM Research Engineer

Mountain View, United States

Full Time Senior-level / Expert USD 120K - 192K

WorldLink US

WorldLink is a leading provider of Data & Analytics services with a global reach and 25 years of experience.

View all jobs at WorldLink US

Apply now Apply later

Posted 1 month ago

TITLE: On-Device LLM Research Engineer

POSITION TYPE: Full Time (W2)

LOCATION: Mountain View, CA

ABOUT WorldLink:

WorldLink is a rapidly growing information technology company at the forefront of the tech transformation. From custom software development to cloud hosting, from big data to cognitive computing, we help companies harness and leverage today’s most cutting-edge digital technologies to create value and grow.

Collaborative. Respectful. Work hard Play hard. A place to dream and do. These are just a few words that describe what life is like at WorldLink. We embrace a culture of experimentation and constantly strive for improvement and learning.

We take pride in our employees and their future with continued growth and career advancement. We put TEAM first. We are a competitive group that like to win. We're grounded by humility and driven by ambition. We're passionate, and we love tough problems and new challenges. You don't hear a lot of "I don't know how" or "I can't" at WorldLink. If you are passionate about what you do and having fun while doing it; tired of rigid and strict work environments and would like to work in a non-bureaucratic startup cultural environment, WorldLink may be the place for you.

For more information about our craft, visit https://worldlink-us.com .

WHO we’re looking for:

We are looking for an On-Device LLM Research Engineer with expertise in developing LLM-specific applications for mobile, edge, and resource-constrained environments. You will be responsible for focusing on building, fine-tuning, and optimizing large language models (LLMs) for practical, real-world applications within the connected-things ecosystem (mobile, wearable, appliance, connected car, etc.). You will play a key role in ensuring that LLMs are efficiently deployed on devices using frameworks such as LiteRT, ExecuTorch, Mediapipe LLM among others. Collaborating with cross-functional teams, you will tackle challenges in mobile AI, NLP, and security, delivering high-performance products to enhance user experiences.

Role and Responsibilities:

Design and development of LLM-driven applications for mobile, wearable, and IoT devices, ensuring optimized performance for on-device deployment.
Research and build efficient LLM architectures, focusing on improving performance and minimizing computational overhead on resource-constrained devices.
Develop and fine-tune LLMs for specific tasks, ensuring high efficiency and low latency for mobile devices while maintaining accuracy and real-time capabilities.
Implement advanced search capabilities in LLM applications by integrating knowledge graphs, vector-based search, and keyword-based hybrid search to deliver more relevant and contextual results.
Implement model distillation, fine-tuning, and compression techniques to create models that run efficiently on mobile and edge devices.
Work with mobile frameworks such as LiteRT, ExecuTorch, Mediapipe LLM, and similar technologies to deploy LLMs on mobile and embedded platforms.
Collaborate with engineering teams to implement and deploy LLM solutions at scale, continuously iterating on performance, robustness, and user experience.
Integrate retrieval-augmented generation (RAG) pipelines where necessary, enhancing LLM-driven applications by leveraging external data sources for more dynamic responses.
Stay updated on emerging trends in LLM technologies and mobile frameworks, integrating the latest research into practical applications.

Required Experience and Education:

3 years of professional engineering experience in machine learning, deep learning, or NLP, with a strong focus on LLM-specific applications.
Bachelor's degree in engineering, computer science or equivalent is required. Master's degree in engineering, computer science, or equivalent is a plus.
Extensive hands-on experience with LLM architectures and their application in real-world products, particularly for on-device or resource-constrained environments.
Expertise in fine-tuning and optimizing LLMs for mobile, wearable, or IoT platforms.
Proven experience with knowledge graph construction, vector search techniques, and hybrid search (combining keyword-based and vector search) for improving search relevance in LLM applications.
Proven experience with LLM model distillation, pruning, and quantization to create highly efficient models for deployment on mobile and embedded systems.

Necessary Skills and Attributes:

Self-motivated individual with the ability to thrive in a team-based or independent environment.
Detail-oriented with strong organization skills.
Ability to work in a fast-paced environment.
Limited supervision and the exercise of discretion.
Strong problem-solving skills, particularly in the optimization of LLM-driven applications.
Ability to manage multiple projects in a fast-paced environment, focusing on LLM applications that provide real-world impact.
Passion for exploring and applying LLM technologies, search techniques, and knowledge graphs to enhance user experiences through novel and efficient solutions.

Preferred Qualifications:

At least 7 years of academic or industrial experience in NLP and LLM development, with a focus on building and deploying real-world applications.
Ph.D. in artificial intelligence, machine learning, or NLP.
Deep expertise in building LLM-driven applications optimized for mobile devices.
Experience with on-device mobile frameworks such as LiteRT, ExecuTorch, Mediapipe LLM, focusing on efficient deployment of LLMs in resource-constrained environments.
Expertise in model compression techniques (e.g., pruning, distillation, quantization) for mobile devices, ensuring minimal latency and optimal performance.
Proven experience with on-device LLM deployment, managing trade-offs between model performance, latency, memory, and power consumption.
Familiarity with retrieval-augmented generation (RAG) and its integration into LLM-driven applications to enhance contextual and knowledge-based responses.
Publication record in top-tier venues in LLM-related fields or NLP.

Physical Demands:

The physical demands described here are representative of those that must be met by contract employee to successfully perform the essential functions of this job. Reasonable accommodations may be made to enable individuals with disabilities to perform the essential functions.

While performing the duties of this job, the contract employee is occasionally required to stand, clean, crawl, kneel, sit, sort, hold, squat, stoop, stand, twist the body, walk, use hands to finger, handle, or feel objects, tools or controls, reach with hands and arms, climb stairs or ladders and scaffolding, talk or hear, and lift up to 20 pounds. Specific vision abilities required by the job include ability to distinguish the nature of objects by using the eye. Operate a computer keyboard and view a video display terminal between 50% - 95% of work time, including prolonged periods of time. Requires considerable (90%+) work utilizing high visual acuity/detail, numeric/character distinction, and moderate hand/finger dexterity.

Performs work under time schedules and stress which are normally periodic or cyclical, including time sensitive deadlines, intellectual challenge, some language barriers, and project management deadlines. May require working additional time beyond normal schedule and periodic travel.

WHAT we’ll bring:

During your interview process, our team can fill you in on all the details of our industry-competitive benefits and career development opportunities. A few highlights include:

Medical Plans
Dental Plans
Vision Plan
Life & Accidental Death & Dismemberment
Short-Term Disability
Long-Term Disability
Critical Illness/ Accident/ Hospital Indemnity/ Identity Theft Protection
401(k)

WHAT you should know:

Our success begins and ends with our people. We embrace diverse perspectives and value unique human experiences. WorldLink is an Equal Employment Opportunity and Affirmative Action employer. All employment at WorldLink is decided on the basis of qualifications, merit, and business needs. We endeavor to continue our footprint as a diverse organization by highlighting opportunities for all people. WorldLink considers applicants for all positions without regard to race, color, religion or belief, sex, (including pregnancy and gender identity), age, national origin, political affiliation, citizenship status, marital status, military/veteran status, genetic information, sexual orientation, gender identity, physical or mental disability or any other characteristic protected by applicable laws. People with disabilities who need assistance with any part of the application process should contact us.

This job description is designed to cover the main responsibilities and duties of the role but is not designed to be a comprehensive list of all.

Salary/Pay Range: $60.00-96.00/hr (Depending on experience)

Apply now Apply later

Job stats: 3 1 0

Categories: Engineering Jobs Research Jobs

Tags: Architecture Big Data Computer Science Deep Learning Engineering Industrial LLMs Machine Learning NLP Pipelines RAG Research Security