AI Operation Engineer-ITIL Process
Jersey City, NJ, United States
Job Description
Role-AI Operation Engineer- ITIL Process
Duration-Long Term
Location: Jersey City, NJ and/or Charlotte, NC (Onsite 1-2 trips per month, remote work allowed)
We're seeking a technical expert with a strong background in AI operations, ITIL processes, and cloud-based technologies to support the launch of new capabilities for a large Financial Services provider. The successful candidate will have a deep understanding of GenAI solutions, machine learning, and data engineering. This role will focus on deploying, operating, maintaining, optimizing, and managing inference services that support our Auto Recommend and Enhanced Search solution.
Key Responsibilities:
Provide Level 3 support for incident management, including issue identification, diagnosis, escalation, resolution, and coordination with key stakeholders and providers
Perform vulnerability management, including risk assessment, CVE scanning, patching, and remediation
Integrate and operate monitoring and alerting systems
Tune and troubleshoot model performance
Manage container image deployment and development
Develop and operate end-to-end deployment processes for model and code deployment
Collaborate with cross-functional teams to ensure smooth operation of AI solutions
Requirements:
15+ years of enterprise consulting experience with a focus on data, machine learning, and GenAI solutions
Proficiency in designing and delivering solutions that leverage GenAI technologies (e.g., LLMs, Foundation Models)
Deep familiarity with relevant concepts and models/technologies (e.g., transformer models, prompt engineering, model fine-tuning)
Experience delivering and scaling complex infrastructural solutions across diverse platforms
Strong knowledge of vLLM, OpenShift AI, Prometheus, Grafana, Aqua, and automation of deployment and execution of pipelines
Proficient in Python and SQL, with experience in Apache Spark, Apache Hadoop, Informatica, and similar data processing tools
Proven experience with building test procedures and ensuring data pipeline quality, reliability, performance, and scalability
Strong communication and customer-facing skills
Ability to work efficiently in collaborative teams using Agile methodologies
University Degree aligned to Data Engineering and/or Data Science
Relevant industry certifications (e.g., Databricks Certified Data Engineer, Microsoft Certifications, NVIDIA Certifications)
Additional Information
All your information will be kept confidential according to EEO guidelines.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Agile Consulting Databricks Engineering Generative AI Grafana Hadoop Informatica ITIL LLMs Machine Learning Pipelines Prompt engineering Python Spark SQL vLLM
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.