ML Engineer (LLM Focus) - Remote Full-Time, 3 Months with Extension Opportunity

Cairo, Cairo Governorate, Egypt

SWATX

We are doing some maintenance on our site. Please come back later.

View all jobs at SWATX

Apply now Apply later

The ML Engineer (LLM) leads the development of AI models focused on language tasks. In this project, they design and fine-tune large language models (LLMs) to interpret building regulations and translate them into a computer-processable format (the “rule extraction” stage of compliance checking) mdpi.com. They will bridge the gap between unstructured text (codes, standards) and structured BIM data by enabling natural language understanding of rules. This role ensures that complex, nested regulatory clauses can be understood by the system and mapped to the BIM model’s metadata for automated queries.

Key Responsibilities:

·       Regulation Text Interpretation: Develop and refine LLM-based pipelines to parse and understand regulatory text (building codes, standards, etc.), extracting key conditions and requirements. For example, implement prompt-based or fine-tuned models (e.g. GPT-4) for one-shot or few-shot extraction of structured rules from text mdpi.com.

·       Rule Formalization: Convert interpreted regulations into structured representations (e.g. object-property-condition-value tuples or logical statements) that a compliance engine can use mdpi.com mdpi.com. This involves defining schema or labels for components of rules (such as what element is regulated, what property, the conditional operator, and the required value).


·       IFC Querying via NLP: Enable the system to query the BIM model’s IFC metadata using natural language inputs or the extracted rules. Design methods for the LLM to map rule requirements to specific IFC entities and attributes. For instance, if a rule says “the living room area must be ≥ 10 m²,” ensure the model can identify IfcSpace objects of type LivingRoom and retrieve their area property.

·       Collaboration on Knowledge Models: Work closely with the Computational and BIM engineers to align the LLM outputs with the BIM data schema and ontology. Ensure that terms in regulations correspond to the correct IFC classes or property names (e.g., “ceiling height” mapped to the right IFC attribute).

·       Model Integration: Integrate the LLM into the overall pipeline, potentially using retrieval augmented generation to fetch relevant model information. For example, use the LLM to generate queries or filters for the BIM model database given a regulatory clause.

·       Testing and Refinement: Evaluate the LLM’s interpretations against sample regulations and expected outcomes. Refine prompts, training data, or model parameters to improve accuracy in capturing regulatory intent. Address edge cases where regulations are ambiguous or have nested conditions, ensuring the LLM can handle complex sentence structures mdpi.com.

Required Skills and Qualifications

·       Natural Language Processing (NLP): Strong background in NLP and language models, with experience in transformer-based LLMs (such as GPT, BERT, etc.). Understanding of prompt engineering and fine-tuning techniques for domain- specific language tasks.

·       Regulatory Domain Knowledge: Ability to grasp construction/building regulations or willingness to learn the terminology. Comfortable working with legal/technical text and extracting meaning.

·       Programming s ML Frameworks: Proficiency in Python and ML libraries (TensorFlow/PyTorch, HuggingFace Transformers). Ability to write scripts for data preprocessing (text cleaning, tokenization) and to implement inference pipelines.

·       Data Structures s Querying: Familiarity with JSON, XML, or other structured data formats for outputting rules. Basic understanding of databases or graph structures is a plus, to help formulate how queries to BIM data might be structured.

·       Analytical Skills: Excellent logical reasoning to design rule schemas and verify that the LLM outputs align with real-world logic. Attention to detail in handling units, thresholds, and conditional logic from regulations.


·       Collaboration: Good communication skills to work with engineers from other disciplines (geometry, backend) and ensure the language model’s outputs are usable in their contexts.

 

Preferred Experience

·       Previous LLM Projects: Experience implementing an LLM in a practical application, such as a question-answer system or text-to-structured-data project. For example, having built a chatbot or an AI assistant that had to interpret user instructions.

·       Domain-Specific NLP: Background in NLP for the architecture, engineering, and construction (AEC) domain or similar regulated domains (finance, healthcare) where compliance language is important.

·       Ontology/Knowledge Graphs: Experience with knowledge representation (ontologies, RDF graphs) especially related to mapping language to data. This could help in aligning terms in regulations with BIM entities (e.g., mapping “room” to IfcSpace).

·       Automated Compliance Checking: Any exposure to automated rule-checking in BIM or related fields, such as familiarity with tools like Solibri, or rule engines (even if not ML-based). Understanding how rules are typically encoded and checked in existing systems can be valuable.

·       Few-Shot Learning and Prompt Tuning: Demonstrated experience in reducing training data needs by using pre-trained models effectively mdpi.com – for instance, crafting effective prompts for GPT-style models or using techniques like prompt tuning or fine-tuning with small datasets.

 

 

 

 

 

 

Apply now Apply later

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats:  2  1  0

Tags: Architecture BERT Chatbots Engineering Finance GPT GPT-4 HuggingFace JSON LLMs Machine Learning NLP Pipelines Prompt engineering Python PyTorch RDF TensorFlow Testing Transformers XML

Regions: Remote/Anywhere Middle East
Country: Egypt

More jobs like this