LLM Pre-training & Distributed Engineer (AI Infrastructure)
Tasks
- Automate checkpointing
- Implement failure recovery
- Optimize InfiniBand networking and RDMA
- Optimize memory management
- Orchestrate distributed training runs
Perks/Benefits
- N/A
Skills/Tech-stack
3D Parallelism | C++ | CUDA | Data parallelism | DeepSpeed | GPU clusters | Infiniband | Kubernetes | Megatron-LM | Pipeline parallelism | PyTorch | Python | RDMA | Slurm | Tensor Parallelism
Education
N/A
Related jobs
-
Senior Big Data Engineer SGD 90K-130KAlgorithms | C++ | Data Extraction Transformation Loading | Data Structures | Data extractionSenior-level Full Time Internship新加坡3h ago
-
Professional Services Engineer SGD 90K-140KAPI Integration | Amazon Web Services | Authentication | Java | KubernetesMid-level Full TimeSingapore, SG7h ago
-
API Integration | Apache Airflow | Data Pipelines | Deployment | DocumentationSenior-level Contract Full TimeSingapore, Singapore, Singapore21h ago
-
API | API Integration | Automation | Data Processing | DocumentationMid-level Full TimeSingapore, Singapore, Singapore21h ago
-
Artificial Intelligence | Data Mining | Data analytics | Deep learning | JMPMid-level Full TimeFab 10A, Singapore21h ago
-
Language Models | Language Processing | Large Language Models | Linear Algebra | Natural LanguageEntry-level Full TimeNTU Main Campus, Singapore21h ago
-
Assistant Vice President/ Vice President, Network Data Engineer, Core Technology Infrastructure SGD 120K-261KAgile | Ansible | Arista | BGP | Change ManagementFlexible benefits | In-office collaborationExecutive-level Full TimeSingapore21h ago
-
Staff/Lead LLM Data Scientist (Singapore based) SGD 120K-135KAgent Orchestration | Cost Optimization | Deep learning | Evaluation | ExperimentationRelocation assistance | Visa sponsorshipSenior-level Full TimeSingapore1d ago
-
Senior Data Engineer, Compliance Data Platform SGD 139K-143KAuditability | Data Lineage | Data Modeling | Data Quality | Data immutabilityCompany events | Education subsidy | Healthcare | L and D programs | Meal allowanceSenior-level Full TimeHong Kong, Hong Kong SAR; Singapore, …1d ago
-
Staff Data Engineer, Finance Data Platform SGD 171K-206KAirflow | Anomaly Detection | Audit Trail | Data Lineage | Data ModelingEducation subsidy | Healthcare coverage | L and D programs | Meal allowances | Team building programsSenior-level Full TimeHong Kong, Hong Kong SAR; Singapore, …1d ago
-
Senior Machine Learning - Search SGD 140K-182KBM25 | C++ | Collaborative Filtering | Deep learning | Dense vectorsSenior-level Full TimeSingapore1d ago
-
Structural Engineer (with Computational Design Skills) - “High-Rise & Complex Buildings” SGD 45K-54KC# | Element analysis | Finite Element Analysis | Finite element | GrasshopperEntry-level Full TimeSingapore, Singapore1d ago
-
BigQuery | Data Cleansing | Data Visualization | Data handling | Deep learningSenior-level Full TimeFab 10A, Singapore1d ago
-
Backend Engineer (GenAI platform) SGD 105K-170KAPI Design | AWS Bedrock | Bedrock Agents | Data masking | ETLMid-level Full TimeSingapore2d ago
-
Container Orchestration | Distributed Computing | GPU | HPC | KubernetesMid-level Full TimeSingapore2d ago
-
CAN bus | Connector selection | Current Sense Amplifier | DC Power Systems | DC powerCross-functional collaboration | Hands on hardware deployment | Open source contributionsSenior-level Full TimeSingapore2d ago
-
Amazon S3 | C++ | Cloud Native | Cloud Native Architecture | ConcurrencySenior-level Full TimeSingapore2d ago
-
Senior-level Full TimeSingapore2d ago
-
Data Engineer SGD 115K-140KAirflow | Amazon Web Services | Apache Spark | Azure | BigQueryFlexible work environment | Innovation culture | Professional growthMid-level Full TimeSingapore2d ago
-
C++ | Generative Models | Imitation Learning | Motion capture | Pose EstimationEntry-level Internship新加坡3d ago
-
Bitrate control | C# | C++ | CMAF | GOPEntry-level Internship新加坡3d ago
-
Algorithm Software Engineer - Robotics System Specialization (Campus Recruitment/Intern) SGD 39K-54KC++ | CI/CD | CMake | DDS | DockerEntry-level Internship新加坡3d ago
-
3D Modeling | 3D Reconstruction | C++ | Isaac Lab | Isaac SimEntry-level Internship新加坡3d ago
-
Adaptive Control | C++ | Control System | Control system modeling | Control system simulationEntry-level Internship新加坡3d ago
-
NumPy | Numerical analysis | Pandas | Python | SciPyFlexible hours | Part-time availability | Project based workSenior-level FreelanceSingapore - Remote R3d ago