Sr. Data Engineer

Dallas, TX, United States

Applications have closed

BuzzClan

BuzzClan's AI-driven cloud & data services empower your business. From strategy to implementation, we drive digital transformation through data-driven insights for public & private sectors.

View all jobs at BuzzClan

Find more jobs like this Jobs in the United States

Posted 4 weeks ago

Job Description

Job Title: Python Data Engineer

Location: Dallas, TX

Job Type: Contract

Job Description:

We are seeking a skilled and forward-thinking Python Data Engineer in Dallas, TX. This role will focus on designing and optimizing scalable data infrastructure to support advanced machine learning models, including Generative AI solutions. The ideal candidate will bring strong proficiency in Python 3.11+, experience working with modern Azure data services, and a collaborative mindset to work alongside data scientists and ML engineers.

Key Responsibilities:

Design, develop, and optimize robust data pipelines and ETL workflows for processing large-scale structured and unstructured datasets.
Work closely with Data Scientists and ML Engineers to support model development, training, inference, and fine-tuning of Generative AI models (e.g., LLMs).
Build and maintain feature stores, vector databases, and embedding pipelines to support retrieval-augmented generation (RAG) and NLP applications.
Write clean, efficient, and idiomatic Python 3.11+ code for data processing, orchestration, and integration.
Leverage Azure Machine Learning to deploy, monitor, and manage machine learning models in production.
Implement secure and efficient data workflows using Azure Data Factory and integrate them with other services like Data Lake, Synapse Analytics, and Azure OpenAI.
Enforce data quality, integrity, security, and governance best practices across systems.
Automate data validation, monitoring, and logging to ensure reliability and scalability of pipelines.

Qualifications

Required Qualifications:

10+ Years of overall experience
Strong proficiency in Python 3.11+, with a deep understanding of idiomatic practices and asynchronous programming.
Proven experience as a Data Engineer working with large-scale data processing systems.
Solid knowledge of data structures, algorithms, and ETL frameworks.
Hands-on experience with Azure Data Factory, Azure Data Lake, Synapse Analytics, and Azure Machine Learning.
Familiarity with Generative AI concepts, such as prompt engineering, vector similarity search, LLM deployment, and tokenization strategies.
Experience supporting data science workflows, including feature engineering, model inference pipelines, and A/B testing.
Understanding of ML lifecycle from experimentation to production, including version control, experiment tracking, and model monitoring.
Working knowledge of data governance and compliance in regulated environments.

Preferred (Bonus) Skills:

Experience with LLMs (e.g., OpenAI, Hugging Face Transformers) and embedding techniques.
Familiarity with MLOps tools (e.g., MLflow, DVC, Airflow).
Exposure to Azure OpenAI Service or similar foundation model deployment environments.
Proficiency in working with vector databases such as Pinecone, FAISS, or Weaviate.
Knowledge of CI/CD for data pipelines and infrastructure-as-code tools like Terraform.