Data Architect
Canada
Orion Innovation
Orion delivers digital transformative business solutions rooted in digital strategy, experience design, and engineering, enabling our clients with digital transformation to operate with agility at scale.Orion Innovation is a premier, award-winning, global business and technology services firm.Ā Orion delivers game-changing business transformation and product development rooted in digital strategy, experience design, and engineering, with a unique combination of agility, scale, and maturity.Ā We work with a wide range of clients across many industries including financial services, professional services, telecommunications and media, consumer products, automotive, industrial automation, professional sports and entertainment, life sciences, ecommerce, and education.
Key Responsibilities
- Real-time implementations of the Data Lakehouse solution.
- Data Modelling and Data Architecting solutions.
- Design, implement, and maintain Data Lakehouse solutions, integrating structured and unstructured data sources.
- Develop scalable ETL/ELT pipelines using tools like Apache Iceberg, Trino, Apache Spark, Delta Lake, Databricks, or Snowflake.
- Optimize data storage formats and query performance across large datasets.
- Implement security and compliance best practices in data management (role-based access control, data masking, etc.), and regulation compliance like CCPA, CASL
- Build and Optimize a distributed search system using Trino to enable fast, SQL based querying across large-scale, heterogeneous datasets.
- Leverage Apache Iceberg for table format management, ensuring efficient data partitioning, versioning and schema evolution.
- Implement indexing strategies and query optimization techniques to enhance search performance and reliability.
- Architect a unified data Lakehouse solution that combines the flexibility of data lake with the structure of a data warehouse.
- Enable real-time and batch processing capabilities for analytics, machine learning, and reporting use cases
- Ensure data consistency, ACID compliance, and scalability using Icebergās transactional capabilities.
- Establish data governance frameworks, including metadata management, data lineage, and access control policies.
- Monitor and tune the performance of Trino queries and Iceberg-based storage systems to ensure low latency and high throughput.
- Collaborate with cloud and DevOps teams to support data infrastructure automation and monitoring.
Ā
Required Skills & Qualifications
- Real-time implementation knowledge of the deployment and creation of a Data Lakehouse.
- Hands-on experience with Apache Iceberg, Trino, Databricks, Delta Lake, or Snowflake.
- Proficiency in Apache Spark, Python/Scala, and SQL
- Strong working experience in data modelling, data partitioning, and performance tuning.
- Familiarity with data governance, data lineage, and metadata management tools.
- Experience working in Agile/Scrum teams.
- Work with structured and semi-structured data stored in object storage systems like S3, GCS.
- Experience with Apache Iceberg, SQL, and Python.
- Familiarity with data orchestration tools like Apache Airflow.
- Experience with real-time data processing frameworks (e.g. Kafka, Artimis, Flink)
- Knowledge of Machine learning pipelines and MLOps integrations with data lakehouses.
Ā
Good to Have:
- Certifications in Cloud platforms (e.g. AWS Certified Data Analytics, or Google Cloud Professional Data Engineer).
Orion is an equal opportunity employer, and all qualified applicants will receive consideration for employment without regard to race, color, creed, religion, sex, sexual orientation, gender identity or expression, pregnancy, age, national origin, citizenship status, disability status, genetic information, protected veteran status, or any other characteristic protected by law.
Candidate Privacy Policy
Orion Systems Integrators, LLC and its subsidiaries and its affiliates (collectively, āOrion,ā āweā or āusā) are committed to protecting your privacy. ThisĀ Candidate Privacy Policy (orioninc.com) (āNoticeā) explains:
- What information we collect during our application and recruitment process and why we collect it;
- How we handle that information; and
- How to access and update that information.
Your use of Orion services is governed by any applicable terms in this notice and our generalĀ Privacy Policy.
Ā
* Salary range is an estimate based on our AI, ML, Data Science Salary Index š°
Tags: Agile Airflow AWS Data Analytics Databricks Data governance Data management Data warehouse DevOps E-commerce ELT Engineering ETL Flink GCP Google Cloud Industrial Kafka Machine Learning MLOps Pipelines Privacy Python Scala Scrum Security Snowflake Spark SQL Unstructured data
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.