Principal Software Engineer - Data Engineering
USA - San Francisco, CA / USA - Remote
Applications have closed
project44
Transform your supply chain with real-time visibility, automation, and AI-powered insights from Movement by project44.Why project44?
At project44, we revolutionize supply chains with our High-Velocity Supply Chain Platform. As the connective tissue of the supply chain, project44 optimizes global product movement, delivering unparalleled resiliency, sustainability, and value for our customers. We operate the world's most trusted end-to-end visibility platform, tracking over 1 billion shipments annually for 1,300 leading brands across various industries, including manufacturing, automotive, retail, life sciences, food & beverage, and oil, chemical & gas. Our High-Velocity platform eliminates supply chain friction, enabling sophisticated inventory control, exceptional customer experience, and predictive analytics through machine learning and automation.
If you’re eager to be part of a winning team that works together to solve some of the most challenging supply chain challenges every day, let’s talk.
About the role:
As a Principal Software Engineer - Data Engineering at project44, you'll have opportunities to work on the latest technologies to streamline Machine Learning & AI Operations, build scalable data infrastructure and democratize data access.
What you'll be doing:
· Work with software architecture and design as part of your job. Leverage and institute best practices from the areas of distributed systems, databases, data platform, infrastructure and platform software, manageability and observability.
· Providing guidance on new technologies and continuous improvement in best practices. Researching, implementation and development of software development tools
· Build systems in a multi-cloud environment - we use AWS & GCP but value experience in other cloud environments such as Azure
· Build complex metrics solutions with data visualization support for actionable business insights.
· Leverage expertise in latest Gen AI tools & methodologies like RAG , Vector DB, embeddings to architect and build automated data access & interpretation solutions.
· Design and development ETL/ELT using Python/Java with Snowflake, Postgres and other data stores. Be able to develop & automate a project through its entire lifecycle
· Knowledge in Data Warehouse/Data Mart design and implementation. Be able to develop a project through its entire lifecycle
· Build distributed, reusable, and efficient backend ETLs. Implementation of security and Data protection
· Understand repeatable automated processes for building the application, test it, document it, and deploy it at scale
· Work collaboratively with insights and data science teams to understand end user requirements to provide technical solutions and for the implementation of new features and data pipelines
· Establish quality processes to deliver a stable and reliable solution
· Efficient in writing complex SQL, stored procedures in Snowflake, Postgres, BigQuery
· Preparing documentation (Data Mapping, Technical Specifications, Production Support, data dictionaries, test cases, etc.) for all projects
· Coach junior team members and help your team to continuously improve by contributing to tooling, documentation, and development practices
You could be a great fit if you have:
Experience & Education
· 8+ Years of experience in leading Data Engineering efforts
· 3+ Years of experience in Snowflake, Oracle and knowledge in No SQL database like MongoDB
· 3+ Years of experience in Python/Java
· 3+ Years of experience in ETL Developer role with deep knowledge of data processing tools like Airflow, Argo workflow
· 4+ yrs experience with data engineering and operations, including administering production-level, always-on, high throughput, complex OLTP RDBMS
· Experience in delivering software solutions in areas of distributed systems.
· Experience with working with Neural network and Gen AI methodologies.
· Strong experience in building data warehouse solutions and Data Modeling
· Strong ETL performance-tuning skills and the ability to analyze and optimize production volumes and batch schedules
· Experience with ETL, GCP, Unix/Linux, Helm Charts as well as Git or other version control systems
· Experience with PII redaction for traditional ETL pipelines, as well as in GenAI solutions.
· Expertise in operational data stores and real time data integration
· Expert level skill in modeling, managing, scaling and performance tuning high-volume transactional database
· Bachelor's Degree in computer science or equivalent experience
Technical Skills
· Strong programming/scripting knowledge in building and maintaining ETL using Java, SQL, Python, Bash, Go
· In-depth hands-on knowledge of public clouds - GCP(preferred)/AWS, PostgreSQL (version 9.6+), ElasticSearch, MongoDB, MySQL/MariaDB, Snowflake, BigQuery
· Participate in an on-call rotation to mitigate any data pipeline failures
· Strong experience with Kafka or equivalent event/streaming based systems
· Experience with Docker, Kubernetes
· Experience with RAG, Vector DB, embedbings,etc.
· Develop and deploy CICD pipelines for Data Engineering
· Experience and knowledge of optimizing database performance and capacity utilization to provide high availability and redundancy
· Proficiency with high volume OLTP Databases and large data warehouse environments
· Ability to work in a fast-paced, rapidly changing environment
· Understanding of Agile and its implementation for Data Warehouse Development
Professional Skills/Competency
· Focus on development/ improvement of framework to support repeatable and scalable solutions
· Demonstrates excellent communication and interpersonal skills; able to communicate clearly and concisely
· Takes initiative to recommend/ develop innovative approaches to getting things done
· Is a team player and encourages collaboration
Diversity & Inclusion
At project44, we're designing the future of how the world moves and is connected through trade and global supply chains. As we work to deliver a truly world-class product and experience, we are also intentionally building teams that reflect the unique communities we serve. We’re focused on creating a company where all team members can bring their authentic selves to work every day.
We’re building a company that every one of us at project44 is proud to work for, and our journey of becoming a more diverse, equitable and inclusive organization, where all have a sense of belonging, is shaped through the actions of our leadership, global teams and individual team members. We are resolute in our belief that each team member has an equal responsibility to mold and uphold our culture.
project44 is an equal opportunity employer seeking to enrich our work environment by creating opportunities for individuals of all backgrounds and experiences to thrive. If you share our values and our passion for helping the way the world moves, we’d love to review your application!
For any accommodation needed during the hiring process, please email recruiting@project44.com. Even if you don’t meet 100% of the above job description you should still seriously consider applying. Studies show that you can still be considered for a role if you meet just 50% of the role’s requirements.
More about project44
Since 2014, project44 has been transforming the way one of the largest, most important global industries does business. As transportation and logistics continues to evolve and customer expectations around delivery become more demanding, industry technology must rise to the occasion. In just a few short years, we’ve created a digital infrastructure that eliminates the inefficiencies caused by dated technology and manual processes. Our Advanced Visibility Platform is used by the world’s leading brands to track shipments, collaborate with supply chain partners, drive operational efficiencies, and create outstanding customer experiences.
#LI-Hybrid
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Agile Airflow Architecture AWS Azure BigQuery Computer Science CX Data pipelines Data visualization Data warehouse Distributed Systems Docker Elasticsearch ELT Engineering ETL GCP Generative AI Git Helm Java Kafka Kubernetes Linux Machine Learning MariaDB MongoDB MySQL Oracle Pipelines PostgreSQL Python RAG RDBMS Security Snowflake SQL Streaming
Perks/benefits: Career development
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.