Position Details: Lead Data Engineer
Description:
We are seeking a Lead Data Engineer to own the design, development, and management of data pipelines, orchestration, and cloud operations for all data initiatives. In this role, you will work closely with cross-functional teams to ensure the seamless integration of data across platforms, with a strong focus on both traditional analytics and our AI-driven data initiatives. The ideal candidate will have extensive experience in data engineering, cloud platforms, and machine learning operations, and will bring strong leadership and technical expertise to help scale our data infrastructure.
Key Responsibilities:
- Design and Architecture: Lead the design and architecture of cloud-based datasolutions and data pipelines to support a range of analytics and AI applications.
- Cloud Infrastructure: Engineer and manage cloud-based data infrastructure, primarilywithin Google Cloud Platform (GCP), but experience with other cloud platforms (AWS,Azure) is a plus.
- Data Pipelines: Build and manage proprietary data pipelines from first-party andthird-party platforms, ensuring data quality, scalability, and efficiency.
- Machine Learning Operations (MLOps): Oversee the ingestion, monitoring, andserving of data required for machine learning operations.
- CI/CD Automation: Implement and manage automated CI/CD pipelines to ensure fastand efficient deployment of data engineering solutions.
- Cross-Functional Collaboration: Work closely with data scientists, software engineers,and product teams to ensure data is easily accessible, accurate, and aligned withbusiness goals.
- Cloud Data Warehousing: Administer and optimize cloud data warehouses, ensuringthey are scalable and optimized for both real-time and batch processing.
- Vendor Management: Collaborate with third-party software vendors and data providersto manage integrations and data products.
- Leadership & Mentorship: Provide technical leadership and mentorship to juniorengineers and foster a collaborative team environment.
- Innovation: Proactively identify new technologies, processes, and opportunities toimprove data infrastructure and drive business outcomes.
- Product Lifecycle: Contribute to the full product lifecycle, from ideation to deployment,including taking products from initial concept (zero to one) to scaling and maintainingthem (one to ten and beyond).
Requirements:
- 10+ years of experience in Data Engineering and Cloud Engineering, with at least 2 years in a Senior Technical Role.
- Strong experience with cloud architecture and cloud engineering, preferably GCP(Google Cloud Platform). Familiarity with AWS or Azure is a plus.
- Advanced proficiency in SQL and Python for data manipulation, ETL, and automation.
- Extensive experience in Cloud Data Warehousing solutions, such as BigQuery,Redshift, or Snowflake.
- Proven track record in building and managing proprietary data pipelines for bothstructured and unstructured data.
- Foundational understanding of Machine Learning Operations (MLOps), includingmanaging data ingestion, monitoring, and serving data for machine learning models.
- Experience in developing and implementing CI/CD pipelines for data products.
- Hands-on experience with the architecture, engineering, and administration ofcommercial software platforms.
- Ability to manage third-party software vendors, ensuring data product integration alignswith business objectives.
- Strong problem-solving skills with the ability to think creatively and proactively inambiguous environments.
- Excellent communication and collaboration skills, with a strong ability to work acrossteams and drive alignment on technical solutions.
- Passion for innovation and willingness to step outside of your comfort zone to solvecomplex challenges.
- Experience in the ecommerce and digital marketing industry including the data availablefrom third party APIs such as Clickstream, Shopify, Google Analytics, Meta, TikTok,Google Ads, etc.
Desired Skills & Experience:
- Experience with data orchestration tools (e.g., Apache Airflow, etc.).
- Familiarity with containerization (e.g., Docker) and Kubernetes for managing scalabledata workloads.
- Background in software engineering practices and agile methodologies.
- Knowledge of data security and compliance best practices in cloud environments.
- Exposure to AI/ML frameworks such as TensorFlow, PyTorch, or scikit-learn.