Principal Senior Data Engineer / Senior AI Data Platform Engineer

Spark / ETL / Cloud / AI Data Platforms

Location: Santo Domingo, Dominican Republic — Hybrid / On-site Flexibility
Employment Type: Full-Time
Industry: AI, Software Engineering, Data Platforms

About the Role

We are looking for a Principal Senior Data Engineer / Senior AI Data Platform Engineer to design, architect, and scale high-performance data platforms that power AI/ML systems, real-time applications, and large-scale data products.

This is not a traditional BI or reporting role. The position focuses on distributed computing, cloud-native data architecture, advanced ETL/ELT pipelines, Apache Spark optimization, and AI-ready data infrastructure.

You will play a key role in building the data foundation for machine learning pipelines, real-time processing, large-volume systems, and scalable AI-driven platforms.

Key Responsibilities

Architect and build scalable data pipelines for AI/ML workloads.
Develop distributed data processing systems using Apache Spark, including batch and streaming.
Optimize large-scale transformations with Python and advanced SQL.
Design and maintain cloud-native data platforms on AWS, Azure, or Google Cloud.
Implement performance tuning, partitioning, parallel processing, and data reliability practices.
Build ingestion pipelines for structured and unstructured data, including logs, events, APIs, and large datasets.
Collaborate with ML Engineers and Software Engineers to support AI models and production data systems.
Ensure scalability, observability, reliability, and performance across the data platform.

Technical Requirements — Must Have

Strong experience with Apache Spark, including core concepts and performance optimization.
Advanced SQL for analytics, optimization, and large-scale data processing.
Strong Python programming skills, focused on data engineering and performance.
Proven experience building ETL / ELT pipelines at scale.
Experience with cloud-native architectures using AWS, Azure, or GCP.
Deep understanding of distributed systems and large-scale data processing.
Experience handling large datasets, ideally from 10M to 1B+ records.

Nice to Have — Highly Valued

Experience supporting ML pipelines, feature engineering, and AI data preparation.
Spark Streaming, Kafka, or real-time data pipelines.
Databricks, Snowflake, BigQuery, or modern data lakehouse platforms.
Knowledge of data lakehouse architectures.
Docker, Kubernetes, or containerized data workflows.
Exposure to MLOps workflows and AI production environments.

What We Offer

Competitive USD-based compensation aligned with the AI and engineering market.
Hybrid Remote flexibility.
Opportunity to build AI-driven systems and scalable data platforms.
High-impact engineering environment, focused on data infrastructure, not BI/reporting.

Application Requirements — Mandatory

To be considered, candidates must submit:

Updated Resume / CV
Updated LinkedIn profile link
Short written response answering: Why should you be considered for this role?

Please include:

Your experience with Apache Spark and distributed data systems.
The most complex data pipeline or data platform you have built.
Your experience with large-scale data, cloud platforms, or AI-related systems.
Why you are a strong fit for this Principal Senior-level position.

Professional Summary Required

Please include a 3–5 line professional summary describing your experience as a Data Engineer, AI Data Engineer, or Data Platform Engineer, with emphasis on Spark, cloud platforms, large-scale pipelines, and AI/ML data infrastructure.

<small>Technical Keywords: Senior Data Engineer, Principal Data Engineer, AI Data Engineer, Data Platform Engineer, Apache Spark, PySpark, Distributed Systems, Big Data Engineering, Python, Advanced SQL, ETL, ELT, AWS, Azure, GCP, Databricks, Snowflake, BigQuery, Kafka, Spark Streaming, Data Lakehouse, MLOps, Machine Learning Pipelines, Cloud Engineering, Scalable Data Infrastructure, Real-Time Data Processing</small>