As a Senior Data Engineer/Architect, you will play a pivotal role in building and optimizing the data infrastructure. You will architect and develop robust data pipelines and scalable solutions, ensuring our data systems projects can handle the complexity and scale of our growing business. This role is hands-on, requiring deep technical expertise in ETL processes, query optimization, and data migration, as well as strong analytical and problem-solving abilities. You'll collaborate closely with cross-functional teams to build high-performance data layers, streamline data ingestion and classification, and ensure the integrity and accessibility of our data ecosystem.
At Blackstone, we are looking for a visionary leader who can translate complex data challenges into scalable, high-performing solutions.
Key Responsibilities:
- Data Architecture Design: Architect and design large-scale, end-to-end data solutions that meet the needs of business and technical stakeholders, focusing on scalability, security, and performance
- ETL/ELT Pipeline Development: Lead the design, development, and optimization of efficient ETL/ELT pipelines that extract, transform, and load data from various sources into structured formats, ready for business intelligence and advanced analytics
- Data Source Analysis: Perform detailed analysis of structured and unstructured data sources, providing strategic insights on how best to ingest, process, and classify data to create a high-quality data layer
- Data Layer Development: Design and build a robust, scalable data layer that integrates seamlessly with business analytics platforms and supports both batch and real-time data processing
- Data Ingestion, Cleansing, and Classification: Develop and implement data ingestion strategies for real-time and batch processes, ensuring data is thoroughly cleansed, validated, and classified in alignment with business goals and governance policies
- Query Optimization: Take ownership of optimizing complex SQL queries and enhancing database performance, with a focus on reducing query execution times and improving resource efficiency across large datasets
- Data Migration: Lead and manage large-scale data migration efforts from legacy systems to modern cloud-based platforms, ensuring data accuracy, integrity, and minimal downtime
- Performance Monitoring & Troubleshooting: Proactively monitor the performance of data systems, troubleshoot bottlenecks, and implement solutions that maximize the speed and efficiency of data processing workflows
- Collaboration: Work closely with data analysts, data scientists, and business teams to understand data requirements, build data models, and ensure the data architecture is aligned with both technical and business objectives
- Data Governance & Security: Ensure that all data solutions comply with data governance and security standards, implementing best practices around data classification, data quality, and regulatory compliance (e.g., GDPR, HIPAA)
- Mentorship: Provide technical leadership and mentorship to junior data engineers, fostering a culture of innovation, best practices, and continuous improvement
Requirements
- Education: Bachelor's or Master's degree in Computer Science, Data Science, Information Technology, or a related field.
- 8+ years of experience in data engineering, with a proven track record in architecting and building large-scale ETL/ELT pipelines and data systems
- Extensive experience in analyzing data sources, designing data layers, and managing large-scale data migrations
- Strong hands-on experience in optimizing complex SQL queries and improving the performance of databases handling large datasets
- Deep expertise with cloud platforms (Azure, AWS, GCP), big data technologies (e.g., Hadoop, Spark), and data warehouse solutions (Snowflake, Azure Synapse)
Technical Skills:
- Advanced SQL skills with a focus on query optimization, indexing, partitioning, and performance tuning
- Hands-on experience with ETL/ELT tools (Azure Data Factory, Apache Airflow, SSIS) and designing complex data pipelines
- Proficiency in working with relational databases (SQL Server, PostgreSQL) and NoSQL databases (MongoDB, CosmosDB)
- Strong knowledge of data governance, data quality management, and regulatory frameworks (GDPR, HIPAA)
- Experience with big data processing frameworks (Databricks, Apache Spark) and integrating them with modern data warehouses
- Expertise in data ingestion from diverse sources (databases, APIs, flat files, streaming data) and transforming raw data into structured formats for analysis
- Strong familiarity with CI/CD pipelines and automation of data workflows using DevOps practices
- Experience with data visualization tools (Power BI, Tableau) and their integration with data platforms
Preferred Skills:
- Cloud certifications in relevant technologies (e.g., Azure Data Engineer, AWS Big Data)
- Hands-on experience with machine learning platforms and frameworks (e.g., Azure Machine Learning, Databricks)
- Familiarity with real-time data processing and streaming technologies (e.g., Azure Stream Analytics, Apache Kafka, EventHub)
- Experience with containerization technologies (Docker, Kubernetes) for deploying data workloads
Personal Attributes:
- A data-driven mindset with the ability to translate complex business challenges into scalable, technical solutions
- Strong hands-on leadership with a passion for innovation and continuous improvement
- Excellent problem-solving and troubleshooting skills, with a deep understanding of optimizing data performance
- Strong communication skills, with the ability to articulate complex technical solutions to both technical and non-technical stakeholders
- A proactive and self-motivated individual who thrives in a dynamic, fast-paced environment