Job Title: Data Engineer
Job Summary:
We are seeking a skilled Data Engineer to join our team and help design, build, and maintain robust data pipelines and architectures. The ideal candidate has a strong background in data integration, ETL processes, and cloud-based data solutions, enabling our organization to make data-driven decisions.
Responsibilities:
- Data Pipeline Development: Design, implement, and maintain scalable and efficient ETL (Extract, Transform, Load) pipelines to ingest data from various sources into data warehouses or lakes.
- Data Architecture Design: Develop and optimize data architectures to support data analytics, machine learning, and other data-driven applications.
- Data Quality and Governance: Implement data validation processes, maintain data quality standards, and ensure data governance across all data assets.
- Data Integration: Integrate structured and unstructured data from internal and external sources to create comprehensive data ecosystems.
- Collaboration: Work closely with data analysts, data scientists, and business stakeholders to understand data requirements and deliver solutions that meet analytical and business needs.
- Performance Optimization: Optimize database and data processing performance, ensuring data availability and efficient data retrieval for analytics.
- Monitoring and Troubleshooting: Monitor data pipelines, troubleshoot issues, and ensure high availability and resilience in data infrastructure.
- Documentation: Maintain clear and comprehensive documentation for data processes, architectures, and pipelines.
Essential Skills
- Experience in SQL and NoSQL: Proficiency in writing complex SQL queries and experience with NoSQL databases (e.g., MongoDB, Cassandra).
- Programming Skills: Strong programming skills in languages like Python, Java, or Scala for data processing and manipulation.
- ETL Tools: Experience with ETL tools such as Apache NiFi, Talend, Informatica, or cloud-native tools (e.g., AWS Glue, Azure Data Factory).
- Big Data Technologies: Familiarity with big data tools and frameworks like Hadoop, Spark, Kafka, and Flink.
- Cloud Platforms: Experience with cloud platforms like AWS, Azure, or Google Cloud, particularly with data-related services (e.g., Redshift, BigQuery, or Snowflake).
- Data Warehousing: Knowledge of data warehousing concepts and experience with platforms like Amazon Redshift, Snowflake, or Google BigQuery.
- Data Modeling: Understanding of data modeling concepts, including star schema, snowflake schema, and normalization techniques.
- Version Control: Experience with version control systems like Git for collaborative development and deployment.
- Bachelor's Degree in Computer Science, Information Technology, Engineering, or related field.
Additional Desirable Skills
- Experience with Data Lakes: Understanding of data lake architecture and experience with platforms like Delta Lake, S3, or Azure Data Lake.
- Machine Learning Pipelines: Familiarity with setting up data pipelines for machine learning and AI workflows.
- Real-Time Data Processing: Experience with real-time data processing and stream processing tools (e.g., Apache Kafka, Apache Flink).
- Data Visualization: Basic knowledge of data visualization tools like Tableau, Power BI, or Looker to support data exploration needs.
- Analytical Skills: Strong analytical skills to solve complex data challenges and derive insights from large data sets.
- Attention to Detail: High attention to detail to ensure data accuracy and quality.
- Communication Skills: Ability to communicate complex data concepts to both technical and non-technical stakeholders.
- Problem-Solving: Proactive in identifying data issues and implementing effective solutions.
Salary: 18K - 30K depending on experience
Location: Abu Dhabi
Duration: FT/Perm