As a Big Data Engineer you will have a strong understanding of big data technologies with an exceptional ability to code. You will provide technical leadership, working closely with the wider team to ensure high quality code is delivered in line with the project goals and delivery cycles. You will work closely with other teams to deliver rapid prototypes as well as production code for which you will ensure high accessibility standards are upheld. We expect familiarity with modern frameworks and languages, as well as working practices such as Clean Code, TDD, BDD, continuous integration, continuous delivery, and DevOps.
Key Responsibilities:
Defining and developing services and solutions
- Define, design, and develop services and solutions around large data ingestion, storage, and management such as withRDBMS, No SQL DBs, Log Files, Events.
- Define, design, and run robust data pipelines/batch jobs in a production environment.
- Architecting highly scalable, highly concurrent, and low latency systems
Maintain, support, and enhance current systems.
- Contribute to paying down technical debt and use development approaches that minimize the growth of new technical debt.
- Contribute feedback to improve the quality, readability, and testability of the code base within your team.
- Mentor and train other developers in a non-line management capacity.
- Build tools (One of SBT, Gradle, Maven).
- Ensure all software built is robust and scalable.
Collaborating with Internal and external stakeholders
- Participating in sprint planning to work with developers and project teams to ensure projects are deployable and monitorable from the outside.
- Work with third-party and other internal providers to support a variety of integrations.
- As part of the team, you may be expected to participate in some of the 2nd line in-house support and Out-of-Hours support rotas.
- Proactively advise on best practices
Processes & Practices
- Agile
- Scrum/Kanban/Lean
- TDD/BDD
- CI/CD
- XP
Essential Skills:
- Follow Clean Code/Solid principles
- Adhere and use TDD/BDD.
- Outstanding ability to develop efficient, readable, highly optimized/maintainable and clear code.
- Highly Proficient in either Functional Java or Scala
- Knowledge of AWS Big Data/Analytics services - S3, EMR, Glue, Redshift, QuickSight, Kinesis.
- Experience of big data environments (also advising best practices/new technologies to Analytics team)
- Experience of handling large data sets and scaling their handling and storage.
- Experience of Storing Data in systems such as Hadoop HFDS, S3, Kafka.
- Experience of designing, setting up and running big data tech stacks such as Hadoop, Spark and distributed datastores such as Cassandra, DocumentDBs, MongoDB, Kafka.
- In depth knowledge of Hadoop technology ecosystem - HDFS, Spark, Impala, Hbase, Kafka, Flume, Sqoop, Oozie, SPARK, Avro, Parquet
- Experience debugging a complex multi-server service.
- In depth knowledge and experience in IaaS/PaaS solutions (eg AWS Infrastructure hosting and managed services)
- Familiarity with network protocols - TCP/IP, HTTP, SSL, etc.
- Knowledge of relational and non-relational database systems
- Understanding continuous integration and delivery.
- Mocking (any of the following Mockito, ScalaTest Spock, Jasmine, Mocha).
- IDE Intellij or Eclipse.
- Build tools (One of SBT, Gradle, Maven).
- Ensure all software built is robust and scalable.
- An ability to communicate technical concepts to a non-technical audience.
- Working knowledge of unix-like operating systems such as Linux and/or Mac OS X.
- Knowledge of the git version control system.
- Ability to quickly research and learn new programming tools and techniques.
Desirable Skills:
- Experience of designing Batch Processing with Spark and Stream Processing with either Spark Streaming or Samza.
- Understanding and experience of Search Data applications/platforms such as ElasticSearch, Splunk and others.
- Familiar with Microservices Architecture
- Mentor and train other developers in a non-line management capacity.
- Experience mentoring or helping colleagues optimise their code..
- System administration and configuration management skills.
- Experience presenting work at user groups, business and to peers.
- Experience of designing and maintaining public HTTP APIs.
- Other languages (Python, Javascript, Clojure, Kotlin etc).
- Other NoSQL databases such Neo4J, Cassandra, Redis etc.