Overview
The Senior Systems Engineer Storage Infrastructure engages in the design, implementation, and provides Level 3 expert support for extra-large scale storage infrastructure ensuring the highest levels of performance, scalability, and reliability.
Responsibilities
- Co-design and implement PBs-level block, object and file storage solutions as integral components of Cloud and HPC environments ensuring stability, performance and compliance with industry standards and best practices.
- Collaborate with architecture and other engineering teams on storage and backup technology component evaluation and selection ensuring solutions are designed following best practices and are optimized from both functional and non-functional perspectives.
- Perform regular capacity planning exercises to anticipate and accommodate the growing demands on the storage infrastructure, ensuring it meets current and future requirements.
- Co-develop and implement technical changes to enhance the reliability of the storage infrastructure, addressing potential points of failure and ensuring high availability of storage services.
- Explore, analyze, and implement performance optimization strategies for the storage solutions, ensuring optimal resource utilization and performance.
- Participate in evaluation and integration of advanced storage technologies and methodologies, such as SDS to enhance features, performance, and efficiency.
- Design and enhance observability stack in collaboration with the IaaS operations team ensuring monitoring coverage and accuracy.
- Provide L3 expert support including on-call shifts and being the final tier of resolution for L2 support teams through problem analysis and communication with vendor's technical support.
- Collaborate with security management teams to ensure that systems are safe and secure against cybersecurity threats.
- Write and maintain relevant documentation ensuring completeness and quality.
- Work closely with process management and operational teams and contribute to process development standardizing collaboration framework and improving collaboration efficiency.
Qualifications
- Bachelor's or master's degree in computer science, engineering, software engineering or related field in technology.
- 5+ years of experience with deep expertise in designing, implementing, and managing large-scale software-defined storage (SDS) solutions providing block, object or file storage services and backup capabilities.
- Hands-on experience in system implementation, management, and optimization of storage systems from leading vendors, including but not limited to HPE, Dell, NetApp, Hitachi, IBM, PureStorage, or VAST Data.
- Good understanding of different storage protocols providing block, object and file storage interfaces such as iSCSI, S3, NFS, FC[oE], NVME over TCP, etc.
- Proficient with Linux/Linux kernel and storage stack and capable of debugging related issues.
- Experience in managing object storage solutions based on SeaweedFS, MinIO, Cloudian HyperStore, Qumulo S3, Scality Ring or Dell ECS is highly desirable.
- Experience with cloud native Backup solution for OpenStack (e.g.,Freezer, Karbor, TriliO, Hystax, Raksha etc) is a plus.
- Experience in designing and managing clustered/parallel file systems such as Lustre, GPFS, etc is highly desirable.
- Familiarity with containerization technologies (Openshift, Docker, Kubernetes, etc) and container storage technologies [Rook, CSI, PVC, etc].
- Experience with integration of identity management, access management, and authorization solutions (PKI, LDAP, OAUTH, OpenID)
- Good knowledge of backup systems, disaster recovery principles, and data protection strategies.
- Hands-on experience with data encryption, security practices, and hardening related to Storage and Backup systems.
- Solid knowledge of Data center network design and related technologies [OSI model, TCP/IP stack, firewalling, routing, VLAN/VxLAN, etc]
- Hands-on experience with monitoring and observability tools like Zabbix, Nagios, Prometheus, Grafana, ELK Stack (Elasticsearch, Logstash, Kibana).
- Understanding of CI/CD principles, Infrastructure as Code (IaaC) approach and software defined infrastructure solutions.
- Advanced level in programming and scripting using Python and/or Golang, bash.