Title; HPC Systems Engineer
Office Schedule: Fully on-site Chicago Office
Position: Full-time
Compensation: $250k-450k+
A global High Frequency Trading firm who is looking to add an High Performance Computing Engineer to their team based in Chicago.
As an HPC Engineer, you'll be part of an elite lean team who are responsible for all aspects of their HPC cluster, including monitoring the health and utilization of their large-scale environment. All aspects of the role will help drive their research innovation forward as well as make a huge impact in their day-to-day trading.
What you'll do:
- Work as part of a small team to manage a large compute cluster of Linux servers and related hardware.
- Build and manage high performance storage and network components in our infrastructure.
- Implement and manage monitoring of job performance, system stats, and the general health of our HPC infrastructure.
- Architect upgrades to expand the size, scope, and performance of our cluster and continually integrate the latest technologies.
- Coordinate with other teams for the deployment, operation, and maintenance of our data center footprint.
- Assist in optimizing and tuning grid jobs to efficiently utilize compute resources.
What we're looking for in this role:
- 5+ YOE in High Performance computing including; parallel file systems, batch systems and interconnects experience.
- Experience with Linux administration (Ubuntu or Debian preferred).
- Confidence with configuration management tools such as Chef, Puppet, or Ansible.
- Experience building, tuning, and managing large HPC environments with thousands of cpu cores running under Slurm, LSF, GridEngine.
- Knowledge and comfort development in scripting languages like Bash, Python, or Ruby.
- Familiarity with containers and orchestration tools like Docker, Kubernetes, Singularity, etc.
- Competence in networking fundamentals.
- Skill with Linux package management tools like apt/dpkg or yum/rpm.
- At home with compilation and packaging of open source software like Redis, Python, or Ruby; including reading and modifying Makefiles.
- Experience with parallel or distributed file systems is a plus.
If the requirments match your past experinece feel free to send me your resume and get in touch at;