Production Engineer / Site Reliability Engineer

6 October 2023

Job Description

Galaxy is a digital asset and blockchain leader helping institutions, startups, and individuals access and navigate the crypto economy.


  • Be on a PagerDuty rotation to respond to availability incidents and provide support for developers and the business.
  • Build, manage, and maintain Galaxy’s cloud infrastructure with Terraform, Kubernetes, flux/Helm, and other tools.
  • Build and maintain automated configuration management.
  • Help plan the growth trajectory of Galaxy Digital’s infrastructure.
  • Help ensure Galaxy is following industry best practices.
  • Actively participate in incident response in the wake of production issues.
  • Build and assist with CI/CD deployments and application observability


  • BS degree in CS, Software Engineering or related field // or equivalent experience.
  • Implement “Infrastructure as Code” using Terraform and CI/CD.
  • Load balancing applications using including Proxies and CDN.
  • Monitoring and Metrics in Prometheus, Grafana, OpenSearch, and integrations with Slack/PagerDuty.
  • Disaster Recovery and High Availability strategy.
  • Managing Kubernetes clusters and using Helm CI/CD for deployment.
  • Cloud architecture and design.
  • Coding in Python, Ruby, Go, or other high-level languages.
  • Ansible, Puppet, Chef, or other configuration management tooling.