WebstaurantStore is looking for Site Reliability Engineers. We are the internet’s largest restaurant supplier, and we are growing. Who we are looking for We are looking for driven and motivated candidates with a variety of skills and experience. We require that SRE candidates possess an aptitude for solving technical problems, a willingness to learn, a desire to grow, and a desire to work with a team. This position requires prior experience managing on-premises Kubernetes. Our work is broad enough that you will never master every tool. What we hope you will master instead is the debugging skills required to support tools for which you are not an expert. If you are familiar with some of our tools and want to learn the rest, please apply! Our SREs typically start their careers as developers or as systems engineers. Developers that want a wider variety of work and enjoy working with infrastructure make a good fit. Likewise, systems engineers that have a desire to improve infrastructure and to reduce repetitive tasks also make a good fit. Experience deploying and managing on-premises Kubernetes clusters. Experience deploying Kubernetes resources with a CI/CD platform such as Argo-CD, Gitlab-CD, Flux, etc. Experience using helm and kustomize to manage and template Kubernetes deployments. Experience using Kubernetes secrets management platforms such as sealed-secrets and Hashicorp Vault. Experience troubleshooting Kubernetes resources such as pods, nodes, deployments, etc. Experience managing Kubernetes persistent storage using Rook / Ceph, iSCSI, NFS, etc. Experience configuring Kubernetes ingress controllers such as HAProxy, NGinx, Traefik, etc. Experience with a Service Mesh such as Istio or Consul is a plus, but not required. Comfortable working with a team to accomplish technical feats. Demonstrated ability to learn new technologies. Attention to detail. Experience with one or more programming/scripting languages. We primarily use Python and Golang, but knowledge of these languages is not required. Experience with Linux in a work environment. Experience troubleshooting across the entire stack: network, server, operating system, and application. Some understanding of configuration management. We use Ansible and Terraform. Experience with any configuration management tool is a plus but not required. Development and/or IT Operations skills and experience relevant to transitioning to/or continuing in an SRE role. Experience with version control. We use git, if you have never used it, we can train you. What we do SREs work to implement, support, and improve the systems that WebstaurantStore relies on to service our customers and grow our company. We use automation and observability to ensure service uptime, performance, and growth. SREs build out new infrastructure and capabilities, maintain existing infrastructure and help departments to leverage the shared services we build and maintain. We value experimenting with novel approaches and new technologies as we are always looking to improve our capabilities. We value sound design principles and encourage review and discussion among the team to ensure that problems and projects are examined from all angles. Reliable systems are key to keeping our customers satisfied. Reliable systems enable our fellow employees to do their work. Reliable systems allow our SREs to enjoy their nights and weekends. We focus a lot of effort on keeping our systems reliable. SREs participate in an on-call rotation. The effort we put into reliability keeps the on-call volume low. What we offer Entrepreneural Spirit is the driving force behind the WebstaurantStore work environment. Making things better for our customers is our goal, every day. Achieving this goal requires taking risks, accepting failure, and learning from that failure. We offer competitive compensation and a comprehensive benefits package including paid time off, medical/dental insurance, wellness programs, gym membership reimbursement, and a 401k with company match. This position will be remote work from home. This is a mid-level to senior level position depending on skills and experience. If you are ready for a challenge and have the ambition to succeed in a fast paced, growing industry, we would love to discuss the SRE position with you! Submit your resume and apply online today. Remote work qualifications Access to a reliable and secure high-speed internet connection. Cable or fiber internet connections (at least 75mbps download/10mbps upload) are preferred, as satellite connections often cannot support the technologies used to perform day-to-day tasks. Access to a home router and modem. A dedicated home office space that is noise- and distraction-free. The space should have strong wireless connection or a wired Ethernet connection (wired connection is preferred, if possible). A valid, physical address (apartment, suite, etc.). PO Boxes are not supported, as a physical address is required for you to receive your computer equipment. The desire and ability to work and communicate with other team members via chat, webcam, etc. Legal residents of one of the following states: (AK, AL, AR, AZ, CT, DE, FL, GA, IA, ID, IN, KS, KY, LA, MD, ME, MI, MN, MO, MS, NC, ND, NH, NM, NV, OH, OK, PA, SC, SD, TN, TX, UT, VA, VT, WI, WV, and WY). H-1B Visa Sponsorship Not Available, W2 only.
Git kustomize Go kubernetes-helm CI/CD traefik Python Terraform Ceph nfs Nginx Linux fluxcd iscsi consul Kubernetes istio hashicorp-vault haproxy Site Reliability Engineering (SRE) Ansible