Durham, NC or Remote
Adwerx is on the lookout for a Site Reliability Engineer to join our small and talented infrastructure team and help us design, build, and automate performant, resilient, and highly-available systems that our teams and customers rely on. In this role, you’ll help us run a handful of mature (and in some cases brand-new) services in the cloud and apply your skills to make them resilient, performant, and highly-available during the rapid adoption of our products. The infrastructure you’ll build has a large impact on an organization that is focused on software development best practices and standards.
The starting title for this experienced role will be based on tenure/experience/work history.
Our culture
Adwerx is a place where you can thrive in our highly collaborative teams and where everyone is encouraged to contribute ideas across all levels of the organization.
Our engineering charter is centered around humility, respect and trust. We abide by the mantra “if it’s not in version control, it doesn’t exist”, strive to write documentation our peers will love, and always try to leave things better than we found it. We employ testing and continuous delivery for all our services and empower our developers to iterate and deploy as often as they need.
Infrastructure engineers share an on-call schedule, but our systems are stable and fire drills are rare. We host lunch and learns, conduct blameless post-mortems and regularly recognize our peers with shout outs and a fun badge program to recognize leaders in specific technical disciplines.
How we work
We apply the Agile/Scrum methodology to run the day to day projects at Adwerx and are heavily inspired by the “Shape Up” process with our product development process. In addition we:
- Utilize a mature CI/CD process and deploy to production many times a day.
- Have production-like QA environments with a culture of writing automated tests.
- Define department SLOs and Engineering KPIs to better understand how we work..
- Relentlessly strive for excellence with not only the products we build but also the health of our codebase and our developer ecosystem.
Technologies we work with
- Our primary application is built with Ruby on Rails. You’ll also encounter or work with Node.js, Go, and Python.
- Our production systems run primarily in Google Cloud Platform though we also have a small footprint in Amazon Web Services
- Besides our primary application, some services you will support include our VPN/Tailscale, CI/CD pipelines, Google Kubernetes Engine Clusters, MySQL databases, Airflow, RabbitMQ, and Redshift
- Some tools we use include Terraform, Kubernetes, Datadog, Helm, Nginx, docker, NewRelic, and CircleCI
In this mission-critical role, you will:
- Design, build, and maintain the core infrastructure for Adwerx
- Create, maintain, and/or iterate on various workloads in Google Kubernetes Engine
- Contribute to the Ruby on Rails monolith to upgrade dependencies, integrate with infrastructure features, or optimize performance
- Maintain reliable network paths and connections between all external and internal services (DNS, VPN, VPC peering)
- Participate and run point in handling production incidents
- Participate in solution design for new features, products, systems, and tooling
- Find new ways to use existing systems to improve scalability and performance for our platform
- Interact with the larger organization to ensure the uptime and reliability of our infrastructure
- Iterate on security standards and reviewing code for secure coding practices
- Partner with engineering teams closely to educate and consult
- Continually monitor application/system performance and costs (SLOs), generate actionable insights and either implement or advocate for them
- Participate in on-call rotations, along with every member of the engineering team
- Work closely with engineering teams to conduct root cause analyses for production incidents and make plans to remediate or prevent recurrences
- Collaboratively plot the course and document Adwerx infrastructure
- Build a great customer experience for people using your infrastructure
What You’ll Get:
- Competitive salary and potential for equity.
- Comprehensive medical, dental, and vision plan options (100% of basic plan premiums paid by company)
- 401(k) plan with a company match of up to 4%
- A collaborative work environment where you’ll learn about and influence every aspect of the business
- The opportunity to work with and learn from talented leaders, developers, marketers and designers and advancement opportunities.
- The ability to help define the foundational technology that will power the growth of our business
- Flexible work scheduling
Go Agile Datadog kubernetes-helm CI/CD newrelic Python Amazon Web Services (AWS) Terraform Node.js Nginx Amazon Redshift RabbitMQ Platform Engineer Google Cloud Platform (GCP) Docker circleci Airflow Kubernetes Scrum Ruby MySQL Site Reliability Engineering (SRE)