Site Reliability Engineer (K8s, NGINX, Rabbit MQ) at Social Discovery Group #vacancy #remote

( Social Discovery Group ( (SDG) is a global technology company that builds apps at the intersection of dating, social, and entertainment. The company’s portfolio includes 70 social discovery platforms with a focus on AI, game mechanics, and video streaming. We actively support and invest in social discovery startups worldwide through our CVC fund.

More than 500 million people in 150 countries enjoy our products, and we strive to have ten times as many.

SDG invests (  in social discovery technology startups around the world.  Our Investments ( include  Open AI, Patreon, Flo, RAW, EVA AI, Clubhouse, Magnet, Woebot, Flure, Astry, Coursera, Academia, Harbour, Space, Auto1, DocSend, AppAnnie, Rapyd, Boom Supersonic, Trading, View, K-Health and many others.

We solve the problem of loneliness, isolation, and disconnection with the help of digital reality.

Our  digital nomad team of more than 1200 professionals  works all over the world. Together, we are solving the prevalent problem of loneliness and shaping Social Life 3.0 — a new digital reality where people will be able to fulfil their needs for communication and attention from other people and artificial life forms.

Our teams of digital nomads live and work remotely from  Cyprus, Malta, the USA, Thailand, Indonesia, Hong Kong, Japan, Australia, Poland, Israel, Turkey, Latvia and many others.

We are currently seeking a skilled Site Reliability Engineer with expertise in Kubernetes, RabbitMQ, and NGINX. As a key member of our support team, you will be responsible for maintaining our key infrastructure services, assisting development teams with technical issues, providing solutions, and ensuring the seamless operation of our production and test environments.

Key responsibilities:

  • Provide support for Linux-based systems, including server installation, configuration, and maintenance
  • Diagnose and resolve Linux-related issues, ensuring system stability
  • Design and implement high-availability configurations for Kubernetes control planes, RabbitMQ clusters, and NGINX load balancers
  • Integrate and manage service mesh technologies, such as Istio or Linkerd, within Kubernetes environments
  • Prepare detailed root cause analysis reports for significant incidents, outlining the steps taken to identify and resolve issues
  • Collaborate with DevOps teams to integrate Kubernetes, RabbitMQ, and NGINX into CI/CD pipelines
  • Assist customers in deploying and configuring service mesh solutions across multi-cloud environments

Qualifications:

  • Strong practical knowledge of Linux systems (Redhat-like OS), Linux filesystems, and Linux networking
  • Experience in maintaining several Kubernetes clusters in geo-distributed data centers with service mesh implementation
  • Proficiency in managing high-load clusters of the message broker RabbitMQ and monitoring and troubleshooting best practices
  • Work experience with NGINX servers, load balancing configurations, and web application security
  • Experience in setting up monitoring systems such as Prometheus, Grafana, and Zabbix

Nice to have:

  • Certification in Linux and Kubernetes
  • Experience in infrastructure automation with Terraform
  • Knowledge and experience with Cloudflare and AKAMAI as CDN and WAF
  • Work experience with public cloud providers such as GCP, AWS, and Azure
  • Knowledge of VmWare VSphere

What do we offer:

  • REMOTE OPPORTUNITY  to work full time;
  • 7 wellness days per year  (time off) that can be used to deal with household issues, to lie down and recover without taking sick leave;
  • Bonuses up to $5000  for recommending successful applicants for positions in the company;
  • Full payment for  professional training, international conferences and meetings;
  • Corporate discount for  English lessons;
  • Health benefits.  If you are not eligible for corporate medical insurance, the company will compensate you with up to $ 1,000 gross per year per employee according to the paychecks. This can be spent on self-purchase of health insurance or on doctor’s fees for yourself and close relatives (spouse, children);
  • Workplace organization.  The company provides all employees with an equipped workplace and all the necessary equipment (table, armchair, wifi, etc.) in our offices or co-working locations. In the other locations, the company provides reimbursement of workplace costs up to $ 1000 gross once every 3 years, according to the paychecks. This money can be spent on the rent of the co-working room, on equipping the working place at home (desk, chair, Internet, etc.) during those 3 years;
  • Internal gamified gratitude system:  receive bonuses from colleagues and exchange them for time off, merch, team building activities, massage certificates, etc. 

Sounds good? Join us now!  

Cloudflare akamai Terraform Amazon Web Services (AWS) RabbitMQ Nginx Azure Red Hat Linux zabbix Prometheus Google Cloud Platform (GCP) istio Kubernetes linkerd vSphere Grafana Site Reliability Engineering (SRE)

Leave a Reply