Site Reliability Engineer (K8s, NGINX, Rabbit MQ) at Social Discovery Group #vacancy #remote

( Social Discovery Group ( (SDG) is a global technology company that builds apps at the intersection of dating, social, and entertainment. The company’s portfolio includes 70 social discovery platforms with a focus on AI, game mechanics, and video streaming. We actively support and invest in social discovery startups worldwide through our CVC fund.

More than 500 million people in 150 countries enjoy our products, and we strive to have ten times as many.

SDG invests ( in social discovery technology startups around the world. Our Investments ( include Open AI, Patreon, Flo, RAW, EVA AI, Clubhouse, Magnet, Woebot, Flure, Astry, Coursera, Academia, Harbour, Space, Auto1, DocSend, AppAnnie, Rapyd, Boom Supersonic, Trading, View, K-Health and many others.

We solve the problem of loneliness, isolation, and disconnection with the help of digital reality.

Our digital nomad team of more than 1200 professionals works all over the world. Together, we are solving the prevalent problem of loneliness and shaping Social Life 3.0 — a new digital reality where people will be able to fulfil their needs for communication and attention from other people and artificial life forms.

Our teams of digital nomads live and work remotely from Cyprus, Malta, the USA, Thailand, Indonesia, Hong Kong, Japan, Australia, Poland, Israel, Turkey, Latvia and many others.

We are currently seeking a skilled Site Reliability Engineer with expertise in Kubernetes, RabbitMQ, and NGINX. As a key member of our support team, you will be responsible for maintaining our key infrastructure services, assisting development teams with technical issues, providing solutions, and ensuring the seamless operation of our production and test environments.

Key responsibilities:

Provide support for Linux-based systems, including server installation, configuration, and maintenance
Diagnose and resolve Linux-related issues, ensuring system stability
Design and implement high-availability configurations for Kubernetes control planes, RabbitMQ clusters, and NGINX load balancers
Integrate and manage service mesh technologies, such as Istio or Linkerd, within Kubernetes environments
Prepare detailed root cause analysis reports for significant incidents, outlining the steps taken to identify and resolve issues
Collaborate with DevOps teams to integrate Kubernetes, RabbitMQ, and NGINX into CI/CD pipelines
Assist customers in deploying and configuring service mesh solutions across multi-cloud environments

Qualifications:

Strong practical knowledge of Linux systems (Redhat-like OS), Linux filesystems, and Linux networking
Experience in maintaining several Kubernetes clusters in geo-distributed data centers with service mesh implementation
Proficiency in managing high-load clusters of the message broker RabbitMQ and monitoring and troubleshooting best practices
Work experience with NGINX servers, load balancing configurations, and web application security
Experience in setting up monitoring systems such as Prometheus, Grafana, and Zabbix

Nice to have:

Certification in Linux and Kubernetes
Experience in infrastructure automation with Terraform
Knowledge and experience with Cloudflare and AKAMAI as CDN and WAF
Work experience with public cloud providers such as GCP, AWS, and Azure
Knowledge of VmWare VSphere

What do we offer:

REMOTE OPPORTUNITY to work full time;
7 wellness days per year (time off) that can be used to deal with household issues, to lie down and recover without taking sick leave;
Bonuses up to $5000 for recommending successful applicants for positions in the company;
Full payment for professional training, international conferences and meetings;
Corporate discount for English lessons;
Health benefits. If you are not eligible for corporate medical insurance, the company will compensate you with up to $ 1,000 gross per year per employee according to the paychecks. This can be spent on self-purchase of health insurance or on doctor’s fees for yourself and close relatives (spouse, children);
Workplace organization. The company provides all employees with an equipped workplace and all the necessary equipment (table, armchair, wifi, etc.) in our offices or co-working locations. In the other locations, the company provides reimbursement of workplace costs up to $ 1000 gross once every 3 years, according to the paychecks. This money can be spent on the rent of the co-working room, on equipping the working place at home (desk, chair, Internet, etc.) during those 3 years;
Internal gamified gratitude system: receive bonuses from colleagues and exchange them for time off, merch, team building activities, massage certificates, etc.

Sounds good? Join us now!

Cloudflare akamai Terraform Amazon Web Services (AWS) RabbitMQ Nginx Azure Red Hat Linux zabbix Prometheus Google Cloud Platform (GCP) istio Kubernetes linkerd vSphere Grafana Site Reliability Engineering (SRE)

Leave a Reply Cancel reply