ML Engineer (LLMs) @ Brainly Sp. z o.o. at Brainly Sp. z o.o. #vacancy #remote

WHAT IS REQUIRED

  • 3+ years experience with Deep Learning models in production or a comparable industry career with machine learning, data mining, or statistical modeling.
  • 1+ years experience with Deep Learning models for NLP, language models, or text analytics in production.
  • Practical experience with modern Cloud Computing either AWS (preferably) or Azure, GCP, and services for storage, data processing, serverless, R&D, and ML R&D environments.
  • Experience with the productization of ML pipelines for feature engineering, training, evaluation, and batch inference purposes.
  • Experience deploying ML models to production, monitoring live, and managing models’ lifecycle (e.g., labeling, retraining).
  • Strong python coding skills, in particular for the purpose of training & deploying models, and related libraries (e.g. numpy, Boto3, FastAPI, PyTorch, Pandas, Poetry, or similar).
  • Machine Learning frameworks such as: Tensorflow or PyTorch, AWS Sagemaker, scikit-learn, Transformers (HuggingFace).
  • Deep knowledge and understanding of theoretical foundations of modern Machine Learning, specifically Deep Neural Networks, either NLP/LLMs (preferably) or Computer vision.
  • Bash and Linux/Unix (eg. AWS CLI, Docker, scripting or similar).
  • Cloud services (eg. IAM, EC2, S3, RDS, Redshift, Sagemaker, Athena, Lambda or GCP and Azure alternatives).
  • Parallel computing (multi-processing, async, GPUs, model sharding).
  • Data analysis and visualization tools such as pandas, plotly, matplotlib, seaborn, or streamlit.
  • Team player attitude and clear communication skills.
  • High level of self-organization.
  • Fluent in English, both written and verbal.

WHAT IS PREFERRED

  • A Bachelor’s degree or above in STEM (science, technology, engineering, mathematics) or a similar field.
  • Hands-on experience with data storage and processing technologies (e.g., relational/non-relational databases, warehouses, cloud storage solutions, and different processing engines).
  • Hands-on experience with large-scale serving of ML models (millions of requests/day).
  • Hands-on experience with computer vision models and algorithms.
  • Hands-on experience with Kubernetes and microservices.
  • Hands-on experience with Infrastructure as Code tools.
  • Familiarity with basics in Data Engineering (e.g. SQL and NoSQL, data streaming, Apache Spark, Snowflake).
  • CI/CD (eg. GitHub Actions, AWS CodePipeline or similar).
  • Kubernetes (eg. Deployment, StatefulSet, Ingress, Helm or similar, REST APIs).
  • MLOps stack (eg. Neptune.ai, Sagemaker or similar like MLFlow, Kubeflow, Flyte).
  • IaaC frameworks (Terraform, CloudFormation, Pulumi).
  • Modern model serving frameworks (torchserve, NVIDIA Triton or Seldon).
  • Familiar with agile development and lean principles.

The ML Engineer will have an opportunity to turn machine learning artifacts into production systems, participate in implementing state-of-the-art MLOps practices, and improve skills in NLP, Computer Vision, Generative AI, large-scale data processing, and information retrieval.

The ideal candidate is an enthusiast of educational technologies with a background in software development and a skill set that blends cloud infrastructure, machine learning, and Python coding.

As part of the  Machine Learning Infra team , ML Engineer will work closely with other AI roles within the AI Services teams (MLOps engineers, Data Scientists, AI Analysts, AI Operations Specialists) on internal projects, develop modularized MLOps solutions on top of what the Data Engineering and Automation teams provide, and collaborate with other ML teams outside the department to support technology adoption.

In addition, the ML Infra team has a dual role: it owns the MLOps platform used by all ML practitioners at Brainly and acts as the engineering backbone for the AI Services projects.

Are you motivated to learn fast and grow in the required areas to succeed in the job? Are you passionate about automating workflows? Do you follow the culture of DevOps and high-quality software standards? Do you take the ownership of problems/challenges from beginning to end? Do you have positive attitude and willingness to address challenges and complex problems? If you answered yes to these questions, you might just be the perfect candidate for this role! 

,[Turn machine learning artifacts into production systems and services., Implement tools and frameworks that help Data Scientists (or other stakeholders) work more efficiently, simplifying areas such as model training and evaluation, data annotation, and processing., Process large data sets—both within prepared and well-organized data pipelines and in quick and dirty mode—for the sake of quick experimentation., Integrate ML solutions within larger systems (other product features or business processes)., Lead innovation and validate AI company-wide opportunities based on state-of-the-art Computer Vision, NLP, and modern LLM services and models., Research and stay current with the latest advancements in AI technology (both models/algorithms and tools/libraries/SaaS/APIs)., Build, deploy, automate, maintain, and manage the entire model lifecycle of the data science solutions developed within the AI Services department., Provide the engineering capabilities to our internal research projects., Act as consultant and own the implementation and maintenance of ML-based solutions in production areas that do not have a dedicated AI team assigned (e.g. Trust & Safety, content moderation, or experimental product features)., Work closely with production teams to integrate and facilitate the adoption of the tools and standardized solutions developed by the ML infrastructure team.] Requirements: Python, NLP, Deep learning, transformers, AWS, TensorFlow, PyTorch, AWS SageMaker, Docker, Redshift, EC2, Computer vision, Kubernetes, CI/CD Additionally: Sport subscription, Training budget, Private healthcare, Dental Care Package, Stock options, AskHenry, Mental Health Helpline.

scikit-learn Agile kubeflow boto3 Terraform Data Storage deep-learning NumPy Machine Learning data-processing microservices Amazon SageMaker computer-vision Natural language processing (NLP) amazon-s3 Lambdas pandas Python Amazon Redshift Team player data-visualization FastAPI MLflow amazon-rds amazon-iam PyTorch torchserve parallel-processing feature-engineering Infrastructure as Code (IaC) Amazon Web Services (AWS) Data Engineering Azure Linux Communication cloud-computing Google Cloud Platform (GCP) Data Analyst CI/CD Self-organization pulumi cloud-platforms GitHub Actions Deep Neural Network (DNN) Amazon Athena seldon STEM amazon-cloudformation SQL Kubernetes TensorFlow Bash ML models NoSQL amazon-ec2

Leave a Reply