At Zyte (formerly Scrapinghub), we eat data for breakfast and you can eat your breakfast anywhere while you work for Zyte. Founded in 2010, we are a globally distributed team of over 190 Zytans working from over 28 countries. We are on a mission to enable our customers to extract the data they need to continue to innovate and grow their businesses. We believe that all businesses deserve a smooth pathway to data. For more than a decade, Zyte has led the way in building powerful, easy-to-use tools to collect, format, and deliver web data, quickly, dependably, and at scale. And today, the data we extract helps thousands of organizations make smarter business decisions, secure competitive advantage, and drive sustainable growth. Today, over 2,000 companies and 1 million developers rely on our tools and services to get the data they need from the web. By joining the Zyte team, you will:Become part of a self-motivated, progressive, multi-cultural, and curious team that excel every day. When you need help there is always someone there who has your back. We are committed to making our customers excited. Have the freedom & flexibility to work remotely. Get the chance to work with cutting-edge technologies and tools. We love to innovate and create new ways of doing things, always striving to do better and be better.

Zyte (formerly Scrapinghub)

Senior Backend Engineer (AI/ Big Data)

About the Job

The new SaaS will include our recently released Automatic Extraction which provides an API for automated e-commerce and article extraction from web pages using Machine Learning. Automatic Extraction is a distributed application written in Java, Scala and Python; components communicate via Apache Kafka and HTTP, and orchestrated using Kubernetes.

As a Senior Backend Engineer, you will be designing and implementing distributed systems: large-scale web crawling platform, integrating Deep Learning based web data extraction components, working on queue algorithms, large datasets, creating a development platform for other company departments, etc. - this is going to be a challenging journey for any backend engineer!

Roles & Responsibilities:

  • Work on the core platform: develop and troubleshoot Kafka-based distributed application, write and change components implemented in Java, Scala and Python.
  • Work on new features, including design and implementation. You should be able to own and be responsible for the complete lifecycle of your features and code.
  • Solve distributed systems problems, such as scalability, transparency, failure handling, security, multi-tenancy.

Requirements:

  • 3+ years of experience building large scale data processing systems or high load services
  • Strong background in algorithms and data structures.
  • Strong track record in at least two of these technologies: Java, Scala, Python, C++. 3+ years of experience with at least one of them.
  • Experience working with Linux and Docker.
  • Good communication skills in English.
  • Computer Science or other engineering degree.

Bonus points for:

  • Kubernetes experience
  • Apache Kafka experience
  • Experience building event-driven architectures
  • Understanding of web browser internals
  • Good knowledge of at least one RDBMS.
  • Knowledge of today‚Äôs cloud provider offerings: GCP, Amazon AWS, etc.
  • Web data extraction experience: web crawling, web scraping.
  • Experience with web data processing tasks: finding similar items, mining data streams, link analysis, etc.
  • History of open source contributions