Data Engineer
Vilnius, Lithuania

Vinted is the worldโ€™s biggest second-hand style marketplace. Our aim is to make second-hand first choice worldwide ๐ŸŒŽ. Currently ~20 million people use our platform and Vintedโ€™s Data Warehouse (DWH) team makes sure our analysts, management and product teams have up to date data to make good decisions.

This September Vinted has reached a major milestone - EBITDAM == 0 (EBITDA before Marketing as we consider marketing an investment), meaning that if we stop marketing efforts, we can run the company without external investments. The company is growing fast. Itโ€™s getting difficult for Vinted DWH to keep up with ever-growing workload.

At the moment Vinted DWH is a one-man-team and we decided to double it. You would be working along with Lech to:

  • Ensure, that the company has new metrics ready for analysis every day (nightly, hourly and real-time / streaming job pipelines);
  • Enable analysts to analyse ever-growing amounts of data (currently we have 300+TB of data in our main cluster);
  • Enable product teams to use Data Warehouse heavy lifting in our core product for: statistics for Vinted members about their item visibility; Machine Learning model training and preparation for scammer, spammer and other bad actor detection; recommendations engine.

Here's what is currently on the roadmap:

  • Adapt the system to GDPR requirements;
  • Find and setup solutions to make data processing easier for Data Warehouse users. Analysts and backend engineers constantly update and add new jobs to our infrastructure, pushing it to its capacity limits. We ensure that they have the tools and knowledge to fully own the development, maintenance and optimization of their jobs;
  • Simplify our event ingestion pipeline, which processes ~ 10.000.000.000 events every week.

We expect the platform to grow 2-4 times next year. There are huge challenges ahead. If this sounds interesting - you may well be just who we need. Weโ€™re looking for an engineer to join Vinted DWH in their work.

We are looking for someone who likes to solve problems related to data. As there are many unsolved problems in the domain, we are always on lookout for new techniques and technologies, we experiment a lot and use unconventional ways to solve problems. Knowledge on how database systems work is a big plus. Experience with Apache Spark is not mandatory, though very useful. We value pragmatism, big picture thinking, curiosity and problem solving skills. We expect you to be familiar with most, and have deep knowledge in at least a few of the following disciplines: Database Systems, Algorithms, Software Engineering, Systems Architecture, Big Data, Systems Scaling, Systems Performance Tuning, Computer Science.

This is a mid / senior level position.

Perks:

  • Learning ๐Ÿ“š ๐ŸŽซ โœˆ๏ธ budget (10% from gross salary)
  • 25 working days of holidays ๐Ÿ
  • We buy all the tech ๐Ÿ’ป ๐ŸŽง โŒจ you need to do your job
  • Daily breakfast and lunch ๐Ÿœ ๐ŸŒฏ ๐Ÿฅ• ๐Ÿ” ๐Ÿณ at Vinted Noise restaurant
  • Budgets for โ™• ๐Ÿป ๐ŸŽพ ๐Ÿฅ ๐Ÿฉ โ›ธ ๐ŸŽณ ๐ŸŽธ ๐ŸŽฒ ๐ŸŽก teambuildings
  • Inhouse ๐Ÿ‹ gym
  • Shop ๐Ÿ‘  ๐Ÿ‘• โŒš๏ธ @ Vinted budget

If youโ€™re interested, contact Titas (titas@vinted.com) and he will be in touch with you.

Here are some of the technologies we use: Hadoop HDFS, Apache Spark (ETL, Streaming, ML, ad-hoc analysis), Apache Kafka (message bus for tracking events integration with our core product), Cloudera Impala (for run-time metric aggregation, quick ad-hoc querying), Oozie (for job scheduling). We use Ruby and Scala for everyday work.

Apply here