Cracking the Data Engineering Interview

Cracking the Data Engineering Interview

Share this post

Cracking the Data Engineering Interview
Cracking the Data Engineering Interview
Daily Data Engineering Problem #2
Daily Data Engineering Problem

Daily Data Engineering Problem #2

Mike Petridisz's avatar
Mike Petridisz
Jan 18, 2024
∙ Paid

Share this post

Cracking the Data Engineering Interview
Cracking the Data Engineering Interview
Daily Data Engineering Problem #2
Share

Good afternoon! Here's your data engineer interview problem for today. This question was asked by Spotify.

Your task is to design a data pipeline that analyzes user listening habits and generates personalized playlist recommendations.

Data Sources:

  • user_activity: Contains user listening data.

  • song_metadata: Contains details about songs.

  • user_preferences: Contains each user's preferred genre.

Task:

  1. Write SQL queries to:

    • Calculate total listening time per user.

    • Identify the most listened song for each user.

    • Recommend three songs from the user's preferred genre that they haven't listened to, ordered by release year (newest first).

  2. Use the provided sample data to test your queries.

Sample Data

It's important to ensure that your SQL environment is set up with the tables and data structured as specified.

SQL Script for Table Creation and Data Insertion:

Access the SQL script here.


Solution & Explanation

Keep reading with a 7-day free trial

Subscribe to Cracking the Data Engineering Interview to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2025 Miklos Petridisz
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share