Create a privacy preserving version of Shazam using Concrete and/or Concrete ML #79

zaccherinij · 2023-09-28T09:36:12Z

Winners

🥇 1st place: A submission by Iamayushanand
🥈 2nd place A submission by GoktuEk

Overview

Shazam is a popular application that instantly identifies the music being played from a short recording made with a mobile phone. While more than 20 years old it is still one of the most used apps for music recognition to this day.
Description

Shazam’s algorithm is well known, as the company published a paper explaining its inner workings in 2003: An Industrial-Strength Audio Search Algorithm, by Avery Li-Chun Wang. In particular, it requires the user to send some queries to Shazam’s servers in order to search for matches in its central database. That means that Shazam is able to gather some of their users’ private personal preferences. We believe this could be avoided thanks to FHE ! We challenge you to create a music recognition algorithm like Shazam’s with FHE !

How to participate?

1️⃣ Register here.
2️⃣ When ready, submit your code here.
🗓️ Submission deadline: December 17, 2023.

What we expect

We expect you to provide:

A model that can identify a song’s artist and title from a sample of raw audio, at most 20 seconds long, as input.
- To qualify for the maximum prize, the FHE application should work on raw encrypted audio and give a song identification number as output. Furthermore, the model should report when a song is not known.
- Partial prizes will be awarded if only some parts of the pipeline are in FHE and others are done in the clear (parts can be: feature extraction, matching algorithm, unknown song rejection, song index identification). The same applies if the application does not report that a song is not known
An evaluation of the FHE model’s performance using FHE (only partial prizes are awarded if the algorithm can only partially be run with FHE but works with FHE simulation)
An evaluation of the floating-point equivalent (non-FHE) model’s performance for comparison
A tutorial explaining how you built the project
A clean and documented code, as well as a straightforward README.md file showing how to install the project as well as run and evaluate the models

The evaluation of both model’s performance should be done using the top-1 and top-3 Accuracy metric for known songs: for a query song, it is considered to be retrieved successfully if it is the first song returned in the result list, or, for top-3 if it is in the first 3 songs returned These metrics should be measured with 10-fold cross validation over the chosen dataset. The test set should be made of the training set’s audio files. A separate list of songs that are not known to the application should be kept and a report on the accuracy of these songs being reported as unknown should be given.

In order to obtain the full reward prize, we expect both models' accuracy to be high, even though the FHE model’s score might be a bit lower than its floating-point equivalent due to quantization. Additionally, the FHE execution time for a single sample should be as realistic as possible.

Implementation guide

While it is not mandatory to use one of them, the FMA: A Dataset For Music Analysis repository provides several dataset of full MP3-encoded 30s audio data, along with additional files containing each of their metadata. The smallest one has 8000 tracks but using a subset of it might be easier to start with. Alternatively you could propose a similar dataset. The awarded prize will depend on the dataset’s size.

Besides, while Concrete ML should provide the necessary models for achieving this bounty, some modifications in the source code might be required. Additionally, some parts might be done using Concrete Python directly.

Reward

🥇Best submission: up to €10,000.

To be considered best submission, a solution must be efficient, effective and demonstrate a deep understanding of the core problem. Alongside the technical correctness, it should also be submitted with a clean code, clear explanations and a complete documentation.

🥈Second-best submission: up to €3,500.

For a solution to be considered the second best submission, it should be both efficient and effective. The code should be neat and readable, while its documentation might not be as exhaustive as the best submission, it should cover the key aspects of the solution.

🥉Third-best submission: up to €1,500.

The third best submission is one that presents a solution that effectively tackles the challenge at hand, even if it may have certain areas of improvement in terms of efficiency or depth of understanding. Documentation should be present, covering the essential components of the solution.

Reward amounts are decided based on code quality, model accuracy scores and speed performance on a m6i.metal AWS server. When multiple solutions of comparable scope are submitted they are compared based on the accuracy metrics and computation times.

Submission

Apply directly to this bounty by opening an application here.

Questions?

Do you have a specific question about this bounty? Join the live conversation on the FHE.org discord server here. You can also send us an email at: [email protected]

The text was updated successfully, but these errors were encountered:

iamayushanand · 2023-12-14T12:58:31Z

These metrics should be measured with 10-fold cross validation over the chosen dataset

If I have 10 songs and I train my model on 9 of them, I would likely get "unrecognised" on the 10th song during cross validation because it isn't in the training set. Can you clarify what is meant by the 10 fold cv?

aquint-zama · 2024-02-13T17:25:44Z

Winners

🥇 1st place: A submission by Iamayushanand
🥈 2nd place A submission by GoktuEk

zaccherinij added 🎯 Bounty This bounty is currently open 📁 TFHE-rs library targeted: TFHE-rs labels Sep 28, 2023

zaccherinij assigned zaccherinij and aquint-zama Sep 28, 2023

zaccherinij changed the title ~~Create a privacy preserving Shazam using FHE and Concrete ML~~ Create a Privacy Preserving Version of Shazam Using FHE and Concrete ML Sep 28, 2023

aquint-zama added this to the Season 4 milestone Sep 28, 2023

zaccherinij added 📁 Concrete ML library targeted: Concrete ML and removed 📁 TFHE-rs library targeted: TFHE-rs labels Sep 28, 2023

zaccherinij changed the title ~~Create a Privacy Preserving Version of Shazam Using FHE and Concrete ML~~ Create a privacy preserving version of Shazam using FHE and Concrete ML Oct 2, 2023

zaccherinij changed the title ~~Create a privacy preserving version of Shazam using FHE and Concrete ML~~ Create a privacy preserving version of Shazam using Concrete ML Oct 3, 2023

aquint-zama pinned this issue Oct 9, 2023

aquint-zama changed the title ~~Create a privacy preserving version of Shazam using Concrete ML~~ Create a privacy preserving version of Shazam using Concrete and/or Concrete ML Nov 2, 2023

zaccherinij closed this as completed Jan 22, 2024

zaccherinij added 💰 Awarded This project is now completed and awarded and removed 🎯 Bounty This bounty is currently open labels Feb 9, 2024

zaccherinij unpinned this issue Feb 9, 2024

zaccherinij added the 🎯 Bounty This bounty is currently open label Feb 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create a privacy preserving version of Shazam using Concrete and/or Concrete ML #79

Create a privacy preserving version of Shazam using Concrete and/or Concrete ML #79

zaccherinij commented Sep 28, 2023 •

edited

Loading

iamayushanand commented Dec 14, 2023

aquint-zama commented Feb 13, 2024

Create a privacy preserving version of Shazam using Concrete and/or Concrete ML #79

Create a privacy preserving version of Shazam using Concrete and/or Concrete ML #79

Comments

zaccherinij commented Sep 28, 2023 • edited Loading

Winners

Overview

How to participate?

What we expect

Implementation guide

Reward

🥇Best submission: up to €10,000.

🥈Second-best submission: up to €3,500.

🥉Third-best submission: up to €1,500.

Related links and references

Submission

Questions?

iamayushanand commented Dec 14, 2023

aquint-zama commented Feb 13, 2024

Winners

zaccherinij commented Sep 28, 2023 •

edited

Loading