Sebastian Michel & Aleksandar Stupar
PICASSO – Soundtrack Recommendation for Images
Most of us have encountered the following situation: we have taken multiple images during a vacation trip and, when back at home, want to present these images to our friends. What is missing is a background music accompaniment that underpins the emotions felt during the vacation. Our approach, named PICASSO, recommends the most appropriate soundtrack, fully automatically, from a user-provided music collection.
There are two worlds to be dealt with here: the world of music and the world of images. Considered separately, both of these worlds are relatively well understood. For each of these worlds, there exist methods that determine which objects are most similar to a given query object. For example, a user can find the 10 images most similar to a given vacation image. Features describing colors, brightness, and edges are used to calculate this similarity. However, different features are used in the world of music, for example, timbre and rhythm, which are crucial for precise search results.
Up to now, we have not been able to make a connection between the world of music and the world of images. With PICASSO, we have developed an approach that is able to learn and apply this connection.
The key principle is to look at very many examples of semantically related images and music pieces, aiming at covering a wide range of both media. The idea behind PICASSO is to analyze movies, which offer exactly the data needed for our task: (moving) images with matching soundtrack music, manually assigned by experienced movie directors. We had PICASSO analyze more than 50 movies, hence, obtain a large collection of pairs of the form (image, movie soundtrack).
Figure 1: Process of Generating the PICASSO Database
In the recommendation process, PICASSO retrieves the most similar movie scenes for a given query image. In doing this, we move from the world of pictures to the world of music, using the already collected image-music pairs. Then, we can select the most similar songs out of a user-provided collection, with regard to the soundtracks in the movie scenes we found. For efficiency reasons, we have precomputed the last step of this process, i.e., we can jump directly from a movie scene to the userprovided songs. Figure 1 illustrates this process.
This recommendation procedure can also be applied to a given collection of the images (i.e., a slide show). First, multiple soundtracks are recommended for each individual image. Subsequently, these results are combined to recommend a soundtrack that matches all images. When doing so, we need to avoid choosing a soundtrack that is totally unfi tting for some images, even if it might be a great fit for others. PICASSO is also applicable to videos, in which case the video is considered as a series of images, for which we recommend a soundtrack.
Figure 2: Screenshot of the iPhone App PicasSound
We have released an application, named PicasSound, which queries our PICASSO database via iPhone or Android-based smartphone. PicasSound is available free-of-charge. It enables users to capture an image or to use an already existing one [Figure 2] , for which PICASSO recommends a list of songs out of the songs available on smartphones.