Embedding models and music

A post on the Google Research blog today announced

the open-sourcing of Embedding Projector, a web application for interactive visualization and analysis of high-dimensional data

Over the summer, I presented work (co-authored with Jaan Altosaar) on the application of word embedding models to a large–and admittedly noisy–corpus of classical music. It appears that word embedding models capture something about sequences of chords. Our work, and related recent work with a different dataset, shows that that major triads are somewhat evenly distributed through the embedding space, in their circle-of-fifths order.

There’s no obvious way–yet–to evaluate these models of chords. The Embedding Projector provides a convenient way to inspect the structure of the learned embedding space, and facilitates the exploration of its structure.

You can find a link to an Embedding Projector visualization of the results of an embedding model trained on a small dataset of classical music here.

Switch the labels to “root_common” for more sensible chord labels. Try searching for (major|minor) triad with regexps enabled in the sidebar.

Data

  • Chord slices from the Yale-Classical Archives Corpus
  • from works by Bach, Hadyn, Mozart, and Handel…
  • represented as binary chroma vectors (tells us the name of sounding pitch classes)…
  • ingested by gensim as a collection of tokens…
  • to train a word2vec model (new dimensionality = 100; context window = 5; skip-gram model; negative sampling used)