Doing things with Doodles

This post is a work in progress and it is subject to change

This March, to celebrate the birthday of the seventeenth-century composer Johann Sebastian Bach, Google added an interactive Doodle to their search engine homepage.

Clicking on a cute depiction of the venerable German cantor revealed a simple music-notation editor that allowed visitors to input a short, simple two-measure tune in staff notation. After a short period of waiting, three more harmony parts were automatically added to complete a choral setting of the tune for a vocal quartet of soprano, alto, tenor, and bass voices.

Unlike very many of the Doodles of the last ten years, which reward little more than idle curiosity or passive engagement, this Doodle dared to be highly specific and it challenged—maybe even empowered—millions of users to be musical at a scale that the current cohort of trained (and in-training) music education professionals couldn’t dream of.

And it sounded OK. The results aped the style of composition represented by the approximately 300 chorales composed by Bach that were used to “teach” the artificial intelligence (AI) algorithm behind the mini game.

That is, until they didn’t. Music educators and musicians were quick to point out that Google’s algorithm seemed to be making rookie mistakes, errors that would make all but the beginning student of harmony and counterpoint blush to commit to paper. Some music theory folk even took to grading the performance of this artificial student. To award it a B seemed generous.

Well-meaning experts chimed in with suggestions for improving the algorithm: give it more music to learn from, allow it to learn for longer, or simply educate it with the “missing” rules.

More lateral thinkers wondered whether that AI would do just as well if it were task to reproduce music written by other composers. Are we really ready for an interactive Doodle celebrating John Cage? Others spent hours assuming an adversarial posture, trying to trick or confuse the AI with so-called edge cases: tough melodies and themes from musical idioms foreign to the training data-set.

These are familiar strategies for critiquing new algorithms. All of them are worth practicing. But after the fun and games subside, professional music educators can step away from the Doodle and confidently assert the continuing need for their existence. These machines can’t even get the basics right. Our jobs in the music classroom are as secure as they ever were. (Which is to say, not very).

Responses of this kind, while important, risk overlooking a more interesting story, one that confronts us to make sense of the forbiddingly dense network of assumptions and systems upon which AI, including but not limited to the Bach Doodle, depends.

Those who caviled at the Bach Doodle’s loose understanding of the rules of musical part-writing put up a robust defense of human knowledge of their craft.

But if there is any real threat posed by AI to those of us who think for a living, I doubt it that can successfully be countered by appealing to the inherent dignity of human reasoning.

This is because many contemporary artificial intelligence implementations cannot be justifiably said to reason in any common sense of the word. The contest between AI and human is not one between different styles, mechanisms, or timescales of reasoning, that is, a difference of degree: it is one of kind.


The silicon revolution has populated our homes, workplaces, and public spaces with legions of computational devices. These devices are often programmed to function as statistical machines, set to make inferences on a second-by-second basis about the probable behaviors of their owners: at what time you are most likely to leave your desk for the subway; who you are most likely to call at 10 p.m. on Saturday nights; at what confidence level should you be forbidden from sitting behind the wheel of your car, and so on.

It’s not that the explanation for these decisions are being withheld—it’s that that no reason-based explanation (an explanation that would be recognized by a human as such) exists for them.

Consider the decisions made by Google’s Bach Doodle. Why, as one might expect a student to explain, did the algorithm choose to harmonize this note with that chord? Why did it match the members of this chord to these voices in that way? The mathematical design of the AI in question makes these questions close to impossible to answer.

Although we might speculatively infer what musical rules it has learned by looking at the examples the AI will tirelessly generate, inside the computer that runs the algorithm you won’t find an explicit, human-interpretable reasoning behind any of decisions it makes.

Doesn’t sound so intelligent now, does it? In fact, this is precisely what makes the family of algorithms behind the Bach Doodle so smart: they appear to simulate skilled behaviors despite the fact that they do not store any interpretable representation of the knowledge traditionally thought to be required in order to do so.

This why any defense of the human element in a world of statistical inference ought to be founded on something else other than our capacity to reason instrumentally: AI is doing better and better in a variety of domains, all without accounting for its decisions in a reasonable way. While this may be tolerable in seemingly low-stakes applications, as in a musical diversion like the Bach Doodle, will we tolerate it as a mode of decision-making when it comes to decisions about finance? Bail hearings? State-run projects under the rubric of collective wellbeing, or national security?

So how much do we know about this AI? In a creative paper written by members of the Doodle team (Anna Huang, Tim Cooijmans, et al., 2017), they describe the fundamental research that powers their AI composer.

Like much research into the problem of automatic composition, they frame the task of musical composition as a problem of learning probability distributions over multiple variables. Their contribution adapts a learning technique from the booming field of deep neural networks (called NADE) as well as analogies from computer vision to improve the musical plausibility of their results.

Phew.

All this makes more impressive the fact that team behind this was able to package up a trained algorithm, and serve it to hundreds of millions of users across the world. This all on Google’s flagship property, the object of scores of optimization experiments every single day and subject to extremely stringent change-management procedures that ensure that the addition of new code does not degrade critical performance metrics.

It is so impressive, in fact, that it is easy to forget that, in practice, the reason that Google was able to pull this off was because only Google could. Only Google could inject J.S. Bach into the front page of the internet, because only Google controls the front page of the internet.

No single corporation mediates global access to online resources like Google does. Any faintly tech-savvy reader who has taken the time to sherpa a digital novice (whether young or old) through cyberspace, knows that Google or its ubiquitous browser software, Google Chrome, is often taken as a synecdoche for The Internet.

This is why reactions online to the Bach Doodle cannot be fully explained by playful delight, academic territorialism, or the overblown fear of being made redundant by scholarly automation. There is also a collective anxiety at play: a natural response to events that call attention to the diminishing control that we exert over our situation.

This algorithm, seeming to stake claim to the hard-earned fruits of their musical training, was foist upon the professional music educators of the Web. An AI whose behavior they couldn’t quite understand, that used statistical techniques many had never heard of, had been neatly packaged and delivered to their screens through a distribution channel that none of them controlled. These feelings have animated recently repeated calls for the breakup the very largest tech companies, calls once articulated by “fringe” communities and interest groups long before they were recuperated as EU anti-trust enforcement priorities and campaign planks of major-party American presidential hopefuls.

This would be a foretaste of what the future brings, were it not already here.