Exploring time-coded comments on YouTube music videos: The past, present, and future of an emerging source for digital musicology

Details

Presented as Eamonn Bell, “Exploring time-coded comments on YouTube music videos: The past, present, and future of an emerging source for digital musicology” at Like, share and subscribe: Youtube, music and cyberculture before and after the new decade. Research Cluster in Music and Cyberculture (CysMus). Faculdade de Ciências Sociais e Humanas - NOVA FCSH, Lisbon, Portugal. (October 1–3 2020)

Abstract

(As published in conference programme)

The potential for the systematic analysis of YouTube comments has been recognised by many researchers in fields including music information retrieval (MIR), sociology, and musical ethnography (Yadati et al. 2014; Thelwall 2018; Born and Haworth 2017). Notably, since 2008 YouTube has automatically detects timecodes in user-generated comments, converting them to “deep” links that skip playback directly to the moment in the video cited (Vliegendhart et al. 2015). Presenting the history, use, and future prospects of these time-coded comments (TCCs) on YouTube, I assess their value as a novel primary source for digital musicologists.

First, I place digitally time-coded commentaries on musical recordings in their historical context. A media archaeology (Parikka 2012) of the TCC shows the practice affiliates to twentieth-century listening guides, experiments with interactive multimedia on LaserDisc during the late 1970s, and mixed-mode CD-ROM content in the 1990s. Second, drawing on a selective dataset of over 25,000 [October 2020: over 180,000] TCCs on over 300 [October 2020: 314] YouTube videos, I sample and compare commenting practices by listeners of Western art (“classical”) music and contemporary popular music. The data show that TCCs can contain ad hoc setlists and track listings for uploads of live recordings and transfers from analog media, respectively. They are used to surface moments marked for the attention of the individual listener; TCCs help users to reason about their musical experiences with direct reference to the sounds they report hearing.

Despite their characteristic brevity, TCCs also afford subtle narrative analyses of the musical content, contextually relevant links to other recordings and media, and internet humor. Finally, I assess the opportunities and challenges musicological research with TCCs, including the difficulties of working with data collected from privately-held web properties, the risk of amplifying the biases of a self-selecting cohort of online commenters (Schultes, Dorner, and Lehner 2013), and questions of linguistic diversity within such datasets.

Video recording

Please find a recording of the pre-recorded conference paper (20 minutes) here.

Slides

Please find the PDF slides for this presentation here.

Funding acknowledgment

This research was supported by a Government of Ireland Postdoctoral Fellowship, awarded by the Irish Research Council in 2019 to Eamonn Bell for the project “Opening the ‘Red Book’: The digital Audio CD format from the viewpoint between musicology and media studies.” Project no. GOIPD/2019/239.

Presentation outline


Introduction

The promise of YouTube comments is a snapshot of how time-based media co-operate with text to stimulate, elicit, engender, construct, or otherwise afford the experiences that their occasionally anarchic content records. In this talk, I’ll speak about how a large dataset of time-coded comments on YouTube music videos might be used to put a finer point on this relationship.

Previous work

Within music studies, an early example of sustained and careful use of YouTube comments can be found in Áine Mangaoang’s work on the CPDRC prison dancers, which dates back to her 2014 thesis.1 In 2016, Amanda Nell Edgar undertook a discourse analysis of over 5,000 comments and replies underneath recordings of N.W.A.’s “Fuck tha Police” and pointed out that music and entertainment videos open up the possibility for counterhegemonic discourse within the largely corporate frame of the YouTube comment section.2 More recently, Ed Spencer has provided a compelling case for the usefulness of qualitatively coded YouTube comments to get a purchase on the complex relationship between spectral features, somatic response, emotion, and the conspicuous consumption of music represented in comments on recordings of EDM music.3

The scale and systematicity of studies of YouTube comments depends on both the technical expertise and the epistemic cultures of the researchers that conduct them The prolific practitioner of social media analytics Mike Thelwall has provided an ambivalent assessment for the prospects of more data-driven analysis of YouTube comments.4 Thelwall concludes that “text-based social media analytics are challenging because they are not basic facts, survey question responses or discussions with researchers; instead they use a range of strategies to make limited and sensible exploratory deductions from collections of public web texts.”5 Despite this, or perhaps because of it, YouTube comments have provided fertile ground for a certain kind of algorithmic reterritorialisation by computer scientists, who aim to extract value from loosely-structured user comments on the site.6

Those of us who routinely cite YouTube videos in their research or in their teaching will have at one stage or another used YouTube’s user interface or URL syntax to create links that navigate directly to a given fragment of an online video. These hyperlinks are sometimes called “deep links” because they dive deep into the content inside the HTML document a given URL. Despite a series of redesigns and significant changes on the backend, the YouTube user experience for comments containing timecoded references has remained (more or less) unchanged since their introuduction in 2008.7 YouTube detects portions of text comments that resemble vaild timecodes and renders them as deep links, which, when clicked, skip the video playhead directly to the moment in the video cited and start playback.

In this talk, I’ll be speaking about YouTube comments that contain these time-based deep links; following Vliegendhart et al., I’ll call them “time-coded comments”, (or TCCs for short).8 Here’s an example of a typical TCC from the dataset. TCCs like this have already been systematically examined by researchers in music information retrieval and multimedia. Researchers recognized that YouTube TCCs—and deep links more generally—could be used to construct non-linear interfaces to online multimedia. These interfaces would stand in contrast to existing mainstream UIs, like the various YouTube interfaces, that encourage ‘all-the-way’-through engagement and algorithmic sequencing.9

In 2012, Vliegendhart et al. developed the LikeLines player, which analyzed TCCs to provide a “heatmap” user interface allowing users to identify and navigate within a video based on the.10 In a later study, they drew on 3,359 time-coded comments across a standardized set of YouTube videos and developed the Viewer Expressive Reaction Varieties (VERV) taxonomy to classify them in to one of six basic types.11 Last year, Yarmand et al. introduced a new coded dataset of 2,517 YouTube comments containing what they called content-based “references”, of which about 2,000 included at least one timecode.12

Like Vliegendhart et al., they used these results to prototype a new user interface for the YouTube platform that enables in-place editing and playback of these referential comments. These recent researches contribute to long-running efforts to design top-down codes for the content of YouTube comments, as well as algorithms that attempt to inform or reproduce new taxonomies and automatically determine the sentiment, valence or popularity of a given comment.13

Studies that focus exclusively on the relationship between TCCs and music specifically are quite rare, however. In a 2014 ISMIR paper, Yadati et al. collected a dataset of 225 timecoded comments on Soundcloud recordings on 100 mainstream EDM tracks and showed how they could be used to speed up the the automatic detection of the structurally significant and generically typical “drop.”14 Yadati et al. expanded on this in 2018 to consider what they called “socially significant music events”—namely: drops, breaks, and builds—in a dataset of 2,070 TCCs over 500 EDM tracks, again collected from Soundcloud.15 As far as I am aware, analogous research into YouTube TCCs focused exclusively on music has not yet been carried out.

Against the background of this broadly empirical work, my research interest is in these comments as documents of technologies of synchronization between sound recordings and other media, usually text. My personal interest in YouTube then is less sociological or anthropological than it is media-theoretical.16 Because TCCs link user generated content to specific moments in time-based media, they are are a rich source of particular human experiences as they variously mediated in the process of watching or listening to online content.

In October 2019, I began working on a two-year project to develop a media history of the audio compact disc format. Early on, I learned that one of the many striking innovations of the audio CD format is that it introduced digital timecoding of audio data to a wide audience. The use of time-coded paratexts emerges in the final third of the 20th century as technical media—like the CD—became increasingly mediated by microelectronics: technologies that can “take care” of the passage of time using explicit symbolic representations of locations in time, or timecodes.

The media-archaeological viewpoint I adopt here focuses less on the particular applications or audiences of a given technology for the coordination of time and text. Rather, it instead seeks to examine the most general articulation of the time-coded comment’s cultural form: as a kind of binding between the time of the sound recording medium and the time of the paratext. This binding is often troped as leading to a greater sense of immersion or immediacy. Through digitalization, the time-code gains greater precision and approaches a model of the passage of real time asymptotically but, as it is implemented in illegible microcircutiry, it is accompanied the user’s increasing alienation from the both the data-storing surface of the recording medium and of the processes of timekeeping. YouTube TCCs participate in this tradition. But precisely how they do so is a task for another day.

From the dataset

For the rest of this talk, I’m going to give an overview of commenting behavior in almost 190,000 timecoded comments covering 314 distinct YouTube video uploads.17 Between 1.5% and 2% of all YouTube comments on music videos contain at least one timecode. So, to derive this dataset I collected roughly 11 million comments on 620 videos in two bursts, during a period from about March to May 2020. Some videos did not return any TCCs, while others returned very few; I am excluding videos with fewer than 10 TCCs from the analysis of the data today. My criteria for choosing videos for collection were very informal. I collected comments from two main clusters of videos: one cluster deriving from a 2019 Top 100 playlist and the other centering on playlists of highly-viewed recordings of Beethoven’s music as well as from the results of some informal searches on keywords such as “Beethoven” or “greatest classical music”.

The posting dates of the comments themselves extend beyond the latest video upload date but stop in May 2020 when data collection ended. As a consequence, the vast majority of the comments that I collected were posted between 2018 and 2019; the majority of videos to which they respond were uploaded between 2017 and 2019. Therefore, the pop selection primarily speaks to mainstream pop music culture in and around 2019, as it was received up to 2020. Generally the classical music videos are older, and have considerably fewer views, comments, and TCCs. Approximately 67% of the videos in the dataset are pop music videos. However, the considerably greater average views-per-video means that pop music videos are responsible for 93% of the TCCs in the dataset.

Here are some of the video titles that I collected, just to give you a sample of what’s in the dataset. The TCCs that I collected are generally short; the mean comment length is 43.96 characters.18 93% of the TCCs include exactly one timecode; 5.86% contain exactly two; the remainder (just over 1%) accounts for the rest. Looking at TCCs with an exceptionally large number of timestamps—-much greater than the mode—we can see that TCCs are used to construct ad hoc setlists for videos that contain more than one musical work. This is consistent with YouTube users’ use of timecodes to compile track listings for transfers from analog media or to collate set lists for recordings of live musical performances.19

Rather than focus on exceptional specimens from the dataset, let’s take 5 TCCs completely at random. These five comments are fairly representative. The first comment indicates the location of the “drops” in a recording of “Winter” from Vivaldi’s Four Seasons concerti. The second TCC tags a striking image from the music video for Ariana Grande and Social House’s “boyfriend” (2019). With two words, (“klaus mood”), the one-year old comment likely connects the character in music video to a flamboyant denizen of universe of The Umbrella Academy, which premiered no Netflix in Februrary 2019. The third and fifth TCCs demonstrate the difficulty of interpreting the 12% or so “timecode-only” TCCs with reference to the comment text only. And, finally, the fourth comment performs the service of linking users’ to the moment the recorded song begins after a lengthy intro in the music video to Lil Dicky’s comic-rap effort “Freaky Friday” (2019). Unfortunately, the poster’s effort was not rewarded with a like on this occassion.

More interestingly, TCCs can help us understand the deep emotional and affectual responses to music videos.20 Following Bannister’s recent work on the varieties of “chills” response to multimedia, we might explore in the dataset where and when users self-report chills, tingles, and goosebumps as well as the bodily responses of warmth, tears, relaxation, and so on, that qualify the distinct varieties of chills reponse that he identified experimentally.21

TCCs also permit us to reason about the relationship between music and reminiscence, as the TCC on Billie Eilish’s “everything i wanted” (2019) shown here demonstrates.

“At 2:08 it gives me the vibes of when we used to play that weird sonic game as a kid and if he was underwater for too long it would play these sounds and I remember it being close to the”do do do dO” in the background so. And it comepletely makes sense because if she is under water too long she will ya know.. drown.”

Despite their characteristic brevity, TCCs also afford subtle narrative analyses of the music heard. These accounts are sometimes grounded in established interpretations of programmatic musical works and, other times, represent new and surprising hearings, as the comment here on the Overture to Rossini’s William Tell does.

In addition to close hearings, longer-than-average TCCs facilitate close readings of the visual content of music videos, as users attempt to decipher the hidden (or not so hidden) imagery of these videos and use timecodes to give a temporal structure to their observations. This is a favored fan activity, as this comment on the video for Demi Lovato’s “I Love Me” (2020) shows. TCCs also afford highly context-relevant links between media: hypertext becomes intertext. This TCC (one of a number) annotates the Billie Eilish track “goodbye”, showing precisely how the lyrics of the final track of her 2019 album When We All Fall Asleep, Where Do We Go? recapitulates the content of the preceding tracks on the album. This fan work garners the commenter over 2,000 likes.

One of many idiosyncratic uses of the TCC in the dataset is the construction of ASCII-art media players: these are generally copypasta adjusted to reflect the end time of the song. It is tempting to consider these as having a different function to the “regular” TCC, but they still afford user interaction. The hypermediated progress bar and track cueing functions allow users to navigate back to the start of the song, feature not afforded by the YouTube user interface until playback has ended. They are a visual twist on the highly popular “0:00” comment: a TCC that is intended to demonstrate that a user enjoyed the video enough to immediately “rewind” to the start of the track.22

Interestingly, there are a small number of TCCs that contain invalid timecodes. For example, some TCCs deliberately exceed the total duration of the parent video. These “links to nowhere” should not be excluded from research simply because they cannot easily be synchronized to the time-based media to which they are related.23 These TCCs are only malformed in in relation to particular end: for example, from the point of view of the information retrieval researcher or the online advertising platform engineer. As participants in mainstream social media platforms are not just users, but what Axel Bruns called “produsers”, whose digital exhaust is captured and monetized as behavioral data.24

Furthermore, media practices “at the limit” of intelligibility frequently indicate the distinctive characteristics of the medium, and this is no different in the case of time-coded comments. These are prevalent, for example, in YouTube comments that play with content above and below “the fold”. The TCC shown here plays with the norms and affordances of the platform, knowingly misleading the user in exchange for approbation from other users. It neatly encapuslates how the value of YouTube time-coded comments exceeds their fungibility as data points in the landscape of information retrieval. Malformed timecodes can be tactical as well as humorous, precisely because they resist straightforward algorithmic interpretation.

Challenges

To close I want to summarise some of the challenges facing research into music with TCCs. Perhaps the most potentially dangerous is that studies of digitally mediated listening more generally drawing on YouTube data risks overstating the the musical preferences of a self-selecting cohort of online commenters. Patricia G. Lange sounded the alarm as early as 2008, observing that “it is a synchronically-laden categorisation to seek a person who posts videos on YouTube, and assume that they were, are, and always will be ‘ordinary.’”25

Lange was thinking primarily of users who post videos; much the same goes for commenters, of course. Furthermore, a survey carried out by Schultes, Dorner, and Lehner reveals some of the antinomies that characterise YouTube comments.26 On the whole, YouTube comments were not viewed as particularly reputable, relevant or essential to the vieweing experience by users.27 Despite this, however, they remain popular: these authors replicated research that about 12% of viewers will leave comments under a given video and over half their survey respondents agreed that they often read the first one or two comments after watching a YouTube video.28

They are also far from harmless. In an important recent study, Murthy and Sharma study the videos of the hip-hop group Das Racist to show that how the YouTube commentsphere is implemented leads to the contagion of racial antagonisms in comment sections inside and across videos at “meso” level of the social network.29 Racism in the comments section ought not to be characterised as exceptional and sporadic flashpoints of incivility, trolling or flaming; as rather, they argue, “online hostility” is “a networked phenomenon.”30

Not unrelatedly, YouTube comment data is almost always multilingual. All the statistics I cite above pertained to comments of all languages: that is, I did not filter out non-English-language comments. It is possible to estimate the relative proportion of languages represented in the dataset of timecoded comments. The data suggests that 85% of the time-coded comments I presented here are written in English; about 4% are written in Spanish, the next most represented language, followed in rank order by Portuguese (1.6%), Russian (1.3%), Japanese (1.3%), and simplified Chinese (less than 0.5%).31 The proportion of non-English comments changes depending on the set of videos in question. This plot shows the proportion of non-English comments over a set of videos broken out by recording artist: the black dot indicates the mean proportion of non-English comments, which is relatively high for Ed Sheeran and Maroon 5, and relatively low for Future and Lil Dicky.32 These data are based on estimates and will require validation in the future, but can nevertheless used to make suggestive arguments about the relative appeal of different groups of music videos.

This leads nicely into important questions of algorithmic bias in data-driven research. Because language identification at scale is probabilistic, some comments in the data I have shown you have definitely been incorrectly classified. These graphs, then, represent estimates of the underlying distribution of languages used in the platform rather than population-level data. And because the majority of state-of-the-art language modeling research is conducted with respect to English, it is common for computational researchers to restrict their analyses to English TCCs only, as this reduces the likelihood that errors arise in later processing. This is one of many examples where the deeply rooted Anglocentric bias of computational approaches to language can affect the quality of results and applications, a feature not only of the analysis stage but all steps in the data processing pipeline, including collection.

Finally, there are the challenges of working with proprietary data, which is regulated not only by terms of service agreements but a nest of national and internantional legislation that has tended to favor the rights of the owners of the web property on which the data is made available. YouTubve data in particular has varied in its availability , and the processes for obtaining it in bulk are considerably more opaque than they are for other social media platforms.

Researchers interested in the possibilities for this kind of research should work toward developing a open, sustainable, ethical, and legal workflow for collecting and sharing TCC data. This will require a survey not only of the existing data sources and data-collection tools but the creation of new datasets, software, and policies to consolidate and improve the functionality of a set of resources that is underdocumented and, generally, requires non-trivial legal and technical expertise to implement effectively.

Possibilties for the future

However, despite these all these challenges, the future for research in to music and music videos with TCCs is bright. A raft of technical improvements on the work presented here are possible: collecting more accurate comment-level data using the YouTube API; automatic identification and extraction of temporal ranges; the use of special datasets to lnik recognized name entities to knowledge bases customised to work in the music domain; improved support for the analysis of the prevalence of emoji in comments; the use of more accurate language-detection models; the use of state-of-the art natural language processing (NLP) techniques to cluster and retrieve documents based on their semantic similarity.

Another next step will be to establish criteria for collection and preservation of particular subsets of the available YouTube data, given that total preservation by independent researchers is impossible. Once a manageable slice of the TCC commentsphere has been identified and collected, the limits on the questions that can be asked and answered are determined primarily by the data-analysis techniques that can be applied to the data arising.

From most disciplinary perspectives, my talk today did not systematically analyse the TCCs I collected. Fortunately, I did not set out to do so. Rather, I set out to show how to take the techniques and tools for data analysis that are traditionally the province of data extractivists and use them to excavate, crystallise, and reflect upon—for no end in particular—hundres of thousands of users’ encounters with music videos on YouTube. Their frank engagement with recorded sound has been, and will be, recorded in the technologies of synchronization to which the ever-expanding digital cultural archive attests.

Funding acknowledgment

This research was supported by a Government of Ireland Postdoctoral Fellowship, awarded by the Irish Research Council in 2019 to Eamonn Bell for the project “Opening the ‘Red Book’: The digital Audio CD format from the viewpoint between musicology and media studies.” Project no. GOIPD/2019/239.

Works cited

Bannister, Scott. “A Survey into the Experience of Musically Induced Chills: Emotions, Situations and Music.” Psychology of Music 48, no. 2 (March 2020): 297–314. https://doi.org/10.1177/0305735618798024.

———. “Distinct Varieties of Aesthetic Chills in Response to Multimedia.” PLOS ONE 14, no. 11 (November 14, 2019): e0224974. https://doi.org/10.1371/journal.pone.0224974.

Colburn, Steven. “Filming Concerts for YouTube: Seeking Recognition in the Pursuit of Cultural Capital.” Popular Music and Society 38, nos. 1, 1 (2015): 59–72. http://dx.doi.org/10.1080/03007766.2014.974373.

Edgar, Amanda Nell. “Commenting Straight from the Underground: N.W.A., Police Brutality, and YouTube as a Space for Neoliberal Resistance.” Southern Communication Journal 81, no. 4 (2016): 223–36. https://dx.doi.org/10.1080/1041794X.2016.1200123.

Kincaid, Jason. “YouTube Enables Deep Linking Within Videos.” TechCrunch, October 26, 2008. http://social.techcrunch.com/2008/10/25/youtube-enables-deep-linking-within-videos/.

Lamere, Paul. “The Drop Machine.” Music Machinery - a blog about music technology by Paul Lamere, June 16, 2015. https://musicmachinery.com/2015/06/16/the-drop-machine/.

Lange, Patricia G. “(Mis)Conceptions About YouTube.” In Video Vortex Reader: Responses to YouTube, 87–100. Amsterdam: Institute of Network Cultures, 2008.

Madden, Amy, Ian Ruthven, and David McMenemy. “A Classification Scheme for Content Analyses of YouTube Video Comments.” Journal of Documentation 69, no. 5 (January 1, 2013): 693–714. https://doi.org/10.1108/JD-06-2012-0078.

Mangaoang, Áine. Dangerous Mediations: Pop Music in a Philippine Prison Video, New York: Bloomsbury Academic. 2019.

———. “Dangerous Mediations: YouTube, Pop Music, and Power in a Philippine Prison Video.” 2014. http://repository.liv.ac.uk/2009748/.

McMurray, Peter. “YouTube Music—Haptic or Optic?” Repercussions, 2014, 1–47.

Menotti, Gabriel. “Objets Propagés: The Internet Video as an Audiovisual Format.” In Video Vortex Reader II: Moving Images Beyond YouTube, edited by Geert Lovink and Rachel Somers Miles, 70–80. Amsterdam: Institute of Network Cultures, 2011.

Murthy, Dhiraj, and Sanjay Sharma. “Visualizing YouTube’s Comment Space: Online Hostility as a Networked Phenomena.” New Media & Society 21, no. 1 (January 2019): 191–213. https://doi.org/10.1177/1461444818792393.

Savigny, Julio, and Ayu Purwarianti. “Emotion Classification on Youtube Comments Using Word Embedding.” In 2017 International Conference on Advanced Informatics, Concepts, Theory, and Applications (ICAICTA), 1–5, 2017. https://doi.org/10.1109/ICAICTA.2017.8090986.

Schultes, Peter, Verena Dorner, and Franz Lehner. “Leave a Comment! An in-Depth Analysis of User Comments on YouTube.” In Wirtschaftsinformatik Proceedings 2013, 42:659–73. Leipzig, n.d. https://aisel.aisnet.org/wi2013/42.

Severyn, Aliaksei, Alessandro Moschitti, Olga Uryupina, Barbara Plank, and Katja Filippova. “Multi-Lingual Opinion Mining on YouTube.” Information Processing & Management 52, no. 1 (January 2016): 46–60. https://doi.org/10.1016/j.ipm.2015.03.002.

Siersdorfer, Stefan, Sergiu Chelaru, Wolfgang Nejdl, and Jose San Pedro. “How Useful Are Your Comments?: Analyzing and Predicting Youtube Comments and Comment Ratings.” In Proceedings of the 19th International Conference on World Wide Web, 891–900. WWW ’10. New York, NY, USA: ACM, 2010. https://doi.org/10.1145/1772690.1772781.

Spencer, Edward K. “Re-Orientating Spectromorphology and Space-Form Through a Hybrid Acoustemology.” Organised Sound 22, no. 3 (December 2017): 324–35. https://doi.org/10.1017/S1355771817000486.

Thelwall, Mike. “Social Media Analytics for YouTube Comments: Potential and Limitations.” International Journal of Social Research Methodology 21, no. 3 (May 4, 2018): 303–16. https://doi.org/10.1080/13645579.2017.1381821.

Vliegendhart, Raynor, Martha Larson, and Alan Hanjalic. “LikeLines: Collecting Timecode-Level Feedback for Web Videos Through User Interactions.” In Proceedings of the 20th ACM International Conference on Multimedia, 1271–2. MM ’12. Nara, Japan: Association for Computing Machinery, 2012. https://doi.org/10.1145/2393347.2396437.

Vliegendhart, Raynor, Martha Larson, Babak Loni, and Alan Hanjalic. “Exploiting the Deep-Link Commentsphere to Support Non-Linear Video Access.” IEEE Transactions on Multimedia 17, no. 8 (August 2015): 1372–84. https://doi.org/10.1109/TMM.2015.2449086.

Vliegendhart, Raynor, Babak Loni, Martha Larson, and Alan Hanjalic. “How Do We Deep-Link? Leveraging User-Contributed Time-Links for Non-Linear Video Access.” In Proceedings of the 21st ACM International Conference on Multimedia, 517–20. MM ’13. Barcelona, Spain: Association for Computing Machinery, 2013. https://doi.org/10.1145/2502081.2502137.

Yadati, Karthik, Martha Larson, Cynthia C. S. Liem, and Alan Hanjalic. “Detecting Drops in Electronic Dance Music: Content Based Approaches to a Socially Significant Music Event.” In Proceedings of the 15th International Society for Music Information Retrieval Conference (ISMIR 2014), 143–48. Taipei, Taiwan, 2014. https://doi.org/10.5281/zenodo.1417081.

———. “Detecting Socially Significant Music Events Using Temporally Noisy Labels.” IEEE Transactions on Multimedia 20, no. 9 (September 2018): 2526–40. https://doi.org/10.1109/TMM.2018.2801719.

Yarmand, Matin, Dongwook Yoon, Samuel Dodson, Ido Roll, and Sidney S. Fels. “"Can You Believe [1:21]?!": Content and Time-Based Reference Patterns in Video Comments.” In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, 1–12. CHI ’19. New York, NY, USA: Association for Computing Machinery, 2019. https://doi.org/10.1145/3290605.3300719.

“YouTube Comment Memes.” Know Your Meme. Accessed September 25, 2020. https://knowyourmeme.com/memes/youtube-comment-memes.


  1. Áine Mangaoang, “Dangerous Mediations: YouTube, Pop Music, and Power in a Philippine Prison Video.” (PhD diss., 2014), http://repository.liv.ac.uk/2009748/; Áine Mangaoang, Dangerous Mediations: Pop Music in a Philippine Prison Video, New York: Bloomsbury Academic, 2019.↩︎

  2. Amanda Nell Edgar, “Commenting Straight from the Underground: N.W.A., Police Brutality, and YouTube as a Space for Neoliberal Resistance,” Southern Communication Journal 81, no. 4 (2016): 223–36, https://dx.doi.org/10.1080/1041794X.2016.1200123.↩︎

  3. Edward K. Spencer, “Re-Orientating Spectromorphology and Space-Form Through a Hybrid Acoustemology,” Organised Sound 22, no. 3 (December 2017): 324–35, https://doi.org/10.1017/S1355771817000486.↩︎

  4. Mike Thelwall, “Social Media Analytics for YouTube Comments: Potential and Limitations,” International Journal of Social Research Methodology 21, no. 3 (May 4, 2018): 303–16, https://doi.org/10.1080/13645579.2017.1381821. Thelwall does not indicate the number of comments studied though his published charts indicate that it well surpasses 1 million.↩︎

  5. Thelwall, 10.↩︎

  6. A complete survey of this research is out of scope, but see, for example, Stefan Siersdorfer et al., “How Useful Are Your Comments?: Analyzing and Predicting Youtube Comments and Comment Ratings,” in Proceedings of the 19th International Conference on World Wide Web, WWW ’10 (New York, NY, USA: ACM, 2010), 891–900, https://doi.org/10.1145/1772690.1772781; Julio Savigny and Ayu Purwarianti, “Emotion Classification on Youtube Comments Using Word Embedding,” in 2017 International Conference on Advanced Informatics, Concepts, Theory, and Applications (ICAICTA), 2017, 1–5, https://doi.org/10.1109/ICAICTA.2017.8090986; Aliaksei Severyn et al., “Multi-Lingual Opinion Mining on YouTube,” Information Processing & Management 52, no. 1 (January 2016): 46–60, https://doi.org/10.1016/j.ipm.2015.03.002.↩︎

  7. Jason Kincaid, “YouTube Enables Deep Linking Within Videos,” TechCrunch, October 26, 2008, http://social.techcrunch.com/2008/10/25/youtube-enables-deep-linking-within-videos/. For a visual history of the UI/frontend, see the representative sequence of Wayback Machine captures archived here. As for the backend… Forthcoming! But we can see changes in the YouTube API.↩︎

  8. Raynor Vliegendhart et al., “Exploiting the Deep-Link Commentsphere to Support Non-Linear Video Access,” IEEE Transactions on Multimedia 17, no. 8 (August 2015): 1372–84, https://doi.org/10.1109/TMM.2015.2449086.↩︎

  9. For example, the existence in the Chrome Web Store of a userspace browser plugin that supply YouTube with a “repeat 1” button testifies to the platform’s lack of such a feature. Repeat-1 was a commonplace on digital audio players since the introduction of the CD and remains in place even on Spotify.↩︎

  10. Raynor Vliegendhart, Martha Larson, and Alan Hanjalic, “LikeLines: Collecting Timecode-Level Feedback for Web Videos Through User Interactions,” in Proceedings of the 20th ACM International Conference on Multimedia, MM ’12 (Nara, Japan: Association for Computing Machinery, 2012), 1271–2, https://doi.org/10.1145/2393347.2396437.↩︎

  11. Raynor Vliegendhart et al., “How Do We Deep-Link? Leveraging User-Contributed Time-Links for Non-Linear Video Access,” in Proceedings of the 21st ACM International Conference on Multimedia, MM ’13 (Barcelona, Spain: Association for Computing Machinery, 2013), 517–20, https://doi.org/10.1145/2502081.2502137; Vliegendhart et al., “Exploiting the Deep-Link Commentsphere to Support Non-Linear Video Access.”.↩︎

  12. Matin Yarmand et al., “"Can You Believe [1:21]?!": Content and Time-Based Reference Patterns in Video Comments,” in Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, CHI ’19 (New York, NY, USA: Association for Computing Machinery, 2019), 1–12, https://doi.org/10.1145/3290605.3300719.↩︎

  13. Siersdorfer et al., “How Useful Are Your Comments?”; Peter Schultes, Verena Dorner, and Franz Lehner, “Leave a Comment! An in-Depth Analysis of User Comments on YouTube.” in Wirtschaftsinformatik Proceedings 2013, vol. 42 (11th International Conference on Wirtschaftsinformatik, Leipzig, n.d.), 659–73, https://aisel.aisnet.org/wi2013/42; Amy Madden, Ian Ruthven, and David McMenemy, “A Classification Scheme for Content Analyses of YouTube Video Comments,” Journal of Documentation 69, no. 5 (January 1, 2013): 693–714, https://doi.org/10.1108/JD-06-2012-0078.↩︎

  14. Karthik Yadati et al., “Detecting Drops in Electronic Dance Music: Content Based Approaches to a Socially Significant Music Event,” in Proceedings of the 15th International Society for Music Information Retrieval Conference (ISMIR 2014) (15th International Society for Music Information Retrieval (ISMIR) Conference, Taipei, Taiwan, 2014), 143–48, https://doi.org/10.5281/zenodo.1417081. The authors called these “timed comments” in this paper and in successive research. On detecting drops more generally using implicit user behavior (in this case, internal data on user time-scrubbing in Spotify), see Paul Lamere, “The Drop Machine,” Music Machinery - a blog about music technology by Paul Lamere, June 16, 2015, https://musicmachinery.com/2015/06/16/the-drop-machine/.↩︎

  15. Karthik Yadati et al., “Detecting Socially Significant Music Events Using Temporally Noisy Labels,” IEEE Transactions on Multimedia 20, no. 9 (September 2018): 2526–40, https://doi.org/10.1109/TMM.2018.2801719.↩︎

  16. Peter McMurray, “YouTube Music—Haptic or Optic?” Repercussions, 2014, 1–47; Gabriel Menotti, “Objets Propagés: The Internet Video as an Audiovisual Format,” in Video Vortex Reader II: Moving Images Beyond YouTube, ed. Geert Lovink and Rachel Somers Miles (Amsterdam: Institute of Network Cultures, 2011), 70–80.↩︎

  17. Apology about this.↩︎

  18. Median 33, SD 79.12.↩︎

  19. Not represented in the collected data, but discernable (for example) here and here. Steven Colburn, “Filming Concerts for YouTube: Seeking Recognition in the Pursuit of Cultural Capital,” Popular Music and Society 38, nos. 1, 1 (2015): 59–72, http://dx.doi.org/10.1080/03007766.2014.974373.↩︎

  20. As the presentation of Lamont, Bannister, Coutihno, and Egermont sets out to demonstrate.↩︎

  21. Scott Bannister, “Distinct Varieties of Aesthetic Chills in Response to Multimedia,” PLOS ONE 14, no. 11 (November 14, 2019): e0224974, https://doi.org/10.1371/journal.pone.0224974; Scott Bannister, “A Survey into the Experience of Musically Induced Chills: Emotions, Situations and Music,” Psychology of Music 48, no. 2 (March 2020): 297–314, https://doi.org/10.1177/0305735618798024.↩︎

  22. “YouTube Comment Memes,” Know Your Meme, accessed September 25, 2020, https://knowyourmeme.com/memes/youtube-comment-memes.↩︎

  23. There’s a similar set of concerns with modelling and mapping fictitious or ill-defined places in literature. DH/GIS stuff.↩︎

  24. CITE↩︎

  25. Patricia G. Lange, “(Mis)Conceptions About YouTube,” in Video Vortex Reader: Responses to YouTube (Amsterdam: Institute of Network Cultures, 2008), 87–100.↩︎

  26. Schultes, Dorner, and Lehner, “Leave a Comment! An in-Depth Analysis of User Comments on YouTube.”.↩︎

  27. Schultes, Dorner, and Lehner, 659.↩︎

  28. Schultes, Dorner, and Lehner, 660.↩︎

  29. Dhiraj Murthy and Sanjay Sharma, “Visualizing YouTube’s Comment Space: Online Hostility as a Networked Phenomena,” New Media & Society 21, no. 1 (January 2019): 191–213, https://doi.org/10.1177/1461444818792393.↩︎

  30. Indeed, it is a networked phenomenon that long predates the Web.↩︎

  31. Using langid. Over a subset of ~47.73K TCCs for which langid log_probability < -100 (reasonably confident estimates of language), excludes videos for which fewer than 11 comments were collected.↩︎

  32. See notes.↩︎