Tag: data

Looking Out For Each Other with The Real Music Wages Database

A photo of an old card catalog from a library

I recently visited a sound art class at Vanderbilt University (over Zoom) as a guest artist. Towards the end of our conversation, one of the students asked me about what I was looking forward to in the future of my work and the fields of music and sound art. Rather than the aesthetic answer the student expected (and I could easily see myself giving a year ago), I surprised both of us by unhesitatingly responding that I was looking forward to improved arts workers’ conditions.

As excited as I am about the opera I’m currently writing or thingNY’s upcoming foray into mail art, the immediate effort I see from various communities of artists to create better working conditions and a healthier, more equitable social and economic ecosystem for the arts eclipses any individual art project or aesthetic movement in terms of my optimism for the future. From the boisterous and massive in-person protests to the quiet one-on-one conversations, from the various collective conversations on Zoom to the steady helping hand of mutual aid organizations, across the podcast interviews, slack channels, op-eds, and, yes, the astounding musical performances and recordings, a culture of community care has been dancing in a rainbow of tempos in all corners of the performing arts world. Into this spirit of sharing knowledge and resources, the New Music Organizing Caucus has created the Real Music Wages Database.

The Real Music Wages Database is an anonymous, crowd-sourced list of real wage transactions reported by musicians. We track how much someone has been paid, who paid them, and how many hours of work it involved. The more entries are added to the spreadsheet, the more discernable a true economic snapshot of the new music industry is visible. Inspired by similar crowd-sourced spreadsheets for dancers, baristas, museum workers, and adjunct professors, we created the Real Music Wages Database to help freelance music workers navigate what can be a very confusing financial landscape and give us tools to negotiate wages for ourselves, particularly in situations when we don’t have a union or an agent working on our behalf. The transparency of the database is meant to also be useful for ensembles, composers, producers, and presenters who want to get a better idea of what an industry standard might look like. The database has the potential to both identify organizations that don’t pay their performers enough as well as model how much an organization should pay their performers, ultimately encouraging equal pay rates and a living wage for musicians. (Oh man, doesn’t that sound nice?)

  • Starting out a career in new music and its adjacent musical scenes can be very confusing financially.

    Gelsey Bell
  • A larger reckoning around funding and transparency in larger non-profit arts institutions is currently taking place and we hope this database can just be one tool in the reforming process.

    Gelsey Bell

Starting out a career in new music and its adjacent musical scenes can be very confusing financially. For me, learning what I should be paid involved years of being paid a vast variety of amounts (or not at all) in ways that even still don’t always reflect the amount of work put in. One gig will pay my rent for two months after a week of work, while another gig will take a month of work to only pay half of my month’s rent. All of us freelancers know that part of our hustle is stitching together a living from a disparate assortment of gigs, each with its own unique equation of give and take. We’re hopeful that the Real Music Wages Database will speed up the knowledge gathering process significantly for young musicians, particularly those who don’t always feel comfortable casually asking their peers what they are being paid, as well as offer transparency for those who have been going at it for a while. In addition, the database can be a resource to other performing arts workers, like dancers and performance artists, who work with some of the same institutions, presenters, and venues that we do, but who historically have had an even harder time making a decent living, and can use the details of our experiences to uplift their own.

The database is limited in the information it gathers. For instance, we don’t ask about the tax status of the gig, or if you were given retirement benefits or health insurance. (Because let’s be real – how often does that happen?) And unlike other databases, we don’t ask about your gender or racial identity, whether you are disabled or your sexual orientation. We think tracking that kind of information is important and we fully support the reckoning over equity taking place within the new music world. However, we want to protect our community’s anonymity and felt that such a level of detailed, identifying information could sabotage those efforts.

We also wanted to make adding entries to the database quick and easy. So we decided to only ask for the most essential information and then use a system of tagging so that each person can decide what additional information would be useful for others to know. For instance, if a gig is associated with a specific university or institution that did not directly pay you; whether the gig was with an orchestra or chamber ensemble, for an opera or a wedding; whether it was for a specific series or festival; if it was a recording session; or if it involved an adjacent field, be that dance, theatre, or a religious service. The more tags are used, the more options will be suggested for you as you type into the ‘Tag’ field. This way, people can add as much information as they want and it is up to each individual to decide what they are comfortable sharing.

The database is also focused on gigs where the musician is not a generative artist. We understand how complicated the funding structures for our work as composers, sound artists, and performance creators can be. We decided that to fully measure the intricacies of our creative time for such projects would take a different set of questions. (We also encourage folks to use the NewMusicBox Commissioning Fees Calculator if it will be useful for your situation.) And finally, we especially encourage musicians to input their gigs paid for by funded institutions, particularly non-profit organizations that receive funds from foundations and governmental arts councils. A larger reckoning around funding and transparency in larger non-profit arts institutions is currently taking place and we hope this database can just be one tool in the reforming process.

The New Music Organizing Caucus (NMOC) is a baby of an organization, originally founded by composer-pianist Dorian Wallace and now spearheaded by a small group of dedicated and welcoming activist musicians. Initiated during the activist summer of 2020 that was energized by Black Lives Matter protests in the wake of George Floyd’s murder, NMOC holds monthly Zoom meetings where a community of new music workers come together to, as stated on the NMOC website, “advocate for decent working conditions and fair wages, provide support against discriminatory practices, share skills and knowledge, and fight for diversity, equity and inclusion in our field.” It’s a community of fellow musicians that welcomes anyone who wants to become more involved. I have met many people for the first time in these meetings, which might begin in shy awkwardness and end in refreshing sensations of solidarity. As with any volunteer organization, the more its membership wants to do, the more will happen. So far, we rely on pro bono assistance. For instance, Brian McCorkle designed the website for the database, and Sophia Richardson and Alyssa McCallion designed the logo.

The Real Music Wage Database is the first large project initiated by NMOC with an eye on other ways we can support our community and share resources in the future. The group also works to advocate for the special interests of new music in larger organizations such as the American Federation of Musicians (AFM), the Union of Musicians and Allied Workers (UMAW), and the Music Workers Alliance (MWA), as well as connect members with resources in these larger organizations. Though many of the active members are based in New York City, there are members from all across the United States. And even as in-person events begin again in the coming year, the group plans to continue meeting online so that it can serve and connect a wider geographical range of musicians.

Like many others, the reason I have more time to go to Zoom meetings for volunteer, activist organizations is because I don’t have as much work as I did before the pandemic started. (Also, I was probably working too much before the pandemic started, but that’s another story…) You might be looking at the Real Music Wages Database and thinking you’d love to input gigs when you have them again. When that day comes (and oh it will), I hope you do! In addition, it is tax season. I recently found myself adding entries as I went through my paystubs and expenses from 2020 in preparation for meeting with my tax guy. However strange it seems to me looking back on what felt like an impossibly long year, there were two and a half months of work in 2020 before the lockdown completely transformed every aspect of my life, and that is as obvious in my banking activity as it is in my sleep schedule. For the ultra-ambitious musician with free time, take a moment now to add your gigs from multiple past years. And for the slow-and-steady thoughtful freelancer, thank you for adding your gigs as you get them for many years to come. The Real Music Wages Database is as much of a useful tool as our collective music community nurtures it to be. I’m real thankful to be part of a community of folks that look out for each other.

(In full transparency, the NMOC Real Wages Steering Committee currently consists of Gelsey Bell, Nicholas Connolly, David Friend, Andrew Griffin, Marina Kifferstein, Brian McCorkle, Luisa Muhr, Pablo O’Connell, and Hajnal Pivnick. We can be reached at [email protected]. Want to be more involved? Please join us!)

The logo for the Real Music Wages database

Spreadsheets and Skeptics: a philosophical tale of data and music

data music
data music

Image via TrekCore

On argumente mal l’honnesteté et la beauté d’une action par son utilité
A man but ill proves the honour and beauty of an action by its utility

—Michel de Montaigne, “De l’Utile et de l’Honneste”

What do you do?

How many answers are there to that question? An occupation. A pastime. A technique. A course of action. Or maybe the question itself is a concession: a rhetorical shrug of the shoulders against the possibility of an answer.

Last August, The New York Times Magazine ran an article by Steven Johnson. “The Creative Apocalypse That Wasn’t” painted, amidst some judicious caveats, a hopeful, even rosy picture of the prospects for a musical career post-Napster, post-internet, post-streaming services. It was, in a way, an exemplar of 21st-century explanatory journalism: technologically optimistic, pleasantly contrarian—and data-driven. Very data-driven.

Both the data and the drive were concerned with that same question: what do you do? One of Johnson’s main exhibits was occupational data—that is, counting up the number of people who said that their occupation was “musician” or some equivalent. In Johnson’s analysis, that number was going up, even as digital forms of consumption seemed to be anecdotally squeezing musicians out of the marketplace. Which led to the second “what do you do?”: don’t worry (or, at least, worry less), be happy (or, at least happier).

There were problems with the article. Johnson’s data was selective and, in at least one case (which I’ll get to below), didn’t quite say what he thought it said. And his own conception of what musicians do was somewhat disconnected from the huge variety and combinations of ways musicians make a living. I certainly raised an eyebrow (as did, I would imagine, Frank) when Johnson noted that

The growth of live music isn’t great news for the Brian Wilsons of the world, artists who would prefer to cloister themselves in the studio, endlessly tinkering with the recording process in pursuit of a masterpiece

—seemingly oblivious not only to exactly how many babies he was tossing out with that achingly lovely California-sun-dappled bathwater, but how many other cloisters (schools, practice rooms, composing tables) are crucial to even the most prolifically disposable musical styles.

creative apocalypse

Plenty of critiques followed Johnson’s article—most of them negative. The Future of Music Coalition led the way, leading to a back-and-forth that mainly shored up the respective trenches. Other observers weighed in. The National Endowment for the Arts Office of Research and Analysis mined some more data, some of it provocative. (The final graph in that report, showing, via Bureau of Economic Analysis data on capital investments, the decline in real investment in new music, is like a flash-card summary of the tyranny of the back catalog.)

I don’t feel the need to sift through all that data again. But I did start thinking about the data itself, the fact of it. Maybe Johnson’s article wasn’t the bellwether for the coming of Big Data to music, but it certainly was part of the flock. Data-driven analysis has seeped into every corner of the musical ecosystem, beyond arguments for (or against) increased opportunities for individual musicians. Streaming services, online retailers, social media communities—all are crunching reams of data and creating reams more, all the time. Our relationship with data has changed profoundly. Even the word itself hints at how much: it turned from plural to singular. (As a linguistic descriptivist, I find meaning in that.) Maybe we should step back, and figure out how to deal with that going forward.

So this will be a philosophical tale about data. As befits a philosophical tale, it will also be a cautionary tale. As befits a cautionary tale, it will include visits from three ghosts. There is, unfortunately, no neat moral at the end. But there will be the start of a framework for answering the question: what do you do?

*

Michel de Montaigne (1533-1592). Engraved by C.E.Wagstaff and pu

Michel de Montaigne (1533-1592). Engraved by C.E.Wagstaff and published in The Gallery Of Portraits With Memoirs encyclopedia, United Kingdom, 1833.

Two ghosts to start: first, Michel de Montaigne, the 16th-century nobleman and bureaucrat who, in his spare time and a long retirement, pretty much invented the essay, assembling his everyday observations and close-read experiences into a volume that, upon publication, was nearly immediately recognized as a classic of humanist thought. And then, from the succeeding generation, René Descartes, the father of Western philosophy, who retreated into his own mind (cogito ergo sum, after all) to search for fundamental truths—and who thought that Montaigne’s way of thinking was intellectually irresponsible to a positively diabolical extent.

The source of Descartes’s discomfiture was Montaigne’s cheerful espousals of a very old philosophy: skepticism, in a version that went well beyond mere Devil’s advocacy (Descartes’s suspicions notwithstanding). In Montaigne’s lifetime, French intellectual life had been marked by a fashion for schools of ancient philosophy that, beyond pursuing insight, offered designs for living—Stoicism, Epicureanism, and Skepticism. The latter cultivated a habit of questioning everything, admitting nothing, subjecting even the most seemingly obvious statement to a barrage of sabotaging logic and rhetoric. Its most famous exponent, the 2nd-century thinker Sextus Empiricus, worked his way through the liberal and scientific arts, demonstrating how none of them (music included) could even be proven to exist.

It sounds like a game, a mental exercise. It is. Epokhē, the Skeptics called it, a suspension of judgement, a constant refusal to succumb to certainty. Get good enough at it, the Skeptics thought, and you could will yourself into a state of ataraxia, tranquility, mindfulness, open to experience rather than trying to frustratingly box it into categorical truths.

In Montaigne, Skepticism inspired a radical if puckish empathy. One of his more tangential but revealing enthusiasms is for stories about animals behaving in clever or vaguely human ways. Another classical Skeptic, Aenesidemus, formulated a defense of epokhē in the form of a chain of ten tropes; Montaigne seems to have especially taken to heart the first: “Different animals manifest different modes of perception.” If animals have a way of experiencing the world, an inner life, that we have so little access to, how can we possibly say that our way of experiencing the world is the only valid one? In Montaigne’s famous formulation: “When I play with my cat who knows whether I do not make her more sport than she makes me?”

Rene Descartes (1596-1650). Engraved by W.Holl and published The

Rene Descartes (1596-1650). Engraved by W.Holl and published The Gallery Of Portraits With Memoirs encyclopedia, United Kingdom, 1833.

Skepticism drove Montaigne’s perception outward; it drove Descartes’s inward. “I think, therefore I am” was Descartes’s implicit shot across Montaigne’s ruminative bow, fencing off human reason as exceptional. He started with the same sally as Montaigne—question everything—but, where Montaigne and his classical forebears took that as an everyday attitude, Descartes took it as as a prompt to, as he was determined to do, answer everything as well. (In her excellent biography of Montaigne, How to Live, Sarah Bakewell puts it like this: “Trying to get away from Skepticism, [Descartes] stretched it to a hitherto unimaginable length, as one might pull a strand of gum stuck to one’s shoe.”)

That first answer, about thinking and being, was Descartes’s base camp. And he immediately questioned it: how do I know this to be true? Well, there was nothing inherent to I think, therefore I am that demonstrated its truth, except for the fact that it was so clearly true to Descartes. And, with that, he began climbing into thinner and thinner air:

I concluded that I could perhaps take, as a general rule, that all the things which we very clearly and distinctly conceive are true.

All the things which we very clearly and distinctly conceive are true.

Whatever happened to “show your work”?

*

In turning back to the data, one might well adopt Montaigne’s motto: Que sais-je? What do I know? And it doesn’t take much effort to reach a Montaigne-like conclusion, a feeling that the cat is playing with us as much as we are playing with the cat. But that’s a trap, too.

For me, the most interesting hole poked in Johnson’s article had to do with some figures Johnson gleaned from the Bureau of Labor Statistics’s Occupational Employment Statistics (OES), which derive from a yearly survey of some 800 occupational categories. Johnson:

According to the O.E.S., in 1999 there were nearly 53,000 Americans who considered their primary occupation to be that of a musician, a music director or a composer; in 2014, more than 60,000 people were employed writing, singing or playing music. That’s a rise of 15 percent, compared with overall job-­market growth during that period of about 6 percent.

That’s a pretty clear trend, no? But the BLS cautions against such year-to-year comparisons of OES data, and with good reason. A New Zealand statistician named Thomas Lumley poked into those figures and found that the 15 percent increase could almost entirely be attributed to an increase in the “Music Directors and Composers” category; beginning sometime around 2009, approximately 15,000 primary and secondary schoolteachers that weren’t previously being counted as music directors suddenly were. Take out that influx, and Johnson’s upswing turns into a decline.

I got curious about that tweak, so I emailed the Bureau of Labor Statistics about it. I was hoping that it was some straightforward change in methodology, one that might say something about how, at least from the standpoint of the state, the dominant idea of a “musician” was evolving. Nope—in their message back, the OES Information Desk chalked it up to the law of unintended consequences:

In particular, in 2010 and 2011, the OES program implemented the revised 2010 version of the federal Standard Occupational Classification (SOC) system. As part of the 2010 SOC revision, the word “band” was added to the occupational description for music directors and composers. This revision was not intended to change the occupation’s content, since “band” was implied to be part of the previous definition for this occupation also. However, the addition of the word “band” and the inclusion of this occupation on the OES survey form sent to elementary and secondary schools may have caused a shift in the number of workers reported as music directors and composers rather than as teachers.

I love this. The addition of one innocuous word to the description managed to extend the fog forward and backward in time. There’s no way to tell how many band directors did get added, didn’t get added, should have been added, should have been in the category already. It brings us, full circle, back to Montaigne: the more you know, the less you know.

At this point, we might respond with a common trope: the data, we would say, is unreliable. But, really, the data is just the data. The BLS asked a question and got an answer; they asked a slightly different question and got a slightly different answer. They’re not pretending that it’s anything other than that; it’s why they specifically warn against making the kind of comparisons that Johnson made. But we, Cartesian children all, can’t resist. Johnson saw the pattern and judged it true. The Future of Music Coalition and Thomas Lumley saw a different pattern, and they did the same thing. Certainly, you can think that one interpretation is more plausible than the other, that one is closer to the truth. I know what I think. (I think it’s the latter.) And yet, at the same time, there’s Montaigne in my head saying, sure, that’s what you think—but what, exactly, do you know?

It’s not the data that’s unreliable; it’s the clarity. And when it comes to trying to figure out music, that’s a bit of a problem.

*

leninother

The problem was neatly framed by a third ghost: Louis Althusser (1918-1990), the Marxist philosopher and theorist. Althusser was a troublesome character, philosophically and otherwise. For all his insistence that he was a classical Marxist, his interpretation of Marx was rather unorthodox—and, to other scholars in the field, highly suspicious. He was unstable, going through periods of mental distress; in 1980, he killed his wife, strangling her in their apartment at the École normale supérieure in Paris, escaping prosecution by being judged to have been temporarily insane. (He described the incident with sophistic frankness in a posthumously published memoir, in which he also admitted that he hadn’t actually read all that much Marx.) His writing is pervaded by a kind of brittlely incisive gloom.

His most famous theoretical contribution—his analysis of ideology, from his essay “Ideology and Ideological State Apparatuses (Notes towards an Investigation),” first published in 1970—is a good example of how grim his philosophy could be. Althusser presents ideology as so omnipresent in society and time, without history, pinning people into identities even before birth, as to make one wonder how any ideology could ever be subverted, or superseded, or even simply adjusted. It is almost helplessly deterministic, to the point that its relationship to actually lived life starts to seem not just counterintuitive, but disconnected.

So why bring him up? Because Althusser had a real skill, almost a sixth sense, for identifying points of tension. And the point of tension at which he builds his theory of ideology is exactly the point at which the competing priorities of data-driven analysis and music collide.

One of the big ideas in Althusser’s essay is interpellation: how ideologies call out individuals as subject to those ideologies, and how individuals respond.

[I]deology “acts” or “functions” in such a way that it “recruits” subjects among the individuals (it recruits them all), or “transforms” the individuals into subjects (it transforms them all) by that very precise operation which I have called interpellation or hailing, and which can be imagined along the lines of the most commonplace everyday police (or other) hailing: “Hey, you there!”

Assuming that the theoretical scene I have imagined takes place in the street, the hailed individual will turn round. By this mere one-hundred-and-eighty-degree physical conversion, he becomes a subject. Why? Because he has recognized that the hail was “really” addressed to him, and that “it was really him who was hailed” (and not someone else).

Althusser presents his example as a sequence of events, but actually, “these things happen without any succession,” he writes. “The existence of ideology and the hailing or interpellation of individuals as subjects are one and the same thing.” So this thicket of scare quotes marks off another of Althusser’s inescapable prisons: if an ideology exists, not only will it interpellate you as subject to it, it already has.

Setting aside the turtles-all-the-way-down aspect of Althusser’s idea of ideology, interpellation is a useful way to think about the way we talk about jobs and occupations. The OES data, for instance, interpellates you, the musician, as a musician, but subject to the terms of the ideology behind the collection of OES data. The various ideologies that pervade society—free market ideologies, hangover-Calvinist ideologies, up-by-your-bootstraps-self-sufficient ideologies—are interpellating you all the time. Artists and musicians, especially in less-dominant stylistic modes, run into this all the time: think about a phrase like “doing what you love,” which so often interpellates artists. Yes, we do what we love, which, as subjects of free-market ideology, calls us out as people who shouldn’t expect to make as much money as other people who do what the free market loves. (It’s no wonder that there’s a movement in radical labor circles dedicated to “counter-interpellation,” essentially re-framing and re-naming worker-subjects in terms suited to more worker-friendly ideologies.)

But Althusser goes further. He wants to know why and how interpellation happens. So he takes a look at one of the bigger ideologies out there: Christianity. The Christian religious ideology calls out an individual, “in order to tell you that God exists and that you are answerable to Him.” The ideology is the voice by which God addresses you (through scripture and its interpretation). The ideology tells you who you are, your place in the world, your duties. Do what the ideology tells you and you will be saved. And so on.

“Now this is quite a familiar and banal discourse,” Althusser writes, “but at the same time quite a surprising one.” Why? Because the ideology is addressing individuals, interpellating individual subjects, but only “on the absolute condition that there is a Unique, Absolute, Other Subject, i.e. God.” There are big-S Subjects (ideologies) and little-s subjects (individuals), and it’s the gap between them that makes interpellation work. The big-S Subject interpellates the little-s subject such that, not only is the little-s subject inescapably linked to that identity, but the little-s subject can contemplate the big-S Subject in his or her own image, such that the ideology doesn’t seem imposed, or constructed, but just “the way things are.” Ideology ensures that, in Althusser’s words, “everything really is so, and that on condition that the subjects recognize what they are and behave accordingly, everything will be all right: Amen— ‘So be it’.”

Responding to the Future of Music Coalition’s first round of objections, Johnson left a long comment that included both of these statements:

We made a decision to focus the piece on the artists, not the ecosystem around the artists

and

[W]e wanted to stick with our principle of not relying on individual anecdotes, and report only broader, industry-wide data

—to which one might say, “well, which is it?” But it’s not either-or; it’s Althusser’s little-s subject and big-S Subject working in quintessential lockstep. Johnson wants to make you, the reader, feel better about the plight of individual artists in an era of technological optimism, and he wants to do it by analyzing large-scale, collective statistics. Does that work? Sure—as long as you’re convinced that the statistics reflect back the image of the individual artist. The artist is the subject. Data is the ideology.

*

Cash Week - sm

So what do you do? Ignore the data? That seems extreme. Data-driven analysis might be an ideology, but it’s a rationally based one. And I, for one, like rational belief systems. They tend to be more useful than the alternatives. They tend to discredit a lot of opinions and behaviors that I find offensive, or unfair, or damaging. But even a rational belief system is still a belief, a faith—something the rationality of the system tends to obfuscate. Not only does that make it easy to fall into Descartes’s clarity-equals-truth trap, it’s easy for that seeming truth to subtly shift from one category to another, to jump the tracks.

Take economics, for instance, the most data-driven of social sciences. If you ask exactly what it is economists do, the best answer might be: they try and design mathematical models that return data matching that generated by real-world situations involving money and material goods and decisions and consumer behavior. But that is not quite the same thing as describing the behavior itself—a distinction that a lot of people (economists included) fail to make a lot of the time. And the models assume a level of coherence (rational actors, rational decisions, market efficiency) only sometimes (if ever) found in the actual world.

Descartes might have thought twice about that clarity thing: after all, his first book was a survey of music theory—Musicae compendium, written in 1618, published (posthumously) in 1650. And, on the very first page, Descartes wrote this (as translated by Thomas Harper in 1653):

For songs may bee made dolefull and delightfull at once; nor is it strange that two divers effects should result from this one cause, since thus Elegiographers and Tragoedians please their Auditors so much the more, by how much the more griefe they excite in them.

Music, at its core, is not a rational art. And yet its creation now necessarily happens within systems and societal frameworks evermore marked off, framed, and otherwise governed by the self-proclaimed rationality of Big Data. Sometimes the meeting will be useful; sometimes it will not. But it will always be a meeting of two fundamentally divergent belief systems. It’s not a matter of collecting more data, or better data, or finding a more sophisticated analysis of that data. The best you can hope for is ecumenical cooperation.

Montaigne would have responded to that uncertainty the way he responded to all uncertainties:

Every one is well or ill at ease, according as he so finds himself; not he whom the world believes, but he who believes himself to be so, is content; and in this alone belief gives itself being and reality. Fortune does us neither good nor hurt; she only presents us the matter and the seed, which our soul, more powerful than she, turns and applies as she best pleases

That sort of attitude is easier said than done, even with a knack for epokhē. But it’s the start of a corrective against the anxiety of data, the illusion of and need for exact, singular answers to big questions. Data requires interpretation; so do notes. Analysis is performance; performance is analysis. The application of a musical soul can make as much sense of fortune as the sort of a spreadsheet. Sure, that’s just a belief. But that, it turns out, is what we do.

Classical Music Has Open Data Sets?

Back in October, the Washington Post offered some blog coverage of Suby Raman’s first big data and the arts post. While the dramatic headline made it sound like Raman had written a death knell for opera, what he’d actually done was taken reams of detailed data about the Metropolitan Opera’s performances and analyzed it for trends. Because he’s a composer, he knew what to look for in the data and what would matter to people. Because he’s a programmer, he knew how to handle the big data set itself.
In that data, he found a lot of really interesting stories to tell–not about the death of opera itself, but about how the Metropolitan Opera has been producing more and more works by dead composers. As the repertoire has solidified, the average age of a composition has gone pretty far up. At New Music USA, we find that less than awesome, and we’re grateful for the clear explanation, complete with charts and graphs and labelled axes. (The labelled axes are incredibly important.)

The chart from

The first of 10 charts created by Suby Raman based on data culled from Metropolitan Opera performances over the past century.

Some of the criticism lobbed at Raman in, mostly, various comment threads, attacked him for only looking at this one opera company instead of the entire field. That position completely misses the reality of what data science in the arts is actually like. Suby Raman didn’t pick the Metropolitan Opera out of a hat, try to say it’s a representative opera company, or argue that it’s more important than any other house. He picked it because they’re the ones who released all their historical data in a neat and usable package. Here’s his own disclaimer:

About the data: data was acquired from the Met Opera Database, in a timeframe from 1905 to present. One “performance” is a night of an identifiable opera performance. Opera performances data was scraped from the HTML and matched up to scraped composer/opera data from Wikipedia. The process of scraping/matching may have introduced some error.

Data may be big and getting bigger, but it’s not exactly thick on the ground in the performing arts. There is no IMDB for string quartets, composers, ballets, or even plays (though in theater there are some who are trying to make one). Where clear data sets exist, there’s a lot we can learn from them, and we should definitely encourage our big institutions to make more of their data transparently available to the public. A better headline for that Washington Post piece might have been “Holy crap, even the Metropolitan Opera is opening up their data?”
While the data available for analysis are limited in scope, they are still valuable, and Suby Raman did some great science by being clear about the limitations of his method, and about how more could be done with more information.
Plus he used a lot of animated gifs in that blog post. Who doesn’t love a good animated gif?
If you want more evidence of Raman’s competence to analyze data in the performing arts, check out the disclaimer text from his most recent blog post. This one’s on the gender diversity of major American orchestras:

Note: Dataset is 1,833 unique orchestra performers, taken from current orchestra roster pages. “Laureate” and “Emeritus” musicians were not included. “Top 20 Orchestras” was defined as the top 20 orchestras ranked by base salary. Librarians and other personnel were not included; you guys are fantastic, but this just examines musicians. I will post the dataset to GitHub in a few days, check here for the link.
Because of variances in doubling vs. unique performers, “ancillary” instruments like piccolo, English horn, etc. were included in their more common section (flute, oboe, etc).

He’s brought both a good knowledge of data scraping and a good knowledge of the inner working of orchestras to the job. I can think of few data scientists with the music background to let them do this. My one quibble, and it is not insignificant, is that there isn’t anything in here about how the gender of each player was determined. Was it by name? By appearance? It probably wasn’t by self-identification, and there do seem to be only two kinds of gender in this analysis, which is its own problem. But the stories Raman is trying to tell here are also clear, also compelling, and also a strong, evidence-based call for change. One particularly good point, revealed by proper analysis like this, is how many principal flutists are men, despite most flutists in the study’s orchestras being women. Wow. This post only does have one animated gif, though.
Did I mention that Raman also writes music?