Big Data Only Gets You So Far, or Why Social Science Is Really Hard
Big data is everywhere, and there’s a new generation of arts researchers and policy mavens trying to use it to understand the arts economy. Given that the arts economy has been constantly exploding for the last fifteen years with no end in sight, I’d say that’s a noble goal. However, there are a lot of limitations to a data-based understanding of anything complex. And given that the arts are famously unquantifiable, it’s no surprise that those limitations are particularly severe in our field.
One ongoing effort is the National Center for Arts Research based at Southern Methodist University. They’ve worked with a variety of sources to compile aggregate data on non-profit arts institutions, and they’ve recently published their first round of results. While their findings are encouraging for the state of the arts economy and their methodology is strong, their project faces some fundamental challenges. The NCAR’s model is seriously constrained by their outside sources of data, both in terms of scope and because they bring the assumptions of the primary researchers into the NCAR’s model.
The data the NCAR is using come from sources that study non-profit arts organizations; they don’t have data on individual artists, and they don’t have data on for-profit arts institutions like clubs, record companies, or streaming music services.
It’s very important to keep the limits of this kind of science in your mind as you read the results, and start to use them in your own decisions. The research is presented in a very captivating interactive website and in tools embedded in the website of the Cultural Data Project to help arts organizations appraise their own organizational health.
While this is an engaging and modern way to present research results–and more research projects should do things like this to disseminate their findings–the limits of the research itself are basically invisible to the casual observer. Unless you read the fine print, you can come away thinking that you’ve seen a high-accuracy snapshot of the entire arts economy, when in fact you’ve seen a few rough statistics concerning only non-profit arts organizations.
More fundamentally, NCAR’s work is a meta-analysis, and it’s stuck with data collected by other people. A lot of the databases they use rely on voluntary submission and allow organizations to show themselves in the best possible light. Organizations can only answer the questions asked by the original researchers. So the NCAR is stuck with hard distinctions between ticket buyers, donors, staff, and artists, and can only learn about each through their aggregated interactions with non-profit institutions.
The final analysis has no notion of an individual person, who can exist in any or all of those categories. I’m an administrator, but I’m also a composer, and an audience member, and a donor. It changes with the day, who I’m talking to, and where I’m standing.
The goals of the NCAR involve helping non-profit institutions to understand their own organizational health, and so for them this missing element doesn’t really damage their work. But it does show how far they are from giving a picture of the whole arts ecosystem.
When you play with an interactive data visualization, the limitations of the research are largely hidden. For policy makers, financial advisors, board members, and other researchers looking into the life of the arts, that’s a problem.
If your sources of information can’t see individual artists, you’ll never be able to help them.
The work of the NCAR is very strong and is a fantastic start to a phenomenally hard project that will no doubt produce clearer results as the years go by and spur individual researchers to provide data to fill the gaps in their analysis. In five years’ time, they’ll be much closer to a description of the whole arts economy, especially if they start working with more sources of data, like the Future of Music Coalition’s Musician Revenue Streams project.
There are problems with this sort of research, but the model the NCAR has built is an amazing first step. We finally have a framework with which to compare all kinds of non-profit arts organizations, and some standards against which to measure the complex dimensions of organizational health. Those comparisons may be a little fuzzy, and those benchmarks a little vague, but they are far from useless.
Meta-analysis is, by definition, an apples to oranges comparison, and the NCAR had a couple hundred kinds of fruit to compare (to extend a metaphor). It’s a hard job, but they’re off to a great start.