The Institute for Digital Public Infrastructure

@ UMass Amherst

Reimagining the Internet

Kate Crawford, Atlas of AI

November 25, 2020

Scholar and artist Kate Crawford joins the podcast to talk about why we don’t just need to imagine how to fix the internet, but how we want to change society. Kate is a co-founder of the AI Now Institute at NYU and author of the Atlas of AI, coming April 2021 on Yale University Press. She walks us through the extractive nature of AI, talks about her collaborations with Vladan Joler (recently acquired by MoMA) and Trevor Paglan, and a fascinating history of classification.

View her collaboration with Joler, "Anatomy of an AI System", and read the essay she wrote with Trevor Paglan about their work togher, "Excavating AI."

Transcript

*Ethan Zuckerman:

Hey, everyone. Welcome to Reimagining The Internet. My name is Ethan Zuckerman. We're here today with my friend Kate Crawford. She's an academic, a researcher, an artist, a leading thinker about AI. Today, she's based in New South Wales in Australia, but generally speaking, she's based at NYU. She's the co-founder of AI Now. She's also a senior principal researcher at Microsoft Research. This year, she's the inaugural visiting chair in AI and justice at the Ecole Normale Superieure in Paris for 2021. She's done an amazing range of research and artistic work, helping people think through AI and the many different human systems that touches. Kate, it's so great to have you here today.

Kate Crawford:

Oh hi, Ethan. It's so lovely to hear your voice.

Ethan Zuckerman:

Well, we are asking some of our favorite people in the technology field a really simple question, but it seems to lead people in interesting directions. And the question is basically this, it's 2020, there's a whole lot wrong with the world. There seems to be a whole lot wrong with the internet, as well. What's wrong with the internet and how would you propose fixing it, Kate?

Kate Crawford:

So one of the things that I know you and I have talked a lot about over the years is really that type of technological solutionism that really was the glow around the internet for such a long time. The idea that this was a network that would transform humanity, that would bring us to our best selves, and that would also open up at a very sort of profound level, new cultural channels, new ways of being, new political opportunities.

​ And one of the things that I think is really interesting in the way that I think about sort of the last 40 years of thought around those questions is it tends to fall into these really sometimes quite problematic dualisms. And I think to some degree, we have another one of those now. We're in a moment where there's sort of technological utopianism and technological dystopianism. I think both of them are equally problematic because again, they're putting technology in many cases, the internet, at the center. We're so used to telling the story as though the internet is the primary actor when it truly is not. So I think we could really start there that's certainly one of the things that I think we could do is to begin to tell the story differently, that by focusing on the internet as the primary actor, as the thing that is either the solution or the thing we need to fix, might be something we could reconsider and actually try decentering that to tell the story differently.

Ethan Zuckerman:

It's a great way of starting a conversation to challenge the question behind it, and you're absolutely welcome to do it. Let me just unpack for sort of half a moment here, technological solutionism, this idea that if we keep building new innovative technologies, we will solve deeply seated human problems is something that has now become quite unfashionable. We've had critiques from every side sort of pointing out the limitations of a tech-centric view of the world. And you're certainly right, that this question of how do we fix the internet has a certain amount of utopianism sort of based in it. You want to talk more about power relationships. How do power relationships show themselves within the work that you're doing around artificial intelligence?

Kate Crawford:

Honestly, AI is about power all the way down. It's one of these systems where, again, I think we can have a tendency to focus on the manifestations of artificial intelligence that we see every day. We might think about things like Siri and Alexa. We might think about autonomous vehicles. We might think about Google search. But what's really going on with AI, I think is a lot deeper, which is that it's really essentially a type of an extractive industry that is profoundly connected to mining and extraction from the resources to hidden forms of labor throughout the supply chain and of course, to forms of data extraction.

​ So what you'll see the way that I sort of tend to think about artificial intelligence and power is by this almost exploded diagram. How do we think of the bigger picture of the systems of power in which AI is embedded? And also Ethan, I think we can do that with the internet. So, I think if there is a really positive contribution we can make to thinking about the internet and how we, quote unquote, fix it. Part of the first sort of step would be changing the lens of analysis in terms of what we think the internet is and where it is based. And to start to think about it more as a complex system in which we are really embedding a whole lot of other social institutions, ideas and capacities and affordances. That kind of broadening of the lens, I think is one of the most powerful steps that we can take to change the way we understand what the internet is and what can be done about it.

Ethan Zuckerman:

Can I get you to talk a little bit about anatomy of an AI system and what were you trying to visualize in that work? Does that connect to this sort of broadening of the scope that you're talking about?

Kate Crawford:

Absolutely. It's a really good example of precisely this sort of turn to thinking about broader infrastructures and power. So, that project started actually at a retreat that I was at Vladan Joler, the extraordinary sort of artist and information visualizer. And we were talking about voice enabled artificial intelligence, so things like Siri and Cortana, Alexa, and we thought about how would you actually think about what goes into making one of these systems? And so initially we had a small piece of paper and we were looking at the system like Amazon's Echo. So you have this little sort of cylinder device that sort of sits on your kitchen bench. What are the components inside it? Then how does it work? How does it relay data back into data centers? And then how do they analyze voices in order to respond to voice commands, like, "Alexa, what's the weather today?"

​ But in doing that, we realized that just telling that narrow slice of the story, really kind of occludes what's actually going on to make a system like that work. So we started to get bigger and bigger pieces of paper. Honestly, we started to get these giant bushes pieces of paper. We had 10 of them at one point, trying to go all the way back into what has to be mined to make one of these things work, to make data centers work? What energy requirements go into it? What about smelting? What about distribution? What about logistics? What about container shipping? And then all of the elements that go into the way in which these gigantic data systems have to work, all the way through, in fact, to the end of life of these systems where these devices get thrown out and they end up in e-waste tips in places like Pakistan and Ghana.

​ So we did what you might consider, I guess, a full life cycle analysis of a single Amazon Echo unit, how they get created, the birth, the life, how they get used, and what happens to those data exchanges when we speak to these devices, all the way through to their death and sitting in these giant e-waste tips. And by doing that, it really transformed my thinking, Ethan, I have to say, to a relationship to technology, which first and foremost is to not be invested purely in the part of the system that you hold in your hand or that you engage with, or that is visually designed for you. But to look at these more sort of non-human and human together infrastructures that make them work. And by doing that, I think we can ask a very different question, which is what are the real costs of these systems? Often these systems feel as though they're free, and of course this deeply relates to internet discourse at the 1990s and early 2000s, the idea that so many things could be free and freedom in all of the senses of that word, but in fact with artificial intelligence, we have the same problem emerging that we tend to think of these systems as somehow being costless to us. But in fact, they're extremely costly and they're costly across these three vectors of the earth, of data and of labor.

Ethan Zuckerman:

But I'm really struck at your interest in sort of extraction and both the extractive industries necessary to produce these technologies, but also this notion of data as something extracted and labor as something extracted. Talk a bit more about AI as an extractive industry. Talk about what you're thinking about as far as extracting labor and data out of people to make these systems possible.

Kate Crawford:

Well, this is really something that I've just completed a book about, in fact, called Atlas of AI, which is coming out early next year. And what I did to really understand what is called in the sort of theory circles that study AI is extractivism as a concept, and to really understand how extractivism is one of these kind of core motivating factors around how AI works.

​ I actually started by going to the places where the types of mineralogical components of large-scale computation are mined, so really starting at the earth layer and understanding how energy is extracted, how minerals are extracted and how much of that is needed to make AI work. It's truly quite staggering. And that there are lots of easy figures that we can talk about. The fact that pre-COVID data centers were already issuing more carbon into the atmosphere than the airline industry in total. Of course, I imagine those figures are probably quite different now under COVID. It's probably considerably more than the airline industry, but we are looking at very real environmental costs from the sorts of large scale computation that artificial intelligence is.

​ And then I started looking at the labor components and here there are many scholars who have done extraordinary studies. I'm thinking of Mary Gray and Sidd Suri's ghost work. I'm thinking of Lilly Irani's work. But really looking at the many types of labor along the pipeline. Now, those scholars have looked at things like crowd work or click work, the people who are really making artificial intelligence systems appear intelligent. So, this is often extremely underpaid work. It's sort of digital piecework really. And then I start from there and look further at all of the kinds of labor that are needed to make artificial intelligence work. And in many cases, in highly exploitive industries like mining.

​ And then finally, I really look at data, which is as you know, to make artificial intelligence work, one of the things that's been really key has been creating large-scale training data sets. Now, where do they come from? Primarily, they come from the internet. They come from scraping all of those pictures that we've been putting up about our families, of our friends, of fun celebrations and holidays. They all go into these big training sets, which are used essentially to train artificial intelligence systems how to see the world, how to recognize faces, objects, et cetera. So the internet itself became a mind to create the way that artificial intelligence sees.

​ So by doing all of this, what really became clear to me is that artificial intelligence is neither artificial nor intelligent. It is extremely a material technology and one that has profound skews in terms of how it interprets and analyzes the world. So by doing this, to see the forms of extraction all along the pipeline was really illuminating for me to think about how we need to come to understand the way that artificial intelligence works and how we might think about making it work differently.

Ethan Zuckerman:

But Kate, you're now talking about this intriguing idea that maybe we could have a less extractive AI. What does a less extractive AI look like?

Kate Crawford:

Take ImageNet. So ImageNet is probably one of the most well-known training datasets in the world. It was created by researchers at Stanford and Princeton. It contains over 14 million images that were scraped from the internet. It contains pictures of cats, of birds, of trees, of cars, of people, and it has been used essentially in so many cases to really drive computer vision as we know it today. But what hasn't happened so much is really opening up those datasets to look at what's in them.

​ So, one of the projects I did with the artist Trevor Paglen is to really spend over two years studying just one of the categories in ImageNet, which is the people category, which is where all of the people were put into various classificatory categories, like nurse or doctor or Boy Scout or Sumo wrestler. But there's also stranger categories that sort of move into more complex terrain, like DETA or micro economist. What does somebody look like if their bank account is in the red? But then the deeper we dug, the more we found really troubling categories, like really disturbing terms like kleptomaniac or alcoholic or bad person or loser. And there are people's faces that have been scraped off the internet, put in these categories and labeled in these ways. And of course, some very deeply racist and sexist terms that I won't be using on your podcast.

Ethan Zuckerman:

Just to explain for half a second, ImageNet is descended from a project called WordNet and WordNet is an attempt to create a hierarchical organization of all the words in the English language. And then ImageNet tried to illustrate WordNet, only the nouns, they didn't do verbs yet, but illustrate all the nouns in WordNet with associated images, which is in part how you end up with things like kleptomaniac.

Kate Crawford:

And what's interesting too, is that this is like a family tree that goes further and further back. So WordNet is also based on other training corpuses, like the Brown Corpus. So this is what's so interesting about artificial intelligence is that you will see that it sort of turtles all the way down, is that so many of these training sets actually draw on the training sets that came before them on different taxonomies, and so these taxonomies get imported over decades and decades. And in many cases, people don't necessarily stop and look really hard at all, why do we have this category, or why are we using this word, or where did it come from?

​ So, one of the things that I've been spending a lot of time doing is tracing those genealogies, looking at where these terms, concepts and ideas came from, because in many cases, those are epistemological foundations, those ideas of knowledge building in artificial intelligence are quite shaky. They're built on these very shaky foundations. And when we really look at them, we find that their meanings are very unstable, but that instability in many cases is doing a sort of work. And what concerns me and why when we began our conversation, we were thinking about how artificial intelligence changes how we understand ourselves and understand other people is that those acts of classification, if you will, those moments of labeling and putting people into categories can actually be quite dangerous as well. And that's a much bigger problem than just AI and bias, which I know is something that you and I have talked a lot about and it gets a lot of column inches in the world. This is a really fundamental question around how these tools are actually classifying us, the things that we do and what has value in the world. So that is really the core issue.

​ And I think, to get back to your question, Ethan, around the internet itself, the internet is one part of these sort of infrastructures that are sort of both input infrastructures that are taking lots of data from us. We give them lots of information and insights into ourselves, but also output structures. They're also representing us. And this is where I think AI has a big role to play and where we need a sort of a much bigger understanding of what is going on. And I think we get there by this type of forensic analysis of the systems themselves.

Ethan Zuckerman:

Is it the categories that are the problem, or is it the need to categorize that's the problem? So much of what we're using AI to do right now is to sort the world for us. Is it the sorting that's the problem, or is it the heritage of the categories that we've inherited and haven't examined that's the problem?

Kate Crawford:

I love this question. We've always used systems of classification to try and sort the world for us. I'm thinking of course of Linnaeus and the extraordinary sort of 18th century encyclopedias that tried to capture the entire world. I think that's a long established impulse, certainly since the Enlightenment, but you could go much further back. You could go back to Aristotle, in fact. So I think that sort of classificatory impulse has an extremely long history.

​ I think what would really doing here though, is thinking about what are the politics of classification. And if we assume that somehow these technologies are doing neutral or objective work by simply classifying the world, then we're deluding ourselves. And in actual fact, we're walking blindly into a much bigger problem, which is that the world is infinitely complex. And by making these decisions of how we put it into bounding boxes, we're actually doing a very political act. And by not examining those politics, we actually create, I think, extreme problems for ourselves down the track. So, I think really what we're talking about here is what are the politics of classification? How can we actually interrogate those politics, and then ask different questions around what technology is supposed to do.

​ Let me put it to you this way. One of the responses, particularly in critiques of the internet, and similarly in critiques of AI is if these things are problems, how do we make better AI, or how do we make a better internet? That's the natural sort of, well, the difficulty here, it takes us back to that extraordinary line from Audre Lorde of the master's tools cannot dismantle the master's house. And this is particularly vivid when we look at artificial intelligence. At the moment, there really are only a handful of companies. You could quibble about which ones in particular, but really somewhere between four and five companies that own the backbone of artificial intelligence, the large scale planetary backbones on to which everything backs. I'm thinking here of things like Amazon's AWS, Google's Cloud, Microsoft's Azure. Alibaba has sort of similar backbone.

Ethan Zuckerman:

Enormous cloud computing resources that are really necessary to build these systems.

Kate Crawford:

Yes. Precisely. We are talking about really a handful of these. Now, what have you do with artificial intelligence? You are backing onto these gigantic backbones. So you are, again, centralizing power in very few hands. I mean, it's hard to think of a more centralized industry. You really have to go back to the railroads, early days of the railroads to see so few hands really deciding the fate of large scale infrastructures. So that is the large scale question we have to ask.

​ The second question we have to ask is if these systems by their nature reduce, classify and discriminate, now certainly no artificial intelligence machine learning in particular is a technology that is trained on making discriminations, is this different from that? How we understand this as being different from another thing. If these are the technologies that we're premised on, why do we think of those as somehow the only way to get to greater forms of justice?

​ And this is why I think, again, de-centering technology as the only path is actually really important here, instead of asking, how do we make AI better, we should be asking, how do we make the world more just? How do we look at our systems of criminal justice, of housing, of education and healthcare, and try to address those core questions rather than centering, how does AI fix those problems for us? So this de-centering, I think, is really powerful because if we put the focus more on these questions of justice, on these questions of what kind of world do we want to live in, we approach these conversations differently. We start to say, let's look at the situations in which we're embedded, rather than the technologies as being the only way.

Ethan Zuckerman:

One of the maybe most controversial AI systems that people have been looking at lately is an expert system, essentially, a set of rules that recommends whether or not someone should receive bail when they've been charged with a crime. And this system has been interrogated by Julia Angwin and her team over at The Markup. There seemed to be some very clear racial biases associated with it. My student, Chelsea Barrabas, has been really brilliantly about this larger problem of trying to outsource justice to algorithms. And in many ways, what I hear you essentially saying is these questions of justice can't be technological questions. They have to be much broader social questions. Point well-taken.

​ Going back to the opening question that I'm now sort of regretting asking you, how should I open these conversations with people, Kate, because what I'm really trying to do is get over what to me feels like a tendency, particularly around social media, where it's very easy for us to sit there and say, Facebook's terrible. Twitter is terrible. We don't trust these companies, but it's often hard to progress that dialogue beyond the critique and the critique's enormously important, but part of what I want people to do is start envisioning systems that work very differently. And to get to that envisioning, in some ways we have to sort of get the critique out, let's acknowledge it and sort of go forward. What I'm inspired by in this conversation with you is sort of saying once we really think about the historical roots of AI, the assumptions that we've brought into it, once we do this work of challenging the objectivity, that sort of assertion of objectivity that comes from the algorithm decided this for us, then where do we go as far as our next?

Kate Crawford:

Well, I mean, it's interesting. I'm thinking of that line from the late, great Derrick Bell that Ruha Benjamind just recently reminded me of that to see the world as it is, we have to imagine it as it could be. So part of what I think is the most important question really is not how you would in an engineering sense, remake the internet. It's almost like how do we invest in the imaginative capacities of world-building and where might the internet be part of it and where might it not be part of it? How do we imagine this world differently? How do we imagine the things that we thought Twitter and Facebook were bringing to us and look with clear eyes around what these systems really do, which is very different to what they're purported to be doing.

​ The question is still a really relevant one, Ethan, but perhaps it's really asking it in not just how would we make the internet better, but what do we expect of the internet? How might we change those expectations and where might we find them in other places? So I think these are still really powerful questions to ask. And I think my provocation to you is really just, what do we put at the center? What is the central object of analysis here and how might we sort of think about the ways in which our imaginations could prompt different sorts of questions and different sorts of centers, if you will?

Ethan Zuckerman:

Kate, I love that shift, and it is a reminder for me that at the end of the day, I'm an engineer. Where I started this work 25 years ago was in trying to figure out how to build online community tools in a very literal sense of actually architecting one of the earliest systems in this space at Tripod.com. I really like the shift that essentially says, what if this isn't an engineering question? What if this is essentially a social question?

Kate Crawford:

And I think that that tendency to put engineering first has been one of the hallmarks of Silicon Valley thinking, as well. And it's done many things, but I think certainly we've reached a point where to some degree, it's foreclosed our imaginations from asking perhaps rather than defining the problem as one of technology to really think about different kinds of politics. I would tell you these are political questions even more than they are social questions. They're ways of knowing, ways of understanding how societies should be understood, and in that sense, they're inherently democratic questions. And sometimes, engineering is not the most democratic way of being, as you know. It always prioritizes sort of engineering logics and engineers. And certainly, I think we've seen that the cost of that, certainly, we could think about the US election as just as just one place where the ways in which these networks that have been engineered to spike particularly particular forms of the most extreme content to drive debates into their most kind of polarized forms. These came from ad driven logics that were created by well-intentioned engineers who weren't necessarily thinking about political questions about democratic questions first. So I do think that one of the things that would be really powerful is, again, moving away from an engineering first logic and asking, what are the political logics that we're engaging in and what's getting prioritized there?

Ethan Zuckerman:

Well, Kate, the next time we talk, I'm going to start with that question. What are the political logics that we should be privileging at this moment in decaying democracies, late capitalism, environmental catastrophe, and looming global pandemic? But on that light note, and also now rethinking how I'm opening these conversations with people, I'm incredibly grateful for all the deep thoughts and frameworks you put on the table here. Thank you so much for being with us.

Kate Crawford:

As ever, it's a pleasure, Ethan. Thank you for inviting me.