Professor Harold Stolper Wants to Use Data for Good
By: Aastha Uprety
In September, I reached out to chat with Harold Stolper, a professor who teaches quantitative methods at Columbia University’s School of International and Public Affairs. He incorporates a “data for good” philosophy into his teaching and work, which he describes as quantitative analysis that connects the dots between individuals, communities, and power structures.
Stolper became a lecturer at SIPA this fall, after working as an adjunct professor for five years. During that time, he was also an economist for the Community Service Society of New York, a nonprofit that advocates for low-income New Yorkers. There, he engaged in research and advocacy on labor rights, transit accessibility, and policing.
We spoke over Zoom on a Friday morning, where he greeted me from his home wearing trendy glasses and a smile. We talked for an hour about his philosophy as an economist, the role of data in social justice advocacy, and the dangers of relying too much on data alone.
The following has been edited and condensed for clarity.
What brought you to teaching?
HS: I graduated from SIPA a while ago, and then I went back to get my PhD in Economics in the same building. My PhD program was taught with a strong emphasis on becoming an academic researcher, not a teacher. In my view, the tools we were learning were valuable ways to understand human behavior, the world, and how people in the world are affected by different issues. And I valued human interaction, community, and togetherness as a part of that process.
I didn’t want to just become a researcher because for me, that would feel like an absence of community. The process of research can at times be a lonely and disconnected undertaking. But teaching, on the other hand, allows me to foster an experience of togetherness. It’s sustaining for me. When you’re teaching students, that’s your only goal — your goal is empowerment, not achievement.
What made you choose to get a PhD in Economics?
Oh boy. I don’t have a great answer for that. I was working in transportation economics, and at some point I decided that okay, if I want to keep going and do more rigorous analysis, a PhD in economics seems like the best way to develop the skills to do that.
I wasn’t like, “I want to live and breathe economics, I want to be an economist.” To me, [economics] was more of a set of tools that could be useful for understanding things about policies, the world, and people — which is something that I learned is not how most people approach the PhD process.
It turns out it was a decision to sort of live and breathe something all the time. That was a struggle for me, and something I was not as committed to from the beginning.
Did you feel like your peers had different goals? Did that make a difference?
Yes. I only have one experience in a PhD program of course, but my understanding is that these programs are often taught to train students to become tenure-track academic researchers. It seems like you’re guided toward this path very strongly, so perhaps there was a bit of a disconnect for me in that.
If you know you want to [go into academia], it’s a very competitive environment. One reason I wasn’t very happy during my time [in the program] is that I had no interest in competing. I was more like, let’s try to learn something together.
But the idea of “together” or “community” — I don’t want to say it didn’t exist, because there might be people who did connect over shared competition or academia, but community looks and feels different for me. It was not a word that I ever used while I was there. [laughs]
In class, how do you engage your students and encourage them to be enthusiastic about what they’re learning?
Most people in Quant II want to be there, but while some are enthusiastic, others seem to think, “this is important, so I should be here.” We focus on the foundations, and as important as they are, they aren’t the most exciting thing. So I always try to apply the methods we’re learning to something in the real world that we care about. For example, this week we’ll examine the question, “does police funding reduce crime?”
If someone said, you’re gonna learn all about multiple regressions and interaction terms and quasi-experimental methods, you should take Quant II, you’d be like, “yeah, cool story bro,” and just ignore it. But if I said we can use quantitative methods to understand differences in policing of different communities, you’d maybe think, “this could be cool.”
Without connecting to the world around us, or things that we’re excited about or passionate about, why bother learning these methods? To me, the point is to put them to use to understand these things we care about.
So how can you put data to use in the real world?
I think it’s important to continually ask how we can use data to inform policy change that results in a more equitable society. And there’s not one right answer.
One approach is to take a policy or program that you think would help achieve certain goals and show evidence of its impact in other comparable settings. So for example, what was the effect of raising the minimum wage [in one city]? What was the effect of police reforms in this area, in this set of cities, for this time period? Demonstrating the impact might help solve a problem or improve outcomes somewhere else. So for example, if you're fighting to reduce police budgets in a certain city, show what happened when budgets were reduced in other, comparable cities.
At the same time, we don’t want to reduce experiences or communities to a set of numbers. People in overpoliced communities, for example, like Black Americans, don't need statistics to know that they live in an overpoliced community.'
What does that mean, to reduce a community to a set of numbers?
I feel like with any research question, we have to ask what we’re trying to accomplish. Analyzing and presenting data alone doesn’t do anything. It’s what the numbers are telling us that really matters. I care about using data to answer policy questions about how we can remove barriers to opportunity, achieve better outcomes, change policy, and shift discourse.
Who consumes your research and what they use it for can say a lot. If it’s just a group of researchers sitting around a room and confirming what we already know, that’s much less helpful than if activists on the ground can use your statistics and data visualizations to strengthen their arguments, reach new audiences, and change minds.
It’s very easy, especially right now, when talking about social justice, to just document disparities. I think before we start documenting disparities, we should ask what we’re hoping to achieve. There are some disparities that we need to learn more about, like those at the intersection of different identities and regions and other demographic factors.
But if there’s an understanding of existing disparities — like the widespread understanding of racial disparities in this country, then I ask, what’s the point of trying to put statistics and numbers to peoples’ lived experiences? There might be some answers, but we should think about what they are.
Do you ever see disparities presented without context, leading people to misunderstand a problem?
Yeah, it’s possible to show results in a way that opens the door for misleading interpretations. If you just show broad disparities in outcomes, that doesn’t inform our understanding of what’s contributing to them, and by extension, what we can do as policy solutions.
Let’s think about the gender pay gap. When you see a graphic that shows how much women of different races make in comparison to white men, it reminds us that women earn less than men pretty much across the board, and that’s a problem that we care about. But it doesn’t tell us why that is. Is it because of the occupations they go into, or differences in work experiences or hours? Or is it because of discrimination, that women are paid less for comparable work?
Of course, it’s probably some of both. If you think it’s the first reason, then maybe the solution is to have more women in STEM, or job training programs, or flexibility for mothers, and so on. If the reason is discrimination, then maybe we need stronger anti-discrimination laws.
There’s different policy solutions based on what we think the causes are. But when you just see a list of disparities, it doesn’t necessarily inform you of what the solutions are.
At the Community Service Society, I did a lot of work on subway fare evasion enforcement by the NYPD. Through our research, we showed that predominantly black neighborhoods with comparable crime and poverty rates to predominantly white neighborhoods, on average, still had more fare evasion arrests.
Over the last few years, close to 9 out of 10 individuals arrested for fare evasion were Black or Hispanic. That number kind of speaks for itself, right? What's the point, then? Why would we want to emphasize that number or push further with the data there?
We hoped that documenting how poverty is policed differently in Black communities would help push the NYPD toward more transparency, which eventually happened. The city council passed a law requiring the NYPD to report statistics on who they were stopping and where, and their demographics. That doesn’t mean all of a sudden the NYPD is accountable. But the hope is that we can use data not just to document disparities, but to understand what’s driving them and to push toward improved transparency and policies that can limit the scope of policing, or in this case, racially-biased police enforcement.
Can you speak more about the limitations of data and what data doesn’t capture?
Interrogating the data you’re working with and not just taking it for granted is important.
Data itself is generated within systems of oppression. There might be the explicit goal of not reporting data publicly, for example, like some police departments do. Other times, it’s the prevailing norms and existing infrastructure of how we think about a demographic category, such as race or ethnicity. Think about how government data only has certain racial categories, or sometimes you can only check one box. Or it’s the gender binary — maybe there are only two options offered. We can think creatively about ways to gather richer data, but there are even ups and downs to that, since data can be weaponized.
So we should at least start by being very explicit about what voices are missing in the data and thinking about what we need to learn about through other approaches, whether it’s data, qualitative work, or amplifying the voices in communities that are directly affected by issues. So really doing a reckoning with the data you’re working with — although it doesn’t solve the problem, and it’s still something I struggle with.
Data is a tool. We don’t need to try to solve every problem or tell everybody’s story with data. At its best, data for good should contextualize lived experiences and amplify voices that are missing or misrepresented in policy narratives. That’s always the goal — to amplify these voices, not speak for them.
Aastha Uprety MPA ‘21 is studying social policy and media & communications at SIPA. You can follow her on Twitter @aasthauprety.