Julia Lane: We have methodological problems that are systemic. And our statistical system isn't up to the task of providing that data.
Dave Bittner: Hello, everyone. And welcome to "Caveat," the CyberWire's law and policy podcast. I'm Dave Bittner. And joining me is my co-host, Ben Yelin from the University of Maryland Center for Health and Homeland Security. Hello, Ben.
Ben Yelin: Hi, Dave.
Dave Bittner: On this week's show, I describe the business concerns of a company who makes their living scraping data. Ben explores the legal importance of being able to analyze source code. And later in the show, Ben speaks with professor Julia Lane. She's a professor at the NYU Wagner Graduate School of Public Service. They're going to be discussing her forthcoming book, "Democratizing Our Data: A Manifesto."
Dave Bittner: While this show covers legal topics and Ben is a lawyer, the views expressed do not constitute legal advice. For official legal advice on any of the topics we cover, please contact your attorney.
Dave Bittner: All right, Ben. Let's jump in here with some stories. Why don't you kick things off for us? What do you have this week?
Ben Yelin: Sure. So my story is actually on the blog of the Electronic Frontier Foundation. EFF, along with the American Civil Liberties Union branch in Pennsylvania, has filed a friend-of-the-court brief in a federal court in Pennsylvania about a really fascinating case related to source code being used by DNA identification software.
Ben Yelin: So just to give a little bit of background here, this is about a gentleman named Lafon Ellis. He was accused of violating a federal law by being in possession of a firearm as somebody who had previously been convicted of a felony. They did not catch him in the act, but they found the firearm in his vehicle. So they brought it to a lab to test for whether his DNA was among the four individuals which were part of this DNA group sample. And the lab results came back inconclusive. So the mixture sample was then sent to Cybergenetics, which owns the probabilistic DNA software company called TrueAllele. TrueAllele ran a bunch of tests on the sample. They adjusted a bunch of settings. They basically finagled all the ways that the sausage is made...
Dave Bittner: (Laughter).
Ben Yelin: ...Introduced a bunch of alternative theories and were able to identify that Mr. Ellis was one of the individuals who had possessed this firearm. So Mr. Ellis was indicted for this federal crime, and he is about to go on trial in a federal district court in Pennsylvania.
Ben Yelin: Now, what's interesting about this is that the defense is seeking the source code for TrueAllele. Obviously, that's very relevant in this case because the defendant wants to know exactly how this company was able to identify him from this DNA mixture. That's extremely relevant information that Mr. Ellis will need for his defense.
Ben Yelin: And that comes from a very profound constitutional principle from the Sixth Amendment that we have the right to confront evidence used against us in a criminal trial. That is foundational to our legal system. So we're not just trusting what the prosecutors are telling us. We actually can have adversarial hearings where we can say, how did you arrive at that information? Who's your source? Is that source reliable. I mean, that's sort of the foundation of our legal system.
Dave Bittner: So if I - for example, if I were to put someone on the stand, if I had a - I don't know - subject matter expert or someone, you know, came up and testified to one thing or another, it is totally within bounds for me to examine that witness and say, how did you come up with these conclusions? And that witness wouldn't be able to say, oh, just, you know, it's - my years of expertise have led me to this conclusion. They would actually have to explain the process.
Ben Yelin: Yeah. Now, you know, I think I always hate to say this because I feel like it's such a cop-out. It would depend on the circumstances.
Dave Bittner: (Laughter).
Ben Yelin: But if it were highly relevant to the case, then yes.
Dave Bittner: Right.
Ben Yelin: The reason you have an expert witness is to shed light that the evidence itself would not reveal. And so if the expert witness is not able to tell a jury or a judge the basis of their expertise, then they're a useless witness, and their testimony will not be offered at a trial.
Dave Bittner: I see.
Ben Yelin: So the source code here is proprietary information from TrueAllele. They don't want it revealed publicly in a court proceeding or in any court filing because it is a trade secret of theirs. They could have competitors come in and be like, ooh, this is a great way to identify DNA in a group DNA mixture. Why don't we steal it and use it for ourselves and make a lot of money? So the prosecutors in this case are, of course, taking the side of TrueAllele, saying, we are not going to reveal their source code because that's a trade secret. And, you know, under court precedent and long recognized legal principles, at least there is the possibility that information like that could be concealed from trial if it doesn't have a lot of probative value.
Ben Yelin: So what the EFF and ACLU are trying to do is to compel the court through this friend of the court amicus brief to reveal that source code because otherwise the defense would not be able to properly question the evidence used against them.
Ben Yelin: And so this creates sort of your classic dilemma here. When we're talking about federal rules of evidence, you know, the most general rule of thumb is - does this piece of evidence have probative value, and would the value of having that evidence included outweigh the risks? You know, with risks, we're generally talking about introducing evidence that's prejudicial or otherwise irrelevant. Here, the risk is that you'd be revealing proprietary trade secrets, and that's a decision that the court is going to have to make.
Ben Yelin: One potential out that they could use is - and this has been used in a number of other cases. They could allow the defense attorneys to view the source code outside of a public courtroom setting - so in judge's chambers. And that might be the most equitable solution here.
Ben Yelin: But I just thought it was a really interesting dilemma. It's really interesting that the Electronic Frontier Foundation and ACLU have shown such interest in this case.
Dave Bittner: Yeah, it's interesting to me. I was sort of going along the lines of what you described there. Could they make use of some third party who could - who had - someone who had some expertise in looking at the source code. And they could say, hey, these are the questions we have about the source code. We're not going to ask you to reveal the source code, but we want your analysis of the source code, and answer these questions for it. Does it do this, this, this and that? You know, perhaps that's a way around it without revealing that secret formula.
Ben Yelin: Yeah. I mean, that's why you could have a member of your defense team be somebody who's familiar with source code. And then that equitable solution would be allow the defense team in a private setting to review that source code. And, you know, perhaps a member of that team could say, I have identified a problem area here. You know, I'm somebody who's an expert in the mistakes and flaws of these types of tools - the type of tools that TrueAllele uses, and I've identified one of those flaws here. X, Y and Z can present bias, and that's part of the source code in this case.
Ben Yelin: That would not be litigated at trial because if you litigated that at trial, it would reveal trade secrets, and that's something that the court wants to protect against. But that could be done, you know, in judge's chambers, and then it could be introduced in some other way at the trial - in a way that wouldn't reveal any proprietary information but in a way that the jury or the judge, if the judge is the finder of fact, can understand the flaws in the system that led to this positive DNA result.
Ben Yelin: You know, I just think this is a classic case of where you have to weigh principles. I deeply believe in protecting trade secrets because, you know, that's the foundation of our economy. People should have rights in their intellectual creations. I definitely believe in that. But we have this constitutional right to make sure that the evidence used against us hasn't been obtained arbitrarily, falsely or through flawed means. And my instant reaction to this was those interests - the interests of the guy who's probably going to be sentenced to, you know, a long jail sentence should supersede the trade secrets here.
Dave Bittner: So we need to err on that side of things, perhaps.
Ben Yelin: That would be my instinct, you know, I think partially just because of the stakes involved. Granted, there are probably significant economic costs to be borne by TrueAllele if their source code were to be revealed, especially in a public setting. But how does that compare to the cost of somebody going to jail based on a system that is fundamentally flawed, fundamentally false?
Dave Bittner: If it is so - the point here is we want to analyze it to see if it is. We're not assuming it is.
Ben Yelin: Exactly. We are not assuming it is at all. Now, what EFF has done is they've documented over a series of years - really over the past decade - a number of cases in which DNA analysis programs have been used. And they've discovered that those programs are not uniquely immune to errors and bugs, as they say in their blog post here. And so it would be improper for a criminal defendant or a defense team to simply say, you know what? You got us. We trust you. Whatever you're using is probably fine. You know, that wouldn't be proper legal defense work. When you know that you have a history of DNA analysis programs having loopholes or being imperfect or coming up with false negatives or false positives, a defense attorney has an obligation to go hard at that system. And I think that's what's happening in this case here.
Dave Bittner: Yeah, interesting. And I suppose we'll have to keep an eye on this one to see how it plays out.
Ben Yelin: Absolutely, yeah. The case is going to be litigated in the next several months, so we'll see what happens.
Dave Bittner: All right. It's an interesting one. My story this week - this comes from Motherboard over on the Vice website, written by Joseph Cox, who...
Ben Yelin: Our boy Joseph Cox.
Dave Bittner: I know. We really should send him a box of cookies or something 'cause we...
Ben Yelin: Yeah.
Dave Bittner: ...(Laughter) So much of - he's really right down the center of our lane. And we certainly make use of a lot of the things that he writes here. So thank you, Joseph.
Ben Yelin: Yes.
Dave Bittner: The article that he writes is - it's titled "This Billion Dollar Company Considers Privacy Laws a Threat to Its Business." And I highlighted this one this week because I think it's sort of a fascinating look inside the business of a data broker. This is about a company called ZoomInfo. And, well, let me back up here first and just share a little sorry. You know, before the days of mass data collection, when the internet was still pretty new and we were all bright-eyed and optimistic and (laughter)...
Ben Yelin: Feels like...
Dave Bittner: ...Naive (laughter).
Ben Yelin: ...Such a long time ago, doesn't it?
Dave Bittner: Right - way back in the heady days of the mid-'90s. I remember as a small-business owner, it was not unusual for me to get a letter in the mail that would say, hey; we are such and such a company, and we are in the business of collecting basically other people's Rolodexes. And so if you will sell us your Rolodex, we guarantee you it will be completely anonymous. None of your - none of the people in your Rolodex will know that you have ratted them out (laughter). And in exchange for some money, we will collect your Rolodex. And that's the deal we want to make.
Ben Yelin: Now, did you have to send them the literal Rolodex, the thing that actually turns around in circles?
Dave Bittner: No, no, no, no. We were - they were looking for electronic records...
Ben Yelin: Ah, OK.
Dave Bittner: ...At this point.
Ben Yelin: OK, so this isn't ancient history.
Dave Bittner: No, no, no. But I want to say, I mean, I think at that point in time, there were still - a lot of people were still keeping, you know, real, old-fashioned actual Rolodexes. It was still a thing. I'm sure there are still some folks out there today who live by it. So I tell you that story because it's reminded me of this. This story reminds me of those days.
Dave Bittner: So ZoomInfo - from what I gather in this article, they are in a similar business. They have a free version of one of their products where - they have a thing called a contributor network. And basically, the way it works is you install some little widget or device on your machine. That widget or device or hook, software hook - whatever it is, it connects to your email client. And it looks through your emails, and it gathers up business contacts - so names, email addresses, phone numbers, business addresses, all those sorts of things - and sends them off to ZoomInfo. And so they build this huge database of, basically, names and addresses - businesspeople all over the world. They say that their contributor network captures 50 million records every day.
Ben Yelin: That's a lot.
Dave Bittner: That is a lot. And so in exchange for you installing that widget, that ability, however it's done, you get access to the database. So if you want to find out I'd really like to get in touch with such and such potential business contact at such and such a company, odds are that this database may have a lead for you, may have a phone number that otherwise would be - would take more time to get or be hard to get or something like that. So you can see that there is value in that exchange, right?
Ben Yelin: Absolutely. Yeah.
Dave Bittner: Yeah.
Ben Yelin: That's sort of what they do on - I know they do this on LinkedIn, where if you volunteer to make your profile public, you can see who views your profile. But you have to agree to having yourself tracked as to which profiles you view.
Dave Bittner: Right.
Ben Yelin: So that's, you know, the same mutual exchange that we're seeing here.
Dave Bittner: Right. So what Joseph Cox points out in this article is that the folks at ZoomInfo - they did a filing for a public offering. They went public back in June, and they raised nearly a billion dollars. So there's big money in Big Data, Ben (laughter). And one of the things they pointed out was that privacy legislation, privacy regulations that are both in place right now, things like the GDPR and the California Consumer Privacy Act but also the direction things are heading - so they point out that the FTC, the Federal Trade Commission, is taking an increasingly active approach to enforcing data privacy - that these could all have an effect on their bottom line, that this business model may have limited viability or variable viability, I suppose, depending on how these things get cracked down upon going forward. So I thought it was really an interesting view through - also interesting to me that it is through some of these public disclosures - so through regulation, because they - in order to get - to raise their billion dollars - right?
Ben Yelin: Right.
Dave Bittner: They have to reveal these things. Yeah. And so you can see how that's kind of in the public interest here. We get a view inside of these companies, how they're looking at some of these things, because of these things they have to disclose.
Ben Yelin: So when I first saw this article when we decided we were going to discuss it on today's podcast, my instinct was to take out the world's tiniest violin and just...
Dave Bittner: (Laughter).
Ben Yelin: ...And cry those deep tears for...
Dave Bittner: Right. Yeah (laughter).
Ben Yelin: ...The fact...
Dave Bittner: Right.
Ben Yelin: ...That ZoomInfo is not going to be able to take advantage of our lax federal privacy laws the way it is now...
Dave Bittner: Right.
Ben Yelin: ...To make itself billions of dollars. Now, you know, my first instinct on these things is not always correct.
Dave Bittner: (Laughter).
Ben Yelin: I think that's been proven many times throughout our conversations.
Dave Bittner: (Laughter).
Ben Yelin: It may or may not be correct here. I certainly, from ZoomInfo's perspective, can see why data privacy legislation - whether it's CCPA or GDPR - would put their profitability at risk. It would decrease - because, you know, particularly the CCPA would allow individuals to opt out of this type of service, it would decrease the pool of people who are part of this giant database. And that would make the database less valuable and would cut away at, you know, those billions of dollars in potential profit.
Ben Yelin: But those of us who care about broader policy and the public good would say, you know, it's a small price to pay for this data scraping company to have its profits cut into so that the rest of us have greater rights in the data that we share online. I guess I'm saying I kind of stand by the tiny violin thing. Am I wrong?
Dave Bittner: I don't think so.
Ben Yelin: Yeah, poor Zoom. They probably don't want to be associated with ZoomInfo.
Dave Bittner: Yeah. So ZoomInfo points out that they only collect business contact information, which is the same information that would customarily be found on a business card. So it's not like they're brokering in - I don't know - Social Security numbers or - to me, this seems more like a nuisance than anything else. You know, what it means is I'm probably going to get more unwelcome soliciting calls, which, you know, is a nuisance more than anything else.
Ben Yelin: Right. It's not personally - it's not PII, for example.
Dave Bittner: Right. Right.
Ben Yelin: And they said, in fairness to them, in their statement that they take data privacy seriously. They comply with CCPA and GDPR. They go above and beyond their data-scraping competitors in protecting information and making sure that the only thing available on this database, as you said, is information that would be on a business card. But it's certainly, to me, not a justification for any policymaker to reconsider whether to institute data privacy legislation.
Ben Yelin: You know, I don't think that ZoomInfo has the right to have as giant a database as it possibly can to maximize its data-scraping profitability. I think the public interest weighs into more robust data privacy laws. That's certainly the decision that the European Union has made, the state of California has made and, as this article mentions, the FTC, the Federal Trade Commission, in some of its regulatory policies. So, you know, I would say that public interest outweighs the fact that ZoomInfo's database is going to be considerably smaller as data privacy laws start to increase.
Dave Bittner: Yeah. It's an interesting point you bring up, though, which is, is your business information PII? Is it out of bounds for those privacy types of things because it's not personal? It's business, my business phone number. Is that a piece of personal information - my business address? I don't - I'm not sure that it is.
Ben Yelin: It's not. I mean, this always depends on, like, which specific statute you're referring to, but in most of our privacy statutes, whether it's HIPAA or anything else, certainly depends on who's releasing the information. But this would generally not be considered PII if it's just a business address and a business phone number. It's not somebody's driver's license number, Social Security number, biometric data. It's nothing like that.
Dave Bittner: Right.
Ben Yelin: And it's certainly not an offensive invasion of privacy compared to some of the other things we talk about frequently on this podcast.
Dave Bittner: Right. All right. Well, again, the article is over on Motherboard, the Vice website. It's titled "This Billion Dollar Company Considers Privacy Laws a Threat to Its Business," written by Joseph Cox. We'll have a link to that in the show notes.
Dave Bittner: Ben, you recently had the pleasure of having a really interesting conversation with Professor Julia Lane. Do you want to set that up for us? What was your conversation focused on?
Ben Yelin: Absolutely. I had the pleasure of speaking with Professor Julia Lane, a professor at NYU Wagner Graduate School of Public Service. She has written a book called "Democratizing Our Data." I read the book, and it was one of those that I wanted to read in one sitting because it was that fascinating - basically, on the importance of updating our data infrastructure to match the policy needs of the 21st century. A lot of the data we used was developed to handle the problems of the 1930s. That's how outdated our data collection system is. And we see it related to economic statistics and other social science statistics. So it was really a fascinating discussion on how we can go about democratizing our data.
Dave Bittner: All right. Well, let's have a listen. It's Professor Julia Lane.
Julia Lane: I'm an economist by training. And I have been worrying about data issues for probably most of my career. I seem to have spent my life building data that answers questions. And, you know, I was interested, for example, on the effect of training on workers' earnings and the productivity of firms - and so built a large-scale national program.
Julia Lane: And about four or five years ago, there was a big sea change, I think, in the way in which governments started to think about data. So the focus has been very much on surveys. As many of you will be familiar with, the decennial census is in the field right now. And it's a very important way in which the government at all levels - federal, state and local government - make decisions. So where am I going to put schools? For business, where am I going to locate my business? For people who are looking for jobs, where are the jobs?
Julia Lane: So the role of data in everyday life is absolutely critical. And yet what has become increasingly clear is the way in which we do collect data and the way in which we make it available for the public to make decisions is just not up to the task. So the book very much came out of 20, 25 years of working with data and then trying to think about how can we improve the way in which we do things.
Ben Yelin: So how would you characterize the problem with data in the public sector? What is sort of your central critique of how the government collects data, makes data public or does not make data public in some circumstances?
Julia Lane: So here's the challenge - is our whole system was basically set up 100 years ago. There was the Great Depression. People wanted to figure out what was going on with the economy, and we didn't know. Herbert Hoover had to rely on stock market indices and freight car loadings in order to figure out what was happening to economic activity...
Ben Yelin: Right.
Julia Lane: ...Whether it had shrunk by 25% or 10%. So a researcher by the name of Simon Kuznets and some others - Stone and so on - they worked with the government to develop an aggregate measure of the economy. And they had pathbreaking changes, and they were able to show just how much economic activity shrunk.
Julia Lane: A framework that was started almost 100 years ago has been the framework whereby we have structured our federal statistical system. And it was actually quite useful in World War II. It helped us figure out how much food to grow and how much manufacturing to produce. But that framework still holds now, where agricultural and manufacturing are much less important. And all the numbers were allocated at the national level, not at the local level. And there was only one way to collect it. You know, you had to go out ask people.
Julia Lane: Now turn around and look to see how data are being generated now, and there's massive amounts of information that are being generated, and our statistical system hasn't kept pace. The private sector has taken off. Our five biggest companies in the United States and probably the world are data companies - you know, Facebook, Amazon, Apple, Microsoft, Google. And yet in the federal sector, in the public sector, state and local as well, we don't use data to anything like that level of precision.
Ben Yelin: In the private sector, there's sort of the obvious incentive for the Facebook and the Googles - to have robust data collection obviously for their own advertising purposes. They have the bottom line as their main motivating factor. It seems like we might have a political problem, where you have to convince policymakers of the importance of robust public data. How do you think we can go about doing that? I mean, how can we raise awareness on this issue and make it real for our policymakers?
Julia Lane: You hit the nail on the head. Part of the issue here is that public data - there's no bottom line. There are massive numbers of people who benefit - school districts, kids, small businesses - but because it's a public good, there's no price that's put on it. And there's no real incentive to innovate because the way the public budgetary process works is you get a line item in the budget and that - by God, that's what you're going to produce, year after year after year, and there's no incentive to innovate.
Julia Lane: There's other issues as well. You went to exactly the core of the issue. The other thing is - is in the private sector, privacy and confidentiality aren't nearly as important as in the public sector.
Ben Yelin: Right.
Julia Lane: In the public sector, there are statutory rules. You go to jail. You get massive fines if you reidentify an individual. In the private sector, you make a lot of money.
Ben Yelin: Right.
Julia Lane: (Laughter) So there's a very real tradeoff. So, really, what needs to happen is there needs to be a cataclysmic change, just like the Great Depression, and I would argue we're here now. We just are in a situation in where the economy has gone off a cliff. The labor market data are - people desperately need it at the local level, and We don't have it. The statistical agency is struggling to report, as we know the unemployment numbers that were reported for May were - had some flaws that meant that 13% that they reported should actually have been 16%.
Ben Yelin: Right.
Julia Lane: And the month before, it should have been 19%. The payroll employment said it was up by 2.5 million, but there was an adjustment made that - for the first time, and so no one knows actually what that number is. So we have methodological problems that are systemic, and our statistical system isn't up to the task of providing that data. So you can think about how do we restructure, and that's what the book's about.
Ben Yelin: I was sort of thinking about it because the crises we're all dealing with right now are very data-heavy, if that makes sense. You know, in order to allocate resources for the COVID epidemic, you have to know the local spread of the disease or what the R0 is in a given state or city. When we're talking about the economy, you identify those issues in terms of getting the correct statistics from the Bureau of Labor Statistics. And then just in this last week, you know, I've been thinking about the problem of police violence, and so much of our understanding of the depth and scope of that problem has been through data.
Julia Lane: Right.
Ben Yelin: But as you say, that's largely been done by the private sector. I've seen it in The Washington Post and other places. So I guess my question is, do you think we will get to that heightened level of awareness where there will be more of a public consciousness around the need for public data and a revolution in the public sector in terms of data collection?
Julia Lane: I think the elements are there. I think the cataclysm - I keep referring to it as a cataclysm - you know, mandates more data. Governors and local authorities are flying blind. You know, we have had 40 million people apply for unemployment. But there's the data about who they are, where they came from, what work is available - that data is actually available in the programmatic agencies, but it's not pulled together in a way that people can make sense of. And yet the governors and the workforce boards and the local entities need to have that data to allocate very limited resources. So how do you do that best? How do you get people back to work?
Julia Lane: So I think very often what effects change is need. Just like the Great Depression created GDP and our national unemployment statistics, I think we're going to see a massive - a sea change in the interest in local data.
Julia Lane: Let me give an example - and I think what it's going to have to be is pushing data collection and use down to the local level. An example is, you know, back in 1900 we had more than 6,000 people died in a hurricane that hit Galveston, Texas, right? Why did that happen? Well, part of the reason was it was not enough data and bad communication. So the only place that could issue hurricane warnings was the Washington, D.C., office of the Weather Bureau. And Washington, D.C., thought the storm would pass over Florida, not Texas. And warnings to Galveston came too late.
Julia Lane: And so what happened after that was the National Weather Service with a national network that democratizes collection and processing of data. You've got the same thing that happened in agricultural productivity. You had a national extension system that grew out of local farmers needing to know how to take care of their crops and how to make their agricultural productivity high. And they built farmers institutes, and you had the land grant system that came out of that and the training that went with it. With the Manhattan Project, you had the national lab system that got put together, where you brought really smart people in to solve problems. So this country has responded to massive need in the past. And when I was writing the book, I wasn't obviously aware that something quite this devastating was going to happen.
Ben Yelin: Sure. Sure. Yeah.
Julia Lane: (Laughter) I mean...
Ben Yelin: None of us could have dissipated this, yeah.
Julia Lane: None of us - none of this could happen. But I think we need to democratize our economic and social data systems.
Ben Yelin: So if we do reach that inflection point and there is political support for democratizing our data, what would a piece of federal legislation look like? As specific as you can get without alienating people who don't speak legalese...
Julia Lane: (Laughter).
Ben Yelin: ...What would you like to see in a new piece of legislation on this issue?
Julia Lane: So I think it needs to be a combination of a national lab system and an ag extension program. So the problems need to be posed by the state and local areas. Unemployment in Detroit is very different from unemployment in Los Angeles, and it's very different from in Houston, right?
Ben Yelin: Right.
Julia Lane: Or in a rural area in Arkansas or in Idaho - those areas understand their data. Just like we don't have one single weather number, we need to have local measures that come from the local community, not top-down, that are pushed out.
Julia Lane: I call, actually, in the book for a national lab for community data which brings together the capacities of our great public universities and private universities, together with the needs of the people who really understand the data-generating process. Both the government agencies and the private sector - they have as much interest in figuring out what's going on in their local economies as anyone. And putting that together in a secure environment - we now have the capacity to do that. Doesn't have to go to Washington to be secure. You can have data in the cloud. You know, I know you know a great deal about cybersecurity. We have FedRAMP protocols. We know how to do this.
Ben Yelin: Right.
Julia Lane: And we have data scientists who are trained. There's not just a small cadre of trained government statisticians. There are data scientists throughout the country who can work on very hard local problems.
Ben Yelin: So the last thing I wanted to ask you about is just the connection to democracy and democratic norms. I mean, it seems like one of the key takeaways from your book is that without this sort of revolution in how we publicize data, how we collect data, it really is a large threat to our democratic system, small-d democratic system. Can you just touch on that and talk about what you see as the long-term consequences to our democratic system of, you know, not making some of the changes that you recommend in your book?
Julia Lane: What I'm worried about is that if we don't pay attention to the slow disintegration of the federal system, the lack of funding, the lack of bureaucratic impetus from Congress - and you hit that nail right on the head - that we won't have good data and good measures. And the core to a democracy is understanding and being able to measure the impact of different decisions. If you don't have that, then you really run the danger of having a dictatorship because they will just assert that whatever they are doing is good for everyone in society, and you don't have any means of contradicting or of establishing a set of facts.
Julia Lane: So when I worked for the World Bank, you could very much tell the level of democracy in a country by how tight a control the government kept of the data. And you can absolutely see why it's so important.
Dave Bittner: All right, Ben. Wow, really interesting stuff. A couple things came to mind for me as I listened to the interview. First of all, government lags the private sector - right? - no news to anyone who listens to this show regularly.
Ben Yelin: No. The subtitle of this show should be government lags the private sector.
Dave Bittner: (Laughter) I also thought it was fascinating - something that I never really thought about was so much of the way we do things are an accident of history, how she points out, you know, the Great Depression and, you know, how we started measuring things, measuring our economy. And we just keep on doing some of those things.
Dave Bittner: And it makes me wonder, are there moments where we need to take a look and recalibrate and say, are we - you know, do we need to get ourselves off the virtual gold standard, if you will? You know, that sort of thing.
Ben Yelin: Yeah. I mean, I think they call it in the social science world path dependency. We're so reliant on the way things have always been, and there's very rarely that breaking point where policymakers have incentive to reinvent the wheel because reinventing the wheel is hard. And coming up with new ways of collecting vital data, specifically as it relates to things like the Bureau of Labor Statistics, would be rather revolutionary and, you know, something that federal agencies are just not generally in the position to undertake.
Dave Bittner: Right.
Ben Yelin: So the upshot of that is we're relying on very outdated data collection systems. And one thing I got from that interview is so many of the major policy issues of our current time here, the issues that have dominated news stories for months, are highly dependent on reliable data. Where are those COVID infections spiking? Which cities are suffering economically? Which sectors of the economy are suffering? Which police departments have a history of racial biases? These are things that are such crucial questions, and if we don't have access to reliable data - reliable data that's usable and that provides proper context - then we're not going to be able to make wise policy decisions. And so I think, you know, that's the real-world impact on falling so far behind in terms of the type of data that we collect.
Dave Bittner: Yeah. It struck me, too, that, you know, as you and I record this, we're in the midst of a data controversy as the federal government is cutting short the census data gathering.
Ben Yelin: Yeah, they sure are. I think they decided to cut short their supplemental census data collection by a month. You know, obviously, the COVID epidemic presents certain complications in people going door-to-door and collecting that data. But we rely on census data for everything - right? - you know, our allocation of funding from the Department of Education, really all federal appropriations, our representation in Congress. That's all dependent on the accuracy of the census count. If we get that wrong, then our public policy is going to suffer.
Ben Yelin: And so, you know, I just think what's great about this book is it's bringing to light a problem that I think is solvable. If we actually committed to updating our data infrastructure and going into these federal agencies and determining how we can get rid of these 1930s-era data collection systems, I think we could do what the Googles and Apples of the world have figured out how to do and collect the data that is most relevant to the decisions we have to make.
Ben Yelin: But because of government inertia and that concept of path dependency, we're not doing it. So I'm just glad that with her book, she's bringing that issue to light. And I really hope people read the book. It was an excellent read, an easy read and something I'd highly recommend.
Dave Bittner: Yeah. All right. Well, our thanks to Julia Lane for joining us. Once again, the book is titled "Democratizing Our Data: A Manifesto."
Dave Bittner: That is our show. We want to thank all of you for listening.
Dave Bittner: The "Caveat" podcast is proudly produced in Maryland at the startup studios of DataTribe, where they're co-building the next generation of cybersecurity teams and technologies. Our coordinating producers are Kelsea Bond and Jennifer Eiben. Our executive editor is Peter Kilpe. I'm Dave Bittner.
Ben Yelin: And I'm Ben Yelin.
Dave Bittner: Thanks for listening.