Uncovering Hidden Risks 6.21.23
Ep 10 | 6.21.23

How eDiscovery Can Help You Reduce Data and Risks in Three Steps


[ Music ]

Erica Toelle: Hello, and welcome to Uncovering Hidden Risks, a new podcast from Microsoft, where we explore how organizations can take a holistic approach to data protection and reduce their overall risk. I'm Erica Toelle, Senior Product Marketing Manager on the Microsoft Purview team. And now, let's get into this week's episode. Welcome to another episode of the Uncovering Hidden Risks podcast. In today's episode we will discuss some strategies and best practices to mitigate security and compliance risks by doing more in-place eDiscovery to support investigations and litigation. As data volumes continue to balloon, it's becoming clear that the quickest path to victory does not involve the fewest steps. Let's explore ways to defensively move data minimization decisions upstream to collaboratively expedite the eDiscovery process and reduce risk within the safety of your own tenent. First, let's start by introducing today's guest host who will join us for the discussion. Caitlin Fitzgerald is the Senior Product Marketing Manager focused on eDiscovery and audit solutions at Microsoft. Welcome, Caitlin, to the podcast.

Caitlin Fitzgerald: Thanks Erica. I'm excited to be here. I've been at Microsoft now just close to 10 years, and my current role has been super interesting as I get to work with some amazing individuals on solutions that customers and partners are really passionate about. Every organization, small or large, regulated or unregulated, encounter scenarios in which they need to find that needle in the haystack, or evidence to determine what happened in a security breech. Or support an internal investigation, including what steps they need to take to reduce that risk in the future. And I'm especially thankful to work closely with our special guest today, who's graciously agreed to join us for this session to share his experience working with our Microsoft ProView eDiscovery Solution, and the ways in which his team has been able to reduce security and compliance risk within Microsoft.

Erica Toelle: Perfect. And that's a great segway to introduce today's guest, EJ Bastien, the Director of Discovery Programs at Microsoft. He and his team have lots of experience using technology to address the challenges of eDiscovery in this modern cloud world. And they have some strategies and best practices to share to help mitigate risk. So, if you've always been wondering how does Microsoft do it? This is the podcast episode for you. EJ, how are you doing today?

EJ Bastien: I'm great. Thanks Erica and thanks Caitlin, for the invitation. I'm really excited to be here; this is going to be a fun one. So, as you mentioned, I'm Microsoft's director of Discovery Programs. I lead the eDiscovery and litigation support function in our litigation department, where I manage a multidisciplinary team of program managers, engineers, paralegals, and records managers. I've been here for 18 years, focused on eDiscovery and litigation operations the entire time. We've rebuilt our approach to eDiscovery numerous times throughout nearly two decades, and I've been here for every step of the way. Think of me as a process architect, right? Finding ways to comply with our legal obligations within our environment. I use our technology, but ultimately, my job is making sure we meet our obligations. I partner very closely with the product team; helping them understand the plight of an eDiscovery practitioner. And I get to be the first customer that actually puts these features that show up in Purview to work; making sure that our processes evolve with the technology that we create.

Erica Toelle: Yeah. And I think it's perfect we describe you as our first and best customer for eDiscovery, which is so true. Maybe to kick us off, Caitlin, are there any trends that you're seeing that are affecting the eDiscovery space?

Caitlin Fitzgerald: Yeah. Great question, Erica. I have three trends that I'd like to talk about. So, the first one is the amount of data and electronically stored information, also known as ESI, that is getting created, shared, stored, across organizations including new data types, modalities, and the challenges it creates for organizations to maintain a strong security and compliance posture; all while enabling productivity for their workers, and how that impacts eDiscovery, which is about, you know, efficiently finding the relevant data to support those in investigations and litigation. The second big trend is the connection between security, compliance, privacy, and legal teams and how that's not -- it's not just a Seeso that is on point to protect company and customer data, but it's the role of the entire C suite including chief compliance officers, legal officers, privacy officers, they all need to take steps to minimize the amount of sensitive data that is exported outside of their secure tenent and compliance boundary. And the last one is also very topical. It's the need for automation to reduce manual tasks and human error and do more with less. And freeing up people's time to focus on higher value analytical work, for example.

Erica Toelle: So, EJ, given your experience in the current trends, what advice would you give to other organizations that are looking to get a handle on the growing amount of data and how you're approaching some of the new technology innovations?

EJ Bastien: As data just grows unchecked, the types of problems that it presents are also growing. All right? You've got problems from making sure you can find the important subsets that are necessary for whatever you are trying to find amongst that ever-growing hay bale; those needles are getting finer and finer. There's problems with through-put. The more data you have, are you going to be able to meet your deadlines? Right. Are you going to be able to finish the process that you have to complete in order to find those needles? There's also problems with security and data access, right? How can I safely get to what I need without making an inappropriate tax surface for other people to exploit and find what they want in our data? So, ultimately as data grows, it's not just the familiar data types that are also causing the glutton. We've got new modalities, as Caitlin mentioned. We've got new applications. We've got people working in new ways. And trying to fit the new data types into tried-and-true industry standard methods doesn't quite fit the same way that it once did. You get new content like Loop. You get new content that doesn't export and render and offline review tool in the same way that was intended to be experienced by the end users. So, finding ways of keeping up with all of those to facilitate the downstream review and production processes that you need to follow to meet your legal obligations is a constant focus for my team, and I assume, for my counterparts that are other customers. The security component is one that I think is often overlooked in that process, right? General counsel organizations are in a hurry to meet their legal obligations and they rely on a team of experts, whether it's outside counsel or legal service providers that help them process data and get through the review and production component of their responsibilities on-time, on-budget, and challenges that are stressing your organization are also stressing theirs. Keeping up with security requirements, keeping up with the ability to process such large volumes of data, keeping up with the expense that the data growth creates, so that their clients get a fair deal and that they can efficiently manage. Their business is all being tested again and again with every inflection point. So, doing more in-house. Looking at what you can do to help get in front of all those pieces. Finding the important subsets of data. Helping your partners be prepared to handle the data that your organization uses that their experts may have never seen before is a really important thing. The role of an in-house practitioner is growing every day, and the opportunities to help prevent problems is growing every day. With that, we need to dynamically explore those opportunities and embrace those opportunities. Learn the things that you haven't yet become comfortable with. Find new partners, maybe internally amongst the [inaudible], amongst your IT counterparts, or hire engineers if your department's able to, to find more efficient ways of scaling and keeping with the data growth.

Erica Toelle: Do you have any other advice you'd give for organizations that are learning to implement an effective eDiscovery strategy with the tools that are available today?

EJ Bastien: Sure. This is something that I get asked a lot. You know, eDiscovery capabilities in M365, bringing things closer to where the data originates is uncomfortable for companies who've long relied on outsourcing the whole process. Outsourcing to partners who understand how to get the data into a state in which it can be minimized and reviewed and categorized and analyzed and produced. And outsourcing to law firms who can oversee the process. Bringing that in-house is a lot of responsibility, and I think that it's good to understand first the details of how it's done today, for your organization and your own matters. And then explore the capabilities that exist in your tenent, right? When you explore the concepts, right? The first thing that people often do is get it into a state where data can be filtered electronically, reliably, right? Where you have the extracted text -- a complete, extracted text of the data and a complete rendering of the metadata so that you can use them all in compound -- complex quarries or classifiers, to minimize the data down to the relevant subset that actually merits review. Can you do those things at the collection point? And in 365 most data lives in an indexed state and the indexers are really cautious about identifying the subset of items that may not be fully searchable so that you can make decisions. Explore what you can do with those capabilities, right? Take, for example, some of the sources that have been collected and processed in offline tools and get the queries that were run against them and the filters, and maybe try replicating them in Purview. Maybe try searching those very locations with the same queries, the same filters, and see if you can get to a familiar subset.

Erica Toelle: Yeah. It sounds like you're testing the system, right? With the tools and the known dataset.

EJ Bastien: Precisely. Gain trust by using data that you already know. Look for things that you would expect to find so you could build your confidence in your ability to find them when you don't know what you're looking for in the future. But also, expect to do a lot of that. Expect to keep up with the evolution of the capabilities and put them to use for you. There's a learning curve with everything; get comfortable on it, because there's never a dull moment. But use it regularly. Assess the inputs and outputs and do a lot of comparison and benchmark against your past self and mark your progress to see how much value you're providing for your own organization. We were astonished at how fast we saw an ROI in our investment in this space, and we use that to justify building out our team even further, and having more and more FTE resources dedicated to the eDiscovery processing and minimization because it has tremendous impact on our budget; positively avoiding downstream costs.

Caitlin Fitzgerald: Hey, EJ, so you were starting to talk about the ROI you've seen, and we talked -- said at the beginning, you know, you're our first and best customer for Microsoft Purview eDiscovery Premium. And then also, I think, you know, you've implemented some of these more business process-oriented best practices that you've shared with us. What are some of the returns on that investment that you've seen?

EJ Bastien: Sure. Great question. So, we've measured our teams impact from bringing eDiscovery in-house from day one to prove that this is the best thing for our company. So, we've developed a lot of KPIs. First, we've paid close attention to the call rate. How much data we can minimize from the raw, unfiltered volume that lives in the sources that we're collecting from, verses the export set that we're ultimately handing off for review in production. On average -- and this has been across almost 15 years -- we've called away more than 95 percent of the unfiltered data before it ever leaves the safety of our own tenent. In fact, in the last few years, because data has grown so aggressively, our average is now about 98 percent culled, meaning, we're handing off only about 2 percent -- the potentially relevant 2 percent of the content that dramatically expedites the downstream eDiscovery processing review and production efforts. We've seen, through the years, a 91 percent reduction in our average cost per custodian. Which is impressive unto itself, but if you consider across the 15 years that we've been tracking this, the volumes -- the average volume per custodian has increased 5,600 percent. All right? It used to be -- when -- I won't tell you what the actual number was, you know, litigation is notoriously expensive, largely because of eDiscovery costs. So, let's call back in the day when we used to outsource everything, 100 percent. That 100 percent was arrived at when we typically had about 4 gigabytes per custodian that had to be processed and reviewed. Our average volume per custodian these days is well north of 200 gigabytes per custodian. So, that reduction and the efficiencies that it creates for us in the downstream processing and review and production efforts are having immense impacts in our ability to meet an ever-shrinking budget. We've also tracked closely the number of items we typically have to reprocess. And as I mentioned earlier, M365 keeps data generally indexed. You know, mailbox content, One Drive, Share Point teams, Yammer, etcetera, etcetera. It lives in an indexed state, so you can use that enterprise index to identify subsets of information that actually merit collection. But we've noticed when we switched from eDiscovery standard in Purview to eDiscovery premium, with deep indexing we get far higher fidelity search results, right? The items that are unsearchable, just the exceptions that may exist in the general enterprise index are automatically remediated and reindexed in the -- in eDiscovery premium. And we've seen a dramatic reduction in the number of items that are continuing blind spots. It's been a 94 percent reduction in the total number of items that we need to reprocess through our migration from eDiscovery standard to eDiscovery premium.

Caitlin Fitzgerald: And I think that, you know, speaks to the security piece we were talking about earlier, right? About not having that super highly sensitive data duplicated outside of the tenent.

EJ Bastien: Exactly. Right? A subset of data -- only a subset -- a relatively small subset of data is important for the case, right? And a lot the eDiscovery work is finding that; is culling away the irrelevant. But it's only irrelevant to that matter. It doesn't mean it's not important or not sensitive content. It could be highly important, and highly sensitive, and highly confidential for the organization. And when you think about -- you think about the Seeso's organization finding out the way that a lot of legal teams process their work in doing unfiltered collections and taking them out of the safety of the organization, you can just imagine how alarmed they would be at the doubling of the attack surface. Of the taking the unfiltered -- irrelevant to this case, but not unimportant to the company -- content and putting it outside the safety of their own tenent. So, in addition to cost savings and also the time savings that we create by doing the targeted exports, we're dramatically protecting the rest of the content that doesn't need to be collected in the first place.

Erica Toelle: I'm really glad that you are all addressing security and how to think about that with eDiscovery. Before this podcast, I had never really connected the two together, so I really appreciate that insight. Is there anything else that you are excited about that our audience should be excited about thinking of the future of eDiscovery?

EJ Bastien: Well, there's the clear immediate opportunities with M365 Copilot and the features that have already been previewed for the world. It's clear that our company is focused in this space and we're just getting started. So, being able to integrate those capabilities of natural language interactions with collected datasets that may span custodians. It may be the datasets we're collecting for cases to categorize data, to create efficiencies in the review process, to create efficiencies in the classification process. I think the future is blindingly bright.

Erica Toelle: EJ, I'm just curious, I know your major focus is on litigation and legal matters, but are these principals that you're discussing applicable to any other type of search? Like, maybe for a just general internal investigation? Data subject request? Or any other reason that you might need to search through a large amount of data?

EJ Bastien: Yeah. Quite a lot actually. My team, we exist to manage litigation needs. We get asked for favors from a lot of different groups across the company from security to HR. People who need to find efficient ways of getting through a lot of data, to find those needles in the haystack, the eDiscovery feature set is very, very valuable to them. Recently, I was looped in on a data spillage situation where content was proliferating within our network that shouldn't have been. And we used the eDiscovery capabilities to assess where is it? Who sent it? Where did it come from and where did it go? The threading, deduplication, metadata analytics and review experience in eDiscovery Purview was a tremendous toolkit to help answer that question confidentially and quickly, without the need to over-collect content and bring it to other offline tools. We were able to, you know, within a day, provide an answer and take action.

Erica Toelle: Well, thank you so much EJ and Caitlin for joining us today. That's just about all the time we have. To wrap up, we'd love to know what is your personal motto, or what words do you live by?

Caitlin Fitzgerald: Yeah. Maybe I'll go first. So, I would say, you know, my super power is problem solving, so I love to turn every stone and be open to learning new ways to tackle all of life's challenges, including the exciting space of eDiscovery.

EJ Bastien: Yeah. For me, it's do it yourself; DIY. Don't wait for somebody else to do it for you. There's a lot of things to learn. There's a lot of potential to be manifest, and who better to do it than yourself?

Erica Toelle: Perfect. Great words to live by. Thank you [music]. Well, Caitlin and EJ, thank you so much for taking the time to join us today. Thank you to our audience for listening and have a great rest of your day.

EJ Bastien: Thanks again for bringing me on; this has been a blast.

Caitlin Fitzgerald: Thanks Erica and EJ.

[ Music ]

Erica Toelle: We had a great time uncovering hidden risks with you today. Keep an eye out for our next episode. And don't forget to tweet us at msftsecurity or email us at: uhr@microsoft.com. We want to know the topics you'd like to hear on a future episode. Be sure to subscribe to Uncovering Hidden Risks on your favorite podcast platform. And you can catch up on past episodes on our website, uncoveringhiddenrisks.com. Until then, remember that opportunity and risk come in pairs and it's up to you where to focus.

[ Music ]