At a glance.
- OpenAI faces lawsuit for scraping of internet data.
- Study shows 25% of kids apps violate COPPA.
- UoM attack reportedly exposed over one million NHS patients.
OpenAI faces lawsuit for scraping of internet data.
After months of scrutiny regarding the data used for training its artificial intelligence tech, ChatGPT developer OpenAI is being sued for feeding its app scraped data. In a first-of-its-kind case, the class-action lawsuit alleges that OpenAI violated internet user’s privacy rights by using public content like social media comments, blog posts, and Wikipedia articles to train ChatGPT. Clarkson, the California-based law firm filing the suit, wants to represent “real people whose information was stolen and commercially misappropriated to create this very powerful technology,” managing partner Ryan Clarkson explains. Like all large language models, ChatGPT relies on reading billions of words of text and using the knowledge gained to predict the best response to a user’s query, and until now has been allowed to operate relatively unfettered. Clarkson’s goal is for the court to implement guardrails on what data can be used and how the creators of that data – in this case, internet users – are compensated. Some AI developers say the use of public internet data should be considered fair use. Katherine Gardner, an intellectual-property lawyer at Gunderson Dettmer, says “When you put content on a social media site or any site, you’re generally granting a very broad license to the site to be able to use your content in any way. It’s going to be very difficult for the ordinary end user to claim that they are entitled to any sort of payment or compensation for use of their data as part of the training.” As the Washington Post notes, this is merely the latest in a series of suits directed at OpenAI. A class-action lawsuit was filed in November against OpenAI and Microsoft for the use of computer code in GitHub to train AI tools, and earlier this month OpenAI was sued by a radio host who alleges ChatGPT produced text that wrongfully accused him of fraud. While OpenAI isn’t the only AI company under fire, its increasingly popular, publically accessible chatbot has made it the most visible. “They’re the company that ignited this AI arms race,” Clarkson stated. “They’re the natural first target.”
Study shows 25% of kids apps violate COPPA.
The researchers at Comparitech analyzed the top four hundred children’s apps offered in Apple’s App Store and found that one in four potentially violate the Children’s Online Privacy Protection Act (COPPA). (The findings align with a previous study aimed at Google Play, where 25% of kids’ apps were also found to break COPPA rules.) Most of the apps found to be in violation failed to offer clear and comprehensive information on how parental consent is obtained. The researchers weren’t even able to review the privacy policies of sixteen apps due to broken links or other issues, and a whopping twenty-six had no child privacy policy at all. Five of the apps say they’re not technically targeting children, despite two of them having the word “kids” in the title. Nearly half (47%) of the apps were violating COPPA by collecting data without parental consent. The most collected type of data was a persistent identifier, like an IP address, followed by street address, name, and online contact details. It’s worth noting that Apple, by offering apps directed at kids, is technically liable under COPPA, but legal gray areas surrounding apps and app stores have so far let such violations slip through the cracks. Apple has been contacted about the study but has not yet responded.
UoM attack reportedly exposed over one million NHS patients.
As we previously noted, on June 9 the University of Manchester (UoM) disclosed it suffered a cyberattack, and the hackers claiming responsibility for the breach claimed to be in possession of 7TB of data. Now, according to info leaked to the Independent, sources say the stolen data include the info of 1.1 million National Health Services (NHS) patients across two hundred hospitals. The data were gathered by UoM for research purposes and include records of patients treated for major trauma after terror attacks. Making matters worse, patients may not necessarily know if they are included in the database, as they were not required to give consent for the collection of their data. The dataset, first set up in 2012, has been secured, but UoM has warned NHS officials that there is “potential for NHS data to be made available in the public domain.” UoM has not confirmed whether the leaked details about the dataset are true, but has stated that investigations are ongoing. “Our in-house data experts and external support are working around-the-clock to resolve this incident and respond to its impacts, and we are not able to comment further at this stage,” a UoM spokesperson stated.