- The Loop
- Posts
- 🔆 Data appetite and AI training: Where creation meets imitation
🔆 Data appetite and AI training: Where creation meets imitation
New guest author, the conflict between copyright and data and avatars for mental health issues
Was this email forwarded to you? Sign up here!
🗞️ Issue 43 // ⏱️ Read Time: 8 min
Hello 👋
Remember the last time you wrote something original? Maybe it was a company report (Q3 anyone?), a social media post, or even a simple email. What if an AI system reproduced your exact text, word by word? This scenario became reality for The New York Times last year, launching a multi-billion dollar lawsuit that's redefining the intersection of AI and intellectual property.
Special Edition: We are pleased to feature Leon Ingelse, Data Scientist and AI Ethicist, as this week’s guest author.
In this week's newsletter
What we’re talking about: Copyright-infringement concerns and legal battles triggered by the scramble for high-quality, human-created content by AI companies that wish to train their models on it.
How it’s relevant: AI companies have been scraping the internet for human-created visuals, texts and music—your own work. Training AI on your data creates highly sophisticated AI models, but it also raises questions about unauthorized use and compensation.
Why it matters: Generative AI (GAI) is reshaping how we think about creativity, originality, and intellectual property. As these models learn from human-created content, they raise questions about fair compensation for creators, the future of creative professions, and how we value creativity. Understanding the GAI training process helps us protect our work and potentially collaborate with GAI models to enhance our creative process.
Big tech news of the week…
🖥️ Anthropic claims that the window for proactive risk prevention with regards to Generative AI is closing fast and urges for policy makers to act in a recently published statement.
📱 Research is showing that using digital avatars for people with psychosis could help reduce distress caused by hearing voices.
🇺🇸 What will AI look like considering the U.S. election results? Here’s an overview.
🌍 Team Lumiera is attending Norrsken Impact Week in Barcelona this week, and we will be at Websummit in Lisbon next week. Let us know if you are around and you will be invited to our events.
The three ingredients of AI models
AI companies invest billions in their AI products, which consists of three key elements:
Development
Computing power
Training data.
The cash tends to be primarily invested in the first two key elements; Development and computing power. The third key element—training data—generally comes without direct cost for these AI companies, through internet data collection. In many cases this is done without the creators’ consent, without remuneration, and including copyrighted data.