The CORPUS White Paper

The CORPUS White Paper
Illustrations: Jan "Steins" Stoewe

The idea behind CORPUS emerged in 2021 from a concrete set of questions: How can large, high-quality music datasets for AI training be built without bypassing rights or excluding creators?

For over a year now, we have been working on this question in a structured way — technically, legally, and economically. CORPUS has grown from this work as a project: an infrastructure for licensed music AI that connects musical contributions, model training, and revenue.

This white paper summarizes the current state of our thinking. It explains why common approaches to music AI fall short, which assumptions we deliberately make, and how an alternative licensing and royalty system could work for both musicians and technology partners.

Anyone who wants to understand what CORPUS is, why we are building it, and which problems it aims to address will find the answers in the white paper.

Download the pdf here:


Too long, didn't read?

Here are some AI-assisted summaries and reviews:

Explainer Video by NotebookLM

0:00
/7:56

Explainer Video created by NotebookLM


Critical Evaluation by GPT-5.2

The risk is not that the concept is incoherent. The risk is execution: getting scoring, governance, and cost allocation “boringly fair”, and getting enough early licensing demand to make participation rational. If those two things land, CORPUS is not just reasonable — it’s one of the few models that actually fits the generative paradigm the paper describes.
ChatGPT - CORPUS economic framework analysis
Shared via ChatGPT

Prompt: Does the attached CORPUS white paper describe a reasonable new economic framework for the music industry? Compare the model presented here with established economic models in the creative industries, including those outside the music industry. Then draw a conclusion.


Comparative Research by Perplexity

Proprietary ethical datasets demonstrate market demand and regulatory pressure; CORPUS’s differentiator is to turn that demand into a shared infrastructure that many developers and sectors (automotive, healthcare, XR, etc.) can plug into with a stable, auditable rights story.

https://www.perplexity.ai/search/corpus-is-not-alone-anymore-in-oyyMLkDtTcyG4arvvjPQnQ

Prompt: CORPUS is not alone anymore in its attempt to offer a licensing framework for Music as AI training data. Research alternative projects and compare them to the approach outlined in the attached CORPUS White Paper.


Presentation Deck by NotebookLM


Human Illustration by Jan "Steins" Stoewe

Mathis Nitschke

Mathis Nitschke

Mathis Nitschke is a composer, sound designer, and music theatre maker working at the intersection of music, technology, and space. He is the Founder and Artistic Director of Sofilab and CORPUS.
Munich, Germany