AI Model Collapse

AI Model Collapse is when an AI is trained on either its own or other AI output. Subsequent generations of the AI trained on such data lose the subtleties of the initial generation and or lose the less represented examples in the training data. The output is then reduced to gibberish or is one dimensional losing the variety of the original text.

The problem feels similar to inbreeding in nature. Without variation, the species collapses.

This article in Nature explores AI Model collapse in detail.

The researchers began by using an LLM to create Wikipedia-like entries, then trained new iterations of the model on text produced by its predecessor. As the AI-generated information — known as synthetic data — polluted the training set, the model’s outputs became gibberish. The ninth iteration of the model completed a Wikipedia-style article about English church towers with a treatise on the many colours of jackrabbit tails.

Nature
Leave a Reply 0