there’s been talk of “ai collapse” over the past few months, where models get trained on their own output and gradually get worse. video models basically need to work on data that’s collapsing in real-time as each frame builds on the previous. if you try to generate more it just turns into mush because there are already thousands of imperceptible but compounding errors in the data that exponentially get worse. the trick with making longer clips is staving off those compounding errors by stabilising the data for as long as possible.
there’s been talk of “ai collapse” over the past few months, where models get trained on their own output and gradually get worse. video models basically need to work on data that’s collapsing in real-time as each frame builds on the previous. if you try to generate more it just turns into mush because there are already thousands of imperceptible but compounding errors in the data that exponentially get worse. the trick with making longer clips is staving off those compounding errors by stabilising the data for as long as possible.