I don’t understand why after generating 8 seconds, it can’t just use that as the base to generate more, and so on for unlimited length. Once they figure that out, I think society will deteriorate even faster.
there’s been talk of “ai collapse” over the past few months, where models get trained on their own output and gradually get worse. video models basically need to work on data that’s collapsing in real-time as each frame builds on the previous. if you try to generate more it just turns into mush because there are already thousands of imperceptible but compounding errors in the data that exponentially get worse. the trick with making longer clips is staving off those compounding errors by stabilising the data for as long as possible.
Outside of computing costs, there is no limit to the length of video generated, but keeping the output coherent and contextual for more than a few seconds is a completely different puzzle to solve.
Same reason ChatGPT 3.0 could make realistic Reddit comments half a decade ago but the latest models still can’t generate more than a paragraph or two before losing the thread.
I don’t understand why after generating 8 seconds, it can’t just use that as the base to generate more, and so on for unlimited length. Once they figure that out, I think society will deteriorate even faster.
there’s been talk of “ai collapse” over the past few months, where models get trained on their own output and gradually get worse. video models basically need to work on data that’s collapsing in real-time as each frame builds on the previous. if you try to generate more it just turns into mush because there are already thousands of imperceptible but compounding errors in the data that exponentially get worse. the trick with making longer clips is staving off those compounding errors by stabilising the data for as long as possible.
Outside of computing costs, there is no limit to the length of video generated, but keeping the output coherent and contextual for more than a few seconds is a completely different puzzle to solve.
Same reason ChatGPT 3.0 could make realistic Reddit comments half a decade ago but the latest models still can’t generate more than a paragraph or two before losing the thread.