Also, even this is vastly overstating how difficult that it would be.
You don’t need to train an entire network to make doorbell camera videos/pictures. There are techniques (like IP Adapters) that can take a single photo during inference and copy the style onto any other generated work. With applications like ComfyUI, this is a matter of dropping a node onto the generation graph and choosing a photo (or several photos), 3-4 clicks.
Also, even this is vastly overstating how difficult that it would be.
You don’t need to train an entire network to make doorbell camera videos/pictures. There are techniques (like IP Adapters) that can take a single photo during inference and copy the style onto any other generated work. With applications like ComfyUI, this is a matter of dropping a node onto the generation graph and choosing a photo (or several photos), 3-4 clicks.