• 0 Posts
  • 21 Comments
Joined 7 months ago
cake
Cake day: December 5th, 2023

help-circle
  • A lot of this stems from instances running old versions with loose registration requirements, like no captcha. This is a problem in a federated system because there’s no barrier for a banned user to just jump to another instance.

    Perhaps it would be a good idea if, when Lemmy has anti-spam measures implemented like rate-limiting and captchas for registration, it disabled federation with instances that are at a lower version, to motivate small instances to upgrade and enable the new features.






  • We’re already at that point. Even recipe sites, which I’ll give the benefit of assuming aren’t already ML-generated, are already so similar, boring, and irrelevant that nobody reads them.

    In the past few months, I’ve also noticed a lot of sites showing up in my Google search results purporting to be relevant or answer my question, but when I actually read them they are also completely useless. For example, I couldn’t figure out how to take a friend’s Instagram story and reshare it to my own if I wasn’t tagged in it. Several pages were titled to look useful, but all of them gave only alternatives.










  • copyright only protects them from people republishing their content

    This is not correct. Copyright protects reproduction, derivation, distribution, performance, and display of a work.

    People also ingest their content and can make derivative works without problem. OpenAI are just doing the same, but at a level of ability that could be disruptive to some companies.

    Yes, you can legally make derivative works, but without license, it has to be fair use. In this case, where not only did they use one whole work in its entirety, they likely scraped thousands of whole NYT articles.

    This isn’t even really very harmful to the NYT, since the historical material used doesn’t even conflict with their primary purpose of producing new news.

    This isn’t necessarily correct either. I assume they sell access to their archives, for research or whatever. Being able to retrieve articles verbatim through chatgpt does harm their business.





  • Generally you’re correct, but copyright law does concern itself with learning. Fair use exemptions require consideration of the purpose character of use, explicitly mentioning nonprofit educational purposes. It also mentions the effect on the potential market for the original work. (There are other factors required but they’re less relevant here.)

    So yeah, tracing a comic book to learn drawing is totally fine, as long as that’s what you’re doing it for. Tracing a comic to reproduce and sell is totally not fine, and that’s basically what OpenAI is doing here: slurping up whole works to improve their saleable product, which can generate new works to compete with the originals.


  • 17 USC § 106, exclusive rights in copyrighted works:

    Subject to sections 107 through 122, the owner of copyright under this title has the exclusive rights to do and to authorize any of the following:

    (1) to reproduce the copyrighted work in copies or phonorecords;

    (2) to prepare derivative works based upon the copyrighted work;

    (3) to distribute copies or phonorecords of the copyrighted work to the public by sale or other transfer of ownership, or by rental, lease, or lending;

    (4) in the case of literary, musical, dramatic, and choreographic works, pantomimes, and motion pictures and other audiovisual works, to perform the copyrighted work publicly;

    (5) in the case of literary, musical, dramatic, and choreographic works, pantomimes, and pictorial, graphic, or sculptural works, including the individual images of a motion picture or other audiovisual work, to display the copyrighted work publicly; and

    (6) in the case of sound recordings, to perform the copyrighted work publicly by means of a digital audio transmission.

    Clearly, this is capable of reproducing a work, and is derivative of the work. I would argue that it’s displayed publicly as well, if you can use it without an account.

    You could argue fair use, but I doubt this use would meet any of the four test factors, let alone all of them.