@SheeEttin

SheeEttin@programming.dev · 8 months ago

A lot of this stems from instances running old versions with loose registration requirements, like no captcha. This is a problem in a federated system because there’s no barrier for a banned user to just jump to another instance.

Perhaps it would be a good idea if, when Lemmy has anti-spam measures implemented like rate-limiting and captchas for registration, it disabled federation with instances that are at a lower version, to motivate small instances to upgrade and enable the new features.

SheeEttin@programming.dev · 9 months ago

https://en.wikipedia.org/wiki/Tay_(chatbot)

SheeEttin@programming.dev · 9 months ago

That’s not busy work. Busy work, as explained in the article, is work that doesn’t really accomplish anything, like re-folding towels that have already been folded. Or as I’ve had to do before, sweep a perfectly spotless sidewalk. Data validation is valid work.

SheeEttin@programming.dev · 9 months ago

Only if you file suit and the court finds it enforceable. Sometimes they say you can sue anyway.

SheeEttin@programming.dev · 9 months ago

They should fix that, because it’s certainly degrading the experience on Lemmy. A good number of these replies have the tags longer than their actual content.

SheeEttin@programming.dev · 9 months ago

We’re already at that point. Even recipe sites, which I’ll give the benefit of assuming aren’t already ML-generated, are already so similar, boring, and irrelevant that nobody reads them.

In the past few months, I’ve also noticed a lot of sites showing up in my Google search results purporting to be relevant or answer my question, but when I actually read them they are also completely useless. For example, I couldn’t figure out how to take a friend’s Instagram story and reshare it to my own if I wasn’t tagged in it. Several pages were titled to look useful, but all of them gave only alternatives.

SheeEttin@programming.dev · 9 months ago

Yes, it’s fine.

If you have vote brigading, ban them, take it up with the instance admin, and defederate, in that order.

SheeEttin@programming.dev · 10 months ago

https://downforeveryoneorjustme.com/leminal.space

SheeEttin@programming.dev · 10 months ago

Big trucks aren’t necessarily all that heavy. The bed is entirely empty space, remember.

SheeEttin@programming.dev · 10 months ago

Broadcom is so good at it, they wrecked VMware years before even completing the acquisition.

SheeEttin@programming.dev · 10 months ago

Yeah but that’s a waste of light. Why use a floodlight when you can use a laser?

SheeEttin@programming.dev · 10 months ago

It’s not

SheeEttin@programming.dev · 10 months ago

Right, but it’s not a pure list of facts. When you set it to paper, it’s unique, and you could argue it’s art. In fact, a quick Google search found one such example: https://www.saatchiart.com/art/Painting-Shopping-list-1/2146403/10186433/view

Granted, that one was presumably intended to be a work of art on creation and your weekly shopping list isn’t, but the intent during creation isn’t all that important for US copyright law. You create it, you get the rights.

SheeEttin@programming.dev · 10 months ago

I’m not aware of any federal case law on copyright and AI. Happy to read some if you have a suggestion.

SheeEttin@programming.dev · 10 months ago

copyright only protects them from people republishing their content

This is not correct. Copyright protects reproduction, derivation, distribution, performance, and display of a work.

People also ingest their content and can make derivative works without problem. OpenAI are just doing the same, but at a level of ability that could be disruptive to some companies.

Yes, you can legally make derivative works, but without license, it has to be fair use. In this case, where not only did they use one whole work in its entirety, they likely scraped thousands of whole NYT articles.

This isn’t even really very harmful to the NYT, since the historical material used doesn’t even conflict with their primary purpose of producing new news.

This isn’t necessarily correct either. I assume they sell access to their archives, for research or whatever. Being able to retrieve articles verbatim through chatgpt does harm their business.

SheeEttin@programming.dev · 10 months ago

That is not correct. Copyright subsists in all original works of authorship fixed in any tangible medium of expression. https://www.law.cornell.edu/uscode/text/17/102

Legally, when you write your shopping list, you instantly have the rights to that work, no publication or registration necessary. You can choose to publish it later, or not at all, but you still own the rights. Someone can’t break into your house, look at your unpublished works, copy them, and publish them like they’re their originals.

SheeEttin@programming.dev · 10 months ago

There are issues other than publishing, but that’s the biggest one. But they are not acting merely as a conduit for the work, they are ingesting it and deriving new work from it. The use of the copyrighted work is integral to their product, which makes it a big deal.

SheeEttin@programming.dev · 10 months ago

True. I fully expect that the court will rule against OpenAI here, because it very obviously does not meet any fair use exemption.

SheeEttin@programming.dev · 10 months ago

Generally you’re correct, but copyright law does concern itself with learning. Fair use exemptions require consideration of the purpose character of use, explicitly mentioning nonprofit educational purposes. It also mentions the effect on the potential market for the original work. (There are other factors required but they’re less relevant here.)

So yeah, tracing a comic book to learn drawing is totally fine, as long as that’s what you’re doing it for. Tracing a comic to reproduce and sell is totally not fine, and that’s basically what OpenAI is doing here: slurping up whole works to improve their saleable product, which can generate new works to compete with the originals.

SheeEttin@programming.dev · 10 months ago

17 USC § 106, exclusive rights in copyrighted works:

Subject to sections 107 through 122, the owner of copyright under this title has the exclusive rights to do and to authorize any of the following:

(1) to reproduce the copyrighted work in copies or phonorecords;

(2) to prepare derivative works based upon the copyrighted work;

(3) to distribute copies or phonorecords of the copyrighted work to the public by sale or other transfer of ownership, or by rental, lease, or lending;

(4) in the case of literary, musical, dramatic, and choreographic works, pantomimes, and motion pictures and other audiovisual works, to perform the copyrighted work publicly;

(5) in the case of literary, musical, dramatic, and choreographic works, pantomimes, and pictorial, graphic, or sculptural works, including the individual images of a motion picture or other audiovisual work, to display the copyrighted work publicly; and

(6) in the case of sound recordings, to perform the copyrighted work publicly by means of a digital audio transmission.

Clearly, this is capable of reproducing a work, and is derivative of the work. I would argue that it’s displayed publicly as well, if you can use it without an account.

You could argue fair use, but I doubt this use would meet any of the four test factors, let alone all of them.