Cast it into the fire

DaGeek247@fedia.io · 3 days ago

Cast it into the fire

yucandu@lemmy.world · 3 days ago

What about the AI that I run on my local GPU that is using a model trained on open source and public works?

Jankatarch@lemmy.world · edit-2 3 days ago

It’s cool as hell to train models don’t get me wrong but if you use them as assistants you will still slowly stop thinking no?

So Nazgûl.

yucandu@lemmy.world · 3 days ago

Feels like telling me not to use a calculator so I don’t forget how to add and subtract.

mattvanlaw@lemmy.world · 3 days ago

I’ve settled on a future model where AIs are familiars that level up from their experience more naturally and are less immediately omnipotent

Mesophar@pawb.social · 3 days ago

Sounds like the rings of the Elves to me

mattvanlaw@lemmy.world · 3 days ago

This is very cool. Any advice a simple software engineer (me) could follow to practice the same?

yucandu@lemmy.world · edit-2 3 days ago

Install LM Studio
Tell LM Studio to download the Apertus model: https://en.wikipedia.org/wiki/Apertus_(LLM)
Bob’s ur uncle.

stick to 8B models for video cards with 8GB VRAM.

mattvanlaw@lemmy.world · 2 days ago

Thanks! I’ve always wanted an uncle bob, too!

amino@lemmy.blahaj.zone · 3 days ago

your local model wouldn’t exist without sauron (openai)

Jankatarch@lemmy.world · edit-2 3 days ago

People were making LLMs before openai/chatgpt tbf.

It’s the “destroy the environment and economy in an attempt to make something that sucks just enough to justify not paying people fairly so you can advertise to rich assholes gambling their generational wealth” that OpenAI invented for the LLMs.

amino@lemmy.blahaj.zone · 3 days ago

what are those LLMs you mention that people are still using? never heard of them, sounds like a cop out

Jared White ✌️ [HWC]@humansare.social · 3 days ago

That is slightly less unethical than Claude or whatever, but it is still unethical.

yucandu@lemmy.world · 3 days ago

Can you elaborate on why this is unethical?

I use 0.2kWh of electricity to spend a day coding with this model:

https://en.wikipedia.org/wiki/Apertus_(LLM)

Jared White ✌️ [HWC]@humansare.social · 2 days ago

It is still trained on open source code on GitHub. These code communities seemingly have no way to opt out of their free (libre) contributions being used as training data, nor does the resulting code generation contribute anything back to those communities. It is a form of license stripping. That’s just one issue.

Just because your inference running locally doesn’t use much electricity doesn’t mean you’ve sidestepped all of the other ethical issues surrounding LLMs.

yucandu@lemmy.world · 2 days ago

It is not trained on open source code on Github.

But I can use it to analyze a datasheet and generate a library for an obscure module that I can then upload to Github and contribute to the community.

Jared White ✌️ [HWC]@humansare.social · 2 days ago

Apertus is most certainly trained on source code hosted on GitHub. It is laid out here in their technical report:

https://github.com/swiss-ai/apertus-tech-report

It uses a large dataset called TheStack, among others.

yucandu@lemmy.world · edit-2 1 day ago

StarCoderData.23 A large-scale code dataset derived from the permissively licensed GitHub collection The Stack (v1.2). (Kocetkov et al., 2022), which applies deduplication and filtering of opted-out files. In addition to source code, the dataset includes supplementary resources such as GitHub Issues and Jupyter Notebooks (Li et al., 2023).

That’s not random Github accounts or “delicensing” anything. People had to opt IN to be part of “The Stack”. Apertus isn’t training itself from community code.

Jared White ✌️ [HWC]@humansare.social · 1 day ago

I’m tired of arguing with you about this, and you’re still wrong. It was opt-out, not opt-in, based initially on a GitHub crawl of 137M repos and 52B files before filtering & dedup.

yucandu@lemmy.world · 1 day ago

But again, you’d have to set your project to public and your license to “anyone can take my code and do whatever they want with it” before it’d be even added to that list. That’s opt-in, not opt-out. I don’t see the ethical dilemma here. I’m pretty sure I’ve found ethical AI, that produces good value for me and society, and I’m going to keep telling people about it and how to use it.