• 0 Posts
  • 6 Comments
Joined 1 year ago
cake
Cake day: June 25th, 2023

help-circle
  • 30 years away from it (reduced from the original 100 years they provided only 5 years ago)

    More like estimates on this are completely unreliable. As in that 100 years could have as well been 1000 years. It was pretty much “until an unpredictable technological paradigm shift happens”. “100 years in future” is “when we have warp drives and star gates” of estimates. Pretty “when we have advanced to next level of advancement and technology, whenever it happens. 100 years should be good minimum of this not being taken as an actual year number estimate”.

    30 years is “we see maybe a potential path to this via hypothetical developments of technology in horizon”. It’s the classical “Fusion is always 30 years away”. Until one time it isn’t, but that 30 year loop can go on indefinitely, if the hypothetical don’t turn to reality. Since you know we thought “maybe that will work, once we put out mind in to it”. Oh it didn’t, on to chasing next path.

    I only know of one project, that has 100 year estimate, that is real. That is the Onkalo deep repository of spent fuel in Finland. It has estimate of spending 100 years being filled and is to be sealed in 2120’s and that is an actual date. Since all the tech is known, the sealing process is known, it just happens to take a century to fill the repository bit by bit. Finland is kinda stable country and radiation hazard such long term, that whatever government is to be there in 2120’s, they will most likely seal the repository.

    Unless “we invent warp drives” happens before that and some new process of actually efficiently and very safely getting rid of the waste is found in some process. (and no that doesn’t include current recycling methods. Since those aren’t that good to get rid of this large amount and with small enough risk of side harms. Surprise, this was studied by Finland as alternative and it was simply decided “recycling is not good enough, simple enough, efficient enough and safe enough yet. Bury it in bedrock tomb”).


  • Main issue comes from GDPR. When one uses the consent basis for collecting and using information it has to be a free choice. Thus one can’t offer “Pay us and we collect less information about you”. Hence “pay or consent” is blatantly illegal. Showing ads in generic? You don’t need consent. That consent is “I vote with my browser address bar”. Thing just is nobody anymore wants to use non tracked ads…

    So in this case DMA 5(2) is just basically re-enforcement and emphasis of previous GDPR principle. from verge

    “exercise their right to freely consent to the combination of their personal data.”

    from the regulation

    1. The gatekeeper shall not do any of the following:
      (a) process, for the purpose of providing online advertising services, personal data of end users using services of third parties that make use of core platform services of the gatekeeper;
      (b) combine personal data from the relevant core platform service with personal data from any further core platform services or from any other services provided by the gatekeeper or with personal data from third-party services;
      © cross-use personal data from the relevant core platform service in other services provided separately by the gatekeeper, including other core platform services, and vice versa; and
      (d) sign in end users to other services of the gatekeeper in order to combine personal data,

    unless the end user has been presented with the specific choice and has given consent within the meaning of Article 4, point (11), and Article 7 of Regulation (EU) 2016/679.

    surprise 2016/679 is… GDPR. So yeah it’s new violation, but pretty much it is “Gatekeepers are under extra additional scrutiny for GDPR stuff. You violate, we can charge you for both GDPR and DMA violation, plus with some extra rules and explicity for DMA”.

    I think technically already GDPR bans combining without permission, since GDPR demands permission for every use case for consent based processing. There must be consent for processing… combining is processing, needs consent. However this is interpretation of the general principle of GDPR. It’s just that DMA makes it explicit “oh these specific processing, yeah these are processing that need consent per GDPR”. Plus it also rules them out of trying to argue “justified interest” legal basis of processing case of the business. Explicitly ruling “these type of processing don’t fall under justified interest for these companies, these are only and explicitly per consent type actions”.


  • That is just its core function doing its thing transforming inputs to outputs based on learned pattern matching.

    It may not have been trained on translation explicitly, but it very much has been trained on these are matching stuff via its training material. Since you know what its training set most likely contained… dictionaries. Which is as good as asking it to learn translation. Another stuff most likely in training data: language course books, with matching translated sentences in them. Again well you didnt explicitly tell it to learn to translate, but in practice the training data selection did it for you.




  • Well difference is you have to know coming to know did the AI produce what you actually wanted.

    Anyone can read the letter and know did the AI hallucinate or actually produce what you wanted.

    On code. It might produce code, that by first try does what you ask. However turns AI hallucinated a bug into the code for some edge or specialty case.

    Hallucinating is not a minor hiccup or minor bug, it is fundamental feature of LLMs. Since it isn’t actually smart. It is a stochastic requrgitator. It doesn’t know what you asked or understand what it is actually doing. It is matching prompt patterns to output. With enough training patterns to match one statistically usually ends up about there. However this is not quaranteed. Thus the main weakness of the system. More good training data makes it more likely it more often produces good results. However for example for business critical stuff, you aren’t interested did it get it about right the 99 other times. It 100% has to get it right, this one time. Since this code goes to a production business deployment.

    I guess one can code comprehensive enough verified testing pattern including all the edge cases and with thay verify the result. However now you have just shifted the job. Instead of programmer programming the programming, you have programmer programming the very very comprehensive testing routines. Which can’t be LLM done, since the whole point is the testing routines are there to check for the inherent unreliability of the LLM output.

    It’s a nice toy for someone wanting to make a quick and dirty test code (maybe) to do thing X. Then try to find out does this actually do what I asked or does it have unforeseen behavior. Since I don’t know what the behavior of the code is designed to be. Since I didn’t write the code. good for toying around and maybe for quick and dirty brainstorming. Not good enough for anything critical, that has to be guaranteed to work with promise of service contract and so on.

    So what the future real big job will be is not prompt engineers, but quality assurance and testing engineers who have to be around to guard against hallucinating LLM/ similar AIs. Prompts can be gotten from anyone, what is harder is finding out did the prompt actually produced what it was supposed to produce.