How an artificial intelligence (as in large language model based generative AI) could be better for information access and retrieval than an encyclopedia with a clean classification model and a search engine?
If we add a step of processing – where a genAI “digests” perfectly structured data and tries, as bad as it can, to regurgitate things it doesn’t understand – aren’t we just adding noise?
I’m talking about the specific use-case of “draw me a picture explaining how a pressure regulator works”, or “can you explain to me how to code a recursive pattern matching algorithm, please”.
I also understand how it can help people who do not want or cannot make the effort to learn an encyclopedia’s classification plan, or how a search engine’s syntax work.
But on a fundamental level, aren’t we just adding an incontrolable step of noise injection in a decent time-tested information flow?
That can be solved if you teach it the meta-knowledge with intermediary steps, for example:
prompt: 34*3= step1: 4*3 + 30*3 = step2: 12 + 10*3*3 = step3: 12 + 10*9= step4: 12 + 90 = step5: 100 + 2 = step6: 102 result: 102
It’s hard to find such learning data though, but e.g. claude already uses intermediary steps. It preprocesses your input multiple times. It writes code, runs code to process your input, and that’s still not the final response. Unfortunately, it’s already smarter than some junior developers, and its consequence is worrying.