The Library of Babel
I have always been fascinated by the “Library of Babel”, the short story from Borges: https://www.amazon.com/Library-Babel-Jorge-Luis-Borges/dp/9333185879/
and I think it is an interesting book to revisit as we try to understand and make sense of LLMs and ChatGPT.
If you have not read it and want to, please stop reading this post, read the book and come back as I am going to spoil the main story in the next section.
In this novel, the author describes the universe as a vast and virtually infinite library. This library contains every possible 410-page book of a certain format and character set. Essentially, the library holds all possible combinations of 25 symbols: 22 letters, the comma, the period, and the space.
The librarians claims it contains all possible combinations of their symbol set and thus, all knowledge. Due to the infinite number of possible combinations, almost all the books are complete gibberish. The library's inhabitants go on various quests to search for meaningful books, creating theories and cults around the search. Some hope to find a book that indexes all the others, some seek their own personal histories, some even worship the gibberish books.
The narrator concludes that the library has no end. He implies that the library is essentially chaotic, that any order or meaningfulness is imposed arbitrarily by human minds. The story thus explores themes of infinity, information overload, and the human desire for pattern and meaning.
Although there are fundamental differences on how GPT is constructed and the way it works (GPT does not have pre-stored outputs and generates responses dynamically), looking at GPT with a Borges’s prospective helps make sense of its intelligence (and lack of) and is instructive on how to use it and why not to take its responses for the truth. At least it helped me !
If we imagine a library where every possible book that can be written exists, then GPT can be viewed as a mere search into this infinite possibilities. It is not a random search but one using as guidance, traces made by every past conversation, every past story, every documented moment of human life used in its training.
So what a user of GPT does is provide the beginning of a book, a first sentence and lets GPT find the most probable book, using as trace any sentence that has been written before.
Most of the time, ChatGPT responses make sense and even look magical.
But it does produce wrong or made up results from time to time, which people call hallucinations - a better marketing term for failure of error.
Keep in mind, these fondation models have been trained by everything that is on the internet, where one finds news, stories but also fictions and … crap.
When you’re using ChatGPT, you’re embarking on a journey to find the book that answer your question, following the traces of everything that has been written/tried before on the internet.
There are use cases which are fairly safe: As a non English speaker I use it to check my spelling and grammar, sometimes to translate. It is also great for “non additive tasks” such as summarization and text simplification.
Any use case that highly involves “generation” i.e advice/recommendation system, fact checking, specially in domain with a long history of biais ( Medical, Legal, HR …) would require a full solution around GPT including grounding, fine tuning, reinforcement learning, ensemble … there is an entire industry building guardrails.
It works specially well when taking roads that people have taken a lot.
I started this AI blog journey with a literature analogy and would like to close with another master piece of poetry: The road not taken (https://www.poetryfoundation.org/poems/44272/the-road-not-taken)
@dominiq