I’m still trying to wrap my head around this new generation of AI tools, mostly ChatGPT and Midjourney as those are the big tools my coworkers and peers are using. But during my lunch break, I was reading a long-form post from a history professors’ perspective on the technology. The author Bret Devereaux did a great job, in my opinion, of simplifying how ChatGPT was created, and ultimately what it is doing:
It is crucial to note, however, what the data is that is being collected and refined in the training system here: it is purely information about how words appear in relation to each other. That is, how often words occur together, how closely, in what relative positions and so on. It is not, as we do, storing definitions or associations between those words and their real world referents, nor is it storing a perfect copy of the training material for future reference. ChatGPT does not sit atop a great library it can peer through at will; it has read every book in the library once and distilled the statistical relationships between the words in that library and then burned the library.
ChatGPT does not understand the logical correlations of these words or the actual things that the words (as symbols) signify (their ‘referents’). It does not know that water makes you wet, only that ‘water’ and ‘wet’ tend to appear together and humans sometimes say ‘water makes you wet’ (in that order) for reasons it does not and cannot understand.
I highly recommend checking out the whole article. I think one of the important things that we need to keep in mind is how these AI tools are created, and that can help us better understand how to use them. This explanation breaks it down in a really simply way, and helps me understand how ChatGPT works, and why ChatGPT will sometimes make the mistakes it has in the past (especially in all the tech YouTube videos out there - I really like the one MKBHD made). I’d like to write more around this topic in the future, but right now I’m still trying to get my head around the technology.