One Word At A Time, Or A Note About ChatGPT Principles

2023-03-14

ChatGPT

WolframAlpha

Stephen Wolfram

Hot topic

ChatGPT is indeed a hot topic nowadays - so when I shared a note about Slack jumping on ChatGPT bandwagon at the work chat, a colleague recommended an article about why and how it works.

I gave it an attempt (because it's A LOT, for me at least) and even though I honestly skipped through a large part of it, I think it's still a worthy read - even if just those parts that still make some sense.

Hard Things Simple (Or Not)

So the article itself is here: https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/

As I've mentioned above, it's a lot - and I'm not going to replicate any of it here. Instead, here's why I think it's worth reading and how it could be consumed if it gets overwhelming at some point.

It emphasises an important idea that (also as my attempts to use it highlight) it's not a search engine or knowledge base or anything that we humans might think it is given the results it produces. Instead, it's a complex probabilistic model that selects what next word to put in the sentence based on probabilities calculated from a very large amount of text it's been trained on.

However, it's based on a neural networks, so it a way the mechanisms it's built upon are trying to replicate how human brain works, so in certain sense it actually is based on what we humans might think it is.

Consumer's Guide

The article goes - at least for someone without any neural networks background, like myself - fairly deep into the theoretical part, which at some point becomes overwhelming and incomprehensible.

So I found it useful to start from the beginning and go as far as one's understanding of the text gets them, and when it becomes too much skip to the last part of the article - roughly around "So … What Is ChatGPT Doing, and Why Does It Work?" chapter.

That last chapter provides ome important thought on the subjects - such as (just a few quotes):

it’s just saying things that “sound right” based on what things “sounded like” in its training material.

and

this suggests something that’s at least scientifically very important: that human language (and the patterns of thinking behind it) are somehow simpler and more “law like” in their structure than we thought.

and also

unlike even in typical algorithmic computation, ChatGPT doesn’t internally “have loops” or “recompute on data”. And that inevitably limits its computational capability—even with respect to current computers, but definitely with respect to the brain.

The End

Even though it's easy to disregard the ChatGPT phenomena as "just a probabilistic model", it's an impressive achievement that is quite certainly a huge milestone in the development of the language processing, neural networks and the AI efforts.