Large language models (LLMs) use tokens as the basic units of input and output to process language. Tokens are used both when training the model – reading text – and when asking a LLM to respond to a prompt.
When training, tokens are fragments of language, such as words, subwords, characters, or punctuation marks. The LLM then categorizes how these tokens are related to each other and in what frequency and context.
When generating a response, the LLM turns the prompt into tokens to understand it’s meaning and then use the tokenized data it has learned to create the reponse.
#chatgpt #LLM #token #tokens