The parameters in Chat and Text completion are:
- Model: The name of the language model that is used to generate the text. For example,
gpt-3.5-turbo
ortext-davinci-003
. - System Message Template (Chat only): A template that defines how the system messages are formatted in the chat transcript. For example,
{"role": "system", "content": "..."}
. - Body Message Template: A template that defines how the user and assistant messages are formatted in the chat transcript. For example,
{"role": "user", "content": "..."}
or{"role": "assistant", "content": "..."}
. - Maximum Tokens: The maximum number of tokens that can be generated by the model. A token is a unit of text, such as a word or a punctuation mark. For example,
max_tokens = 2000
. - Temperature: A parameter that controls the randomness of the text generation. A higher temperature means more diversity and creativity, but also more errors and inconsistency. A lower temperature means more predictable and coherent text, but also more boring and repetitive. For example,
temperature = 0.8
. - Nucleus sampling (top_p): A parameter that controls the probability threshold for selecting the next token. The model only considers the tokens that have a cumulative probability of less than or equal to this value. This helps to avoid low-probability tokens that can lead to nonsensical text. For example,
top_p = 0.9
. - Frequency Penalty: A parameter that penalizes the repetition of tokens in the generated text. A higher frequency penalty means less repetition, but also more risk of losing coherence and relevance. A lower frequency penalty means more repetition, but also more consistency and fluency. For example,
frequency_penalty = 0.5
. - Presence Penalty: A parameter that penalizes the use of tokens that have already appeared in the chat transcript. A higher presence penalty means less reuse of previous words and phrases, but also more risk of losing context and continuity. A lower presence penalty means more reuse of previous words and phrases, but also more risk of sounding redundant and boring. For example,
presence_penalty = 0.2
.