OpenAI Python Library#
Install the OpenAI Library#
1
2
| # install from PyPI
pip install --upgrade openai
|
Import the relevant Python Libraries and Load the OpenAI API Key#
If you don’t have an API Key, Get your API Key here.
1
2
3
4
5
6
7
8
| import openai
import os
from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file
openai.api_key = os.getenv('OPENAI_API_KEY') # more secure
|
Here, we use a .env file to store our OpenAI API Key.
Or, you can set openai.api_key
to a plain text value:
1
| openai.api_key = "sk-..." # less secure (not recommended)
|
Use gpt-3.5-turbo
model and the chat completion API#
A chat API call has two required inputs:
model
: the name of the model you want to use (e.g., gpt-3.5-turbo)messages
: a list of message objects, where each object has at least two fields:role
: the role of the messenger (either system, user, or assistant)content
: the content of the message (e.g., Write me a beautiful poem)
Typically, a conversation will start with a system message, followed by alternating user and assistant messages, but you are not required to follow this format.
Let’s look at an example chat API calls to see how the chat format works in practice.
1
2
3
4
5
6
7
8
9
10
11
12
13
| MODEL = "gpt-3.5-turbo"
response = openai.ChatCompletion.create(
model=MODEL,
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Knock knock."},
{"role": "assistant", "content": "Who's there?"},
{"role": "user", "content": "Orange."},
],
temperature=0,
)
response
|
<OpenAIObject chat.completion id=chatcmpl-6pjrV9CvZ2ivOSxzZrBdEidUB6xfs at 0x13362cf90> JSON: {
"choices": [
{
"finish_reason": "stop",
"index": 0,
"message": {
"content": "Orange who?",
"role": "assistant"
}
}
],
"created": 1677789041,
"id": "chatcmpl-6pjrV9CvZ2ivOSxzZrBdEidUB6xfs",
"model": "gpt-3.5-turbo-0301",
"object": "chat.completion",
"usage": {
"completion_tokens": 5,
"prompt_tokens": 38,
"total_tokens": 43
}
}
As you can see, the response object has a few fields:
id
: the ID of the requestobject
: the type of object returned (e.g., chat.completion)created
: the timestamp of the requestmodel
: the full name of the model used to generate the responseusage
: the number of tokens used to generate the replies, counting prompt, completion, and totalchoices
: a list of completion objects (only one, unless you set n greater than 1)message
: the message object generated by the model, with role and contentfinish_reason
: the reason the model stopped generating text (either stop, or length if max_tokens limit was reached)index
: the index of the completion in the list of choices
Let’s get the completion:
1
| print(response.choices[0].message.content)
|
Orange who?
1
2
3
4
5
6
7
8
9
10
| # example without a system message
response = openai.ChatCompletion.create(
model=MODEL,
messages=[
{"role": "user", "content": "What's the capital of Canada?"},
],
temperature=0,
)
print(response['choices'][0]['message']['content'])
|
The capital of Canada is Ottawa.
Counting Tokens#
When you submit your request, the API transforms the messages into a sequence of tokens.
The number of tokens used affects:
- the cost of the request
- the time it takes to generate the response
- when the reply gets cut off from hitting the maximum token limit (4096 for gpt-3.5-turbo)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
| import tiktoken
def num_tokens_from_messages(messages, model="gpt-3.5-turbo-0301"):
"""Returns the number of tokens used by a list of messages."""
try:
encoding = tiktoken.encoding_for_model(model)
except KeyError:
encoding = tiktoken.get_encoding("cl100k_base")
if model == "gpt-3.5-turbo-0301": # note: future models may deviate from this
num_tokens = 0
for message in messages:
num_tokens += 4 # every message follows <im_start>{role/name}\n{content}<im_end>\n
for key, value in message.items():
num_tokens += len(encoding.encode(value))
if key == "name": # if there's a name, the role is omitted
num_tokens += -1 # role is always required and always 1 token
num_tokens += 2 # every reply is primed with <im_start>assistant
return num_tokens
else:
raise NotImplementedError(f"""num_tokens_from_messages() is not presently implemented for model {model}.
See https://github.com/openai/openai-python/blob/main/chatml.md for information on how messages are converted to tokens.""")
|
1
2
3
4
5
6
7
8
| messages = [
{"role": "system", "content": "You are a helpful, pattern-following assistant that translates corporate jargon into plain English."},
{"role": "system", "name":"example_user", "content": "New synergies will help drive top-line growth."},
{"role": "system", "name": "example_assistant", "content": "Things working well together will increase revenue."},
{"role": "system", "name":"example_user", "content": "Let's circle back when we have more bandwidth to touch base on opportunities for increased leverage."},
{"role": "system", "name": "example_assistant", "content": "Let's talk later when we're less busy about how to do better."},
{"role": "user", "content": "This late pivot means we don't have time to boil the ocean for the client deliverable."},
]
|
1
2
| # example token count from the function defined above
print(f"{num_tokens_from_messages(messages)} prompt tokens counted.")
|
126 prompt tokens counted.
1
2
3
4
5
6
7
8
| # example token count from the OpenAI API
response = openai.ChatCompletion.create(
model=MODEL,
messages=messages,
temperature=0,
)
print(f'{response["usage"]["prompt_tokens"]} prompt tokens used.')
|
126 prompt tokens used.
See the API reference and the OpenAI Cookbook for more details.