OpenAI Few Shot Prompt Engineering
What is Few Shot Prompt Engineering?
Prompt engineering is emerging as an important discipline for developing and optimizing large language models (LLMs). Among the several techniques for chatting with an OpenAI GPT, few shot prompting is one of the effective ways to coach the LLM and help it deliver replies with greater fidelity.
The paper Language Models are Few-Shot Learners explains how the few shot technique improves NLP performance of GPT-3.
With the few shot approach, the user supplies the model with a few examples of how she would like the LLM to respond. Instead of conceptually explaining the reply she’s looking for, the user simply feeds the LLM a few choice examples. Few shot is a powerful technique, enabling large language model customization without updating its overall architecture.
In this post we’ll show an example of this technique by prompting OpenAI to classify movie reviews as positive or negative. To help it along, we supply a handful of made up review texts, along with the corresponding feedback (positive/negative) for each one. Working with a spreadsheet file of IMDB movie reviews from Kaggle, we feed each record to OpenAI and ask it to rate the film review accordingly.
Kaggle user VolodymyrGavrysh describes the CSV file as: data set to experiment with classification tasks; is small sub sample of IMDB.
OpenAI Chat: Movie Reviews Classifier
= OpenAI( api_key=os.environ.get("OPENAI_API_KEY"),) client
Full source code for openai-fewshotprompting.py.
To access the OpenAI GPT server, we need an account and the unique API key associated with it. There are several ways to get the key into our Python code. Here we store the key as a text file, then set its content as an environment variable for our script to read.
In a Linux environment, we store the API key in openai_key
and we execute the script the following way: export OPENAI_API_KEY=
cat openai_key; python openai-fewshotprompting.py
= "imdb_10K_sentiments_reviews.csv"
data_file = pd.read_csv(data_file, encoding='utf-8')
df 'label'] = df['label'].replace({0: 'Negative', 1: 'Positive'})
df[ df.head()
Next we create a Pandas data frame and fill it with all the movie review records from the IMDB spreadsheet file and replace the labels accordingly (positive or negative). We’ll instruct GPT to process these records, assessing the sentiment of each review and supplying its own label.
= """
system_message You are a binary classifier for sentiment analysis.
Given a text, based on its sentiment, you classify it into
one of two caterories: positive or negative.
You can use the following texts as examples:
Text: "I am pleasantly surprised by this fountain pen, the weight and feel in the hand is substantial and the ink distribution is smooth and solid, no skips."
Positive
Text: "What was delivered was a burn piece of goat cheese and 9 small cubes of squash."
Negative
Text: "Easy to assemble. Sturdy and looks good. Was perfect solution for story my many plants."
Positive
Text: "The printed code style (grey with horrible white lines) is really disturbing for the eye."
Negative
ONLY return the sentiment as output (without punctuation).
Text:
"""
This is where few shot prompting enters the picture. We supply the LLM with sample reviews and their associated labels. The goal is for the model to pick up the association and work with this template as it processes the IMDB data.
= df.sample(n=10, random_state=12)
df_sampling
def classify_review(review):
= client.chat.completions.create(
completion ="gpt-3.5-turbo",
model=[
messages"role": "system", "content": system_message},
{"role": "user", "content": review}
{
]
)return completion.choices[0].message.content
For this example, we have version 1.35.3 of the openai
package installed via pip
.
After extracting a small working set of movie review records, we create a function classify_review
that chats with OpenAI. For each record in the sub-sample, we call the function, passing the movie review on to OpenAI to assess and label.
'predicted'] = df_sampling['review'].apply(classify_review)
df_sampling[print(df_sampling)
We’ve updated the sample set with a new column at the end called predicted
. Comparing it with the data in the label
column, we see that the LLM does a pretty decent job classifying the movie reviews. When running the script several times with different sample records, the model’s predicted label occasionally deviates from the label that came with the IMDB data.
review label predicted
5669 There won't be one moment in this film where y... Positive Positive
8800 Salva and his pal Bigardo have been at the mar... Positive Negative
3205 I grew up with H.R. Pufnstuff and the dashingl... Positive Positive
8731 Other than the great cinematography by the mar... Negative Negative
6412 Not the worst movie I've seen but definitely n... Negative Negative
6828 This is the first Michael Vartan movie i've se... Positive Positive
5795 Except for the acting of Meryl Streep, which i... Negative Negative
9617 After losing the Emmy for her performance as M... Positive Positive
4643 Firstly let me get this of my chest I hate Oct... Negative Negative 1844 I figure this to be an "alternate reality" tee... Negative Negative