OpenAI Few Shot Prompt Engineering

Python
LLM
OpenAI
GPT
prompt engineering
Python code for OpenAI chat showcasing the few shot prompting technique
Author

Dennis Chua

Published

July 1, 2024

What is Few Shot Prompt Engineering?

Prompt engineering is emerging as an important discipline for developing and optimizing large language models (LLMs). Among the several techniques for chatting with an OpenAI GPT, few shot prompting is one of the effective ways to coach the LLM and help it deliver replies with greater fidelity.

The paper Language Models are Few-Shot Learners explains how the few shot technique improves NLP performance of GPT-3.

With the few shot approach, the user supplies the model with a few examples of how she would like the LLM to respond. Instead of conceptually explaining the reply she’s looking for, the user simply feeds the LLM a few choice examples. Few shot is a powerful technique, enabling large language model customization without updating its overall architecture.

In this post we’ll show an example of this technique by prompting OpenAI to classify movie reviews as positive or negative. To help it along, we supply a handful of made up review texts, along with the corresponding feedback (positive/negative) for each one. Working with a spreadsheet file of IMDB movie reviews from Kaggle, we feed each record to OpenAI and ask it to rate the film review accordingly.

Kaggle user VolodymyrGavrysh describes the CSV file as: data set to experiment with classification tasks; is small sub sample of IMDB.

OpenAI Chat: Movie Reviews Classifier

client = OpenAI( api_key=os.environ.get("OPENAI_API_KEY"),)

Full source code for openai-fewshotprompting.py.

To access the OpenAI GPT server, we need an account and the unique API key associated with it. There are several ways to get the key into our Python code. Here we store the key as a text file, then set its content as an environment variable for our script to read.

In a Linux environment, we store the API key in openai_key and we execute the script the following way: export OPENAI_API_KEY=cat openai_key; python openai-fewshotprompting.py

data_file = "imdb_10K_sentiments_reviews.csv"
df = pd.read_csv(data_file, encoding='utf-8')
df['label'] = df['label'].replace({0: 'Negative', 1: 'Positive'})
df.head()

Next we create a Pandas data frame and fill it with all the movie review records from the IMDB spreadsheet file and replace the labels accordingly (positive or negative). We’ll instruct GPT to process these records, assessing the sentiment of each review and supplying its own label.

system_message = """
You are a binary classifier for sentiment analysis.
Given a text, based on its sentiment, you classify it into 
one of two caterories: positive or negative.
You can use the following texts as examples:
Text: "I am pleasantly surprised by this fountain pen, the weight and feel in the hand is substantial and the ink distribution is smooth and solid, no skips."
Positive
Text: "What was delivered was a burn piece of goat cheese and 9 small cubes of squash."
Negative
Text: "Easy to assemble. Sturdy and looks good. Was perfect solution for story my many plants."
Positive
Text: "The printed code style (grey with horrible white lines) is really disturbing for the eye."
Negative
ONLY return the sentiment as output (without punctuation).
Text:
"""

This is where few shot prompting enters the picture. We supply the LLM with sample reviews and their associated labels. The goal is for the model to pick up the association and work with this template as it processes the IMDB data.

df_sampling = df.sample(n=10, random_state=12)

def classify_review(review):
  completion = client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[
      {"role": "system", "content": system_message},
      {"role": "user", "content": review}
    ]
  )
  return completion.choices[0].message.content

For this example, we have version 1.35.3 of the openai package installed via pip.

After extracting a small working set of movie review records, we create a function classify_review that chats with OpenAI. For each record in the sub-sample, we call the function, passing the movie review on to OpenAI to assess and label.

df_sampling['predicted'] = df_sampling['review'].apply(classify_review)
print(df_sampling)

We’ve updated the sample set with a new column at the end called predicted. Comparing it with the data in the label column, we see that the LLM does a pretty decent job classifying the movie reviews. When running the script several times with different sample records, the model’s predicted label occasionally deviates from the label that came with the IMDB data.

                                                 review     label predicted
5669  There won't be one moment in this film where y...  Positive  Positive
8800  Salva and his pal Bigardo have been at the mar...  Positive  Negative
3205  I grew up with H.R. Pufnstuff and the dashingl...  Positive  Positive
8731  Other than the great cinematography by the mar...  Negative  Negative
6412  Not the worst movie I've seen but definitely n...  Negative  Negative
6828  This is the first Michael Vartan movie i've se...  Positive  Positive
5795  Except for the acting of Meryl Streep, which i...  Negative  Negative
9617  After losing the Emmy for her performance as M...  Positive  Positive
4643  Firstly let me get this of my chest I hate Oct...  Negative  Negative
1844  I figure this to be an "alternate reality" tee...  Negative  Negative