Zero-shot prompting is a technique used with Generative Pre-trained Language Models (LLMs) like GPT (Generative Pre-trained Transformer) that enables the model to undertake tasks it hasn't been explicitly trained on. It involves presenting a task to a language model without any task-specific examples or training. The model is expected to understand and execute the task based solely on its pre-existing knowledge and the general instructions provided in the prompt. We communicate with the model using a prompt that explains what we want to achieve. The model uses its pre-trained knowledge, acquired from a vast amount of text data, to infer the best way to complete the task.
This capability is pivotal for several reasons:
Example of Zero-Shot Prompting
Let's consider the task of sentiment classification. Here's how you would set up your prompt:
Task: Sentiment classification
Classes: Positive, neutral, negative
Text: "That shot selection was awesome."
Prompt: “Classify the given text into one of the following sentiment categories: positive, neutral, negative.”
The model's response would likely be "positive" because it has learned from its training data that the word "awesome" is associated with positive sentiment.
Few-shot prompting is a technique used to guide large language models (LLMs), such as ChatGPT and Llama, to perform specific tasks or understand particular contexts using only a small number of examples. In few-shot prompting, you provide the model with a few carefully selected examples (typically between 2 and 10) that demonstrate both the input and the desired output of the task. These examples help the model infer the pattern or context of the task, which it then attempts to generalize to new, unseen inputs.
It's important to note that the model does not update its internal weights during few-shot prompting. The model temporarily "learns" or infers patterns from the provided examples but discards this information once the interaction is over.
Example 1:
Input: "Do you have the latest model of the XYZ smartphone in stock?"
Response: "Thank you for your inquiry. Yes, we have the latest XYZ smartphone model available. Would you like to place an order?"
Example 2:
Input: "Is the ABC laptop available in your store?"
Response: "Thank you for reaching out. The ABC laptop is currently out of stock, but we expect new shipments to arrive next month. Can we notify you when it's available?"
Your task:
Input: "Can you tell me if you have the DEF headphones in stock?"
Response:
In this scenario, the model is provided with two examples of customer inquiries regarding product availability, along with the corresponding email responses. In the first example, the product is in stock, and the response includes an offer to place an order. In the second example, the product is out of stock, and the response offers to notify the customer when it becomes available.
When the model is tasked with generating a response to a new inquiry about DEF headphones, it applies the pattern observed in the previous examples to craft an appropriate reply. This might involve confirming the product's availability and suggesting next steps if it's in stock, or explaining that the product is out of stock and offering alternatives or a notification service.
This approach enables the model to understand the context of customer service in a business setting and to generate responses that are both relevant and considerate of the customer's needs.
Exemplars (Examples)
Exemplars are specific instances or examples that demonstrate how a task should be performed, helping to train or guide machine learning models, especially in few-shot learning scenarios. Here's how few-shot prompting can be approached using exemplars for a business-related task, such as drafting email responses to customer inquiries about product availability, while paying attention to avoiding common pitfalls:
Many-shot prompting is a variant of few-shot learning where, instead of using a handful of examples (e.g., around 10), you use several hundred examples (e.g., 500-800). Models with large context windows, such as Gemma, can accommodate many examples in a single prompt. However, a significant downside of utilizing such large context windows is the increased computational cost and slower inference times. With this many examples, it may be more efficient to fine-tune the model directly, avoiding the repeated cost of processing large context lengths during every inference.
In-Context Learning refers to a large language model's ability to perform tasks by interpreting examples provided in the input prompt, without updating its internal parameters. Few-shot prompting and many-shot prompting are both forms of in-context learning. Despite the term "learning," the model doesn't actually update its weights or retain information beyond the current interaction. Instead, it temporarily infers patterns or rules from the examples in the prompt but discards this inferred knowledge once the interaction concludes.
Metadata prompting is an approach designed to simplify and streamline the process of instructing large language models (LLMs). It applies principles of modularity and separation of concerns to prompt engineering, enhancing the effectiveness of communication with LLMs. Traditionally, prompts often combine task descriptions with explanations of various entities involved, resulting in complex and cluttered instructions.
The core principle of metadata prompting is to separate the task description from entity explanations. It encourages users to start by clearly defining the main task, using all necessary entities without worrying about explaining them. To distinguish entities within the task description, they are enclosed in backticks (`). This allows for a focused and concise task description while clearly marking which terms will be explained later.
After the task is clearly defined, each entity that requires explanation is described separately in JSON format. The entity names serve as keys, with their explanations as corresponding values. This structured approach offers several benefits:
By structuring prompts in this way, metadata prompting aims to create more efficient, readable, and adaptable instructions for AI models, ultimately improving the quality of AI-generated outputs and making the process of working with LLMs more user-friendly.
Taking an example, let's consider a situation where a user wants to assign custom tags to each paragraph in an extensive document. Given the limitations on the token size that an LLM can handle, the document would need partitioning into segments. Yet, for every segment, crucial context like the document's title, headings, and preceding paragraphs must be provided. Traditional prompting methods might fall short here, as LLMs could have difficulty discerning metadata from the main content. In contrast, Metadata prompting offers a more straightforward communication method.
Tag each of `target-paragraphs` with one of the `tags` considering `article-title`, `headings` and `preceding-paragraphs`.
tags: """
tagA: definition of tag A
tagB: definition of tag B
""",
article-title: """Article title""",
headings: """
h1: heading with type Heading 1
h2: heading with type Heading 2
"""
preceding-paragraphs: """Provide 2 paragraphs that come before the target paragraphs to give more context"""
target-paragraphs: """Provide the paragraphs you want the task to summarize"""
Using impressive NLU and in-context learning abilities of LLMs, AI agents typically use text as an interface between components to plan, use external tools, evaluate, reflect, and improve without additional training.
Chain of thought prompting is a technique used to encourage language models to break down complex problems into a series of smaller, interconnected steps or thoughts, mimicking the way humans reason through problems. A language model is prompted to generate a series of short sentences that mimic the reasoning process a person might employ in solving a task. The process involves three main steps:
There are several approaches to prompting a model to generate intermediate reasoning steps in a chain of thought. The most common and the one used in the original paper by Wei et al. (2022) is few-shot learning. In this approach, the model is provided with a few examples of problems along with their corresponding chains of thought and final answers. The model learns from these examples and applies the same reasoning pattern to new, unseen problems, relying on its ability to generalize from a small number of examples.
In their experiments, Wei et al. (2022) provided the model with examples of problems, each demonstrating the step-by-step reasoning process. For instance:
Source: Paper link
Note: A good read for automating picking examplers, Auto-CoT: Paper link , Good summary article
When presented with a new question, the model uses these examples as a reference to generate its own chain of thought and final answer. The authors found that this few-shot learning approach led to significant improvements in the model's performance on various reasoning tasks, including arithmetic, commonsense reasoning, and symbolic manipulation. The generated chains of thought also provided valuable insights into the model's reasoning process, making its outputs more interpretable and trustworthy.
Typical implementation:
Question 1 to n are the few shot exemplars with their respective Reasonings and Answers.
Question: {question 1}
Reasoning: Let's think step-by-step. {reasoning 1}
Answer: {answer 1}
...
Question: {question n}
Reasoning: Let's think step-by-step. {reasoning n}
Answer: {answer n}
Question: {question 1}
Reasoning: Let's think step-by-step.
Other approaches to prompting a model to generate intermediate reasoning steps include:
1. Zero-shot Chain of Thought: By appending the phrase "Let's think step by step", "Break down your reasoning into clear steps", or "Take a deep breath and work on this problem step-by-step" to the original prompt given to the model, it encourages the model to break down its reasoning process into a series of logical and intermediate steps rather than attempting to reach the final answer in one leap.
2. Structured prompts: Prompts that include placeholders for intermediate reasoning steps, which the model is trained to fill in along with the final answer. For instance, a prompt might be structured as follows: Question: [Original question] Step 1: [Placeholder for first reasoning step] Step 2: [Placeholder for second reasoning step] ... Step N: [Placeholder for final reasoning step] Answer: [Placeholder for final answer] The model is trained to fill in the placeholders with relevant intermediate steps and the final answer.
How is it Different from Standard Prompting?
Standard prompting might involve asking a model a direct question and receiving a direct answer, without any explanation of the steps taken to reach that answer. CoT prompting, on the other hand, explicitly asks the model to show its work, providing a step-by-step breakdown of its reasoning. This not only leads to more accurate answers in many cases but also provides an explanation that can be helpful for users to understand the model's thought process.
Business Example: Enhancing Customer Support with RAG and CoT
Consider an online retailer implementing a chatbot equipped with RAG and chain of thought prompting to handle customer inquiries. A customer asks a complicated question about a product's features, compatibility with other devices, and return policy.
This example illustrates how chain of thought prompting in RAG transforms the way LLMs handle complex queries, enabling them to provide more accurate, detailed, and contextually relevant responses. By mimicking human-like reasoning and adaptability, this approach significantly enhances the capabilities of AI in business applications, particularly in areas requiring deep understanding and nuanced responses.