Generative AI

OpenAI GPT-3 Models List: Explained with Examples

In the ever-evolving landscape of natural language processing (NLP), OpenAI’s GPT-3 models have garnered significant attention for how they could understand and generate human-like text. Different GPT-3 models discussed in this blog can be accessed using APIs and OpenAI Playground. In this blog post, we will delve into the OpenAI GPT-3 models and provide a comprehensive list, along with explanations and examples of their capabilities. Although GPT-3.5 models are more powerful than their counterpart GPT-3 models, it is only these GPT-3 models which are currently available for fine-tuning. Whether you are an experienced data scientist or a curious generative ai enthusiast, understanding these models is crucial in making the most of the NLP capabilities of OpenAI GPT-3 models.

GPT-3 Models Details & Examples

The GPT-3 models offer different levels of power and speed for various tasks related to understanding and generating natural language. Among these models, Davinci is the most capable, while Ada is the fastest. The following represents the four different GPT text models:

The following is the list of GPT-3 models in their order of capability, from highest to lowest:

  • text-davinci-003
  • text-curie-001
  • text-babbage-001
  • text-ada-001.

While Davinci is the most capable, the other models excel in terms of speed. It is recommended that we begin by experimenting with Davinci to obtain the best results and validate the value provided by Azure OpenAI. Once our POV (proof-of-value) or POC (proof-of-concept) is functional, we can optimize the choice of the model to achieve our objective of latency vis-a-vis performance for the specific applications.

Here is the detail of all the models listed above.

ModelUse CasesCapabilitiesMax TokensSpeedReal-world Examples
text-davinci-003Complex intent, cause and effect, summarization for audienceCapable of performing any task the other models can do, deep understanding of content. This is the most capable GPT-3 models of all. 4097 tokens Not as fast as other modelsRequires more computing resources
text-curie-001Language translation, complex classification, text sentiment, summarizationSuitable for nuanced tasks, fast performance. This is also fast model with lower cost than Davinci. 2049 tokens Powerful and fast modelProviding real-time language translation in chat applications, generating product summaries.
text-babbage-001Moderate classification, semantic search classificationCapable of performing straightforward tasks, excels in semantic search. This is also fast model with lower cost than Davinci. 2049 tokens Fast modelRanking search results based on semantic relevance, classifying customer support tickets.
text-ada-001Parsing text, simple classification, address correction, keywordsSuited for parsing text, address correction, and certain kinds of classification tasks. This is the fastest model in GPT-3 series models while being available at the lowest cost. 2049 tokens Fast modelParsing and extracting information from legal documents, correcting addresses in shipping labels.

You might be wondering about the format of the GPT-3 models such as text-davinci-003. What does text, davinci, and 003 mean? Let’s quickly understand this.

The format of OpenAI’s GPT-3 models follows a specific pattern: {capability}-{family}[-{input-type}]-{identifier}.

Let’s consider the example of the text-davinci-003 model to illustrate the different elements of its format:

  1. {capability}: The capability of the text-davinci-003 model is “text.” This indicates that the model is specifically designed to understand and generate natural language text. It can handle a wide range of language-related tasks such as summarization, creative content generation, and complex intent analysis.
  2. {family}: The text-davinci-003 model belongs to the “davinci” family. The davinci family of models is known for its exceptional capabilities and versatility. It offers the most advanced features and can perform any task that the other GPT-3 models can handle, often with less instruction.
  3. {input-type} (Optional): In the case of the text-davinci-003 model, there is no specific input type mentioned. This means that the model doesn’t require any specific input format or embedding type for its operations.
  4. {identifier}: The “003” in text-davinci-003 represents the version identifier of the model. It indicates that this is a specific iteration or version of the text-davinci model. The version identifier helps keep track of updates and improvements made to the model over time, ensuring that users are aware of the specific variant they are working with.

GPT-3 Models Overview – Presentation

Here is a quick overview on GPT-3 models based on this blog post.

Get OpenAI GPT-3 & GPT3.5 Models List: Python Example

You can execute the following code to get the list of models supported by OpenAI. First and foremost, install the OpenAI library and create a secret key.

  • Use the command pip install openai to install OpenAI library.
  • Then, create a secret key by logging into your OpenAI account, clicking on your profile in top right corner and clicking “View API Keys”. Go to the page and create your secret key. The picture below represents the same.

Execute the following code to get the list of GPT models.

secret_key='sk-PXWRJppSdiQcVDwo92343BlbkABCDEVeMOxLR0XHyDe9jYyI'

import openai
openai.api_key = secret_key
#
# Get the models list
#
list = openai.Model.list()
# 
# Print the models list
#
i = 0
while(i < len(list.data)):
    print(list.data[i].id)
    i += 1

The above will print the following as of today (30/08/2023).

  • davinci
  • text-davinci-001
  • text-search-curie-query-001
  • gpt-3.5-turbo
  • babbage
  • text-babbage-001
  • curie-instruct-beta
  • davinci-similarity
  • code-davinci-edit-001
  • text-similarity-curie-001
  • ada-code-search-text
  • gpt-3.5-turbo-0613
  • text-search-ada-query-001
  • gpt-3.5-turbo-16k-0613
  • babbage-search-query
  • ada-similarity text-curie-001
  • gpt-3.5-turbo-16k
  • text-search-ada-doc-001
  • text-search-babbage-query-001
  • code-search-ada-code-001
  • curie-search-document
  • davinci-002
  • text-search-davinci-query-001
  • text-search-curie-doc-001
  • babbage-search-document
  • babbage-002
  • babbage-code-search-text
  • text-embedding-ada-002
  • davinci-instruct-beta
  • davinci-search-query
  • text-similarity-babbage-001
  • text-davinci-002
  • code-search-babbage-text-001
  • text-davinci-003
  • text-search-davinci-doc-001
  • code-search-ada-text-001
  • ada-search-query
  • text-similarity-ada-001
  • ada-code-search-code
  • whisper-1 text-davinci-edit-001
  • davinci-search-document
  • curie-search-query
  • babbage-similarity
  • ada
  • ada-search-document
  • text-ada-001
  • text-similarity-davinci-001
  • curie-similarity
  • babbage-code-search-code
  • code-search-babbage-code-001
  • text-search-babbage-doc-001
  • gpt-3.5-turbo-0301
  • curie

Conclusion

The GPT-3 models present a remarkable leap forward in natural language processing and understanding. Each model offers unique capabilities, from deep understanding and complex intent analysis to fast parsing and classification. Real-world examples have demonstrated the tangible benefits of using GPT-3, such as generating personalized news summaries, performing real-time language translation, and improving search result rankings.

Considering the compute resources required and the trade-off between capability and speed is crucial when selecting the appropriate GPT-3 model. Starting with Davinci for experimentation and then optimizing the model choice based on latency/performance balance is a recommended approach. As GPT-3 continues to evolve, it holds the potential to revolutionize language processing and generation, driving innovation and transforming the way we interact with natural language. Embracing GPT-3 unlocks a world of possibilities to elevate projects, enhance customer experiences, and foster groundbreaking advancements in the digital landscape. Please feel free to reach out in case of need for any clarifications.

Ajitesh Kumar

I have been recently working in the area of Data analytics including Data Science and Machine Learning / Deep Learning. I am also passionate about different technologies including programming languages such as Java/JEE, Javascript, Python, R, Julia, etc, and technologies such as Blockchain, mobile computing, cloud-native technologies, application security, cloud computing platforms, big data, etc. I would love to connect with you on Linkedin. Check out my latest book titled as First Principles Thinking: Building winning products using first principles thinking.

Recent Posts

Agentic Reasoning Design Patterns in AI: Examples

In recent years, artificial intelligence (AI) has evolved to include more sophisticated and capable agents,…

1 month ago

LLMs for Adaptive Learning & Personalized Education

Adaptive learning helps in tailoring learning experiences to fit the unique needs of each student.…

1 month ago

Sparse Mixture of Experts (MoE) Models: Examples

With the increasing demand for more powerful machine learning (ML) systems that can handle diverse…

2 months ago

Anxiety Disorder Detection & Machine Learning Techniques

Anxiety is a common mental health condition that affects millions of people around the world.…

2 months ago

Confounder Features & Machine Learning Models: Examples

In machine learning, confounder features or variables can significantly affect the accuracy and validity of…

2 months ago

Credit Card Fraud Detection & Machine Learning

Last updated: 26 Sept, 2024 Credit card fraud detection is a major concern for credit…

2 months ago