LLM pricing calculator
Calculate and compare the cost of using OpenAI Chatgpt, Anthropic Claude, Meta Llama 3, Google Gemini, and Mistral LLM APIs with this simple and powerful free calculator. Latest numbers as of June 2024.
Name | Context window | Input / 1k Tokens | Output / 1k Tokens | Per call | Total |
---|---|---|---|---|---|
gpt-4o | 128,000 | $0.005 | $0.015 | $0.02 | $0.4 |
gpt-4o-2024-05-13 | 128,000 | $0.005 | $0.015 | $0.02 | $0.4 |
gpt-4-turbo | 128,000 | $0.01 | $0.03 | $0.04 | $0.8 |
gpt-3.5-turbo-0125 | 16,385 | $0.0005 | $0.0015 | $0.002 | $0.04 |
gpt-3.5-turbo-instruct | 4,096 | $0.0015 | $0.002 | $0.0035 | $0.07 |
Claude 3.5 Sonnet | 200,000 | $0.003 | $0.015 | $0.018 | $0.36 |
Claude 3 Opus | 200,000 | $0.015 | $0.075 | $0.09 | $1.8 |
Claude 3 Sonnet | 200,000 | $0.003 | $0.015 | $0.018 | $0.36 |
Claude 3 Haiku | 200,000 | $0.00025 | $0.00125 | $0.0015 | $0.03 |
Gemini 1.5 Pro | 1,000,000 | $0.007 | $0.021 | $0.028 | $0.56 |
Gemini 1.5 Pro | 128,000 | $0.0035 | $0.0105 | $0.014 | $0.28 |
Gemini 1.5 Flash | 1,000,000 | $0.0007 | $0.0021 | $0.0028 | $0.056 |
Gemini 1.5 Flash | 128,000 | $0.00035 | $0.00105 | $0.0014 | $0.028 |
Gemini 1.0 Pro | 128,000 | $0.0005 | $0.0015 | $0.002 | $0.04 |
+llama-3-70b-instruct | 8,192 | $0.00059 | $0.00079 | $0.00138 | $0.0276 |
+llama-3-8b-instruct | 8,192 | $0.00008 | $0.00008 | $0.00016 | $0.0032 |
+llama-3-8b | 8,192 | $0.00005 | $0.0001 | $0.00015 | $0.003 |
open-mixtral-8x22b | 65,536 | $0.002 | $0.006 | $0.008 | $0.16 |
open-mixtral-8x7b | 32,768 | $0.0007 | $0.0007 | $0.0014 | $0.028 |
open-mistral-7b | 8,192 | $0.00025 | $0.00025 | $0.0005 | $0.01 |
AI model pricing: Guide to GPT-4, Claude 3, Gemini, and more
Lots of companies like OpenAI, Google, Meta, and more make special computer programs called AI models. These models can chat with you, write stories, and answer questions. Here’s a simple way to understand how they decide how much it costs to use these models.
What is GPT token?
Imagine tokens like pieces of a puzzle. Each token is a small part of a word. For example, 1,000 tokens are about 750 words. If we say "This sentence is 5 tokens," it means the sentence is made up of 5 small pieces, or tokens.
A good way to remember is that one token is about four letters long. So, one token is usually a little less than a whole word.
What is Context Length?
When talking about AI models, you’ll hear about "context length." This is like the model’s memory span. It’s how much information the model can remember at one time.
Context length is the amount of information or the number of tokens a model can keep in mind while it’s working. If a model has a context length of 8,000 tokens, it can remember 8,000 tokens worth of information in one go.
Why Does Context Length Matter?
- Doing Complex Tasks: If the model has a longer memory, it can handle bigger tasks, like summarizing a long article or answering detailed questions.
- Remembering Conversations: In a chat, a longer memory helps the model remember more of the conversation, making it better at replying.
- Cost: Models with longer memory usually cost more because they need more computer power.
AI models comparison
Here are some popular AI models and what makes them special:
OpenAI GPT-4
Known for its smartness, GPT-4 can handle tough questions and tasks. It’s slower and costs more, but there's a cheaper, faster version called GPT-4 Turbo.
OpenAI GPT-3.5 Turbo
Ideal for chatting, GPT-3.5 Turbo is fast and affordable. It’s perfect for making chatbots.
Anthropic's Claude 3
Available in three versions with different abilities. The most powerful is Opus, and the most affordable is Haiku. Claude 3 models can remember up to 200K tokens.
Llama 3 by Meta
Free to use and great for many tasks like writing and answering questions. It’s powerful yet cost-effective.
Google's Gemini
These models can handle text, pictures, and even videos. The best one is called Gemini Ultra. There are other versions like Gemini Pro which are also very powerful. Gemini models can remember up to 1 million tokens.
Mistral AI
Known for making small, fast models that are cheap to use. Their models, like Mistral 7B and Mixtral 8x7B, offer great performance for their price and can handle many tasks.
Conclusion
Understanding the pricing and capabilities of different AI models is crucial for choosing the right one for your needs. Whether you need a model for chatting, complex problem-solving, or handling multiple types of media, there is an option available to fit your budget and requirements. Consider the token usage and context length to make an informed decision that balances performance and cost.