OpenAI – the creators of ChatGPT and current designers of our collective futures with AI – announced a metric tonne of new updates, features, and products at their inaugural DevDay this Monday. I was there, and here is my no-hype, no-nonsense, pragmatic guide to what was released and what it means for you.
GPT-4 Turbo: Upgraded model with upgraded speed, upgraded “knowledge,” and lower pricing
All software has version releases. So does GPT – the underlying foundation model for ChatGPT and OpenAI’s other language-based AI services. The latest version – GPT-4 Turbo – boasts:
- faster speed. GPT-4 was notoriously slow compared to GPT-3.5 and GPT-3.5 Turbo. GPT-4 Turbo is reportedly significantly faster meaning you don’t have to wait as long for responses. This is especially noticeable with their text-to-speech features.
- bigger context window. GPT-4 Turbo has a 128,000 token context vs GPT-4’s 8,000 – 32,000 tokens. This means you can now provide around 300 pages of text for the system to reference during your session without it losing track of what you’re talking about. LLM systems are language transformers, and the more context you provide, the better they are able to perform tasks. This has big implications which I’ll address in the below sections on GPTs and the Assistant API.
- updated knowledge cutoff. GPT-3, 3.5, 3.5 Turbo, and 4 were all trained on data collected before September 2021. This meant if you asked them about something that happened at a later date, they would not be able to answer. GPT-4 Turbo’ knowledge cutoff is April 2023 and in the DevDay keynote OpenAI CEO Sam Altman said they will “try to never let it get that out of date again.”
- lower cost. GPT-4 Turbo is 3x cheaper than GPT-4 for prompts, and 2x cheaper for completions. This is significant for developers who are building things with OpenAI’s API because every token costs money. This is a transparent play to get more developers to work with the platform. As for ChatGPT, the pricing stays the same, so the majority of users won’t see any pricing impact.
- Other things: Multimodal by default in ChatGPT (Dall-E 3, web lookup, code interpreter triggers automatically), invite-only fine-tuning for GPT-4, and increased rate limits for the API.
What this means for us
GPT-4 Turbo is the next version of GPT, and based on previous history we can expect either a GPT-4.5 Turbo or a GPT-5 in the relatively near future. The model provides incremental and obvious improvements and we see a clear pattern here: Base models improve performance, turbo models improve speed, extend context windows, and lower cost. The real-world implications of this new model are significant:
- ChatGPT will appear “smarter” and more “knowledgable” meaning people will be more inclined to think of these systems as “intelligent” and neutral arbiters of the truth. This continues to be a serious societal problem and will be amplified every time the models get upgraded.
- Using GPT models for practical things got a lot easier. Context window limits have been a major issue for use cases including knowledge retrieval from large documents, summary writing, and more. The enormous context window of GPT-4 Turbo means students can use ChatGPT to summarize academic articles and entire textbooks, writers can use it to review entire chapters and even books, and data professionals can use it to parse much larger data sets.
- More people will be using ChatGPT and GPT-based systems for more advanced things, and lean more on the mythologized “reasoning” within these systems to make decisions, and the systems will produce completions good enough to pass a cursory overview leading people to think they are doing good work. Education is necessary to help people understand why this is not the case.
GPTs: The first step towards GPT Agents and the sidelining of plugins
ChatGPT users can now create so-called GPTs – effectively tailored ChatGPT versions with custom instructions, expanded “knowledge”, and specialized actions. These GPTs are built from the ChatGPT interface and programmed using natural language meaning you don’t have to be a programmer to build them. This democratizes the creation of custom GPT agents and gives people new AI capabilities.
- Each GPT has its own custom instructions – a large system prompt where you describe what the GPT is for, what it should do, and how it should behave.
- You can upload “knowledge” to a GPT in the form of documents and other data and the GPT will refer to this knowledge in its completions. For example, you can upload a textbook as a PDF and tell the GPT to act like a teaching assistant and it can help you learn the content of the textbook, quiz you on important topics, provide summaries, etc.
- Actions allow you to connect GPTs to external services and customize their interactions. For example you can connect a GPT to a weather API and instruct it on how to pull real-time data from that API for accurate reporting.
- You can create private GPTs with any content you want.
- You can share GPTs (when you do they go through a copyright check to make sure you’re not sharing content you don’t own the rights to).
- Enterprise users can create enterprise-only GPTs to share within their orgs.
- There will be a future GPT marketplace where you can buy and sell GPTs with profit sharing.
- Currently GPTs are in beta, available only to ChatGPT Plus users, and being rolled out slowly. Unclear whether they will become available to non-paying users.
- Some mentions were made about how actions could be associated with ChatGPT plugins, but reading between the lines the message is quite clear: Plugins are being silently sidelined in favour of GPTs.
What this means for us
GPTs will become the new primary way people use ChatGPT because they eliminate the need to state the purpose of your interaction with each chat. GPTs will also dramatically accelerate advanced user of ChatGPT because they bring down some significant barriers to entry:
- The massive 128,000 token limit allows you to upload entire books as “knowledge” in a GPT meaning every student can and will create a GPT for every textbook they own and use it to supercharge their learning.
- Sharing of GPTs means as people create new capabilities with ChatGPT they’re able to give those capabilities to others. This will be especially important for things like helpdesk, documentation search, and internal enterprise operations.
- The plugins ecosystem is fading into irrelevancy, both because GPTs take over for them and because the release of GPTs meant the death of hundreds of well-funded startups and projects, all informed by the plugins they created. For example, every “talk to your PDF” type plugin is now meaningless as GPTs do this by default.
- OpenAI will have a nightmare task on their hands as they try to moderate the tsunami on top of an avalanche of GPTs people make and try to sell in their marketplace. Moderation will be key, and it will be enormously costly.
Assistants: The programmer’s path to agents
Along with GPTs (which belong in ChatGPT), OpenAI released the Assistant API which provides the same family of functionality in for programmers who build tools utilizing GPT services. With the Assistant API comes a bunch of features that make the work of every developer a lot easier:
- Threaded responses for streaming so you don’t have to keep track of every prompt/response pair in your own database
- Invoke multiple functions at once with function calling
- “Knowledge” retrieval from documents (low-key low-investment RAG for smaller documents)
- a stateful API ? because this is 2023 not 1996
- API access to the code interpreter (and a future path towards custom code interpreters)
- Future promises including mutimodal API, async, and support for web sockets and web hooks.
What this means for us
If you’ve built any application on top of the OpenAI API, chances are you now have to rebuild it. Many of the new features released (threading, multi-function calling, retrieval, statefulness) replace custom features developers were forced to build due to the lack of core support from the API. This will be enormously expensive and damaging to many projects, but is necessary to move the entire space forward. The lack of these features in the original API were deficiencies and their introduction is long overdue. One thing I didn’t see was any mention of proper authentication. The current key-based auth in the OpenAI API is sub-optimal at best and leaves developers having to rig their own security around their apps which is… not great.
- Building anything with the API is now way easier.
- This is an aggressive play to onboard more developers, and OpenAI is clearly taking slow developer adoption seriously.
- The importance of parallel function calling cannot be overstated – this is the path to a lot of advanced functionality.
- Building extensions to OpenAI’s features remains risky as the API and underlying services evolve, so make sure you have room to rapidly iterate and change.
- I expect we’ll see a continuation of this rapid evolution of API features for a long time, so stay nimble.
OpenAI is rapidly evolving from a startup with an insanely popular experimental service to a full-fledged platform company with professional products on offer. The rapid evolution of their products shows now sign of slowing down and I expect by next year’s DevDay what was released this week will appear quaint and old.
If you’re working with AI in any way, get used to the constant change and uncertainty, because they are going to keep accelerating for a while.