Large language models, LLMs for short, are AI models trained to understand and generate text. Before these can do so, they train on massive datasets containing books, websites and other data types, interpreting the relationships between the content and the tags applied by data scientists. As they process more data, they start identifying patterns and repetitions in the text.
Identifying patterns is one of the capabilities that helps the model generate text. The simplest explanation for what happens when AI processes your prompt is that it predicts what’s the text most likely to appear right after the last word. It keeps repeating this process as it creates an answer, always finding the most likely word to come after.
ChatGPT, launched in late 2022, surprised everyone with its understanding, reasoning and text generation capabilities. In the early days, people were asking questions, generating content, and testing the limits of what AI could do. Today, millions of users keep a browser tab open in their computer, always ready to move forward in their projects.
Over time, other AI chatbot apps with original LLMs appeared in the market. The most notable ones are:
You can judge the quality of new chatbots in two major ways. The first one is looking at a breakdown of the AI model’s benchmarks and capabilities, showing what you can expect when receiving a response. These statistics show how good they are at reasoning, coding or communication, for example.
The other aspect to consider are the chatbot app features. This refers to the interface, user experience and connected tools to use the AI model. It can include searching the internet, generating files or rendering AI-written code previews.
But while ChatGPT has good short-term memory, it doesn’t remember details across conversations—even with it’s recent memory feature, it can’t accurately recall all key details about you, your work and your personal life.
Since LLM responses depend on statistics, it’s possible that these models can generate answers that, while seeming plausible, are actually false. Called hallucinations, these tend to happen when the chatbot is answering questions about topics it didn’t have a lot of training data to rely on. While the accuracy can be quite high for more mainstream topics, if you want to dive deep into a new field of knowledge, it can hallucinate frequently.
The potential for hallucinations is closely connected to its low adaptation capabilities to niche topics. Since they have limited training data on these matters, their usefulness is limited. You either provide the initial data at every conversation start, or you risk the chatbot making mistakes or being inaccurate.
While each provider aims to give their chatbot a distinctive personality, it’s hard for you to control the writing style of the chatbot with in-app settings. You can do so with a prompt, but you always have to state what you’d like every time you start a conversation.
LLMs require powerful hardware to calculate your responses, meaning there is a limited chance you’ll ever be able to run an OpenAI model on a cost-effective on-premises device. This can add data privacy risks, as you always have to send your prompts and data to AI model providers, complying with whatever they choose to write in their privacy policies.