AI Stores

What is an AI Store? How does it work? How to contribute or benefit? What is next? A gentle discussion about (the future of) ChatGPT, large language models, foundation models, and more.

Feb 15, 2023

We are entering the era of AI Stores.

Updates on November 7, 2023

The progress in LLMs has not yet provided enough evidence that the foundation is solid enough to support the idea of AI Stores. See my other posts for more detail.

Updates on March 24, 2023

OpenAI launched ChatGPT Plugins on March 23.

Plugins can be regarded as “apps”.

In the following, it says “One issue is, it is desirable to have an educated estimation of the speed of AI progress. Imagine that a startup built on GPT-3 may encounter embarrassments after the launch of ChatGPT.”

It is worthwhile to highlight that, in contrast to an App Store, providers of language models may eat (part of) the territories of downstream apps, as the models improve. This is due to the current stage of AI: far away from AGI.

Clearly, many things happened after publishing this blog on Feb 14, 2023. This blog does not need to update though, except some up-to-date info, like new launches.

Two new blogs:

Will AGI Emerge from Large Language Models? (Feb 28, 2023)
Where is the boundary for large language models? (March 12, 2023) “It is naturally incentivized for AI models providers to keep improving models’ capacities. AI models may expand their capacities 1) horizontally by prompting automation, 2) horizontally by integrating tools, and / or 3) vertically by incorporating common blocks for specialized domains. LLMs will improve, likely adeptly and significantly, esp. at the current stage, in the beginning and many giants are joining. As a result, unfortunately, AI models providers may eat the business territories of downstream model users. This is different from App Stores.”

What is an AI Store?

An AI Store provides general AI capabilities of knowledge representation and decision making.

An AI store supports various types of AI assistants, e.g., drafting an email or a report, scheduling an appointment, drawing a picture, creating a melody, chatting with a kid or an elderly, suggesting a piece of computer code, proposing a drug molecule, playing Diplomacy, and so on and so forth. A powerful capacity is to delegate a task to an existing mature tool, e.g., a translator, a calculator, a calendar, a Python interpreter, Mathematica, or you name it.

An AI Store is powered by general AI methods, learning and search, in particular, deep learning, machine learning, representation learning, (un/self-)supervised learning, reinforcement learning, causal inference, logic, and reasoning, with the recent large language models (LLMs), or foundation models in general, as stepping stones.

Large language models, or foundation models in general, including OpenAI ChatGPT, OpenAI DALL·E, Google LaMDA, Google PaLM and Flan-PaLM, Deepmind Sparrow, Anthropic Claude, Google Chinchilla, Nvidia Megatron-Turing NLG, Deepmind Gopher, BigScience/HuggingFace BLOOM, Google BRET (a survey), Google T5, Deepmind AlphaFold, Github Copilot (using OpenAI Codex), Deepmind AlphaCode, Google Robotics Transformer (RT-1), Stable Diffusion, Google MusicLM, Meta Galactica, Meta Cicero, Deepmind Adaptive Agent (AdA) and many more, are propelling a gigantic wave of AI research and development, and are generating huge hope and hype from a titanic crowd. If you are not isolated from news and social media, you can feel the heat.

As a note, foundation models are “trained on broad data (generally using self-supervision at scale) that can be adapted to a wide range of downstream tasks.”

People call it the moment of the computer, the moment of the operating system, the moment of the Internet, or the moment of the iPhone. People call it a paradigm shift. People call it a game changer.

We coin the term “AI Store”, in honour of App Store.

How does an AI Store work?

An AI Store builds a foundation from human knowledge, from languages, images, audios, videos, all modalities of information. An AI Store usually includes one or more general AI models and many specialized AI models.

An AI Store connects users and AI models with convenient interfaces, which may be an application, a browser, or a voice assistant, on a phone, on a watch, on a glass, on a computer, on a virtual reality (VR) device, as a physical robot, or in some new form yet to be invented.

An AI Store is an ecosystem. It provides general digital, smart products and services to the general public. Moreover, it provides the foundation for researchers and developers to make refinements, either with retails of API calls, or with whole sales of model fine-tunings. All these in turn will help the AI Store improve its foundation models, resulting in refined products and services.

How to contribute to or benefit from AI Stores?

It is time for a student, a researcher, a developer, an entrepreneur, an officer, or even anyone, to consider how to contribute to or benefit from such an AI Store.

A study by Joshua Tenenbaum and his colleagues evaluates LLMs from a cognitive perspective w.r.t. formal competence (knowledge of linguistic rules and patterns) and functional competence (understanding and using language in the world) and shows that “LLMs are good models of language but incomplete models of human thought”.

As a result, currently, we should leverage LLMs’ competence as good models of language, and when necessary, manage to improve the functional competence, e.g., factuality, safety, and reasoning. Meanwhile, we need to watch the AI progress. In general, prompting can improve LLMs, fine-tuning with domain data can achieve a specialized LLM, combining an LLM with a search engine and/or a knowledge base can improve factuality, and integrating LLMs with mature tools can achieve various functionalities. See more details below.

A foundation model usually consumes huge amount of resources to train or even for inference, including compute, data, and talents. To contribute or to benefit, the first factor is about feasibility. We loosely categorize the amount of resources into four types: small, medium, large, and extra large.

With Small Resources

With small resources, one may simply be a customer, leveraging an AI Store for fun or for her endeavour. For example, ChatGPT can be a writing aid or a companion.

The aim of an AI Store is for the convenience of general public, with basic knowledge of a language like English to communicate with an application, without any formal training of traditional computer programming. One may resort to her favourite website or App for tutorials, e.g., about how to use ChatGPT, or how to run a business with it. Many people are already using ChatGPT, Copilot and/or DALL·E.

In fact, there are many startups using LLMs, e.g., see TechCrunch’s coverages for GPT-3 and ChatGPT respectively for more info. One issue is, it is desirable to have an educated estimation of the speed of AI progress. Imagine that a startup built on GPT-3 may encounter embarrassments after the launch of ChatGPT.

LLM may help with software issues. A software system requires specification and verification for it to work properly, and there may be security issues for an anomaly detection system to tackle with. As a concrete example, a smart contract in a blockchain is a piece of software.

With small resources, one may develop auxiliary techniques to elicit the best value of an LLM. An obvious approach is prompt engineering. Prompting may be ad-hoc and temporary; however, it appears effective, and will likely stay for a while. Studies show that prompting can perform better than fine-tuning. See a survey paper about prompting in ACM Computing Surveys and a companion website by Graham Neubig and his colleagues. See also the Prompt Engineering Guide.

Google applied instruction prompt tuning to Flan-PaLM to encode clinical knowledge, resulting in Med-PaLM.

Another approach is to integrate a foundation model with a mature tool, which may be based on a software API or symbolic AI or another deep neural network, leveraging the best of both worlds. It is possible to integrate an LLM with a Python interpreter, Mathematica, a Question-Answer (QA) system, a translator, a calculator, etc., with virtually endless potentials. Examples include Meta Toolformer, LangChain, ChatGPT+Mathematica, Program-aided Language Models, A Neuro-Symbolic Perspective on LLMs, etc.

Queries to a search engine can help improve factual accuracy and provide evidence, e.g., Deepmind Sparrow and OpenAI WebGPT. Another approach to improve factuality is by combining an LM with a knowledge graph, see e.g., Stanford QA-GNN.

With small resources, a researcher or a student may study fundamentals of LLMs, foundation models, or AI, e.g., open source like the HuggingFace Community, underlying AI systems e.g., Ray, novel prompting techniques, traditional studies for deep learning, machine learning, reinforcement learning, natural language processing (NLP), computer vision, robotics, causal inference, logic, reasoning, neuro-symbolic programming, etc. Such studies will shed lights on further developments of foundation models.

One particularly interesting topic is about (plagiarism) detection, e.g., GPTZero (built during New Year 2023), a watermark for LLMs, and DetectGPT.

With small resources, as discussed above, one does not expect to change the underlying foundation model.

Note: There may not be a clear boundary between resource categories. Some projects mentioned in this subsection may require medium, large, or even extra large resources, e.g., integrating foundation models with tools and research for foundation models, even fundamentals.

With Medium, Large, or Extra Large Resources

With medium resources, one may fine-tune a foundation model with domain data, resulting in a model specialized in, e.g., healthcare, education, or laws. In this case, one needs access to the model and updates part or all of the weights. See a blog by Sebastian Ruder (in Feb 2021) or Section 4.3 Adaptation in the foundation models paper (updated July 2022).

With large resources, one may consider to build an in-house large model. Yet, the model is relatively smaller, e.g., with parameters at the level of billions, rather than hundreds of billions. Examples include Stanford PubMedGPT 2.7B and Microsoft BioGPT for healthcare, and ProGen for biotechnology.

With extra large resources, one should consider building a large model, and importantly, making improvements to the state of the art as much as possible.

With large, especially extra large, resources, one should think big and far, in particular, how to achieve similar or even better performance with much lower resource requirements, how to approach or even achieve AGI, and how to shift the current paradigm. See next section for more discussions.

What’s next?

A plausible paradigm shift is both promising and challenging. The current large language models or foundation models have shown impressive capacities, yet they are far from perfect, with issues of factuality, safety, reasoning, etc. Even so, we expect a wider adoption, a lot of customized foundation models for many imaginable applications, and a storm of downstream AI models.

Impacts to the Society

We expect AI Stores to have impacts, both positive and negative, on the society, in all aspects, e.g., education (an interview with Noam Chomsky, a position paper, and graduate-level exams) , copyright, industries, research, and science, and for programming in particular, job security and even the end of it. We expect AI Stores to undergo the creative destruction process, with the Code Red response from Google as a good example. We expect all sorts of discussions and debates, e.g., the blurry JPEG analogy.

Deep Learning

Deep learning, (un/self-)supervised learning, representation learning, etc. in particular, Transformers (and improvements for efficiency), are indispensable for LLMs and foundation models. This is basically common sense now. Even many kids start studying AI/Python coding.

Reinforcement Learning

Reinforcement learning from human feedback (RLHF) plays a critical role in human alignment and facilitates learning of the objective function, the single most important ingredient for all optimization tasks. OpenAI ChatGPT collects human data. Deepmind Sparrow and Anthropic Claude design rules and reduce the reliance on human involvements. RL is also applied in algorithmic settings in LLMs, e.g., prompt optimization (like RLPrompt and TEMPERA), offline learning, hindsight instruction relabeling, etc.

Excitingly, there is already an RL adaptive agent based on foundation models. It is an inspiring starting point for RL generalists targeting various applications, e.g., in one or several areas of operations research, optimal control, economics, finance, and computer systems.

The classic experiment by Held and Hein showed that “only the active kittens developed meaningful visually-guided behavior that was tested in separate tasks”. RL is an ideal option to address such interactive perception. Google SayCan studies grounding language in robotic affordances. Pierre-Yves Oudeyer and colleagues study functional grounding of LLMs with RL. Sergey Levine discusses the purpose of a language model and how RL can help fulfill it.

We expect that reinforcement learning and foundation models will influence each other significantly in the near and far future.

Universal Turing Machine

Dale Schuurmans shows that “transformer-based large language models are computationally universal when augmented with an external memory”. Giannou et al. shows that “transformer networks can be used as universal computers by programming them with specific weights and placing them in a loop”.

Artificial General Intelligence

There are heated debates about whether LLMs are the right way to and how far is ChatGPT from artificial general intelligence (AGI) and human-like intelligence. See, e.g., LeCun’s take on current auto-regressive LLMs. One way to keep up-to-date with such discussions is to follow experts on Twitter, e.g., Yann LeCun, Gary Marcus, and Melanie Mitchell.

There are emergent abilities of LLMs with in-context learning and chain-of-thought prompting. ReAct combines action-prompts with chain-of-thought for better reasoning. See also a blog about emergent abilities in ChatGPT and a talk about the role of data and optimization for emergence. More studies are in line to show the reliability and interpretability of emergent abilities.

People may wonder, can we expect reasoning as emergent abilities, based on maximum likelihood language models? People may wonder, can we expect intervention and counterfactuals as emergent abilities, based on a computing/learning architecture merely for association?

One hypothesis is that in an LLM a correction method somehow recovers causation phenomena in large language corpora, somewhat like an imitation learning method can, to some extent, learn an optimal sequential decision policy. Expert moves helped AlphaGo; however, tabular rasa is the right way for computer Go, as shown in AlphaZero, although for real life problems, reincarnating RL would be helpful.

People will investigate how to handle causation, logic, and reasoning, combining with foundation models or not, to fully achieve AGI, which also demands answers to inquiries from cognitive science, neural science, and philosophy. AGI is still an open problem. To scale, or not to scale, that is the question.

Future Work

For research in computing/AI, solving the following appears particularly interesting:

Can we design a new architecture, which requires 1000x less compute and data than Transformers, with an innate reasoning capacity?
ENIAC weighs 30 tons; iPhone weighs a little bit more than 100g, fits in a small pocket, with much stronger compute, storage, and communication capacities. Foundation models are now with hundreds of billions of parameters. Can we build the “iPhone version of foundation models”?

Epilogue

AI Stores with the current large language models, foundation models and AI in general present compelling products and services, i.e., through AI models. Meanwhile, these is still a large room for the current AI technologies to improve. Cost is not a negligible factor, either. For AI Stores to prosper, the foundation has to be solid, technically and economically.

More and more people, especially early adopters, will be using ChatGPT, DALL·E and alike, i.e, as customers of AI Stores. More and more geeks and entrepreneurs will be developing downstream AI models, to enrich products and services in AI Stores. More and more researchers and developers will be busy with continuously solidifying the foundation of AI Stores.

An AI Store needs a background music, right? Here we go.

Learning From Feedback (I Want ChatGPT) by Pablo Samuel Castro.

To True AI.

2023.02.14.

(In Mandarin, “AI” pronounces the same as love.)

Yuxi’s Substack

Discussion about this post