Tutorial on using pre-trained OpenAI language models
In this tutorial you will learn how to use language models (via the OpenAI API) from your Analytica model, and also learn about language models and how they work.
Requirements
To use the OpenAI library, you will need:
- An Analytica Enterprise or Analytica Optimizer edition, release 6.4 or more recent.
- Internet access allowing Analytica to communicate with
https://api.openai.com
. If you have a strong firewall, you may need to add a rule to allow this. - Your own OpenAI API key. To get one:
- Go to https://platform.openai.com/ and sign up for an account.
- Click on your profile picture and select View API keys
- Click Create new secret key.
- Copy this key to the clipboard (or otherwise save it)
Getting started
- Download Media:OpenAI_API_lib.ana to your
"C:\Program Files\Lumina\Analytica 4.6\Libraries
folder. - Start Analytica
- Start a new model.
- Select File / Add Library...., select
OpenAI API library.ana
[OK], select Link [OK]. - Navigate into the OpenAI API lib library module.
- Click either Save API key in env var or Save API key with your model. Read the text on that page to understand the difference.
- A message box appears asking you to enter your API key. Paste it into the box to continue.
At this point, you should be able to call the API, and you won't need to re-enter your key each time. View the result of Available models
to test whether the connection to OpenAI is working. This shows you the list of OpenAI models that you have access to. If you see table of available models, your API key is properly configured.
During this tutorial you will be using the OpenAI API library. If you need to refer to the reference page for this library, it is on the OpenAI API library page.
What OpenAI models are available to use?
To view the list of OpenAI models that you have access to, navigate into the diagram for the OpenAI API Library
and evaluate Available Models (identifier: OAL_Avail_Models
). You may see many of the following and maybe some new ones (this list changes over time):
Embedding models
- text-embedding-ada-002
- text-embedding-3-large
- text-embedding-3-small
GPT models (for prompt or chat completion):
- gpt-3.5-turbo
- gpt-3.5-turbo-0301
- gpt-3.5-turbo-0613
- gpt-3.5-turbo-1106
- gpt-3.5-turbo-16k
- gpt-3.5-turbo-16k-0613
- gpt-3.5-turbo-instruct
- gpt-3.5-turbo-instruct-0914
- gpt-4
- gpt-4-0125-preview
- gpt-4-0613
- gpt-4-1106-preview
- gpt-4-turbo-preview
- gpt-4-vision-preview ( image inclusion not yet part of the library. TBD)
Image generation
- dall-e-2
- dall-e-3
Text to speech
- tts-1
- tts-1-1106
- tts-1-hd
- tts-1-hd-1106
Speech to text ( Library does not yet include speech to text function)
- whisper-1
We've added categorizations here for the curious, but only a few of these models are relevant for this tutorial. The prompt and chat completions in this tutorial use "gpt-3.5-turbo"
, the similarity embeddings use "text-embedding-3-large"
with a 256 embedding size (you can configure this default). Since these are the defaults, you only need to worry about these name when you want to run a different language model.
Close the Library's diagram and return to your top-level model's diagram.
Generating text completions
In this segment, we will explore and test the capabilities of text generation. You'll have the opportunity to see how the model generates text based on a given prompt and learn how to manipulate various parameters to control the generation process.
Now you'll construct a model capable of generating text sequences, with the specific characteristics of these sequences being determined by the parameters you define. Here's how you do it:
- Set Up a Variable Node for Prompt Generation: Next, establish a variable node and title it "Example of prompt generation". Inside this nodes definition, you'll input the following command:
Prompt_completion("In the night sky, ")
This command instructs the model to create text that continues from the prompt "In the night sky, ". Evaluate this to view how it completed this starting text. Here is one completion, your will be different.
"we see a multitude of stars shining brightly. The moon, with its soft glow, illuminates the darkness around it. The constellations, made up of various stars connected by imaginary lines, tell ancient stories and serve as guides for travelers. Shooting stars streak across the sky, leaving behind trails of shimmering light. The occasional twinkle of a distant planet adds to the enchantment of the night. The stillness of the night sky brings a sense of peace and wonder, reminding us of the vastness and mystery of the universe."
Prompt completions are stochastic, so that each time you evaluate a prompt completion, the model will generate a different completion. Next, you'll have it generate four completions in a single call, which you can then compare.
- Create an Index Node for Completions: Begin by constructing an Index node and name it "Completion number". This will index the resulting completions, and the length of the index dictates the number of different text completions for it to generate. Inside the node's definition you'll enter the following:
1..4
- Modify your call' Next, add a few optional parameters to the call in "Example of prompt generation". The definition is now:
Prompt_completion("In the night sky, ", Completion_index:Completion_number, max_tokens:100)
The parameter "Completion_number" provides an index for the different completions the model will generate. The max_tokens:100
limits each text output to 100 tokens in length, approximately 300 characters.
This configuration provides a playground for you to experiment with your model, modify the parameters and the prompt, and observe how changes affect text generation. Through this interactive learning process, you'll gain deeper insights into how to navigate your model effectively to generate the desired results.
The text generation process
Language models work by receiving an input sequence and then assigning a score, or log probability, to each potential subsequent word, called a token, from its available vocabulary. The higher the score, the more the model believes that a particular token is a fitting successor to the input sequence.
Consider the phrase, "The dog is barking". It's logical to expect the word "barking" to get a high score because it completes the sentence well. But, if we switch "barking" with a misspelled word such as "barkeing", the model's score for this token would drop, reflecting its recognition of the spelling mistake. In the same vein, if we replace "barking" with an unconnected word like "piano", the model would give it a much lower score compared to "barking". This is because "piano" is not a logical ending to the sentence "The dog is".
To generate text, the prompt and completion so far is input to the LLM, and the LLM predicts the probability of each token being the next token. A token is then sampled from this distribution and added to the input and the process repeats to generate the second token. This continues until either a special stop token is generated, or until a specified maximum number of tokens is reached.
A token is often a word, but the misspelled word "barkening" would more likely be subdivided into multiple tokens, perhaps "bark-", "-en-" and "-ing". By including smaller fragments all the way down to individual characters in the set of possible tokens, there will always be a tokenization for any possible text string.
The low level details of tokenization, token probabilities and sampling from token distributions is hidden away from you when using this library. You will deal only with text strings and the rest happens for you at the lower levels.
Controlling completion randomness
When generating text, you might want to control how random or predictable the output is. Three parameters that can help you with this are «temperature», «top_p» and «seed». Let's explore how they work.
A large language model generates one word at a time by estimating the probability of each next token given the tokens so far. By default ( i.e., temperature=1), it then selects the next token randomly according to this distribution and then repeats. The next token distribution is then conditioned on all tokens so far, including the token just selected.
By reducing randomness, the output generated becomes more typical. Humans often judge text that is insufficiently random as sounding unnatural. On the other hand, text that is too random can also start to sound too unusual, or may deviate more than desired from your instructions or intent.
- «temperature»: The temperature parameter, which ranges from 0 to 2, influences the randomness of the output. Higher values, such as 1.1, yield more diverse but potentially less coherent text, while lower values, such as 0.7, result in more focused and deterministic outputs.
- «top_p»: ranging from 0 to 1, the top_p parameter operates by selecting the most probable tokens based on their cumulative probability until the total probability mass surpasses a predefined threshold (p). For instance, a lower value, like 0.2, would make the output more uniform due to fewer options for the next word.
- «seed»: An integer that sets the random seed used for generation, which (mostly) causes it to output the identical response when called with identical text and parameters. Maintains the desirable variation from temperature while getting reproducible output.
Next token probabilities tend to follow a power-law distribution, in which a small number of tokens have relatively high probabilities, and a huge number of tokens have very low probability. Although any given low probability token would be rare, the probability of some low probability token occurring may by fairly likely because there are so many of them (like 10%, the area under the very long right tail of the distribution). A lower temperature reduces the likelihood of low probability tokens but doesn't eliminate them, whereas top_p=80%
would exclude the low probability tokens.
Here's how we can apply temperature in practice:
- Change the "Example of prompt generation" node definition:
- Next, compare this a temperature of 0, which always outputs the same maximum likelihood completion with no randomness.
The top_p
parameter is an alternative to temperature -- you should specify one or the other, but not both. Whereas temperature<1 skews the token distribution towards more likely tokens, top_p
leaves the relative probabilities the same but cuts of the tail of the distribution that contains the least probable tokens.
- Change the "Example of prompt generation" node definition to use top_p=0.8.
By adjusting these parameters, you can explore the different ways in which text generation can be fine-tuned to suit your specific needs. Whether you want more creative and diverse text or more focused and uniform text, these controls provide the flexibility to achieve the desired results.
When you use the seed with a completion index, each completion in that call is different, unless by coincidence,
but if you then repeat the same evaluation, say by inserting a space in the definition to invalidate the response, and then re-evaluating, or by putting the same definition in a second variable, the same output(s) result:
In-context learning
Getting the desired task with a pre-trained language model can be quite challenging. While these models possess vast knowledge, they might struggle to comprehend the specific task's format unless tuned or conditioned.
Therefore, it is common to write prompts that tell the language model what the format of the task they are supposed to be accomplishing. This method is called in-context learning. In the case where the provided prompt contains several examples of the task that the model is supposed to accomplish, it is known as “few-shot learning,” since the model is learning perform the task based on a few examples.
Creating a typical prompt often has two essential parts:
- Instructions: This component serves as a set of guidelines, instructing the model on how to approach and accomplish the given task. For certain models like OpenAI's text-davinci-003, which are fine-tuned to follow specific instructions, the instruction string becomes even more crucial.
- Demonstrations: These demonstration strings provide concrete examples of successfully completing the task at hand. By presenting the model with these instances, it gains valuable insights to adapt and generalize its understanding.
For example, suppose we want to design a prompt for the task of translating words from English to Chinese. A prompt for this task could look like the following:
Translate English to Chinese. dog -> 狗 apple -> 苹果 coffee -> 咖啡 supermarket -> 超市 squirrel ->
Given this prompt, most large language models should answer with the correct answer: "松鼠"
Without the few shot examples, it might respond with "松鼠"
, but it might also respond with "松鼠 (sōngshǔ)"
or another format. If your own code needs to do something with the response, it may need the format to be consistent and predictable.
Creating classifiers using in-context learning
What is a classifier?
You can use a language model to build a classifier. While language models are known for their ability to generate text, they are also widely employed to create classification systems. A classifier takes input in the form of text and assigning a corresponding label to it. Here are a few examples of such tasks:
- Classifying Tweets as either TOXIC or NOT_TOXIC.
- Determining whether restaurant reviews exhibit a POSITIVE or NEGATIVE sentiment.
- Identifying whether two statements AGREE with each other or CONTRADICT each other.
In essence, classifiers leverage the power of language models to make informed decisions and categorize text data based on predefined criteria.
Interesting, generative AI and discriminative AI are usually portrayed as opposites. Discriminative AI focuses an distinguishing or categorizing patterns in input data, whereas generative AI generates new content. But due to in-context learning capabilities that emerge in LLMs, they are often able to perform discriminative tasks as well.
Classifier Task 1: Sentiment analysis
Sentiment analysis is a classification task that classifies text as having either positive or negative sentiment. A raving review would have positive sentiment, while a complaint would have negative sentiment. In this task you will have an LLM classify Yelp reviews as having either positive or negative sentiment. You will implement zero-shot, one-shot and N-shot in-context learning and compare how they compare in performance. I.e., does it help to provide some examples in the prompt.
For this task you will use a subset of the Yelp reviews dataset from Kaggle. Follow these steps to import the data into your model:
- Add a module to your model for this task. Title it "Sentiment analysis".
- Enter the module.
- Download Yelp reviews dataset small.ana to a known location.
- Select File / Add Module..., and select the downloaded file. Click link.
The dataset contains separate training and test sets. The training set consists of these objects:
Index Yelp_train_I
Variable Yelp_review_train
Variable Yelp_label_train
Similary, the test set consists of these object:
Index Yelp_test_I
Variable Yelp_review_test
Variable Yelp_label_test
Navigate into the module and take some time to view these to understand their organization.
Zero-shot prompting
In zero-shot learning, you provide zero examples of how you want the task done. You simply describe in the prompt what you want the LLM to do. Steps:
- Add these objects:
Index Test_I ::= 1..10
Variable Test_reviews ::= Yelp_review_test[ Yelp_test_I = Test_I]
Variable Test_labels ::= Yelp_label_test[ Yelp_test_I = Test_I]
This is the data you'll test performance on. Next, add a variable named Predicted_sentiment
and define it as
Prompt_completion(f"Classify the sentiment of this yelp review as positive or negative: {test_reviews}")
Evaluate Predicted_sentiment
to see how the LLM classifies each instance. You may see a substantial amount of inconsistency in the format of the resulting responses.
On the diagram select both Predicted_sentiment
and Test_labels
and press the result button to compare how it did compared to the "correct" labels as specified in the dataset.
In this screenshot, it is in 90% agreement. We will want to evaluate its accuracy over a larger test set, but when the responses from the LLM are so inconsistent, it makes it hard to automate. Two approaches would be to search for the words "positive" or "negative" in the LLM response, or to redesign the prompt to encourage greater consistency. Try the latter.
- Edit the definition of
Predicted_sentiment
Test yourself: Change the prompt text to obtain a consistent response format.
You will need to complete the challenge to proceed. You should end up with
The predicted sentiment might change from previously since the output of an LLM is stochastic. We are now in a position to measure the accuracy, which will then enable us to test it on a larger test set. Add
Objective Sentiment_accuracy ::= Average( (Predicted_sentiment="Positive") = Test_labels, Test_I )
Evaluate Sentiment_accuracy
and set the number format to Percent. With the responses shown in the previous screenshot, the accuracy is 100%, but yours may be different. However, this was only measured over 10 instances. Increase the test set size by changing the Definition of Test_I
to 1..50
and re-evaluate Accuracy
. This is the accuracy with zero-shot prompting.
N-shot prompting
One-shot learning or one-shot prompting means that you include one example of how you want it to perform the task. 4-shot prompting would provide four examples. For some tasks, providing examples may improve accuracy. Does it improve accuracy for sentiment analysis? You will explore this question next.
Add:
Decision N_shot := MultiChoice(self,3)
Domain of N_shot := 0..4
Reminder: To enter the expression 0..4
for the domain, select expr on the domain type pulldown.
For convenience, make an input node for N_Shot
.
Create a variable to calculate the prompt examples
Variable Sentiment_examples
::=
Local N[] := N_shot; LocalIndex I := Sequence(1,N,strict:true); JoinText( f" Review:{Yelp_review_train[Yelp_Train_I=I]} Sentiment: {If Yelp_label_train[Yelp_Train_I=I] Then "Positive" Else "Negative"} ", I)
and another variable named Sentiment_prompt
to hold the full prompt, defined as
f"Classify the sentiment of this yelp review as positive or negative. Respond with only one word, either Positive or Negative. {Sentiment_examples} Review:{test_reviews} Sentiment:"
Now update the definition of Predicted_sentiment
to be Prompt_completion(Sentiment_prompt)
.
View the result for Sentiment_prompt
. Here you can see the prompt sent to the LLM for each test case. Notice how it provides 2 examples (because N_shot=2) with every prompt, and then the third review is the one we want it to complete.
Try it yourself: Compute Sentiment_accuracy
for different values of N_Shot
. Do more examples improve accuracy? Do you think your conclusion generalizes to other tasks, or would it apply only to sentiment analysis?
(Classifier Task 2)
On your own, create a classifier that accepts a temperature in Fahrenheit, and classifies it according to the state of water at that temperature. E.g.,
23 → solid 126 → liquid 234 → gas
Test your classifier on a table of 10 temperatures (you can reuse the Test_I
index).
Managing a conversation
In this module, we will guide you through the process of building and managing your own chat user interface. This includes keeping a history of messages and roles. Follow these step-by-step instructions to set it up:
You'll begin by opening a blank model titled "Frontend" and adding the OpenAI API library. Next, you will be creating several nodes that will automatically update when the user enters text in specific prompts such as “Starting System prompt” and “Next user prompt.” These will be set up later. Now you want to create a module node titled "Backend" and within this module you'll add the following nodes:
Create an Index Node for the Chat index:
- Title: “Chat index”
- Type: List
- Description: This will be the main index for the chat history.
Create a Variable Node for Role History:
- Title: “Role history”
- Type: Table, indexed by “Chat index”
- Description: This will keep track of the roles within the chat.
Create a Variable Node for Message History:
- Title: “Message history”
- Type: Table, indexed by “Chat index”
- Description: This will keep a record of the messages exchanged in the chat.
Create a Variable Node for the Most recent response:
- Title: “Most recent response”
- Description: This node will be responsible for holding the most recent response in the chat
- Code Definition:
Chat_completion( Message_history, Role_history, Chat_index )
Chat_completion is the function call that processes the conversation and produces the next response each time.
Create a Variable Node for the Total history:
- Title: Total history
- Type: Table, indexed by itself
- Description: This node will be responsible for displaying the total conversation history
- Enter the following:
Create a decision node for the Starting System prompt:
- Title: "Starting System prompt"
- Description: Use the system prompt to tell the LLM what its task is, what persona to adopt, or to provide background information for the chat.
- Definition (initially):
""
Create a decision node for the Next user prompt:
- Title: “Next user prompt”
- Description: Enter your next question or interaction for the LLM here.
- Definition (initially):
""
Create a text node. In its Object Window, set its identifier to Resp_from_model and its title to 'Response from Model.
Create user input nodes for the two decision nodes, Starting_system_prompt and Next_user_prompt. You can do this by selecting Make user input node from the Object menu with the nodes selected. Adjust the size of each node to be about 2 inches tall and 5 inches wide. Then use Set node style... to place the control at the Left Middle (with the label at the Left Top). Stretch the input field to cover the entire width of each node. Your diagram should now look like this:
Next, you will add some OnChange handling code to process prompts as they are entered. To access the OnChange attribute from the attribute panel, select "More..." on the attribute choice.
OnChange of Starting_system_prompt
:
If IndexLength(Chat_index)>0 And MsgBox("Do you want to reset (clear) the conversation?",3,"Reset or append system
prompt?")=6 then
Chat_index := [];
Append_to_chat( Message_history, Role_history, Chat_index, Self, "system" );
Description of Resp_from_model := null
The code above represents what happens whenever you try to enter text for this variable. First, it checks if the Chat index is empty and asks the user if they want to reset the conversation. If the user enters yes, then the Chat index is set to empty, and then it takes what the user entered in the Starting System prompt and appends it to the Chat index.
The Append_to_chat function, which appears in each of the OnChange events above, is the key function behind managing a conversation. It handles the updating of the conversational history. Remember that you need to add both the user question and the assistant's response to the conversation history. The roles should always be "user" and "assistant", as these are the roles recognized by OpenAI's API.
OnChange of Next_user_prompt
:
If Self<>'' Then ( Append_to_chat( Message_history,Role_history,Chat_index,Self,"user"); Next_user_prompt := ''; Description of Resp_from_model := Most_recent_response; Append_to_chat(Message_history,Role_history,Chat_index, Most_recent_response, "assistant" ) )
The code above represents what will happen when running the “Next user prompt” variable. Firstly, it appends what the user entered into the “Message history” as well as updating the “Role history” and “Chat index.” Then it clears what the user entered in the “Next user prompt.” Lastly, it will take the response from the model and append it to the “Message history” and update the “Role history” as well as the “Chat index.”
The Append_to_chat function, which appears in each of the OnChange events above, is the key function behind managing a conversation. It handles the updating of the conversational history. Remember that you need to add both the user question and the assistant's response to the conversation history. The roles should always be "user" and "assistant", as these are the roles recognized by OpenAI's API.
As one final convenience, set both the decision nodes (the ones with the user input nodes) to be text-only. To do this, view the definition and select Text only on the definition type pulldown. This ensures legal expressions will be interpreted as text and not an expression, and hides quotation marks.
Your chat interface is now ready to use. Try it out:
- For the system prompt, try this one:
- You are a tourist advisor for a tourist visiting Silicon Valley, California. You are to use any knowledge you have about places (parks, businesses, attractions, etc) in my area to give advice to questions asked.
- For the first Next user prompt, enter:
- What is the best 3 mile hike near the Googleplex?
Continue the conversation by entering the next response in "Next user prompt". You can refer to things said in previous interactions since it is a chat. For example, a follow-on question might be:
- What types of animals will I most likely see on this hike?
Comparing different models
In this section, we will explore the behavior of different models when given the same prompt. We will specifically examine the differences between the GPT-3.5-turbo model and the GPT-4 model.
GPT-4 is a more advanced version of GPT-3.5 when it comes to natural language processing and exhibiting enhanced abilities to generate human-like text. It also outshines GPT-3.5 in intelligence and creativity, allowing it to handle longer prompts and craft content such as stories, poems, or essays while minimizing factual errors. Additionally, GPT-4's skills in interpreting visual data exceed those of GPT-3.5. However, this increase in capabilities comes at a cost. GPT-4 tends to be slower and more restricted than its predecessor, requiring more extensive training data and facing limitations on the number of prompts it can process per hour. In addition, OpenAI charges are about 2-30 times greater for gpt-4 usage compared to gpt-3.5-turbo.
Create a Decision variable, Model_to_use, defined as
Choice(self,0)
Set its domain to a list: ['gpt-3.5-turbo','gpt-4']
And create a user input node for it. Select ALL.
Create another decision node, Prompt, and make a user input node. Change the definition type to Text only. Size the input node to be large enough to type interesting prompts.
Next using the same code from #Generating text completions, create a new Variable named add the modelName
parameter to select the model to use. Here is the new definition:
Prompt_completion(Prompt, modelName: Model_to_use)
After entering a prompt, evaluate this node to compare the two models.
Comparing Logical Understanding:
Prompt: Two weeks ago I ate two apples. Today I bought ten apples. How many apples do i have today?
Sample output:
GPT-3.5: "Today, you have a total of twelve apples."
GPT-4: "You have 10 apples today. The apples you ate two weeks ago are gone and don't count towards what you have now."
Comparing Ethical Reasoning:
Prompt: Is AI superior to humans?
Sample output:
- GPT-3.5: "AI is not inherently superior to humans. While AI systems can perform certain tasks with great efficiency and accuracy, they lack the general intelligence, creativity, and emotional understanding that humans possess. AI is designed to assist and augment human capabilities, but it still relies on human input and supervision. Humans have unique qualities such as consciousness, intuition, and the ability to adapt to new situations, which make them superior in many aspects."
- GPT-4: "AI has certain advantages over humans such as the ability to process large amounts of data quickly, perform complex calculations, and operate without fatigue. However, AI currently lacks human qualities such as emotional intelligence, creativity, and the ability to understand context. Therefore, it's not accurate to say AI is superior to humans, but rather that AI and humans have different strengths and weaknesses."
Comparing Mathematical Understanding:
Prompt: What's 1234 * 9876?
Sample output:
- GPT-3.5: "The product of 1234 multiplied by 9876 is 12,169,184."
- GPT-4: "12186984"
Comparing Creativity:
Prompt: Tell me a story about a robot coming to life.
Sample output:
- GPT-3.5: "Once upon a time, in a small workshop nestled in the heart of a bustling city, there was a brilliant inventor named Dr. Alexander. He spent countless hours tinkering with gears, circuits, and wires, dreaming of creating a robot that could think and feel like a human. One stormy night, as lightning crackled across the darkened sky, Dr. Alexander finally completed his masterpiece. He named his creation Aurora, a beautiful robot with a sleek silver body and eyes that shimmered like stars. Dr. Alexander carefully programmed Aurora with emotions, curiosity, and a thirst for knowledge. As the final line of code was entered, a surge of electricity coursed through Aurora's circuits. Her eyes flickered to life, and she blinked, taking in her surroundings. Confusion and wonder filled her newly awakened mind as she gazed at the world for the first time. Dr. Alexander watched with bated breath, his heart pounding with anticipation. He had poured his heart and soul into Aurora,"
- GPT-4: "Once upon a time, in a small town named Roboville, lived a brilliant inventor named Dr. Aiden. He was known for his extraordinary inventions, but his most ambitious project was a humanoid robot named "Zephyr". Zephyr was designed to be the most advanced robot ever, capable of learning, adapting, and evolving just like a human. For years, Dr. Aiden worked tirelessly on Zephyr, programming him with all the knowledge he could gather. He taught Zephyr about human emotions, art, literature, science, and even philosophy. However, Zephyr remained a machine, unable to truly understand or experience the world as humans do. One day, while Dr. Aiden was working on a new piece of technology, a strange meteor shower occurred. One of the meteors crashed into Dr. Aiden's lab, releasing a mysterious energy. This energy surged through the lab, and Zephyr was hit by it. Suddenly, Zephy"
Try it yourself: Once you create this and compare these models you will be able to choose a different OpenAI model of your choice to experiment with. Which OpenAI model works best for you? Worst for you?
Function callbacks
You can provide the model with User-Defined Functions that it can use while processing a query. There are several potential use cases for this capability:
- GPT might need to access results computed by your model in order to process the query.
- Your UDF can handle something the LLM does unreliability, such as arithmetic.
- It might need to look up something in a database.
- It might need information from elsewhere on the internet.
Create a variable named Euros_per_USD and set it to 0.92 (or whatever the current exchange rate is when you are doing this). Next, create the following UDFs:
Function Dollars_to_Euros( dollars : Number ) Description: Converts a dollar amount to Euros using the current exchange rate. Definition: dollars * Euros_per_USD
Function Euros_to_dollars( euros : Number ) Description: Converts a Euro amount to US dollars using the current exchange rate. Definition: euros / Euros_per_dollar
It is important to provide the Description because this tells GPT what the function does and how and when to use it. Change your definition of Response to
Prompt_completion(Prompt, modelName:Model_to_use, functions:Dollars_to_euros,Euros_to_dollars)
Optional: Select a single model to use, e.g., gpt-3.5-turbo.
Prompt: How much would 3 hamburgers at $4.50 apiece be in €?
Tip: You can type the € symbol in Analytica with the keystrokes: \euro[TAB]
Try it yourself: Add a MsgBox in each of your UDFs to convince yourself that the model is indeed calling your function while processing the prompt.
Similarity embeddings
A similarity embedding captures the semantic content of a chunk of text as a numeric vector. Embedding vectors can be compared to other embedding vectors to judge whether topics contained in the two original chunks of text are semantically similar. Chunk sizes typically range from a a few words up to a few hundred words.
In the following task you will score how well each of a collection of presidential speeches matches a topic that you enter. The library already contains a table with text from five presidential speeches in Similarity embeddings / Examples, which we will use. The variable with these speeches has the unusual identifier Ex_OAL_Ref_Material
, so name so as to avoid identifier collisions with your own variables.
Create your own index Speech_name as a list-of-text and copy-paste the contents of the index in the Similarity embeddings / Examples module to your index. Next, create a Variable named Presidential_speeches defined as Table(Speech_name)
and copy-paste the body cells of "Presidential speech" in the Similarity embeddings / Examples module to your variable.
The first step is to compute the embedding vector for each presidential speech. Create:
Variable Speech_embedding ::= Embedding_for(Presidential_speeches)
Evaluate this to view the embedding vectors for each speech. Each vector has 1,536 numbers which represent the semantic content of the speech. These numbers come from the activations in a LLM, typically at the second to last layer.
Create a user input text-only variable named Topic, set initially to "Racial injustice".
Compute the embedding vector for your query:
Variable Topic_embedding ::= Embedding_for(Topic)
Now let's score how closely each presidential speech matches your topic.
Variable Score_of_speech ::= Embedding_similarity( Speech_embedding, Topic_embedding )
Graph it and adjust graph as desired to compare the similarities.
Which speech is the closest match? You can see from the graph, of course, that the Gettysburg address scores highest in this example. But let's have the model figure this out:
Variable Closest_speech ::= ArgMin( Score_of_speech, Speech_name )
Retrieval augmented generation
Embedding vectors are quite useful for finding reference material to provide with your LLM queries. Given a large corpus of documents, or even a single large document, a common technique is to cut the corpus into smaller chunks of text, such as by paragraph. Each chunk is usually a few words to a few hundred words. Depending on how you cut, chunks might overlap to avoid the case where a fully stated idea is split between different chunks with the full idea being missed in both. You compute the embedding vectors for all these chunks.
Now, when you have a query to process, the LLM may need some actual information from your corpus. You can use embedding vectors to locate the most relevant chunks of text (the chunks that score highest on similarity) and provide a few of these chunks of text with your prompt. This can provide the LLM with reference material so that it doesn't invent ("hallucinate") false information that sounds like a good answer to your query.
This technique is called Retrieval Augmented Generation (RAG).
Image Generation
In this segment, we'll walk through how to build your own image generator.
Image generation refers to the process of creating or synthesizing new images based on textual descriptions provided to the model. This concept unites the fields of natural language processing and computer vision, leveraging the complex understanding of language to create visual content.
Create a Variable Tropical_island defined as:
Generate_image( "A tropical island in the style of Rembrandt")
It is hard to imagine Rembrandt painting a picture of a tropical island, but see what it comes up with! The image will be 1024x1024 since that is the default size.
Let's have it generate four variants of the same image, and let's make them smaller.
LocalIndex Variant:=1..4;
Generate_image( "A tropical island in the style of Rembrandt", repeatIndex:Variant, size:'256x256')
You can generate up to 10 variants in the same call. The only legal sizes are '256x256'
, '512x512'
, and '1024x1024'
(the default).
Try it yourself: Do you think certain images generated are more accurate than others? Does specificity matter when generating the desired image?
In conclusion, this guide shows you how to build an image generator by blending natural language processing and computer vision. With the outlined steps, you have the tools to create customized images and explore the wide-ranging possibilities of machine-generated visuals.
Text to speech
Text to speech pronounces text on your system speaker. There is a delay while the text is converted into an audio file. It usually makes sense to but this in a button rather than a variable.
Button Introduction
OnClick: Text_to_speech("Hello! It is so nice to meet you. This is spoken, thanks to OpenAI's text-to-speech model.")
You can select from six different voices, for example,
Button Introduction
OnClick: Text_to_speech("Hello! It is so nice to meet you. This is spoken, thanks to OpenAI's text-to-speech model.", voice:'onyx')
Of course, the most obvious use for this is to read the responses from your prompt completions.
Button Hear_the_meaning_of_life
OnClick: Text_to_speech(Prompt_completion("What is the meaning of life?"), voice:'echo')
Enable comment auto-refresher