OpenAI API library
This page is under construction. The library is not yet available for download, but when it is ready, we will include a download link on this page
Release: |
4.6 • 5.0 • 5.1 • 5.2 • 5.3 • 5.4 • 6.0 • 6.1 • 6.2 • 6.3 • 6.4 • 6.5 |
---|
Requires the Analytica Enterprise edition or better
The Open AI API library is a collection of Analytica functions that interface with generative A.I. models from within your Analytica model. You can leverage the flexibility of large language models (LLMs) to perform tasks that would be hard to do in a formal program or model, and you can generate images from text. The library is also a great way to learn about generative A.I. from within Analytica. This page is a reference to the functions in the library. It is accompanied by a Tutorial on using the library. Going through the tutorial is a great way to learn about LLMs.
Download: OpenAI API lib.ana (v. 0.1)
Requirements
To use this library, you must have:
- An Analytica Enterprise or Analytica Optimizer edition, release 6.3 or higher.
- You own OpenAI API key.
- Internet access allowing Analytica to communicate with
https://api.openai.com
. If you have a strong firewall, you may need to add a rule to allow this.
To get an OpenAI API key
- Go to https://platform.openai.com/ and sign up for an account.
- Click on your profile picture and select View API keys
- Click Create new secret key.
- Copy this key to the clipboard (or otherwise save it)
Getting started
- Download the library to your
"C:\Program Files\Lumina\Analytica 6.5\Libraries
folder. - Launch Analytica
- Load your model, or start a new model.
- Select File / Add Library...., select
OpenAI API library.ana
[OK], select Link [OK]. - Navigate into the OpenAI API lib library module.
- Press either Save API key in env var or Save API key with your model. Read the text on that page to understand the difference.
- A message box appears asking you to enter your API key. Paste it into the box to continue.
At this point, you should be able to call the API, and you won't need to re-enter your key each time. View the result of Available models
to test whether the connection to OpenAI is working. This shows you the list of OpenAI models that you have access to.
Text generation from a prompt
Many machine learning and A.I. inference tasks are performed by providing prompt text and asking an LLM to complete the text.
Function Prompt_completion( prompt, modelName, «optional parms» )
Returns a text completion from the provided starting «prompt». This example demonstrates the basic usage:
Prompt_completion("The little red corvette is a metaphor for") → "sensuality and excitement"
The function has multiple return values:
- The main return value is the textual response (the content).
- If «Completion_index» is specified, this is a set of completions indexed by «Completion_index».
- . The finish_reason. Usually Null, but may be "stop" if a «stop_sequence» is encountered.
- Number of prompt tokens
- Number of tokens in the response
- A reference to the full response from the API call
Examples
Prompt_completion("Translate 'happy birthday' into Chinese") → "生日快乐 (shēng rì kuài lè)"
Local ( completion_text, finish_reason, prompt_tokens, completion_tokens, raw_response )
:= Prompt_completion("The little red corvette is a metaphor for") Do
[ completion_text, finish_reason, prompt_tokens, total_tokens, raw_response ]
→ completion_text "freedom, desire, and youthfulness"
finish_reason "stop"
prompt_tokens 16 completion_tokens 15 raw_response «ref» Optional parameters
The function has many optional parameters:
- «modelName»: The OpenAI model to use. It must support Chat.
'gpt-3.5'
,'gpt-3.5-turbo'
and'gpt-4'
are common choices. - «functions» : One or more functions that the LLM can call while generating its completions.
- «temperature»: A value between 0 and 2.
- Smaller values are more focused and deterministic, Higher values are more random. Default=1.
- «top_p»: A value 0<top_p<=1. An alternative to sampling temperature.
- Do not specify both «temperature» and «top_p».
- A value of 0.1 means only tokens comprising the top 10% of probability mass are considered.
- «Completion_index»: Specify an index if you want more than one alternative completion.
- The results will have this index if specified. The length of this index specifies how many completions to generate.
- «stop_sequences»: You can specify up to 4 stop sequences.
- When it generates one of these sequences, it stops and returns the completion up to that point.
- «max_tokens»: The maximum number of tokens to generate in the chat completion.
- «presence_penalty»: Number between -2.0 and 2.0.
- Positive penalizes new tokens based on whether they appear in the text so far.
- «frequency_penalty»: Number between -2.0 and 2.0.
- Positive values penalize new tokens based on their existing frequency in the text so far.
See the tutorial on using this library for more details. Also, see #Function callbacks below.
Managing a chat
A chat is a conversation with several back-and-forth interactions with the LLM. To use a chat, you need to store the chat history in variables within your model. The
Append_to_chat
functions makes it easy to manage your conversation history over successive interactions. TheChat_completion
function processes the next response in a conversation.A chat is encoded by three nodes:
- The chat index.
- A message history, which is a Table on the chat index.
- A role history, which is a Table on the chat index.
Each interaction has one of three possible roles:
'system'
,'user'
or'assistant'
. The first two mark text that you, your end-user, or your model creates, whereas'assistant'
marks the responses from the LLM. Typically you will have one'system'
prompt at the beginning of the chat with the instructions for the LLM.Function Append_to_chat( messageHistory, roleHistory, ChatIndex, message, role )
Destructively appends a new message (and role) to the end of a conversation. A conversation consists of three globals: • A «ChatIndex», usually 1..n for n interactions so far. • A «messageHistory» variable, defined as a Table( «ConversationHistory» ) • A «roleHistory», defined as a Table( «ConversationHistory» )
«message» is the new message to append. «role» ole should be either "user", "assistant" or "system".
Because this destructively changes global variables, it must be called from an event handler like OnClick or OnChange, and cannot be called while an evaluation is in progress.
Function Chat_completion( messages, roles, Chat_index, modelName, «more optional parameters»)
This returns the next response in a Chat given the history of messages so far.
Required parameters:
- «messages» : The sequence of messages so far, in order, indexed by «Chat_index».
- «role» : Each message has a role, which must be one of 'system', 'user', 'assistant', or 'function'.
- «Chat_index»: Indexes the sequence of messages in the conversation
The function has multiple return values:
- The main return value is the textual response (the content).
- If «Completion_index» is specified, this is a set of completions indexed by «Completion_index».
- . The finish_reason. Usually Null, but may be "stop" if a «stop_sequence» is encountered.
- Number of tokens in the prompt, which includes all the tokens in «messages».
- Number of tokens in the response
- A reference to the full response from the API call
Optional parameters:
- «modelName»: The OpenAI model to use. It must support Chat. 'gpt-3.5', 'gpt-3.5-turbo' and 'gpt-4' are common choices.
- «functions» : One or more functions that the LLM can call during its completions.
- «temperature»: A value between 0 and 2.
- Smaller values are more focused and deterministic, Higher values are more random. Default=1.
- «top_p»: A value 0<top_p<=1. An alternative to sampling temperature.
- Do not specify both «temperature» and «top_p».
- A value of 0.1 means only tokens comprising the top 10% of probability mass are considered.
- «Completion_index»: Specify an index if you want more than one alternative completion.
- The results will have this index if specified. The length of this index specifies how many completions are generated.
- «stop_sequences»: You can specify up to 4 stop sequences.
- When one of these sequences is generated, the API stops generating.
- «max_tokens»: The maximum number of tokens to generate in the chat completion.
- «presence_penalty»: Number between -2.0 and 2.0.
- Positive penalizes new tokens based on whether they appear in the text so far.
- «frequency_penalty»: Number between -2.0 and 2.0.
- Positive values penalize new tokens based on their existing frequency in the text so far.
Function callbacks
You can provide the
Prompt_completion
andChat_completion
functions with your own User-Defined functions that the LLM can call while it is generating the response to your prompt. You could use this, for example, to allow the LLM to gather results that your model computes to incorporate into the conversation. You can also use this to provide tools for it to use for things it is not very good at on its own, such as arithmetic.Your callback functions should have only simple parameters, accepting scalar text or numbers. The language models do not have a way to pass arrays or indexes. It is a good idea to quality each parameter as either Text or Number. The Description of your function provides the language model with guidance for when it should use your function.
For example:
Function get_current_weather( location : text ; unit : text optional ) Description: Get the current weather in a given location Parameter Enumerations: unit "celsius"| "fahrenheit"| Definition:
AskMsgText(f"What is the current weather in {location}?","API function call")
To allow it to use this function, pass the function identifier in the «functions» parameter, e.g.,
Prompt_completion("Do I need an umbrella today? I'll be taking a hike in Portland, Oregon", functions: get_current_weather)
When you evaluate this, a message box appears on the screen asking you to provide the answer to "What is the current weather in Portland, Oregon?". This message box occurs when the AskMsgText in
get_current_weather
is evaluated by the LLM.Type: Drizzly with occasional thunder showers, and the final return value is
"Yes, it is recommended to bring an umbrella today as there are occasional thunder showers in Portland, Oregon."
Using a meaningful parameter name can help the language model understand what value to pass for that parameter, but the LLM will often benefit from including an additional description of each parameter. To do this, add the parameter descriptions inside your function Description using the following format:
- Description:
- Get the current weather in a given location.
- Parameters:
- location: The location in the format City, State. E.g., "Tuscon, AZ".
- units: Units to use for temperature.
If a parameter description has more than one line, you should indent each line (using TAB). To find the parameter descriptions, in looks for this format -- the title "Parameters", followed by lines where the first character is a bullet (either * or •, the latter is typed with the keystrokes \bullet[TAB]), followed by the parameter name, a colon, then the description. The parameter name may optionally appear inside chevrons, e.g, «location», in which case the bullet is optional.
Use the ParameterEnumeration attribute to specify possible enumerated values for parameters that expect specific values. You may need to make this attribute visible first. The parameter name should appear on its own line, then each enumerated value should appear on a separate indented line. Each value should be followed by a bar (|) then an optional description of the value (this description isn't passed to the LLM, but is used by Expression Assist). Text value should appear with explicit quotes.
Similarity embeddings
A similarity embedding captures the semantic content of a chunk of text as a numeric vector. Embedding vectors can be compared to other embedding vectors to judge whether topics contained in the two original chunks of text are semantically similar. Chunk sizes typically range from a a few words up to a few hundred words.
Similarity embeddings have many uses, one of the most common being for doing Retrieval Augmented Generation, in which your code finds a small number of reference text chunks whose embeddings are similar to the user's question and then including these in the LLM prompt, along with the actual question, when calling Prompt_completion ar Chat_completion.
Function Embedding_for(txt)
Returns an embedding vector for the text «txt». You can pass a single text or an array of text. If you pass an array, all text strings are passed in a single API call. It uses the
text-embedding-ada-002
OpenAI model, which is tuned for fast similarity embedding, the price charged by OpenAI for each embedding is extremely low.Each embedding is indexed by the index
Ada_embedding
.Function Embedding_similarity( x,y )
Compares two embedding vectors, «x» an «y» and returns the similarity, where a larger number corresponds to being more similar.
Example:
- Index Record_num := 1..10
- Variable
Processor_name ::= Table(Record_num)
{ CPU & GPU names in inconsistent formats } - Variable
Proc_name_embedding ::= Embedding_for(Processor_name)
- Variable
Query ::= "Graphics card for gamer"
- Variable
Similarity_to_query ::= Embedding_similarity( Embedding_for(query), Proc_name_embedding)
- Side note: To generate this plot: In Graph Setup / Chart type, select Bar chart, Swap horizontal and vertical, and Sort by data. spread. Then press the XY' button, check Use another variable and add
Processor_name
. Finally, set the vertical axis pivot toProcessor_name
. Finally, right-click on the key for Record_num and select Hide Key.
- Side note: To generate this plot: In Graph Setup / Chart type, select Bar chart, Swap horizontal and vertical, and Sort by data. spread. Then press the XY' button, check Use another variable and add
Image generation
Function Generate_image( prompt, RepeatIndex, size )
- «prompt»: Textual prompt describing what you want an image of.
- «RepeatIndex»: (optional) Specify an index if you want multiple variants of the same «prompt». The length of the index, up to 10, variants will be generated.
- «size»: The size of the image. It must be one of '256x256', '512x512' or '1024x1024'. If unspecified, the default is '1024x1024'.
Generates an image from a textual description.
Example:
API Errors
When calling OpenAI's API, you will encounter various errors from the API server (in the form of HTTP error codes). Unfortunately, these are common even when you don't have a bug in your own code, and are often hard to deal with in a graceful or automatic way. Future revisions of this library will hopefully incorporate enhancements for reducing the frustrations of these errors as we get more experience with ways to deal with them.
Two common error sources that we have observed are:
- Intermittent errors from server. These are often codes like "page not found". The exact same set of queries may work when repeated, but it is unclear why the server reports the error in one run but not the next. We have seen these more often when sending a rapid sequence of calls, which might reflect a cause that is somehow correlated to bursts of calls, or it might just be that we notice it more when executing code that requires lots of calls.
- Rate limit exceeded errors when using the gpt-4 model. Occurs when even a very small number of queries are sent in quick succession, such as when breaking a large page into 20 chunks and sending 20 queries in succession to process each chunk separately. It appears that the actual query rate is far below the rate limits published by OpenAI, but yet they occur.
These errors can be especially frustrating because they both tend to occur while array abstracting over a problem (where each part is solved by a separate call). If an error causes the calculation to abort, the results for the earlier calls in that iteration are not retained. If you get the rate-limit errors with gpt-4, you are charged by OpenAI for the tokens in the queries that fail. (GPT-4 queries are fairly expensive -- for example, it can be on the order of $1 to process the text of a long article, whereas gpt-3.5-turbo queries are dirt cheap).
Errors like these that are issued by OpenAI's server tend to be cryptic and not very informative about what the cause is. At present, these errors and the frustration of dealing with them is probably the dominant limitation of the current library.
We expect the library to evolve with time, with issues like these being a big area for improvements, so you should be prepared to update to newer versions of the library.
- «modelName»: The OpenAI model to use. It must support Chat.
Enable comment auto-refresher