Retrieving Content From the Web
This article was created in 2007, and is now obsolete. The entire functionality developed in this article can today be carried out just by calling built-in functions ReadURL and ParseCSV. Nevertheless, if you are developig your own custom data or application integration, the development carried out in the article may still be relevant for you.
This page details a simple example of integrating external data into Analytica. In this example, historical stock data for a given stock is read from a web site and imported into an Analytica model.
The example makes use of an external program, written in C++/CLR using Microsoft Visual Studio 2005. The example thus demonstrates the integration of an external program with Analytica. The source code and steps for creating this program, ReadURL.exe, are given below. If you don't have Microsoft Visual Studio 2005, or don't wish to compile ReadURL.exe yourself, you can download it from this link: ReadURL.exe
A webinar based on the content of this article can be viewed at: Calling-External-Applications.wmv (requires Windows Media Player)
Minimal Requirements
- Analytica Enterprise 4.0
- .NET 2.0 framework
- Internet access (to Yahoo finance)
- Optional: Microsoft Visual Studio 2005
Getting Started
First, you need to obtain ReadURL.exe. You can either download it from the link, or you can compile it yourself using the source code provided in the section at the end of this article. You should take note of the full path of your executable. In my case, after compiling it, the executable was at:
C:\Src\ReadURL\Debug\ReadURL.exe
Understanding Historical Stock Data on Yahoo Finance
The Yahoo Finance web site provides historical stock data for free, downloadable for each stock as a CSV file. Before actually downloading the data, take a few minutes to understand how you can access the data as a CSV file.
In a web browser, go to: http://finance.yahoo.com. In the Stock Symbol box enter: MSFT (the symbol for Microsoft) and press the Get Quotes button. On the page that results, enter a start date of Jan 3, 2007 and an end date of Aug 8, 2007 and press Get Prices.
Near the bottom of the page that results is a link that says "Download to spreadsheet". By hovering over the link, you can see the URL that loads the CSV file for MSFT between Jan 3, 2007 and Aug 8, 2007. It is:
Take a few moments to study the URL. The format of the URL should become clear fairly quickly. The «s» parameter is the stock symbol, the «a» parameter is one less than the day number in the start date, «b» is the month number of the start date, etc. We need to know this format because we want our Analytica model to construct this URL for a given stock symbol, start date, and end date.
Constructing the URL
Start Analytica. Create two variable nodes, Start_date
and End_date
, and set the number format for both to Date.
Set the definition of Start_date
to: 3 Jan 2007
Set the definition of End_date
to: Today()
Create a variable named Stock sym
and define it to be MSFT
Now let's create a user-defined function (UDF) that constructs the necessary URL from the symbol, start date and end date. Drag a function object to your diagram and define it as shown:
We can test this by creating another variable, named URL, with the definition:
Historical_Stock_URL(stock_sym, start_date, end_date)
Evaluating this variable returns the URL we saw earlier.
Reading the Data
Next, we'll create a UDF that reads the content from that URL as a long text string. Drag another function node to your diagram and define it as follows:
You may need to modify the path in the first parameter to point to your copy of ReadURL.exe
.
The file we'll read will be a CSV file, so to parse it we'll want to make use of the Flat File Library. Select File → Add Library... and select Flat File Library.ana. You can either link or embed.
Now, create a variable node titled Textual Price Data
defined as:
ParseCsvText(ReadFromUrl(Url))
Evaluate this node and you should the historical price data imported from Yahoo Finance.
Coercing the Data
The data just imported is currently all text, which means we cannot treat the cell contents numerically with arithmetic operations. Also, it might be preferable to use Date
as an index (although this depends on your own needs). Here we will use Date
as an index, but we'll name it Trading_Date
.
If you plan to do much analysis using this data, it is more convenient to break each column out as a separate variable. This allows you to use an identifier such as Open_price
in expressions rather than a Subscript operator like Data[Field = 'Open']
. We'll demonstrate both methods here.
We'll use a variable node for our Trading_date
index, rather than an index node. The reason for this is that Analytica evaluates Index nodes pro-actively (e.g., immediately when they are defined, or when the model is loaded), but evaluates variables nodes on-demand. A variable node can serve as an index just fine, with this former behavior being the key functional distinction.
Create a variable node, titled Trading Date
, defined as:
CopyIndex(ParseDate(Textual_Price_Data[.Field = 'Date']))
Set its number format to Date
.
Create an variable titled Fields
, defined as:
subset( Textual_price_data.Field <> "Date")
this will serve as our field index in the full numeric table.
Now create a variable titled Stock Data
defined as
ParseNum(Textual_price_data[@.Row = @Trading_date,.Field = Fields])
Here we are re-indexing by the two indexes defined previously, and converting the data to numeric. The result here is the stock data as numbers in a table form.
It may be more convenient to break out each column as a separate variable. For example, you can create a variable Open_Price
defined as
ParseNum(Textual_Price_Data[.Field = 'Open'])
and repeat for High_Price, Low_Price, Close_Price, Volume,
and Adj_Close
.
Creating ReadURL.exe
In the above example, you have the option of using the ReadURL.exe provided, or you can compile it yourself if you have Visual Studio 2005. Going through the process of creating and compiling this simple application is a highly worthwhile exercise if you think you may need to create your own custom integrations or custom external computations.
To read data from the web, a simple C++/CLR application was created in Microsoft Visual Studio 2005. Select File → New → Project/Solution → C++ → CLR/.NET Console Application. We named our project ReadURL
and placed it in C:\Src.
In the file ReadURL.cpp
, we placed the following code:
// ReadURL.cpp : main project file. #include "stdafx.h" using namespace System; using namespace System::Net; using namespace System::IO; int main(array<String ^> ^args) { if (args->Length != 1) { Console::Error->WriteLine(L"Usage: ReadURL <url>"); return -1; } String^ url = args[0]; WebRequest^ request = WebRequest::Create(url); WebResponse^ response = nullptr; StreamReader^ reader = nullptr; try { response = request->GetResponse(); Stream^ stream = response->GetResponseStream(); reader = gcnew StreamReader(stream,Text::Encoding::UTF8); Console::Write(reader->ReadToEnd()); } catch (WebException^ x) { Console::Error->WriteLine(x->Message); return -2; } if (response!=nullptr) response->Close(); if (reader!=nullptr) reader->Close(); return 0; }
This program accepts one command line parameter, a URL, and then reads from that URL and echos the text to StdOut.
Enable comment auto-refresher