DbQuery

Release:	… • 4.6 • 5.0 • 5.1 • 5.2 • 5.3 • 5.4 • • 6.0 • 6.1 • 6.2 • 6.3 • 6.4 • 6.5 • 6.6

(Requires Analytica Enterprise or Power Player)

DbQuery(connection, sql, key, datesAsText)

Used to define an index Variable. The definition of the index should contain only one DbQuery() statement. «Connection» specifies a data source as a textual connection string (e.g., 'DSN = MyDatabase')or it can be a «Database connection» obtained by calling DbConnection.. The «sql» parameter defines an SQL query.

When placed as the definition of an index Variable, DbQuery() will be evaluated as soon as the definition is complete. When it is evaluated, the actual query is performed. The resulting result table is cached in the Att_recordSet attribute of the variable (or local index) containing the call to DbQuery, which is then subsequently accessed by DbTable() or DbLabels().

DbQuery() returns a sequence 1..n, where n is the number of records (rows) in the result table. Alternatively, you can use one column from the result set as the index labels by specifying the name of the column in the optional «key» parameter.

DbQuery() should appear only once in a definition, and if it is embedded in an expression, the expression must return a list with n elements.

DbQuery() processes the sql statement in read-only mode, so that the data source cannot be altered as a result of executing this statement. To alter the data source, use DbWrite().

Local Indexes as record sets

A local index can hold the record set. This means you can create a user-defined function that bundles several database query steps. For example:

Function DbFullTable(connectionStr, sql: text)

Definition:

Index Rows := DbQuery(connectionStr, sql);

Index Cols := DbLabels(Rows);

DbTable(Rows, Cols)

This user-defined function returns the full relational table in a single call, using local indexes, Rows and Cols.

Date and Time data

Date and time values retreived from the database are returned as text. Beginning with release 4.2.3, you can also opt to have date or time data types returned as date-time numbers. A date-time number in Analytica encodes a point in time as the number of days elapsed since the date origin (usually Jan 1, 1904). Date-time numbers respond to the Date style Number format.

The system variable, Sys_DbDatesAsText contains a global preference for the handling of dates by DbQuery or DbWrite. When this system variable is 0 (false), dates are returned as text. When set to 1, dates are returned as date-time numbers. For backward compatibility with pre-4.2.3 releases, it is 0 by default.

Individual calls to DbQuery can override the system default by specifying the optional «datesAsText» parameter.

Connection String Syntax

Comprehensive references for connection string syntax across different database services are at www.connectionStrings.com.

Using a configured DSN

The «connection» parameter specifies the data source that you will be querying. The most common syntax for the connection string is:

"DSN=«name»"

where «name» is the name of a system or user data source that you have previously configured in the Data Sources (ODBC) wizard (accessible in Windows via

Start → Programs → Administrative Tools → Data Sources (ODBC).

Note: Don't add spaces on either side of the equal sign. Spaces on either side of the equal sign may prevent it from finding the data source. You can augment a DSN-based connection string with additional keyword values. These are specific to the database driver you are using, and the names of these additional keywords must be found from other documentation. For example, some databases require a login and password, which can be added to the connection string as:

"DSN=MyDatabase; UID=John; PWD=Lightning"

Some databases require two logins, one for the server, and another for the DBMS. In this case, both can be specified:

"DSN=MyDatabase; UID=John; PWD=Lightning; UIDDBMS=JQR; PWDDBMS=Thunder"

There are a couple of advantages to using a DSN to specify your data source. First, the wizard in Windows makes it relatively easy to select the appropriate driver and configure the parameters used to access the data source. Second, the «connection» parameter in your model remains very simple. And third, the model is not reliant on the precise location of the database, so that if someone else uses your model on a different machine, the connection string can remain valid. On the downside, it does mean that whenever you share your model with someone else, they must configure a DSN before using it.

Directly accessing the database file

You can also access your database directly, without configuring a DSN. To do this, at a minimum you need to specify the ODBC driver name and the database location in the connection string.

The driver name can be discovered by evaluating

SqlDriverInfo("")

from within Analytica.

Many drivers use the keyword DBQ for specifying the file path to the database. Some database drivers (examples are Microsoft Access and Excel) store the database in a single file, so the path specifies the actual file. Other drivers (e.g., Paradox, Text driver) store their database in many files in a single directory, in which case DBQ locates the directory. Here are some examples of connection strings:

"Driver=Microsoft Access Driver (*.mdb); DBQ=C:\MyData\MyDatabase.mdb"

"Driver=Microsoft Excel Driver (*.xls); DBQ=" & CurrentModelFolder() & "\SourceData.xls"

"Driver=Microsoft Paradox Driver (*.db); DBQ=C:\MyData\MyDatabaseDirectory"

"Driver=Microsoft Text Driver (*.txt; *.csv); DBQ="C:\MyData\MyCsvTables"

Note that when specifying the location, you must use absolute path names. The ODBC drivers do not know anything about Analytica's current folders. You may find CurrentModelFolder and CurrentDataFolder functions useful for computing full path names when your data source is in the same directory as your model, or a subdirectory.

SQL Syntax

Databases contain tables, each table having a name. The simplest and most common SQL statement is one that retrieve all the data from a given table:

"SELECT * FROM tableName"

Many databases also allow you to define queries, which may join several underlying tables to create a new computed table results. The names of such predefined queries can usually be used in place of tableName as well.

The SELECT statement is SQL's workhorse for reading data. You can expand on this simple form by selecting only a subset of the columns, selecting only a subset of the rows using a search condition, joining multiple tables, sorting the selected rows, eliminating duplicate rows, or grouping (aggregating) the rows. The following shows the general syntax used by most SQL databases, where brackets surround optional parts:

SELECT column_list

[FROM list_of_tables]

[WHERE search_condition]

[ORDER BY column_list [ASC|DESC]]

[GROUP BY group_by_expression]

[HAVING group_search_conditon]

If you are creating a complex query, involving the joining of multiple tables and complex WHERE or GROUP BY criteria, a convenient way to do this by using an SQL designer application. Graphical user interfaces that help you create your SQL queries can be found in Microsoft Access, in a program called Microsoft Query, in Microsoft SQL Server management studio, etc.

You can often bring entire tables into Analytica and "join" them from within Analytica, using Analytica expressions. However, if you are dealing with huge databases, then use of more sophisticated SQL can be used to reduce the amount of data that needs to be transferred and held in Analytica's memory.

Note that DbQuery cannot be used to modify the database -- it executes its SQL in a read-only model. Other SQL statements can be used to modify a database -- creating new tables, adding or removing data to/from tables, etc. To utilize these SQL statements, you must use DbWrite.

Unicode characters

In Microsoft SQL Query and SQL Express, Unicode strings must appear in SQL as, for example, N'一些文字'. In other words, it is not enough to surround the text with single quotes, but a capital N must appear in front of the opening quote. Without this, the unicode characters get converted to question marks in the query.

I don't know if this is specific to MS SQL, or also applies to other databases. This is documented on Microsoft's website in Using Unicode Data.

ODBC to read Excel data

ODBC can be used to read data from Excel spreadsheets, at least in a rudimentary way. However, if you want to read from Excel spreadsheets, we recommend using the Spreadsheet Access functions instead, which are similar in result, but can also be used from Analytica 64-bit (The JET database drivers, required for ODBC queries into Excel, are only available for use by 32-bit processs). Nevertheless, understanding how to access spreadsheets through ODBC can be informative, so a description is provided here.

When using ODBC from Excel, data that you might read should be organized as a table, where the first row of the table contains column labels. If you have a single table on a worksheet, you can use:

"Select * from [Sheet1$]"

The Excel ODBC driver wants the the name of the sheet to be followed by a dollar sign, and because of this, you must surround the sheet name and dollar sign with square brackets.

If you name a region in your workbook, you can read the contents of that region using:

"Select * from [RegionName]"

In this case, you don't have to specify a worksheet name, and you don't include the trailing dollar sign. The first row of your region needs to have column labels. If the first row does not have column labels, it will omit the first row from the result and return placeholder labels like ["F1", "F2", "F3"].

You can query a cell range using, e.g.:

"Select * from [Sheet1$D17:F55]"

Again, the first row (in this example, D17:F17), of the range needs to have labels. I do not know of a way of reading a transposed table (i.e., where the labels are row headers, and the fields for each record are in the same column). If the block of data don't have the labels for the first row, as long as it doesn't start in row 1, you can always start your query from one row previous. E.g., if you want cells F9:H12, then query F8:H12.

One downside with using ODBC to query an Excel workbook is that the ODBC driver does not seem to pick up on changes made from Excel itself while Analytica and Excel are both open. If you change something in Excel, you may need to exit Analytica, then save the Excel file with the change, then re-enter Analytica. It appears that the Excel ODBC driver locks the file in certain ways and caches certain things so that changes are not always picked up once the driver has been loaded. When you load the file into Excel once the workbook has already been queried, it seems to convert to a read-only mode a soon as you attempt to save.

On the upside, there are advantages to using ODBC for reading data from Excel over OLE Linking. Especially if you use a one table per sheet organization, then the size of the table adapts very gracefully with ODBC when the amount of data in Excel changes.

Programs with ODBC support

ODBC drivers exist for the following programs (this is only a partial list), which means if you need to exchange data between these applications and Analytica, the use of ODBC may be an option.

SQL Server
Microsoft Access
Oracle
MySQL
IBM DB2
Sybase
Paradox
Microsoft Excel
Text files -- csv, fixed-width delimited, others.
FoxPro

AS/400 (iSeries)
Caché
Filemaker
Firebird
Informix
Ingres
Interbase
Lotus Notes
Mimer SQL
Pervasive
Postgre SQL
Progress
SQLite
Teradata
Salesforce

These support the OLE DB interface (Analytica uses straight ODBC) and thus may be usable (we are not totally sure):

Microsoft Project
Lightbase
SQLBase
VistaDB
Active Directory

64-bit ODBC Drivers

To use ODBC from Analytica 64-bit, you must have 64-bit versions of the ODBC driver for your database. Microsoft SQL Server, Oracle, MySQL and most other major commercial databases provide 64-bit ODBC drivers, and can be used with Analytica 64-bit.

The Microsoft JET drivers (for Microsoft Access, Microsoft Excel and CSV file queries) require Microsoft Office 2010 or later, and you must install the x64 edition of Office. Prior to the Office 2010 release, 64-bit drivers were not available. If you install Office 2010 x64, then you can query Access, Excel, or flat files directly via ODBC.

Without MS Access x64 installed (e.g., if you have only Office 2003 or 2007), you won't be able to query an Access database directly. There is a method, however, of querying a Microsoft Access database from Analytica 64-bit by configuring the free Microsoft SQL Server express driver to act as a proxy to the Access database. The steps for doing this are detailed at Querying Access database from Analytica 64. Again, this is now unnecessary if you have MS Access 2010 x64.

Library

Database Functions

Additional details

Follow a step-by-step mini-tutorial to get your first DbQuery working with a Microsoft Access database here: DbQuery/Step-by-Step querying Microsoft Access

An Analytica User Group webinar on the use of DbQuery to read data in relational databases can be viewed at: Using-ODBC-Queries.wmv.

A bug has been identified in 4.0 (fixed in 4.1) that causes the following message to appear incorrectly in some circumstances:

First arg, <rowIndex>, to DBLabels must be defined with a DBQuery( ) statement

The bug only occurs when the node holding the DbQuery is an index node that has just been defined. If you use a Variable node, rather than an index node, the problem does not occur, or if a model is saved and reloaded the problem does not occur. If you change the index to a variable node, you must force the definition to be re-evaluated, such as by inserting a space in the definition. If you encounter this problem when it should not be appearing, our recommendation is to save and reload the model, which will correct the problem.