Difference between revisions of "Functions To Read Excel Worksheets"
(howToIndex options require 4.1.1) |
|||
Line 67: | Line 67: | ||
** 4 = Use first row for column labels: The first row in the range is not included in the result returned, but the values in the first row are used as the column labels for the local index ''.Column''. | ** 4 = Use first row for column labels: The first row in the range is not included in the result returned, but the values in the first row are used as the column labels for the local index ''.Column''. | ||
** 8 = Use first column for row labels: The first column in the range is not included in the result returned, but the values from the first column are used as row labels for the local index ''.Row''. | ** 8 = Use first column for row labels: The first column in the range is not included in the result returned, but the values from the first column are used as row labels for the local index ''.Row''. | ||
− | ** 16 = | + | ** 16 = Do not issue an error if the supplied ''colIndex'' or ''rowIndex'' are not the correct length. |
+ | |||
+ | ''Note: Options 4, 8 and 16 require release 4.1.1 or later.'' | ||
Depending on the dimensions of the cell range, the result may be a scalar (singe cell), a column vector, a row vector, or a 2-D array. | Depending on the dimensions of the cell range, the result may be a scalar (singe cell), a column vector, a row vector, or a 2-D array. |
Revision as of 19:09, 21 April 2008
What's new in Analytica 4.1? >
The functions described on this page allow you to easily retrieve results directly from an existing Excel spreadsheet. These can be used as an alternative to OLE linking or ODBC, which are two other methods for retrieving data from spreadsheets. They are introduced on an experimental basis in Analytica 4.1, and are only available from the Enterprise edition (or higher). They may be also used from ADE.
When you use these functions to query a spreadsheet, the values are obtained when the functions are evaluated, but no dependencies are maintained from the original data. Thus, if the data is changed in the spreadsheet, the Analytica results do not automatically invalidate, as occurs when OLE linking is used. Unless you have done something special to copy the values in your model, your results will be re-evaluated if a model is re-loaded at a later time, so that each time you start your model, you will be getting the most up-to-date data from the spreadsheet.
These functions provide access only to the computed results in a spreadsheet, but can be used to access the contents of arbitrary cells or named ranges, regardless of how the spreadsheet is laid out. The functions described here only provide read-access to the spreadsheet cells.
OpenExcelFile( filename, showDialog )
Opens the Excel file at the indicated filename and returns a special Workbook object, which displays in an Analytica result table as <<ExcelWorkbook>>.
The filename parameter is relative to the CurrentDataDirectory.
If you set showDialog to true, it will display a file dialog, even if the file filename exists. If showDialog is false (the default), it will display the dialog only if it does not find the file (the same behavior as ReadTextFile).
The filename can also accept a workbook name that is current open in Excel, even if it is not yet saved.
Example
OpenExcelFile( "C:\MyModels\Sales Numbers.xls" ) → «ExcelWorkbook»
WorksheetCell( workbook, sheet, column, row )
Reads the contents of one cell of a worksheet given its coordinates. The function fully array abstracts, so you use to read a range of cells by specifying the column and/or row as arrays.
Parameters:
- workbook: A workbook object returned by OpenExcelFile
- sheet: The name of a worksheet inside the workbook, or its number, where 1 is the first worksheet, etc.
- Column: The column label, e.g., "A", "B", or "AB", or the column number as an integer.
- Row: The row number
If the worksheet cell is empty, it returns Null. It flags an error if workbook is not a valid workbook, if it does not contain sheet, or if the coordinates are invalid.
If you change the cell after evaluating a variable that uses WorksheetCell(), it has no effect, unless and until you recompute the variable. Use OLE if you need automatic dependency maintenance between Excel and Analytica.
Examples
These expressions are different ways to get the same result, the value from cell C7 in the first sheet, "Sheet1" of workbook:
WorksheetCell( workbook, "Sheet1", "C", 7 ) WorksheetCell( workbook, "Sheet1", 3, 7 ) WorksheetCell( workbook, 1, 3, 7 )
Suppose the spreadsheet contains a 2-D table in the region C4:J19. The columns of this table correspond to the years 2008..2015. The rows correspond to different assets. It is easier to refer to the columns by number, so that the columns "C" thru "J" are columns 3 thru 10. To hold this 2-D table, we need two indexes in Analytica, Time and Asset.
Index Time := 2008..2015 Index Asset := 1..16 Variable Workbook := OpenExcelFile( "C:\Asset Data.xls" ) Variable Data := WorksheetCell( workbook, "Sheet1", @Time+2, @Asset+3 )
WorksheetRange( workbook, sheet, range, colIndex, rowIndex, howToIndex)
Reads a range of cells from an Excel worksheet. The range can be the name of a named range in the Excel workbook, or it can be a cell range such as: "C7" or "C7:F12".
Parameters:
- workbook: A workbook object returned by OpenExcelFile
- sheet: The name of a worksheet inside the workbook, or a number.
- range: A cell range. This may be a named range in the spreadsheet, or it may be of the form "A1:BC99"
- colIndex: (optional) A pre-existing index for indexing the column dimension of the result.
- rowIndex: (optional) A pre-existing index for indexing the row dimension of the result.
- howToIndex: (optional) A set of flags controlling how the result is indexed when colIndex or rowIndex is not specified. Any of the following values can be added together for this parameter:
- 1 = Force column index. Forces a column index to be included even if the range spans only a single column. Has no effect if colIndex is specified.
- 2 = Force row index. Forces a row index to be included even if the range spans only a single row. Has no effect if rowIndex is specified.
- 4 = Use first row for column labels: The first row in the range is not included in the result returned, but the values in the first row are used as the column labels for the local index .Column.
- 8 = Use first column for row labels: The first column in the range is not included in the result returned, but the values from the first column are used as row labels for the local index .Row.
- 16 = Do not issue an error if the supplied colIndex or rowIndex are not the correct length.
Note: Options 4, 8 and 16 require release 4.1.1 or later.
Depending on the dimensions of the cell range, the result may be a scalar (singe cell), a column vector, a row vector, or a 2-D array.
If the range spans more than one column in the spreadsheet, then the result must be indexed by a column index. If you already have this index, you can specify it in the optional colIndex parameter; otherwise, a local index named .Column is created by the function to serve as a dimension of the result. By default, the elements of the column index are set to the sheet's column labels, e.g., "A","B",...."AA","AB",..]. If desired, you can also use the first row of values as elements for the .Column index (this row is then not included in the actual array retrieved -- thus should only be used when the range spans at least two rows). To use the first row for the elements of the local column index, specify howToIndex:4 and leave the colIndex parameter unspecified.
If your result falls entirely within a single column, but you wish to force a local column index to be included, so that the result is a 2-dimensional array of size 1xR, rather than a 1-D vector of length R, then specify howToIndex:1. If you are using a named range and don't know how many columns there actually are, and it is possible there is only one, then you would want to force a column index so that an error doesn't occur if you use result.Column in an expression.
If the range spans more than one row in the spreadsheet, then a row index must dimension the result. If you already have this index, specify it in the rowIndex parameter; otherwise, if the range spans more than one row, a local index named .Row is created by the function and used to dimension the result. The local .Row index that is created contains the row numbers from the spreadsheet as its elements, so for example, if the range is "C7:F12", the .Row index would contain the elements [7,8,9,10,11,12]. To force the creation of a local .Row index when the result spans only a single row, specify howToIndex:2. To use the values from the first column as the elements of the local .Row index, specify howToIndex:8.
If you specify colIndex or rowIndex, and the length of the indexes do not match the length of the specified range, the range is truncated if the index is too short, or the result filled with Null values beyond the range length if the index is too long. If the flag howToIndex:16 is specified, an error is issued if the index length does not match.
Empty cells in the range are Null in the result.
No dependency is maintained between the original data in the spreadsheet and the result. The data retrieved are the value that were there at the time the function is evaluated. If the data in the spreadsheet changes after the function has been evaluated, the results are not invalidated and continue to retain the original values.
Array Abstraction Caveats
If any of the parameters to WorksheetRange is array-valued, Analytica will array abstract as it does with any function. With array-valued parameters, multiple workbooks, multiple worksheets, and multiple ranges could be queried with a single call. This works seamlessly in certain cases, but can be problematic in other cases. Thus, when you use WorksheetRange, you should ensure that it is either in a place in your model where array abstraction would not need to occur, including in the future if new dimensions are introduced, or that that you have used it in a way that is guaranteed to array abstract seamlessly.
The caveats with array abstraction occur in cases where there is at least a possibility that a local .Row or .Column index might be introduced by the function. A separate .Column or .Row index is created for each range that needs one. The element values of those indexes are specific to the range. If multiple ranges are being queried, the result could then have multiple .Column or .Row indexes, which are combined in the final result into a very high-dimensional result. Even if the .Column or .Row indexes are identical, separate local indexes are created for each range. Therefore, if there is a possibility that a local .Column or .Row index will need to be introduced, you should ensure that your use of the function will not be subject to array abstraction.
It is safe to array-abstract over WorksheetRange if you have specified both colIndex and rowIndex parameters explicitly. It is also safe if you can guarantee that all the ranges being queried are single cells. You are safe array-abstrating without specifying the rowIndex parameter if you can ensure that all ranges being ranges are column vectors (i.e., span only one row), and likewise, you are safe omitting the colIndex parameter if you can guarantee that all ranges span only one column.
Examples
These example use the following spreadsheet:
In the above spreadsheet, several cell ranges are labelled as named ranges, as follows:
Label | Range |
---|---|
Rate | B1 |
Year | B3:F3 |
Cash_flow | B4:F4 |
Divisions | A7:A9 |
Employee_count | B7:F9 |
WorksheetRange( wb, "Sheet1", Rate ) → 0.08
WorksheetRange( wb, "Sheet1", "B1") → 0.08
WorksheetRange( wb, "Sheet1", "B3:F3" ) →
.Column → | 'B' | 'C' | 'D' | 'E' | 'F' |
---|---|---|---|---|---|
2008 | 2009 | 2010 | 2011 | 2012 |
Index Year := CopyIndex( WorksheetRange( wb, "Sheet1", "Year", howToIndex:1 ) ); WorksheetRange( wb, "Sheet1", "Cash_flow", colIndex:Year ) →
.Year → | 2008 | 2009 | 2010 | 2011 | 2012 |
---|---|---|---|---|---|
-100 | 10 | 30 | 50 | 60 |
Note: howToIndex:1 was specified for Year here so that we would have a 1-D array even if only one year were present in the spreadsheet.
WorksheetRange( wb, "Sheet1", "Employee_count" ) →
.Column → | 'B' | 'C' | 'D' | 'E' | 'F' | |
---|---|---|---|---|---|---|
.Row ↓ |
7 | 24 | 27 | 28 | 32 | 35 |
8 | 13 | 13 | 13 | 13 | 13 | |
9 | 25 | 22 | 21 | 19 | 16 |
Index Time := [2008,2009,2010,2011,2012]; WorksheetRange( wb, "Sheet1", "A7:F9", colIndex:Time, howToIndex:8) →
Time → | 2008 | 2009 | 2010 | 2011 | 2012 | |
---|---|---|---|---|---|---|
.Row ↓ |
"Div A" | 24 | 27 | 28 | 32 | 35 |
"Div B" | 13 | 13 | 13 | 13 | 13 | |
"Div C" | 25 | 22 | 21 | 19 | 16 |
See Also
- DbQuery -- ODBC can also be used to read from Excel worksheets
- ReadTextFile
- ReadCsvFile
Enable comment auto-refresher