Difference between revisions of "ReadCsvFile"

(See Also ParseCsvText)
m
 
(9 intermediate revisions by 3 users not shown)
Line 1: Line 1:
 
[[Category:Database Functions]]
 
[[Category:Database Functions]]
 +
[[Category: Flat File library functions]]
 +
 
[[Category:Doc Status D]] <!-- For Lumina use, do not change -->
 
[[Category:Doc Status D]] <!-- For Lumina use, do not change -->
 +
 +
ReadCSVFile is an obsolete library function. We recommend that you use [[ReadTextFile]] and  [[ParseCSV]] functions to respectively read and parse a CSV file.
  
 
(''Requires Analytica Enterprise or Power Player'')
 
(''Requires Analytica Enterprise or Power Player'')
  
= ReadCsvFile(filename'', linesep,colsep'') =
+
== ReadCsvFile(filename'', linesep, colsep'') ==
  
 
Reads and parses a tabular text file, known as a CSV (for "Comma-Separated Values") format, in which data is arranged in columns on separate lines, and in which the first line contains column labels.
 
Reads and parses a tabular text file, known as a CSV (for "Comma-Separated Values") format, in which data is arranged in columns on separate lines, and in which the first line contains column labels.
  
Normally columns are separated by commas, although the optional colsep parameter can be specified if a different separation string is used.   
+
Normally columns are separated by commas, although the optional «colsep» parameter can be specified if a different separation string is used.   
  
An array of text values is returned, indexed by two local indexes, Row and Field.  Typically some columns may contain text, others numbers, and other dates.  You must post-process the result returned from ReadCsvFile using, for example, [[ParseDate]] and [[ParseNum]] to convert the text columns to numeric values.  The Row index contains the column labels.
+
An array of text values is returned, indexed by two local indexes, <code>Row</code> and <code>Field</code>.  Typically some columns may contain text, others numbers, and other dates.  You must post-process the result returned from [[ReadCsvFile]] using, for example, [[ParseDate]] and [[ParseNum]] to convert the text columns to numeric values.  The <code>Row</code> index contains the column labels.
  
 
The data within a column in the file must never contain the separator strings.
 
The data within a column in the file must never contain the separator strings.
  
 
To use this function with tab-delimited data, use:
 
To use this function with tab-delimited data, use:
ReadCsvFile(filename,colsep:[[Chr]](9))
+
:<code>ReadCsvFile(filename, colsep: Chr(9))</code>
 
 
= Library =
 
  
Flat File Library.ana
+
== Library ==
 +
Download: [[media:Flat File Library.ana|Flat File Library.ana]]
  
(Use '''File &rarr; Add Library...''' to add this library)
+
(Use [[File menu|File]] &rarr; '''Add Library...''' to add this library)
  
= Example =
+
== Example ==
  
Suppose the file "data.txt" is in the [[CurrentDataDirectory]] and contains the following:
+
Suppose the file "data.txt" is in the [[CurrentDataFolder]] and contains the following:
PlantID,Name,Capacity,Start Date,Type
+
:<code>PlantID, Name, Capacity, Start Date, Type</code>
G0372,Moss Landing,2218,7/11/2002,Oil/Gas
+
:<code>G0372, Moss Landing, 2218, 7/11/2002, Oil/Gas</code>
N0002,San Onofre,2254,1/1/1968,Nuclear
+
:<code>N0002, San Onofre, 2254, 1/1/1968, Nuclear</code>
N0001,Diablo Canyon,2218,1/1/1985,Nuclear
+
:<code>N0001, Diablo Canyon, 2218, 1/1/1985, Nuclear</code>
  
 
The following expression
 
The following expression
ReadCsvFile("data.txt")
+
:<code>ReadCsvFile("data.txt")</code>
returns an array of text indexed by local indexes Row and Fields as follows:
+
returns an array of text indexed by local indexes <code>Row</code> and <code>Fields</code> as follows:
{| border="1"
+
:{| class="wikitable"
! !! !! colspan="5" | .Fields
+
! !! colspan="5" | .Fields &#9654;
 
|-
 
|-
! !! !! PlantID !! Name !! Capacity !! Start Date !! Type
+
! .Row  &#9660; !! PlantID !! Name !! Capacity !! Start Date !! Type
 
|-
 
|-
! rowspan="3" | .Row
+
! 1
! 1 || G0372 || Moss Landing || 2218 || 7/11/2002 || Oil/Gas
+
|G0372 || Moss Landing || 2218 || 7/11/2002 || Oil/Gas
 
|-
 
|-
! 2 |N0002 || San Onofre || 2254 || 1/1/1968 || Nuclear
+
! 2  
 +
| N0002 || San Onofre || 2254 || 1/1/1968 || Nuclear
 
|-
 
|-
! 3 |N0001 || Diablo Canyon || 2218 || 1/1/1985 || Nuclear
+
! 3  
 +
| N0001 || Diablo Canyon || 2218 || 1/1/1985 || Nuclear
 
|}
 
|}
  
In a case like this, one would typically want to parse the Capacity and Start Date columns to actual numeric values.  So the expression would be enhanced to read
+
In a case like this, one would typically want to parse the <code>Capacity</code> and <code>Start Date</code> columns to actual numeric values.  So the expression would be enhanced to read
  
var f := ReadCsvFile('data.txt');
+
:<code>Var f := ReadCsvFile('data.txt');</code>
If f.Fields='Capacity' Then [[ParseNum]](f[.Fields='Capacity'])
+
:<code>If f.Fields = 'Capacity' Then ParseNum(f[.Fields = 'Capacity'])</code>
Else if f.Fields='Start Date' Then [[ParseDate]]([f[.Fields='Start Date'])
+
:<code>Else If f.Fields = 'Start Date' Then ParseDate([f[.Fields = 'Start Date'])</code>
Else f
+
:<code>Else f</code>
  
 
It would actually be a bit more elegant to break Capacity and Start_Date out as separate variables, e.g.:
 
It would actually be a bit more elegant to break Capacity and Start_Date out as separate variables, e.g.:
Variable RawData := ReadCsvFile('data.txt')
+
:<code>Variable RawData := ReadCsvFile('data.txt')</code>
Variable Capacity := [[ParseNum]]( RawData[.Fields='Capacity'] )
+
:<code>Variable Capacity := ParseNum(RawData[.Fields = 'Capacity'])</code>
Variable Start_Date := [[ParseDate]]( RawData[.Fields='Start Date'] )
+
:<code>Variable Start_Date := ParseDate(RawData[.Fields = 'Start Date'])</code>
 
 
In this way, it is possible to display Capacity and Start_Date using separate number formats.  (Note: In the previous code, RawData, Capacity and Start_Date are each variable nodes on a diagram, and their definition is shown to the right of :=).  With Capacity as numeric, you can apply numeric operations, such as
 
Sum( Capacity, Capacity.Rows )
 
  
= See Also =
+
In this way, it is possible to display <code>Capacity</code> and <code>Start_Date</code> using separate number formats.  (Note: In the previous code, <code>RawData, Capacity</code> and <code>Start_Date</code> are each variable nodes on a diagram, and their definition is shown to the right of <code>:=</code> ).  With <code>Capacity</code> as numeric, you can apply numeric operations, such as <code>Sum(Capacity, Capacity.Rows)</code>
  
 +
== See Also ==
 +
* [[media:Flat File Library.ana|Flat File Library.ana]]
 
* [[ReadTextFile]]
 
* [[ReadTextFile]]
 
* [[ParseCsvText]]
 
* [[ParseCsvText]]
 
* [[WriteCsvFile]]
 
* [[WriteCsvFile]]
* [[ParseNum]], [[ParseDate]]
+
* [[ParseNum]]
 +
* [[ParseDate]]
 +
* [[DbQuery]] -- An ODBC Text Driver can alternatively be used to read from CSV-formatted files.
 +
* [[Files and editing]]

Latest revision as of 23:50, 18 July 2018


ReadCSVFile is an obsolete library function. We recommend that you use ReadTextFile and ParseCSV functions to respectively read and parse a CSV file.

(Requires Analytica Enterprise or Power Player)

ReadCsvFile(filename, linesep, colsep)

Reads and parses a tabular text file, known as a CSV (for "Comma-Separated Values") format, in which data is arranged in columns on separate lines, and in which the first line contains column labels.

Normally columns are separated by commas, although the optional «colsep» parameter can be specified if a different separation string is used.

An array of text values is returned, indexed by two local indexes, Row and Field. Typically some columns may contain text, others numbers, and other dates. You must post-process the result returned from ReadCsvFile using, for example, ParseDate and ParseNum to convert the text columns to numeric values. The Row index contains the column labels.

The data within a column in the file must never contain the separator strings.

To use this function with tab-delimited data, use:

ReadCsvFile(filename, colsep: Chr(9))

Library

Download: Flat File Library.ana

(Use FileAdd Library... to add this library)

Example

Suppose the file "data.txt" is in the CurrentDataFolder and contains the following:

PlantID, Name, Capacity, Start Date, Type
G0372, Moss Landing, 2218, 7/11/2002, Oil/Gas
N0002, San Onofre, 2254, 1/1/1968, Nuclear
N0001, Diablo Canyon, 2218, 1/1/1985, Nuclear

The following expression

ReadCsvFile("data.txt")

returns an array of text indexed by local indexes Row and Fields as follows:

.Fields ▶
.Row ▼ PlantID Name Capacity Start Date Type
1 G0372 Moss Landing 2218 7/11/2002 Oil/Gas
2 N0002 San Onofre 2254 1/1/1968 Nuclear
3 N0001 Diablo Canyon 2218 1/1/1985 Nuclear

In a case like this, one would typically want to parse the Capacity and Start Date columns to actual numeric values. So the expression would be enhanced to read

Var f := ReadCsvFile('data.txt');
If f.Fields = 'Capacity' Then ParseNum(f[.Fields = 'Capacity'])
Else If f.Fields = 'Start Date' Then ParseDate([f[.Fields = 'Start Date'])
Else f

It would actually be a bit more elegant to break Capacity and Start_Date out as separate variables, e.g.:

Variable RawData := ReadCsvFile('data.txt')
Variable Capacity := ParseNum(RawData[.Fields = 'Capacity'])
Variable Start_Date := ParseDate(RawData[.Fields = 'Start Date'])

In this way, it is possible to display Capacity and Start_Date using separate number formats. (Note: In the previous code, RawData, Capacity and Start_Date are each variable nodes on a diagram, and their definition is shown to the right of := ). With Capacity as numeric, you can apply numeric operations, such as Sum(Capacity, Capacity.Rows)

See Also

Comments


You are not allowed to post comments.