Difference between revisions of "ReadCsvFile"

Revision as of 23:50, 18 July 2018

ReadCSVFile is an obsolete library function. We recommend that you use ReadTextFile and ParseCSV functions to respectively read and parse a CSV file.

(Requires Analytica Enterprise or Power Player)

ReadCsvFile(filename, linesep, colsep)

Reads and parses a tabular text file, known as a CSV (for "Comma-Separated Values") format, in which data is arranged in columns on separate lines, and in which the first line contains column labels.

Normally columns are separated by commas, although the optional «colsep» parameter can be specified if a different separation string is used.

An array of text values is returned, indexed by two local indexes, Row and Field. Typically some columns may contain text, others numbers, and other dates. You must post-process the result returned from ReadCsvFile using, for example, ParseDate and ParseNum to convert the text columns to numeric values. The Row index contains the column labels.

The data within a column in the file must never contain the separator strings.

To use this function with tab-delimited data, use:

ReadCsvFile(filename, colsep: Chr(9))

Library

Download: Flat File Library.ana

(Use File → Add Library... to add this library)

Example

Suppose the file "data.txt" is in the CurrentDataFolder and contains the following:

PlantID, Name, Capacity, Start Date, Type

G0372, Moss Landing, 2218, 7/11/2002, Oil/Gas

N0002, San Onofre, 2254, 1/1/1968, Nuclear

N0001, Diablo Canyon, 2218, 1/1/1985, Nuclear

The following expression

ReadCsvFile("data.txt")

returns an array of text indexed by local indexes Row and Fields as follows:

	.Fields ▶
.Row ▼	PlantID	Name	Capacity	Start Date	Type
1	G0372	Moss Landing	2218	7/11/2002	Oil/Gas
2	N0002	San Onofre	2254	1/1/1968	Nuclear
3	N0001	Diablo Canyon	2218	1/1/1985	Nuclear

In a case like this, one would typically want to parse the Capacity and Start Date columns to actual numeric values. So the expression would be enhanced to read

Var f := ReadCsvFile('data.txt');

If f.Fields = 'Capacity' Then ParseNum(f[.Fields = 'Capacity'])

Else If f.Fields = 'Start Date' Then ParseDate([f[.Fields = 'Start Date'])

Else f

It would actually be a bit more elegant to break Capacity and Start_Date out as separate variables, e.g.:

Variable RawData := ReadCsvFile('data.txt')

Variable Capacity := ParseNum(RawData[.Fields = 'Capacity'])

Variable Start_Date := ParseDate(RawData[.Fields = 'Start Date'])

In this way, it is possible to display Capacity and Start_Date using separate number formats. (Note: In the previous code, RawData, Capacity and Start_Date are each variable nodes on a diagram, and their definition is shown to the right of := ). With Capacity as numeric, you can apply numeric operations, such as Sum(Capacity, Capacity.Rows)

@@ Line 1: / Line 1: @@
 [[Category:Database Functions]]
+[[Category: Flat File library functions]]
 [[Category:Doc Status D]] <!-- For Lumina use, do not change -->
+ReadCSVFile is an obsolete library function. We recommend that you use [[ReadTextFile]] and  [[ParseCSV]] functions to respectively read and parse a CSV file.
 (''Requires Analytica Enterprise or Power Player'')
-= ReadCsvFile(filename'', linesep,colsep'') =
+== ReadCsvFile(filename'', linesep, colsep'') ==
 Reads and parses a tabular text file, known as a CSV (for "Comma-Separated Values") format, in which data is arranged in columns on separate lines, and in which the first line contains column labels.
-Normally columns are separated by commas, although the optional colsep parameter can be specified if a different separation string is used.
+Normally columns are separated by commas, although the optional «colsep» parameter can be specified if a different separation string is used.
-An array of text values is returned, indexed by two local indexes, Row and Field.  Typically some columns may contain text, others numbers, and other dates.  You must post-process the result returned from ReadCsvFile using, for example, [[ParseDate]] and [[ParseNum]] to convert the text columns to numeric values.  The Row index contains the column labels.
+An array of text values is returned, indexed by two local indexes, <code>Row</code> and <code>Field</code>.  Typically some columns may contain text, others numbers, and other dates.  You must post-process the result returned from [[ReadCsvFile]] using, for example, [[ParseDate]] and [[ParseNum]] to convert the text columns to numeric values.  The <code>Row</code> index contains the column labels.
 The data within a column in the file must never contain the separator strings.
 To use this function with tab-delimited data, use:
- ReadCsvFile(filename,colsep:[[Chr]](9))
+:<code>ReadCsvFile(filename, colsep: Chr(9))</code>
-= Library =
-Flat File Library.ana
+== Library ==
+Download: [[media:Flat File Library.ana|Flat File Library.ana]]
-(Use '''File &rarr; Add Library...''' to add this library)
+(Use [[File menu|File]] &rarr; '''Add Library...''' to add this library)
-= Example =
+== Example ==
-Suppose the file "data.txt" is in the [[CurrentDataDirectory]] and contains the following:
+Suppose the file "data.txt" is in the [[CurrentDataFolder]] and contains the following:
- PlantID,Name,Capacity,Start Date,Type
+:<code>PlantID, Name, Capacity, Start Date, Type</code>
- G0372,Moss Landing,2218,7/11/2002,Oil/Gas
+:<code>G0372, Moss Landing, 2218, 7/11/2002, Oil/Gas</code>
- N0002,San Onofre,2254,1/1/1968,Nuclear
+:<code>N0002, San Onofre, 2254, 1/1/1968, Nuclear</code>
- N0001,Diablo Canyon,2218,1/1/1985,Nuclear
+:<code>N0001, Diablo Canyon, 2218, 1/1/1985, Nuclear</code>
 The following expression
- ReadCsvFile("data.txt")
+:<code>ReadCsvFile("data.txt")</code>
-returns an array of text indexed by local indexes Row and Fields as follows:
+returns an array of text indexed by local indexes <code>Row</code> and <code>Fields</code> as follows:
-{| border="1"
+:{| class="wikitable"
-! !! !! colspan="5" | .Fields
+! !! colspan="5" |  .Fields &#9654;
 |-
-! !! !! PlantID !! Name !! Capacity !! Start Date !! Type
+!  .Row  &#9660; !! PlantID !! Name !! Capacity !! Start Date !! Type
 |-
-! rowspan="3" | .Row
+! 1
-! 1 || G0372 || Moss Landing || 2218 || 7/11/2002 || Oil/Gas
+|G0372 || Moss Landing || 2218 || 7/11/2002 || Oil/Gas
 |-
-! 2 ||  N0002 || San Onofre || 2254 || 1/1/1968 || Nuclear
+! 2
+| N0002 || San Onofre || 2254 || 1/1/1968 || Nuclear
 |-
-! 3 ||  N0001 || Diablo Canyon || 2218 || 1/1/1985 || Nuclear
+! 3
+| N0001 || Diablo Canyon || 2218 || 1/1/1985 || Nuclear
 |}
-In a case like this, one would typically want to parse the Capacity and Start Date columns to actual numeric values.  So the expression would be enhanced to read
+In a case like this, one would typically want to parse the <code>Capacity</code> and <code>Start Date</code> columns to actual numeric values.  So the expression would be enhanced to read
- var f := ReadCsvFile('data.txt');
+:<code>Var f := ReadCsvFile('data.txt');</code>
- If f.Fields='Capacity' Then [[ParseNum]](f[.Fields='Capacity'])
+:<code>If f.Fields = 'Capacity' Then ParseNum(f[.Fields = 'Capacity'])</code>
- Else if f.Fields='Start Date' Then [[ParseDate]]([f[.Fields='Start Date'])
+:<code>Else If f.Fields = 'Start Date' Then ParseDate([f[.Fields = 'Start Date'])</code>
- Else f
+:<code>Else f</code>
 It would actually be a bit more elegant to break Capacity and Start_Date out as separate variables, e.g.:
- Variable RawData := ReadCsvFile('data.txt')
+:<code>Variable RawData := ReadCsvFile('data.txt')</code>
- Variable Capacity := [[ParseNum]]( RawData[.Fields='Capacity'] )
+:<code>Variable Capacity := ParseNum(RawData[.Fields = 'Capacity'])</code>
- Variable Start_Date := [[ParseDate]]( RawData[.Fields='Start Date'] )
+:<code>Variable Start_Date := ParseDate(RawData[.Fields = 'Start Date'])</code>
-In this way, it is possible to display Capacity and Start_Date using separate number formats.  (Note: In the previous code, RawData, Capacity and Start_Date are each variable nodes on a diagram, and their definition is shown to the right of :=).  With Capacity as numeric, you can apply numeric operations, such as
- Sum( Capacity, Capacity.Rows )
-= See Also =
+In this way, it is possible to display <code>Capacity</code> and <code>Start_Date</code> using separate number formats.  (Note: In the previous code, <code>RawData, Capacity</code> and <code>Start_Date</code> are each variable nodes on a diagram, and their definition is shown to the right of <code>:=</code> ).  With <code>Capacity</code> as numeric, you can apply numeric operations, such as <code>Sum(Capacity, Capacity.Rows)</code>
+== See Also ==
+* [[media:Flat File Library.ana|Flat File Library.ana]]
 * [[ReadTextFile]]
 * [[ParseCsvText]]
 * [[WriteCsvFile]]
-* [[ParseNum]], [[ParseDate]]
+* [[ParseNum]]
+* [[ParseDate]]
+* [[DbQuery]] -- An ODBC Text Driver can alternatively be used to read from CSV-formatted files.
+* [[Files and editing]]