User Tools

Site Tools


howtos:workwithdata:createsetsfromdatafile

Use create to make informants from data file

Important parameters

General

allowDuplicate - it is quite likely that the column you want to choose make your informant from will have element names repeated, this parameter just tells create that's O.K. but does not actually make the duplicate entries in the set.

searchElemName1 and replaceElemName1 - use this to search and replace pair adjust the element names as you read them in. Good for removing commas or .'s and simplifying the names. These 2 should always be used togeter

wholeWordElemName - tells you whether the above parameter refer to a substring or whole word

replaceWhiteSpaceInElemName - replace any tab, space or other “white space” in the element names. White space is allowed in the element names when reading in but is not fully supported by all tools and so it's best to either replace the white space here or map to non white space informants before doing any serious manipulation.

From a Line

elemNamesFromLineNum - use this to specify the line/row

firstCol to indicate the first column, only needed if the set should start a column other than the first use parameter *

From a Column

elemNamesFromColNum - use this to specify the column -

heading=on/off - if this parameter is set to on it will use the second row as a start point with no consideration of commented out lines

firstLine - use this only if you want to start on any row other than the first row with no consideration of commented out lines. Note this overrides the “heading” setting!

Example Reading in sets

This is a sample .csv file from which we will read some sets

Region,,,ALBERTA,,,,,,,,,,
Year,,,2006,,,,,,,,,,
Table 10:,,,Sectoral Greenhouse Gas Emission Summary,,,,,,,,,,

Greenhouse Gas Categories,,,,,Greenhouse Gases,,,,,,,,
,,a,b,c,CO2,CH4,CH4e,N2O,N2Oe, HFCs , PFCs ,SF6,TOTAL
,,,,Global Warming Potential,,,21,,310,,,,
,,,,Unit,kt,kt,kt CO2  equivalent, kt ,kt CO2 equivalent,kt CO2 equivalent,kt CO2 equivalent,kt CO2 equivalent,kt CO2 equivalent
,,,,,,,,,,,,,
,,,,,,,,,,,,,
,Stationary Combustion Sources,,,,"123,801",74,"1,557",3,819,,,,"126,177"
,Electricity and Heat Generation,,,,"53,600",2,31.7,1,304.3,,,,"53,936"
,Fossil Fuel Production and Refining,,,,"38,618",70,"1,475.00",1,287.4,,,,"40,380"
,Mining & Oil and Gas Extraction,,,,"11,408",0,4.5,0,83.5,,,,"11,496"
,Manufacturing Industries,,,,"6,978",0,5.9,0,56.6,,,,"7,041"

Here it is as a .csv you can download and open with Excel to see it better and to test the code below emtestfile.csv

Read from a column

The following code reads the second column as

string $importPath, $fileName
buildstring ($importPath, $home, "/testScripts")		

buildstring ($fileName, $importPath, "/emtestfile.csv")
localinformant ECemRowTitles[] = create (; object=set, delimiter=",", \
	allowDuplicate=on, \
	elemNamesFromColNum=2, firstLine=9, \
	searchElemName1=",", replaceElemName1="-", \
	searchElemName2="&", replaceElemName2="and", \
	replaceWhiteSpaceInElemName="_" , \
	file=$fileName)	
display (ECemRowTitles[])
export (ECemRowTitles[]; file=$importPath/ECemRowTitles.txt)

Read from a line/row

The following code reads the second column as

string $importPath, $fileName
buildstring ($importPath, $home, "/testScripts")		

buildstring ($fileName, $importPath, "/emtestfile.csv")
localinformant GHGInvEmType[] = create (; object=set, delimiter=",", \
	allowDuplicate=on, \
	elemNamesFromLineNum=6, firstCol=4, \
	searchElemName1=",", replaceElemName1="-", \
	replaceWhiteSpaceInElemName="_" , \
	file=$fileName)

display (GHGInvEmType[])
export (GHGInvEmType[]; file=$importPath/GHGInvEmType.txt)

Example Test Load

Once you have code for reading the sets it's often useful to do a quick test load of the data. This file pulls together all the pieces:

$informPath = $home + "/V4/informants"

string $importPath, $fileName
buildstring ($importPath, $home, "/testScripts")		

buildstring ($fileName, $importPath, "/emtestfile.csv")

localinformant ECemRowTitles[] = create (; object=set, delimiter=",", \
	allowDuplicate=on, \
	elemNamesFromColNum=2, firstLine=11, \
	searchElemName1="", replaceElemName1="blank", \
	searchElemName2="&", replaceElemName2="and", \
	replaceWhiteSpaceInElemName="_" , \
	file=$fileName)	
display (ECemRowTitles[])
export (ECemRowTitles[]; file=$importPath/ECemRowTitles.txt)


localinformant GHGInvEmType[] = create (; object=set, delimiter=",", \
	allowDuplicate=on, \
	elemNamesFromLineNum=6, firstCol=3, \
	searchElemName1="", replaceElemName1="blank", \
	searchElemName2="&", replaceElemName2="and", \
	replaceWhiteSpaceInElemName="_" , \
	file=$fileName)

display (GHGInvEmType[])
export (GHGInvEmType[]; file=$importPath/GHGInvEmType.txt)

local data[rows,cols] = create (; dim=ECemRowTitles, dim=GHGInvEmType, \
	dataFormat="coordinate", fileFormat="text", allCoord=off, delimiter=",", \
	firstLine=11, firstCol=2, ignoreExtraCols=on, ignoreMissingCols=on, \
	searchData=",", replaceData="" , \
	searchElemName1="", replaceElemName1="blank", \
	searchElemName2="&", replaceElemName2="and", \
	replaceWhiteSpaceInElemName="_" , \
	file=$fileName)

table (data[rows,cols])

And here's a growing list of “magic” element name character replacements:

localinformant NHS2011Profile_fld[] = create (; object=set, delimiter=",", \
	allowDuplicate=off, \
	elemNamesFromColNum=2, firstLine=1, \
	searchElemName1=",", replaceElemName1="", \
	searchElemName2=".", replaceElemName2="_", \
	searchElemName3="&", replaceElemName3="and", \
	searchElemName4="/", replaceElemName4="_", \
	searchElemName5="(", replaceElemName5="", \
	searchElemName6=")", replaceElemName6="", \
	searchElemName7="]", replaceElemName7="", \
	searchElemName8="[", replaceElemName8="", \
	searchElemName9="-", replaceElemName9="_", \
	searchElemName10="'", replaceElemName10="", \
	searchElemName11="$", replaceElemName11="dlr", \
	searchElemName12="%", replaceElemName12="pct", \
	searchElemName13=":", replaceElemName13="", \
	searchElemName14="=", replaceElemName14="eq", \
	stripLeadingWhiteSpace=off, \
	replaceWhiteSpaceInElemName="_" , \
	file=$fileName)
howtos/workwithdata/createsetsfromdatafile.txt · Last modified: 2015/06/22 15:46 by marcus.williams