====== Use create to make informants from data file ====== ===== Important parameters ===== ==== General ==== **allowDuplicate** - it is quite likely that the column you want to choose make your informant from will have element names repeated, this parameter just tells create that's O.K. but does not actually make the duplicate entries in the set. **searchElemName1** and **replaceElemName1** - use this to search and replace pair adjust the element names as you read them in. Good for removing commas or .'s and simplifying the names. These 2 should always be used togeter **wholeWordElemName** - tells you whether the above parameter refer to a substring or whole word **replaceWhiteSpaceInElemName** - replace any tab, space or other "white space" in the element names. White space is allowed in the element names when reading in but is not fully supported by all tools and so it's best to either replace the white space here or map to non white space informants before doing any serious manipulation. ==== From a Line ==== **elemNamesFromLineNum** - use this to specify the line/row **firstCol** to indicate the first column, only needed if the set should start a column other than the first use parameter * ==== From a Column ==== **elemNamesFromColNum** - use this to specify the column - **heading=on/off** - if this parameter is set to on it will use the second row as a start point with no consideration of commented out lines **firstLine** - use this only if you want to start on any row other than the first row with no consideration of commented out lines. Note this overrides the "heading" setting! ===== Example Reading in sets ===== This is a sample .csv file from which we will read some sets Region,,,ALBERTA,,,,,,,,,, Year,,,2006,,,,,,,,,, Table 10:,,,Sectoral Greenhouse Gas Emission Summary,,,,,,,,,, Greenhouse Gas Categories,,,,,Greenhouse Gases,,,,,,,, ,,a,b,c,CO2,CH4,CH4e,N2O,N2Oe, HFCs , PFCs ,SF6,TOTAL ,,,,Global Warming Potential,,,21,,310,,,, ,,,,Unit,kt,kt,kt CO2 equivalent, kt ,kt CO2 equivalent,kt CO2 equivalent,kt CO2 equivalent,kt CO2 equivalent,kt CO2 equivalent ,,,,,,,,,,,,, ,,,,,,,,,,,,, ,Stationary Combustion Sources,,,,"123,801",74,"1,557",3,819,,,,"126,177" ,Electricity and Heat Generation,,,,"53,600",2,31.7,1,304.3,,,,"53,936" ,Fossil Fuel Production and Refining,,,,"38,618",70,"1,475.00",1,287.4,,,,"40,380" ,Mining & Oil and Gas Extraction,,,,"11,408",0,4.5,0,83.5,,,,"11,496" ,Manufacturing Industries,,,,"6,978",0,5.9,0,56.6,,,,"7,041" Here it is as a .csv you can download and open with Excel to see it better and to test the code below {{:howtos:workwithdata:emtestfile.csv|}} ==== Read from a column ==== The following code reads the second column as string $importPath, $fileName buildstring ($importPath, $home, "/testScripts") buildstring ($fileName, $importPath, "/emtestfile.csv") localinformant ECemRowTitles[] = create (; object=set, delimiter=",", \ allowDuplicate=on, \ elemNamesFromColNum=2, firstLine=9, \ searchElemName1=",", replaceElemName1="-", \ searchElemName2="&", replaceElemName2="and", \ replaceWhiteSpaceInElemName="_" , \ file=$fileName) display (ECemRowTitles[]) export (ECemRowTitles[]; file=$importPath/ECemRowTitles.txt) ==== Read from a line/row ==== The following code reads the second column as string $importPath, $fileName buildstring ($importPath, $home, "/testScripts") buildstring ($fileName, $importPath, "/emtestfile.csv") localinformant GHGInvEmType[] = create (; object=set, delimiter=",", \ allowDuplicate=on, \ elemNamesFromLineNum=6, firstCol=4, \ searchElemName1=",", replaceElemName1="-", \ replaceWhiteSpaceInElemName="_" , \ file=$fileName) display (GHGInvEmType[]) export (GHGInvEmType[]; file=$importPath/GHGInvEmType.txt) ===== Example Test Load ===== Once you have code for reading the sets it's often useful to do a quick test load of the data. This file pulls together all the pieces: $informPath = $home + "/V4/informants" string $importPath, $fileName buildstring ($importPath, $home, "/testScripts") buildstring ($fileName, $importPath, "/emtestfile.csv") localinformant ECemRowTitles[] = create (; object=set, delimiter=",", \ allowDuplicate=on, \ elemNamesFromColNum=2, firstLine=11, \ searchElemName1="", replaceElemName1="blank", \ searchElemName2="&", replaceElemName2="and", \ replaceWhiteSpaceInElemName="_" , \ file=$fileName) display (ECemRowTitles[]) export (ECemRowTitles[]; file=$importPath/ECemRowTitles.txt) localinformant GHGInvEmType[] = create (; object=set, delimiter=",", \ allowDuplicate=on, \ elemNamesFromLineNum=6, firstCol=3, \ searchElemName1="", replaceElemName1="blank", \ searchElemName2="&", replaceElemName2="and", \ replaceWhiteSpaceInElemName="_" , \ file=$fileName) display (GHGInvEmType[]) export (GHGInvEmType[]; file=$importPath/GHGInvEmType.txt) local data[rows,cols] = create (; dim=ECemRowTitles, dim=GHGInvEmType, \ dataFormat="coordinate", fileFormat="text", allCoord=off, delimiter=",", \ firstLine=11, firstCol=2, ignoreExtraCols=on, ignoreMissingCols=on, \ searchData=",", replaceData="" , \ searchElemName1="", replaceElemName1="blank", \ searchElemName2="&", replaceElemName2="and", \ replaceWhiteSpaceInElemName="_" , \ file=$fileName) table (data[rows,cols]) And here's a growing list of "magic" element name character replacements: localinformant NHS2011Profile_fld[] = create (; object=set, delimiter=",", \ allowDuplicate=off, \ elemNamesFromColNum=2, firstLine=1, \ searchElemName1=",", replaceElemName1="", \ searchElemName2=".", replaceElemName2="_", \ searchElemName3="&", replaceElemName3="and", \ searchElemName4="/", replaceElemName4="_", \ searchElemName5="(", replaceElemName5="", \ searchElemName6=")", replaceElemName6="", \ searchElemName7="]", replaceElemName7="", \ searchElemName8="[", replaceElemName8="", \ searchElemName9="-", replaceElemName9="_", \ searchElemName10="'", replaceElemName10="", \ searchElemName11="$", replaceElemName11="dlr", \ searchElemName12="%", replaceElemName12="pct", \ searchElemName13=":", replaceElemName13="", \ searchElemName14="=", replaceElemName14="eq", \ stripLeadingWhiteSpace=off, \ replaceWhiteSpaceInElemName="_" , \ file=$fileName)