This is an old revision of the document!
TOOL was originally built when memory was scarce and processors were slow. Each tool was optimized for performance in that world, which meant fixed positions for dimensions. Usually the key dimension a single input tool operated on would have to be on the far right and for binary tools (operating on 2 inputs) would require matching dimensions in specific locations. Reorders were scattered through out the code to accommodate these requirements.
Over the years processors and memory improved and the more commonly used tools which operate on one key dimension have been improved to take advantage of the better machines by allowing the key dimension to be anywhere. This reduces the need for reorders and makes the tool code much easier to read and to write.
Recently the “binary” type tools which operate with 2 inputs that interact with each other have been improved too so the order of dimensions does not have to be matching in any way. This has really improved code readability but at a cost of memory.
Today TOOL code is much easier to read and write which is ideal and as long as your objects have a “reasonable” number of elements (say 100M or less) you don't need to think about TOOL memory efficiency. If you have more elements or you are running with a memory strapped machine you do have to concern yourself with efficiency, read on.
Facts for a 32bit machine:
These are the limitations we have until we move to 64bit machines and optimize the TOOL language to use it.
Any tool will need memory for all it's inputs AND outputs but not always enough for the full objects to be in memory. How much of the objects are pulled into memory depends on how the objects are ordered. Read on!
These tools can have the “key” dimension usually specified with dim= anywhere in the object but the further to the right it is the less of the input objects will have to be in memory at a time.
Given object A[a,b,c,d] that you want to run a single input tools with key dimension:
If key dimension is d the tool will bring in y elements of the object into memory at a time where y = extent(d). It will then process that section and then move on to the next section also size extent(d). This will be repeated x times where x = extent(a) * extent(b) * extent©.
If key dimension is b the tool will bring in y elements of the object into memory at a time where y = extent(b) * extent© * extent(d). It will then process that section and then move on to the next section. This will be repeated x times where x = extent(a)
Heres a real world example and a test that was done.
Input object:
tmp1[]: array of real numbers, single precision version: 3 desc: local tmp1[] = dim1: SET; stateLOT; 8 dim2: SET; SD96; 58 dim3: SET; lndCByIrr; 2 dim4: CAT; agrCACat: agrRegType.crop; 48 dim5: SEQ; yearbuilt: 1856:2001:5; year dim6: SET; agrCondType; 3 dim7: SEQ; time: 1861:2001:5; year units: tonne / hectare scientific measure: tonne / hectare SI signature: m^-2 kg data: 116259840 elements, empty
Test1 - map the first dimension (stateLOT) using a dimension mapping:
local tmp2[] = map (tmp1[]; stateLOT->state)
Test2 - reorder the stateLOT dimension to the far right do the dimension mapping and reorder back:
local tmp1Reord[] = reorder (tmp1[]; 2,3,4,5,6,7,1) local tmp2Reord[] = map (tmp1Reord[]; stateLOT->state) local tmp3Reord[] = reorder (tmp2Reord[]; 7,1,2,3,4,5,6)
Runtime comparisons:
– Again order matters
– Basically if there are matching dimensions keep them together in the LEFT most postions
– These tools require bringing the WHOLE input and output object into memory