The Excel Importer
The Excel Importer allows you to import data from an Excel file in a (working) prototype, effectively adding it to the population of a running prototype. This text is meant for the Ampersand user who has more population than can be dealt with by
POPULATION statements. It is also useful if you have existing spreadsheets filled with tables, which you want to use as population in your Ampersand-script. Please note that some editing of your Excel-file may be required, to make Ampersand understand how columns match Ampersand-relations.
Nothing special is required for installation; this extension is enabled by default
Importing Excel files in a running prototype
If the user has the required access rights to import a population (by default no login/roles are required), the excel import is available via the menu bar.
How to create importable Excel files
When you have a prototype running for an Ampersand context, you can import data for that prototype from an Excel file, which effectively adds the population specified in the Excel file to the population that is currently already in the database. This section describes how to construct an Excel file that can be used to do this.
There are 2 specifications possible 1 Using an interface definition (automatically detected based on sheet name equals interface name) 2 Using the [import block] syntax
Using an interface defintion
To be specified. Functionality is already implemented
Using the [import block] syntax
Let us consider a small (useless) Ampersand model, defined as follows:
rAA :: A*A [PROP] rAB :: A*B [UNI] rAC :: A*C sAB :: A->B tAD :: A*Delta uBA :: B*A
If you want to specify data elements in an Excel file in order to populate such a model, you must define so called 'blocks' that contain this data. The importer looks for such blocks throughout the Excel file (meaning that you can have blocks on different sheets - all sheets will be inspected).
Here is an example of such a 'block' (note: all blocks must start in the leftmost column; if not, they are disregarded):
Here is the specification of a block:
- A 'block' consists of 2 header rows followed by lines of data. A 'block' terminates whenever a next block starts or the end of a sheet is reached. Empty lines are disregarded.
- Every cell in the leftmost column whose contents starts with the character '[' is the first cell in the first header row of a block. The contents of this cell is further disregarded.
- Subsequent cells in the first header row must either be empty, or contain the name of a relations that is known in your Ampersand model, optionally followed by a
~(flip) character. In our example, such cells may hence only contain '',
- The second header row only contains cells that are either empty or contain a concept name, or contain a concept name and delimiter. In our example, such cells may only contain '',
- The first cell (in the second header row) must contain the source (left) concept of all relations specified in the first header row. It may not be empty.
- Every subsequent cell (in the second header row) must either be empty, or contain the name of the target (right) concept of the relation that is specified in the same column in the first header row.
- There is an exception to the previous two specifications: if a cell in the first header row specifies a flipped relation, the leftmost cell in the second header row specifies the target concept for the (unflipped) relation, and the cell below the flipped relation specifies the source concept for the (unflipped) relation. In the example, all relations in the first header row have source concept
A, except for relation
uBA, which has concept
Aas its target concept, and
Bas its source concept.
- Every subsequent row in this block is called a data row. Cells in a data row are either empty or non-empty. If a non-empty cell contains a formula, this formula is evaluated to obtain the cell contents. If a non-empty does not contain a formula, its contents is obtained as is (see notes for errors in formula-evaluation). From here on, when we talk about 'the contents of a cell', the obtained value from (evaluating the expression in) that cell is meant.
Data rows are interpreted as follows:
- When the first cell in a data row is empty, the content of all other cells in that row is disregarded (you may use such cells to include comments, computations, or whatever else you like)
- When the first cell in a data row is not empty, the content of all other non-empty cells and the content of the first cell may define a set of pairs (srcAtom,tgtAtom), each of which is to be inserted into the population of the Ampersand model, where
- 'srcAtom' is the contents of the first cell
- 'tgtAtom' is the (untrimmed) contents of a non-empty cell
- for the relation to which the pair (srcAtom,tgtAtom) is to be added,
- its name is specified in the first row of the block in the same column as 'tgtAtom'
- its SRC Concept is specified on the second header row in the first column;
- its TGT Concept is specified on the second header row in the same column as 'tgtAtom'.
This means that the example is equivalent with the following population specification (note that the cell containing 'CMT' is disregarded as it is comment):
POPULATION rAA CONTAINS [ ("alfa1"), ("alfa1") ] POPULATION rAB CONTAINS [ ("alfa1"), ("beta1") ] POPULATION rAB CONTAINS [ ("alfa2"), ("beta2") ] POPULATION rAC CONTAINS [ ("alfa1"), ("char1") ] POPULATION rAC CONTAINS [ ("alfa1"), ("char2") ] POPULATION rAC CONTAINS [ ("alfa1"), ("char3") ] POPULATION rAC CONTAINS [ ("alfa2"), ("char2") ] POPULATION rAC CONTAINS [ ("alfa3"), ("char2") ] POPULATION rAC CONTAINS [ ("alfa3"), ("char3") ] POPULATION rAC CONTAINS [ ("alfa3"), ("char4") ] POPULATION sAB CONTAINS [ ("alfa1"), ("beta2") ] POPULATION sAB CONTAINS [ ("alfa3"), ("beta1") ] POPULATION tAD CONTAINS [ ("alfa1"), ("d1") ] POPULATION tAD CONTAINS [ ("alfa2"), ("d2") ] POPULATION tAD CONTAINS [ ("alfa3"), ("d1") ] POPULATION uBA CONTAINS [ ("beta1"), ("alfa1") ] POPULATION uBA CONTAINS [ ("beta3"), ("alfa2") ]
- You need NOT know about the internals of the database to use this plugin (at least, that's the idea).
- You may specify formulae instead of texts. The result of the formula will be read (and converted to text) before being inserted into the database. This allows for dynamic construction of identifiers, precomputation of tables, date adaptations to the date of today, etc. Note, however, that this does not always work flawlessly. In particular, the functions
HLOOKUPare known to produce errors (that we are not capable of fixing), so such functions should be avoided.
- If you use '_NEW' in the first column, the (dirty) identifier for the atom will be automatically generated. If you use '_NEW' in a subsequent column on the same row, this will be replaced with the (dirty) identifier for the source atom (which you can use e.g. to populate property-relations). Note that as we also support formulae, you may use those to achieve the same result (and excercise control over the actual dirty identifiers used)
- It is possible to store all sorts of data in the spreadsheet that will not interfere with the database population. The contents of the following cells is disregarded and can therefore be used for other purposes:
- cells in a row whose first cell is empty.
- cells in a column where the cell that specifies the relation name or the TGT concept is empty.
- When you use something like 'CLASSIFY X ISA Y' in your model, and want to populate an atom 'xy', then you should populate it in the block where 'X's are populated. In this block, you can not only populate relations that have source concept X, but also relations that have source concept Y.
That's all, Folks!