You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Current »

This document contains instructions for running application CSV to GDF converter within the EPISODES Platform. The application is a tool for conversion of a CSV (Comma-Separated Values) file into a GDF (Generic Data Format) file with *.mat extension.

To obtain more general information about working with applications within the Platform, see Applications Quick Start Guide.

CATEGORY Converters

KEYWORDS Format conversion

CITATION If you use the results or visualizations retrieved from this application in a publication, then you must cite the data source as follows:
Orlecka-Sikora, B., Lasocki, S., Kocot, J. et al. (2020) An open data infrastructure for the study of anthropogenic hazards linked to georesource exploitation., Sci Data 7, 89, doi: 10.1038/s41597-020-0429-3.

Requirements for input file

The input file to be converted to the GDF file format should be a valid CSV file compatible with one of the GDF file types included in the list, with values separated by commas (',') - see also CSV format specifications, e.g., https://tools.ietf.org/html/rfc4180. The file may or may not contain a header line (if it does not, uncheck the "Use csv header" checkbox), which defines the names of the columns in the GDF file. The numbers within the file should use English locale, the text, if containing special characters (including comma, used for values separation) should be enclosed in double quotes.

Example input file

The following excerpt provides an example of the contents of a valid input CSV file containing the resulting data for water level GDF type. It defines two columns with three entries. Each entry contains a value for each column, if the value is not specified, it should be left empty (note the double comma in the last line of Excerpt 1) The most popular column names and formats are described in this guide, however, custom columns are also allowed. The time should be written as a single text value with any format, however, this format has to be later specified within the application form (see the Filling form values section). 

Date,Water_Level
"2020 Jan 01 00:00:00.000",12.89
"2020 Jan 02 00:00:00.000",10.42
"2020 Jan 03 00:00:00.000",11.36

Excerpt 1. Sample content of a CSV input file.

Input file specification

The application requires single file of type Catalog in CSV - see also previous section for the input file requirements.


Figure 1. Application input file specification.

Options specification

Choose from the drop-down list (marked with (1) on Figure 2) what type of output file is expected (it must be related to the input file according to the list of GDF file types) - if the type is not in the list, use the generic type (Generic GDF Data). In the middle field (marked with (2) on Figure 2), you can indicate whether the selected input file has a header. In addition, a description of the file's contents can be added to the resulting file(marked with (3) on Figure 2).

Figure 2. Options specification.

If the file you have selected does not have a direct mapping in combination with the selected result type, a corresponding message to this effect will be displayed above the options fields (Figure 3).


Figure 3. Information displayed if the selected result type does not have an exact mapping to the GDF file types based on the input file specified

Filling form values

The application form is generated based on the specific input file - the column names from header line are displayed in the first column (see Figure 4). By default all columns will be present in the result GDF file, however, to exclude any of them uncheck the box near to the column name (as marked with (1) in Figure 4). To read the file content correctly, it is required to set the column content type (marked with (2) in Figure 4), so that the program knows how to interpret the subsequent values. If the content type is Date and time, it is also required to set the time format (marked with (3) in Figure 4) to allow correct reading of the time fields. It is also possible to specify a different name for the column in the resulting GDF file (by default the name is the same as in the header) or add a description or unit. Additionally, a display format might be added to specify a custom mode of display (e.g. engineering notation), with a help of a wizard (accessible with icon marked with (4) in Figure 4 and options visible in Figure 5).

If the column name from the file header was defined among the GDF types column names, all of the above properties will be already filled with defaults by the system (rows from Date to Water_level in Figure 2). For other values, they have to be specified by the user. 

Figure 4. Application form with most important elements marked.

Figure 5. Wizard used for display format specification

Produced output

The result file will have the same name as the input, with only the extension changed to .mat, and it will be visualized within the application outputs. 

Figure 6. Result file visualization

Troubleshooting

The most common errors that might be spotted when running the application are caused by incorrect specification of the input format - either format of the time field or incorrect type of the field. If the value in the date/time field does not match the time format from the form (field marked with (3) in Figure 4), the application fails with an error that it cannot parse the field, showing also the content of the field on which the parsing failed - this is illustrated in Figure 7. In this example the time field has value 2010 Jan 01 00:00:00.000 (as in sample input file), therefore, the format of the field is YYYY MMM dd HH:mm:ss.SSS. However, the content of the format column within the application form was specified as YYYY-MMM-dd HH:mm:ss.SSS (the month written as three letter abbreviation - in this format the date value should be 2014-Jan-01 00:05:02.017) and the system could not read the value due to that. It is also important that all the dates/times within one CSV column have the same format.

Figure 7. Application error in case of incorrect time format specification.

Back to top

  • No labels