What files can be stored with the application
The application codes stored in the repository created for the application have to contain at least one file, with a name as defined in the executableScriptName property of the Application Definition file. This file will be the entry point of the application - the computation will be started by calling this script. Apart from that, there might be any number of other application files, which might be further organized in directories. The names of the other files and directories do not have to be specified and known beforehand, but the user has to ensure correct syntax of the internal function calls.
When programming the codes, bear in mind that they will be executed on a remote computing infrastructure, therefore, there are some limitations and rules you have to follow:
- Do not program any GUI component to your application - it will be running on a distant computing node, which will not have access to any graphical interface. If you need to input some parameters, use the forms generated by the EPISODES Platform. In case you need to display any plots or any other graphics, save them to files and declare the files as the application results (see the Application Definition file guide for information how these could be specified).
- Do not use any interactive inputs - at the moment, the EPISODES Platform does not support requesting any user input from the application that is running on a remote computing node - this feature is envisaged, but not yet implemented. Therefore, your application has to run in a batch mode - where all the inputs are specified at the moment of running
- Using absolute paths in your scripts would lead to an error when the script is executed on a remote computing infrastructure. You can still use relative paths, as the directory structure will be preserved as it is in the application repository during the computation.
The script selected as the executable of the application (with the executableScriptName property) has to be programmed in the language set in scriptLanguage in the Application Definition file, it also has to declare the inputs and outputs in accordance with that file. As the format of the script and the loaded data depends on the programming language, they are describe on a per-language basis in the following sections.
MATLAB
When choosing MATLAB (or Octave) as the programming language, the file has to be created so that it contains a main function with the same number of inputs and outputs as defined in the application definition file. Before the script execution, the data is read from all the input files and is inserted to the function in the order in which the files were declared in the Application Definition file. With the default multiplicity setting (single file) for the input file (see the Application Definition file structure description), the file is read to a MATLAB variable which is passed directly as argument to the function. If the multiplicity is set to any other value, all the files provided by the user for this input are read to separate MATLAB variables and packed into a single cell array, and this cell array is passed as a single argument to the function. Note, that the order of variables in the cell array is not guaranteed (see the information on the multiplicity and typeLabel property in the Application Definition file structure description). The input parameter values (they are of simple types) are directly passed as arguments to the function right after the input file variables, in the same order in which they were declared in the Application Definition file. After the script execution, the outputs declared with the isReturnedValue property set to true, are automatically saved to .mat
files (other outputs have to be saved manually inside the function).
The input and output variables defined in the function can have any names as long as they are allowed by MATLAB syntax.
Example
The example below shows the connection between the Application Description file (Excerpt1) and the script containing the main function (Excerpt 2) and the order of arguments in that function.
{ "scriptLanguage" : "MATLAB", "executableScriptName" : "sampleFunction.m", "inputFiles" : [ "integer_vector", "string_vector" ], "inputParameters" : [ "TEXT", "DOUBLE" ], "outputs": [ { "dataType" : "double_vector", "isReturnedValue" : true }, { "dataType" : "boolean_vector", "isReturnedValue" : true }, { "fileName" : "plot.png" }], "requiredTools": [ "octave" ] }
Excerpt 1. Content of the Application Description file with two input files, two input parameters and three outputs.
function [outputDoubleValues, outputBooleanValues] = sampleFunction(intValues, stringValues, text, multiplicator) outputDoubleValues = double(intValues) * multiplicator; outputBooleanValues = strcmp(stringValues, text); plot(outputDoubleValues); print('-dpng', 'plot.png'); end
Excerpt 2. Executable script of the application (sampleFunction.m), containing the main function.
Knowing that the order of inputs and outputs of the function corresponds to the order in which they are defined in the Application Definition file (Excerpt 1), and that inputFiles variables come before inputParameters variables, in the above example the order of variables would be as follows:
- intValues - corresponds to the input file with type 'integer_vector'
- stringValues - corresponds to the input file with type 'string_vector'
- text - corresponds to the input parameter with type TEXT
- multiplicator - corresponds to the input parameter with type DOUBLE
- outputDoubleValues - corresponds to the output with type 'double_vector'
- outputBooleanValues - corresponds to the output with type 'boolean_vector'
Additionally, another file is produced within the script - an image file plot.png
- it is not declared in the function outputs, but is createde inside the function.
PYTHON
When choosing Python as the programming language, your script must follow a few clear rules to be executed correctly on the EPISODES Platform.
Script Requirements
Main Function: Your script must include a main function.
File Name Match: The script filename must exactly match the executableScriptName defined in the Application Definition file.
Example: if "executableScriptName": "sample_function.py", the file must be named sample_function.py.Input Arguments: All inputs are passed as arguments to the
main
function in the order declared in the Application Definition file.
Your main function arguments must follow this order:
Input Files: In the exact order defined under
inputFiles
in the Application Definition.- Input Parameters: Follow after input files, in the exact order defined under
inputParameters in the Application Definition file
Example:
If your Application Definition has:
"inputFiles": ["double_vector", "string_vector"],
"inputParameters": ["TEXT", "DOUBLE"]
Then your main function signature should look like:
def main(double_values_path, string_values_path, text, scale_factor):
Input Files Handling:
- Input files are passed as relative paths to the files.
Handling Multiple Files (
multiplicity > 1
)- Files defined with multiplicity > 1 are passed separately as arguments.
- The order of multiple files isn't guaranteed; manage accordingly inside your script.
- See the information on the multiplicity and typeLabel property in the Application Definition file structure description
- You are responsible for loading them using Python tools.
Example for .mat files:
from scipy.io import loadmat data = loadmat(file_path)
Output Handling
- Result files must be manually saved in the script using appropriate Python tools.
- Use exact filenames declared in your Application Definition under
outputs
.
Example (.mat files):
from scipy.io import savemat
savemat('output_file_name.mat', {'variable_name': variable})
Example
The example below shows the connection between the Application Definition file (Excerpt 3) and the script containing the main function (Excerpt 4) and the order of arguments in that function.
{ "scriptLanguage" : "PYTHON", "executableScriptName" : "sample_function.py", "inputFiles" : [ "double_vector", "string_vector" ], "inputParameters" : [ "TEXT", "DOUBLE" ], "outputs": [ { "dataType" : "double_vector", "fileFormat" : "MAT", "fileName" : "output_double_values.mat", "isReturnedValue" : false }, { "dataType" : "boolean_vector", "fileFormat" : "MAT", "fileName" : "output_boolean_values.mat", "isReturnedValue" : false }, { "dataType" : "image_data", "fileFormat" : "PNG", "fileName" : "plot.png" }], "requiredTools" : [ "python" ] }
Excerpt 3. Content of the Application Description file with two input files, two input parameters and three outputs.
import numpy as np from scipy.io import loadmat, savemat import matplotlib.pyplot as plt def main(double_values_path, string_values_path, text, scale_factor): # 1. Load and flatten the arrays from the .mat files loaded_doubles = loadmat(double_values_path) double_values = next(array for array in loaded_doubles.values() if isinstance(array, np.ndarray)).ravel().astype(float) loaded_strings = loadmat(string_values_path) string_values = np.array([str(item) for item in next(array for array in loaded_strings.values() if isinstance(array, np.ndarray)).ravel()]) # 2. Scale the numeric values and create a boolean vector based on the string match scaled_doubles = double_values * scale_factor boolean_mask = (string_values == text) # 3. Save the scaled numeric values and boolean vector to .mat files savemat('output_double_values.mat', {'output_double_values': scaled_doubles}) savemat('output_boolean_values.mat', {'output_boolean_values': boolean_mask}) # 4. Plot the scaled numeric values and save the figure plt.plot(scaled_doubles); plt.savefig('plot.png'); plt.close()
Excerpt 4. Executable script of the application (sampleFunction.m), containing the main function.
Argument and File Correspondence
Order of inputs and outputs of the function corresponds to the order in which they are defined in the Application Definition file (Excerpt 3):
Inputs:
double_values_path
→ input file with type "double_vector"string_values_path
→input file with type "string_vector"text
→ input parameter "TEXT"scale_factor
→ input parameter "DOUBLE"
Outputs (saved manually within script):
output_double_values.mat →
output with type double_vectoroutput_boolean_values.mat
→
output with type boolean_vectorplot.png
→
output with type image_data
When an input file is defined with a multiplicity greater than 1, each file is passed as a separate argument. The order is not guaranteed, so the script should handle this internally.