Date: Thu, 28 Mar 2024 16:44:30 +0100 (CET) Message-ID: <1154539519.903.1711640670141@plg-s04> Subject: Exported From Confluence MIME-Version: 1.0 Content-Type: multipart/related; boundary="----=_Part_902_966552435.1711640670140" ------=_Part_902_966552435.1711640670140 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Content-Location: file:///C:/exported.html
Workflows in the EPISODES Platform is a mechanism that al= lows for additional organization of the data processing inside the workspac= e. They allow for a configuration of the applications execution in such a w= ay, that one application is automatically started after another, being fed = data that may either be specified beforehand or be produced by the precedin= g applications.
With this guide you will learn about workflows - a mechanism that will a= llow you for automating and managing the data processing inside your worksp= ace. We will tell you how the workflows are constructed and how to use and = edit them. Workflows are similar to regular EPISODES Platform= a> applications, therefore, a basic knowledge of the what are these applica= tions and how they are run is a requisite for using workflows, therefore,&n= bsp;if you are not familiar with work= ing with the applications yet, please check Applications Quick Start Guide first.
A workflow is a sequence of tasks that processes a set of data. It can, = in an automated way perform several actions that would, otherwise, have to = be invoked one by one. Within the EPISODES Platform, the tasks composing a = workflow are the applications - any set of applications present in your wor= kspace can become a workflow. We will show the usage of the workflow mechan= ism on an example using the application Ground Motion Prediction Equations: GMPE calculation and two oth= er applications that prepare data to be used in the former. The sequence of= the applications used is:
Normally, when used in workspace, the three applications have to be mana= ged separately, therefore, to run the last application (Ground Motion P= rediction Equations: GMPE calculation), one has to first run the = Ground Motion Parameters Catalog builder, then use its results for= Catalog Filter, and finally, on the results of the latter, c= reate and run the Ground Motion Prediction Equations: GMPE calculation<= /em>. The amount of work required for running such a sequence of applicatio= n grows with the number of the component applications, while when using a w= orkflow, the whole sequence is reduced to one operation. Figure 1 shows a c= omparison between the aforementioned sequence of applications located in wo= rkspace and a workflow created from them.
Figure 1. Comparison of applicatio= n sequence in workspace and a workflow created from this sequence
A workflow constructed within the EPISO= DES Platform assumes that the user (a scientist) experiments with the data = and its analyses performed by the applications in workspace, and when findi= ng a combination of data and parameters that suits their research best save= s that as a workflow. While a workflow itself is a known concept, the appro= ach presented here is quite different from the most common workflow solutio= ns (see Related p= ublications)
A workflow can be created by choosing the Transform to workflow=
action (see Figure 2) from the file menu of any directory (except from the=
root directory - /
) in workspace. Note, that an applicat=
ion is also a directory, so you can also create a workflow for an applicati=
on item in workspace.
Figure 2. File menu displayed for = a directory in workspace, with Transform to workflow action m= arked in red. The marking in grey shows the content that will become intern= al to the created workflow
The Transform to workflow action will create a workflow from th= e whole sub-tree starting from the directory on which we initiated the acti= on - in Figure 2, this will be application GMParametersBuilder, an= d the workflow will contain the part of the tree marked in grey. After this= action, workspace will contain only single entry marked with "W" (meaning workflow) listing results from all the applications that w= ere included in the workflow (you will learn how to configure which results= should be shown and which not, further in this guide). In the example from= Figure 2, creating the workflow, will produce a result as shown in Figure = 3.
The workflow item in workspace (see GMParametersBuilder <= /em>item in Figure 3) is similar to an application item - it is displa= yed as a directory with a colorful status icon on the right. As with other = workspace items, it has an action menu available (see Figure 3). The action= Expand is specific only to workflows and it is a reverse operatio= n to creating workflow (to the Transform to workflow operatio= n, visible in Figure 2) - it transforms the workflow again to a regular dir= ectory. Other actions are similar to the actions available for a directory = or application (compare My Workspac= e Quick Start Guide and Applicat= ions Quick Start Guide), however, note, that the Rename a= ction changes the only the name of the workflow, not the name of the underl= ying directory. Actions like Upload are not available fo= r a workflow, as the directory structure inside is not visible when a workf= low is created and the result of such action could be ambiguous (see also t= he Editing w= orkflow section). Note also that, if there were any files in the base w= orkflow directory (in our example, the GMParametersBuilder = em>directory) that were not used by any of the applications, they will simp= ly be hidden from you, as the workflow would only show the content related = to the applications inside. Therefore, if you want to have access to these = files when the workflow is created, move them outside of its directory.&nbs= p;Creating a workflow is a fully reversible operation, as it does not chang= e the logical structure of applications nor files inside. Therefore, in cas= e of any amendments needed within the directory structure underneath, the w= orkflow can be expanded (action Expand, described earlier) an= d created again after the necessary corrections.
Figure 3. Directories from Figure = 2 after creating the workflow, with file menu displayed
Note that, in the example above, =
some inputs - LGCD_Catalog.ma=
t
and LGCD_GM_Catalog_1.mat
in Figure 3 - are outside of the GMParametersBuilder
directory. Files like these - inputs of the applications in=
side the workflow that are outside of the transformed directory will b=
ecome workflow inputs - see also the description below and Figure 4.=
As in case of any other applicati=
on or file inside the workspace, a workflow can be displayed by clicking on=
its item in the workspace tree (GMParametersBuilder
item in Figure 3). Once the workflow is shown, you can see the followin=
g contents:
LGCD_Catalog.mat
and LGCD_GM_Catalog_1.mat<=
/code> files were outside the GMParametersBuilder
directory. O=
ther inputs are hidden as, if they were inside the directory, it meant that=
they were passed from one application to another. You can modify them afte=
r expanding the workflow back to directory.
The contents are each displayed in the same order as the order in which = the applications were added in the source directory.
Figure 4. View of an open workflow with most important elements marked=
Editing workflow will allow you to customize which forms and outputs are visible within the workflow vi= ew (Figure 4). Not all the features of a workflow are editable, since some = of them depend on the structure of the underlying directory. For this reaso= n you cannot remove or add applications to workflow, nor change their order= . Neither can you change which are the inputs to the workflow, as they are = based on the structure of the inputs of the applications inside (see also p= revious section). To edit the workflow structure, you need to expand it (se= e previous section) and create again.
To edit the workflow, choose the = Edit / debug workflow button marked with (6) in Figure 4. This will change the workflow view, into an editable fo= rm - a view similar to previous one, but with additional options to control= the editable features - each of them is described below in the order in wh= ich it appears in the workflow edit view. We recommend to edit the workflow= after its structure is well established, as operation of expanding it, wil= l erase all the changes done while editing.
Figure 5 shows the part of the editing view that allows to edit the work=
flow description and application names. The workflow Description is a summary, which you can fill (by =
editing the field marked with (1) in Figure 5) with yo=
ur description of what the workflow does and what it should be used for. By=
default it is filled with the list of the applications that constitute the=
workflow. The application names are your custom names displayed next to th=
e application individual inputs, forms, statuses and outputs (see
Figure 5. Part of workflow editing= view containing workflow description and application names, with most impo= rtant elements marked, filled with default values (upper figure) and after = changing to custom names (lower figure)
Figure 6. Workflow from Figure 3 a= fter changing the application names as in Figure 5
Figure 7 shows the part of the editing view that allows to edit the work= flow forms visibility. The form for each application can be shown or hidden= by using the '+' and '-' buttons, respectively. After us= ing the '-' button (marked with (1) in Figur= e 7), the form is added to Hidden forms (marked with (2)= strong> in Figure 7) It can be made visible again, by using the '+= ' button (marked with (3) in Figure 7). The hidden forms w= ill not be displayed within the main workflow view, which implies also that= the parameters of this form will not be able to be edited when running the= workflow. By default, all forms are visible.
Figure 7. Part of workflow ed= iting view containing controls of visibility of forms, with most important = elements marked
A similar mechanism applies to outputs of the application - their visibi= lity can be changed with similar '+' and '-' buttons as in case of forms. I= f any application output is set as hidden, it is additionally removed from = the workflow view in the workspace tree (see Figure 8).
Figure 8. Workflow from Figur= e 6 after hiding the outputs of two first applications
The last part of the workflow edit view is dedicated to the workflow deb= ugging (see Figure 9). The controls in this part give you more insight to t= he workflow structure and ability to change inputs that otherwise is not av= ailable. However, be careful, as manipulating this part can unexpectedly ch= ange the workflow structure, causing the workflow view to be desynchronized= with the actual structure.
Figure 9. Part of workflow ed= iting view containing workflow debugging options
The changes done while editing the workflow can be saved using = Save workflow template button in lower right corner (see Figure 10). A= fter saving, the view is displayed again (as in Figure 4) with the changes = applied.
Figure 10. Workflow editing view w= ith Save workflow template button visible
Once a workflow is created, you h= ave only limited options of reorganization of the applications inside and d= ata flow between them (options described in section Editing the workflow). If= , after creating it, you see that the workflow should have different inputs= , or the flow of data between the applications inside should be organized d= ifferently, you can always expand it with the Expand workflow acti= on accessible from the workspace item menu (see Figure 11). This operation = will return the workspace to the state before creating the workflow (in our= example, the workspace tree will come back to the situation from Figure 2)= . You can then rearrange the individual applications, add or delete them, a= nd, when ready, create the workflow anew. Creating and expanding workflows = is an operation only on the files metadata - the files are not moved nor co= pied, therefore, it does not affect the space in your workspace, nor takes = much computing resources, therefore, you can create and expand workflows as= you choose.
Note that, when expanding w= orkflow, all the configuration done when Editing the workflow will be lost, therefo= re, we advise to make the workflow edition only after the workflow structur= e is settled.
Figure 11. File menu displayed for a directory in workspace, with = ;Expand action marked in red.
As we said in one of the first sections, a workflow is similar to an app=
lication. This means also that it can be used in another workflow - in simi=
lar way as an application does. Expanding on the example used in the previo=
us sections (workflow created as in Figure 1), we can now add another appli=
cation before the GMParametersBuilder workflow - namely the Ca=
talogMerger application, that will merge two Ground Motion Catalogs in=
to one - the result will be used in the GMParametersBuilder workfl=
ow instead of the original LGCD_GM_Catalog_1.mat
file - see Fi=
gure 12. The workflows can be nested like this any number of times.
Figure 12. Workflow created on the= top of another workflow and application
Makuch, M., M. Malawski, J. Kocot, and T. Szepieniec (2020) Applying= workflows to scientific projects represented in file system directory tree= . In: 2020 IEEE/ACM Workflows in Support of Large-Scale Science (WORKS= ), pp. 25-32, https://doi.org/10.1109/WORKS51914.20= 20.00009
Makuch, M., M. Malawski, J. Kocot, and T. Szepieniec (2022) Mod= el and system for scientific workflows represented in file system directory= tree. In: 2022 Future Generation Computer Systems, ISSN 0167-739= X, https://doi.org/10.1016/j.future.2022.03.023=