Designing your first workflow
We now have a basic understanding of how to call an conversion node on your data, we can visualize the results, now lets put it all together and create our first workflow, spanning the most basic nodes.
Here we will use our uploaded image, max project it, threshold it (yeah analysis), and then measure the fraction of the image that is above the threshold ( yeah quantification). In the course of this tutorial you will get to understand:
- What even are workflows?
- What is a Workflow Scheduler?
- How to create a workflow?
- How to deploy a workflow on a Scheduler?
Before we start
You are familiar with this by now.. There are a few things we need to do before we can start.
First what do we mean with Workflow?
A workflow in the arkitekt sense is a processing pipeline, that uses a series of Nodes to process your data in a stream. Nodes just like
our previously mentioned Show on Napari or Convert File nodes. You can either stitch them together in the GUI or you can import them from a file or
even this website. We will do the first. Hopefully this will help you in familiarizing yourself a bit more with the UI .
Lets look first at the workflow we would like to create, and then we will go through the steps to create it.
Show Workflow
This component is under construction.
This is the workflow we would like to create. For now we disabled the import feature.. You should really try to create it on your own
This is probably the most basic workflow you can create, but it will teach you a lot about the Arkitekt Workflow and its design. A few things to note here.
-
This is an Arkitekt
Workflowthat we just exported from Arkitekt and then embeded in this website.Arkitekt workflows are just visual representations of a processing workflow. They are stored in a JSON file, and can be imported and exported from Arkitekt. You can also import them from this website, but we will get to that later.
-
About the
Nodes.Nodes in Arkitekt Workflows are front and center. They are the building blocks of your workflow, and thusly the building blocks of your analysis. As you have seen in the previous section every
Nodehas inputs and outputs that you can connect to other nodes. This connection then defines the flow of data through your workflow. Importantly you will notice the nodes termedInputandOutput. These are special nodes that are part of every workflow, and are the entry and exit points to your analysis. When you connect a node to theInputnode, you are telling Arkitekt that your workflow will expect the input type of that node as a parameter when you run it. Equally you connect a node to theOutputnode, you are telling Arkitekt that your workflow will return the output type of that node when you run it.Workflows are Just NodesThis abstraction of
InputandOutputnodes is core to the concept of a workflow in Arkitekt. Each workflow has exactly oneInputand oneOutputnode. And as our workflows are just nodes, these inputs and outputs will then be the inputs and outputs of the workflow node. This means that you can use workflows in workflows, and you can use workflows just as nodes on your data. But we will get to that later. -
About their colors:
If you have connected the website and followed the tutorial until now, you might notice that the color of the nodes is yellow. This is because we have not yet installed apps that provide the functionality for the nodes in this workflow. This illustrates another core feature of Arkitekt. The separation of the workflow design and the workflow execution. You can design and share a workflow, irrespective of the apps that provide the functionality for the nodes in the workflow. This means that you can design a workflow, and potentially share it with others, even though they might run in on completely different apps. This makes workflows a great way to share analysis pipelines, and to make them reproducible and universal
-
About the data as a stream:
Arkitekt workflows are adapted to process your data as a stream, rather than as a batch. This means that each node will process your data as it comes in, and then pass it on to the next node. Nodes will not wait for all data to be processed, but will process it autonomously as it comes in. This means that you can process data ridicously fast, and importantly you can process data that is still being generated. This is a core feature of Arkitekt, and we will get to it later. You will also note that the edges are labeled with
@mikro/representatoinand@mikro/metric. These labels correspond to the data types that are passed between the nodes. The@symbol indicates that these are mikro data types and therepresentationandmetricindicate the type of data. The first two nodes will manipulate an image to an image (images are represented asrepresentation), and the last node will return ametric(metrics are scalar values attached to an image (here the fraction)). Thismetricwill also be the output of our node.
Enough talking
Lets start by creating this workflow. First we need to install the apps that provide the functionality for the nodes in this workflow. We will need two new plugins for this workflow, so lets install them. One plugin will provide all of our functionality need to run this workflow in. The other plugin will be used to actually run the workflow in the background. So lets install them.
New plugins
First we need to install the apps that provide the functionality for the nodes in this workflow. We will need two new plugins for this workflow, so lets install them. One plugin will provide all of our functionality need to run this workflow in.
Make sure to install the stdlib and reaktor apps. You can do that by clicking on the Install button below, or
the classic way by adding these Repositories to your plugin store.
Now follow previously outlined steps to Appify the latest stdlib and reaktor version, and deploy them to your server.
You should now be able to search for Otsu Threshold in the dashboard search bar, and find the node we just installed.
Creating the workflow
Now that we have the Nodes we need lets create the workflow. For this we can finally go to the Workflow tab in the sidebar.
Here we can see a list of all our workflows, and we can create a new one by clicking on the "Create Workflow" button.
You will be presented by the Arkitekt Workflow Designer, which is a drag and drop interface to create workflows.
You can drag Nodes from the nodes sidebar into the workflow, and connect them by dragging the output of one Node to the input of another.
Lets see the design in progress.
-
Open the
Workflowstab in the sidebarThe Worksflows tab is where you can create and manage your workflows, that you can create and run on your data.
-
Click on
Create Workspaceon the bottom right.Give it a Name like "Analysis Run" and click on "Create". A workspace is a place to create and manage versions of your workflow. Workflows are automatically versioned, that means if you change a workflow, you will create a new version of it. This is important for reproducibility and traceability of your analysis.
-
You are now presented with the Arkitekt Workflow Designer.
The Arkitekt Workflow Designer is a drag and drop interface to create workflows. You can drag nodes from the nodes sidebar into the workflow, and connect them by dragging the output of one node to the input of another. Just search for your nodes in the search bar, and drag them into the workflow. Make sure to connect the
InputandOutputnodes to your workflow, as they are required for each workflow. -
Set necessary parameters on the sidebar.
Some nodes require you to specify parameters. You can do that by clicking on the node, and then setting the parameters in the sidebar. For example the
Otsu Thresholdnode requires you to specify if you want to use a gaussian blur before thresholding.This is not necessary for our workflow, so we can leave it at the default value. However you can change the value that should be measure by theMeasure Fractionnode. You can do that by clicking on the node, and then setting the parameters in the sidebar. We are interested in theFractionof the image that is below the threshold, so we can change it at to the 0 value. Also we can rename the Metric key to "Background Fraction", to be more descriptive. -
Save the workflow
You can save the workflow by clicking on the "Save" button on the bottom right. This will save the workflow to your server.
Once saved, your workflow is immediately available to be run on your data. You don't need to deploy it manually anymore.
Running the workflow
Running the workflow on our data should seem quite straightforward. Lets see that in the next section...