REANA-Workflow-Controller

image image image image image image

REANA-Workflow-Controller is a component of the REANA reusable and reproducible research data analysis platform. It takes care of instantiating and managing computational workflows.

Features

  • start workflow execution

  • control workflow steps

  • support for several workflow specifications (CWL, Yadage, Serial)

Usage

The detailed information on how to install and use REANA can be found in docs.reana.io.

REST API

The REANA-Workflow-Controller component offers a REST API for managing workflows. Detailed REST API documentation can be found here.

REANA Workflow Controller workflows REST API.

reana_workflow_controller.rest.workflows.create_workflow()[source]

Create workflow and its workspace.

reana_workflow_controller.rest.workflows.get_workflow_diff(workflow_id_or_name_a, workflow_id_or_name_b)[source]

Get diff between two workflows.

reana_workflow_controller.rest.workflows.get_workflow_parameters(workflow_id_or_name)[source]

Get workflow parameters.

reana_workflow_controller.rest.workflows.get_workflow_retention_rules(workflow_id_or_name: str, user: str)[source]

Get the retention rules of a workflow.

reana_workflow_controller.rest.workflows.get_workflows(args, paginate=None)[source]

Returns all workflows.

REANA Workflow Controller interactive sessions REST API.

reana_workflow_controller.rest.workflows_session.close_interactive_session(workflow_id_or_name)[source]

Close an interactive workflow session.

reana_workflow_controller.rest.workflows_session.open_interactive_session(workflow_id_or_name, interactive_session_type)[source]

Start an interactive session inside the workflow workspace.

REANA Workflow Controller status REST API.

reana_workflow_controller.rest.workflows_status.get_workflow_logs(workflow_id_or_name, paginate=None, **kwargs)[source]

Returns logs of a specific workflow from a workflow engine.

reana_workflow_controller.rest.workflows_status.get_workflow_status(workflow_id_or_name)[source]

Get workflow status.

reana_workflow_controller.rest.workflows_status.set_workflow_status(workflow_id_or_name)[source]

Set workflow status.

REANA Workflow Controller workspaces REST API.

reana_workflow_controller.rest.workflows_workspace.delete_file(workflow_id_or_name, file_name)[source]

Delete the specified file.

reana_workflow_controller.rest.workflows_workspace.download_file(workflow_id_or_name, file_name)[source]

Returns the requested file.

reana_workflow_controller.rest.workflows_workspace.get_files(workflow_id_or_name, paginate=None)[source]

Returns the workspace file list.

reana_workflow_controller.rest.workflows_workspace.move_files(workflow_id_or_name)[source]

Move files within workspace.

reana_workflow_controller.rest.workflows_workspace.upload_file(workflow_id_or_name)[source]

Adds a file to the workspace.

Changelog

0.9.3 (2024-03-04)

Build

  • docker: non-editable submodules in “latest” mode (#551) (af74d0b)

  • python: bump all required packages as of 2024-03-04 (#574) (1373f4c)

  • python: bump shared REANA packages as of 2024-03-04 (#574) (e31d903)

Features

  • manager: call shutdown endpoint before workflow stop (#559) (719fa37)

  • manager: increase termination period of run-batch pods (#572) (f05096a)

  • manager: pass custom env variables to job controller (#571) (646f071)

  • manager: pass custom env variables to workflow engines (#571) (cb9369b)

Bug fixes

  • manager: graceful shutdown of job-controller (#559) (817b019)

  • manager: use valid group name when calling groupadd (#566) (73a9929), closes #561

  • stop: store engine logs of stopped workflow (#563) (199c163), closes #560

Code refactoring

Code style

Continuous integration

  • commitlint: addition of commit message linter (#555) (b9df20a)

  • commitlint: allow release commit style (#575) (b013d49)

  • commitlint: check for the presence of concrete PR number (#562) (4b8f539)

  • pytest: move to PostgreSQL 14.10 (#568) (9b6bfa0)

  • release-please: initial configuration (#555) (672083d)

  • release-please: update version in Dockerfile/OpenAPI specs (#558) (4be8086)

  • shellcheck: fix exit code propagation (#562) (c5d4982)

Documentation

  • authors: complete list of contributors (#570) (08ab9a3)

0.9.2 (2023-12-12)

  • Adds automated multi-platform container image building for amd64 and arm64 architectures.

  • Adds metadata labels to Dockerfile.

  • Changes CVMFS support to allow users to automatically mount any available repository.

  • Changes how pagination is performed in order to avoid counting twice the total number of records.

  • Changes the workflow deletion endpoint to return a different and more appropriate message when deleting all the runs of a workflow.

  • Fixes job status consumer exception while attempting to fetch workflow engine logs for workflows that were not successfully scheduled.

  • Fixes runtime uWSGI warning by rebuilding uWSGI with the PCRE support.

0.9.1 (2023-09-27)

  • Adds the timestamp of when the workflow was stopped (run_stopped_at) to the workflow list and the workflow status endpoints.

  • Adds PDF files to the list of file types that can be previewed from the web interface.

  • Changes the deletion of a workflow to automatically delete an open interactive session attached to its workspace.

  • Changes the k8s specification for interactive session pods to include labels for improved subset selection of objects.

  • Changes the k8s specification for interactive session ingress resource to include annotations.

  • Changes uWSGI configuration to increase buffer size, add vacuum option, etc.

  • Fixes job status inconsistency when stopping a workflow by setting the job statuses to stopped for any running jobs.

  • Fixes job status consumer to correctly rollback the database transaction when an error occurs.

  • Fixes uWSGI memory consumption on systems with very high allowed number of open files.

  • Fixes uWSGI and consume-job-queue command to gracefully stop when being terminated.

  • Fixes container image names to be Podman-compatible.

0.9.0 (2023-01-19)

  • Adds the remote origin of workflows submitted via Launch-on-REANA (launcher_url) to the workflow list endpoint.

  • Adds support for Kerberos authentication for workflow orchestration.

  • Adds the REANA_WORKSPACE environment variable to jupyter notebooks and terminals.

  • Adds option to sort workflows by most disk and cpu quota usage to the workflow list endpoint.

  • Adds support for specifying and listing workspace file retention rules.

  • Changes workflow list endpoint to add the possibility to filter by workflow ID.

  • Changes the deployment of interactive sessions to use networking/v1 Kubernetes API.

  • Changes default consumer prefetch count to handle 10 messages instead of 200 in order to reduce the probability of 406 PRECONDITION errors on message acknowledgement.

  • Changes to Flask v2.

  • Changes job status consumer to improve logging for not-alive workflows.

  • Changes the deletion of a workflow to also update the user disk quota usage if the workspace is deleted.

  • Changes the CWD of jupyter’s terminals to the directory of the workflow’s workspace.

  • Changes the k8s specification of interactive sessions’ pods to remove the environment variables used for service discovery.

  • Changes GitLab integration to use reana as pipeline name instead of default when setting status of a commit.

  • Changes the deletion of a workflow to always remove the workflow’s workspace and to fail if the request is asking not to delete the workspace.

  • Changes the move_files endpoint to allow moving files while a workflow is running.

  • Changes the deployment of interactive sessions to improve security by not automounting the Kubernetes service account token.

  • Changes workspace file management commands to use common utility functions present in reana-commons.

  • Changes to PostgreSQL 12.13.

  • Changes the deployment of job-controller to avoid unnecessarily mounting the database’s directory.

  • Changes the base image of the component to Ubuntu 20.04 LTS and reduces final Docker image size by removing build-time dependencies.

  • Fixes the download of files by changing the default MIME type to application/octet-stream.

  • Fixes the workflow list endpoint to correctly parse the boolean parameters include_progress, include_workspace_size and include_retention_rules.

  • Fixes Kerberos authentication for long-running workflows by renewing the Kerberos ticket periodically.

  • Fixes job status consumer by discarding invalid job IDs.

0.8.2 (2022-10-06)

  • Fixes delete --include-all-runs functionality to delete only workflow owner’s past runs.

0.8.1 (2022-02-07)

  • Adds configuration environment variable to set default timeout for user’s jobs for the Kubernetes compute backend (REANA_KUBERNETES_JOBS_TIMEOUT_LIMIT).

  • Adds configuration environment variable to set maximum custom timeout limit that users can assign to their jobs for the Kubernetes compute backend (REANA_KUBERNETES_JOBS_MAX_USER_TIMEOUT_LIMIT).

0.8.0 (2021-11-22)

  • Adds users quota accounting.

  • Adds new job properties started_at and finished_at to the /logs endpoint.

  • Adds configuration environment variable to limit the number of messages received in the job status consumer (prefetch_count).

  • Adds file search capabilities to the workflow workspace endpoint.

  • Adds Snakemake workflow engine support.

  • Adds support for custom workflow workspace path.

  • Changes to PostgreSQL 12.8.

  • Changes workflow run manager to query the specific workflow engine during pod deletion.

  • Fixes workflow list endpoint query logic to improve optimization.

0.7.4 (2021-07-05)

  • Changes internal dependencies.

0.7.3 (2021-04-28)

  • Adds configuration environment variable to set job memory limits for the Kubernetes compute backend (REANA_KUBERNETES_JOBS_MEMORY_LIMIT).

  • Adds configuration environment variable to set maximum custom memory limits that users can assign to their job containers for the Kubernetes compute backend (REANA_KUBERNETES_JOBS_MAX_USER_MEMORY_LIMIT).

  • Adds support for listing files using glob patterns.

  • Adds support for glob patterns and directory downloads, packaging files into a zip.

0.7.2 (2021-03-17)

  • Adds new configuration to toggle Kubernetes user jobs clean up.

  • Fixes job-status-consumer exception detection for better resilience.

0.7.1 (2021-02-03)

  • Fixes minor code warnings.

  • Changes CI system to include Python flake8 and Dockerfile hadolint checkers.

0.7.0 (2020-10-20)

  • Adds possibility to restart workflows.

  • Adds exposure of workflow engines logs.

  • Adds possibility to pass workflow operational options.

  • Adds progress report information on workflow list response.

  • Adds code mount on dev mode in workflow engines and job controller.

  • Adds preview flag to file download endpoint.

  • Fixes deletion of workflows in queued state.

  • Fixes CVMFS availability for interactive sessions.

  • Fixes jobs status update.

  • Fixes response on close interactive session action.

  • Changes runtime component creation to use centrally configured namespace from REANA-Commons.

  • Changes workflow engine pod labelling for better traceability.

  • Changes logs endpoint to provide richer information.

  • Changes git clone depth when retrieving GitLab projects.

  • Changes REANA submodule installation in editable mode for live code updates for developers.

  • Changes base image to use Python 3.8.

  • Changes code formatting to respect black coding style.

  • Changes documentation to single-page layout.

0.6.1 (2020-05-25)

  • Upgrades REANA-Commons package using latest Kubernetes client version.

0.6.0 (2019-12-20)

  • Modifies the batch workflow run creation, including an instance of REANA-Job-Controller running alongside with the workflow engine (sidecar pattern). Only DB and workflow worksapce are mounted.

  • Refactors volume mounts using reana-commons base.

  • Provides user secrets to the job controller.

  • Extends workflow APIs for GitLab integration.

  • Allows stream file uploads.

0.5.0 (2019-04-23)

  • Adds support to create interactive sessions so the workspace can be explored and modified through a Jupyter notebook.

  • Creates workflow engine instances on demand for each user and makes CVMFS available inside of them.

  • Adds new endpoint to compare two workflows. The output is a git like diff which can be configured to show differences at metadata level, workspace level or both.

  • Adds new endpoint to delete workflows including the stopped ones.

  • Adds new endpoints to delete and move files whithin the workspace. The deletion can be also done recursively with a wildcard.

  • Adds new endpoint which returns workflow parameters.

  • Adds new endpoint to query the disk usage of a given workspace.

  • Makes docker image slimmer by using python:3.6-slim.

  • Centralises log level and log format configuration.

0.4.0 (2018-11-06)

  • Improves AMQP re-connection handling. Switches from pika to kombu.

  • Improves REST API documentation rendering.

  • Changes license to MIT.

0.3.2 (2018-09-25)

  • Modifies job input identification process for caching purposes, adding compatibility with CephFS storage volumes.

0.3.1 (2018-09-07)

  • Harmonises date and time outputs amongst various REST API endpoints.

  • Separates workflow parameters and engine parameters when running Serial workflows.

  • Pins REANA-Commons and REANA-DB dependencies.

0.3.0 (2018-08-10)

  • Adds support for Serial workflows.

  • Tracks progress of workflow runs.

  • Adds uwsgi for production deployments.

  • Allows downloading of any file from a workflow workspace.

0.2.0 (2018-04-19)

  • Adds support for Common Workflow Language workflows.

  • Adds support for specifying workflow names in REST API requests.

  • Adds sequential incrementing of workflow run numbers.

  • Adds support for nested inputs and runtime code directory uploads.

  • Improves error messages and information.

  • Prevents multiple starts of the same workflow.

0.1.0 (2018-01-30)

  • Initial public release.

Contributing

Bug reports, issues, feature requests, and other contributions are welcome. If you find a demonstrable problem that is caused by the REANA code, please:

  1. Search for already reported problems.

  2. Check if the issue has been fixed or is still reproducible on the latest master branch.

  3. Create an issue, ideally with a test case.

If you create a pull request fixing a bug or implementing a feature, you can run the tests to ensure that everything is operating correctly:

$ ./run-tests.sh

Each pull request should preserve or increase code coverage.

License

MIT License

Copyright (C) 2017, 2018, 2019, 2020, 2021, 2022, 2023, 2024 CERN.

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

In applying this license, CERN does not waive the privileges and immunities granted to it by virtue of its status as an Intergovernmental Organization or submit itself to any jurisdiction.

Authors

The list of contributors in alphabetical order: