tasks module

Collection of Prefect task annotated functions for use in cloud based numerical weather modelling workflows. These tasks are basically wrappers around other functions.

Keep things cloud platform agnostic at this layer.

tasks.create_scratch(provider: str, configfile: str, mountpath: str = '/ptmp') → cloudflow.services.ScratchDisk.ScratchDisk

Provides a high speed scratch disk if available. Creates and mounts the disk.

Parameters
  • provider (str) – Name of an implemented provider.

  • configfile (str) – The Job configuration file

Returns

scratch – Returns the ScratchDisk object

Return type

ScratchDisk

tasks.delete_scratch(scratch: cloudflow.services.ScratchDisk.ScratchDisk)

Unmounts and deletes the scratch disk

Parameters

scratch (ScratchDisk) – The scratch disk object

tasks.fetchpy_and_run(job: cloudflow.job.Job.Job, service: cloudflow.services.StorageService.StorageService)

Prototype for injecting user developed scripts into a workflow. This is currently implemented only for the hlfs example. Additional work is needed to generalize this process.

Parameters
  • job (Job) – The job configuration file

  • service (StorageService) – The cloud or local storage service implementation

Notes

WARNING!!!!! This could potentially allow arbitrary code execution!!!

tasks.forecast_run(cluster: cloudflow.cluster.Cluster.Cluster, job: cloudflow.job.Job.Job)

Run the forecast

Parameters
  • cluster (Cluster) – The cluster to run on

  • job (Job) – The job to run

tasks.job_init(cluster: cloudflow.cluster.Cluster.Cluster, configfile) → cloudflow.job.Job.Job

Initialize the Job object. :param cluster: The Cluster object to use for this Job :type cluster: Cluster :param configfile: The Job configuration file :type configfile: str

Returns

job – An implemented sub-class of Job

Return type

Job

Notes

We can’t really separate the hardware from the job, nprocs is needed to setup the Job

tasks.mount_scratch(scratch: cloudflow.services.ScratchDisk.ScratchDisk, cluster: cloudflow.cluster.Cluster.Cluster)

Mounts the scratch disk on each node of the cluster

Parameters
  • scratch (ScratchDisk) – The scratch disk object

  • cluster (Cluster) – The cluster object that contains the hostnames

tasks.run_pynotebook(pyfile: str)

Wraps the execution of a python3 script

Parameters

pyfile (The path and filename of the python3 script to run.) –

tasks.save_to_cloud(job: cloudflow.job.Job.Job, service: cloudflow.services.StorageService.StorageService, filespecs: list, public=False)

Save stuff to cloud storage.

Parameters
  • job (Job) – A Job object that contains the required attributes. BUCKET - bucket name BCKTFOLDER - bucket folder CDATE - simulation date OUTDIR - source path

  • service (StorageService) – An implemented service for your cloud provider.

  • filespecs (list of str) – file specifications to match using glob.glob Example: [“.nc”, “.png”]

  • public (bool, optional) – Whether the files should be made public. Default: False

tasks.storage_init(provider: str) → cloudflow.services.StorageService.StorageService

Class factory that returns an implementation of StorageService.

StorageService is the abstract base class that provides a generic interface for multiple cloud platforms.

Parameters

provider (str) – Name of an implemented provider.

Returns

service – Returns a specific implementation of the StorageService interface.

Return type

StorageService

Raises

signals.FAIL – Triggers and exception if provider is not supported.

Notes

The following providers are implemented:

AWS S3 - S3Storage