tasks module¶

Collection of Prefect task annotated functions for use in cloud based numerical weather modelling workflows. These tasks are basically wrappers around other functions.

Keep things cloud platform agnostic at this layer.

tasks.create_scratch(provider: str, configfile: str, mountpath: str = '/ptmp') → cloudflow.services.ScratchDisk.ScratchDisk¶

Provides a high speed scratch disk if available. Creates and mounts the disk.

Parameters

provider (str) – Name of an implemented provider.
configfile (str) – The Job configuration file

Returns

scratch – Returns the ScratchDisk object

Return type

ScratchDisk

tasks.delete_scratch(scratch: cloudflow.services.ScratchDisk.ScratchDisk)¶

Unmounts and deletes the scratch disk

Parameters: scratch (ScratchDisk) – The scratch disk object

tasks.fetchpy_and_run(job: cloudflow.job.Job.Job, service: cloudflow.services.StorageService.StorageService)¶

Prototype for injecting user developed scripts into a workflow. This is currently implemented only for the hlfs example. Additional work is needed to generalize this process.

Parameters

job (Job) – The job configuration file
service (StorageService) – The cloud or local storage service implementation

Notes

WARNING!!!!! This could potentially allow arbitrary code execution!!!

tasks.forecast_run(cluster: cloudflow.cluster.Cluster.Cluster, job: cloudflow.job.Job.Job)¶

Run the forecast

Parameters

cluster (Cluster) – The cluster to run on
job (Job) – The job to run

tasks.job_init(cluster: cloudflow.cluster.Cluster.Cluster, configfile) → cloudflow.job.Job.Job¶

Initialize the Job object. :param cluster: The Cluster object to use for this Job :type cluster: Cluster :param configfile: The Job configuration file :type configfile: str

Returns: job – An implemented sub-class of Job
Return type: Job

Notes

We can’t really separate the hardware from the job, nprocs is needed to setup the Job

tasks.mount_scratch(scratch: cloudflow.services.ScratchDisk.ScratchDisk, cluster: cloudflow.cluster.Cluster.Cluster)¶

Mounts the scratch disk on each node of the cluster

Parameters

scratch (ScratchDisk) – The scratch disk object
cluster (Cluster) – The cluster object that contains the hostnames

tasks.run_pynotebook(pyfile: str)¶

Wraps the execution of a python3 script

Parameters: pyfile (The path and filename of the python3 script to run.) –

tasks.save_to_cloud(job: cloudflow.job.Job.Job, service: cloudflow.services.StorageService.StorageService, filespecs: list, public=False)¶

Save stuff to cloud storage.

Parameters

job (Job) – A Job object that contains the required attributes. BUCKET - bucket name BCKTFOLDER - bucket folder CDATE - simulation date OUTDIR - source path
service (StorageService) – An implemented service for your cloud provider.
filespecs (list of str) – file specifications to match using glob.glob Example: [“.nc”, “.png”]
public (bool, optional) – Whether the files should be made public. Default: False

tasks.storage_init(provider: str) → cloudflow.services.StorageService.StorageService¶

Class factory that returns an implementation of StorageService.

StorageService is the abstract base class that provides a generic interface for multiple cloud platforms.

Parameters: provider (str) – Name of an implemented provider.
Returns: service – Returns a specific implementation of the StorageService interface.
Return type: StorageService
Raises: signals.FAIL – Triggers and exception if provider is not supported.

Notes

The following providers are implemented:: AWS S3 - S3Storage