tasks module¶
Collection of Prefect task annotated functions for use in cloud based numerical weather modelling workflows. These tasks are basically wrappers around other functions.
Keep things cloud platform agnostic at this layer.
-
tasks.
create_scratch
(provider: str, configfile: str, mountpath: str = '/ptmp') → cloudflow.services.ScratchDisk.ScratchDisk¶ Provides a high speed scratch disk if available. Creates and mounts the disk.
- Parameters
provider (str) – Name of an implemented provider.
configfile (str) – The Job configuration file
- Returns
scratch – Returns the ScratchDisk object
- Return type
-
tasks.
delete_scratch
(scratch: cloudflow.services.ScratchDisk.ScratchDisk)¶ Unmounts and deletes the scratch disk
- Parameters
scratch (ScratchDisk) – The scratch disk object
-
tasks.
fetchpy_and_run
(job: cloudflow.job.Job.Job, service: cloudflow.services.StorageService.StorageService)¶ Prototype for injecting user developed scripts into a workflow. This is currently implemented only for the hlfs example. Additional work is needed to generalize this process.
- Parameters
job (Job) – The job configuration file
service (StorageService) – The cloud or local storage service implementation
Notes
WARNING!!!!! This could potentially allow arbitrary code execution!!!
-
tasks.
forecast_run
(cluster: cloudflow.cluster.Cluster.Cluster, job: cloudflow.job.Job.Job)¶ Run the forecast
-
tasks.
job_init
(cluster: cloudflow.cluster.Cluster.Cluster, configfile) → cloudflow.job.Job.Job¶ Initialize the Job object. :param cluster: The Cluster object to use for this Job :type cluster: Cluster :param configfile: The Job configuration file :type configfile: str
- Returns
job – An implemented sub-class of Job
- Return type
Notes
We can’t really separate the hardware from the job, nprocs is needed to setup the Job
-
tasks.
mount_scratch
(scratch: cloudflow.services.ScratchDisk.ScratchDisk, cluster: cloudflow.cluster.Cluster.Cluster)¶ Mounts the scratch disk on each node of the cluster
- Parameters
scratch (ScratchDisk) – The scratch disk object
cluster (Cluster) – The cluster object that contains the hostnames
-
tasks.
run_pynotebook
(pyfile: str)¶ Wraps the execution of a python3 script
- Parameters
pyfile (The path and filename of the python3 script to run.) –
-
tasks.
save_to_cloud
(job: cloudflow.job.Job.Job, service: cloudflow.services.StorageService.StorageService, filespecs: list, public=False)¶ Save stuff to cloud storage.
- Parameters
job (Job) – A Job object that contains the required attributes. BUCKET - bucket name BCKTFOLDER - bucket folder CDATE - simulation date OUTDIR - source path
service (StorageService) – An implemented service for your cloud provider.
filespecs (list of str) – file specifications to match using glob.glob Example: [“.nc”, “.png”]
public (bool, optional) – Whether the files should be made public. Default: False
-
tasks.
storage_init
(provider: str) → cloudflow.services.StorageService.StorageService¶ Class factory that returns an implementation of StorageService.
StorageService is the abstract base class that provides a generic interface for multiple cloud platforms.
- Parameters
provider (str) – Name of an implemented provider.
- Returns
service – Returns a specific implementation of the StorageService interface.
- Return type
- Raises
signals.FAIL – Triggers and exception if provider is not supported.
Notes
- The following providers are implemented:
AWS S3 - S3Storage