AWSCluster module

Cluster implementation for AWS EC2 clusters.

class AWSCluster.AWSCluster(configfile)

Bases: cloudflow.cluster.Cluster.Cluster

Implementation of the Cluster interface for AWS

platform

The cloud provider. This will always be ‘AWS’ for this implementation.

Type

str

nodeType

EC2 instance type.

Type

str

nodeCount

Number of instances in this cluster.

Type

int

NPROCS

Total number of processors in this cluster.

Type

int

PPN

Number of processors (physical cores) per node.

Type

int

tags

Specific tags to attach to the resources provisioned.

Type

list of dictionary/s of str

image_id

AWS EC2 AMI - Amazon Machine Image

Type

str

key_name

Private key used for SSH access to the instance. This should be configured when creating the AMI.

Type

str

sg_ids

Security group ids

Type

list of str

subnet_id

VPC subnet ID to run in

Type

str

placement_group

The cluster placement group to use.

Type

str

daskscheduler

a reference to the Dask scheduler process started on the cluster

Type

Popen

daskworker

a reference to the Dask worker process started on the cluster

Type

Popen

getCoresPN()

Get the number of cores per node in this cluster.

Returns

self.PPN – the number of cores per node in this cluster. Assumes a heterogenous cluster.

Return type

int

getHosts()

Get the list of hosts in this cluster

Returns

hosts – list of private dns names

Return type

list of str

getHostsCSV()

Get a comma separated list of hosts in this cluster

Returns

hosts – a comma separated list of private dns names

Return type

str

getState()

Returns the cluster state. Not currently used.

parseConfig(cfDict)

Parses the configuration dictionary to class attributes

Parameters

cfDict (dict) – Dictionary containing this cluster parameterized settings.

readConfig(configfile)

Reads a JSON configuration file into a dictionary.

Parameters

configfile (str) – Full path and filename of a JSON configuration file for this cluster.

Returns

cfDict – Dictionary containing this cluster parameterized settings.

Return type

dict

setState(state)

Set the cluster state. Not currently used.

start()

Provision the configured cluster in the cloud.

Returns

self.__instances – the list of Instances started. See boto3 documentation.

Return type

list of EC2.Intance

terminate()
Shutdown and remove the EC2 Instances in this cluster.

Also terminates any associated Dask Worker and Scheduler processes.

Returns

responses – a list of the responses from EC2.Instance.terminate(). See boto3 documentation.

Return type

list of dict