Customizing Profiles for Slurm Integration
Edit the Profile JSON for Your Purposes:
Tagging instances created by the backend is controlled by two sections, depending on the function of the asset:
Controllers are On-Demand instances that manage other instances. By default, they are tagged as seen on lines 6-9, above, and 1-4 below.
- CODE
{ "Key": "exostellar.xspot-role", "Value": "xspot-controller" }
To add additional tags, duplicate lines 1-4 as 5-8 below (as many times as you need), noting that an additional comma is added on line 4.
- CODE
{ "Key": "exostellar.xspot-role", "Value": "xspot-controller' }, { "Key": "MyCustomKey", "Value": "MyCustomValue" }
Don’t forget the comma between tags.
Workers will be created by Controllers as needed and they can be On-Demand/Reserved instances or EC2 Spot. By default, they are tagged as seen on lines 26-30, above, and 1-4 below:
- CODE
{ "Key": "exostellar.xspot-role", "Value": "xspot-worker" }
Add as many tags as needed.
- CODE
{ "Key": "exostellar.xspot-role", "Value": "xspot-worker" }, { "Key": "MyCustomKey", "Value": "MyCustomValue" }
Don’t forget the comma between tags.
Note Line numbers listed below reference the above example file. Once changes start being made on the system, the line numbers may change.
Line 11 -
InstanceType
: Controllers do not generally require large instances.In terms of performance, these On-Demand Instances can be set as
c5.xlarge
orm5.xlarge
with no adverse effect.
Line 20 -
MaxControllers
: This will define an upper bound for your configuration.Controllers will manage up to 80 workers.
The default upper bound is 800 nodes joining your production cluster: notice line 20
"MaxControllers": 10,
.If you plan to autoscale past 800 nodes joining your production cluster,
MaxControllers
should be increased.If you want to lower that upper bound,
MaxControllers
should be decreased.
Line 21 -
ProfileName
: This is used for your logical tracking, in the event you configure multiple profiles.Lines 31-34 -
InstanceTypes
here in the Worker section, this refers to On-Demand instances – if there is no EC2 Spot availability, what instances do you want to run on.Lines 38-43 -
SpotFleetTypes
: here in the Worker section, this refers to EC2 Spot instance types – because of the discounts, you may be comfortable with a much broader range of instance types.More types and families here, means more opportunities for cost optimization.
Priorities can be managed by appending a
:
and an integer, e.g.m5:1
is a higher priority thanc5:0
.
Line 48 -
EnableHyperthreading
: Set toFalse
to disable hyperthreading.Line 53 -
NodeGroupName
: This string appears in Controller Name tagging <profile>-NGN-countAll other field/lines can be ignored in the asset.