# Exostellar Terraform Modules


Modules for deploying Exostellar resources using Terraform.

## [Partner Branch Documentation](docs/partner-branch.md)


## Usage


### 1. Existing Cluster Flow

Sets up the Exostellar environment on an existing EKS cluster.

For additional information, refer to [existing-cluster-flow](examples/existing-cluster-flow/README.md).

#### Steps

1. Import the module:

    ```hcl
    module "existing_cluster_flow" {
        source = "git::ssh://git@github.com/Exostellar/terraform-exostellar-modules//modules/existing-cluster-full?ref=v0.0.5"

        eks_cluster      = "my-exostellar-cluster"
        aws_region       = "us-east-1"
        ems_ami_id       = "ami-XXXXXXXXXXXXXXXXX"
        ssh_key_name     = "my-ssh-key-pair-name"
        license_filepath = "/path/to/exo-license-file"
    }
    ```
    > Note: This is a minimal example (with just mandatory fields). For full list of inputs and other useful details,
    > refer to example [existing-cluster-flow](./examples/existing-cluster-flow/main.tf).<br>
    - The `license_filepath` field is optional. If you provide an empty string (`""`), the Terraform modules will skip
      adding the license. However, a valid license is required for the Exostellar Management Server (EMS) to operate.
      You can add the license later by logging in to the EMS UI and uploading it from the **Settings** page.
    - The `ssh_key_name` is optional. You may pass an empty string (`""`).

2. Prerequisite Configurations:

    - Configure AWS CNI:

        1. Prevent AWS CNI from running on `x-compute` nodes by adding the following condition to DaemonSet. Refer to
           doc for more details and steps: [Configure-AWS-CNI.md](docs/Configure-AWS-CNI.md)

            ```
            {"key": "eks.amazonaws.com/nodegroup","operator": "NotIn","values": ["x-compute"]}
            ```

        2. Disable `AWS_VPC_K8S_CNI_CUSTOM_NETWORK_CFG` on container `aws-node` in AWS CNI DaemonSet. Refer to doc for
           more details and steps: [Configure-AWS-CNI.md](docs/Configure-AWS-CNI.md)

    - Configure AWS CSI:

        1. Prevent AWS CSI from running on `x-compute` nodes by adding the following condition to DaemonSet. Refer to
           doc for more details and steps: [Configure-AWS-CSI.md](docs/Configure-AWS-CSI.md)

            ```
            {"key": "eks.amazonaws.com/nodegroup","operator": "NotIn","values": ["x-compute"]}
            ```

        2. Annotate EBS CSI driver's service account with IAM role ARN for IRSA. And restart the EBS CSI driver add-on
           or pods (if installed using Helm chart). Refer to doc for more details and steps:
           [Configure-AWS-CSI.md](docs/Configure-AWS-CSI.md)

            ```
            eks.amazonaws.com/role-arn: arn:aws:iam::<aws-account-id>:role/<cluster-name>-ebs-csi
            ```

3. Helm login to public ECR:

    - Run the following AWS CLI command to retrieve the authentication token required by Helm to access Amazon ECR
      (Elastic Container Registry) Public. This step is necessary because the Terraform modules use the following Helm
      charts published from the public ECR:

        1. Exostellar’s Karpenter (xKarpenter)

        2. Exostellar’s CNI (Exo-CNI)

        3. Exostellar’s CSI (Exo-CSI)

        > AWS Public ECR still uses AWS’s authentication system, even for public repositories. They are accessible to
          all AWS users.

        ```sh
        aws ecr-public get-login-password --region "us-east-1" \
            | helm registry login --username "AWS" --password-stdin "public.ecr.aws"
        ```

        > Permissions needed: [helm-auth-policy.json](policy/helm-auth-policy.json). Refer to #iam-permissions section
        > for more details.

4. To deploy the existing-cluster-flow, run the following commands:

    ```bash
    terraform init
    terraform plan -input=false
    terraform apply -auto-approve
    ```

    > Permissions needed: [terraform-modules-policy.json](policy/terraform-modules-policy.json). Refer to
    > #iam-permissions section for more details.

5. To destroy the existing-cluster-flow, run the following command:

    ```bash
    terraform destroy -auto-approve -refresh=true
    ```

6. (Optional) Revert the prerequisite configurations from step-2.

### 2. Standalone Flow

Deploys an EKS cluster and sets up the Exostellar environment on it.

For additional information, refer to [standalone-flow](examples/standalone-flow/README.md).

> [!WARNING]
> The standalone deployment flow is currently outdated and does not incorporate the latest module changes. It is still
> under development and may not reliably deploy all required resources.


## Examples


- [existing-cluster-flow](./examples/existing-cluster-flow/README.md) - This module deploys all Exostellar resources
onto an existing cluster. The provided example includes details about all inputs to the module, mandatory and optional.


## IAM Permissions


The IAM (Identity and Access Management) permissions for securely allowing only the required access for Terraform
modules and other operations (like Helm login to public ECR) are described below:

<details>

<summary>Terraform Module's Policy</summary>

Attach the IAM policy ([terraform-modules-policy.json](policy/terraform-modules-policy.json)) to the IAM user or role
used by the AWS CLI. This policy grants Terraform the necessary permissions to deploy and destroy Exostellar’s Terraform
modules.

> ⚠️ **Warning**<br>
> Any modifications to this policy could cause internal application failures. Changes are not recommended, but if
> necessary, proceed with caution.

```json
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "ec2:AuthorizeSecurityGroupEgress",
                "ec2:AuthorizeSecurityGroupIngress",
                "ec2:CreateSecurityGroup",
                "ec2:CreateTags",
                "ec2:DeleteSecurityGroup",
                "ec2:DescribeImages",
                "ec2:DescribeInstanceAttribute",
                "ec2:DescribeInstances",
                "ec2:DescribeInstanceTypes",
                "ec2:DescribeNetworkInterfaces",
                "ec2:DescribeRouteTables",
                "ec2:DescribeSecurityGroups",
                "ec2:DescribeSubnets",
                "ec2:DescribeTags",
                "ec2:DescribeVolumes",
                "ec2:DescribeVpcAttribute",
                "ec2:DescribeVpcs",
                "ec2:ModifyInstanceAttribute",
                "ec2:RevokeSecurityGroupEgress",
                "ec2:RunInstances",
                "ec2:TerminateInstances"
            ],
            "Resource": "*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "eks:AssociateAccessPolicy",
                "eks:CreateAccessEntry",
                "eks:DeleteAccessEntry",
                "eks:DescribeAccessEntry",
                "eks:DescribeCluster",
                "eks:DisassociateAccessPolicy",
                "eks:ListAssociatedAccessPolicies"
            ],
            "Resource": "*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "iam:AddRoleToInstanceProfile",
                "iam:AttachRolePolicy",
                "iam:CreateInstanceProfile",
                "iam:CreateOpenIDConnectProvider",
                "iam:CreatePolicy",
                "iam:CreateRole",
                "iam:DeleteInstanceProfile",
                "iam:DeleteOpenIDConnectProvider",
                "iam:DeletePolicy",
                "iam:DeleteRole",
                "iam:DeleteRolePolicy",
                "iam:DetachRolePolicy",
                "iam:GetInstanceProfile",
                "iam:GetOpenIDConnectProvider",
                "iam:GetPolicy",
                "iam:GetPolicyVersion",
                "iam:GetRole",
                "iam:GetRolePolicy",
                "iam:ListAttachedRolePolicies",
                "iam:ListInstanceProfilesForRole",
                "iam:ListPolicyVersions",
                "iam:ListRolePolicies",
                "iam:PassRole",
                "iam:PutRolePolicy",
                "iam:RemoveRoleFromInstanceProfile",
                "iam:TagInstanceProfile",
                "iam:TagPolicy",
                "iam:TagRole"
            ],
            "Resource": "*"
        }
    ]
}
```

</details>

<details>

<summary>Helm Auth Policy</summary>

Attach the IAM policy ([helm-auth-policy.json](policy/helm-auth-policy.json)) to the IAM user or role used by the AWS
CLI. This grants AWS CLI the permission to retrieve the auth token required for Helm to access the public ECR and fetch
the Exostellar Helm charts.

```json
{
	"Version": "2012-10-17",
	"Statement": [
		{
			"Effect": "Allow",
			"Action": [
				"ecr-public:GetAuthorizationToken"
			],
			"Resource": "*"
		},
		{
			"Effect": "Allow",
			"Action": [
				"sts:GetServiceBearerToken"
			],
			"Resource": "*"
		}
	]
}
```

</details>

<details>

<summary>Info: Exostellar IAM Policies</summary>

> ℹ️ **Note**<br>
> These policies are for informational purposes only. Terraform modules create these as a part of deployment. No action
> is required from the user.

The following IAM policies are provisioned by the Terraform modules to ensure secure and appropriate access for
individual Exostellar components.

<details>

<summary>1. Exostellar Management Server (EMS) Policy</summary>

The following IAM policy [exostellar-management-server.json](modules/ems/policy/exostellar-management-server.json) will
be created for Exostellar Management Server (EMS)'s secure access:

```json
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "ec2:RunInstances",
                "ec2:StopInstances",
                "ec2:DescribeSpotPriceHistory",
                "ec2:DescribeInstances",
                "ec2:DescribeInstanceTypes",
                "ec2:DescribeInstanceStatus",
                "ec2:DescribeTags",
                "ec2:CreateTags",
                "ec2:CreateFleet",
                "ec2:CreateLaunchTemplate",
                "ec2:DeleteLaunchTemplate",
                "ec2:TerminateInstances",
                "ec2:AssignPrivateIpAddresses",
                "ec2:UnassignPrivateIpAddresses",
                "ec2:AttachNetworkInterface",
                "ec2:DetachNetworkInterface",
                "ec2:CreateNetworkInterface",
                "ec2:DeleteNetworkInterface",
                "ec2:ModifyNetworkInterfaceAttribute",
                "ec2:DescribeRegions"
            ],
            "Resource": "*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "iam:CreateServiceLinkedRole",
                "iam:ListRoles",
                "iam:ListInstanceProfiles",
                "iam:PassRole",
                "iam:GetRole"
            ],
            "Resource": "*"
        },
        {
        "Effect": "Allow",
        "Action": [
            "ec2:DescribeSubnets",
            "ec2:DescribeSecurityGroups",
            "ec2:DescribeImages",
            "ec2:DescribeImageAttribute",
            "ec2:DescribeKeyPairs",
            "ec2:DescribeInstanceTypeOfferings",
            "iam:GetInstanceProfile",
            "iam:SimulatePrincipalPolicy",
            "sns:Publish",
            "ssm:GetParameters",
            "ssm:GetParametersByPath"
        ],
        "Resource": "*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "ec2:CreateVolume",
                "ec2:DescribeVolumes",
                "ec2:AttachVolume",
                "ec2:ModifyInstanceAttribute",
                "ec2:DetachVolume",
                "ec2:DeleteVolume"
            ],
            "Resource": "*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "ec2:CreateInstanceExportTask",
                "ec2:DescribeExportTasks",
                "ec2:RebootInstances",
                "ec2:CreateSnapshot",
                "ec2:DescribeSnapshots",
                "ec2:LockSnapshot",
                "ec2:CopySnapshot",
                "ec2:DeleteSnapshot",
                "kms:DescribeKey",
                "kms:Encrypt",
                "kms:CreateGrant",
                "kms:ListGrants",
                "kms:Decrypt",
                "kms:ReEncrypt*",
                "kms:RevokeGrant",
                "kms:GenerateDataKey*"
            ],
            "Resource": "*"
        }
    ]
}
```

</details>

<details>

<summary>2. X-spot Controller Policy</summary>

The following IAM policy [xspot-controller.json](modules/iam/policy/xspot-controller.json) will be created for X-spot
controller's secure access:


```json
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "ec2:RunInstances",
        "ec2:StopInstances",
        "ec2:DescribeSpotPriceHistory",
        "ec2:DescribeInstances",
        "ec2:DescribeInstanceTypes",
        "ec2:DescribeInstanceStatus",
        "ec2:DescribeTags",
        "ec2:CreateTags",
        "ec2:CreateFleet",
        "ec2:CreateLaunchTemplate",
        "ec2:DeleteLaunchTemplate",
        "ec2:TerminateInstances",
        "ec2:AssignPrivateIpAddresses",
        "ec2:UnassignPrivateIpAddresses",
        "ec2:AttachNetworkInterface",
        "ec2:DetachNetworkInterface",
        "ec2:CreateNetworkInterface",
        "ec2:DeleteNetworkInterface",
        "ec2:ModifyNetworkInterfaceAttribute",
        "ec2:DescribeRegions",
        "ec2:CreateVolume",
        "ec2:DescribeVolumes",
        "ec2:AttachVolume",
        "ec2:ModifyInstanceAttribute",
        "ec2:DetachVolume",
        "ec2:DeleteVolume",
        "ec2:CreateInstanceExportTask",
        "ec2:DescribeExportTasks",
        "ec2:RebootInstances",
        "ec2:CreateSnapshot",
        "ec2:DescribeSnapshots",
        "iam:CreateServiceLinkedRole",
        "iam:ListRoles",
        "iam:ListInstanceProfiles",
        "iam:PassRole",
        "iam:GetRole",
        "ec2:DescribeSubnets",
        "ec2:DescribeSecurityGroups",
        "ec2:DescribeImages",
        "ec2:DescribeKeyPairs",
        "ec2:DescribeInstanceTypeOfferings",
        "iam:GetInstanceProfile",
        "iam:SimulatePrincipalPolicy",
        "sns:Publish",
        "ssm:GetParameters",
        "ssm:GetParametersByPath",
        "kms:DescribeKey",
        "kms:Encrypt",
        "kms:CreateGrant",
        "kms:ListGrants",
        "kms:Decrypt",
        "kms:ReEncrypt*",
        "kms:RevokeGrant",
        "kms:GenerateDataKey*"
      ],
      "Resource": "*"
    },
    {
      "Effect": "Allow",
      "Action": [
        "eks:DescribeCluster"
      ],
      "Resource": "*"
    }
  ]
}
```

</details>

<details>

<summary>3. X-spot Worker Policy</summary>

The following IAM policy [xspot-worker.json](modules/iam/policy/xspot-worker.json) will be created for X-spot worker's
secure access:


```json
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Deny",
      "Action": [
        "ec2:UnassignPrivateIpAddresses"
      ],
      "Resource": "*"
    },
    {
      "Effect": "Allow",
      "Action": [
        "ec2:ModifyInstanceMetadataOptions",
        "eks:DescribeCluster"
      ],
      "Resource": "*"
    }
  ]
}
```

</details>

<details>

<summary>4. Exostellar's Karpenter (xKarpenter) Policy</summary>

The following IAM policy [karpenter-policy.json](modules/karpenter/policy/karpenter-policy.json) will be created for
Exostellar's Karpenter (xKarpenter)'s secure access:

```json
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "ec2:DescribeInstanceTypes",
                "ec2:DescribeSecurityGroups",
                "ec2:DescribeSubnets"
            ],
            "Resource": "*"
        }
    ]
}
```

</details>

<details>

<summary>5. Exo Node Controller Policy</summary>

The following IAM policy [exo-node-controller-policy.json](modules/karpenter/policy/exo-node-controller-policy.json)
will be created for Exo Node Controller's secure access:

```json
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "eks:DescribeCluster"
            ],
            "Resource": "*"
        }
    ]
}
```

</details>

</details>


## Using AWS SSM


### What is AWS SSM?

AWS Systems Manager (SSM) is a service that lets you securely manage and automate tasks on your EC2 instances and other
AWS resources. It uses the SSM Agent (installed on instances) to allow:

- Remote access to instances without SSH or bastion hosts.
- Command execution (e.g., patching, updates, or custom scripts).
- Automation via predefined or custom SSM documents.
- Centralized management with logging and auditing in CloudTrail.

In short, SSM replaces SSH with a secure, IAM-based, auditable way to manage your infrastructure.

For more info, refer to the doc: [What is AWS Systems Manager?][AWS Systems Manager Docs]

### How to use SSM?

Exostellar Management Server (EMS) supports SSM out of the box, i.e., the SSM agent is preinstalled in EMS AMI.

To enable SSM in the existing cluster flow, set the `ssm_enabled` flag in `examples/existing-cluster-flow/main.tf`:

```hcl
module "existing_cluster_flow" {
    source = "git::ssh://git@github.com/Exostellar/terraform-exostellar-modules//modules/existing-cluster-full?ref=v0.0.5"

    eks_cluster      = "my-exostellar-cluster"
    aws_region       = "us-east-1"
    ems_ami_id       = "ami-XXXXXXXXXXXXXXXXX"
    license_filepath = "/path/to/exo-license-file"

    # Change the SSM and SSH related fields as follows:
    ssm_enabled  = true
    ssh_key_name = "" # Even a value is passed, this will be omitted since ssm_enabled = true
}
```

This does the following:
1. Disable the ingress rule for SSH port 22 in the Exostellar Management Server (EMS)'s EC2 instance's security group.
2. Unset the SSH key-pair with Exostellar Management Server (EMS)'s EC2 instance.
   > Note: In this case, even if `ssh_key_name` is set, it will be ignored.
3. Attach the policy `AmazonSSMManagedInstanceCore` to Exostellar Management Server (EMS)'s role.

> [!WARNING]
> SSM is not yet supported in standalone flow. Will be added soon.


## Force Clean-up


> [!WARNING]
> The force clean-up script, [clean-up-exostellar-module-resources.sh](scripts/clean-up-exostellar-module-resources.sh),
> is currently outdated and does not reflect the latest changes to the modules. It is still a work in progress and may
> not reliably clean up all resources.


## Version Matrix


Following is the mapping between different components of Exostellar. It is recomended to upgrade to the latest
`terraform-exostellar-modules` version and corresponding components.

| [Exostellar Terraform Modules][Modules Release Page] | [Exostellar Management Server (EMS)][Marketplace Release Page] | [Xspot Controller][Marketplace Release Page] | [Xspot Worker][Marketplace Release Page] | [Exostellar's Karpenter (xkarpenter)][xkarpenter Release Page] | [Exostellar's CNI (Exo-CNI)][Exo-CNI Release Page] | [Exostellar's CSI (Exo-CSI)][Exo-CSI Release Page] |
|:----------------------------:|:----------------------------------:|:----------------:|:------------:|:-----------------------------------:|:--------------------------:|:--------------------------:|
| v0.0.1 | v2.2.2+ <br>(Exostellar Release 2.5.2) | v3.2.1+ <br>(Exostellar Release 2.5.2) | v3.2.1+ <br>(Exostellar Release 2.5.2) | v2.0.2+ | v1.19.0 | v1.37.0 |
| v0.0.2 | v2.2.4+ <br>(Exostellar Release 2.5.3) | v3.2.2+ <br>(Exostellar Release 2.5.3) | v3.2.2+ <br>(Exostellar Release 2.5.3) | v2.0.3+ | v1.20.0 | v1.46.0 |
| v0.0.3 | v2.2.5+ <br>(Exostellar Release 2.5.4) | v3.2.3+ <br>(Exostellar Release 2.5.4) | v3.2.3+ <br>(Exostellar Release 2.5.4) | v2.0.4+ | v1.20.0 | v1.46.0 |
| v0.0.4 | v2.3.3+ <br>(Exostellar Release 2.6.3) | v3.3.2+ <br>(Exostellar Release 2.6.3) | v3.3.2+ <br>(Exostellar Release 2.6.3) | v2.0.5+ | v1.20.0 | v1.46.0 |
| v0.0.5 | v2.4.0+ <br>(Exostellar Release 2.7.0) | v3.4.0+ <br>(Exostellar Release 2.7.0) | v3.4.0+ <br>(Exostellar Release 2.7.0) | v2.0.6+ | v1.20.0 | v1.46.0 |

Refer to [release information page][Marketplace Release Page] for AMI details of Exostellar Management Server (EMS),
Xspot Controller and worker.


## Authors


This module is maintained by [Exostellar][Exostellar].


## License


All rights reserved. Copyright 2023 [Exostellar Inc][Exostellar homepage].

---

[Exostellar]: https://github.com/exostellar/
[Exostellar homepage]: https://exostellar.io/
[AWS Systems Manager Docs]: https://docs.aws.amazon.com/systems-manager/latest/userguide/what-is-systems-manager.html
[Marketplace Release Page]: https://exostellar.atlassian.net/wiki/spaces/ENG/pages/314638358/EAR+Non-Marketplace+Release+Information
[Modules Release Page]: https://github.com/exostellar/terraform-exostellar-modules/releases
[xkarpenter Release Page]: https://github.com/exostellar/karpenter/releases
[Exo-CNI Release Page]: https://github.com/Exostellar/amazon-vpc-cni-k8s/releases
[Exo-CSI Release Page]: https://github.com/Exostellar/aws-ebs-csi-driver/releases
