Scaling Helpers in AWS EC2

Overview

This topic explains the mechanics involved in scaling cloud helpers automatically using Autoscaling Groups, for Incredibuild for Windows and Incredibuild for Linux. The process involves these steps:

  1. Create Security Groups

  2. Select or create a Helper AMI

  3. Create a Helper Startup Script (one or more)

  4. Create a Launch Template (one or more)

  5. Create an Autoscaling Group (one or more)

Because of the way autoscaling works, you will need to have a single Helper machine alive in the VPC at all times. The Autoscaling Group spins up machines based on the load on the entire group, so one machine needs to be available as a bootstrap.

Requirements

To use autoscaling you must have:

  • An EC2 account, with enough permissions (see Permissions Required later).

  • Access between your Coordinator and Initiators to the VPC where the helpers will run, over all relevant TCP ports (see System Requirements)

  • If your environment is hybrid, see specific network requirements in Hybrid Environment Considerations.

It is assumed that:

  • You have a working Incredibuild environment, either on-prem or in your VPC.

  • You have a network connection between your Coordinator and Initiators to/from your VPC.

  • You are experienced with Incredibuild technology.

  • You are well versed in managing Amazon EC2 instances.

Step 1: Create Security Groups

The Helper you spin up needs to allow incoming traffic from the Coordinator and Initiators.

The command below creates a dedicated Security Group which allows access to the necessary ports. You can remove RDP (3389), SSH (22) and ICMP if the environment is operational.

Incredibuild for Windows

Use the following command:

Copy
VPC="vpc-a5ac07cc" 

JSON=$(aws ec2 create-security-group --group-name "HelperSecGrp" \ 
       --description "Security Group for Cloud Helpers" \ 
       --vpc-id $VPC) 
echo $JSON 
SEC=$(jq -r .GroupId <<< $JSON) 
aws ec2 authorize-security-group-ingress \ 
  --group-id $SEC \ 
  --ip-permissions \ 
    '{"FromPort":31105,"IpProtocol":"tcp","ToPort":31105,"IpRanges":[{"CidrIp":"0.0.0.0/0"}]}' \ 
    '{"FromPort":31106,"IpProtocol":"tcp","ToPort":31170,"IpRanges":[{"CidrIp":"0.0.0.0/0"}]}' \ 
    '{"FromPort": 3389,"IpProtocol":"tcp","ToPort": 3389,"IpRanges":[{"CidrIp":"0.0.0.0/0"}]}' \ 
    '{"FromPort":   22,"IpProtocol":"tcp","ToPort":   22,"IpRanges":[{"CidrIp":"0.0.0.0/0"}]}' \ 
    '{"FromPort":-1,"IpProtocol":"icmp","ToPort":-1,"IpRanges":[{"CidrIp":"0.0.0.0/0"}]}'  

Incredibuild for Linux

Use the following command:

Copy
VPC="vpc-a5ac07cc"  

JSON=$(aws ec2 create-security-group --group-name "HelperSecGrp" \ 
       --description "Security Group for Cloud Helpers" \ 
       --vpc-id $VPC) 
echo $JSON 
SEC=$(jq -r .GroupId <<< $JSON) 
aws ec2 authorize-security-group-ingress \ 
  --group-id $SEC \ 
  --ip-permissions \ 
    '{"FromPort":2088,"IpProtocol":"tcp","ToPort":2088,"IpRanges":[{"CidrIp":"0.0.0.0/0"}]}' \ 
    '{"FromPort":2089,"IpProtocol":"tcp","ToPort":2089,"IpRanges":[{"CidrIp":"0.0.0.0/0"}]}' \ 
    '{"FromPort":3389,"IpProtocol":"tcp","ToPort":3389,"IpRanges":[{"CidrIp":"0.0.0.0/0"}]}' \ 
    '{"FromPort":  22,"IpProtocol":"tcp","ToPort":  22,"IpRanges":[{"CidrIp":"0.0.0.0/0"}]}' \ 
    '{"FromPort":-1,"IpProtocol":"icmp","ToPort":-1,"IpRanges":[{"CidrIp":"0.0.0.0/0"}]}'

Step 2: Select or Create a Helper AMI

Incredibuild for Windows using Windows-based Helpers

Use the Windows Server 2022 Base AMI for new instances (amazon/Windows_Server-2022-English-Full-Base-2024.10.09).

Instances launched using the Amazon AMI for Windows have unique host names, which is problematic for Incredibuild. If you want to use your own AMI, the launched instances inherit the host name from the AMI - make sure to include some method for renaming the instances.

Copy
AMI="ami-06f0776c761daa76e"

Incredibuild for Windows using Linux-based Helpers (with SCL)

To improve performance, you will need a custom AMI that has Docker and the SCL container image.

  1. Create a small instance based on Linux with Docker, for example, amzn2-ami-ecs-hvm-2.0.20241120-x86_64-ebs.

  2. SSH into the instance, and pull the Incredibuild container image. Do not create (run) a container as this will be done by the Autoscaling Group. Use the following command:

    Copy
    docker image pull public.ecr.aws/incredibuild/scl:latest_10.24.0 
  3. Shutdown the instance, and create an AMI (replace BASE_INSTANCE_ID with the ID of your helper):

    Copy
    BASE_INSTANCE_ID="i-006edbe8fb1d8f612" 
    JSON=$(  

        aws ec2 create-image \ 
        --instance-id $BASE_INSTANCE_ID --name "SCLHelperAMI") 
    echo $JSON 

    AMI=$(jq -r .ImageId <<< $JSON) 
  4. Wait until the image is ready (should take about 15 minutes):

    Copy
    aws ec2 describe-images --image-id $AMI | jq -r ".Images[0].State" 

Incredibuild for Linux

Use the Ubuntu Server 24.04 LTS AMI for new instances (amazon/Windows_Server-2022-English-Full-Base-2024.10.09).

Copy
AMI="ami-08eb150f611ca277f" 

Step 3: Create a Helper Startup Script

The Launch Template you create will include the details about the image (Windows or Linux), the Security Groups for the Helper instances (so they can communicate with the Coordinator and Initiators) and a startup script (that downloads Incredibuild and installs the Helper software).

Use the following example to create a startup script for Windows or Linux and convert it to base64 encoding (stored in the USERDATA environment variable). Then create the Launch Template.

The following scripts assumes that COORD contains the persistent IP address or FQDN of the Coordinator.

Incredibuild for Windows using Windows-based Helpers

Prepare the details for the template (the startup PowerShell script must be in base64 format).

Starting with Incredibuild 10.24, add the option /Agent:SingleUse=True to the ibsetup command.

Starting with Incredibuild 10.24, add the option /Agent:HelperAssignmentPriority to the ibsetup command, for support of hybrid setups (where Helpers exist both on-premises and in AWS).

Copy
COORD="192.168.1.1"

cat << EOF > boot.txt 
    <PowerShell> 
    \$COORD="$COORD"  
    \$ProgressPreference = 'SilentlyContinue'; 
    invoke-webrequest https://dl.incredibuild.com/ib10-latest-silent -outfile  C:\Users\Administrator\Downloads\ibsetup.exe;  
    invoke-command -ScriptBlock { C:\Users\Administrator\Downloads\ibsetup.exe /install /COMPONENTS=Agent /AGENT:AGENTROLE=helper /AGENT:HELPERTYPE=floating /Agent:SingleUse=True /Agent:InstallAddins=off /COORDINATOR=\$COORD 
    }; 
    </PowerShell> 
    <persist>false</persist> 
EOF 
USERDATA=$(base64 -w 0 boot.txt) 

Incredibuild for Windows using Linux-based Helpers (with SCL)

Prepare the details for the template (the startup PowerShell script must be in base64 format).

Copy
COORD="192.168.1.1" 

cat << EOF > boot.txt 
#! /bin/bash -x  
    docker run --detach --restart unless-stopped \  
           --hostname \$HOSTNAME \  
           --name IncredibuildHelper \  
           --env COORD_IP=$COORD \  
           --network host \ 
           public.ecr.aws/incredibuild/scl:latest_10.19.0 
EOF 
USERDATA=$(base64 -w 0 boot.txt) 

Incredibuild for Linux

Prepare the details for the template (the startup bash script must be in base64 format).

Copy
COORD="192.168.1.1" 
cat << EOF > boot.txt 
#! /bin/bash -x  
    curl -L https://incredibuild.com/downloads/incredibuild_4.12.0.run -o ./ibsetup.run  
    chmod a+x ./ibsetup.run  
    sudo apt-get install bzip2  
    sudo ./ibsetup.run --action install \ 
        --helper enabled --license-type SUVM \ 
        --data-dir /etc --coordinator-machine $COORD 
    sudo /opt/incredibuild/etc/init.d/incredibuild start 
EOF 
USERDATA=$(base64 -w 0 boot.txt) 

Step 4: Create a Launch Template

Create a Launch Template, using the following sample EC2 CLI command.

The following script assumes that USERDATA contains the base64-encoded (as defined above), and AMI is the image ID for the helper instances you want to spin up (as defined above).

In the following script, replace the value of KEY with the name of the key pair to set the authentication to the Helper instances, replace the value of SEC with a list of security group IDs that allow access to the Helpers, and VOL with the size of the Helper disk (in GB).

Copy
KEY="HelperKey" 
VOL="50" 
JSON=$(aws ec2 create-launch-template \ 
    --launch-template-name "HelperTemplate" \ 
    --launch-template-data '{  
        "ImageId":"'$AMI'", 
        "UserData":"'$USERDATA'", 
        "KeyName":"'$KEY'", 
        "SecurityGroupIds":["'$SEC'"], 
        "BlockDeviceMappings":[{ 
            "DeviceName":"/dev/sda1", 
            "Ebs":{ 
                "Encrypted":false,"DeleteOnTermination":true,"Iops":3000, 
                "VolumeSize":'$VOL',"VolumeType":"gp3","Throughput":125 
            } 
        }] 
    }') 
echo $JSON 
TMPLID=$(jq -r .LaunchTemplate.LaunchTemplateId <<< $JSON)  

The Launch Template needs to connect to the same Security Groups and Subnet as the original Helper.

Step 5: Create an Autoscaling Group

The Autoscaling Group you create will control the instance types you want (minimum/maximum vCPUs, memory, etc.), the distribution between on-demand and spot instances (including further options for selecting the instance types for spot instances to balance cost vs. performance), the range of vCPUS in the group (including thresholds when to spin up/down), and optionally a warm pool.

Create an Autoscaling Group, using the following sample EC2 CLI command.

The following scripts assume that TMPLID contains the Launch Template ID (as defined above).

In the following script, replace the value of SUBNET with the ID of a network subnet where your helpers can be reached from the Coordinator and Initiators, and change the various Autoscaling Group options to match your grid acceleration requirements.

Copy
SUBNET=subnet-72b37809 
aws autoscaling create-auto-scaling-group \ 
    --auto-scaling-group-name "HelperASG" \ 
    --vpc-zone-identifier "$SUBNET" \ 
    --health-check-grace-period "300" \ 
    --default-instance-warmup "-1" \ 
    --health-check-type "EC2" \ 
    --desired-capacity-type "vcpu" \ 
    --max-size "4" --min-size "4" --desired-capacity "4" \ 
    --mixed-instances-policy '{ 
        "InstancesDistribution":{ 
            "OnDemandAllocationStrategy":"lowest-price", 
            "OnDemandBaseCapacity":75, 
            "OnDemandPercentageAboveBaseCapacity":25, 
            "SpotAllocationStrategy":"price-capacity-optimized" 
            }, 
        "LaunchTemplate":{ 
            "LaunchTemplateSpecification":{ 
                "LaunchTemplateId":"'$TMPLID'", 
                "Version":"$Default" 
                }, 
            "Overrides":[{ 
                "InstanceRequirements":{ 
                    "VCpuCount":{"Min":4,"Max":8}, 
                    "MemoryMiB":{"Min":0}, 
                    "MemoryGiBPerVCpu":{"Min":2,"Max":2} 
                    } 
          }] 
      } 
   }'

Set the scaling policy for the Autoscaling group, using the following sample EC2 CLI command.

By selecting the instance warmup and CPU utilization threshold you can balance cost and acceleration (see Performance Considerations).

Copy
aws autoscaling put-scaling-policy \ 
    --auto-scaling-group-name HelperASG \ 
    --policy-name cpu50-target-tracking-scaling-policy \ 
    --policy-type TargetTrackingScaling \ 
    --estimated-instance-warmup "60" \ 
    --target-tracking-configuration '{ 
        "TargetValue": 50.0,  
        "PredefinedMetricSpecification": { 
            "PredefinedMetricType": "ASGAverageCPUUtilization" 
        } 
    }' 

Optionally, you can add a warm pool of instances to the Autoscaling Group, using the following sample EC2 CLI command.

To use a warm pool, the Autoscaling Group must be limited to a single instance type, and the instance storage must be encrypted.

Copy
aws autoscaling put-warm-pool \ 
    --auto-scaling-group-name HelperASG \ 
    --pool-state hibernated 

You can apply additional policies to the Autoscaling group.

Hybrid Environment Considerations

If your Coordinator and Initiators run on-premises, then your environment is hybrid. There are a few key things to take care of so that your cloud Helpers will be able to function correctly and efficiently.

Open TCP Access to the Coordinator

The Helpers running in the cloud need access to your on-premises Coordinator via TCP ports 8000, and one of 32103 (if to set communication to be encrypted, recommended) or port 31104 (if you choose to leave the communication unencrypted).

Make sure to open these ports for incoming traffic in your network firewall.

Set Helper Assignment Priority

To achieve the highest efficiency and save on cloud costs, give on-prem Helpers lower assignment priority than Helpers launches in the cloud VPC. A lower priority value means that the Coordinator would choose to use the on-prem Helpers over cloud Helpers.

The simplest way to do this, is to use the Coordinator Manager and set the Helper Assignment Priority of the on-prem Helpers to

There are additional policies that you can apply to the Autoscaling group.

Alternatively, the Incredibuild installer includes the option to set the priority during the installation. Instead of the above, change the installation command above (in the Launch Template definition) to set their priority to 5.

Performance Considerations

When you upgrade your Coordinator, update the Launch Template to install the same version on the Helper, and update the Autoscaling Group to use the new Template version.

There are several ways to balance cost vs. performance (i.e. acceleration):

  • Reducing the instance warmup time means new instances will launch more quickly, which means they will be available to accelerate more quickly but may exceed the needs of the current build jobs.

  • Likewise, reducing the CPU threshold will behave similarly.

If your build load happens during specific hours of the day, consider using a schedule to spin instances up and down so they will be available before they’re needed, without relying on measuring CPU load.

Autoscaling Group Maintenance

If you need to temporarily disable an Autoscaling Group, use the following command:

Copy
aws autoscaling suspend-processes \ 
    --auto-scaling-group-name HelperASG

Enable the Autoscaling Group with:

Copy
aws autoscaling resume-processes \ 
    --auto-scaling-group-name HelperASG 

Permissions Required in AWS

The following are the minimal permissions required to run the above.

Copy

"Version": "2012-10-17", 
"Statement": [ 

"Sid": "VisualEditor0", 
"Effect": "Allow", 
"Action": ["ec2:AuthorizeSecurityGroupIngress", 
"ec2:DescribeInstances", 
"autoscaling:BatchPutScheduledUpdateGroupAction","ec2:CreateImage", 
"ec2:CancelImageLaunchPermission", 
"autoscaling:ExecutePolicy","ec2:UpdateSecurityGroupRuleDescriptionsIngress", 
"ec2:GetLaunchTemplateData", 
"ec2:StartInstances", 
"autoscaling:PutScalingPolicy", 
"autoscaling:UpdateAutoScalingGroup","ec2:UpdateSecurityGroupRuleDescriptionsEgress", 
"ec2:DeleteLaunchTemplateVersions", 
"ec2:RebootInstances", 
"ec2:DeleteLaunchTemplate", 
"ec2:TerminateInstances", 
"autoscaling:ResumeProcesses", 
"ec2:DescribeLaunchTemplates", 
"autoscaling:SetDesiredCapacity","ec2:DescribeLaunchTemplates", 
"autoscaling:DeleteWarmPool","ec2:CreateTags", 

"ec2:RegisterImage", 
"ec2:DescribeLaunchTemplateVersions", 
"ec2:RunInstances", 
"autoscaling:SuspendProcesses","ec2:ModifySecurityGroupRules", 
"ec2:StopInstances", 
"ec2:CreateLaunchTemplateVersion", 
"autoscaling:CreateLaunchConfiguration", 
"autoscaling:PutWarmPool", 
"ec2:CreateLaunchTemplate", 
"autoscaling:AttachInstances","ec2:DescribeSecurityGroupRules", 

"ec2:DescribeImageAttribute", 
"autoscaling:DeleteLaunchConfiguration", 
"ec2:ModifyLaunchTemplate", 
"autoscaling:DeleteAutoScalingGroup", 
"autoscaling:CreateAutoScalingGroup", 
"ec2:EnableImage", 
"ec2:DeregisterImage", 
"ec2:DeleteTags","ec2:CancelImageLaunchPermission", 
"ec2:DescribeTags", 
"ec2:CreateTags", 
"ec2:RegisterImage", 
"ec2:DescribeImageAttribute", 
"ec2:CreateImage", 
"ec2:ModifyImageAttribute", 
"autoscaling:PutScalingPolicy", 

"ec2:GetSecurityGroupsForVpc", 

"ec2:CreateSecurityGroup", 

"ec2:DeleteLaunchTemplateVersions", 

"ec2:RebootInstances", 

"ec2:EnableImage", 

"ec2:AuthorizeSecurityGroupEgress", 

"ec2:DeleteLaunchTemplate", 

"ec2:TerminateInstances", 

"ec2:DescribeTags", 

"ec2:DescribeLaunchTemplateVersions", 

"ec2:DisableImage", 

"ec2:DescribeSecurityGroups", 

"ec2:CreateLaunchTemplateVersion", 

"autoscaling:CreateLaunchConfiguration", 

"ec2:CreateLaunchTemplate", 

"autoscaling:DeleteLaunchConfiguration", 

"ec2:DeleteSecurityGroup", 

"ec2:ModifyLaunchTemplate" 


], 
"Resource": "*" 


}