The Agent Settings page is used to make modifications to the agent task characteristics. This includes the CPU and Memory (for the AWS Fargate tasks), the VPC and Subnet(s) as well as the scaling characteristics including Scaling Threshold, Minimum # of Agents and Maximum # of Agents. You also get a bit of insight into your overall deployment as each region where scanning agents exist is displayed for both event and retro agents.
Displayed to you are any regions where either an event agent or retro agent have been deployed. The Event Agents will be one of two colors, green or orange. Green indicates the agents in those regions are running off the default settings. Whereas orange indicates the agents in those regions are running with custom characteristics. This may occur for many reasons, but one simple example is adapting regions for the load they see. US-East-1 may see the brunt of your object handling and so you tune the scaling and up the
Max Agents, whereas your other regions see much less traffic so you turn on
Smart Scan for those regions.
This does not indicate whether the agents are running or whether any buckets in those regions are being protected. It does indicate that buckets in those given regions were scanned at one point and the agent has been deployed there.
Agent Task Settings¶
Task Settings apply to the actual AWS Fargate Task (container) running the scanning agents. The default settings within the CloudFormation Template are to run the agents with 2vCPU and 4GB of Memory. This is suitable in 99% of the cases and certainly when you get started. You also set the values for Scaling Threshold, Min Agents and Max Agents during the CloudFormation Stack creation. Whether you've determined you need to tune regions or reset all regions to new defaults, this section is to allow you to easily modify the running values of each of the scanning agents.
When you first enable a bucket for protection you are prompted to select a VPC and two Subnets to run the agents in for that region. There may be times you want to change these values after the fact which can easily be done on this page.
|Select Region||Allows you to set the fields as the global default or individually by region|
|Scaling Threshold||This is the value that determines the number of entries in the work queue before a scaling event happens. There are a number of considerations here that can affect what this value should be. Review the Sizing Discussion to get more details on how to think through this|
|Min Agents||The minimum number of scanning agents you'd like running by default or in the given region. This value can be
|Max Agents||The maximum number of scanning agents you would like to possibly scale to. This number can be anything greater or equal to the minimum (except 0).|
|CPU||The amount of vCPU you would like allocated to each agent. Each additional auto-scaled agent would also have this value.|
|Memory||The amount of memory you would like allocated to each agent. Each additional auto-scaled agent would also have this value.|
The Memory must always be set to a value that is between 2x and 8x of the vCPU.
vCPU = 1, then 3gb <= memValue <= 8gb vCPU = 2, then 4gb <= memValue <= 16gb vCPU = 3, then 6gb <= memValue <= 24gb vCPU = 4, then 8gb <= memValue <= 32gb
The new minimum memory requirement for agents is 3gb. We no longer offer a 2gb memory choice as we saw unstable scanning results with this value.
You'll notice a
Private associated with each VPC. This is an indicator of whether or not the VPC is tied to an Internet Gateway. Thought process being that with an IG in place you will have the required outbound access. The console does not require a public IP address or to be accessible from the public in general, but it does require outbound internet access to get to AWS ECR to pull new Task images.
You'll notice a
Restricted associated with each Subnet. This is an indicator of whether or not the Subnet is outbound routable to the internet. Minimally, the agent task must be able to reach the AWS ECR to pull new Task images and to be able to pull AV signature updates. Two validations are performed for this check. First, we check to see if the NACL associated with the Subnet(s) has outbound open for 0.0.0.0/0. Secondly, we check to see if there is a custom Route Table in place and verify it is routed to an internet gateway.
Generally, the first indicator for a good configuration will be whether or not the VPC has an internet gateway or something that is taking its place. Secondly, check the subnets for their outbound access.
Changing these values will cause the agents to reboot.
If the VPC or Subnets are not proper for the agents, you could have trouble with the agents not booting up or entering a constant reboot cycle.
Smart Scan is an agent settings configuration that creates infrastructure cost optimizations. The idea behind it was to enable and support the scenario where you didn't want or need a scanning agent running full time in one or all of the regions you were protecting. This is the classic example of run the "server" only when you need it and taking advantage of scaling policies to do just that.
Smart Scan can be enabled as the global
Default so each and every region you protect is setup that way or on a one-off basis for each region that requires it.
You only get new objects during the work day hours and nothing comes in at night. You can switch
Smart Scan on and your scanning agents will only run while work is coming in. They may even shut down during stagnant times during the work day as well to add additional savings.
Another scenario is where you may have one or two really busy regions and would like to have scanning agents running full time, but in other less used regions you want to take advantage of the scan on-demand settings.
This could easily be handled by just modify the scaling policies yourself, but we have simplified it for you with a simple toggle . When you toggle on
Smart Scan, whether as the global default or in an individual region, we automatically adjust the following values:
|Scaling Threshold||We set this value to 1 by default which states: any time there is work (at least 1 item in the queue) an agent will be spun up to process that work. While it is up if other work comes in it will continue to run and process all of the items in the queue.
You can set this value to something greater than 1. For example, you don't want a scanning agent to spin up and process the workload until you have a certain amount of work so you set this threshold to 50 or 100. This would indicate you will scan once you have 50 or 100 objects waiting.
This value translates to the LargeQueue CloudWatch Alarm which controls when to spin up scanning agents.
|Min Agents||This value is set to 0 which indicates that the agents can contract back down all the way to no running agents.|
|Max Agents||This value is set to 1, but can be modified to anything greater than 1 as well. This indicates how many agents you would permit to spin up. If you get objects in large sets, it may be useful to have this greater than 1.|
|SmallQueue CloudWatch Alarm||This attribute is not visible on the page. The value for the CloudWatch Alarm that controls the scale down of the tasks will be set to 1. Meaning, whenever the queue is emptied and below 1 to stop by scaling down all tasks.
If you were to simply change the
Smart Scan will allow you to go back to modifying all the fields. It also goes back to the default scaling up and down behaviors where both the LargeQueue and SmallQueue scaling threshold are set to the same value which is the
Scaling Threshold value as you have defined in this page.