API Agent Settings

The API scanning agent can be deployed to most regions at this time.

Unlike the Event Agent which is setup at the time protection is turned on for a bucket in a new region, the API agent is specifically setup in regions of your choice completely independently of whether you have S3 integrated protections (event-based and retro-based scanning) in place. As part of the deployment process, a Load Balancer is created as the entry point to the API Agent. After setup is complete, you will have an API Agent Service in your Fargate Cluster and a Load Balancer (either internet-facing or internal) as the persistent access mechanism. On top of this setup, you can configure aspects of the scanning agent and characteristics for how they scale. This includes the CPU, Memory and Disk Size (for the AWS Fargate tasks), the VPC and Subnet(s) as well as the scaling characteristics including Minimum # of Agents and Maximum # of Agents.

With the API Agent you have choices as to whether you deploy in more than one region. With the Event Agent, we deployed infrastructure to each region with protected buckets to keep the scanning close to the data. With the API Agent effectively being an API endpoint, you could make one bit of infrastructure available to all of your applications if networking and latency allowed for it. Applications, whether on-prem, living in AWS or being run by third parties could leverage one endpoint. You have the choice to deploy to additional regions to get closer to your users as you see fit.

With all of the infrastructure we deploy, you can get more detailed deployment information on the Deployment Overview page where we represent all of the different Fargate infrastructure we have deployed.

To learn more about triggering an API scan of a file, check out the api-based scanning overview page.

Deployed Regions

If there are no currently deployed API agents, then you will only have an Add New Region button available to you. Otherwise, displayed to you are any regions where API agents have been deployed. To setup new regions click the Add New Region button. To edit an existing region's setup, click the specific region pill. Either will populate the bottom portion of the page with the fields that are required to stand up the service.

Default DNS

This will not show on the page as you are setting up a new service region. Once the setup is complete and the load balancer has been created, this page will show the default DNS value. This is the value assigned to the load balancer, not necessarily the value you plan to use with your application teams. This URL is a fully valid and working address, so you can submit API scans to this. It is recommended that you create a CNAME record in your DNS that links a friendlier name to this LB url. This is to ensure validity of the SSL certificate being used. While it still works to simply use the LB url, you would need to disable SSL certificate verification in order to access it.

Network Settings

The Network Settings section defines which network the API Agent will run in and the connectivity to it.

You start by specifying which VPC and Subnets you want the load balancer and Fargate service to run in. Whichever VPC and Subnets you pick ensure that they meet your access needs whether from the outside or inside of your network.

If you are using an API Load Balancer you can also define the Subnets of the API Load Balancer through these settings without having to make changes in the AWS console.

VPC and Subnet Access Importance

You'll notice a Public or Private associated with each VPC. This is an indicator of whether or not the VPC is tied to an Internet Gateway. Thought process being that with an IG in place you will have the required outbound access. The API Agent does not require a public IP address or to be accessible from the public in general, but it does require outbound internet access to get to AWS ECR to pull new Task images.

You'll notice a Public or Restricted associated with each Subnet. This is an indicator of whether or not the Subnet is outbound routable to the internet. Minimally, the agent task must be able to reach the AWS ECR to pull new Task images and to be able to pull AV signature updates. Two validations are performed for this check. First, we check to see if the NACL associated with the Subnet(s) has outbound open for 0.0.0.0/0. Secondly, we check to see if there is a custom Route Table in place and verify it is routed to an internet gateway.

Changing these values will cause the agents to reboot.

You also define who can access the URL via the Inbound Access CIDR. This directly corresponds to an AWS Security Group for the load balancer. Lock this down to the network(s) allowed to access the URL.

The Internet Facing Load Balancer determines whether the load balancer is created with public IPs or not. If you plan to leverage this API endpoint from outside the network, then you'll want this box ticked. If you plan to use the endpoint from resources on the same network then you can uncheck it.

If you decide after the fact you want a different value for Internet Facing Load Balancer you will have to tear the API Agent down and recreate it because there are no AWS APIs to change this value after the fact. You can tear down the API Agent from the Deployment Overview page.

The tear down can take upwards of 15 minutes for the load balancer to be removed. Once that has completed you can re-setup the region.

The SSL Certificate ARN is the SSL certificate to associate with the load balancer. You will typically create the certificate to match a friendly URL. If the certificate and API URL do not match, you will get certificate validation warnings and errors. You can code around those, but better to match them up. With the SSL Certificate ARN you have a choice to create your own or leverage ours like the Management Console. The default behavior of the Management Console is to give you persistent access through the cloudstoragesecapp.com domain. We also offer this option for the API Endpoint(s). This allows for a very fast and simple setup for the API Endpoint and doesn't require you to provide your own cert and friendly DNS entry. Your environment could look like the following: console = customerA.cloudstoragesecapp.com and your API endpoint = customerA-api.cloudstoragesecapp.com. And we'll do it all for you.

When first setting up the API Endpoint you'll have an option for New Cert.

If you have your own certificate you'd like to use, then simply past the ARN value into this field and ignore the New Cert button. If you'd like us to register your API Endpoint with our DNS as described above, then click the New Cert button to reveal the following popup. Here you can specify the subdomain value you'd like to use for the friendly URL. Type in whatever value you'd like to use and check the availability. If it is free, then simply click Create button. We will create the entries in our DNS to make the friendly URL. You may see the following until the certificate and DNS settings are completely setup.

Service Settings

FieldDescription

Min Agents

The minimum number of scanning agents you'd like running by default or in the given region. This value cannot be 0.

Max Agents

The maximum number of scanning agents you would like to possibly scale to. This number can be anything greater or equal to the minimum.

Scanning Engine

You can select between Sophos, CrowdStrike, or ClamAV for your API agent scanning engine. ClamAV can scan files up to 2GB in size. If you need to scan a file larger than 2GB you will need to use Sophos. If you are using CrowdStrike you will be limited to files up to 20MB in size and a limited set of file types.

CPU

The amount of vCPU you would like allocated to each agent. Each additional auto-scaled agent would also have this value.

Memory

The amount of memory you would like allocated to each agent. Each additional auto-scaled agent would also have this value.

Disk Size (GB)

The amount of disk space assigned to each agent. Each additional auto-scaled agent would also have this value. The default value and included in the AWS pricing is 20GB. If all files you will be scanning are under 15GB in size then keep the default. If there are regions or needs to scan files larger than 15GB in size, then increase this to a size large enough to handle your largest files. 200GB is the current maximum this number can be set to. As a result, the maximum size file that can be scanned through this scanning method is 195GB. If you require the scanning of anything larger than 195GB, please refer to ==Extra Large File Scanning== option. You can scan TB(s) sized files. NOTE: There are increased pricing costs for any GB above the 20GB size. Currently, this price per GB per hour is $0.000111, but refer to the AWS Fargate Pricing page to get the latest costs.

The Memory must always be set to a value that is between 2x and 8x of the vCPU.

vCPU = 1, then 3gb <= memValue <= 8gb  
vCPU = 2, then 4gb <= memValue <= 16gb
vCPU = 3, then 6gb <= memValue <= 24gb
vCPU = 4, then 8gb <= memValue <= 32gb

Changing these values will cause the agents to reboot.

If the VPC or Subnets are not proper for the agents, you could have trouble with the agents not booting up or entering a constant reboot cycle.

Once you click Setup <region-name> the process of creating the load balancer and API Endpoint agents will commence. This can take a number of minutes to do. While this is happening you will see the following message. You will know it is complete when the yellow bar goes away.

Scaling Considerations

Currently we have seen sufficient performance with scaling out instead of up such that the default values of 1vCPU and 3GB Memory should be all you need for most any workload. Scaling out is driven by two main factors: number of connections to the load balancer and the CPU utilization of the task. Either factor can trigger scaling based on an average over a full minute.

For connections, it is 1000 or greater connections lasting over 1 minute. If 10,000 connections come in all at once and haven't been processed within a minute, then tasks will be scaled out to match the number of connections remaining. If 10,000 were still remaining the scaling policy would start up 9 more tasks.

For CPU utilization, 75% or greater utilization lasting over a minute will trigger scaling. Scaling by CPU utilization behaves slightly differently than connections in that only 1 new instances will be spun up at a time. So if the utilization averages over 75% for a minute then a second instance will spin up. If the utilization of both instances averages over 75% again, then a third instance will spin up and so forth.

Contracting back down after the scaling event takes a bit longer since the load balancer wants to see the connections drained and the cpu utilization stabilized before dropping resources. So you may see tasks linger longer than you'd expect, but they will scale back down.

Scanning Engine Consideration

When setting up your API agent you'll need to select which antivirus scanning engine the API agent will use to scan files. We support the following scanning engines for API driven scanning:

  • Sophos

  • CrowdStrike

  • ClamAV

You have the option to use a single scanning engine or you can enable multi-engine scanning to have your API agent use multiple scanning engines at the same time.

If you toggle on Use Default Scan Settings we will use the scanning engine settings you've configured for Event-based and Retro-based scanning through the Scan Settings page. If it's toggled off the engine settings you configure for your API agent will be separate from your Event-based and Retro-based scan settings.

Single Engine API Scanning

Multi-Engine API Scanning

Classification Rule Sets

Similar to our data classification functionality for other storage volumes, you can also integrate data classification into your API scan. This way you'll be able to perform an antivirus scan and classify files for PII against the rulesets that you select here.

You can learn more about using the classification functions on the API Driven Scanning overview page.

Last updated