Scan Settings
There are four configuration adjustments you can make to the Scanning/Classification Agents
Last updated
There are four configuration adjustments you can make to the Scanning/Classification Agents
Last updated
Tag Name changes
Infected File handling (for AV) and Matching File handling (for DC)
Scan Engine choice
Decisions around which objects to scan (Scan list / Skip list).
These Scan Behavior Settings
modifications will apply to all agents currently running and new ones that spin up going forward no matter the region. Changes made here will not require a reboot for any agent, but could take 30 seconds to take affect. Such as, custom naming the tags we apply to each object as well as providing scan list and skip list functionality for the buckets.
These are global changes and therefore it is not currently possible to create different behavior by region or group
Every file the S3 scanner touches has an AWS Tag applied to it. You can change the default key names if required or desired, but not the values.
scan-result
Identifies whether a file is found to be clean or having issues. Possible values: Clean
, Infected
, Unscannable
, Error
, InfectedAllowed
, Suspicious
Clean
= no issues found with file
Infected
= malware found
InfectedAllowed
= represents a file you have moved from a quarantine bucket back into its originating bucket; could be a "false positive"
Unscannable
= object is password protected or greater than maximum size allowed (20MB for CrowdStrike, 2GB for ClamAV, and 5TB for Sophos)
Error
= access to object issues: KMS permissions, cross account permissions, bucket policy blocking, other
Suspicious
= this is a Sophos-only finding that means the engine didn't find anything infected within the file but has found the file to have suspicious characteristics that do not fit the file type
date-scanned
The date and time the object was scanned
message
A description of what has been identified in the file. Only populated for Error
or Unscannable
results.hint
virus-name
Name of virus identified
uploaded-by
AWS user identifier
AWS allows an object to have only 10 tags applied to it. At most we will add 5 tags (for infected files) and only 2 tags for clean files. If you have a number of tags on your object already, we will trim the number of tags we add to ensure none of the existing tags are dropped. If only 1 tag is available for example, we will write only the scan-result
tag onto the object.
There are 3 main actions you can take with an infected file: move (default), delete and keep.
Move
directs the scanner to take the file and place it (copy then delete) in a quarantine bucket. The console creates a quarantine bucket in each region you enable buckets for scanning. The bucket will be named uniquely with the ApplicationID tacked onto it along with the region it was created in. This is the default behavior.
Delete
directs the scanner to remove the file entirely.
Keep
directs the scanner to leave the file in the bucket it found it. The scanner will still tag the object appropriately.
Using the default Move
action places files identified as infected into a quarantine bucket. You may need to restore the file from quarantine bucket back into the originating bucket. You can do this on a one-time basis or permanently. Check out the Allow Suspect Files documentation for more information.
We have a number of customers also using a "2 Bucket System" for more of a "physical separation" of the object storage. They utilize a staging/dirty bucket for all files to first be placed in. This is where all the scanning takes place. Once a file is found to be clean
it is copied/moved to the production/usable bucket(s). At that point then downstream users and applications will be aware of the files and able to use them. This system uses a combination of our standard event-based scanning (on the staging bucket) and a lambda triggered by our real-time notifications. For more information on how to set this up, check out the 2 Bucket System write-up here.
If you leverage the Move
action above, then the quarantine bucket and its settings come into play. By default, we place a quarantine bucket in each account (installed and linked accounts) that has buckets being protected by event-based scanning or on-demand/scheduled scanning. Because we don't want to pull files across accounts to store we have this as the default behavior. The quarantine buckets are also created per region so we aren't pulling across regions to store. So theoretically, you could have 20+ buckets added to your S3 if you are protecting data in that many regions.
We have had a number of customers determine they do not want to quarantine in the linked accounts. They would prefer to have all infected files quarantined to their centralized security account (the account the Antivirus for Amazon S3 solution is installed). There is an option here in the settings to Move to Main Account
. Simply tick the toggle and we'll change how the application quarantines files. We will start storing all infected
files into a centralized quarantine bucket.
A Lifecycle Policy is added to the quarantine bucket. The default policy indicates to keep files indefinitely, but you can change that so quarantined files will be deleted after a certain number of days. Pick a value that will give you time to work through whatever workflows you have in place to examine quarantined files.
Antivirus for Amazon S3 has been designed to work with multiple AV engines easily. Currently, there are three engines available: Sophos, CrowdStrike, and ClamAV. You can choose any engine and even easily switch it after the fact, but there are some distinctions that could weigh the decision in one favor or the other.
Choice comes down to cost vs large files with a bit of brand-name mixed in as well.
In large environments, the additional scan costs of the Sophos engine could be somewhat accounted for by the performance gains and the need to run less infrastructure.
CrowdStrike is an excellent ML-based engine, however it comes with limitations on maximum file size and file types that it can scan.
Look to the table below for the key differentiators.
Sophos
No - add $0.10 per GB
Up to 5 TB
Better
CrowdStrike
No - add $0.10 per GB
Up to 20 MB
Best for ML-based scan
ClamAV
Yes
Up to 2 GB
Good
Please note, if you choose to use both Sophos and CrowdStrike you will receive a discounted premium rate of $0.15 per GB.
Multi-engine scanning provides two options to use multiple engines for scanning. The two choices for how to use multi-engine scanning are: All Files
and By File Size
.
All Files
indicates every file that is event-based scanned or on-demand/schedule scanned will be processed by the enabled engines.
By File Size
indicates smaller files (<2GB) will be scanned by ClamAV-only or (<20MB) will be scanned by CrowdStrike-only (since ClamAV and CrowdStrike have a size limitation and can't scan above 2GB or 20MB respectively). Larger files (>2GB or >20MB) will be scanned by Sophos-only. This allows you to take advantage of the scan cost savings offered by ClamAV or ML-based scanning offer by CrowdStrike for part of your files, but still allow for those files that are too big for ClamAV or CrowdStrike to still be scanned.
All Files
All three engines
Sophos or ClamAV
Sophos-only
By File Size
CrowdStrike or ClamAV
ClamAV-only
Sophos-only
As of v7.00.000 we automatically use EventBridge to resolve any bucket conflicts.
If Protect with Event Bridge
is enabled globally then we will protect all selected buckets with Event Bridge without acknowledgment.
If Protect with Event Bridge
is not enabled we will protect buckets using the "best choice". If the bucket can be protected with the S3 Event Notification we will do so, but if conflicted we will fail over to Event Bridge.
You can learn more about how EventBridge works with protected buckets here.
In certain situations, you may prefer the scanning agents to retrieve signature updates locally rather than reaching outside of your account. We have customers who have requirements where the applications that touch data cannot go out to the public internet. Local updates will allow you to better control and potentially eliminate outbound access for the VPCs hosting the scanning agents. This option allows you to specify an Amazon S3 bucket in your account for the scanning agents to look to for signature updates. You can get the updates into this bucket however you see fit (sample lambda provided below). Each time a scanning agent boots it grabs the latest definitions. A running scanning agent will check every 1 hours for ClamAV (every 15 minutes for Sophos) for new signature definitions.
If you do not use local updates, ClamAV is setup to reach out over the internet to directly download the updates. This is happening at the agent, so any and all agents will need to speak to the public internet (outbound only) to retrieve the updates. Sophos is updated differently. Cloud Storage Security hosts an S3 bucket in our account that we maintain and update with the Sophos updates files. When running the Sophos engine your agents will retrieve updates directly from our bucket.
For both engines you can choose to have your agents look to your local bucket. We have provide a sample, fully working lambda for each engine. The ClamAV lambda pulls the updates down from the internet and the Sophos lambda copies the update files from our bucket to yours.
If you can't wait 1 hour (or the 15 minutes) after a new signature update comes out, simply reboot all your agents and they will pick the new updates up immediately.
This is for signature updates only. Engine updates are done as part of our build process and will be rolled out with the next release.
If you choose to use Private Mirror
with multi-engine scanning, you will need to setup Private Mirror for both sets of updates. You can and will need to use the same bucket for both.
We provide a lambda function for Sophos. Sophos requires authentication to download their updates, so we are hosting them in a bucket where your account will be added to the permissions list. This is done automatically by switching on Local Updates
.
Because the updates are taking place inside of AWS already you may not consider it a need for local updates unlike how we download Clam updates directly from the internet. But, if you still want updates coming from your own bucket, then read on.
Sophos updates are already done from a bucket. A bucket we have granted your application account access to and the agents simply pull the updates from that bucket we host. For local updates for the Sophos engine all you really need to do is copy the update files from our bucket to yours. Like we do for ClamAV updates, we have provided some sample code you can turn into a lambda to pull the updates over.
Please refer to the detailed steps for creating a Lambda to retrieve updates under the ClamAV lambda Detailed Steps. The steps will be exactly the same with the exception of Step 3 and Step 4.3 which will be explained below.
Sub for Step 3 above - Provide Code for Lambda
Copy Lambda code in the section above . . .
And paste it into the function .py file
Deploy the code
Sub for Step 4.3 above - Add Permissions to Lambda Role Copy the policy section above . . .
And paste it into the role JSON
Remember the comma on the preceding bracket!
Make sure to follow all other steps for creating the trigger and so forth and you should be all set. Find Here an example to add the trigger to the lambda
Most of our customers are scanning files under the Fargate disk cap of 200GB. But, there are those in many industries (life sciences, media, etc.) that do have files that are larger than 200GB and some much more so. Antivirus for Amazon S3 supports up to the Amazon S3 file size maximum (5TB) for file scanning.
This is done by bypassing the internal disk limitations of Fargate leveraging an EC2 instance(s) for such files. The "extra large file scanning" doesn't have to be leveraged for only really big files, but can be used to scan any size file over the disk size you have assigned to the standard scanning agents we provide. For example, let's say you occasionally need to process 50GB files, but it isn't worth it to you to keep a larger disk attached to every scanning agent running. So you keep the default disk size of 20GB, and have the Extra Large File Scanning
toggle switch on. Any file 15GB and smaller will be processed by the scanning agent, but any file greater than 15GB in size will be scanned by the extra large file scanning process. This can ensure that no file is ever skipped due to size, but if large files are rare in your system you don't have to sit on the expense of a larger default disk.
Extra large file scanning can be triggered by event based scanning, retro scanning and even API based scanning(scan existing API only). When any of these scanning agents picks up a file that is too large to scan (too large based on the disk size assigned under the Agent Settings or API Agent Settings) and the Extra Large File Scanning
toggle is on, a Job
is defined to be kicked off. The job will be picked up within 10 minutes and kicked off. A temporary EC2 instance will be spun up with an EBS volume of the size defined in the Disk Size
field. The EC2 will pick up the file and scan it. Because it is a "job", it is monitored under the Jobs page. On the Jobs page you can monitor the job going through "Not Started" while waiting for the EC2 to start up, "Scanning", and "Completed". Each "large file" is treated as its own job. If you have 50 large files come in then 50 jobs will be kicked off for the duration it takes to scan each individual file.
You must have either the Sophos engine selected or have multi-engine scanning model enabled with By File Size
selected if you intend to use CrowdStrike and need extra large files scanned.
Sample scenarios and scanning outcomes: (Note: subtract 5GB for overhead from disk sizes for agents and Extra Large File Scan size)
File is smaller than defined scanning agent disk size
Scanning agent picks file up to scan Scan Result is whatever the outcome of file is
File is larger than defined scanning agent disk size
Extra Large File Scanning
is off
Scanning agent rejects file and does not even attempt to scan it
Scan Result is set to Unscannable
File is larger than defined scanning agent disk size
Extra Large File Scanning
is on
Scanning agent creates an Extra Large File Scan Job and moves on to the next file Large File Job shows up on the Jobs page and is kicked off within 10 minutes of creation Scan Result is set to whatever the outcome of the file is
File is larger than Extra Large File Scanning disk size
Extra Large File Scanning
is on
Scan Result is set to Unscannable
Buckets themselves are inherently skipped by the fact they are turned off to start. When you enable a bucket for scanning, you are explicitly marking it to be scanned. When you do this and nothing else, then all objects in the given bucket will be scanned no matter the path inside the bucket. This portion is all handled from the Bucket Protection page.
The Scan List
and Skip List
located here inside the Agent Configuration
page allows you to take this concept a step further by allowing you to apply it to paths (folders) within the buckets. Scan listing and skip listing are opposites of one another. Scan listing a path
is explicitly marking that specific path(s) within that bucket to be scanned. All other paths within the bucket will be ignored. You can list as many paths within the bucket as you'd like. For example, you have 5 different paths within the bucket and you want to scan only 3 of them. Simply add the 3 you want scanned to the Scan List. There is no need to add the other 2 paths to the Skip List as they will automatically be skipped. Skip listing a path
will stop that defined path(s) from being scanned, but leave all others to be scanned. Depending on how many you want to include versus exclude you can choose which list to leverage. From the previous example, you could have just Skip listed the 2 paths you didn't want to scan which leaves the other 3 to be scanned. With numbers that split you could go either way, but if you have a much larger number of paths, one may become more self-evident.
The root
of the bucket is also a "path" that we define in the settings as an empty path. So, if you'd like to scan or skip list the root along with your paths you can do so.
You can use a wild card (*
) in the path. This is useful in a repeated sub-path structure where you need to skip a certain folder amongst all those top level folders.
For example, you have a path structure that is Year/Month/scanMe and Year/Month/skipMe where the month reflects each month of the year creating 12 unique paths. Underneath that Month folder you have folders you want scanned or skipped. You can create a path one time to setup scanning in every Month folder like this: Year/*/scanMe. That path will scan the scanMe
folder under every month in the bucket without you having to add 12 entries.
This could also be leveraged not just in the path, but down to the object name as well. If you only wanted to scan a certain file type in a given bucket/folder you could put a path with *.jpg
.
As you can see, but wasn't spelled out, the *
does not mean everything at the level it is placed and below. If you wanted everything underneath Year you could place a path /Year/ and that would do absolutely everything below Year. The wildcard represents the level itself where it is placed only.
Similar to wild cards an often used with wild cards, you can specify a global entry or your Scan and Skip Lists. If you have a repeated path structure across all your buckets where being able to define a scan / skip entry that would apply to all, you can select the _GLOBAL_
option in place of a bucket to then define across all buckets.
Let's take a look at an example. The following is an S3 bucket with 4 paths (folders) in it. With the default settings, meaning no paths
defined, the root and all 4 folders will be scanned.
First thing you need to do is enter the <bucket name>
in the Scan List or Skip List field and click Add Scan list
.
You'll get:
Next, you'll add the path
you want to scan and click the Add Entry
button. *This can be the full multi-depth path.
That's it. You've now identified that the only thing you will scan in the css-protect-01
bucket will be the scan-me
folder and nothing else. The steps above can be used for skip listing as well. Look below at the examples.
You probably wouldn't have a scan list and a skip list entry for the same bucket as we see in this example, but it is possible and I'm sure a scenario could be found to support it.
With the Two-Bucket System you can move objects from a source bucket or region to a different bucket and/or prefix after it has been successfully scanned and tagged as Clean.
All other scan result types (Infected, Error, Unscannable), remain in the protected source bucket.
When using this method you no longer need to add Lambda Functions to move your files, as outlined here. With the AV Two-Bucket System Configuration the agent task itself will promote the clean files as part of the scanning process.
This feature will not protect the bucket(s) configured within the setting. You will still need to ensure the bucket(s) are protect within the Bucket Protection page.
There are two options for configuring the Two-Bucket System:
By Region When using this setting, any protected bucket within the choosen region will have its clean files delivered to the destination bucket.
Choose a region from the Add Region selection
Choose a target destination bucket
Optionally, choose a desired prefix to place objects in.
By Bucket
Choose a bucket from the Add Bucket selection
Choose a target destination bucket
Optionally, choose a desired prefix to place objects in.
If you have very long object paths, specifying a prefix for either By Region or By Bucket could cause you to exceed the max key length and impact file delivery.
Delete any region or bucket by clicking the delete icon next to its source.
Be sure to click the save button to apply your changes.
Source bucket in the primary account and destination bucket in a linked account
To set up a two-bucket system with a source bucket in the primary account and a destination bucket in another linked account, you need to add permissions to the destination bucket in the linked account. These permissions are necessary for the agent to send objects to that bucket.
Steps to Configure the Two-Bucket System
Add the source bucket and the destination bucket in the AV Console.
Log in to the account where the destination bucket is located.
Navigate to Amazon S3, then Buckets.
Select the destination bucket in your linked account.
Click on Permissions.
Scroll down to the Bucket Policy section and select Edit.
Add the following JSON policy to the destination bucket
8. Replace the placeholders in the policy:
awsaccountnumber with the AWS account number of the primary account.
appID with the application ID of your console.
destination-bucket with the name of the destination bucket.
This policy allows the agent to put the clean objects in the linked account's bucket.
To set up a two-bucket system with a source bucket in a linked account and a destination bucket in the primary account, you need to add permissions to both buckets. These permissions allow the agent to transfer objects between the source and destination buckets.
Setting up the policy for the source bucket in the linked account:
Log in to the linked account where the source bucket is located.
Navigate to Amazon S3, then Buckets.
Select the source bucket in the linked account.
Click on Permissions.
Scroll down to the Bucket Policy section and select Edit.
Add the following JSON policy to the source bucket
Replace the placeholders in the policy:
<awsaccountnumber> with the AWS account number of the primary account.
<appID> with the application ID of your console.
<source-bucket> with the name of the source bucket.
This policy allows the source bucket to place objects in the destination bucket within the primary account.
Setting up the policy in the destination bucket in the primary account:
Log in to the linked account and go to CloudFormation.
Select the CloudStorageSecurity Linked Account stack.
Click on Resources and select the hyperlink for the Remote Access Role.
Copy the ARN for the CloudStorageSecRemoteRole.
Log in to the primary account.
Navigate to Amazon S3, then Buckets.
Select the destination bucket in the primary account.
Click on Permissions.
Scroll down to the Bucket Policy section and select Edit.
Add the following JSON policy to the destination bucket:
Replace the placeholders in the policy:
<arn:aws:iam::<awsaccountnumber>:role/CloudStorageSecAgentRole-<appID> with the ARN you copied from the CloudStorageSecRemoteRole in the linked account.
<destination-bucket> with the name of the destination bucket.
This policy allows the remote role to place objects into the destination bucket.