Skip to content

Scan Settings

Scan Configuration Overview

There are four configuration adjustments you can make to the Scanning Agents: Tag Name changes, Infected File handling, Scan Engine choice and decisions around which objects to scan (Scan list / Skip list). These Scan Behavior Settings modifications will apply to all agents currently running and new ones that spin up going forward no matter the region. Changes made here will not require a reboot for any agent, but could take 30 seconds to take affect. Such as, custom naming the tags we apply to each object as well as providing scan list and skip list functionality for the buckets.

Scan Settings

Note

These are global changes and therefore it is not currently possible to create different behavior by region or group.

Object Tag Keys

Every file the s3 scanner touches has an AWS Tag applied to it. You can change the default key names if required or desired, but not the values.

Object Tagging

Key                             Description
scan-result Identifies whether a file is found to be clean or having issues. Possible values: Clean, Infected, Unscannable, Error, InfectedAllowed
  • Clean = no issues found with file
  • Infected = malware found; note: encrypted and password protected files will also be marked as infected at this time
  • InfectedAllowed = represents a file you have moved from a quarantine bucket back into its originating bucket; could be a "false positive"
  • Unscannable = object is password protected or greater than maximum size allowed (2gb for ClamAV or 195gb for Sophos)
  • Error = access to object issues: KMS permissions, cross account permissions, bucket policy blocking, other
date-scanned The date and time the object was scanned
message A description of what has been identified in the file. Only populated for Error or Unscannable results.
virus-name Name of virus identified
uploaded-by AWS user identifier

Note

AWS allows an object to have only 10 tags applied to it. At most we will add 5 tags (for infected files) and only 2 tags for clean files. If you have a number of tags on your object already, we will trim the number of tags we add to ensure none of the existing tags are dropped. If only 1 tag is available for example, we will write only the scan-result tag onto the object.

Action for Infected Files

Quarantine Options

There are 3 main actions you can take with an infected file: move (default), delete and keep.

Move directs the scanner to take the file and place it (copy then delete) in a quarantine bucket. The console creates a quarantine bucket in each region you enable buckets for scanning. The bucket will be named uniquely with the ApplicationID tacked onto it along with the region it was created in. This is the default behavior.

Delete directs the scanner to remove the file entirely.

Keep directs the scanner to leave the file in the bucket it found it. The scanner will still tag the object appropriately.

Restore Quarantined Files

Using the default Move action places files identified as infected into a quarantine bucket. You may need to restore the file from quarantine bucket back into the originating bucket. You can do this on a one-time basis or permanently. Check out the Allow Suspect Files documentation for more information.

Anecdotally

Most customers continue with the default value of Move, but we have seen a number go with Keep. They keep the objects in place and leverage a bucket policy to make access to them available only when they are tagged with scan-result=clean.

Sample Bucket Policy to do this
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "BlockAccessExceptClean",
            "Effect": "Deny",
            "Principal": "*",
            "Action": "s3:GetObject",
            "Resource": "arn:aws:s3:::<staging bucket or any bucket>/*",
            "Condition": {
                "StringNotEquals": {
                    "aws:PrincipalArn": [
                        "arn:aws:iam::<account-number>:role/CloudStorageSecConsoleRole-<applicationID>",
                        "arn:aws:iam::<account-number>:role/CloudStorageSecAgentRole-<applicationID>",
                        "<account-number>"
                    ],
                    "s3:ExistingObjectTag/scan-result": [
                        "Clean",
                        "Unscannable",
                        "InfectedAllowed"
                    ]
                }
            }
        }
    ]
}

We have a number of customers also using a "2 Bucket System" for more of a "physical separation" of the object storage. They utilize a staging/dirty bucket for all files to first be placed in. This is where all the scanning takes place. Once a file is found to be clean it is copied/moved to the production/usable bucket(s). At that point then downstream users and applications will be aware of the files and able to use them. This system uses a combination of our standard event-based scanning (on the staging bucket) and a lambda triggered by our real-time notifications. For more information on how to set this up, check out the 2 Bucket System write-up here.

Quarantine Buckets

Quarantine Buckets

If you leverage the Move action above, then the quarantine bucket and its settings come into play. By default, we place a quarantine bucket in each account (installed and linked accounts) that has buckets being protected by event-based scanning or on-demand/scheduled scanning. Because we don't want to pull files across accounts to store we have this as the default behavior. The quarantine buckets are also created per region so we aren't pulling across regions to store. So theoretically, you could have 20+ buckets added to your S3 if you are protecting data in that many regions.

We have had a number of customers determine they do not want to quarantine in the linked accounts. They would prefer to have all infected files quarantined to their centralized security account (the account the Antivirus for Amazon S3 solution is installed). There is an option here in the settings to Move to Main Account. Simply tick the toggle and we'll change how the application quarantines files. We will start storing all infected files into a centralized quarantine bucket.

A Lifecycle Policy is added to the quarantine bucket. The default policy indicates to keep files indefinitely, but you can change that so quarantined files will be deleted after a certain number of days. Pick a value that will give you time to work through whatever workflows you have in place to examine quarantined files.

Scanning Engine

Scan Engine

Antivirus for Amazon S3 has been designed to work with multiple AV engines easily. Currently, there are two engines available: Sophos and ClamAV. You can choose either one and even easily switch it after the fact, but there are some distinctions that could weigh the decision in one favor. Look to the table below for the key differentiators. Choice comes down to cost vs large files with a bit of brand-name mixed in as well. In large environments, the additional scan costs of the Sophos engine could be somewhat accounted for by the performance gains and the need to run less infrastructure.

Engine Included in Base Cost Scans Large Files Performance
Sophos No
add .10 per GB
Up to 195 GB Better
ClamAV Yes Up to 2 GB Good

Multi-engine Scanning

Multi Scan Engine
Multi-engine scanning was introduced in version 5.06.001 with two options to use all the supported engines. The two choices for how to use multi-engine scanning are: All Files and By File Size. All Files indicates every file that is event-based scanned or on-demand/schedule scanned will be processed by both engines. By File Size indicates smaller files (<2gb) will be scanned by ClamAV-only (since ClamAV has a size limitation and can't scan above 2gb) and larger files (>2gb) will be scanned by Sophos-only. This allows you to take advantage of the scan cost savings offered by ClamAV for part of your files, but still allow for those files that are too big for ClamAV to still be scanned.

Multi-Engine Selection Smaller Files (<2gb) Larger Files (>2gb)
All Files Both Engines Sophos-only
By File Size ClamAV-only Sophos-only

Private Mirror - Local Signature Updates

Private Mirror
In certain situations, you may prefer the scanning agents to retrieve signature updates locally rather than reaching outside of your account. We have customers who have requirements where the applications that touch data cannot go out to the public internet. Local updates will allow you to better control and potentially eliminate outbound access for the VPCs hosting the scanning agents. This option allows you to specify an Amazon S3 bucket in your account for the scanning agents to look to for signature updates. You can get the updates into this bucket however you see fit (sample lambda provided below). Each time a scanning agent boots it grabs the latest definitions. A running scanning agent will check every 1 hours for ClamAV (every 15 minutes for Sophos) for new signature definitions.

If you do not use local updates, ClamAV is setup to reach out over the internet to directly download the updates. This is happening at the agent, so any and all agents will need to speak to the public internet (outbound only) to retrieve the updates. Sophos is updated differently. Cloud Storage Security hosts an S3 bucket in our account that we maintain and update with the Sophos updates files. When running the Sophos engine your agents will retrieve updates directly from our bucket.

For both engines you can choose to have your agents look to your local bucket. We have provide a sample, fully working lambda for each engine. The ClamAV lambda pulls the updates down from the internet and the Sophos lambda copies the update files from our bucket to yours.

Note

If you can't wait 1 hour (or the 15 minutes) after a new signature update comes out, simply reboot all your agents and they will pick the new updates up immediately.


This is for signature updates only. Engine updates are done as part of our build process and will be rolled out with the next release.


If you choose to use Private Mirror with multi-engine scanning, you will need to setup Private Mirror for both sets of updates. You can and will need to use the same bucket for both.

Sample Lambdas

We have provided two lambda functions: 1 for ClamAV and 1 for Sophos. ClamAV has recently changed their mechanism for downloading updates and the lambda below accommodates this. Sophos has requires authentication to download their updates, so we are hosting them in a bucket where your account will be added to the permissions list. This is done automatically by switching on Local Updates.

ClamAV Lambda

The following lambda package can be leveraged to pull the three update files down to the Amazon S3 bucket of your choice that you identify in the BUCKET_NAME environment variable. You will be required to make a few configuration changes: BUCKET_NAME environment variable, CloudWatch trigger event to kick off on the timing of your choice (hourly, every 4 hours, etc), modify the timeout to be 3 minutes and set the memory to be at least 1024mb. With this you should be all set.

Download, import and configure:

https://css-public-docs.s3.amazonaws.com/update_clamav_defs.zip

Note

The signature update process uses the object URL to access the objects. Therefore, the objects must be public so the standard URL can be used to access them.

Detailed Steps
  1. Create a new lambda from the AWS Lambda service console clam lambda create
  2. Provide initial details for new lambda clam lambda details
  3. Upload lambda zip file provided above
    clam lambda upload
  4. Modify Configuration
    clam lambda config 1

    4.1 Modify Memory and Time Out
    clam lambda config 2

    Note

    You'll need at least 1024MB memory. If you run into any issues with that, add a bit more and all should be well.

    I typically see this take about 45 seconds to run. So setting to 3 minutes is just a little extra.

    4.2 Add Environment Variable for bucket name clam lambda config 3
    clam lambda config 4

    4.3 Modify Role Permissions clam lambda config 5

    This will take you back to this screen. Click the link to take you to the IAM console to modify the role. You can go directly as well. clam lambda config 6

    Click Edit Policy clam lambda config 7

    Add the following to the policy

    {
        "Sid": "SignatureUpdates",
        "Effect": "Allow",
        "Action": [
            "s3:PutObject",
            "s3:GetObject"
        ],
        "Resource": [
            "arn:aws:s3:::<YOUR-BUCKET-NAME-HERE>/*"
        ]
    }
    
    clam lambda config 8

    Note

    Remember the comma on the preceding bracket!

    Click Review Policy and Save Changes to complete the role modification.

    Go back to the Lambda tab and click Cancel on the role edit screen. clam lambda config 9

    4.4 Add the Event Trigger clam lambda config 10

    Select the EventBridge (CloudWatch Events) trigger clam lambda config 12

    Provide details and the cron expression for however often you want to check for updates. The example below shows checking every 2 hours for updates. clam lambda config 13

You're all set now! You've created a lambda, applied the code by uploading the zip and modified the config to make it run. This will start downloading ClamAV updates to the specified bucket.

Note

If you would like to review or extend the functionality of the lambda for any reason, extract the zip locally to see the code. Make modifications and re-zip it all and then upload.

Sophos Lambda

Info

Because the updates are taking place inside of AWS already you may not consider it a need for local updates unlike how we download Clam updates directly from the internet. But, if you still want updates coming from your own bucket, then read on.

Sophos updates are already done from a bucket. A bucket we have granted your application account access to and the agents simply pull the updates from that bucket we host. For local updates for the Sophos engine all you really need to do is copy the update files from our bucket to yours. Like we do for ClamAV updates, we have provided some sample code you can turn into a lambda to pull the updates over.

Lambda Code
import boto3
import botocore
from datetime import datetime, timezone
CSS_SOPHOS_BUCKET = 'css-sophos-updates'
VDL_FILE_NAME = 'vdl.zip'
IDE_FILE_NAME = 'ide.zip'
DESTINATION_SOPHOS_BUCKET = '<enter-your-bucket-name-here>'
LAST_UPDATED_FILE_NAME = 'def_files_last_updated.txt'
def lambda_handler(event, context):
s3 = boto3.client('s3')
vdlLastModified = datetime(2006, 3, 14)
ideLastModified = datetime(2006, 3, 14)
try:
    lastUpdatedObj = s3.get_object(Bucket=DESTINATION_SOPHOS_BUCKET, Key=LAST_UPDATED_FILE_NAME)
    lastUpdated = lastUpdatedObj['Body'].read().decode('utf-8').split('|')
    vdlLastModified = datetime.strptime(lastUpdated[0], '%Y-%m-%dT%H:%M:%S%z')
    ideLastModified = datetime.strptime(lastUpdated[1], '%Y-%m-%dT%H:%M:%S%z')
except botocore.exceptions.ClientError as e:
    if e.response['Error']['Code'] not in ['404', 'NoSuchKey']:
        raise
filesWereUpdated = False
try:
    s3.copy_object(
        Bucket=DESTINATION_SOPHOS_BUCKET,
        Key=VDL_FILE_NAME,
        CopySource={'Bucket': CSS_SOPHOS_BUCKET, 'Key': VDL_FILE_NAME},
        CopySourceIfModifiedSince=vdlLastModified,
        ACL='bucket-owner-full-control'
    )
    vdlLastModified = datetime.now(timezone.utc)
    filesWereUpdated = True
except botocore.exceptions.ClientError as e:
    if e.response['Error']['Code'] != 'PreconditionFailed':
        raise
try:
    s3.copy_object(
        Bucket=DESTINATION_SOPHOS_BUCKET,
        Key=IDE_FILE_NAME,
        CopySource={'Bucket': CSS_SOPHOS_BUCKET, 'Key': IDE_FILE_NAME},
        CopySourceIfModifiedSince=ideLastModified,
        ACL='bucket-owner-full-control'
    )
    ideLastModified = datetime.now(timezone.utc)
    filesWereUpdated = True
except botocore.exceptions.ClientError as e:
    if e.response['Error']['Code'] != 'PreconditionFailed':
        raise
if filesWereUpdated:
    object = boto3.resource('s3').Object(DESTINATION_SOPHOS_BUCKET, LAST_UPDATED_FILE_NAME)
    object.put(Body=f'{vdlLastModified.strftime("%Y-%m-%dT%H:%M:%SZ")}|{ideLastModified.strftime("%Y-%m-%dT%H:%M:%SZ")}')
return {
    'statusCode': 200,
    'body': 'Sophos VDL and/or IDE file(s) were updated.' if filesWereUpdated else 'Sophos VDL and IDE files are already up to date.'
}
Permissions to add to lambda
{
    "Sid": "GetSophosFilesFromCSS",
    "Effect": "Allow",
    "Action": [
        "s3:GetObject",
        "s3:GetObjectTagging"
    ],
    "Resource": [
        "arn:aws:s3:::css-sophos-updates/*",
        "arn:aws:s3:::css-sophos-updates"
    ]
},
{
    "Sid": "PlaceSophosFilesInMirrorBucket",
    "Effect": "Allow",
    "Action": [
        "s3:GetObject",
        "s3:PutObject",
        "s3:PutObjectTagging",
        "s3:PutObjectAcl"
    ],
    "Resource": [
        "arn:aws:s3:::<enter-your-bucket-name-here>/*",
        "arn:aws:s3:::<enter-your-bucket-name-here>"
    ]
}
Detailed Steps

Please refer to the detailed steps for creating a Lambda to retrieve updates under the ClamAV lambda Detailed Steps. The steps will be exactly the same with the exception of Step 3 and Step 4.3 which will be explained below.

  1. Sub for Step 3 above - Provide Code for Lambda
    Copy lambda code in the section above . . . sophos lambda copy
    And paste it into the function .py file sophos lambda upload

    Deploy the code sophos lambda deploy

  2. Sub for Step 4.3 above - Add Permissions to Lambda Role Copy the policy section above . . . sophos lambda policy
    And paste it into the role json sophos lambda role

    Note

    Remember the comma on the preceding bracket!

    Important

    Make sure to follow all other steps for creating the trigger and so forth and you should be all set.

Scan and Skip Lists

Buckets themselves are inherently skipped by the fact they are turned off to start. When you enable a bucket for scanning, you are explicitly marking it to be scanned. When you do this and nothing else, then all objects in the given bucket will be scanned no matter the path inside the bucket. This portion is all handled from the Bucket Protection page.

The Scan List and Skip List located here inside the Agent Configuration page allows you to take this concept a step further by allowing you to apply it to paths (folders) within the buckets. Scan listing and skip listing are opposites of one another. Scan listing a path is explicitly marking that specific path(s) within that bucket to be scanned. All other paths within the bucket will be ignored. You can list as many paths within the bucket as you'd like. For example, you have 5 different paths within the bucket and you want to scan only 3 of them. Simply add the 3 you want scanned to the Scan List. There is no need to add the other 2 paths to the Skip List as they will automatically be skipped. Skip listing a path will stop that defined path(s) from being scanned, but leave all others to be scanned. Depending on how many you want to include versus exclude you can choose which list to leverage. From the previous example, you could have just Skip listed the 2 paths you didn't want to scan which leaves the other 3 to be scanned. With numbers that split you could go either way, but if you have a much larger number of paths, one may become more self-evident.

Special Scan / Skip List Capabilities

The root of the bucket is also a "path" that we define in the settings as an empty path. So, if you'd like to scan or skip list the root along with your paths you can do so.


You can use a wild card (*) in the path. This is useful in a repeated sub-path structure where you need to skip a certain folder amongst all those top level folders.

For example, you have a path structure that is Year/Month/scanMe and Year/Month/skipMe where the month reflects each month of the year creating 12 unique paths. Underneath that Month folder you have folders you want scanned or skipped. You can create a path one time to setup scanning in every Month folder like this: Year/*/scanMe. That path will scan the scanMe folder under every month in the bucket without you having to add 12 entries.

This could also be leveraged not just in the path, but down to the object name as well. If you only wanted to scan a certain file type in a given bucket/folder you could put a path with *.jpg.

Note

As you can see, but wasn't spelled out, the * does not mean everything at the level it is placed and below. If you wanted everything underneath Year you could place a path /Year/ and that would do absolutely everything below Year. The wildcard represents the level itself where it is placed only.


Similar to wild cards an often used with wild cards, you can specify a global entry or your Scan and Skip Lists. If you have a repeated path structure across all your buckets where being able to define a scan / skip entry that would apply to all, you can select the _GLOBAL_ option in place of a bucket to then define across all buckets.
Global Scan List example

Let's take a look at an example. The following is an S3 bucket with 4 paths (folders) in it. With the default settings, meaning no paths defined, the root and all 4 folders will be scanned. AWS S3 Bucket

First thing you need to do is enter the <bucket name> in the Scan List or Skip List field and click Add Scan list.
Scan list - Add Bucket
You'll get:
Scan list - Result
Next, you'll add the path you want to scan and click the Add Entry button. *This can be the full multi-depth path. Scan list - Add Path

That's it. You've now identified that the only thing you will scan in the css-protect-01 bucket will be the scan-me folder and nothing else. The steps above can be used for skip listing as well. Look below at the examples.

Scan list and Skip list Empty

Scan list

Skip list

Scan list and Skip list Rules

Note

You probably wouldn't have a scan list and a skip list entry for the same bucket as we see in this example, but it is possible and I'm sure a scenario could be found to support it.


Last update: August 26, 2021