Cloud Storage Security Help Docs
Release Notes
  • Introduction
  • Getting Started
    • How to Subscribe
      • Pay-As-You-Go (PAYG)
      • Bring Your Own License/GovCloud (BYOL)
      • AWS Transfer Family
    • How to Deploy
      • Steps to Deploy
      • Advanced Deployment Considerations
      • AWS Transfer Family
    • How to Configure
  • Console Overview
    • Dashboard
    • Malware Scanning
      • AWS
        • Buckets
        • Amazon EBS Volumes
        • Amazon EFS Volumes
        • Amazon FSx Volumes
        • WorkDocs Connections
      • Azure
        • Blob Containers
      • GCP
        • GCP Buckets
    • See What's Infected
      • Findings
      • Malware History
      • Results
    • Schedules
    • Monitoring
      • Error Logs
      • Bucket Settings
      • Deployment
      • Jobs
      • Notifications
      • Storage Assessment
      • Usage
    • Configuration
      • Classification Rule Sets
      • Classification Custom Rules
      • Scan Settings
      • Console Settings
      • AWS Integrations
      • Job Networking
      • API Agent Settings
      • Proactive Notifications
      • License Management
      • Event Agent Settings
    • Access Management
      • Manage Users
      • Manage Accounts
        • Linking an AWS Account
        • Linking an Azure Account
        • Linking a GCP Account
      • Manage Groups
    • Support
      • Getting Started
      • Stay Connected
      • Contact Us
      • Documentation
  • Product Updates
  • How It Works
    • Scanning Overview
      • Event Driven Scanning for New Files
      • Retro Scanning for Pre-Existing Files
      • API Driven Scanning
    • Architecture Overview
    • Deployment Details
    • Sizing Discussion
    • Integrations
      • AWS Security Hub
      • AWS CloudTrail Lake
      • AWS Transfer Family
      • Amazon GuardDuty
      • Amazon Bedrock
    • Demo Videos
    • Scanning APIs
    • SSO Integrations
      • Entra ID SSO Integration
      • Okta SSO Integration
  • Frequently Asked Questions
    • Getting Started
    • Product Functionality
    • Architecture Related
    • Supported File Types
  • Troubleshooting
    • CloudFormation Stack failures
    • Cross-Region Scanning on with private network
    • API Scanning: Could not connect to SSL/TLS (v7)
    • Password not received after deployment
    • Conflicted buckets
    • Modifying scaling info post-deployment
    • Objects show unscannable with access denied
    • Remote account objects not scanning
    • My scanning agents keep starting up and immediately shutting down
    • I cannot access the management console
    • Linked Account Out of Date
    • Rebooting the Management Console
    • Error when upgrading to the latest major version
    • I Cannot Create/Delete an API Agent
  • Release Notes
    • Latest (v8)
    • v7
    • v6 and older
  • Contact Us & Support
  • Data Processing Agreement
  • Privacy Policy
Powered by GitBook
On this page
  • Your Mileage May Vary
  • Throughput Table
  • Linear Scale Out for S3 Integrated
  • So what does this mean?
  1. How It Works

Sizing Discussion

File size and the number of files you plan on scanning will impact how you scale your deployment to fit your needs.

PreviousDeployment DetailsNextIntegrations

Last updated 5 months ago

To get a feel for scale and performance we ran a series of tests with a number of file sizes: 100kb, 1mb, 10mb, 100mb, 500mb, 1gb, 1.5gb, and 2gb (the current maximum file size) with the ClamAV engine. We tested all the same file sizes, but also added 5gb, 10gb and 15gb files for the Sophos engine. The goal here was to find two things:

  1. What was the throughput in GB and object counts?

  2. Is the scale linear as you add scanning agents?

We did this systematically with a number of agents in a strict environment. We later similarly tested with auto-scaling in place to see the scanning agents spin up and down as the load backed off and similar results were found with greater numbers of agents.

We tested uniquely generated junk files with the hashing function turned off to ensure that each file would be fully scanned. This allows for more accurate throughput metrics for our scanner agents.

Your Mileage May Vary

These tests are not real world tests with your particular data sets. This is purely to give you a feel for how your environment may behave and allow you to make deployment decisions. Please test with files similar to what you will see in production. We'd love to have you to report findings and see how your environment matches up or how we may help you get the most out of it.

Event Driven and Existing (Retro) were both tested and had similar results for the scanning per agent per hour time. Event Driven Scanning does NOT include time spent copying nor uploading to the bucket. Scan Existing does NOT include bucket crawling.

In practice, scan existing will start many agents, usually enough to complete scanning the entire bucket of objects, however large, within an hour or less.

  • Where 300 Gb/hr is reported, we actually observed initial speeds from 200 Gb/hr to 600 Gb/hr.

  • After throttling (1 - n hours later) we observed speeds as low as 100 Gb/hr (and as high as 300 Gb/hr)

  • Testing was done in us-east-1, but a few tests in us-east-2 ran about 20% faster

Throughput Table

Here are the average of results we observed before throttling:

Throughput results for the CSS Premium engine will be released in the future. If you have specific questions on throughput for a scanning agent using the CSS Premium engine please .

File Size Tests Throughput
ClamAV Engine S3 Integrated
ClamAV Engine S3 Integrated
Sophos Engine S3 Integrated
Sophos Engine S3 Integrated

in GB/hr

in ~Files/hr

in GB/hr

in ~Files/hr

100kb files

~1.75

~17,500

~2.25

~22,500

1mb files

~6.5

~6,500

~20

~20,000

10mb files

~9

~900

~100

~10,000

100mb files

~9

~90

~200

~2,000

500mb files

~9

~18

~300*

~600

1 gb files

~9

~9

~300*

~300

2 gb files

~9

~4.5

~300*

~150

5 gb files

X

X

~300*

~60

10 gb files

X

X

~300*

~30

50 gb files

X

X

~300*

~6

100 gb files

X

X

~300*

~3

150 gb files

X

X

~300*

~2

Linear Scale Out for S3 Integrated

The table above shows the results of a single scanning agent running and be bombarded with objects to get to upper end, but sustainable throughput value. We noticed in our testing that as you add scanning agents, you simply increase the throughput by the same values above for that second scanning agent. Just multiply the GBs / hr and the Files / hr values to see what it would be like with 2 to N scanning agents.

With the testing we've done on the API file scanning, we have seen significant performance increases at all file sizes up through 1.5GB. We have not done extended testing at this time so cannot post full results of testing. And your mileage will vary based on network performance and latency.

So what does this mean?

There have always been questions around what is required to meet the business needs when adopting a new solution.

  • How much infrastructure do I need?

  • Do I scale up or scale out?

  • Do I need to run it all the time?

  • Am I trying to get a certain amount of work done in a particular window of time or can it take as long as it wants?

The answers to these questions can help you determine how you want to run the solution. The simple answer is, and taken with a your mileage may vary consideration, is to look at your environment and see the types of files you deal with and the average size. Apply that to the chart above to get a baseline to the amount of given work the scanning agents can achieve.

For example, let's say most of your files are approximately 1mb in size. A single agent can do ~7000 of those files an hour. How many files per hour or per day are you receiving? Do you need to do them in "realtime" as they come in throughout the day or in a certain scan window? How old will you allow an object to get before it is scanned?

Extending the example, let's say 7001 files come in all at once. A single agent will evaluate those in an hour (2 per second), but many of the files will sit there for tens of minutes to even a full hour for that 7001th file. Is that ok? If not, then we have to judge the impacts of scaling additional agents in this scenario. Adding a second agent in this case then roughly doubles the throughput so we're now at 14k per hour (4 per second) and therefore you can now evaluate the files in ~30 minutes instead of 60. You're oldest file would be at most 30 minutes before getting scanned. Adding a third agent takes you down to ~20 minutes and so on.

With that, you can start to think through how you want to drive your system. The main configuration available to you today for this is modifying the Number of Messages in Queue to Trigger Agent Auto-Scaling during deployment. This can be modified after the fact if you find your original choice is not allowing you to meet your goals. The way the auto-scaling works is the queue must have the number of entries you specified during deployment sitting there for at least 1 minute to trigger the alarm that will then generate the scaling event. Similarly this works in much the same, but opposite fashion for backing off the scaling events.

In the scenario above, how could you ensure no item was more than 4 minutes old? Looking at the numbers, a single agent can do ~120 of those files per minute and therefore ~480 in 4 minutes. As soon as you see more than 120 entries in the queue for longer than 1 minute's time you are starting to fall behind. It isn't until the queue has had ~480 entries in it for longer than a minute you may no longer hit that 'at most 4 minute' scan window. So the queue value you may want to specify could be between 240-360. This allows for the time it takes to spin another agent up. If the files are coming in so fast your queue is backed up and is now sitting above 700 entries for a minute, then another alarm triggers a scaling event for another agent to spin up and so on it goes. So the queue value you pick during deployment is used in multiples of queue entries for triggering scaling events up and down. This choice should allow for scanning agents to spin up on demand to continue to serve that 4 minute old window. As entries drop below those multiples in the queue, scanning agents will start to spin down.

In this scenario, you are receiving more than 120 files per minute. If you never have this type of inflow, then a single agent will always keep up and you are always within a few seconds to a minute of scanning. The idea to take away from this section is to evaluate the inflow of objects along with the size of the objects and determine your acceptable scan window. Maybe it isn't 4 minutes, but rather 4 seconds. Thinking through how that changes your deployment allows you to determine the scaling values.

The alternative to good queue choices is to just brute force it by upping the minimum running agents. This will add infrastructure costs, but you'll have the agents ready and waiting for the loads to come in.

As items are peeled off the queue, scaling contractions will happen and the scanning agents will drop off. There is a cool down period so you may notice they don't immediately drop off, but how AWS manages it seem reasonable.

In the brute force scenario the agents won't contract as you have set the minimum. You'd have to change that value directly if you wanted it reduced.

Other Sample Scenarios (using the slower ClamAV throughputs):

  • 100GB 1GB files (100 files) in 4 hours: 3 agents with an autoscaling queue of 10

    • Baseline: 1 agent = 10gb/hr, 3 agents = 30gb/hr

  • 200GB 100kb files (2,000,000 files) in 3 hours: 34 agents with an autoscaling queue of 17,000

    • Baseline: 1 agent = 1.77gb/hr, Assume 5 agents = 10gb/hr, 50 agents = 100gb/hr

  • 1TB of 100MB files (10,000 files) in 2 hours: 50 agents with an autoscaling queue of 100

    • Baseline: 1 agent = 10gb/hr, 50 agents = 500gb/hr

We are happy to have a discussion with you on these metrics. Please if you'd like to learn more.

Contact Us
Contact Us
Contact Us