There is very little to configure to get the product up and scanning your data. The only activity you truly have to do post deployment is enable buckets for scanning. We've seen this covered in the Initial Configuration, but we'll get into more details here.
The Bucket Protection table is a complete list of the buckets you have in your account or linked accounts and their current status. You will see all buckets across all regions within the account the console is running in as well as active linked accounts.
Account identifier shows as
Primary. This represents the default account you deployed the solution in. If you link accounts for cross-account scanning, you will see a different identifier (the nickname you gave it) for those buckets that come from other accounts.
The bucket list is refreshed every 30 minutes in the background, but if you have recently created new buckets or deleted existing, you can force a refresh with the
Actions --> Refresh Buckets menu item at the top of the buckets list.
You may have noticed the Object Count and Total Size (GB) values for each bucket and summed for each region. These are not real-time reliable numbers. This data is pulled from CloudWatch Metrics for S3 Buckets. Amazon only updates these metrics once per day at the end of each day. So the numbers you are seeing are always a day old.
We check certain attributes related to buckets to give you information pertinent to setting up protection. As a result, you may notice icons next to the bucket names. The two main aspects we check now are public status and the encryption status. We want you to be informed on which buckets are public and how they are public. We also want to stop you from scanning whole buckets of encrypted objects when we don't have permissions to the key to decrypt those objects. Giving the AgentRole permissions to the key will solve this issue.
when bucket is capable of being
public, but not actually
- Some of the
Block Public Accesschecks are turned off, but there isn't an ACL or a Bucket Policy set to make the bucket public
- Some of the
when bucket is truly
publicvia ACL or Bucket Policy
- Some of the
Block Public Accesschecks are turned off and there is an ACL or a Bucket Policy set to make the bucket public
- The tooltip will give you details on the ACL settings
- Some of the
when KMS encryption enabled on the bucket and the AgentRole does not have permission to the key
- Follow the trouble shooting for how to enable the AgentRole with the key
You will not be able to turn on protection for a bucket or perform a
Scan Existingif the AgentRole does not have permission to the key
when KMS encryption enabled on the bucket and the AgentRole does have permissions to the key
Enable Buckets for Scanning¶
You can enable buckets one at a time with the or you can multi-select with the
Select Visible button at the top of the page or by leveraging the checkboxes on each bucket row for a subset. For any buckets in new regions where you aren't currently scanning, you will be asked to configure the VPC and Subnet(s) as we saw in the Initial Configuration. And if you select multiple new regions during the same action, you will be prompted to configure each one. Look to the steps below where two buckets are selected from two different regions (
us-west-1) not currently enabled.
Select Visible means all rows available from the filtered search. If no search criteria has been entered, then all buckets in your list will be selected. If you have filtered the list down with a search, then only those search results will be selected.
This is a great way to find the set of buckets you want to take a batch action against and trigger the action.
Turn On Selected from the
Actions drop down button yields the following popup where you must configure a VPC and Subnet(s) for any regions that are not already setup.
The VPC and Subnets you choose must have an outbound path to reach Amazon ECR. If not, the agents will never boot properly. As discussed in the troubleshooting topic, you can do with outbound access to the internet or through VPC Endpoints that give you access to ECR and API.
We now show
Private next to each VPC to indicate whether or not the VPC is tied to an Internet Gateway and therefor likely to have an outbound path. We also show
Restricted next to each Subnet to indicate whether each Subnet appears to have outbound routing.
Scan Existing Objects¶
Any time you enable a bucket for scanning, you will be asked if you would like to scan all the existing objects in that bucket as well. You can also trigger a
scan existing objects any time from the
Actions drop down button at the top of the bucket list as well. In the scenario above, you turned on two buckets and were first prompted to select network settings. Once that is complete you will be presented the
Scan Existing Objects popup. If you'd prefer not to scan existing objects at this time you can simply click
Don't Scan and this popup will be closed. If you do prefer to scan your existing objects as well, you select some or all of the buckets you had enabled for event-based scanning and then click the
Scan Selected button. You must select the disclaimer checkbox as well before the button will be enabled.
Buckets being turned on for event-based scanning is not a pre-requisite for scanning existing objects. Whether the bucket is turned on or off and whether it has a conflict or not, a
scan existing can be triggered on it. More than one
scan existing can be triggered on a bucket if so desired (picture two or more distinct, non-contiguous date ranges needed). Triggering a
scan existing for a bucket or buckets is simple on this page. Select one or many buckets using the
Select All button or the checkboxes and then select
Scan Existing Objects from the
Actions button. This will pull up the same popup as seen above.
For a time window, the default is beginning of time through current time. The intends to scan all objects within the bucket. The date picker allows you to select from one of the present values as well as create a completely custom range.
Custom Range allows you to select down to specific hours and minutes of the day if needed.
The instructions for this popup are collapsed by default. Expand for detailed steps.
The recommended number of agents is generated with the notion of completing the retro scan in a window of 12'ish hours. Depending on the number of GB to scan along with the number of objects we roughly calculate how many agents are needed to meet the 12 hour window. You can spin more agents up or reduce the number down. The cost math is the same whether you complete the job in 1 hour or 24 hours because it is running minutes.
Crawling performance will be impacted by the specs of the Console container. The default specs of .5vCPU will crawl one million objects (spread across 4 buckets in our testing) in about 4 minutes. A console with 2vCPU crawled the same one million objects in ~70 seconds. Extrapolate that out over 10s of millions or billions of objects and it can make a significant difference. We will be providing a mechanism to dynamically change the console specs for the duration of the crawl, but in the mean time please make adjustments to the specs in the Console Settings.
To learn more retro scanning of your existing objects. More details can be found here.
In addition to the on-demand scanning that Scan Existing offers, you can setup schedules to scan your buckets as well. You simply need to select the buckets you would like to scan and then define the scan frequency as desired.
Select the buckets and then from the action menu click on
Create Schedule modal will pop up to allow you to review selected buckets and define the scan frequency (daily, weekly, monthly, yearly).
For more information, review the Scheduled Scans documentation page.
Notice, as we saw in the Bucket Protection Status chart, there are 3 overall protection statuses:
- Protected by this console -
- Not Protected -
- Protected by another Cloud Storage Security Antivirus for Amazon S3 console -
On top of these three main statuses, you may have buckets that are in some form of conflict for scanning. There are three conflicts that can arise represented by two colors: yellow and red. The yellow color indicates a conflict we can fix, but want to make you aware of it in case there are considerations you need to make. The red color indicates one of two conflicts that we cannot automatically address. This will require you to intervene to enable scanning. Please check out the Trouble Shooting - Address Conflicts section for detailed steps on how to resolve these.
On top of those main conflicts, you can see a purple color tied to the primary account or a linked account if there is another Antivirus for Amazon S3 console running and protecting buckets.
Every field in the table can be searched upon utilizing the
Search field at the top of the page. Want to see only the buckets in 'east' search for that. Want all of the buckets that have a particular piece of text in their name, just type in that piece of text. You can search for multiple things as well separated by a space. Want to see all the buckets in
us-east-1 for the
Production account just add both of those in with a space between them.
Special Search Terms
There are some hidden terms that can be searched on: public, encrypt, conflict, on/off.
Protection Status can be searched by either
Off as the toggle would indicate.
You may find bucket names that one
on within the name that could throw the results slightly off. In this case, use column sort on Protection Status.
Bucket Conflicts can be searched for by using the word
conflict in the search field. This will return all the highlighted rows that reflect a potential event conflict.
public will identify all buckets that have some public aspects to them as seen in the Bucket Attributes above.
Searching on 'encrypt` will return all buckets that have a KMS key associated with them and identify whether the AgentRole has access to the key as seen in the Bucket Attributes above.
Additional Search Capabilities
We also provide the ability to search leveraging regex within the search field. This gives you great flexibility to really narrow down exactly what you are looking for. Where as a general partial word specified in the Search field may pull back more rows than you'd like, the regex option will allow you to better pattern match.
For example, searching on
webinar returns 4 matching buckets as seen here:
But, if I want just the "webinar" folders that end with "files", I could specify
Regex(webinar.*files) to get the two buckets that contain "webinar" and end with "files":
Another useful capability is that you can aggregate multiple individual searches to build a larger selected list. I can create a potentially complex regex or I can do multiple simple searches for my selections. Extending the example above, albeit a simple one, might look as follows.
webinar buckets you want.
Notice the bottom summary line:
Showing 4 of 112 buckets - 2 selected
webinar from the search and enter
canada in place of it and select the canada bucket you want.
Now notice the bottom summary line:
Showing 1 of 112 buckets - 3 selected (2 not currently visible)
Your first search selections are still maintained as well as any subsequent search selections made. So you can build up your selected list very simply this way. This can be used for one off retro scanning (Scan Existing) as well as the basis for creating schedule based scans.