Skip to content

Trouble Shooting

Failed stack creations

How to address stack creation failures

The two most common stack creation failures customers see are as a result of 1) bad parameters specified within the template and 2) deploying in a non-supported region.

Bad Parameters

As the How to Deploy describes, there are 6 parameters you need to make sure you get right.

Ensure that the subnets you select for Subnet A and Subnet B are unique (to each other) and both are a part of the VPC you selected.

Note

Rules have been added to ensure you select these values correctly.

Ensure the CIDR range is a value one that you can access. Specify a network range that represents the specific network you will be accessing from (i.e. your work network) or open it up to everyone with a value of 0.0.0.0/0.

Warning

Using 0.0.0.0/0 will make it accessible to the outside world. SSL based authentication is in place, but you still may want to consider whether "wide open" is the strategy you want to take.

Non-supported Regions

Antivirus for Amazon S3 uses many native AWS services. Not all AWS services are supported in all AWS regions. Cognito is a service leveraged for authentication and user management, but it is not supported it all regions. Below you will find a list of supported regions. Please deploy the console to one of these regions.

Supported regions for Console deployment: US East (N. Virginia), US East (Ohio), US West (Oregon), Asia Pacific (Mumbai), Asia Pacific (Seoul), Asia Pacific (Sydney), Asia Pacific (Tokyo), Asia Pacific (Singapore), EU (Frankfurt), EU (Ireland), and EU (London) AWS regions.

Note

The agents will run in any region that supports Amazon ECS Fargate.

If still having issues after ensuring these two criteria are met, please Contact Us.

Addressing conflicted buckets

How to address conflicted buckets

Amazon S3 buckets have an event system associated with them. There are 15 possible events that can be triggered on an bucket. There is one particular event, All object create events, that Antivirus for Amazon S3 listens for. Because AWS only allows one event listener per event, this is where the conflict can occur. If the bucket has an event listener already assigned to All object create events then we could have a conflict.

There are 3 types of event listeners than can be assigned to each of the events: SNS Topics, SQS Queues and Lambda Functions. Amazon Simple Notification Service (SNS) is a highly available, durable, secure, fully managed pub/sub messaging service. Amazon Simple Queue Service (SQS) is a fully managed message queuing service that enables you to decouple and scale microservices, distributed systems, and serverless applications. AWS Lambda lets you run code without provisioning or managing servers.

We reflect these conflicts with colored backgrounds to the rows in the bucket management views. A yellow row represents that your bucket has an SNS Topic on it for the event we care about. It is yellow because we can fix this one automatically as you'll learn below. A red row indicates you have either a Queue or a Lambda assigned to the event. We cannot directly fix those and they will require some intervention on your part. Read on to solve these.

Conflicted Buckets

Dealing with the Yellows

Because SNS is a pub/sub messaging service, which multiple subscribers can in fact subscribe to, we can simply subscribe the Antivirus for Amazon S3 queue to the Topic to receive all the events triggered post setup. We will do this automatically when enabling a bucket in the console. So no real conflict when the event listener is an SNS Topic. We just want to make you aware.

Warning

There is a chance multiple publishers push messages to this topic. If that is the case, Antivirus for Amazon S3 will see all messages and attempt to process them. If you have this setup, Contact Us to work through one additional option we can explore.

Dealing with the Reds

The simple answer here is: turn these to yellows. Done. Ok, that is simplistic and there are some steps needed here, but it is true that if these buckets only had SNS Topics on them then we all (your queue, your lambda and our Antivirus for Amazon S3) could subscribe to the Topic and each get the event package we needed.

Note

You should take a moment to determine what functions you are performing based on that event trigger to ensure there won't be a conflict between what you are doing and what we are doing. For example, you expect the object to be there, but we found an infection and quarantined it to another bucket. Does this break your flow?

Now let's walk through how we change from red to yellow.

The short answer is we need to place an SNS Topic on the bucket and assign it to the All create object events. Then subscribe your Queue/Lambda that was originally assigned to the event now to the Topic. And then go back to the Antivirus for Amazon S3 console and enable the bucket for scanning. We will automatically subscribe our queue to the topic you specified.

You have two choices when it comes to the SNS Topic: create one from scratch or leverage the one Antivirus for Amazon S3 creates. For simplicity we'll walk through leveraging the Antivirus for Amazon S3 created Topic.

Tip

If you have not protected any buckets in the given region prior to these steps the SNS Topic we create will not have been created yet. Protect another bucket, even temporarily, in that region to have the SNS Topic created and then you can proceed.

We'll walk through how to fix a bucket with lambda configured on it. Fixing a red "queue" bucket will be the process, but for a queue rather than a lambda function. What it looks like in the Antivirus for Amazon S3 console: Red Bucket What it looks like in the Amazon console:
S3 Bucket Lambda Event
As you have it configured here, each time a new object is created/modified the All object create events fires and sends the event to a Lambda function. In this case the helloworld lambda function.

First, we'll simply delete the event off of the bucket. Here is the initial view of the event on the bucket (the previous shot was the drill down detailed view). S3 Bucket Lambda Event Now select the event, click Delete and then click Save. S3 Bucket Lambda Event

With this done, if you go back to the Antivirus for Amazon S3 console the bucket will no longer be conflicted. Bucket no longer conflicted
While here, enable the bucket for scanning to apply the SNS Topic to the bucket. Bucket enabled
Which gives you this now: S3 Event

You're half way there. Next thing to do is subscribe your lambda to the Topic so it continues to receive the new object bucket events. The easiest way to do this is from the Lambda Management page. Navigate to AWS Services-->Lambda--><lambda function name> as seen below. Lambda Pre-Fix
Delete the S3 Trigger seen there and then click the + Add trigger button and populate the following screen as seen below:
Lambda Pre-Fix
Click the Add button and you will now see the following for your lambda function: Lambda Post-Fix
Make sure to click Save on the Lambda page in the upper right corner. Once you do, you will no longer see the S3 Bucket trigger and will be left with just an SNS Trigger as seen below: Lambda Post-Fix Post Save

You have now made it so your Lambda function will get triggered by the SNS Topic instead of directly from the S3 Bucket.

Note

There is one difference in the JSON the Lambda will receive. The JSON coming directly from the bucket was just that, the package of information related to the object added to the bucket. When coming from the SNS Topic, that same package of information is wrapped in an SNS Topic JSON block as the Message. So you will just need to grab that Message block and you will be right back where you started. You can see the differences here:

S3 Bucket Event JSON

SNS Topic Event JSON

Now you are all set! Your Lambda function gets the event, as seen above, and the Antivirus for Amazon S3 SNS Topic also receives the event and can now scan the objects.

Note

One more reminder, if that Lambda was only tied to a single bucket and you only want it to receive events from that single bucket, you may need to take a slightly different approach. As mentioned earlier, this SNS Topic receives events from all enabled buckets within the region. You will be better off creating a unique SNS Topic that only applies to this bucket. Subscribe your Lambda to it just as we did above and then simply enable this bucket from the console. We'll treat it as a yellow bucket and just subscribe our queue to that Topic as well.

Modifying scaling info post deployment

How to modify scaling info post deployment

As discussed in the Sizing section, there may be times you need or want to change how the product scales. This is simple to do on the Agent Settings page. Please go to this page to make changes to the scaling characteristics for all agents globally or on a per region basis.

Note

See how changing these values alone helps you meet your requirements. There are other values that may need to be tweaked if you aren't seeing the response you were hoping for. Please Contact Us for more details.

Objects show unscannable with access denied

How to fix access denied issues

This issue is seen when the customer is using AWS-KMS with a Custom KMS for encryption on the bucket. Standard AES-256 and AWS-KMS leveraging aws/s3 will be read and processed just fine. When using a Custom KMS ARN you must give the scanning agent role access to the key in order to process objects within the bucket. This is straight forward to do.

You have two options as seen below to enable this. Global KMS Access is favorable because it is one activity to grant access to all keys. Also, in the event of keys being rotated or as new keys come online for additional buckets the solution will automatically be able to leverage them. The KMS access is granted using the viaService option within the permissions. This means that the Agent can only use the keys when dealing with S3 buckets and objects. If you'd prefer to grant access on a per-key basis you can follow the second option.

Global KMS access by default

Any product update starting with the 4.04.004 release will automatically turn on Global KMS access. This is a result of it being the default option within the CloudFormation template starting with that release. If you do not want this set, follow the steps below, but select No in the option.

Global KMS Access (limited scope)

Note

In the case of cross-account scanning, in the remote account you would rerun the CloudFormation template and ensure the KMS option is set to Yes much like you'll see below.

The steps below walk you through setting up the primary account.

  1. Login to the AWS Console and navigate to the CloudFormation service (ensure you are in the appropriate region) CloudFormation menu

  2. Select the stack that represents the initial Console deployment and click the Update button (CloudStorageSec-AV-for-S3 if you kept the default name) CF Stack Update 1

  3. Leave the selection as Use current template and click the Next button CF Stack Update 2

  4. Find the Allow Access to All KMS Keys option and change this value as desired (Yes to enable, No to disable)

    Warning

    Leave all other parameters as their current values unless you desire them to change as well.

    CF Stack Update 3

  5. Click the Next button and on the following screen

  6. Review the stack changes, tick the I acknowledge box and click the Update Stack button CF Stack Update 4

Individual Key Access

Note

In the case of cross-account scanning, in the remote account you would assign the Custom KMS ARN to the CloudStorageSecRemoteRole.

The steps below walk you through setting up the primary account.

  1. Login to the AWS Console and navigate to your Key Management Service
    KMS menu

  2. Select the key you are using for encryption on that bucket
    KMS key

  3. Scroll down to Key Users and click the Add button KMS key user

  4. Search for the word agent and select the CloudStorageSecAgentRole-<appID> role
    KMS key agent search

  5. Click the Add button and you should now see the following:
    KMS key user added

You are now all set in regards to processing the encrypted objects within the bucket. You'll need to reprocess the objects that were scanned prior to enabling the key use.

Remote account objects not scanning

How to fix remote scanning

When you link account you have to deploy a cross-account role in the remote account. You will see a list of the linked accounts and their status on the Link Accounts page. If a remote account was previously setup with the cross-account role and scanning successfully, then there is an issue with the role in the remote account. It may have been deleted by an account admin by accident or numerous other reasons as to why there may be an issue.

Here is what it will look like if Antivirus for Amazon S3 can no longer utilize the cross-account role to get buckets: Linked Account No Access

The first indicator that tipped you off may have been the Problem Files page results or a Proactive Notification. Problem Files No Access

In order to fix the problem, you have to fix the cross-account role. This may be able to be done in the remote account directly, but you may just need to start over with the role. If the stack still exists in the remote account you could simply rerun the stack there. Alternatively, if you need to run the stack for the first time or again from scratch, you can click the action button on the remote account row and select Launch Stack as seen here: Linked Account Launch Stack

Note

This will launch the Quick Stack Create wizard in whichever AWS account you are currently logged into. Ensure you are logged in to the remote account in question.

My scanning agents keep starting up and immediately shutting down

Fix scanning agents never coming online

The main reason we see this happening is the agent does not have a route (outbound over internet or through VPC Endpoint) path to Amazon ECR. This scenario is where the Fargate Service Task cannot reach out to the ECR to get the task image itself to properly load it. What you'll find here is you'll get the following message:

STOPPED (ResourceInitializationError: unable to pull secrets or registry auth: execution resource retrieval failed: unable to retrieve ecr registry auth: service call has been retried 1 time(s): RequestError: send request failed caused by: Post https://api.ecr....)

The agent service just needs to gain access to the ECR. There are two main ways to accomplish this:

  • Ensure your VPC is attached to an Internet Gateway with outbound access
  • Leverage VPC Endpoints to get you to ECR and API calls internally through AWS' network

Important

Both the console and the scanning agents can run in private subnets and do not require public IPs as long as they can communicate outbound and you can still get to the console.


Don't forget about Security Groups. We have seen customers be tricked (really just forgotten about the rules they had in place) on gaining access to the console because of the SG that was in place.

Warning

Security Groups can also have outbound restrictions. By default they are wide open, but that does not mean yours are. If the steps above do not fix this issue, double check your Security Group settings (outbound rules).

If taking these steps does not resolve the problem, please Contact Us. We are here to help you!

I can not access the management console

Fix console access

It may occur that you are unable to access the management console UI. This could occur right after installation, after a reboot (from an update or another reason) or after you have changed the subdomain name. The first two scenarios typically turn out to be Subnet issues and the last is typically a DNS issue (if it is temporary).

During deployment you are asked to specify a VPC and 2 Subnets (preferably in different AZs) for the Console to run in. If you want to access the console publicly then the subnets must provide public access. What we have seen with numerous customers is that they will pick 2 subnets, one happens to be public and one happens to be private. Inevitably during the initial boot or after a reboot the console spins up on the private subnet and then can no longer be accessed from the URL or public IP. Ensure the subnets you choose are both public so no matter which the console spins up in, you'll be able to get to it.

Note

Neither the console nor the scanning agents require a public IP, but:

  • they must have outbound to the internet routes (to pull ECR images)
  • they must if you want to access the application from the URL
    • if you have a vpn or Direct Connect you can access from the private IP and forgo the public IP

Checkout the Deployment Details page for more details on public and internal routing.

There are times, typically right after the subdomain changes, that you cannot access the application from the new URL. This is typically a DNS issue and you just have to wait for DNS to catch up in your area. If this issue persists longer than you expect, try to access the console via the public IP assigned to it which can be identified in the AWS Console. If neither work, then explore the subnet issue described above.

The Fix

There is an easy way to fix this. Run a Stack Update and select two new subnets. Identify before hand which are public or which will provide the access mechanism you'd like to use. For the stack update you will use the existing template and make no other changes but the subnets. When you are able to get back into the console, you can double check your VPC and Subnets on the Console Settings page.

If desired, you could create an entirely new VPC with Subnets designed to work how you expect and run the console off of those.


Allow DNS enough time to distribute to all locations.


The vast majority of the time, it will be a VPC or DNS issue as described above. If you do not believe either of those to be the case, then one other trouble shooting step you could take is go to the ECS service in the AWS Management Console to ensure the console task is running. If it is you can drill down into the task itself to find the public and private IPs and attempt to access the console directly from those. If you can, then you know the console is up and running properly and the issue lies in the registration with Route53. Please contact us if this is the issue.

Linked Account Out of Date

Update Cross Account Role

Checking for KMS encryption on buckets was added to the product as of version 4.02.000. During this process we check the bucket attributes for Custom KMS Encryption. All buckets in the deployment account (Primary by default) will be able to be checked. But, if you have linked accounts, the cross-account role that was previously created does not have the permissions needed to perform that check. This will be indicated on the Bucket Protection page as a warning message tied to the account nickname (as seen below).

linked account message

You will need to update the cross account role. Follow the steps below to update the role. This action should be taken in the linked account, not the deployment account.

1) Login to the AWS Console and navigate to the CloudFormation service (in the region you originally deployed it in)

cloud formation service

2) Select the stack that represents the Cross Account Role and click the Update button as seen below

cft role stack

3) On the Update stack page make sure to select Replace current template and then provide the cross account role stack URL in the field as indicated below. Then click the Next button.

https://css-cft.s3.amazonaws.com/LinkedAccountCloudFormationTemplate.yaml
update stack 1

4) Leave the parameters as they were on the Specify stack details page and simply click Next

update stack 2

5) Click Next on the Configure stack options page

6) Tick the I acknowledge ... box and click Update Stack

update stack 3

7) It won't take long to complete the update.

update stack 4

8) It won't take long to complete the update. Once done you can navigate back to the Bucket Protection page and refresh the list of the buckets to see the warning disappear and determine the KMS status of your remote buckets.

refresh bucket


Last update: December 2, 2020