Below are some of the most common questions related to Architecture
Do I need to make any changes to my application to use the product?
No. The Antivirus for Amazon S3 solution will fit into your existing workflow. You do not have to make any changes to your current workflow.
Can I scan S3 objects from more than one account from within the same deployment?
Yes, Antivirus for Amazon S3 supports cross-account scanning. This means you can centrally install the console and scanning agents to protect not only the account you are deployed within, but also any other AWS account where you can install a cross-account role.
Check out the Linked Accounts documentation for more details.
Can I setup a staging bucket for all of my files to first land in and then move them to a production bucket after they have been found to be clean?
Two Bucket System
Yes, it is very easy to setup a two bucket system and we have many customers using this approach. We provide two methods of achieving this.
First, within the Configuration > Scan Settings menu there is a configuration option for AV Two-Bucket System Configuration. This allows for a quick and easy way to choose your source region or bucket(s) and a destination bucket to promote your clean files. With this option, our agent will handle promoting the clean files as part of the scanning process.
Create a Python (latest version) Lambda Function with sample code provided below
Make adjustments to the Lambda settings with the information below
Subscribe Lambda to SNS Notifications Topic
Add IAM permissions to Lambda with provided permission blocks below
Modify Topic Subscription to filter down to clean objects with provided filter block below
Test clean and "not clean" files to ensure behavior is as expected
*LinkedAccountBuckets: Optionally you can make your two-bucket-system on linked accounts buckets as well, adding this pieces.
**Multi-partFiles Lambda: This lambda also involves step functions to handle multi-part files and large files
Sample Copy Lambda
The code below is a starting point and does work out of the box, but more can be done with it and to it. Feel free to do so.
import jsonimport boto3import osfrom botocore.exceptions import ClientError, ParamValidationErrorimport randomfrom urllib import parsedeflambda_handler(event,context):try:print(json.dumps(event)) SOURCE_BUCKET = os.getenv("SOURCE_BUCKET", 'any') DESTINATION_BUCKET = os.getenv("DESTINATION_BUCKET", '<some failover bucket>') DELETE_STAGING = os.getenv("DELETE_STAGING", 'no')#print("Source bucket is:" + SOURCE_BUCKET)#print("Destination bucket is:" + DESTINATION_BUCKET) record = event['Records'][0] messageBucket = record['Sns']['MessageAttributes']['bucket']print("The messageBucket value is:")print(messageBucket['Value'])if (messageBucket['Value']== SOURCE_BUCKET or SOURCE_BUCKET =='any'): message = json.loads(record['Sns']['Message'])#print("The message content is:" + str(message))#print("The message key is: " + message['key']) s3 = boto3.resource('s3') copy_source ={'Bucket': messageBucket['Value'],'Key': message['key']} if 'PartsCount' in s3.meta.client.head_object(Bucket=messageBucket['Value'], Key=message['key'], PartNumber=1):
# get the tags, then copy with tags specified#print("doing a multipart copy with tags")try: tagging = s3.meta.client.get_object_tagging(Bucket=messageBucket['Value'], Key=message['key'])#print("get object tagging = " + str(tagging)) s3.meta.client.copy(copy_source, DESTINATION_BUCKET, message['key'], ExtraArgs={'Tagging': parse.urlencode({tag['Key']: tag['Value'] for tag in tagging['TagSet']})})
exceptExceptionas e:print(e)raise(e)else:# copy as normal#print("doing a normal copy")try: s3.meta.client.copy_object(CopySource=copy_source, Bucket=DESTINATION_BUCKET, Key=message['key'])exceptExceptionas e:print(e)raise(e)print("Copied: "+ message['key'] +" to production bucket: "+ DESTINATION_BUCKET)#print("Delete files: " + DELETE_STAGING)if (DELETE_STAGING =='yes'):try: s3.meta.client.delete_object(Bucket=SOURCE_BUCKET, Key=message['key'])print("Deleted: "+ message['key'] +" from source bucket: "+ SOURCE_BUCKET)exceptExceptionas e:print(e)raise(e)return{'statusCode':200,'body': json.dumps('Non-infected object moved to production bucket')}return{'statusCode':200,'body': json.dumps('Not from the Staging bucket')}except ClientError as e:return{'statusCode':400,'body':"Unexpected error: %s"% e}except ParamValidationError as e:return{'statusCode':400,'body':"Parameter validation error: %s"% e}
Make Adjustments to Lambda Settings
There are some adjustments to the Lambda you'll probably need to make:
Set the environment variables for the Staging Bucket and Production Bucket names as seen here:
Change the Time Out under General Configuration to a value that will work for the typical file sizes you deal with. The larger the file size, the longer you may want to make it so the lambda doesn't time out before the copy finishes.
Permissions to add to Lambda Role
Add an Inline Policy to the Lambda Role that was created when you created the Lambda. Paste the below into the JSON screen and change the sections that have <> to match your staging bucket name and production destination buckets
Go to the SNS Topic itself and find the subscription created for the Lambda. Edit the filter settings by pasting the below value in. You can modify the filter further to include more scan results and even filter down by bucket. Bucket filtering is a good idea if you have event based scanning setup for any other bucket and you do not want the lambda to copy those clean files over as well.
import jsonimport boto3from botocore.exceptions import ClientError, ParamValidationErrorfrom urllib import parseimport timeimport mathdeflambda_handler(event,context):try:print(json.dumps(event)) record = event['Records'][0] message = json.loads(record['Sns']['Message'])# Modify destination Bucket as appropriate sourceBucket =str(message['bucketName']) destinationBucket = sourceBucket# Modify destination key as appropriate sourceKey =str(message['key']) destinationKey = sourceKey.replace("incoming", "landing", 1)# Set to appropriate step function state machine ARN stateMachineArn ='arn:aws:states:<region>:<account_id>:stateMachine:<stateMachineName>'print("The source bucket is: "+ sourceBucket)print("The destination bucket is: "+ destinationBucket)print("The source key is: "+ sourceKey)print("The destination key: "+ destinationKey)print("The State Machine ARN: "+ stateMachineArn) s3 = boto3.resource('s3') copy_source ={'Bucket': sourceBucket,'Key': sourceKey} meta_data = s3.meta.client.head_object( Bucket=sourceBucket, Key=sourceKey) parts_meta = s3.meta.client.head_object( Bucket=sourceBucket, Key=sourceKey, PartNumber=1) partsCount = parts_meta["PartsCount"]if'PartsCount'in parts_meta else0if partsCount >0: totalContentLength = meta_data["ContentLength"] partSize = math.floor(totalContentLength / partsCount)print(f'FileSize: {totalContentLength} -- PartSize: {partSize} -- PartsCount: {partsCount}') tagging = s3.meta.client.get_object_tagging( Bucket=sourceBucket, Key=sourceKey) tagging_encoded = parse.urlencode( {tag['Key']: tag['Value'] for tag in tagging['TagSet']})if totalContentLength >=25_000_000_000: source_bucket_and_key =f'{sourceBucket}/{sourceKey}'try:# 25GB or larger file, use step function to copy due to lambda run time limitprint("Using step function to copy extra large file")# Initiate multi-part upload kwargs =dict( Bucket=destinationBucket, Key=destinationKey, Tagging=tagging_encoded, Metadata=meta_data['Metadata'], StorageClass=meta_data['StorageClass'] if'StorageClass'in meta_data elseNone, ServerSideEncryption=meta_data['ServerSideEncryption'] if 'ServerSideEncryption' in meta_data else None,
SSEKMSKeyId=meta_data['SSEKMSKeyId'] if'SSEKMSKeyId'in meta_data elseNone, BucketKeyEnabled=meta_data['BucketKeyEnabled'] if'BucketKeyEnabled'in meta_data elseNone, ObjectLockMode=meta_data['ObjectLockMode'] if 'ObjectLockLegalHoldStatus' in meta_data and 'ObjectLockRetainUntilDate' in meta_data else None,
ObjectLockRetainUntilDate=meta_data['ObjectLockRetainUntilDate'] if 'ObjectLockLegalHoldStatus' in meta_data and 'ObjectLockRetainUntilDate' in meta_data else None,
ObjectLockLegalHoldStatus=meta_data['ObjectLockLegalHoldStatus'] if'ObjectLockLegalHoldStatus'in meta_data elseNone ) multipart_upload_response = s3.meta.client.create_multipart_upload(**{k: v for k, v in kwargs.items() if v isnotNone} ) uploadId = multipart_upload_response['UploadId']# kick off step function executions sfn_client = boto3.client('stepfunctions') parts = [] baseInput ={'Bucket': destinationBucket,'Key': destinationKey,'CopySource': source_bucket_and_key,'SourceBucket': sourceBucket,'SourceKey': sourceKey,'UploadId': uploadId,'PartsCount': partsCount,'CompleteUpload':False}for partNumber inrange(1, partsCount +1): byteRangeStart = (partNumber -1) * partSize byteRangeEnd = byteRangeStart + partSize -1if byteRangeEnd > totalContentLength or partNumber == partsCount: byteRangeEnd = totalContentLength -1 parts.append({'PartNumber': partNumber,'CopySourceRange': f'bytes={byteRangeStart}-{byteRangeEnd}' })iflen(parts)==250and partNumber != partsCount: executionInput =dict(baseInput) executionInput['Parts']= parts sfn_client.start_execution( stateMachineArn=stateMachineArn, name=f'copy_parts_{time.time() *1000}', input=json.dumps(executionInput) ) parts.clear() executionInput =dict(baseInput) executionInput['Parts']= parts executionInput['CompleteUpload']=True sfn_client.start_execution( stateMachineArn=stateMachineArn, name=f'copy_parts_{time.time() *1000}', input=json.dumps(executionInput) )print("Step function execution started.")return{'statusCode':200,'body': json.dumps('Non-infected object moved to production bucket')}exceptExceptionas e:print(f'Failed to execute step function to copy {source_bucket_and_key}')print(e)raise (e)else:# get the tags, then copy with tags specifiedprint("Performing a multipart copy with tags")try: s3.meta.client.copy(copy_source, destinationBucket, destinationKey, ExtraArgs={'Tagging': tagging_encoded})exceptExceptionas e:print(e)raise (e)else:# copy as normalprint("Performing a normal copy")try: s3.meta.client.copy_object( CopySource=copy_source, Bucket=destinationBucket, Key=destinationKey)exceptExceptionas e:print(e)raise (e)print(f'Copied: {source_bucket_and_key} to destination: {destinationBucket}/{destinationKey}')try: s3.meta.client.delete_object(Bucket=sourceBucket, Key=sourceKey)print(f'Deleted: {source_bucket_and_key}')exceptExceptionas e:print(e)raise (e)return{'statusCode':200,'body': json.dumps('Non-infected object moved to production bucket')}except ClientError as e:return{'statusCode':400,'body':"Unexpected error: %s"% e}except ParamValidationError as e:return{'statusCode':400,'body':"Parameter validation error: %s"% e}
Custom/own Quarantine Bucket
If you decide to use your own quarantine bucket, you can use these same steps for a 2-bucket-system. You only need to go to Configuration > Scan Settings and change the action for infected files to Keep and change the "Clean" subscription on step 7 for "Infected"
If you need any help getting this setup, please Contact Us as we are happy to help.
How the 2 Bucket System Flows:
What ports do I need open for the product to function properly?
Port 443 for:
Outbound for Lambda calls
Outbound Console and Agent access to Elastic Container Repository (ECR)
Inbound access to Console for public access
Public access is not required as long as you have access via private IP
Port 80 for:
ClamAV signature updates
You can now setup local signature updates rather than reach out over the internet. This will allow you to setup an Amazon S3 bucket for the solution to look at.
You can get a more detailed view and additional options for routing on the Deployment Details page. In either the standard deployment or the VPC Endpoints deployment, with local signature updates you can remove all non-AWS calls from the application run space. With VPC Endpoint you can remove almost all public calls as well.
Can I change the CIDR range, VPC or Subnets post deployment for the console and agents?
Yes. The Console Settings page gives you the option to modify the inbound Security Group rules, the VPC and Subnets and the specs of the task (vCPU and Memory). The Agent Settings page allows you to change the VPC and Subnets the agents run in, the specs of the task (vCPU and Memory) as well as all the scaling configuration aspects.
Do you use AWS Lambdas or EC2 Instances?
Neither. Antivirus for Amazon S3 infrastructure is built around AWS Fargate containers. We wanted to be serverless like Lambda and faster and more flexible than EC2s. Fargate containers give you persistence and other benefits that Lambdas aren't prepared to give you yet. We explored Lambda and do see some advantages there, but not enough to win out over AWS Fargate containers.
We do leverage two lambdas for the subdomain registration, but not for any of the workload at this time. If you are interested in a lambda-driven solution, please Contact Us to let us know. We are always exploring the best way to build and run our solution.
Do you support AWS Control Tower or Landing Zone?
A landing zone is a well-architected, multi-account AWS environment that's based on security and compliance best practices. AWS Control Tower automates the setup of a new landing zone using best-practices blueprints for identity, federated access, and account structure.
Antivirus for Amazon S3 is now tightly integrated with AWS Control Tower, and is designed to work within the landing zone context. Antivirus for Amazon S3 can be centrally deployed in a Security Services account while leveraging Linked Accounts to scan all other accounts. You can learn more about AWS Control Tower here.
Can I leverage Single Sign On (SSO) with your product?
Yes you can leverage SSO with our solution. Antivirus for Amazon S3 utilizes Amazon Cognito for user management. Amazon Cognito allows SAML integrations. Leveraging this capability we can utilize various providers as part of SSO into our solution, both from the SSO Dashboard as well as from within the application itself.
It is a fairly simple task to setup the connection and AWS has been kind enough to document how to do it. Follow the instructions found here:
Upon customer request, we have documented the steps to get GSuite working as your SSO provider. Please leverage the document below. We have had other customers leverage these steps (along with the Okta and Azure AD write up) to setup other providers as well such as Keycloak.
GSuite Setup Instructions
Additional Actions Required
Okta identities will be auto-created within Amazon Cognito (and therefore Antivirus for Amazon S3) as simple Users and not Admins. They are also not assigned to a particular Group. This state enables them to login, but manage nothing. One-time only you will need to assign the user to a group and the admin role. After this initial assignment, each subsequent login will allow for proper management.
SSO users will also standout as their username will be created from the SSO sign in process.
How can we scan your console/agent images for vulnerabilities?
You can access our images by either downloading them locally to your ECR or using the Container images commands on the Launch this software page on our AWS Marketplace listing.
You'll need to install and use the AWS CLI to authenticate to Amazon Elastic Container Registry and download the container images using the commands.
Below is a sample of the commands to pull our images from ECR for v7.01.001. However, you'll need to navigate to the Launch this software page on our AWS Marketplace listing to get the commands for our latest release.
aws ecr get-login-password \
--region us-east-1 | docker login \
--username AWS \
--password-stdin 564477214187.dkr.ecr.us-east-1.amazonaws.com
CONTAINER_IMAGES="564477214187.dkr.ecr.us-east-1.amazonaws.com/cloud-storage-security/console:v7.01.001,564477214187.dkr.ecr.us-east-1.amazonaws.com/cloud-storage-security/agent:v7.01.001"
for i in $(echo $CONTAINER_IMAGES | sed "s/,/ /g"); do docker pull $i; done
yes or no value based on whether you want the original file to be deleted after the copy has occurred
DESTINATION_BUCKET
Clean or Production bucket name identifying where to copy clean files to
SOURCE_BUCKET
Dirty or Staging bucket name identifying the originating bucket
Note: you can leverage SNS Topic Filtering to eliminate the need for this value and the if check inside the code