API Driven Scanning
You can programmatically implement antivirus scans and data classification in your application.
API driven scanning is the notion of scanning a file and receiving the verdict before it is written anywhere. We see this when your workflow demands a verdict at the time of uploaded. We often hear from customers that they want the file scanned before it resides in Amazon S3. Or they may have aspects of their workflow such that they just need an API driven verdict engine and Amazon S3 may not be in play at all. This is often necessary in applications where users are waiting to be told the upload was successful and the file accepted. APIs allow the application to make a direct handoff of the file to the scanning agent.
Ultimately, API driven scanning provides an API Endpoint verdict engine that can be used inside or out of AWS. You can send files to scan from on-prem or applications residing within AWS or from anywhere you grant access. The API scanning agents sit behind an AWS Load Balancer. You can make the Load Balancer internet-facing
or internal
depending on your requirements. Learn more about configuring and managing the API endpoint on the API Agent Settings page.
Setup
Create a user for API use - How to
Setup and configure API Agent Region - How to
Integrate HTTP Post calls into your applications - explore samples below
You can use the programming language of your choice as we only require you to leverage HTTP Post to submit the file for scanning. Below are very simple examples of how to submit a file and the results you will see back.
Please note that you can use Sophos, CSS Premium, or ClamAV for API driven scanning. If you are using ClamAV you can only scan files up to 2GB in size. If you need to scan a file larger than 2GB using API driven scanning, you must select either Sophos or CSS Premium
Steps to making the API call:
Make request for Auth Token a. Specify content type of JSON in headers b. Capture username and password in JSON c. HTTP Post the data block and headers to <baseURL> + /api/Token
headers = {'Content-type': 'application/json'} json_foo: {"username": "<username here>", "password": "<pw here>"} r = session.post("https://<baseURL to load balancer or friendly URL>/api/Token", data=json_foo, headers=headers)
This will return the following response text:
{ "accessToken":"eyJraWQiOiI0Qk41QU1yVXdhWUUrZlBUZ0dhQTZWQUNXUmREMmh2dlMxWFgrUmNmTzd3PSIsImFsZyI6IlJTMjU2In0.eyJzdWIiOiIyMDYyZDQxMC1kMGE0LTRiNTItYjc2Yi03M2FiNWQ5Njk4YWQiLCJjb2duaXRvOmdyb3VwcyI6WyJVc2VycyIsIlByaW1hcnkiXSwiZW1haWxfdmVyaWZpZWQiOnRydWUsImlzcyI6Imh0dHBzOlwvXC9jb2duaXRvLWlkcC51cy1lYXN0LTEuYW1hem9uYXdzLmNvbVwvdXMtZWFzdC0xX1haNWpVNXcwWSIsImN1c3RvbTpoaWRlX3RyaWFsX21zZyI6IjAiLCJjb2duaXRvOnVzZXJuYW1lIjoiZWRjIiwiY3VzdG9tOnVzZXJfZGlzYWJsZWQiOiIwIiwiY3VzdG9tOmF3c19hY2NvdW50X2lkIjoiNzMwMDc5MDI1Njg4IiwiY3VzdG9tOmhpZGVfd2VsY29tZV9tc2ciOiIwIiwiYXVkIjoiNXM2M29raWtodGJxdDR2cTFtMmV1bDk4Z2kiLCJldmVudF9pZCI6IjkwOTJlZjc4LTMzZTMtNDVhNS1hZTlhLTVmN2Y0NGY2NDZmNiIsInRva2VuX3VzZSI6ImlkIiwiYXV0aF90aW1lIjoxNjI1MjgyODU5LCJleHAiOjE2MjUyODY0NTksImlhdCI6MTYyNTI4Mjg1OSwiZW1haWwiOiJzdXBwb3J0QGNsb3Vkc3RvcmFnZXNlYy5jb20ifQ.QehudPO4zTphRq9ch3p6IopzRz7m72D5LquVgnzw8iHfDBbgZLQiAM7uWtkKGQw5fYV5dsB_U0fbcrW6F3ov_U4LcpvLgP88NXk7MR9PprzIQQjvnHRU9z6wy6wavgrK-VdPiqNF7dsKaAJGW6vVZCzFzVIEKaZCThHpqVYbKdiSfVm08nvWsWEM4fxAgCFY8sAr2pNxY5VHydGc_iP4On3H7MSFh1n7ee-lH88Ao8PLWMWQBYlbR6ZFLin7KKi6lhDOE-b4cAGDgPtl4acdw6ha_AWJPxozJILQkSAesl-BbxWquphTJ-oD_jRl7DvJBSbBw3DPNzXcO4w4SMnnLA", "tokenType":"Bearer", "expiresIn":3600 }
Save the access token off for the next call. It is valid for 1 hour if you choose to re-use it.
Send the file for scanning a. Specify the headers - big thing here is the accessToken needs to be added, this is the minimum b. Get the file as your language dictates, but should be multipart form upload c. HTTP Post the file and headers <baseURL> + /api/Scan
headers = {"Prefer": "respond-async", "Content-Type": form.content_type, 'Authorization': 'Bearer ' + accessToken} r = session.post("https://<baseURL to load balancer or friendly URL>/api/Scan", headers=headers, data=form, timeout=4000)
This will return the following response text:
{ "dateScanned": "2021-07-02T07:04:18.8896831Z", "detectedInfections": [], "errorMessage": null, "result": "Clean" }
Available APIs
Antivirus
Currently, there are 4 available antivirus scanning API functions:
Token
Scan
Scan/Existing
Scan/URL
The Scan
option has two uses: "scan and return" and "scan and upload". Read on to learn more about these APIs. For more technical docs on the APIs, please check out our API Swagger Docs.
Data Classification
Currently there are also 3 available data classification functions which you can use similar to the antivirus functions:
/api/Classify
/api/Classify/Existing
/api/Classify/URL
The Classify
option has two uses: "classify and return" and "classify and upload".
Code Samples
This is a simple command line example with the base URL and the file to scan passed in on the command line.
python ./scanWithAPI.py <username> <password> <base-URL> <file-to-scan>
import json
import requests
from requests_toolbelt.multipart import encoder
from requests_toolbelt.multipart.encoder import MultipartEncoder
import sys
# baseURL is the value found on the API Agent Settings page as the Default DNS
# this can also be a friendly URL which you've mapped within your DNS
baseURL = sys.argv[3]
# /api/Token is the API to retrieve the auth token
# the auth token is valid for 1 hour, so you can re-use if your application can manage it
getTokenURL = baseURL + '/api/Token'
# /api/Scan is the API to pass the file to for scanning
scanFileURL = baseURL + '/api/Scan'
# must specify the content type as JSON when retrieving the auth token
headers = {'Content-type': 'application/json'}
# as part of the /api/Token HTTP Post you must pass the username and password for the
# user created and configured inside of the Antivirus for Amazon S3 console
# the data block must be passed in JSON format
uname = sys.argv[1]
pw = sys.argv[2]
foo = {"username": "", "password": ""}
foo["username"] = uname
foo["password"] = pw
json_foo = json.dumps(foo)
# make the HTTP post now passing in the username/pw data block and headers
session = requests.Session()
r = session.post(getTokenURL, data=json_foo, headers=headers)
# pull the auth token from the response to use in the scan call below
# valid for 1 hour if you want to re-use
jsonResponse = json.loads(r.text)
accessToken = jsonResponse["accessToken"]
# read file in from wherever it is coming from: form upload, in file system, etc
with open(sys.argv[4], 'rb') as f:
form = encoder.MultipartEncoder({
"documents": ("my_file", f, "application/octet-stream"),
"composite": "NONE",
})
# setup headers for /api/Scan HTTP Post.
# the only thing you really need is the 'Authorization': 'Bearer ' with the auth token
# assigned to that. Depending on how you are reading the file or the language, you may
# need to pass more values in the header as seen below
headers = {"Prefer": "respond-async", "Content-Type": form.content_type, 'Authorization': 'Bearer ' + accessToken}
r = session.post(scanFileURL, headers=headers, data=form, timeout=4000)
# grab the text from the response and check however you will
# below converts the response to JSON for easy formatting and handling
parsed = json.loads(r.text)
print(json.dumps(parsed, indent=4, sort_keys=True))
# do the next portion of your workflow based on what the scan result is
if parsed['result'] == "Clean":
print("file was clean")
#do more work in the workflow here
session.close()
Scan Results - JSON formatted
{
"dateScanned": "2021-07-02T07:04:18.8896831Z",
"detectedInfections": [],
"errorMessage": null,
"result": "Clean"
}
Last updated