API Driven Scanning

You can programmatically implement antivirus scans and data classification in your application.

API driven scanning is the notion of scanning a file and receiving the verdict before it is written anywhere. We see this when your workflow demands a verdict at the time of uploaded. We often hear from customers that they want the file scanned before it resides in Amazon S3. Or they may have aspects of their workflow such that they just need an API driven verdict engine and Amazon S3 may not be in play at all. This is often necessary in applications where users are waiting to be told the upload was successful and the file accepted. APIs allow the application to make a direct handoff of the file to the scanning agent.

Ultimately, API driven scanning provides an API Endpoint verdict engine that can be used inside or out of AWS. You can send files to scan from on-prem or applications residing within AWS or from anywhere you grant access. The API scanning agents sit behind an AWS Load Balancer. You can make the Load Balancer internet-facing or internal depending on your requirements. Learn more about configuring and managing the API endpoint on the API Agent Settings page.

Setup

  1. Create a user for API use - How to

  2. Setup and configure API Agent Region - How to

  3. Integrate HTTP Post calls into your applications - explore samples below

You can use the programming language of your choice as we only require you to leverage HTTP Post to submit the file for scanning. Below are very simple examples of how to submit a file and the results you will see back.

Please note that you can use Sophos, CSS Premium, or ClamAV for API driven scanning. If you are using ClamAV you can only scan files up to 2GB in size. If you need to scan a file larger than 2GB using API driven scanning, you must select either Sophos or CSS Premium

Steps to making the API call:

  1. Make request for Auth Token a. Specify content type of JSON in headers b. Capture username and password in JSON c. HTTP Post the data block and headers to <baseURL> + /api/Token

    headers = {'Content-type': 'application/json'}
    json_foo: {"username": "<username here>", "password": "<pw here>"}
    r = session.post("https://<baseURL to load balancer or friendly URL>/api/Token", data=json_foo, headers=headers)

    This will return the following response text:

    {
        "accessToken":"eyJraWQiOiI0Qk41QU1yVXdhWUUrZlBUZ0dhQTZWQUNXUmREMmh2dlMxWFgrUmNmTzd3PSIsImFsZyI6IlJTMjU2In0.eyJzdWIiOiIyMDYyZDQxMC1kMGE0LTRiNTItYjc2Yi03M2FiNWQ5Njk4YWQiLCJjb2duaXRvOmdyb3VwcyI6WyJVc2VycyIsIlByaW1hcnkiXSwiZW1haWxfdmVyaWZpZWQiOnRydWUsImlzcyI6Imh0dHBzOlwvXC9jb2duaXRvLWlkcC51cy1lYXN0LTEuYW1hem9uYXdzLmNvbVwvdXMtZWFzdC0xX1haNWpVNXcwWSIsImN1c3RvbTpoaWRlX3RyaWFsX21zZyI6IjAiLCJjb2duaXRvOnVzZXJuYW1lIjoiZWRjIiwiY3VzdG9tOnVzZXJfZGlzYWJsZWQiOiIwIiwiY3VzdG9tOmF3c19hY2NvdW50X2lkIjoiNzMwMDc5MDI1Njg4IiwiY3VzdG9tOmhpZGVfd2VsY29tZV9tc2ciOiIwIiwiYXVkIjoiNXM2M29raWtodGJxdDR2cTFtMmV1bDk4Z2kiLCJldmVudF9pZCI6IjkwOTJlZjc4LTMzZTMtNDVhNS1hZTlhLTVmN2Y0NGY2NDZmNiIsInRva2VuX3VzZSI6ImlkIiwiYXV0aF90aW1lIjoxNjI1MjgyODU5LCJleHAiOjE2MjUyODY0NTksImlhdCI6MTYyNTI4Mjg1OSwiZW1haWwiOiJzdXBwb3J0QGNsb3Vkc3RvcmFnZXNlYy5jb20ifQ.QehudPO4zTphRq9ch3p6IopzRz7m72D5LquVgnzw8iHfDBbgZLQiAM7uWtkKGQw5fYV5dsB_U0fbcrW6F3ov_U4LcpvLgP88NXk7MR9PprzIQQjvnHRU9z6wy6wavgrK-VdPiqNF7dsKaAJGW6vVZCzFzVIEKaZCThHpqVYbKdiSfVm08nvWsWEM4fxAgCFY8sAr2pNxY5VHydGc_iP4On3H7MSFh1n7ee-lH88Ao8PLWMWQBYlbR6ZFLin7KKi6lhDOE-b4cAGDgPtl4acdw6ha_AWJPxozJILQkSAesl-BbxWquphTJ-oD_jRl7DvJBSbBw3DPNzXcO4w4SMnnLA",
        "tokenType":"Bearer",
        "expiresIn":3600
    }

    Save the access token off for the next call. It is valid for 1 hour if you choose to re-use it.

  2. Send the file for scanning a. Specify the headers - big thing here is the accessToken needs to be added, this is the minimum b. Get the file as your language dictates, but should be multipart form upload c. HTTP Post the file and headers <baseURL> + /api/Scan

    headers = {"Prefer": "respond-async", "Content-Type": form.content_type, 'Authorization': 'Bearer ' + accessToken}
    r = session.post("https://<baseURL to load balancer or friendly URL>/api/Scan", headers=headers, data=form, timeout=4000)

    This will return the following response text:

    {
        "dateScanned": "2021-07-02T07:04:18.8896831Z",
        "detectedInfections": [],
        "errorMessage": null,
        "result": "Clean"
    }

Available APIs

Antivirus

Currently, there are 4 available antivirus scanning API functions:

  1. Token

  2. Scan

  3. Scan/Existing

  4. Scan/URL

The Scan option has two uses: "scan and return" and "scan and upload". Read on to learn more about these APIs. For more technical docs on the APIs, please check out our API Swagger Docs.

api/Token

To execute the api/Token API, you need to pass the API-user username and password in to be

This will return the following response text:

Save the access token off for the next call. It is valid for 1 hour if you choose to re-use it.

api/Scan

Scan and Return only

This will return the following response text

Scan and Upload To scan and upload (if clean) you need to add the uploadTo attribute to the form data. This is as simple as augmenting the sample code to include one more field as seen below.

If wanting to test this, add to the sample code below and use the following line to run the script:

Uploading Metadata and Tags

We also provide the option to upload Metadata and additional Tags to the uploaded object. The field will vary depending on the type of upload.

  • Multipart-form uploads: include tag and metadata in the form-data

    • Metadata form field name is metadata

    • Tag form field is tags

  • Binary file uploads: include the tag and metadata in the HTTPS request

    • Metadata should be indicated by x-file-metadata header

    • Tags should be indicated by x-file-tags header

Insert tags and metadata in a json key:value format

{"tags":"Production"}

{"metadata":"my-metadata"}

The filename chosen must be in the variable uploadTo for it to be considered as it is in the S3 bucket (e.g: uploadTo: bucketname/filename.type). If the full path of the file is not indicated, the file will be considered a Binary data stream, and uploaded as a tmp file.

api/Scan/Existing

To scan a file that already exists within an S3 bucket, you use the api/Scan/Existing API call. This call requires you to pass a block of json as seen below. container and objectPath are required. versionID is optional, but allows you to specify a particular version number to scan. If you do not specify this field, we will scan the latest. uploadedBy is also optional, but allows you to specify who is doing the scanning.

To execute this call, capture the bucket name and full path to file

api/Scan/URL

There will be times when scanning a file existing by URL path is desired. Presigned URLs for Amazon S3 objects is a good use case or anything your applications have access to hanging out on a fully qualified URL. The api/Scan/URL API call will allow you to do this task. You simply need to add the URL field to the form data

To execute this call, capture the bucket name and full path to file

Data Classification

Currently there are also 3 available data classification functions which you can use similar to the antivirus functions:

  • /api/Classify

  • /api/Classify/Existing

  • /api/Classify/URL

The Classify option has two uses: "classify and return" and "classify and upload".

/api/Classify

Classify and Return

Classify and Upload

To classify and upload (if clean) you need to add the uploadTo attribute to the form data. This is as simple as augmenting the above sample code to include one more field as seen below.

If wanting to test this, add to the sample code below and use the following line to run the script:

The filename chosen must be in the variable uploadTo for it to be considered as it is in the S3 bucket (e.g: uploadTo: bucketname/filename.type). If the full path of the file is not indicated, the file will be considered a Binary data stream, and uploaded as a tmp file.

/api/Classify/Existing

To classify a file that already exists within an S3 bucket, you use the api/Classify/Existing API call. This call requires you to pass a block of json as seen below. container and objectPath are required. versionID is optional, but allows you to specify a particular version number to scan. If you do not specify this field, we will scan the latest. uploadedBy is also optional, but allows you to specify who is doing the scanning.

/api/Classify/URL

There will be times when classifying an existing file by URL path is desired. Presigned URLs for Amazon S3 objects is a good use case or anything your applications have access to hanging out on a fully qualified URL. The api/Classify/URL API call will allow you to do this task. You simply need to add the URL field to the form data

To execute this call, capture the bucket name and full path to file:

Code Samples

You can download and use our Postman collection and environment JSON files below to assist in your testing of our API Scanning functionality:

The below code samples are simple, but can give you a good start. If you need additional code samples you can generate examples in the programming language of your choice them using Postman.

This is a simple command line example with the base URL and the file to scan passed in on the command line.

Scan Results - JSON formatted

Last updated