Amazon S3

Integrate with Amazon S3 to gain publish-access from Cascade to Amazon S3 buckets. Any team member(s) with access to the workspace will be able to publish data to the S3 bucket.

Workspace: Configure your Integration

In the top-left corner, confirm the workspace you are about to integrate with Amazon S3. Any team members with access to this workspace will gain access to the integration.

  • From the left-nav of your workspace, select the "Integrations" tab

  • Select the "New Integration" button in the top-right corner

  • Select "Amazon S3" to create a new (or additional) integration with Amazon S3

  • In the provided field, name your integration

    • Note: Any name may be provided. However, it may be simplest to label it with the Amazon S3 account name.

  • Next, fill out your AWS Access Key ID and AWS Secret Access Key

  • Click "Test & Save" - if connection is successful, status should update to "Connected".

Canvas: Publish to Integration

At the moment, Cascade only provides publish access to Amazon S3. Open a new or existing workflow from the now integrated workspace. You'll be able to add the "Publish to Amazon S3" tool to your canvas.

  • From the tool side panel, select "Publish to Integration" to drill in

  • Drag "Publish to Amazon S3" from the tool side panel to your canvas

    • Connect it to the upstream dataset you wish to publish to Amazon S3

  • Select the Publish to Amazon S3 tool to configure it, options include:

    • Select Integration

      • Select the integration, named in an earlier step, for this Amazon S3 account

    • Bucket

      • Enter the bucket container name within your S3 account you would like to publish your objects to

    • Key

      • Enter the key (the file path) of the object you would like to publish to within your S3 bucket

        • This can be an already existing key that you're overwriting or a new key that will be written to

    • Output Options

      • Select "Overwrite" to publish over any existing data in Amazon S3

      • Select "Disable Publishing" to disable workflow from publishing new data to the Amazon S3 bucket

        • Use this option during workflow building and testing so non-finalized datasets are not pushed to S3

    • Publish File Type

      • Select the file type you would like to publish to S3:

        • CSV

        • Parquet

        • Excel

        • Hyper

Don't get blocked by a data integration. If you need help configuring and integration or would like to recommend an integration, join the Cascade Slack Community, tell us what you're after :)

Last updated