Cascade Library
  • Introduction
    • Welcome to Cascade
    • Workspaces
  • Getting Started
    • Build Your First Workflow
    • Build Your First Data App
  • Workflows
    • Overview
      • Navigating the Canvas
      • Setting Up
      • Drag and Drop
      • Tools
    • Dynamic Workflows
    • Global Variables
    • Data Locker
    • Scheduling
    • Webhooks
    • Run Logs
    • Workflow Deployment
  • Integrations
    • Connecting Cascade to your database
    • Amazon S3
    • Azure Blob Storage
    • BigQuery
    • Google Sheets
    • MySQL
    • Postgres
    • Redshift
    • SQL Server
    • Snowflake
    • Tableau Server
  • Tools
    • Import
      • Import File
      • Import from Data Locker
      • Import from API
      • Import Sample Data
      • New Table
    • Clean
      • Validate Schema
      • Find/Replace
      • Text to Columns
      • Flatten Json
      • Sample
      • Standardize
      • Validate
    • Transform
      • Edit Columns
      • Select Columns
      • Filter
      • Sort
      • Pivot
      • Unpivot
      • Deduplicate
    • Merge
      • Append
      • Join
      • Multi Join
      • Fuzzy Join
    • Predictive Modeling
      • Build Model
      • Apply Model
      • Correlate
      • ARIMA Forecast
    • Flow
      • Conditional
    • Code
      • Python
      • SQL
    • Visualize
      • Chart
        • Bar
        • Line
        • Combo
        • Scatter
        • Histogram
        • Box
        • Pie
        • Area
        • Funnel
    • Publish
      • Publish to Data Locker
      • Publish via Email
      • Publish to URL
      • Embed
  • Functions & Expressions
    • Functions
      • Aggregate Functions
        • AVERAGE
        • CORR
        • COUNT
        • COUNTD
        • COUNTBY
        • COUNTIF
        • COUNTIFS
        • COVAR
        • COVARP
        • COVARS
        • MAX
        • MEDIAN
        • MIN
        • MAXBY
        • MINBY
        • PERCENTILE
        • STDEV
        • STDEVP
        • STDEVS
        • SUM
        • SUMBY
        • SUMIF
        • VAR
        • VARP
        • VARS
        • RUNNINGTOTALBY
          • SIGN
          • SIN
          • SQRT
          • SQUARE
          • TAN
          • ZN
      • Conversion Functions
        • TIMESTAMPTODATE
        • TODATE
        • TODECIMAL
        • TOINT
      • Date/Time Functions
        • DATEADD
        • DATEDIF
        • DATENAME
        • DATENORMALIZE
        • DATEPART
        • DATETRUNC
        • DAY
        • DAYS
        • HOUR
        • ISDATE
        • ISOWEEKDAY
        • ISOWEEK
        • ISOQUARTER
        • ISOYEAR
        • MAKEDATE
        • MAKEDATETIME
        • MINUTE
        • MONTH
        • NOW
        • QUARTER
        • SECOND
        • TODAY
        • WEEK
        • WEEKDAY
        • YEAR
      • Logical Functions
        • AND
        • BETWEEN
        • CASE
        • CHOOSE
        • CONTAINSWITHIN
        • IF
        • IFS
        • IIF
        • IN
        • IFNULL
        • ISBOOLEAN
        • ISDECIMAL
        • ISDURATION
        • ISINTEGER
        • ISNULL
        • ISNUMBER
        • ISSTRING
        • ISUNIQUE
        • NOT
        • NULL
        • OR
        • SWITCH
        • ALL
          • TOSTRING
        • ANY
      • Math Functions
        • ABS
        • ACOS
        • ASIN
        • ATAN
        • ATAN2
        • CEILING
        • COS
        • COT
        • COSEC
        • DEGREES
        • DIV
        • EVEN
        • EXPONENTIAL
        • FILLINFINITY
        • FLOOR
        • HAVERSINE
        • LOG
        • LN
        • ODD
        • MODULO
        • PERCENTILEOFVALUE
        • PERCENTILEVALUE
        • PI
        • POWER
        • RADIANS
        • RANDOM
        • ROUND
        • SEC
      • Table Functions
        • ENCODE
        • INDEX
        • INDEXBY
        • FILLNULL
        • FIRSTBY
        • GENERATEUNIQUEID
        • LASTBY
        • LOOKUP
        • MATCH
        • NTH
        • OFFSET
        • OFFSETBY
        • PREVIOUSVALUE
        • RANK
        • RANKBY
        • RECORDID
        • ROLLINGAVERAGE
        • ROW
        • RUNNINGAVERAGE
        • RUNNINGMAX
        • RUNNINGMIN
        • RUNNINGSTDEV
        • RUNNINGTOTAL
        • WINDOWAVERAGE
        • WINDOWMAX
        • WINDOWMIN
        • WINDOWCOUNT
        • WINDOWSUM
        • SEQUENCE
        • WINDOWMEDIAN
        • WINDOWSTDEV
        • WINDOWSTDEVP
        • WINDOWSTDEVS
        • WINDOWVAR
        • WINDOWVARP
        • WINDOWVARS
        • WINDOWCORR
        • WINDOWCOVAR
        • WINDOWCOVARP
        • WINDOWCOVARS
        • SMOOTHEDAVERAGE
      • Text Functions
        • ASCII
        • CHAR
        • CONCAT
        • CONTAINS
        • ENDSWITH
        • FIND
        • FINDNTH
        • ISEMPTY
        • JSONPARSE
        • LEFT
        • LENGTH
        • LOWER
        • LTRIM
        • MID
        • PROPER
        • RIGHT
        • RTRIM
        • SPACE
        • SPLIT
        • STARTSWITH
        • TRIM
        • SUBSTITUTE
        • UPPER
    • Building Expressions
      • Expression Operators
      • Guide to Window Functions
  • Cascade FAQs
    • Best Practices
      • 💬How to add a total row to a table
      • 💬How to leave comments on a workflow
      • 💬How to add new columns in the Edit Columns tool
      • 💬Setting up a New Table tool
      • 💬How to rename a tool
    • Knowledge Based
      • 💬How to change Data Types in Cascade
      • 💬How to remove columns from a table in Cascade
      • 💬How to rename columns in Cascade
      • 💬Understanding the Join options in the Cascade Join tool
      • 💬How to connect tools to each other
    • Import
      • 💬How to Import an Excel File into Cascade
      • 💬How to Import a CSV File into Cascade
      • 💬How to import a CSV file into the Data Locker
    • Functions and Expressions
      • 💬How to write an IN() statement with multiple variables
      • 💬How to Remove null Records with a Filter tool
      • 💬How to write an IF Statement in Cascade
      • 💬How to replace null values with 0
    • Troubleshooting
      • ⚠️What does it mean if my workflow won’t load?
      • ⚠️Why can’t I connect my tool to other tools?
      • ⚠️Why are there duplicate records after my Join tool?
  • Change Log
On this page
  • Sources
  • Configuration
  • Example
  • Output

Was this helpful?

  1. Tools
  2. Clean

Validate Schema

Validate that a table contains specific columns.

PreviousCleanNextFind/Replace

Last updated 2 years ago

Was this helpful?

The Validate Schema tool provides a way for workflows to verify that source tables are a particular shape (or schema) before passing the data along to later steps. This can be especially helpful in cases where data is being pulled dynamically from an external source and has the potential to change shape in the future.

Sources

The Validate Schema tool accepts two input sources:

  • Reference Schema: The table that will serve as the starting point for defining how future tables should be shaped.

  • For Validation: The table that is being compared against the expected schema.

Unlike most Cascade tools, Validate Schema does not include the normal prompt options. Instead, the tool allows users to select or unselect columns related to the prompts below.

Selection

Description

Columns to Confirm

Default is all columns selected. Unselect columns to exclude them from the validation checks in the tool

Allow Nulls

Default is all columns selected. Unselect if columns in the Validation table should not allow for null values.

Configuration

The Validate Schema tool configuration is a bit different in that it does not contain the usual right navigation bar with prompts to fill out.

In the tool modal, columns from the reference schema source can be selected to be part of the expected schema. By default, every column from the reference schema source that also appears in the source for validation will be set as the default set of columns that should be validated in the future. Columns can also be marked as allowing or disallowing null values.

Example

The most common use case of Validate Schema involves a template table and a table to validate. The template table will have the desired set of columns and formats needed for the downstream cleaning and transformation tools in the workflow canvas.

In the example shown below, the "Transaction Template" table is connected to the Validate Schema tool as the Reference Schema and the "Transactions" table is connected as the table for Validation.

When we initially connect both tables to the Validate Schema tool, you'll see that there are errors because the validation has failed. This will happen when the columns in the table for validation do not match the schema and format of the reference table.

Output

If the table for validation fails to match the criteria specified in the expected schema, the Validate Schema tool will produce an error in your workflow. Otherwise, the table for validation will be passed along as the output.

In order to clean up the table for validation, we can add an tool step to reformat and clean up our data. As seen below, once we make the necessary reformatting changes in Edit Columns, we can connect the updated table for validation to the Validate Schema, and the tool passes without any errors.

Edit Columns
Validation checks fail
After reformatting columns using an Edit Columns tool, the validation checks pass