Validating Against External Sources

OneSchema’s validation API allows you to use your own API endpoint to create a column validator.

Validation Webhook

Similar to our data export webhooks feature, you will only need a URL endpoint and a template to get started using this feature. Once your validation webhook is configured, every time your template is mapped to a sheet, your validation webhook will be run alongside our other validations in our validation library.

OneSchema's validation API operates by sending POST requests with JSON values to the supplied endpoints. The User-Agent HTTP header is set to OneSchema/1.0.

For row hooks, each request will have a customizable batch size which defaults to 1000 values. If a list has more rows than the batch size, multiple requests for a row hook will be sent. You have to select which columns should be sent to the validation hook. Row hooks are triggered on initial validation as well as after subsequent updates to any cell in the specified columns.

For column hooks, only one request will be sent with ALL of the rows on the spreadsheet even if the spreadsheet has more than 1000 rows. Column hooks are triggered once on initial validation and each time a cell in one of the configured columns changes.

📘

Please contact your OneSchema representative for more info on setting up column hooks.

Attaching a Validation Webhook to a Template

Validation Webhooks are configured on an individual template's configuration page. More info here.

Validation JSON request schema

ParameterTypeDescription
rows*Object[]Array of rows to validate.
rows[].row_id*IntegerId of a single row. Use this value in the response data to specify errors.
rows[].values*ObjectA map from template column names to the value in that column for a single row. All values are represented as strings. Only columns that are part of the hook's configuration are included.
columns*Object[]Ordered array of mapped columns. Used for referencing custom columns.
template_key*StringAn optional key specified on the template that can be used instead of template_id to determine what template is being validated.
hook_id*IntegerA unique identifier for this validation hook.
hook_name*StringThe name of the hook, as configured on the validation hook UI.
workspace_id*IntegerThe id of Workspace containing the Sheet containing the rows being validated.
sheet_id*IntegerThe id of the Sheet containing the rows being validated.
sheet_metadata*ObjectA JSON Object containing any custom metadata associated with the Sheet at time of import. Includes two additional fields added automatically by OneSchema: original_file_name containing the name of the original uploaded file, and original_sheet_name containing the name of the sheet in the imported Excel file, if part of a multi-sheet Excel file upload.
embed_user_jwt*StringIf you are embedding OneSchema within your own site, this will contain the JWT token used to initialize a customer’s session.
request_id*StringA unique ID for this webhook request.

Example JSON POST request body

{
  "rows": [
    {
      "row_id": 1,
      "values": {
        "contact_id": "1438263",
        "contact_name": "John Doe",
        "business_vertical": "SaaS Sales"
      }
    },
    {
      "row_id": 2,
      "values": {
        "contact_id": "900372",
        "contact_name": "Aaron Smith",
        "business_vertical": "Sportswear"
      }
    },
    {
      "row_id": 3,
      "values": {
        "contact_id": "6803830",
        "contact_name": "Johnny Apple",
        "business_vertical": "Shipping Infrastructure"
      }
    },
    {
      "row_id": 4,
      "values": {
        "contact_id": "1684023",
        "contact_name": "Nancy Jackson",
        "business_vertical": "Marketing"
      }
    }
  ],
  "columns": [
    {
      "sheet_column_name": "Contact ID",
      "template_column_name": "Contact ID",
      "template_column_key": "contact_id"
    },
    {
      "sheet_column_name": "Contact Name",
      "template_column_name": "Contact Name",
      "template_column_key": "contact_name"
    },
    {
      "sheet_column_name": "Business Vertical",
      "template_column_name": "Business Vertical",
      "template_column_key": "business_vertical"
    }
  ],
  "template_key": "contact-template",
  "hook_id": 823,
  "hook_name": "Hook 1",
  "workspace_id": 23155,
  "sheet_id": 23145,
  "sheet_metadata": {
    "original_file_name": "contacts.csv"
  },
  "embed_user_jwt": "eyJhbGciOiJIUzI1NiJ9.eyJpc3MiOiI2N2JiMmU1Zi1mMGY3LTQyYTYtYTUxMS0xOGIyNWU2N2I4YzQiLCJ1c2VyX2asdasdasdasd.MaxfODdhWqVamNgK7l8mZrR-A4B2uGDuPWLOreu7dQI",
  "request_id": "req_v85bd03e1-32aa-4600-8f9c-5ff0b0d206c5"
}

In order to validate the values, OneSchema will expect the endpoint to return an array of warnings and errors. Each warning and error should specify the row_id, the column the warning or error appeared in, the severity of the issue ("warning" or "error"), a message to be displayed in the UI (see best practices for error messages), and optionally a suggestion can be provided to the user.

Issues can be raised for multiple columns in a single row. For example, if a validation hook verifies that two columns A and B sum to a value in column C, the validation hook may choose to say only column C has an error, or that each of columns A, B, and C have an error, depending on the relationship between the columns and the semantics they want to expose.

Validation Hook JSON Response Schema

ParameterTypeDescription
[]Object[]The response should be an array of validations.
[].row_idIntegerThe row_id with the warning or error.
[].columnStringThe key of the template column where the validation should be displayed.
[].column_indexInteger(Optional) The index of the column for this validation, based on the columns array provided in the JSON POST request body. Required if the column is a custom column.
[].severityStringEither "warning" (displayed in yellow) or "error" (displayed in red).
[].messageStringMessage to be displayed to user when they click on the validation.
[].suggestionString | String[](Optional) A suggestion or list of suggestions for a new value that will be displayed to the user. The user may choose whether or not to accept the suggested value.
[].optionsObject(Optional) Additional information related to the error. Keys may include:

popover_title: The title of the popover which appears when the user clicks an error or warning in OneSchema.

Example JSON Response

[
  {
    "row_id": 2,
    "column": "employee_name",
    "severity": "warning",
    "message": "Employee no longer works for company."
  },
  {
    "row_id": 2,
    "column": "employee_end_date",
    "severity": "error",
    "message": "Employee gave notice on this date.",
    "suggestion": "04/15/2020"
  },
  {
    "row_id": 3,
    "column": "employee_name",
    "severity": "error",
    "message": "Employee's name has changed",
    "suggestion": ["Johnny Appleseed", "Johnny"]
  }
]

Securing your validation hook

To secure your validation hook, we can optionally authenticate requests using HTTP Basic Auth. If a secret token is supplied, it will be used as the basic auth password in the request to the provided endpoint. No username is passed through.

Embedding

When using webhooks with embedded OneSchema, the JWT token that was used to initialize the session will also be sent in the top-level key embed_user_jwt.

Retries

If the webhook call times out after 30 seconds, or if the webhook returns a server error response code (5xx), OneSchema will retry the the call up to three times with exponential backoff.

A 408 (request timeout) or 429 (too many requests) code will also prompt OneSchema to retry.

The request_id will be the same for every retry.

Validation hooks with custom columns

Sometimes, even your custom columns require validation hooks. For example, you may have a user custom column that requires validation against your internal database when mapped against a sheet column. However, since custom columns can be mapped multiple times within the same sheet, additional information is required in the validation webhook JSON response in order to properly display validation errors and warnings.

The POST request schema specifies an array of column mappings via the columns key.

Columns object

ParameterTypeDescription
[]*Object[]Ordered array of mappings.
sheet_column_name*StringThe sheet column name of a specific mapping
template_column_name*StringThe name of the template column this sheet column is mapped to.
template_column_key*StringThe template column key this sheet column is mapped to.
is_customBoolean(Optional) Whether this template column is a custom column.
custom_column_nameString(Optional) The uploaded sheet column name for this mapped custom column.

Custom column values

Like standard template columns, custom column values are also specified via the rows array provided in the POST request body. However, all custom column values will be grouped under a single umbrella object keyed by$custom. The $custom key specifies a mapping from column_index to sheet cell value. This allows you to disambiguate between multiple sheet columns which could be mapped to the same custom template column.

Example JSON POST with custom columns

{
  "rows": [
    {
      "row_id": 1,
      "values": {
        "business_vertical": "SaaS Sales",
        "$custom": {
          "3": LOW,
          "4": LOW,
        }
      }
    },
    {
      "row_id": 2,
      "values": {
        "business_vertical": "Sportswear",
        "$custom": {
          "3": HIGH,
          "4": HIGH,
        }
      }
    },
    {
      "row_id": 3,
      "values": {
        "business_vertical": "Shipping Infrastructure", 
        "$custom": {
          "3": LOW,
          "4": HIGH,
        }
      }
    },
    {
      "row_id": 4,
      "values": {
        "business_vertical": "Marketing",
        "$custom": {
          "3": HIGH,
          "4": MED,
        }
      }
    }
  ],
  "columns": [
    {
      "sheet_column_name": "Contact ID",
      "template_column_name": "Contact ID",
      "template_column_key": "contact_id"
    },
    {
      "sheet_column_name": "Contact Name",
      "template_column_name": "Contact Name",
      "template_column_key": "contact_name"
    },
    {
      "sheet_column_name": "Business Vertical",
      "template_column_name": "Business Vertical",
      "template_column_key": "business_vertical"
    },
    {
      "sheet_column_name": "Secondary Priority",
      "template_column_name": "Tier",
      "template_column_key": "tier",
      "is_custom": true,
      "custom_column_name": "Secondary Priority"
    },
    {
      "sheet_column_name": "Priority",
      "template_column_name": "Tier",
      "template_column_key": "tier",
      "is_custom": true,
      "custom_column_name": "Priority"
    }
  ],
  "template_key": "contact-template",
  "hook_id": 823,
  "hook_name": "Hook 1",
  "workspace_id": 23155,
  "sheet_id": 23145,
  "sheet_metadata": {
    "original_file_name": "contacts.csv"
  },
  "embed_user_jwt": "eyJhbGciOiJIUzI1NiJ9.eyJpc3MiOiI2N2JiMmU1Zi1mMGY3LTQyYTYtYTUxMS0xOGIyNWU2N2I4YzQiLCJ1c2VyX2asdasdasdasd.MaxfODdhWqVamNgK7l8mZrR-A4B2uGDuPWLOreu7dQI",
  "request_id": "req_v85bd03e1-32aa-4600-8f9c-5ff0b0d206c5"
}

As before, OneSchema will expect the endpoint to return an array of warnings and errors. In addition to specifying row_id, column, severity, and message, you must also provide a column_index in any errors for custom columns.

Example JSON Response with custom columns

[
  {
    "row_id": 2,
    "column": "contact_name",
    "severity": "warning",
    "message": "Employee no longer works for company."
  },
  {
    "row_id": 3,
    "column": "tier",
    "column_index": 3,
    "severity": "error",
    "message": "Mismatched priority levels."
  },
  {
    "row_id": 4,
    "column": "tier",
    "column_index": 4,
    "severity": "warning",
    "message": "Specify priority as HIGH or LOW."
  }
]

Validation hooks with multi-mappable columns

Similar to custom columns, multi-mappable columns may be mapped multiple times in the same sheet. As such, additional information is required in the validation webhook JSON response in order to properly display validation errors and warnings.

The POST request schema specifies an array of column mappings via the columns key.

Columns object

ParameterTypeDescription
[]*Object[]Ordered array of mappings.
sheet_column_name*StringThe sheet column name of a specific mapping
template_column_name*StringThe name of the template column this sheet column is mapped to.
template_column_key*StringThe template column key this sheet column is mapped to.
is_multi_mappableBoolean(Optional) Whether this template column is multi mappable.

Multi mapped column values

Like standard template columns, multi-mappable column values are also specified via the rows array provided in the POST request body. Multi-mappable column values are reported as an array of objects; the column_index field identifies the index of the column in the columns array so that the original uploaded column can be identified. Note that multi-mappable columns values will always reported in array form even if there is only one corresponding uploaded column.

Example JSON POST with multi-mapped values

{
  "request_id": "req_v39a92433-3061-4977-8f70-9ff981d528ba",
  "rows": [
    {
      "row_id": 1,
      "values": {
        "area_name": "Area 1",
        "ground_temperature": [
          {
            "value": "59.4",
            "column_index": 1
          },
          {
            "value": "59.", 
            "column_index": 2
          },
          {
            "value": "60.9",
            "column_index": 3
          }
        ]
      }
    },
    {
      "row_id": 2,
      "values": {
        "ground_temperature": [
          {
            "value": "56.8",
            "column_index": 1
          },
          {
            "value": "57.0",
            "column_index": 2
          },
          {
            "value": "57.6",
            "column_index": 3
          }
        ],
        "area_name": "Area 2"
      }
    },
    {
      "row_id": 3,
      "values": {
        "ground_temperature": [
          {
            "value": "61.7",
            "column_index": 1
          },
          {
            "value": "62.2",
            "column_index": 2
          },
          {
            "value": "62.9",
            "column_index": 3
          }
        ],
        "area_name": "Area 3"
      }
    }
  ],
  "template_key": "temperature_sensors",
  "hook_id": <HOOK_ID>,
  "hook_name": <HOOK_NAME>
  "sheet_id": <SHEET_ID>,
  "columns": [
    {
      "sheet_column_name": "Area name",
      "template_column_name": "Area name",
      "template_column_key": "area_name"
    },
    {
      "sheet_column_name": "Ground temp 1",
      "template_column_name": "Ground Temperature",
      "template_column_key": "ground_temperature",
      "is_multi_mappable": true
    },
    {
      "sheet_column_name": "Ground temp 2",
      "template_column_name": "Ground Temperature",
      "template_column_key": "ground_temperature",
      "is_multi_mappable": true
    },
    {
      "sheet_column_name": "Ground temp 3",
      "template_column_name": "Ground Temperature",
      "template_column_key": "ground_temperature",
      "is_multi_mappable": true
    }
  ]
}

Again, OneSchema will expect the endpoint to return an array of warnings and errors. In addition to specifying row_id, column, severity, and message, you must also provide a column_index in any errors for multi mapped columns.

Example JSON Response with multi-mapped values

[
  {
    "row_id": 1,
    "column": "ground_temperature",
    "column_index": 2,
    "severity": "warning",
    "message": "Invalid temperature reading."
  }
]