Cloud Dataprep API (v7.6.0)

Download OpenAPI specification:Download

Overview

To enable programmatic control over its objects, the Cloud Dataprep Platform supports a range of REST API endpoints across its objects. This section provides an overview of the API design, methods, and supported use cases.

Most of the endpoints accept JSON as input and return JSON responses. This means that you must usually add the following hearders to your request:

Content-type: application/json
Accept: application/json

ℹ️ NOTE: Access to APIs must be enabled on a per-project basis. For more information, see enable Api

Resources

The term resource refers to a single type of object in the Cloud Dataprep Platform metadata. An API is broken up by its endpoint's corresponding resource. The name of a resource is typically plural, and expressed in camelCase. Example: jobGroups.

Resource names are used as part of endpoint URLs, as well as in API parameters and responses.

CRUD Operations

The platform supports Create, Read, Update, and Delete operations on most resources. You can review the standards for these operations and their standard parameters below.

Some endpoints have special behavior as exceptions.

Create

To create a resource, you typically submit an HTTP POST request with the resource's required metadata in the request body. The response returns a 201 Created response code upon success with the resource's metadata, including its internal id, in the response body.

Read

An HTTP GET request can be used to read a resource or to list a number of resources.

A resource's id can be submitted in the request parameters to read a specific resource. The response usually returns a 200 OK response code upon success, with the resource's metadata in the response body.

If a GET request does not include a specific resource id, it is treated as a list request. The response usually returns a 200 OK response code upon success, with an object containing a list of resources' metadata in the response body.

When reading resources, some common query parameters are usually available. e.g.:

/v4/jobGroups?limit=100&includeDeleted=true&embed=jobs
Query Parameter Type Description
embed string Comma-separated list of objects to include part of the response. See Embedding resources.
includeDeleted string If set to true, response includes deleted objects.
limit integer Maximum number of objects to fetch. Usually 25 by default
offset integer Offset after which to start returning objects. For use with limit query parameter.

Update

Updating a resource requires the resource id, and is typically done using an HTTP PUT or PATCH request, with the fields to modify in the request body. The response usually returns a 200 OK response code upon success, with minimal information about the modified resource in the response body.

Delete

Deleting a resource requires the resource id and is typically executing via an HTTP DELETE request. The response usually returns a 204 No Content response code upon success.

Conventions

  • Resource names are plural and expressed in camelCase.

  • Resource names are consistent between main URL and URL parameter.

  • Parameter lists are consistently enveloped in the following manner:

    { "data": [{ ... }] }
  • Field names are in camelCase and are consistent with the resource name in the URL or with the embed URL parameter.

    "creator": { "id": 1 },
    "updater": { "id": 2 },

Embedding Resources

When reading a resource, the platform supports an embed query parameter for most resources, which allows the caller to ask for associated resources in the response. Use of this parameter requires knowledge of how different resources are related to each other and is suggested for advanced users only.

In the following example, the sub-jobs of a jobGroup are embedded in the response for jobGroup=1:

https://api.clouddataprep.com/v4/jobGroups/1?embed=jobs

To get a list of all the possible embeddings, you can provide e.g. *. The response will contain the list of possible resources that can be embedded.

https://api.clouddataprep.com/v4/jobGroups/1?embed=*

Errors

The Cloud Dataprep Platform uses HTTP response codes to indicate the success or failure of an API request.

  • Codes in the 2xx range indicate success.
  • Codes in the 4xx range indicate that the information provided is invalid (invalid parameters, missing permissions, etc.)
  • Codes in the 5xx range indicate an error on the servers. These are rare and should usually go away when retrying. If you experience a lot of 5xx errors, contact support.
HTTP Status Code (client errors) Notes
400 Bad Request Potential reasons:
  • Resource doesn't exist
  • Request is incorrectly formatted
  • Request contains invalid values
403 Forbidden Incorrect permissions to access the Resource.
404 Not Found Resource cannot be found.
410 Gone Resource has been previously deleted.
415 Unsupported Media Type Incorrect Accept or Content-type header

Request Ids

Each request has a request identifier, which can be found in the response headers, in the following form:

x-trifacta-request-id: <myRequestId>

ℹ️ NOTE: If you have an issue with a specific request, please include the x-trifacta-request-id value when you contact support

Versioning and Endpoint Lifecycle

  • API versioning is not synchronized to specific releases of the platform.
  • APIs are designed to be backward compatible.
  • Any changes to the API will first go through a deprecation phase.

Trying the API

You can use a third party client, such as curl, HTTPie, Postman or the Insomnia rest client to test the Cloud Dataprep API.

⚠️ When testing the API, bear in mind that you are working with your live production data, not sample data or test data.

Note that you will need to pass an API token with each request.

For e.g., here is how to run a job with curl:

curl -X POST 'https://api.clouddataprep.com/v4/jobGroups' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer <token>' \
-d '{ "wrangledDataset": { "id": "<recipe-id>" } }'

Using a graphical tool such as Postman or Insomnia, it is possible to import the API specifications directly:

  1. Download the API specification by clicking the Download button at top of this document
  2. Import the JSON specification in the graphical tool of your choice.
    • In Postman, you can click the import button at the top
    • With Insomnia, you can just drag-and-drop the file on the UI

Note that with Postman, you can also generate code snippets by selecting a request and clicking on the Code button.

Authentication

BearerAuth

ℹ️ NOTE: Each request to the Cloud Dataprep Platform must include authentication credentials.

API access tokens can be acquired and applied to your requests to obscure sensitive Personally Identifiable Information (PII) and are compliant with common privacy and security standards. These tokens last for a preconfigured time period and can be renewed as needed.

You can create and delete access tokens through the Settings area of the application. With each request, you submit the token as part of the Authorization header.

Authorization: Bearer <tokenValue>

As needed, you can create and use additional tokens. There is no limit to the number of tokens you can create. See API Access Token API for more information.

ℹ️ NOTE: You must be a project owner to create access tokens.

Security Scheme Type HTTP
HTTP Authorization Scheme bearer

Connection

An object representing Cloud Dataprep's connection to an external data source. connections can be used for import, publishing, or both, depending on type.

Create connection

post/v4/connections
https://api.clouddataprep.com/v4/connections

Create a new connection

ref: createConnection

Authorizations:
Request Body schema: application/json
vendor
required
string

String identifying the connection`s vendor

vendorName
required
string

Name of the vendor of the connection

type
required
string
Enum: "jdbc" "rest"

Type of connection

credentialType
required
string
Enum: "basic" "securityToken" "iamRoleArn" "iamDbUser"
  • basic - Simple username/password to be provided in the credentials property
  • securityToken - Connection uses username, password and security token to authenticate.
  • iamRoleArn - Connection uses username, password and optiona IAM Role ARN to authenticate.
  • iamDbUser - Connection uses IAM and DbUser to connect to the database.
name
required
string

Display name of the connection.

params
required
object

This setting is populated with any parameters that are passed to the source duringconnection and operations. For relational sources, this setting may include thedefault database and extra load parameters.

ssl
boolean

When true, the Cloud Dataprep Platform uses SSL to connect to the source

description
string

User-friendly description for the connection.

disableTypeInference
boolean

If set to false, type inference has been disabled for this connection. The default is true. When type inference has been disabled, the Cloud Dataprep Platform does not apply Cloud Dataprep types to data when it is imported.

isGlobal
boolean

If true, the connection is public and available to all users. Default is false.

NOTE: After a connection has been made public, it cannot be made private again. It must be deleted and recreated.

credentialsShared
boolean

If true, the credentials used for the connection are available for use byusers who have been shared the connection.

host
string

Host of the source

port
integer

Port number for the source

credentials
Array of basic (object) or securityToken (object) (credentialsInfo)

If present, these values are the credentials used to connect to the database.

Responses

201

Success

Request samples

Content type
application/json
Example
Copy
Expand all Collapse all
{
  • "vendor": "oracle",
  • "vendorName": "oracle",
  • "type": "jdbc",
  • "name": "example_oracle_connection",
  • "description": "This is an oracle connection",
  • "disableTypeInference": false,
  • "isGlobal": false,
  • "credentialsShared": false,
  • "host": "my_oracle_host",
  • "port": 1521,
  • "params":
    {
    },
  • "credentialType": "basic",
  • "credentials":
    [
    ]
}

Response samples

Content type
application/json
Copy
Expand all Collapse all
{
  • "vendor": "oracle",
  • "vendorName": "oracle",
  • "type": "jdbc",
  • "credentialType": "basic",
  • "ssl": true,
  • "name": "example_oracle_connection",
  • "description": "string",
  • "disableTypeInference": true,
  • "isGlobal": true,
  • "credentialsShared": true,
  • "host": "example.oracle.test",
  • "port": 1521,
  • "id": "55",
  • "uuid": "f9cab740-50b7-11e9-ba15-93c82271a00b",
  • "createdAt": "2020-10-16T01:18:39Z",
  • "updatedAt": "2020-10-16T01:18:39Z",
  • "credentials":
    [
    ],
  • "creator":
    {
    },
  • "updater":
    {