Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 7 Next »

Disclaimer

Your use of this download is governed by Stonebranch’s Terms of Use, which are available at https://www.stonebranch.com/integration-hub/Terms-and-Privacy/Terms-of-Use/

Introduction

This Universal Task allows Stonebranch users to perform end-to-end Orchestration and Automation of Jobs & Clusters in Databricks environment, either in AWS or Azure.

Overview

  • This task will use the Databricks URL and the user bearer token to connect with the Databricks environment. 

  • Users can perform the following with respect to the Databricks jobs.

    • Create and list jobs

    • Get job details

    • Run now jobs

    • Run submit jobs

    • Cancel run jobs

  • Also with respect to Databricks clusters, this Universal Task can perform the following operations:

    • Create, start and restart a cluster

    • Terminate a cluster

    • Get cluster information

    • List clusters

  • With respect to Databricks DBFS , this Universal Task also provides a feature to upload larger files.

Software Requirements

This integration requires a Universal Agent and a Python runtime to execute the Universal Task against a Databricks environment.

Software Requirements for Universal Template and Universal Task

Requires Python 3.6 or higher. Tested with the Universal Agent bundled Python distribution.

  • Python modules required

    • requests

Software Requirements for Universal Agent

  • Universal Agent for Windows x64 Version 6.9 and later with Python options installed

  • Universal Agent for Linux Version 6.9 and later with Python options installed

Software Requirements for Universal Controller

  • Universal Controller Version 6.9.0.0 and later

Software Requirements for the Application to be Scheduled

This Universal Task has been tested with the Azure Databricks environment -API version 2.0.

Technical Considerations

  • This task uses Python modules requests to make REST-API calls to the Databricks environment.

  • Databricks URL and user bearer token would be required as basic input for this Universal Task.

  • Authentication is possible either by generating a personal access token in Databricks Environment or an Azure AD-based authentication

  • Using the Azure AD based Authentication involves some configuration as shown below in Azure (https://portal.azure.com)

Key Features

Feature

Description

Create Job

Create a job in a Databricks environment from Universal Controller. Here, a JSON input for job creation in Databricks environment will be used.

List jobs

List the jobs available within the Databricks environment.

Get Job details

Provides an existing job definition in Databricks by providing the job ID as input.

Run now Jobs

This feature helps to run an existing job in Databricks environment using the run time input parameters supplied in JSON from the Universal Task and the Universal Controller will be monitoring the execution of the job until it gets completed.

Run Submit jobs

This feature helps to run a job in Databricks environment that can be dynamically defined in JSON as an input parameter in the Universal Task and the Universal Controller will be monitoring the execution of the job until it gets completed.

Cancel Run job

Cancel a execution of job that is in running state within the Databricks environment.

Create Cluster

Create a cluster in Databricks environment. Input to be provided in the JSON in a script in this Universal Task.

List clusters

List the clusters available in the Databricks environment.

Start cluster

Start a cluster that is in stopped state in Databricks.

Restart cluster

Restart a cluster in the Databricks environment.

Terminate cluster

Terminate cluster in Databricks environment by providing cluster ID as input.

Get a Cluster info

Provides the definition of an existing cluster in Databricks environment in JSON.

Upload file to DBFS

Upload a file from local server to a Databricks file system DBFS.

Import Databricks Integration Downloadable Universal Template

To use this downloadable Universal Template, you first must perform the following steps:

  1. This Universal Task requires the Resolvable Credentials feature. Check that the Resolvable Credentials Permitted system property has been set to true.
  2. To import the Universal Template into your Controller, follow the instructions here.
  3. When the files have been imported successfully, refresh the Universal Templates list; the Universal Template will appear on the list.

Configure Databricks Integration Universal Task

For the new Universal Task type, create a new task, and enter the task-specific details that were created in the Universal Template.

Field Descriptions for Databricks Universal Task

Field

Description

Databricks URL

Specify the Databricks URL.

Bearer Token

Provide the Databricks Personal token or the Azure AD token.

Databricks Function

Select a Function that would like to perform with Databricks.

Create Request Script

Feed the script for the new job creation or cluster in Databricks.

Job ID

Provide the Databricks Job ID.

Job Run Request

Specify the parameters for Jar or notebook or python or spark-submit or the Job submit run request.

Run ID

Specify the Databricks Run ID.

Cluster ID

Provide the cluster ID.

Local file name

Local file name with path.

DBFS file name

Provide the Databricks file path and name.

overwrite

Specify if the uploaded files need to overwritten in DBFS.

SSL VerifyCheck if this Universal Task requires certificate verification for Databricks REST-API calls.
Certificate Path and file name

Path of the certificate for SSL Verification, if the SSL Verify field is enabled.


Examples for Databricks Integration Universal Tasks

List Cluster Job

Run now Job

Run Submit Job

List Cluster

Upload Local File to DBFS


Document References

This document references the following documents:

Name

Location

Description

Universal Templates

https://docs.stonebranch.com/confluence/display/UC72x/Universal+Templates

User documentation for creating Universal Templates in the Universal Controller user interface.

Universal Tasks

https://docs.stonebranch.com/confluence/display/UC72x/Universal+Tasks

User documentation for creating Universal Tasks in the Universal Controller user interface.




  • No labels