Create Create

yaml
type: "io.kestra.plugin.azure.batch.job.Create"

Create a Azure Batch job with tasks.

Examples

yaml
id: azure_batch_job_create
namespace: company.team

tasks:
  - id: create
    type: io.kestra.plugin.azure.batch.job.Create
    endpoint: https://***.francecentral.batch.azure.com
    account: <batch-account>
    accessKey: <access-key>
    poolId: <pool-id>
    job:
      id: <job-name>
    tasks:
      - id: env
        commands:
          - 'echo t1=$ENV_STRING'
        environments:
          ENV_STRING: "{{ inputs.first }}"

      - id: echo
        commands:
          - 'echo t2={{ inputs.second }} 1>&2'

      - id: for
        commands:
          -  'for i in $(seq 10); do echo t3=$i; done'

      - id: vars
        commands:
          - echo '::{"outputs":{"extract":"'$(cat files/in/in.txt)'"}::'
        resourceFiles:
          - httpUrl: https://unittestkt.blob.core.windows.net/tasks/***?sv=***&se=***&sr=***&sp=***&sig=***
          filePath: files/in/in.txt

      - id: output
        commands:
          - 'mkdir -p outs/child/sub'
          - 'echo 1 > outs/1.txt'
          - 'echo 2 > outs/child/2.txt'
          - 'echo 3 > outs/child/sub/3.txt'
        outputFiles:
          - outs/1.txt
        outputDirs:
          - outs/child

Use a container to start the task, the pool must use a microsoft-azure-batch publisher.

yaml
id: azure_batch_job_create
namespace: company.team

tasks:
  - id: create
    type: io.kestra.plugin.azure.batch.job.Create
    endpoint: https://***.francecentral.batch.azure.com
    account: <batch-account>
    accessKey: <access-key>
    poolId: <pool-id>
    job:
      id: <job-name>
    tasks:
      - id: echo
        commands:
          - 'python --version'
        containerSettings:
          imageName: python

Properties

delete

  • Type: boolean
  • Dynamic:
  • Required: ✔️
  • Default: true

Whether the job should be deleted upon completion.

endpoint

  • Type: string
  • Dynamic: ✔️
  • Required: ✔️

The blob service endpoint.

job

  • Type: Job
  • Dynamic: ✔️
  • Required: ✔️

The job to create.

poolId

  • Type: string
  • Dynamic: ✔️
  • Required: ✔️

The ID of the pool.

resume

  • Type: boolean
  • Dynamic:
  • Required: ✔️
  • Default: true

Whether to reconnect to the current job if it already exists.

tasks

  • Type: array
  • SubType: Task
  • Dynamic:
  • Required: ✔️

The list of tasks to be run.

accessKey

  • Type: string
  • Dynamic:
  • Required:

account

  • Type: string
  • Dynamic:
  • Required:

completionCheckInterval

  • Type: string
  • Dynamic:
  • Required:
  • Default: 1.000000000
  • Format: duration

The frequency with which the task checks whether the job is completed.

maxDuration

  • Type: string
  • Dynamic:
  • Required:
  • Format: duration

The maximum total wait duration.

If null, there is no timeout and the task is delegated to Azure Batch.

Outputs

outputFiles

  • Type: object
  • SubType: string
  • Required:

The output files' URIs in Kestra's internal storage.

vars

  • Type: object
  • Required:

The values from the output of the commands.

Definitions

io.kestra.plugin.azure.batch.models.OutputFileBlobContainerDestination

Properties

containerUrl
  • Type: string
  • Dynamic:
  • Required: ✔️

The URL of the container within Azure Blob Storage to which to upload the file(s).

If not using a managed identity, the URL must include a Shared Access Signature (SAS) granting write permissions to the container.

identityReference

The reference to the user assigned identity to use to access Azure Blob Storage specified by containerUrl.

The identity must have write access to the Azure Blob Storage container.

path
  • Type: string
  • Dynamic: ✔️
  • Required:

The destination blob or virtual directory within the Azure Storage container.

If filePattern refers to a specific file (i.e. contains no wildcards), then path is the name of the blob to which to upload that file. If filePattern contains one or more wildcards (and therefore may match multiple files), then path is the name of the blob virtual directory (which is prepended to each blob name) to which to upload the file(s). If omitted, file(s) are uploaded to the root of the container with a blob name matching their file name.

io.kestra.plugin.azure.batch.models.ContainerRegistry

Properties

identityReference

The reference to the user assigned identity to use to access the Azure Container Registry instead of username and password.

password
  • Type: string
  • Dynamic: ✔️
  • Required:

The password to log into the registry server.

registryServer
  • Type: string
  • Dynamic: ✔️
  • Required:

The registry server URL.

If omitted, the default is "docker.io".

userName
  • Type: string
  • Dynamic: ✔️
  • Required:

The user name to log into the registry server.

io.kestra.plugin.azure.batch.models.OutputFileUploadOptions

Properties

uploadCondition
  • Type: string
  • Dynamic:
  • Required: ✔️
  • Default: taskcompletion
  • Possible Values:
    • TASK_SUCCESS
    • TASK_FAILURE
    • TASK_COMPLETION

The conditions under which the Task output file or set of files should be uploaded.

io.kestra.plugin.azure.batch.models.ComputeNodeIdentityReference

Properties

resourceId
  • Type: string
  • Dynamic: ✔️
  • Required:

The ARM resource ID of the user assigned identity.

io.kestra.plugin.azure.batch.models.ResourceFile

Properties

autoStorageContainerName
  • Type: string
  • Dynamic: ✔️
  • Required:

The storage container name in the auto storage Account.

The autoStorageContainerName, storageContainerUrl and httpUrl properties are mutually exclusive, and one of them must be specified.

blobPrefix
  • Type: string
  • Dynamic: ✔️
  • Required:

The blob prefix to use when downloading blobs from the Azure Storage container.

Only the blobs whose names begin with the specified prefix will be downloaded. The property is valid only when autoStorageContainerName or storageContainerUrl is used. This prefix can be a partial file name or a subdirectory. If a prefix is not specified, all the files in the container will be downloaded.

fileMode
  • Type: string
  • Dynamic: ✔️
  • Required:

The file permission mode attribute in octal format.

This property applies only to files being downloaded to Linux Compute Nodes. It will be ignored if it is specified for a resourceFile which will be downloaded to a Windows Compute Node. If this property is not specified for a Linux Compute Node, then a default value of 0770 is applied to the file.

filePath
  • Type: string
  • Dynamic: ✔️
  • Required:

The location on the Compute Node to which to download the file(s), relative to the Task's working directory.

If the httpUrl property is specified, the filePath is required and describes the path which the file will be downloaded to, including the file name. Otherwise, if the autoStorageContainerName or storageContainerUrl property is specified, filePath is optional and is the directory to download the files to. In the case where filePath is used as a directory, any directory structure already associated with the input data will be retained in full and appended to the specified filePath directory. The specified relative path cannot break out of the Task's working directory (for example by using ..).

httpUrl
  • Type: string
  • Dynamic: ✔️
  • Required:

The URL of the file to download.

The autoStorageContainerName, storageContainerUrl and httpUrl properties are mutually exclusive, and one of them must be specified. If the URL points to Azure Blob Storage, it must be readable from compute nodes. There are three ways to get such a URL for a blob in Azure storage: include a Shared Access Signature (SAS) granting read permissions on the blob, use a managed identity with read permission, or set the ACL for the blob or its container to allow public access.

identityReference

The reference to the user assigned identity to use to access Azure Blob Storage specified by storageContainerUrl or httpUrl.

storageContainerUrl
  • Type: string
  • Dynamic: ✔️
  • Required:

The URL of the blob container within Azure Blob Storage.

The autoStorageContainerName, storageContainerUrl and httpUrl properties are mutually exclusive, and one of them must be specified. This URL must be readable and listable from compute nodes. There are three ways to get such a URL for a container in Azure storage: include a Shared Access Signature (SAS) granting read and list permissions on the container, use a managed identity with read and list permissions, or set the ACL for the container to allow public access.

io.kestra.plugin.azure.batch.models.TaskContainerSettings

Properties

imageName
  • Type: string
  • Dynamic: ✔️
  • Required: ✔️

The Image to use to create the container in which the Task will run.

This is the full Image reference, as would be specified to docker pull. If no tag is provided as part of the Image name, the tag :latest is used as a default.

containerRunOptions
  • Type: string
  • Dynamic: ✔️
  • Required:

Additional options to the container create command.

These additional options are supplied as arguments to the docker create command, in addition to those controlled by the Batch Service.

registry

The private registry which contains the container image.

This setting can be omitted if was already provided at Pool creation.

workingDirectory
  • Type: string
  • Dynamic:
  • Required:
  • Possible Values:
    • TASK_WORKING_DIRECTORY
    • CONTAINER_IMAGE_DEFAULT

The location of the container Task working directory.

The default is taskWorkingDirectory. Possible values include: taskWorkingDirectory, containerImageDefault.

io.kestra.plugin.azure.batch.models.Task

Properties

commands
  • Type: array
  • SubType: string
  • Dynamic: ✔️
  • Required: ✔️

The command line of the Task.

For multi-instance Tasks, the command line is executed as the primary Task, after the primary Task and all subtasks have finished executing the coordination command line. The command line does not run under a shell, and therefore cannot take advantage of shell features such as environment variable expansion. If you want to take advantage of such features, you should invoke the shell in the command line, for example, using cmd /c MyCommand in Windows or /bin/sh -c MyCommand in Linux. If the command line refers to file paths, it should use a relative path (relative to the Task working directory), or use the Batch provided environment variable.

Command will be passed as /bin/sh -c "command" by default.

id
  • Type: string
  • Dynamic: ✔️
  • Required: ✔️
  • Max length: 64

A string that uniquely identifies the Task within the Job.

The ID can contain any combination of alphanumeric characters including hyphens and underscores, and cannot contain more than 64 characters. The ID is case-preserving and case-insensitive (that is, you may not have two IDs within a Job that differ only by case). If not provided, a random UUID will be generated.

interpreter
  • Type: string
  • Dynamic:
  • Required: ✔️
  • Default: /bin/sh
  • Min length: 1

Interpreter to be used.

constraints

The execution constraints that apply to this Task.

containerSettings

The settings for the container under which the Task runs.

If the Pool that will run this Task has containerConfiguration set, this must be set as well. If the Pool that will run this Task doesn't have containerConfiguration set, this must not be set. When this is specified, all directories recursively below the AZ_BATCH_NODE_ROOT_DIR (the root of Azure Batch directories on the node) are mapped into the container, all Task environment variables are mapped into the container, and the Task command line is executed in the container. Files produced in the container outside of AZ_BATCH_NODE_ROOT_DIR might not be reflected to the host disk, meaning that Batch file APIs will not be able to access those files.

displayName
  • Type: string
  • Dynamic: ✔️
  • Required:
  • Max length: 1024

A display name for the Task.

The display name need not be unique and can contain any Unicode characters up to a maximum length of 1024.

environments
  • Type: object
  • SubType: string
  • Dynamic: ✔️
  • Required:

A list of environment variable settings for the Task.

interpreterArgs
  • Type: array
  • SubType: string
  • Dynamic:
  • Required:
  • Default: [-c]

Interpreter args to be used.

outputDirs
  • Type: array
  • SubType: string
  • Dynamic:
  • Required:

Output directories list that will be uploaded to the internal storage.

List of keys that will generate temporary directories. In the command, you can use a special variable named outputDirs.key. If you add a file with ["myDir"], you can use the special variable echo 1 >> {{ outputDirs.myDir }}/file1.txt and echo 2 >> {{ outputDirs.myDir }}/file2.txt, and both files will be uploaded to the internal storage. Then, you can use them on other tasks using {{ outputs.taskId.files['myDir/file1.txt'] }}

outputFiles
  • Type: array
  • SubType: string
  • Dynamic:
  • Required:

Output file list that will be uploaded to the internal storage.

List of keys that will generate temporary files. In the command, you can use a special variable named outputFiles.key. If you add a file with ["first"], you can use the special variable echo 1 >> {{ outputFiles.first }}on this task, and reference this file on others tasks using {{ outputs.taskId.outputFiles.first }}.

requiredSlots
  • Type: integer
  • Dynamic:
  • Required:

The number of scheduling slots that the Task requires to run.

The default is 1. A Task can only be scheduled to run on a compute node if the node has enough free scheduling slots available. For multi-instance Tasks, this must be 1.

resourceFiles
  • Type: array
  • SubType: ResourceFile
  • Dynamic: ✔️
  • Required:

A list of files that the Batch service will download to the Compute Node before running the command line.

For multi-instance Tasks, the resource files will only be downloaded to the Compute Node on which the primary Task is executed. There is a maximum size for the list of resource files. When the max size is exceeded, the request will fail and the response error code will be RequestEntityTooLarge. If this occurs, the collection of ResourceFiles must be reduced in size. This can be achieved using .zip files, Application Packages, or Docker Containers.

uploadFiles
  • Type: array
  • SubType: OutputFile
  • Dynamic: ✔️
  • Required:

A list of files that the Batch service will upload from the Compute Node after running the command line.

For multi-instance Tasks, the files will only be uploaded from the Compute Node on which the primary Task is executed.

io.kestra.plugin.azure.batch.models.OutputFile

Properties

destination

The destination for the output file(s).

uploadOptions

Additional options for the upload operation, including the conditions under which to perform the upload.

filePattern
  • Type: string
  • Dynamic: ✔️
  • Required:

A pattern indicating which file(s) to upload.

Both relative and absolute paths are supported. Relative paths are relative to the Task working directory. The following wildcards are supported: * matches 0 or more characters (for example, pattern abc* would match abc or abcdef), ** matches any directory, ? matches any single character, [abc] matches one character in the brackets, and [a-c] matches one character in the range. Brackets can include a negation to match any character not specified (for example, [!abc] matches any character but a, b, or c). If a file name starts with "." it is ignored by default but may be matched by specifying it explicitly (for example *.gif will not match .a.gif, but .*.gif will). A simple example: **\*.txt matches any file that does not start in '.' and ends with .txt in the Task working directory or any subdirectory. If the filename contains a wildcard character it can be escaped using brackets (for example, abc[*] would match a file named abc*). Note that both \ and / are treated as directory separators on Windows, but only / is on Linux.Environment variables (%var% on Windows or $var on Linux) are expanded prior to the pattern being applied.

io.kestra.plugin.azure.batch.models.Job

Properties

id
  • Type: string
  • Dynamic: ✔️
  • Required: ✔️
  • Max length: 64

**A string that uniquely identifies the Job within the Account. **

The ID can contain any combination of alphanumeric characters including hyphens and underscores, and cannot contain more than 64 characters. The ID is case-preserving and case-insensitive (that is, you may not have two IDs within an Account that differ only by case).

displayName
  • Type: string
  • Dynamic: ✔️
  • Required:
  • Max length: 1024

The display name for the Job.

The display name need not be unique and can contain any Unicode characters up to a maximum length of 1024.

labels
  • Type: object
  • SubType: string
  • Dynamic: ✔️
  • Required:

Labels to attach to the created job.

maxParallelTasks
  • Type: integer
  • Dynamic:
  • Required:

The maximum number of tasks that can be executed in parallel for the Job.

The value of maxParallelTasks must be -1 or greater than 0, if specified. If not specified, the default value is -1, which means there's no limit to the number of tasks that can be run at once. You can update a job's maxParallelTasks after it has been created using the update job API.

priority
  • Type: integer
  • Dynamic:
  • Required:

The priority of the Job.

Priority values can range from -1000 to 1000, with -1000 being the lowest priority and 1000 being the highest priority. The default value is 0.

io.kestra.plugin.azure.batch.models.TaskConstraints

Properties

maxTaskRetryCount
  • Type: integer
  • Dynamic:
  • Required:

The maximum number of times the Task may be retried.

The Batch service retries a Task if its exit code is nonzero. Note that this value specifically controls the number of retries for the Task executable due to a nonzero exit code. The Batch service will try the Task once, and may then retry up to this limit. For example, if the maximum retry count is 3, Batch tries the Task up to 4 times (one initial try and 3 retries). If the maximum retry count is 0, the Batch service does not retry the Task after the first attempt. If the maximum retry count is -1, the Batch service retries the Task without limit.

maxWallClockTime
  • Type: string
  • Dynamic:
  • Required:
  • Format: duration

The maximum elapsed time that the Task may run, measured from the time the Task starts.

If the Task does not complete within the time limit, the Batch service terminates it. If this is not specified, there is no time limit on how long the Task may run.

retentionTime
  • Type: string
  • Dynamic:
  • Required:
  • Format: duration

The minimum time to retain the Task directory on the Compute Node where it ran, from the time it completes execution.

After this time, the Batch service may delete the Task directory and all its contents. The default is 7 days, i.e. the Task directory will be retained for 7 days unless the Compute Node is removed or the Job is deleted.

io.kestra.plugin.azure.batch.models.OutputFileDestination

Properties

container

A location in Azure Blob Storage to which the files are uploaded.

Was this page helpful?