Dataflow API . projects . locations . flexTemplates

Instance Methods

launch(projectId, location, body=None, x__xgafv=None)

Launch a job with a FlexTemplate.

Method Details

launch(projectId, location, body=None, x__xgafv=None)
Launch a job with a FlexTemplate.

Args:
  projectId: string, Required. The ID of the Cloud Platform project that the job belongs to. (required)
  location: string, Required. The [regional endpoint]
(https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) to
which to direct the request. E.g., us-central1, us-west1. (required)
  body: object, The request body.
    The object takes the form of:

{ # A request to launch a Cloud Dataflow job from a FlexTemplate.
    "launchParameter": { # Launch FlexTemplate Parameter. # Required. Parameter to launch a job form Flex Template.
      "launchOptions": { # Launch options for this flex template job. This is a common set of options
          # across languages and templates. This should not be used to pass job
          # parameters.
        "a_key": "A String",
      },
      "jobName": "A String", # Required. The job name to use for the created job.
      "containerSpecGcsPath": "A String", # Gcs path to a file with json serialized ContainerSpec as content.
      "containerSpec": { # Container Spec. # Spec about the container image to launch.
        "image": "A String", # Name of the docker container image. E.g., gcr.io/project/some-image
        "sdkInfo": { # SDK Information. # Required. SDK info of the Flex Template.
          "language": "A String", # Required. The SDK Language.
          "version": "A String", # Optional. The SDK version.
        },
        "metadata": { # Metadata describing a template. # Metadata describing a template including description and validation rules.
          "description": "A String", # Optional. A description of the template.
          "parameters": [ # The parameters for the template.
            { # Metadata for a specific parameter.
              "label": "A String", # Required. The label to display for the parameter.
              "helpText": "A String", # Required. The help text to display for the parameter.
              "regexes": [ # Optional. Regexes that the parameter must match.
                "A String",
              ],
              "paramType": "A String", # Optional. The type of the parameter.
                  # Used for selecting input picker.
              "isOptional": True or False, # Optional. Whether the parameter is optional. Defaults to false.
              "name": "A String", # Required. The name of the parameter.
            },
          ],
          "name": "A String", # Required. The name of the template.
        },
      },
      "parameters": { # The parameters for FlexTemplate.
          # Ex. {"num_workers":"5"}
        "a_key": "A String",
      },
    },
    "validateOnly": True or False, # If true, the request is validated but not actually executed.
        # Defaults to false.
  }

  x__xgafv: string, V1 error format.
    Allowed values
      1 - v1 error format
      2 - v2 error format

Returns:
  An object of the form:

    { # Response to the request to launch a job from Flex Template.
    "job": { # Defines a job to be run by the Cloud Dataflow service. # The job that was launched, if the request was not a dry run and
        # the job was successfully launched.
        "pipelineDescription": { # A descriptive representation of submitted pipeline as well as the executed # Preliminary field: The format of this data may change at any time.
            # A description of the user pipeline and stages through which it is executed.
            # Created by Cloud Dataflow service.  Only retrieved with
            # JOB_VIEW_DESCRIPTION or JOB_VIEW_ALL.
            # form.  This data is provided by the Dataflow service for ease of visualizing
            # the pipeline and interpreting Dataflow provided metrics.
          "displayData": [ # Pipeline level display data.
            { # Data provided with a pipeline or transform to provide descriptive info.
              "url": "A String", # An optional full URL.
              "javaClassValue": "A String", # Contains value if the data is of java class type.
              "timestampValue": "A String", # Contains value if the data is of timestamp type.
              "durationValue": "A String", # Contains value if the data is of duration type.
              "label": "A String", # An optional label to display in a dax UI for the element.
              "key": "A String", # The key identifying the display data.
                  # This is intended to be used as a label for the display data
                  # when viewed in a dax monitoring system.
              "namespace": "A String", # The namespace for the key. This is usually a class name or programming
                  # language namespace (i.e. python module) which defines the display data.
                  # This allows a dax monitoring system to specially handle the data
                  # and perform custom rendering.
              "floatValue": 3.14, # Contains value if the data is of float type.
              "strValue": "A String", # Contains value if the data is of string type.
              "int64Value": "A String", # Contains value if the data is of int64 type.
              "boolValue": True or False, # Contains value if the data is of a boolean type.
              "shortStrValue": "A String", # A possible additional shorter value to display.
                  # For example a java_class_name_value of com.mypackage.MyDoFn
                  # will be stored with MyDoFn as the short_str_value and
                  # com.mypackage.MyDoFn as the java_class_name value.
                  # short_str_value can be displayed and java_class_name_value
                  # will be displayed as a tooltip.
            },
          ],
          "originalPipelineTransform": [ # Description of each transform in the pipeline and collections between them.
            { # Description of the type, names/ids, and input/outputs for a transform.
              "outputCollectionName": [ # User  names for all collection outputs to this transform.
                "A String",
              ],
              "displayData": [ # Transform-specific display data.
                { # Data provided with a pipeline or transform to provide descriptive info.
                  "url": "A String", # An optional full URL.
                  "javaClassValue": "A String", # Contains value if the data is of java class type.
                  "timestampValue": "A String", # Contains value if the data is of timestamp type.
                  "durationValue": "A String", # Contains value if the data is of duration type.
                  "label": "A String", # An optional label to display in a dax UI for the element.
                  "key": "A String", # The key identifying the display data.
                      # This is intended to be used as a label for the display data
                      # when viewed in a dax monitoring system.
                  "namespace": "A String", # The namespace for the key. This is usually a class name or programming
                      # language namespace (i.e. python module) which defines the display data.
                      # This allows a dax monitoring system to specially handle the data
                      # and perform custom rendering.
                  "floatValue": 3.14, # Contains value if the data is of float type.
                  "strValue": "A String", # Contains value if the data is of string type.
                  "int64Value": "A String", # Contains value if the data is of int64 type.
                  "boolValue": True or False, # Contains value if the data is of a boolean type.
                  "shortStrValue": "A String", # A possible additional shorter value to display.
                      # For example a java_class_name_value of com.mypackage.MyDoFn
                      # will be stored with MyDoFn as the short_str_value and
                      # com.mypackage.MyDoFn as the java_class_name value.
                      # short_str_value can be displayed and java_class_name_value
                      # will be displayed as a tooltip.
                },
              ],
              "id": "A String", # SDK generated id of this transform instance.
              "inputCollectionName": [ # User names for all collection inputs to this transform.
                "A String",
              ],
              "name": "A String", # User provided name for this transform instance.
              "kind": "A String", # Type of transform.
            },
          ],
          "executionPipelineStage": [ # Description of each stage of execution of the pipeline.
            { # Description of the composing transforms, names/ids, and input/outputs of a
                # stage of execution.  Some composing transforms and sources may have been
                # generated by the Dataflow service during execution planning.
              "componentSource": [ # Collections produced and consumed by component transforms of this stage.
                { # Description of an interstitial value between transforms in an execution
                    # stage.
                  "userName": "A String", # Human-readable name for this transform; may be user or system generated.
                  "name": "A String", # Dataflow service generated name for this source.
                  "originalTransformOrCollection": "A String", # User name for the original user transform or collection with which this
                      # source is most closely associated.
                },
              ],
              "inputSource": [ # Input sources for this stage.
                { # Description of an input or output of an execution stage.
                  "userName": "A String", # Human-readable name for this source; may be user or system generated.
                  "originalTransformOrCollection": "A String", # User name for the original user transform or collection with which this
                      # source is most closely associated.
                  "sizeBytes": "A String", # Size of the source, if measurable.
                  "name": "A String", # Dataflow service generated name for this source.
                },
              ],
              "name": "A String", # Dataflow service generated name for this stage.
              "componentTransform": [ # Transforms that comprise this execution stage.
                { # Description of a transform executed as part of an execution stage.
                  "name": "A String", # Dataflow service generated name for this source.
                  "userName": "A String", # Human-readable name for this transform; may be user or system generated.
                  "originalTransform": "A String", # User name for the original user transform with which this transform is
                      # most closely associated.
                },
              ],
              "id": "A String", # Dataflow service generated id for this stage.
              "outputSource": [ # Output sources for this stage.
                { # Description of an input or output of an execution stage.
                  "userName": "A String", # Human-readable name for this source; may be user or system generated.
                  "originalTransformOrCollection": "A String", # User name for the original user transform or collection with which this
                      # source is most closely associated.
                  "sizeBytes": "A String", # Size of the source, if measurable.
                  "name": "A String", # Dataflow service generated name for this source.
                },
              ],
              "kind": "A String", # Type of tranform this stage is executing.
            },
          ],
        },
        "labels": { # User-defined labels for this job.
            #
            # The labels map can contain no more than 64 entries.  Entries of the labels
            # map are UTF8 strings that comply with the following restrictions:
            #
            # * Keys must conform to regexp:  \p{Ll}\p{Lo}{0,62}
            # * Values must conform to regexp:  [\p{Ll}\p{Lo}\p{N}_-]{0,63}
            # * Both keys and values are additionally constrained to be <= 128 bytes in
            # size.
          "a_key": "A String",
        },
        "projectId": "A String", # The ID of the Cloud Platform project that the job belongs to.
        "environment": { # Describes the environment in which a Dataflow Job runs. # The environment for the job.
          "flexResourceSchedulingGoal": "A String", # Which Flexible Resource Scheduling mode to run in.
          "workerRegion": "A String", # The Compute Engine region
              # (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
              # which worker processing should occur, e.g. "us-west1". Mutually exclusive
              # with worker_zone. If neither worker_region nor worker_zone is specified,
              # default to the control plane's region.
          "userAgent": { # A description of the process that generated the request.
            "a_key": "", # Properties of the object.
          },
          "serviceAccountEmail": "A String", # Identity to run virtual machines as. Defaults to the default account.
          "version": { # A structure describing which components and their versions of the service
              # are required in order to run the job.
            "a_key": "", # Properties of the object.
          },
          "serviceKmsKeyName": "A String", # If set, contains the Cloud KMS key identifier used to encrypt data
              # at rest, AKA a Customer Managed Encryption Key (CMEK).
              #
              # Format:
              #   projects/PROJECT_ID/locations/LOCATION/keyRings/KEY_RING/cryptoKeys/KEY
          "experiments": [ # The list of experiments to enable.
            "A String",
          ],
          "workerZone": "A String", # The Compute Engine zone
              # (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
              # which worker processing should occur, e.g. "us-west1-a". Mutually exclusive
              # with worker_region. If neither worker_region nor worker_zone is specified,
              # a zone in the control plane's region is chosen based on available capacity.
          "workerPools": [ # The worker pools. At least one "harness" worker pool must be
              # specified in order for the job to have workers.
            { # Describes one particular pool of Cloud Dataflow workers to be
                # instantiated by the Cloud Dataflow service in order to perform the
                # computations required by a job.  Note that a workflow job may use
                # multiple pools, in order to match the various computational
                # requirements of the various stages of the job.
              "onHostMaintenance": "A String", # The action to take on host maintenance, as defined by the Google
                  # Compute Engine API.
              "sdkHarnessContainerImages": [ # Set of SDK harness containers needed to execute this pipeline. This will
                  # only be set in the Fn API path. For non-cross-language pipelines this
                  # should have only one entry. Cross-language pipelines will have two or more
                  # entries.
                { # Defines a SDK harness container for executing Dataflow pipelines.
                  "containerImage": "A String", # A docker container image that resides in Google Container Registry.
                  "useSingleCorePerContainer": True or False, # If true, recommends the Dataflow service to use only one core per SDK
                      # container instance with this image. If false (or unset) recommends using
                      # more than one core per SDK container instance with this image for
                      # efficiency. Note that Dataflow service may choose to override this property
                      # if needed.
                },
              ],
              "zone": "A String", # Zone to run the worker pools in.  If empty or unspecified, the service
                  # will attempt to choose a reasonable default.
              "kind": "A String", # The kind of the worker pool; currently only `harness` and `shuffle`
                  # are supported.
              "metadata": { # Metadata to set on the Google Compute Engine VMs.
                "a_key": "A String",
              },
              "diskSourceImage": "A String", # Fully qualified source image for disks.
              "dataDisks": [ # Data disks that are used by a VM in this workflow.
                { # Describes the data disk used by a workflow job.
                  "sizeGb": 42, # Size of disk in GB.  If zero or unspecified, the service will
                      # attempt to choose a reasonable default.
                  "diskType": "A String", # Disk storage type, as defined by Google Compute Engine.  This
                      # must be a disk type appropriate to the project and zone in which
                      # the workers will run.  If unknown or unspecified, the service
                      # will attempt to choose a reasonable default.
                      #
                      # For example, the standard persistent disk type is a resource name
                      # typically ending in "pd-standard".  If SSD persistent disks are
                      # available, the resource name typically ends with "pd-ssd".  The
                      # actual valid values are defined the Google Compute Engine API,
                      # not by the Cloud Dataflow API; consult the Google Compute Engine
                      # documentation for more information about determining the set of
                      # available disk types for a particular project and zone.
                      #
                      # Google Compute Engine Disk types are local to a particular
                      # project in a particular zone, and so the resource name will
                      # typically look something like this:
                      #
                      # compute.googleapis.com/projects/project-id/zones/zone/diskTypes/pd-standard
                  "mountPoint": "A String", # Directory in a VM where disk is mounted.
                },
              ],
              "packages": [ # Packages to be installed on workers.
                { # The packages that must be installed in order for a worker to run the
                    # steps of the Cloud Dataflow job that will be assigned to its worker
                    # pool.
                    #
                    # This is the mechanism by which the Cloud Dataflow SDK causes code to
                    # be loaded onto the workers. For example, the Cloud Dataflow Java SDK
                    # might use this to install jars containing the user's code and all of the
                    # various dependencies (libraries, data files, etc.) required in order
                    # for that code to run.
                  "name": "A String", # The name of the package.
                  "location": "A String", # The resource to read the package from. The supported resource type is:
                      #
                      # Google Cloud Storage:
                      #
                      #   storage.googleapis.com/{bucket}
                      #   bucket.storage.googleapis.com/
                },
              ],
              "teardownPolicy": "A String", # Sets the policy for determining when to turndown worker pool.
                  # Allowed values are: `TEARDOWN_ALWAYS`, `TEARDOWN_ON_SUCCESS`, and
                  # `TEARDOWN_NEVER`.
                  # `TEARDOWN_ALWAYS` means workers are always torn down regardless of whether
                  # the job succeeds. `TEARDOWN_ON_SUCCESS` means workers are torn down
                  # if the job succeeds. `TEARDOWN_NEVER` means the workers are never torn
                  # down.
                  #
                  # If the workers are not torn down by the service, they will
                  # continue to run and use Google Compute Engine VM resources in the
                  # user's project until they are explicitly terminated by the user.
                  # Because of this, Google recommends using the `TEARDOWN_ALWAYS`
                  # policy except for small, manually supervised test jobs.
                  #
                  # If unknown or unspecified, the service will attempt to choose a reasonable
                  # default.
              "network": "A String", # Network to which VMs will be assigned.  If empty or unspecified,
                  # the service will use the network "default".
              "ipConfiguration": "A String", # Configuration for VM IPs.
              "diskSizeGb": 42, # Size of root disk for VMs, in GB.  If zero or unspecified, the service will
                  # attempt to choose a reasonable default.
              "autoscalingSettings": { # Settings for WorkerPool autoscaling. # Settings for autoscaling of this WorkerPool.
                "maxNumWorkers": 42, # The maximum number of workers to cap scaling at.
                "algorithm": "A String", # The algorithm to use for autoscaling.
              },
              "poolArgs": { # Extra arguments for this worker pool.
                "a_key": "", # Properties of the object. Contains field @type with type URL.
              },
              "subnetwork": "A String", # Subnetwork to which VMs will be assigned, if desired.  Expected to be of
                  # the form "regions/REGION/subnetworks/SUBNETWORK".
              "numWorkers": 42, # Number of Google Compute Engine workers in this pool needed to
                  # execute the job.  If zero or unspecified, the service will
                  # attempt to choose a reasonable default.
              "numThreadsPerWorker": 42, # The number of threads per worker harness. If empty or unspecified, the
                  # service will choose a number of threads (according to the number of cores
                  # on the selected machine type for batch, or 1 by convention for streaming).
              "workerHarnessContainerImage": "A String", # Required. Docker container image that executes the Cloud Dataflow worker
                  # harness, residing in Google Container Registry.
                  #
                  # Deprecated for the Fn API path. Use sdk_harness_container_images instead.
              "taskrunnerSettings": { # Taskrunner configuration settings. # Settings passed through to Google Compute Engine workers when
                  # using the standard Dataflow task runner.  Users should ignore
                  # this field.
                "dataflowApiVersion": "A String", # The API version of endpoint, e.g. "v1b3"
                "oauthScopes": [ # The OAuth2 scopes to be requested by the taskrunner in order to
                    # access the Cloud Dataflow API.
                  "A String",
                ],
                "baseUrl": "A String", # The base URL for the taskrunner to use when accessing Google Cloud APIs.
                    #
                    # When workers access Google Cloud APIs, they logically do so via
                    # relative URLs.  If this field is specified, it supplies the base
                    # URL to use for resolving these relative URLs.  The normative
                    # algorithm used is defined by RFC 1808, "Relative Uniform Resource
                    # Locators".
                    #
                    # If not specified, the default value is "http://www.googleapis.com/"
                "workflowFileName": "A String", # The file to store the workflow in.
                "logToSerialconsole": True or False, # Whether to send taskrunner log info to Google Compute Engine VM serial
                    # console.
                "baseTaskDir": "A String", # The location on the worker for task-specific subdirectories.
                "taskUser": "A String", # The UNIX user ID on the worker VM to use for tasks launched by
                    # taskrunner; e.g. "root".
                "vmId": "A String", # The ID string of the VM.
                "alsologtostderr": True or False, # Whether to also send taskrunner log info to stderr.
                "parallelWorkerSettings": { # Provides data to pass through to the worker harness. # The settings to pass to the parallel worker harness.
                  "shuffleServicePath": "A String", # The Shuffle service path relative to the root URL, for example,
                      # "shuffle/v1beta1".
                  "tempStoragePrefix": "A String", # The prefix of the resources the system should use for temporary
                      # storage.
                      #
                      # The supported resource type is:
                      #
                      # Google Cloud Storage:
                      #
                      #   storage.googleapis.com/{bucket}/{object}
                      #   bucket.storage.googleapis.com/{object}
                  "reportingEnabled": True or False, # Whether to send work progress updates to the service.
                  "servicePath": "A String", # The Cloud Dataflow service path relative to the root URL, for example,
                      # "dataflow/v1b3/projects".
                  "baseUrl": "A String", # The base URL for accessing Google Cloud APIs.
                      #
                      # When workers access Google Cloud APIs, they logically do so via
                      # relative URLs.  If this field is specified, it supplies the base
                      # URL to use for resolving these relative URLs.  The normative
                      # algorithm used is defined by RFC 1808, "Relative Uniform Resource
                      # Locators".
                      #
                      # If not specified, the default value is "http://www.googleapis.com/"
                  "workerId": "A String", # The ID of the worker running this pipeline.
                },
                "harnessCommand": "A String", # The command to launch the worker harness.
                "logDir": "A String", # The directory on the VM to store logs.
                "streamingWorkerMainClass": "A String", # The streaming worker main class name.
                "languageHint": "A String", # The suggested backend language.
                "taskGroup": "A String", # The UNIX group ID on the worker VM to use for tasks launched by
                    # taskrunner; e.g. "wheel".
                "logUploadLocation": "A String", # Indicates where to put logs.  If this is not specified, the logs
                    # will not be uploaded.
                    #
                    # The supported resource type is:
                    #
                    # Google Cloud Storage:
                    #   storage.googleapis.com/{bucket}/{object}
                    #   bucket.storage.googleapis.com/{object}
                "commandlinesFileName": "A String", # The file to store preprocessing commands in.
                "continueOnException": True or False, # Whether to continue taskrunner if an exception is hit.
                "tempStoragePrefix": "A String", # The prefix of the resources the taskrunner should use for
                    # temporary storage.
                    #
                    # The supported resource type is:
                    #
                    # Google Cloud Storage:
                    #   storage.googleapis.com/{bucket}/{object}
                    #   bucket.storage.googleapis.com/{object}
              },
              "diskType": "A String", # Type of root disk for VMs.  If empty or unspecified, the service will
                  # attempt to choose a reasonable default.
              "defaultPackageSet": "A String", # The default package set to install.  This allows the service to
                  # select a default set of packages which are useful to worker
                  # harnesses written in a particular language.
              "machineType": "A String", # Machine type (e.g. "n1-standard-1").  If empty or unspecified, the
                  # service will attempt to choose a reasonable default.
            },
          ],
          "tempStoragePrefix": "A String", # The prefix of the resources the system should use for temporary
              # storage.  The system will append the suffix "/temp-{JOBNAME} to
              # this resource prefix, where {JOBNAME} is the value of the
              # job_name field.  The resulting bucket and object prefix is used
              # as the prefix of the resources used to store temporary data
              # needed during the job execution.  NOTE: This will override the
              # value in taskrunner_settings.
              # The supported resource type is:
              #
              # Google Cloud Storage:
              #
              #   storage.googleapis.com/{bucket}/{object}
              #   bucket.storage.googleapis.com/{object}
          "internalExperiments": { # Experimental settings.
            "a_key": "", # Properties of the object. Contains field @type with type URL.
          },
          "sdkPipelineOptions": { # The Cloud Dataflow SDK pipeline options specified by the user. These
              # options are passed through the service and are used to recreate the
              # SDK pipeline options on the worker in a language agnostic and platform
              # independent way.
            "a_key": "", # Properties of the object.
          },
          "dataset": "A String", # The dataset for the current project where various workflow
              # related tables are stored.
              #
              # The supported resource type is:
              #
              # Google BigQuery:
              #   bigquery.googleapis.com/{dataset}
          "clusterManagerApiService": "A String", # The type of cluster manager API to use.  If unknown or
              # unspecified, the service will attempt to choose a reasonable
              # default.  This should be in the form of the API service name,
              # e.g. "compute.googleapis.com".
        },
        "stepsLocation": "A String", # The GCS location where the steps are stored.
        "steps": [ # Exactly one of step or steps_location should be specified.
            #
            # The top-level steps that constitute the entire job.
          { # Defines a particular step within a Cloud Dataflow job.
              #
              # A job consists of multiple steps, each of which performs some
              # specific operation as part of the overall job.  Data is typically
              # passed from one step to another as part of the job.
              #
              # Here's an example of a sequence of steps which together implement a
              # Map-Reduce job:
              #
              #   * Read a collection of data from some source, parsing the
              #     collection's elements.
              #
              #   * Validate the elements.
              #
              #   * Apply a user-defined function to map each element to some value
              #     and extract an element-specific key value.
              #
              #   * Group elements with the same key into a single element with
              #     that key, transforming a multiply-keyed collection into a
              #     uniquely-keyed collection.
              #
              #   * Write the elements out to some data sink.
              #
              # Note that the Cloud Dataflow service may be used to run many different
              # types of jobs, not just Map-Reduce.
            "kind": "A String", # The kind of step in the Cloud Dataflow job.
            "properties": { # Named properties associated with the step. Each kind of
                # predefined step has its own required set of properties.
                # Must be provided on Create.  Only retrieved with JOB_VIEW_ALL.
              "a_key": "", # Properties of the object.
            },
            "name": "A String", # The name that identifies the step. This must be unique for each
                # step with respect to all other steps in the Cloud Dataflow job.
          },
        ],
        "stageStates": [ # This field may be mutated by the Cloud Dataflow service;
            # callers cannot mutate it.
          { # A message describing the state of a particular execution stage.
            "executionStageState": "A String", # Executions stage states allow the same set of values as JobState.
            "executionStageName": "A String", # The name of the execution stage.
            "currentStateTime": "A String", # The time at which the stage transitioned to this state.
          },
        ],
        "replacedByJobId": "A String", # If another job is an update of this job (and thus, this job is in
            # `JOB_STATE_UPDATED`), this field contains the ID of that job.
        "jobMetadata": { # Metadata available primarily for filtering jobs. Will be included in the # This field is populated by the Dataflow service to support filtering jobs
            # by the metadata values provided here. Populated for ListJobs and all GetJob
            # views SUMMARY and higher.
            # ListJob response and Job SUMMARY view.
          "sdkVersion": { # The version of the SDK used to run the job. # The SDK version used to run the job.
            "sdkSupportStatus": "A String", # The support status for this SDK version.
            "versionDisplayName": "A String", # A readable string describing the version of the SDK.
            "version": "A String", # The version of the SDK used to run the job.
          },
          "bigTableDetails": [ # Identification of a BigTable source used in the Dataflow job.
            { # Metadata for a BigTable connector used by the job.
              "instanceId": "A String", # InstanceId accessed in the connection.
              "tableId": "A String", # TableId accessed in the connection.
              "projectId": "A String", # ProjectId accessed in the connection.
            },
          ],
          "pubsubDetails": [ # Identification of a PubSub source used in the Dataflow job.
            { # Metadata for a PubSub connector used by the job.
              "subscription": "A String", # Subscription used in the connection.
              "topic": "A String", # Topic accessed in the connection.
            },
          ],
          "bigqueryDetails": [ # Identification of a BigQuery source used in the Dataflow job.
            { # Metadata for a BigQuery connector used by the job.
              "dataset": "A String", # Dataset accessed in the connection.
              "projectId": "A String", # Project accessed in the connection.
              "query": "A String", # Query used to access data in the connection.
              "table": "A String", # Table accessed in the connection.
            },
          ],
          "fileDetails": [ # Identification of a File source used in the Dataflow job.
            { # Metadata for a File connector used by the job.
              "filePattern": "A String", # File Pattern used to access files by the connector.
            },
          ],
          "datastoreDetails": [ # Identification of a Datastore source used in the Dataflow job.
            { # Metadata for a Datastore connector used by the job.
              "namespace": "A String", # Namespace used in the connection.
              "projectId": "A String", # ProjectId accessed in the connection.
            },
          ],
          "spannerDetails": [ # Identification of a Spanner source used in the Dataflow job.
            { # Metadata for a Spanner connector used by the job.
              "instanceId": "A String", # InstanceId accessed in the connection.
              "databaseId": "A String", # DatabaseId accessed in the connection.
              "projectId": "A String", # ProjectId accessed in the connection.
            },
          ],
        },
        "location": "A String", # The [regional endpoint]
            # (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
            # contains this job.
        "transformNameMapping": { # The map of transform name prefixes of the job to be replaced to the
            # corresponding name prefixes of the new job.
          "a_key": "A String",
        },
        "startTime": "A String", # The timestamp when the job was started (transitioned to JOB_STATE_PENDING).
            # Flexible resource scheduling jobs are started with some delay after job
            # creation, so start_time is unset before start and is updated when the
            # job is started by the Cloud Dataflow service. For other jobs, start_time
            # always equals to create_time and is immutable and set by the Cloud Dataflow
            # service.
        "clientRequestId": "A String", # The client's unique identifier of the job, re-used across retried attempts.
            # If this field is set, the service will ensure its uniqueness.
            # The request to create a job will fail if the service has knowledge of a
            # previously submitted job with the same client's ID and job name.
            # The caller may use this field to ensure idempotence of job
            # creation across retried attempts to create a job.
            # By default, the field is empty and, in that case, the service ignores it.
        "executionInfo": { # Additional information about how a Cloud Dataflow job will be executed that # Deprecated.
            # isn't contained in the submitted job.
          "stages": { # A mapping from each stage to the information about that stage.
            "a_key": { # Contains information about how a particular
                # google.dataflow.v1beta3.Step will be executed.
              "stepName": [ # The steps associated with the execution stage.
                  # Note that stages may have several steps, and that a given step
                  # might be run by more than one stage.
                "A String",
              ],
            },
          },
        },
        "type": "A String", # The type of Cloud Dataflow job.
        "createTime": "A String", # The timestamp when the job was initially created. Immutable and set by the
            # Cloud Dataflow service.
        "tempFiles": [ # A set of files the system should be aware of that are used
            # for temporary storage. These temporary files will be
            # removed on job completion.
            # No duplicates are allowed.
            # No file patterns are supported.
            #
            # The supported files are:
            #
            # Google Cloud Storage:
            #
            #    storage.googleapis.com/{bucket}/{object}
            #    bucket.storage.googleapis.com/{object}
          "A String",
        ],
        "id": "A String", # The unique ID of this job.
            #
            # This field is set by the Cloud Dataflow service when the Job is
            # created, and is immutable for the life of the job.
        "requestedState": "A String", # The job's requested state.
            #
            # `UpdateJob` may be used to switch between the `JOB_STATE_STOPPED` and
            # `JOB_STATE_RUNNING` states, by setting requested_state.  `UpdateJob` may
            # also be used to directly set a job's requested state to
            # `JOB_STATE_CANCELLED` or `JOB_STATE_DONE`, irrevocably terminating the
            # job if it has not already reached a terminal state.
        "replaceJobId": "A String", # If this job is an update of an existing job, this field is the job ID
            # of the job it replaced.
            #
            # When sending a `CreateJobRequest`, you can update a job by specifying it
            # here. The job named here is stopped, and its intermediate state is
            # transferred to this job.
        "createdFromSnapshotId": "A String", # If this is specified, the job's initial state is populated from the given
            # snapshot.
        "currentState": "A String", # The current state of the job.
            #
            # Jobs are created in the `JOB_STATE_STOPPED` state unless otherwise
            # specified.
            #
            # A job in the `JOB_STATE_RUNNING` state may asynchronously enter a
            # terminal state. After a job has reached a terminal state, no
            # further state updates may be made.
            #
            # This field may be mutated by the Cloud Dataflow service;
            # callers cannot mutate it.
        "name": "A String", # The user-specified Cloud Dataflow job name.
            #
            # Only one Job with a given name may exist in a project at any
            # given time. If a caller attempts to create a Job with the same
            # name as an already-existing Job, the attempt returns the
            # existing Job.
            #
            # The name must match the regular expression
            # `[a-z]([-a-z0-9]{0,38}[a-z0-9])?`
        "currentStateTime": "A String", # The timestamp associated with the current state.
      },
  }