blob: e67f11bdbccca785e009c9b5c2a9007ee49fa117 [file] [log] [blame]
Sai Cheemalapati4ba8c232017-06-06 18:46:08 -04001<html><body>
2<style>
3
4body, h1, h2, h3, div, span, p, pre, a {
5 margin: 0;
6 padding: 0;
7 border: 0;
8 font-weight: inherit;
9 font-style: inherit;
10 font-size: 100%;
11 font-family: inherit;
12 vertical-align: baseline;
13}
14
15body {
16 font-size: 13px;
17 padding: 1em;
18}
19
20h1 {
21 font-size: 26px;
22 margin-bottom: 1em;
23}
24
25h2 {
26 font-size: 24px;
27 margin-bottom: 1em;
28}
29
30h3 {
31 font-size: 20px;
32 margin-bottom: 1em;
33 margin-top: 1em;
34}
35
36pre, code {
37 line-height: 1.5;
38 font-family: Monaco, 'DejaVu Sans Mono', 'Bitstream Vera Sans Mono', 'Lucida Console', monospace;
39}
40
41pre {
42 margin-top: 0.5em;
43}
44
45h1, h2, h3, p {
46 font-family: Arial, sans serif;
47}
48
49h1, h2, h3 {
50 border-bottom: solid #CCC 1px;
51}
52
53.toc_element {
54 margin-top: 0.5em;
55}
56
57.firstline {
58 margin-left: 2 em;
59}
60
61.method {
62 margin-top: 1em;
63 border: solid 1px #CCC;
64 padding: 1em;
65 background: #EEE;
66}
67
68.details {
69 font-weight: bold;
70 font-size: 14px;
71}
72
73</style>
74
Bu Sun Kim715bd7f2019-06-14 16:50:42 -070075<h1><a href="dataflow_v1b3.html">Dataflow API</a> . <a href="dataflow_v1b3.projects.html">projects</a> . <a href="dataflow_v1b3.projects.locations.html">locations</a> . <a href="dataflow_v1b3.projects.locations.templates.html">templates</a></h1>
Sai Cheemalapati4ba8c232017-06-06 18:46:08 -040076<h2>Instance Methods</h2>
77<p class="toc_element">
Dan O'Mearadd494642020-05-01 07:42:23 -070078 <code><a href="#create">create(projectId, location, body=None, x__xgafv=None)</a></code></p>
Sai Cheemalapati4ba8c232017-06-06 18:46:08 -040079<p class="firstline">Creates a Cloud Dataflow job from a template.</p>
80<p class="toc_element">
Bu Sun Kim65020912020-05-20 12:08:20 -070081 <code><a href="#get">get(projectId, location, view=None, gcsPath=None, x__xgafv=None)</a></code></p>
Sai Cheemalapati4ba8c232017-06-06 18:46:08 -040082<p class="firstline">Get the template associated with a template.</p>
83<p class="toc_element">
Bu Sun Kim65020912020-05-20 12:08:20 -070084 <code><a href="#launch">launch(projectId, location, body=None, gcsPath=None, dynamicTemplate_gcsPath=None, dynamicTemplate_stagingLocation=None, validateOnly=None, x__xgafv=None)</a></code></p>
Sai Cheemalapati4ba8c232017-06-06 18:46:08 -040085<p class="firstline">Launch a template.</p>
86<h3>Method Details</h3>
87<div class="method">
Dan O'Mearadd494642020-05-01 07:42:23 -070088 <code class="details" id="create">create(projectId, location, body=None, x__xgafv=None)</code>
Sai Cheemalapati4ba8c232017-06-06 18:46:08 -040089 <pre>Creates a Cloud Dataflow job from a template.
90
91Args:
92 projectId: string, Required. The ID of the Cloud Platform project that the job belongs to. (required)
Bu Sun Kim715bd7f2019-06-14 16:50:42 -070093 location: string, The [regional endpoint]
94(https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) to
95which to direct the request. (required)
Dan O'Mearadd494642020-05-01 07:42:23 -070096 body: object, The request body.
Sai Cheemalapati4ba8c232017-06-06 18:46:08 -040097 The object takes the form of:
98
99{ # A request to create a Cloud Dataflow job from a template.
Bu Sun Kim65020912020-05-20 12:08:20 -0700100 &quot;environment&quot;: { # The environment values to set at runtime. # The runtime environment for the job.
101 &quot;workerRegion&quot;: &quot;A String&quot;, # The Compute Engine region
Dan O'Mearadd494642020-05-01 07:42:23 -0700102 # (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
Bu Sun Kim65020912020-05-20 12:08:20 -0700103 # which worker processing should occur, e.g. &quot;us-west1&quot;. Mutually exclusive
Dan O'Mearadd494642020-05-01 07:42:23 -0700104 # with worker_zone. If neither worker_region nor worker_zone is specified,
Bu Sun Kim65020912020-05-20 12:08:20 -0700105 # default to the control plane&#x27;s region.
106 &quot;numWorkers&quot;: 42, # The initial number of Google Compute Engine instnaces for the job.
107 &quot;zone&quot;: &quot;A String&quot;, # The Compute Engine [availability
Sai Cheemalapati4ba8c232017-06-06 18:46:08 -0400108 # zone](https://cloud.google.com/compute/docs/regions-zones/regions-zones)
109 # for launching worker instances to run your pipeline.
Dan O'Mearadd494642020-05-01 07:42:23 -0700110 # In the future, worker_zone will take precedence.
Bu Sun Kim65020912020-05-20 12:08:20 -0700111 &quot;workerZone&quot;: &quot;A String&quot;, # The Compute Engine zone
112 # (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
113 # which worker processing should occur, e.g. &quot;us-west1-a&quot;. Mutually exclusive
114 # with worker_region. If neither worker_region nor worker_zone is specified,
115 # a zone in the control plane&#x27;s region is chosen based on available capacity.
116 # If both `worker_zone` and `zone` are set, `worker_zone` takes precedence.
117 &quot;additionalUserLabels&quot;: { # Additional user labels to be specified for the job.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700118 # Keys and values should follow the restrictions specified in the [labeling
119 # restrictions](https://cloud.google.com/compute/docs/labeling-resources#restrictions)
120 # page.
Bu Sun Kim65020912020-05-20 12:08:20 -0700121 &quot;a_key&quot;: &quot;A String&quot;,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700122 },
Bu Sun Kim65020912020-05-20 12:08:20 -0700123 &quot;additionalExperiments&quot;: [ # Additional experiment flags for the job.
124 &quot;A String&quot;,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700125 ],
Bu Sun Kim65020912020-05-20 12:08:20 -0700126 &quot;maxWorkers&quot;: 42, # The maximum number of Google Compute Engine instances to be made
127 # available to your pipeline during execution, from 1 to 1000.
128 &quot;serviceAccountEmail&quot;: &quot;A String&quot;, # The email address of the service account to run the job as.
129 &quot;machineType&quot;: &quot;A String&quot;, # The machine type to use for the job. Defaults to the value from the
130 # template if not specified.
131 &quot;subnetwork&quot;: &quot;A String&quot;, # Subnetwork to which VMs will be assigned, if desired. Expected to be of
132 # the form &quot;regions/REGION/subnetworks/SUBNETWORK&quot;.
133 &quot;ipConfiguration&quot;: &quot;A String&quot;, # Configuration for VM IPs.
134 &quot;kmsKeyName&quot;: &quot;A String&quot;, # Optional. Name for the Cloud KMS key for the job.
Dan O'Mearadd494642020-05-01 07:42:23 -0700135 # Key format is:
136 # projects/&lt;project&gt;/locations/&lt;location&gt;/keyRings/&lt;keyring&gt;/cryptoKeys/&lt;key&gt;
Bu Sun Kim65020912020-05-20 12:08:20 -0700137 &quot;bypassTempDirValidation&quot;: True or False, # Whether to bypass the safety checks for the job&#x27;s temporary directory.
138 # Use with caution.
139 &quot;tempLocation&quot;: &quot;A String&quot;, # The Cloud Storage path to use for temporary files.
140 # Must be a valid Cloud Storage URL, beginning with `gs://`.
141 &quot;network&quot;: &quot;A String&quot;, # Network to which VMs will be assigned. If empty or unspecified,
142 # the service will use the network &quot;default&quot;.
Sai Cheemalapati4ba8c232017-06-06 18:46:08 -0400143 },
Bu Sun Kim65020912020-05-20 12:08:20 -0700144 &quot;location&quot;: &quot;A String&quot;, # The [regional endpoint]
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700145 # (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) to
146 # which to direct the request.
Bu Sun Kim65020912020-05-20 12:08:20 -0700147 &quot;parameters&quot;: { # The runtime parameters to pass to the job.
148 &quot;a_key&quot;: &quot;A String&quot;,
Sai Cheemalapati4ba8c232017-06-06 18:46:08 -0400149 },
Bu Sun Kim65020912020-05-20 12:08:20 -0700150 &quot;jobName&quot;: &quot;A String&quot;, # Required. The job name to use for the created job.
151 &quot;gcsPath&quot;: &quot;A String&quot;, # Required. A Cloud Storage path to the template from which to
152 # create the job.
153 # Must be a valid Cloud Storage URL, beginning with `gs://`.
Sai Cheemalapati4ba8c232017-06-06 18:46:08 -0400154 }
155
156 x__xgafv: string, V1 error format.
157 Allowed values
158 1 - v1 error format
159 2 - v2 error format
160
161Returns:
162 An object of the form:
163
164 { # Defines a job to be run by the Cloud Dataflow service.
Bu Sun Kim65020912020-05-20 12:08:20 -0700165 &quot;clientRequestId&quot;: &quot;A String&quot;, # The client&#x27;s unique identifier of the job, re-used across retried attempts.
166 # If this field is set, the service will ensure its uniqueness.
167 # The request to create a job will fail if the service has knowledge of a
168 # previously submitted job with the same client&#x27;s ID and job name.
169 # The caller may use this field to ensure idempotence of job
170 # creation across retried attempts to create a job.
171 # By default, the field is empty and, in that case, the service ignores it.
172 &quot;id&quot;: &quot;A String&quot;, # The unique ID of this job.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700173 #
174 # This field is set by the Cloud Dataflow service when the Job is
175 # created, and is immutable for the life of the job.
Bu Sun Kim65020912020-05-20 12:08:20 -0700176 &quot;currentStateTime&quot;: &quot;A String&quot;, # The timestamp associated with the current state.
177 &quot;transformNameMapping&quot;: { # The map of transform name prefixes of the job to be replaced to the
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700178 # corresponding name prefixes of the new job.
Bu Sun Kim65020912020-05-20 12:08:20 -0700179 &quot;a_key&quot;: &quot;A String&quot;,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700180 },
Bu Sun Kim65020912020-05-20 12:08:20 -0700181 &quot;environment&quot;: { # Describes the environment in which a Dataflow Job runs. # The environment for the job.
182 &quot;sdkPipelineOptions&quot;: { # The Cloud Dataflow SDK pipeline options specified by the user. These
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700183 # options are passed through the service and are used to recreate the
184 # SDK pipeline options on the worker in a language agnostic and platform
185 # independent way.
Bu Sun Kim65020912020-05-20 12:08:20 -0700186 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700187 },
Bu Sun Kim65020912020-05-20 12:08:20 -0700188 &quot;flexResourceSchedulingGoal&quot;: &quot;A String&quot;, # Which Flexible Resource Scheduling mode to run in.
189 &quot;workerPools&quot;: [ # The worker pools. At least one &quot;harness&quot; worker pool must be
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700190 # specified in order for the job to have workers.
191 { # Describes one particular pool of Cloud Dataflow workers to be
192 # instantiated by the Cloud Dataflow service in order to perform the
193 # computations required by a job. Note that a workflow job may use
194 # multiple pools, in order to match the various computational
195 # requirements of the various stages of the job.
Bu Sun Kim65020912020-05-20 12:08:20 -0700196 &quot;defaultPackageSet&quot;: &quot;A String&quot;, # The default package set to install. This allows the service to
197 # select a default set of packages which are useful to worker
198 # harnesses written in a particular language.
199 &quot;network&quot;: &quot;A String&quot;, # Network to which VMs will be assigned. If empty or unspecified,
200 # the service will use the network &quot;default&quot;.
201 &quot;zone&quot;: &quot;A String&quot;, # Zone to run the worker pools in. If empty or unspecified, the service
Dan O'Mearadd494642020-05-01 07:42:23 -0700202 # will attempt to choose a reasonable default.
Bu Sun Kim65020912020-05-20 12:08:20 -0700203 &quot;numWorkers&quot;: 42, # Number of Google Compute Engine workers in this pool needed to
204 # execute the job. If zero or unspecified, the service will
205 # attempt to choose a reasonable default.
206 &quot;numThreadsPerWorker&quot;: 42, # The number of threads per worker harness. If empty or unspecified, the
Dan O'Mearadd494642020-05-01 07:42:23 -0700207 # service will choose a number of threads (according to the number of cores
208 # on the selected machine type for batch, or 1 by convention for streaming).
Bu Sun Kim65020912020-05-20 12:08:20 -0700209 &quot;diskSourceImage&quot;: &quot;A String&quot;, # Fully qualified source image for disks.
210 &quot;packages&quot;: [ # Packages to be installed on workers.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700211 { # The packages that must be installed in order for a worker to run the
212 # steps of the Cloud Dataflow job that will be assigned to its worker
213 # pool.
214 #
215 # This is the mechanism by which the Cloud Dataflow SDK causes code to
216 # be loaded onto the workers. For example, the Cloud Dataflow Java SDK
Bu Sun Kim65020912020-05-20 12:08:20 -0700217 # might use this to install jars containing the user&#x27;s code and all of the
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700218 # various dependencies (libraries, data files, etc.) required in order
219 # for that code to run.
Bu Sun Kim65020912020-05-20 12:08:20 -0700220 &quot;location&quot;: &quot;A String&quot;, # The resource to read the package from. The supported resource type is:
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700221 #
222 # Google Cloud Storage:
223 #
224 # storage.googleapis.com/{bucket}
225 # bucket.storage.googleapis.com/
Bu Sun Kim65020912020-05-20 12:08:20 -0700226 &quot;name&quot;: &quot;A String&quot;, # The name of the package.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700227 },
228 ],
Bu Sun Kim65020912020-05-20 12:08:20 -0700229 &quot;teardownPolicy&quot;: &quot;A String&quot;, # Sets the policy for determining when to turndown worker pool.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700230 # Allowed values are: `TEARDOWN_ALWAYS`, `TEARDOWN_ON_SUCCESS`, and
231 # `TEARDOWN_NEVER`.
232 # `TEARDOWN_ALWAYS` means workers are always torn down regardless of whether
233 # the job succeeds. `TEARDOWN_ON_SUCCESS` means workers are torn down
234 # if the job succeeds. `TEARDOWN_NEVER` means the workers are never torn
235 # down.
236 #
237 # If the workers are not torn down by the service, they will
238 # continue to run and use Google Compute Engine VM resources in the
Bu Sun Kim65020912020-05-20 12:08:20 -0700239 # user&#x27;s project until they are explicitly terminated by the user.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700240 # Because of this, Google recommends using the `TEARDOWN_ALWAYS`
241 # policy except for small, manually supervised test jobs.
242 #
243 # If unknown or unspecified, the service will attempt to choose a reasonable
244 # default.
Bu Sun Kim65020912020-05-20 12:08:20 -0700245 &quot;onHostMaintenance&quot;: &quot;A String&quot;, # The action to take on host maintenance, as defined by the Google
246 # Compute Engine API.
247 &quot;poolArgs&quot;: { # Extra arguments for this worker pool.
248 &quot;a_key&quot;: &quot;&quot;, # Properties of the object. Contains field @type with type URL.
249 },
250 &quot;diskSizeGb&quot;: 42, # Size of root disk for VMs, in GB. If zero or unspecified, the service will
Dan O'Mearadd494642020-05-01 07:42:23 -0700251 # attempt to choose a reasonable default.
Bu Sun Kim65020912020-05-20 12:08:20 -0700252 &quot;workerHarnessContainerImage&quot;: &quot;A String&quot;, # Required. Docker container image that executes the Cloud Dataflow worker
253 # harness, residing in Google Container Registry.
254 #
255 # Deprecated for the Fn API path. Use sdk_harness_container_images instead.
256 &quot;diskType&quot;: &quot;A String&quot;, # Type of root disk for VMs. If empty or unspecified, the service will
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700257 # attempt to choose a reasonable default.
Bu Sun Kim65020912020-05-20 12:08:20 -0700258 &quot;machineType&quot;: &quot;A String&quot;, # Machine type (e.g. &quot;n1-standard-1&quot;). If empty or unspecified, the
259 # service will attempt to choose a reasonable default.
260 &quot;kind&quot;: &quot;A String&quot;, # The kind of the worker pool; currently only `harness` and `shuffle`
261 # are supported.
262 &quot;dataDisks&quot;: [ # Data disks that are used by a VM in this workflow.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700263 { # Describes the data disk used by a workflow job.
Bu Sun Kim65020912020-05-20 12:08:20 -0700264 &quot;sizeGb&quot;: 42, # Size of disk in GB. If zero or unspecified, the service will
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700265 # attempt to choose a reasonable default.
Bu Sun Kim65020912020-05-20 12:08:20 -0700266 &quot;diskType&quot;: &quot;A String&quot;, # Disk storage type, as defined by Google Compute Engine. This
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700267 # must be a disk type appropriate to the project and zone in which
268 # the workers will run. If unknown or unspecified, the service
269 # will attempt to choose a reasonable default.
270 #
271 # For example, the standard persistent disk type is a resource name
Bu Sun Kim65020912020-05-20 12:08:20 -0700272 # typically ending in &quot;pd-standard&quot;. If SSD persistent disks are
273 # available, the resource name typically ends with &quot;pd-ssd&quot;. The
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700274 # actual valid values are defined the Google Compute Engine API,
275 # not by the Cloud Dataflow API; consult the Google Compute Engine
276 # documentation for more information about determining the set of
277 # available disk types for a particular project and zone.
278 #
279 # Google Compute Engine Disk types are local to a particular
280 # project in a particular zone, and so the resource name will
281 # typically look something like this:
282 #
283 # compute.googleapis.com/projects/project-id/zones/zone/diskTypes/pd-standard
Bu Sun Kim65020912020-05-20 12:08:20 -0700284 &quot;mountPoint&quot;: &quot;A String&quot;, # Directory in a VM where disk is mounted.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700285 },
286 ],
Bu Sun Kim65020912020-05-20 12:08:20 -0700287 &quot;sdkHarnessContainerImages&quot;: [ # Set of SDK harness containers needed to execute this pipeline. This will
Dan O'Mearadd494642020-05-01 07:42:23 -0700288 # only be set in the Fn API path. For non-cross-language pipelines this
289 # should have only one entry. Cross-language pipelines will have two or more
290 # entries.
291 { # Defines a SDK harness container for executing Dataflow pipelines.
Bu Sun Kim65020912020-05-20 12:08:20 -0700292 &quot;containerImage&quot;: &quot;A String&quot;, # A docker container image that resides in Google Container Registry.
293 &quot;useSingleCorePerContainer&quot;: True or False, # If true, recommends the Dataflow service to use only one core per SDK
Dan O'Mearadd494642020-05-01 07:42:23 -0700294 # container instance with this image. If false (or unset) recommends using
295 # more than one core per SDK container instance with this image for
296 # efficiency. Note that Dataflow service may choose to override this property
297 # if needed.
298 },
299 ],
Bu Sun Kim65020912020-05-20 12:08:20 -0700300 &quot;subnetwork&quot;: &quot;A String&quot;, # Subnetwork to which VMs will be assigned, if desired. Expected to be of
301 # the form &quot;regions/REGION/subnetworks/SUBNETWORK&quot;.
302 &quot;ipConfiguration&quot;: &quot;A String&quot;, # Configuration for VM IPs.
303 &quot;taskrunnerSettings&quot;: { # Taskrunner configuration settings. # Settings passed through to Google Compute Engine workers when
304 # using the standard Dataflow task runner. Users should ignore
305 # this field.
306 &quot;alsologtostderr&quot;: True or False, # Whether to also send taskrunner log info to stderr.
307 &quot;taskGroup&quot;: &quot;A String&quot;, # The UNIX group ID on the worker VM to use for tasks launched by
308 # taskrunner; e.g. &quot;wheel&quot;.
309 &quot;harnessCommand&quot;: &quot;A String&quot;, # The command to launch the worker harness.
310 &quot;logDir&quot;: &quot;A String&quot;, # The directory on the VM to store logs.
311 &quot;oauthScopes&quot;: [ # The OAuth2 scopes to be requested by the taskrunner in order to
312 # access the Cloud Dataflow API.
313 &quot;A String&quot;,
314 ],
315 &quot;dataflowApiVersion&quot;: &quot;A String&quot;, # The API version of endpoint, e.g. &quot;v1b3&quot;
316 &quot;logUploadLocation&quot;: &quot;A String&quot;, # Indicates where to put logs. If this is not specified, the logs
317 # will not be uploaded.
318 #
319 # The supported resource type is:
320 #
321 # Google Cloud Storage:
322 # storage.googleapis.com/{bucket}/{object}
323 # bucket.storage.googleapis.com/{object}
324 &quot;streamingWorkerMainClass&quot;: &quot;A String&quot;, # The streaming worker main class name.
325 &quot;workflowFileName&quot;: &quot;A String&quot;, # The file to store the workflow in.
326 &quot;baseTaskDir&quot;: &quot;A String&quot;, # The location on the worker for task-specific subdirectories.
327 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the taskrunner should use for
328 # temporary storage.
329 #
330 # The supported resource type is:
331 #
332 # Google Cloud Storage:
333 # storage.googleapis.com/{bucket}/{object}
334 # bucket.storage.googleapis.com/{object}
335 &quot;commandlinesFileName&quot;: &quot;A String&quot;, # The file to store preprocessing commands in.
336 &quot;languageHint&quot;: &quot;A String&quot;, # The suggested backend language.
337 &quot;baseUrl&quot;: &quot;A String&quot;, # The base URL for the taskrunner to use when accessing Google Cloud APIs.
338 #
339 # When workers access Google Cloud APIs, they logically do so via
340 # relative URLs. If this field is specified, it supplies the base
341 # URL to use for resolving these relative URLs. The normative
342 # algorithm used is defined by RFC 1808, &quot;Relative Uniform Resource
343 # Locators&quot;.
344 #
345 # If not specified, the default value is &quot;http://www.googleapis.com/&quot;
346 &quot;logToSerialconsole&quot;: True or False, # Whether to send taskrunner log info to Google Compute Engine VM serial
347 # console.
348 &quot;continueOnException&quot;: True or False, # Whether to continue taskrunner if an exception is hit.
349 &quot;parallelWorkerSettings&quot;: { # Provides data to pass through to the worker harness. # The settings to pass to the parallel worker harness.
350 &quot;baseUrl&quot;: &quot;A String&quot;, # The base URL for accessing Google Cloud APIs.
351 #
352 # When workers access Google Cloud APIs, they logically do so via
353 # relative URLs. If this field is specified, it supplies the base
354 # URL to use for resolving these relative URLs. The normative
355 # algorithm used is defined by RFC 1808, &quot;Relative Uniform Resource
356 # Locators&quot;.
357 #
358 # If not specified, the default value is &quot;http://www.googleapis.com/&quot;
359 &quot;reportingEnabled&quot;: True or False, # Whether to send work progress updates to the service.
360 &quot;servicePath&quot;: &quot;A String&quot;, # The Cloud Dataflow service path relative to the root URL, for example,
361 # &quot;dataflow/v1b3/projects&quot;.
362 &quot;shuffleServicePath&quot;: &quot;A String&quot;, # The Shuffle service path relative to the root URL, for example,
363 # &quot;shuffle/v1beta1&quot;.
364 &quot;workerId&quot;: &quot;A String&quot;, # The ID of the worker running this pipeline.
365 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the system should use for temporary
366 # storage.
367 #
368 # The supported resource type is:
369 #
370 # Google Cloud Storage:
371 #
372 # storage.googleapis.com/{bucket}/{object}
373 # bucket.storage.googleapis.com/{object}
374 },
375 &quot;vmId&quot;: &quot;A String&quot;, # The ID string of the VM.
376 &quot;taskUser&quot;: &quot;A String&quot;, # The UNIX user ID on the worker VM to use for tasks launched by
377 # taskrunner; e.g. &quot;root&quot;.
378 },
379 &quot;autoscalingSettings&quot;: { # Settings for WorkerPool autoscaling. # Settings for autoscaling of this WorkerPool.
380 &quot;maxNumWorkers&quot;: 42, # The maximum number of workers to cap scaling at.
381 &quot;algorithm&quot;: &quot;A String&quot;, # The algorithm to use for autoscaling.
382 },
383 &quot;metadata&quot;: { # Metadata to set on the Google Compute Engine VMs.
384 &quot;a_key&quot;: &quot;A String&quot;,
385 },
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700386 },
387 ],
Bu Sun Kim65020912020-05-20 12:08:20 -0700388 &quot;dataset&quot;: &quot;A String&quot;, # The dataset for the current project where various workflow
389 # related tables are stored.
390 #
391 # The supported resource type is:
392 #
393 # Google BigQuery:
394 # bigquery.googleapis.com/{dataset}
395 &quot;internalExperiments&quot;: { # Experimental settings.
396 &quot;a_key&quot;: &quot;&quot;, # Properties of the object. Contains field @type with type URL.
397 },
398 &quot;workerRegion&quot;: &quot;A String&quot;, # The Compute Engine region
399 # (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
400 # which worker processing should occur, e.g. &quot;us-west1&quot;. Mutually exclusive
401 # with worker_zone. If neither worker_region nor worker_zone is specified,
402 # default to the control plane&#x27;s region.
403 &quot;serviceKmsKeyName&quot;: &quot;A String&quot;, # If set, contains the Cloud KMS key identifier used to encrypt data
404 # at rest, AKA a Customer Managed Encryption Key (CMEK).
405 #
406 # Format:
407 # projects/PROJECT_ID/locations/LOCATION/keyRings/KEY_RING/cryptoKeys/KEY
408 &quot;userAgent&quot;: { # A description of the process that generated the request.
409 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
410 },
411 &quot;workerZone&quot;: &quot;A String&quot;, # The Compute Engine zone
412 # (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
413 # which worker processing should occur, e.g. &quot;us-west1-a&quot;. Mutually exclusive
414 # with worker_region. If neither worker_region nor worker_zone is specified,
415 # a zone in the control plane&#x27;s region is chosen based on available capacity.
416 &quot;clusterManagerApiService&quot;: &quot;A String&quot;, # The type of cluster manager API to use. If unknown or
Dan O'Mearadd494642020-05-01 07:42:23 -0700417 # unspecified, the service will attempt to choose a reasonable
418 # default. This should be in the form of the API service name,
Bu Sun Kim65020912020-05-20 12:08:20 -0700419 # e.g. &quot;compute.googleapis.com&quot;.
420 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the system should use for temporary
421 # storage. The system will append the suffix &quot;/temp-{JOBNAME} to
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700422 # this resource prefix, where {JOBNAME} is the value of the
423 # job_name field. The resulting bucket and object prefix is used
424 # as the prefix of the resources used to store temporary data
425 # needed during the job execution. NOTE: This will override the
426 # value in taskrunner_settings.
427 # The supported resource type is:
Sai Cheemalapati4ba8c232017-06-06 18:46:08 -0400428 #
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700429 # Google Cloud Storage:
Sai Cheemalapati4ba8c232017-06-06 18:46:08 -0400430 #
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700431 # storage.googleapis.com/{bucket}/{object}
432 # bucket.storage.googleapis.com/{object}
Bu Sun Kim65020912020-05-20 12:08:20 -0700433 &quot;experiments&quot;: [ # The list of experiments to enable.
434 &quot;A String&quot;,
435 ],
436 &quot;version&quot;: { # A structure describing which components and their versions of the service
437 # are required in order to run the job.
438 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
439 },
440 &quot;serviceAccountEmail&quot;: &quot;A String&quot;, # Identity to run virtual machines as. Defaults to the default account.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700441 },
Bu Sun Kim65020912020-05-20 12:08:20 -0700442 &quot;stageStates&quot;: [ # This field may be mutated by the Cloud Dataflow service;
443 # callers cannot mutate it.
444 { # A message describing the state of a particular execution stage.
445 &quot;executionStageName&quot;: &quot;A String&quot;, # The name of the execution stage.
446 &quot;currentStateTime&quot;: &quot;A String&quot;, # The time at which the stage transitioned to this state.
447 &quot;executionStageState&quot;: &quot;A String&quot;, # Executions stage states allow the same set of values as JobState.
448 },
449 ],
450 &quot;jobMetadata&quot;: { # Metadata available primarily for filtering jobs. Will be included in the # This field is populated by the Dataflow service to support filtering jobs
451 # by the metadata values provided here. Populated for ListJobs and all GetJob
452 # views SUMMARY and higher.
453 # ListJob response and Job SUMMARY view.
454 &quot;bigTableDetails&quot;: [ # Identification of a BigTable source used in the Dataflow job.
455 { # Metadata for a BigTable connector used by the job.
456 &quot;tableId&quot;: &quot;A String&quot;, # TableId accessed in the connection.
457 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
458 &quot;instanceId&quot;: &quot;A String&quot;, # InstanceId accessed in the connection.
459 },
460 ],
461 &quot;spannerDetails&quot;: [ # Identification of a Spanner source used in the Dataflow job.
462 { # Metadata for a Spanner connector used by the job.
463 &quot;databaseId&quot;: &quot;A String&quot;, # DatabaseId accessed in the connection.
464 &quot;instanceId&quot;: &quot;A String&quot;, # InstanceId accessed in the connection.
465 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
466 },
467 ],
468 &quot;datastoreDetails&quot;: [ # Identification of a Datastore source used in the Dataflow job.
469 { # Metadata for a Datastore connector used by the job.
470 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
471 &quot;namespace&quot;: &quot;A String&quot;, # Namespace used in the connection.
472 },
473 ],
474 &quot;sdkVersion&quot;: { # The version of the SDK used to run the job. # The SDK version used to run the job.
475 &quot;versionDisplayName&quot;: &quot;A String&quot;, # A readable string describing the version of the SDK.
476 &quot;sdkSupportStatus&quot;: &quot;A String&quot;, # The support status for this SDK version.
477 &quot;version&quot;: &quot;A String&quot;, # The version of the SDK used to run the job.
478 },
479 &quot;bigqueryDetails&quot;: [ # Identification of a BigQuery source used in the Dataflow job.
480 { # Metadata for a BigQuery connector used by the job.
481 &quot;table&quot;: &quot;A String&quot;, # Table accessed in the connection.
482 &quot;dataset&quot;: &quot;A String&quot;, # Dataset accessed in the connection.
483 &quot;projectId&quot;: &quot;A String&quot;, # Project accessed in the connection.
484 &quot;query&quot;: &quot;A String&quot;, # Query used to access data in the connection.
485 },
486 ],
487 &quot;fileDetails&quot;: [ # Identification of a File source used in the Dataflow job.
488 { # Metadata for a File connector used by the job.
489 &quot;filePattern&quot;: &quot;A String&quot;, # File Pattern used to access files by the connector.
490 },
491 ],
492 &quot;pubsubDetails&quot;: [ # Identification of a PubSub source used in the Dataflow job.
493 { # Metadata for a PubSub connector used by the job.
494 &quot;subscription&quot;: &quot;A String&quot;, # Subscription used in the connection.
495 &quot;topic&quot;: &quot;A String&quot;, # Topic accessed in the connection.
496 },
497 ],
498 },
499 &quot;createdFromSnapshotId&quot;: &quot;A String&quot;, # If this is specified, the job&#x27;s initial state is populated from the given
500 # snapshot.
501 &quot;projectId&quot;: &quot;A String&quot;, # The ID of the Cloud Platform project that the job belongs to.
502 &quot;type&quot;: &quot;A String&quot;, # The type of Cloud Dataflow job.
503 &quot;pipelineDescription&quot;: { # A descriptive representation of submitted pipeline as well as the executed # Preliminary field: The format of this data may change at any time.
504 # A description of the user pipeline and stages through which it is executed.
505 # Created by Cloud Dataflow service. Only retrieved with
506 # JOB_VIEW_DESCRIPTION or JOB_VIEW_ALL.
507 # form. This data is provided by the Dataflow service for ease of visualizing
508 # the pipeline and interpreting Dataflow provided metrics.
509 &quot;executionPipelineStage&quot;: [ # Description of each stage of execution of the pipeline.
510 { # Description of the composing transforms, names/ids, and input/outputs of a
511 # stage of execution. Some composing transforms and sources may have been
512 # generated by the Dataflow service during execution planning.
513 &quot;id&quot;: &quot;A String&quot;, # Dataflow service generated id for this stage.
514 &quot;componentTransform&quot;: [ # Transforms that comprise this execution stage.
515 { # Description of a transform executed as part of an execution stage.
516 &quot;originalTransform&quot;: &quot;A String&quot;, # User name for the original user transform with which this transform is
517 # most closely associated.
518 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
519 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this transform; may be user or system generated.
520 },
521 ],
522 &quot;componentSource&quot;: [ # Collections produced and consumed by component transforms of this stage.
523 { # Description of an interstitial value between transforms in an execution
524 # stage.
525 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
526 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this transform; may be user or system generated.
527 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
528 # source is most closely associated.
529 },
530 ],
531 &quot;kind&quot;: &quot;A String&quot;, # Type of tranform this stage is executing.
532 &quot;outputSource&quot;: [ # Output sources for this stage.
533 { # Description of an input or output of an execution stage.
534 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
535 # source is most closely associated.
536 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
537 &quot;sizeBytes&quot;: &quot;A String&quot;, # Size of the source, if measurable.
538 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this source; may be user or system generated.
539 },
540 ],
541 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this stage.
542 &quot;inputSource&quot;: [ # Input sources for this stage.
543 { # Description of an input or output of an execution stage.
544 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
545 # source is most closely associated.
546 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
547 &quot;sizeBytes&quot;: &quot;A String&quot;, # Size of the source, if measurable.
548 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this source; may be user or system generated.
549 },
550 ],
551 },
552 ],
553 &quot;originalPipelineTransform&quot;: [ # Description of each transform in the pipeline and collections between them.
554 { # Description of the type, names/ids, and input/outputs for a transform.
555 &quot;kind&quot;: &quot;A String&quot;, # Type of transform.
556 &quot;inputCollectionName&quot;: [ # User names for all collection inputs to this transform.
557 &quot;A String&quot;,
558 ],
559 &quot;name&quot;: &quot;A String&quot;, # User provided name for this transform instance.
560 &quot;id&quot;: &quot;A String&quot;, # SDK generated id of this transform instance.
561 &quot;displayData&quot;: [ # Transform-specific display data.
562 { # Data provided with a pipeline or transform to provide descriptive info.
563 &quot;timestampValue&quot;: &quot;A String&quot;, # Contains value if the data is of timestamp type.
564 &quot;boolValue&quot;: True or False, # Contains value if the data is of a boolean type.
565 &quot;javaClassValue&quot;: &quot;A String&quot;, # Contains value if the data is of java class type.
566 &quot;strValue&quot;: &quot;A String&quot;, # Contains value if the data is of string type.
567 &quot;int64Value&quot;: &quot;A String&quot;, # Contains value if the data is of int64 type.
568 &quot;durationValue&quot;: &quot;A String&quot;, # Contains value if the data is of duration type.
569 &quot;namespace&quot;: &quot;A String&quot;, # The namespace for the key. This is usually a class name or programming
570 # language namespace (i.e. python module) which defines the display data.
571 # This allows a dax monitoring system to specially handle the data
572 # and perform custom rendering.
573 &quot;floatValue&quot;: 3.14, # Contains value if the data is of float type.
574 &quot;key&quot;: &quot;A String&quot;, # The key identifying the display data.
575 # This is intended to be used as a label for the display data
576 # when viewed in a dax monitoring system.
577 &quot;shortStrValue&quot;: &quot;A String&quot;, # A possible additional shorter value to display.
578 # For example a java_class_name_value of com.mypackage.MyDoFn
579 # will be stored with MyDoFn as the short_str_value and
580 # com.mypackage.MyDoFn as the java_class_name value.
581 # short_str_value can be displayed and java_class_name_value
582 # will be displayed as a tooltip.
583 &quot;url&quot;: &quot;A String&quot;, # An optional full URL.
584 &quot;label&quot;: &quot;A String&quot;, # An optional label to display in a dax UI for the element.
585 },
586 ],
587 &quot;outputCollectionName&quot;: [ # User names for all collection outputs to this transform.
588 &quot;A String&quot;,
589 ],
590 },
591 ],
592 &quot;displayData&quot;: [ # Pipeline level display data.
593 { # Data provided with a pipeline or transform to provide descriptive info.
594 &quot;timestampValue&quot;: &quot;A String&quot;, # Contains value if the data is of timestamp type.
595 &quot;boolValue&quot;: True or False, # Contains value if the data is of a boolean type.
596 &quot;javaClassValue&quot;: &quot;A String&quot;, # Contains value if the data is of java class type.
597 &quot;strValue&quot;: &quot;A String&quot;, # Contains value if the data is of string type.
598 &quot;int64Value&quot;: &quot;A String&quot;, # Contains value if the data is of int64 type.
599 &quot;durationValue&quot;: &quot;A String&quot;, # Contains value if the data is of duration type.
600 &quot;namespace&quot;: &quot;A String&quot;, # The namespace for the key. This is usually a class name or programming
601 # language namespace (i.e. python module) which defines the display data.
602 # This allows a dax monitoring system to specially handle the data
603 # and perform custom rendering.
604 &quot;floatValue&quot;: 3.14, # Contains value if the data is of float type.
605 &quot;key&quot;: &quot;A String&quot;, # The key identifying the display data.
606 # This is intended to be used as a label for the display data
607 # when viewed in a dax monitoring system.
608 &quot;shortStrValue&quot;: &quot;A String&quot;, # A possible additional shorter value to display.
609 # For example a java_class_name_value of com.mypackage.MyDoFn
610 # will be stored with MyDoFn as the short_str_value and
611 # com.mypackage.MyDoFn as the java_class_name value.
612 # short_str_value can be displayed and java_class_name_value
613 # will be displayed as a tooltip.
614 &quot;url&quot;: &quot;A String&quot;, # An optional full URL.
615 &quot;label&quot;: &quot;A String&quot;, # An optional label to display in a dax UI for the element.
616 },
617 ],
618 },
619 &quot;replaceJobId&quot;: &quot;A String&quot;, # If this job is an update of an existing job, this field is the job ID
620 # of the job it replaced.
621 #
622 # When sending a `CreateJobRequest`, you can update a job by specifying it
623 # here. The job named here is stopped, and its intermediate state is
624 # transferred to this job.
625 &quot;tempFiles&quot;: [ # A set of files the system should be aware of that are used
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700626 # for temporary storage. These temporary files will be
627 # removed on job completion.
628 # No duplicates are allowed.
629 # No file patterns are supported.
630 #
631 # The supported files are:
632 #
633 # Google Cloud Storage:
634 #
635 # storage.googleapis.com/{bucket}/{object}
636 # bucket.storage.googleapis.com/{object}
Bu Sun Kim65020912020-05-20 12:08:20 -0700637 &quot;A String&quot;,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700638 ],
Bu Sun Kim65020912020-05-20 12:08:20 -0700639 &quot;name&quot;: &quot;A String&quot;, # The user-specified Cloud Dataflow job name.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700640 #
641 # Only one Job with a given name may exist in a project at any
642 # given time. If a caller attempts to create a Job with the same
643 # name as an already-existing Job, the attempt returns the
644 # existing Job.
645 #
646 # The name must match the regular expression
647 # `[a-z]([-a-z0-9]{0,38}[a-z0-9])?`
Bu Sun Kim65020912020-05-20 12:08:20 -0700648 &quot;steps&quot;: [ # Exactly one of step or steps_location should be specified.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700649 #
650 # The top-level steps that constitute the entire job.
651 { # Defines a particular step within a Cloud Dataflow job.
Sai Cheemalapati4ba8c232017-06-06 18:46:08 -0400652 #
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700653 # A job consists of multiple steps, each of which performs some
654 # specific operation as part of the overall job. Data is typically
655 # passed from one step to another as part of the job.
656 #
Bu Sun Kim65020912020-05-20 12:08:20 -0700657 # Here&#x27;s an example of a sequence of steps which together implement a
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700658 # Map-Reduce job:
659 #
660 # * Read a collection of data from some source, parsing the
Bu Sun Kim65020912020-05-20 12:08:20 -0700661 # collection&#x27;s elements.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700662 #
663 # * Validate the elements.
664 #
665 # * Apply a user-defined function to map each element to some value
666 # and extract an element-specific key value.
667 #
668 # * Group elements with the same key into a single element with
669 # that key, transforming a multiply-keyed collection into a
670 # uniquely-keyed collection.
671 #
672 # * Write the elements out to some data sink.
673 #
674 # Note that the Cloud Dataflow service may be used to run many different
675 # types of jobs, not just Map-Reduce.
Bu Sun Kim65020912020-05-20 12:08:20 -0700676 &quot;name&quot;: &quot;A String&quot;, # The name that identifies the step. This must be unique for each
Dan O'Mearadd494642020-05-01 07:42:23 -0700677 # step with respect to all other steps in the Cloud Dataflow job.
Bu Sun Kim65020912020-05-20 12:08:20 -0700678 &quot;kind&quot;: &quot;A String&quot;, # The kind of step in the Cloud Dataflow job.
679 &quot;properties&quot;: { # Named properties associated with the step. Each kind of
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700680 # predefined step has its own required set of properties.
681 # Must be provided on Create. Only retrieved with JOB_VIEW_ALL.
Bu Sun Kim65020912020-05-20 12:08:20 -0700682 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700683 },
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700684 },
685 ],
Bu Sun Kim65020912020-05-20 12:08:20 -0700686 &quot;replacedByJobId&quot;: &quot;A String&quot;, # If another job is an update of this job (and thus, this job is in
687 # `JOB_STATE_UPDATED`), this field contains the ID of that job.
688 &quot;executionInfo&quot;: { # Additional information about how a Cloud Dataflow job will be executed that # Deprecated.
689 # isn&#x27;t contained in the submitted job.
690 &quot;stages&quot;: { # A mapping from each stage to the information about that stage.
691 &quot;a_key&quot;: { # Contains information about how a particular
692 # google.dataflow.v1beta3.Step will be executed.
693 &quot;stepName&quot;: [ # The steps associated with the execution stage.
694 # Note that stages may have several steps, and that a given step
695 # might be run by more than one stage.
696 &quot;A String&quot;,
697 ],
698 },
699 },
700 },
701 &quot;currentState&quot;: &quot;A String&quot;, # The current state of the job.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700702 #
703 # Jobs are created in the `JOB_STATE_STOPPED` state unless otherwise
704 # specified.
705 #
706 # A job in the `JOB_STATE_RUNNING` state may asynchronously enter a
707 # terminal state. After a job has reached a terminal state, no
708 # further state updates may be made.
709 #
710 # This field may be mutated by the Cloud Dataflow service;
711 # callers cannot mutate it.
Bu Sun Kim65020912020-05-20 12:08:20 -0700712 &quot;location&quot;: &quot;A String&quot;, # The [regional endpoint]
713 # (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
714 # contains this job.
715 &quot;startTime&quot;: &quot;A String&quot;, # The timestamp when the job was started (transitioned to JOB_STATE_PENDING).
716 # Flexible resource scheduling jobs are started with some delay after job
717 # creation, so start_time is unset before start and is updated when the
718 # job is started by the Cloud Dataflow service. For other jobs, start_time
719 # always equals to create_time and is immutable and set by the Cloud Dataflow
720 # service.
721 &quot;stepsLocation&quot;: &quot;A String&quot;, # The GCS location where the steps are stored.
722 &quot;labels&quot;: { # User-defined labels for this job.
723 #
724 # The labels map can contain no more than 64 entries. Entries of the labels
725 # map are UTF8 strings that comply with the following restrictions:
726 #
727 # * Keys must conform to regexp: \p{Ll}\p{Lo}{0,62}
728 # * Values must conform to regexp: [\p{Ll}\p{Lo}\p{N}_-]{0,63}
729 # * Both keys and values are additionally constrained to be &lt;= 128 bytes in
730 # size.
731 &quot;a_key&quot;: &quot;A String&quot;,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700732 },
Bu Sun Kim65020912020-05-20 12:08:20 -0700733 &quot;createTime&quot;: &quot;A String&quot;, # The timestamp when the job was initially created. Immutable and set by the
734 # Cloud Dataflow service.
735 &quot;requestedState&quot;: &quot;A String&quot;, # The job&#x27;s requested state.
736 #
737 # `UpdateJob` may be used to switch between the `JOB_STATE_STOPPED` and
738 # `JOB_STATE_RUNNING` states, by setting requested_state. `UpdateJob` may
739 # also be used to directly set a job&#x27;s requested state to
740 # `JOB_STATE_CANCELLED` or `JOB_STATE_DONE`, irrevocably terminating the
741 # job if it has not already reached a terminal state.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700742 }</pre>
743</div>
744
745<div class="method">
Bu Sun Kim65020912020-05-20 12:08:20 -0700746 <code class="details" id="get">get(projectId, location, view=None, gcsPath=None, x__xgafv=None)</code>
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700747 <pre>Get the template associated with a template.
748
749Args:
750 projectId: string, Required. The ID of the Cloud Platform project that the job belongs to. (required)
751 location: string, The [regional endpoint]
752(https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) to
753which to direct the request. (required)
Bu Sun Kim65020912020-05-20 12:08:20 -0700754 view: string, The view to retrieve. Defaults to METADATA_ONLY.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700755 gcsPath: string, Required. A Cloud Storage path to the template from which to
756create the job.
Bu Sun Kim65020912020-05-20 12:08:20 -0700757Must be valid Cloud Storage URL, beginning with &#x27;gs://&#x27;.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700758 x__xgafv: string, V1 error format.
759 Allowed values
760 1 - v1 error format
761 2 - v2 error format
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700762
763Returns:
764 An object of the form:
765
766 { # The response to a GetTemplate request.
Bu Sun Kim65020912020-05-20 12:08:20 -0700767 &quot;status&quot;: { # The `Status` type defines a logical error model that is suitable for # The status of the get template request. Any problems with the
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700768 # request will be indicated in the error_details.
769 # different programming environments, including REST APIs and RPC APIs. It is
Dan O'Mearadd494642020-05-01 07:42:23 -0700770 # used by [gRPC](https://github.com/grpc). Each `Status` message contains
771 # three pieces of data: error code, error message, and error details.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700772 #
Dan O'Mearadd494642020-05-01 07:42:23 -0700773 # You can find out more about this error model and how to work with it in the
774 # [API Design Guide](https://cloud.google.com/apis/design/errors).
Bu Sun Kim65020912020-05-20 12:08:20 -0700775 &quot;message&quot;: &quot;A String&quot;, # A developer-facing error message, which should be in English. Any
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700776 # user-facing error message should be localized and sent in the
777 # google.rpc.Status.details field, or localized by the client.
Bu Sun Kim65020912020-05-20 12:08:20 -0700778 &quot;details&quot;: [ # A list of messages that carry the error details. There is a common set of
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700779 # message types for APIs to use.
780 {
Bu Sun Kim65020912020-05-20 12:08:20 -0700781 &quot;a_key&quot;: &quot;&quot;, # Properties of the object. Contains field @type with type URL.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700782 },
783 ],
Bu Sun Kim65020912020-05-20 12:08:20 -0700784 &quot;code&quot;: 42, # The status code, which should be an enum value of google.rpc.Code.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700785 },
Bu Sun Kim65020912020-05-20 12:08:20 -0700786 &quot;templateType&quot;: &quot;A String&quot;, # Template Type.
787 &quot;metadata&quot;: { # Metadata describing a template. # The template metadata describing the template name, available
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700788 # parameters, etc.
Bu Sun Kim65020912020-05-20 12:08:20 -0700789 &quot;name&quot;: &quot;A String&quot;, # Required. The name of the template.
790 &quot;parameters&quot;: [ # The parameters for the template.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700791 { # Metadata for a specific parameter.
Bu Sun Kim65020912020-05-20 12:08:20 -0700792 &quot;label&quot;: &quot;A String&quot;, # Required. The label to display for the parameter.
793 &quot;paramType&quot;: &quot;A String&quot;, # Optional. The type of the parameter.
Dan O'Mearadd494642020-05-01 07:42:23 -0700794 # Used for selecting input picker.
Bu Sun Kim65020912020-05-20 12:08:20 -0700795 &quot;helpText&quot;: &quot;A String&quot;, # Required. The help text to display for the parameter.
796 &quot;name&quot;: &quot;A String&quot;, # Required. The name of the parameter.
797 &quot;regexes&quot;: [ # Optional. Regexes that the parameter must match.
798 &quot;A String&quot;,
799 ],
800 &quot;isOptional&quot;: True or False, # Optional. Whether the parameter is optional. Defaults to false.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700801 },
802 ],
Bu Sun Kim65020912020-05-20 12:08:20 -0700803 &quot;description&quot;: &quot;A String&quot;, # Optional. A description of the template.
804 },
805 &quot;runtimeMetadata&quot;: { # RuntimeMetadata describing a runtime environment. # Describes the runtime metadata with SDKInfo and available parameters.
806 &quot;sdkInfo&quot;: { # SDK Information. # SDK Info for the template.
807 &quot;language&quot;: &quot;A String&quot;, # Required. The SDK Language.
808 &quot;version&quot;: &quot;A String&quot;, # Optional. The SDK version.
809 },
810 &quot;parameters&quot;: [ # The parameters for the template.
811 { # Metadata for a specific parameter.
812 &quot;label&quot;: &quot;A String&quot;, # Required. The label to display for the parameter.
813 &quot;paramType&quot;: &quot;A String&quot;, # Optional. The type of the parameter.
814 # Used for selecting input picker.
815 &quot;helpText&quot;: &quot;A String&quot;, # Required. The help text to display for the parameter.
816 &quot;name&quot;: &quot;A String&quot;, # Required. The name of the parameter.
817 &quot;regexes&quot;: [ # Optional. Regexes that the parameter must match.
818 &quot;A String&quot;,
819 ],
820 &quot;isOptional&quot;: True or False, # Optional. Whether the parameter is optional. Defaults to false.
821 },
822 ],
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700823 },
824 }</pre>
825</div>
826
827<div class="method">
Bu Sun Kim65020912020-05-20 12:08:20 -0700828 <code class="details" id="launch">launch(projectId, location, body=None, gcsPath=None, dynamicTemplate_gcsPath=None, dynamicTemplate_stagingLocation=None, validateOnly=None, x__xgafv=None)</code>
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700829 <pre>Launch a template.
830
831Args:
832 projectId: string, Required. The ID of the Cloud Platform project that the job belongs to. (required)
833 location: string, The [regional endpoint]
834(https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) to
835which to direct the request. (required)
Dan O'Mearadd494642020-05-01 07:42:23 -0700836 body: object, The request body.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700837 The object takes the form of:
838
839{ # Parameters to provide to the template being launched.
Bu Sun Kim65020912020-05-20 12:08:20 -0700840 &quot;transformNameMapping&quot;: { # Only applicable when updating a pipeline. Map of transform name prefixes of
841 # the job to be replaced to the corresponding name prefixes of the new job.
842 &quot;a_key&quot;: &quot;A String&quot;,
843 },
844 &quot;environment&quot;: { # The environment values to set at runtime. # The runtime environment for the job.
845 &quot;workerRegion&quot;: &quot;A String&quot;, # The Compute Engine region
Dan O'Mearadd494642020-05-01 07:42:23 -0700846 # (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
Bu Sun Kim65020912020-05-20 12:08:20 -0700847 # which worker processing should occur, e.g. &quot;us-west1&quot;. Mutually exclusive
Dan O'Mearadd494642020-05-01 07:42:23 -0700848 # with worker_zone. If neither worker_region nor worker_zone is specified,
Bu Sun Kim65020912020-05-20 12:08:20 -0700849 # default to the control plane&#x27;s region.
850 &quot;numWorkers&quot;: 42, # The initial number of Google Compute Engine instnaces for the job.
851 &quot;zone&quot;: &quot;A String&quot;, # The Compute Engine [availability
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700852 # zone](https://cloud.google.com/compute/docs/regions-zones/regions-zones)
853 # for launching worker instances to run your pipeline.
Dan O'Mearadd494642020-05-01 07:42:23 -0700854 # In the future, worker_zone will take precedence.
Bu Sun Kim65020912020-05-20 12:08:20 -0700855 &quot;workerZone&quot;: &quot;A String&quot;, # The Compute Engine zone
856 # (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
857 # which worker processing should occur, e.g. &quot;us-west1-a&quot;. Mutually exclusive
858 # with worker_region. If neither worker_region nor worker_zone is specified,
859 # a zone in the control plane&#x27;s region is chosen based on available capacity.
860 # If both `worker_zone` and `zone` are set, `worker_zone` takes precedence.
861 &quot;additionalUserLabels&quot;: { # Additional user labels to be specified for the job.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700862 # Keys and values should follow the restrictions specified in the [labeling
863 # restrictions](https://cloud.google.com/compute/docs/labeling-resources#restrictions)
864 # page.
Bu Sun Kim65020912020-05-20 12:08:20 -0700865 &quot;a_key&quot;: &quot;A String&quot;,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700866 },
Bu Sun Kim65020912020-05-20 12:08:20 -0700867 &quot;additionalExperiments&quot;: [ # Additional experiment flags for the job.
868 &quot;A String&quot;,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700869 ],
Bu Sun Kim65020912020-05-20 12:08:20 -0700870 &quot;maxWorkers&quot;: 42, # The maximum number of Google Compute Engine instances to be made
871 # available to your pipeline during execution, from 1 to 1000.
872 &quot;serviceAccountEmail&quot;: &quot;A String&quot;, # The email address of the service account to run the job as.
873 &quot;machineType&quot;: &quot;A String&quot;, # The machine type to use for the job. Defaults to the value from the
874 # template if not specified.
875 &quot;subnetwork&quot;: &quot;A String&quot;, # Subnetwork to which VMs will be assigned, if desired. Expected to be of
876 # the form &quot;regions/REGION/subnetworks/SUBNETWORK&quot;.
877 &quot;ipConfiguration&quot;: &quot;A String&quot;, # Configuration for VM IPs.
878 &quot;kmsKeyName&quot;: &quot;A String&quot;, # Optional. Name for the Cloud KMS key for the job.
Dan O'Mearadd494642020-05-01 07:42:23 -0700879 # Key format is:
880 # projects/&lt;project&gt;/locations/&lt;location&gt;/keyRings/&lt;keyring&gt;/cryptoKeys/&lt;key&gt;
Bu Sun Kim65020912020-05-20 12:08:20 -0700881 &quot;bypassTempDirValidation&quot;: True or False, # Whether to bypass the safety checks for the job&#x27;s temporary directory.
882 # Use with caution.
883 &quot;tempLocation&quot;: &quot;A String&quot;, # The Cloud Storage path to use for temporary files.
884 # Must be a valid Cloud Storage URL, beginning with `gs://`.
885 &quot;network&quot;: &quot;A String&quot;, # Network to which VMs will be assigned. If empty or unspecified,
886 # the service will use the network &quot;default&quot;.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700887 },
Bu Sun Kim65020912020-05-20 12:08:20 -0700888 &quot;update&quot;: True or False, # If set, replace the existing pipeline with the name specified by jobName
Dan O'Mearadd494642020-05-01 07:42:23 -0700889 # with this pipeline, preserving state.
Bu Sun Kim65020912020-05-20 12:08:20 -0700890 &quot;parameters&quot;: { # The runtime parameters to pass to the job.
891 &quot;a_key&quot;: &quot;A String&quot;,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700892 },
Bu Sun Kim65020912020-05-20 12:08:20 -0700893 &quot;jobName&quot;: &quot;A String&quot;, # Required. The job name to use for the created job.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700894 }
895
Bu Sun Kim65020912020-05-20 12:08:20 -0700896 gcsPath: string, A Cloud Storage path to the template from which to create
897the job.
898Must be valid Cloud Storage URL, beginning with &#x27;gs://&#x27;.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700899 dynamicTemplate_gcsPath: string, Path to dynamic template spec file on GCS.
900The file must be a Json serialized DynamicTemplateFieSpec object.
Bu Sun Kim65020912020-05-20 12:08:20 -0700901 dynamicTemplate_stagingLocation: string, Cloud Storage path for staging dependencies.
902Must be a valid Cloud Storage URL, beginning with `gs://`.
903 validateOnly: boolean, If true, the request is validated but not actually executed.
904Defaults to false.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700905 x__xgafv: string, V1 error format.
906 Allowed values
907 1 - v1 error format
908 2 - v2 error format
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700909
910Returns:
911 An object of the form:
912
913 { # Response to the request to launch a template.
Bu Sun Kim65020912020-05-20 12:08:20 -0700914 &quot;job&quot;: { # Defines a job to be run by the Cloud Dataflow service. # The job that was launched, if the request was not a dry run and
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700915 # the job was successfully launched.
Bu Sun Kim65020912020-05-20 12:08:20 -0700916 &quot;clientRequestId&quot;: &quot;A String&quot;, # The client&#x27;s unique identifier of the job, re-used across retried attempts.
917 # If this field is set, the service will ensure its uniqueness.
918 # The request to create a job will fail if the service has knowledge of a
919 # previously submitted job with the same client&#x27;s ID and job name.
920 # The caller may use this field to ensure idempotence of job
921 # creation across retried attempts to create a job.
922 # By default, the field is empty and, in that case, the service ignores it.
923 &quot;id&quot;: &quot;A String&quot;, # The unique ID of this job.
Sai Cheemalapati4ba8c232017-06-06 18:46:08 -0400924 #
925 # This field is set by the Cloud Dataflow service when the Job is
926 # created, and is immutable for the life of the job.
Bu Sun Kim65020912020-05-20 12:08:20 -0700927 &quot;currentStateTime&quot;: &quot;A String&quot;, # The timestamp associated with the current state.
928 &quot;transformNameMapping&quot;: { # The map of transform name prefixes of the job to be replaced to the
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700929 # corresponding name prefixes of the new job.
Bu Sun Kim65020912020-05-20 12:08:20 -0700930 &quot;a_key&quot;: &quot;A String&quot;,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700931 },
Bu Sun Kim65020912020-05-20 12:08:20 -0700932 &quot;environment&quot;: { # Describes the environment in which a Dataflow Job runs. # The environment for the job.
933 &quot;sdkPipelineOptions&quot;: { # The Cloud Dataflow SDK pipeline options specified by the user. These
Sai Cheemalapati4ba8c232017-06-06 18:46:08 -0400934 # options are passed through the service and are used to recreate the
935 # SDK pipeline options on the worker in a language agnostic and platform
936 # independent way.
Bu Sun Kim65020912020-05-20 12:08:20 -0700937 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
Sai Cheemalapati4ba8c232017-06-06 18:46:08 -0400938 },
Bu Sun Kim65020912020-05-20 12:08:20 -0700939 &quot;flexResourceSchedulingGoal&quot;: &quot;A String&quot;, # Which Flexible Resource Scheduling mode to run in.
940 &quot;workerPools&quot;: [ # The worker pools. At least one &quot;harness&quot; worker pool must be
Sai Cheemalapati4ba8c232017-06-06 18:46:08 -0400941 # specified in order for the job to have workers.
942 { # Describes one particular pool of Cloud Dataflow workers to be
943 # instantiated by the Cloud Dataflow service in order to perform the
944 # computations required by a job. Note that a workflow job may use
945 # multiple pools, in order to match the various computational
946 # requirements of the various stages of the job.
Bu Sun Kim65020912020-05-20 12:08:20 -0700947 &quot;defaultPackageSet&quot;: &quot;A String&quot;, # The default package set to install. This allows the service to
948 # select a default set of packages which are useful to worker
949 # harnesses written in a particular language.
950 &quot;network&quot;: &quot;A String&quot;, # Network to which VMs will be assigned. If empty or unspecified,
951 # the service will use the network &quot;default&quot;.
952 &quot;zone&quot;: &quot;A String&quot;, # Zone to run the worker pools in. If empty or unspecified, the service
Dan O'Mearadd494642020-05-01 07:42:23 -0700953 # will attempt to choose a reasonable default.
Bu Sun Kim65020912020-05-20 12:08:20 -0700954 &quot;numWorkers&quot;: 42, # Number of Google Compute Engine workers in this pool needed to
955 # execute the job. If zero or unspecified, the service will
956 # attempt to choose a reasonable default.
957 &quot;numThreadsPerWorker&quot;: 42, # The number of threads per worker harness. If empty or unspecified, the
Dan O'Mearadd494642020-05-01 07:42:23 -0700958 # service will choose a number of threads (according to the number of cores
959 # on the selected machine type for batch, or 1 by convention for streaming).
Bu Sun Kim65020912020-05-20 12:08:20 -0700960 &quot;diskSourceImage&quot;: &quot;A String&quot;, # Fully qualified source image for disks.
961 &quot;packages&quot;: [ # Packages to be installed on workers.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700962 { # The packages that must be installed in order for a worker to run the
963 # steps of the Cloud Dataflow job that will be assigned to its worker
964 # pool.
965 #
966 # This is the mechanism by which the Cloud Dataflow SDK causes code to
967 # be loaded onto the workers. For example, the Cloud Dataflow Java SDK
Bu Sun Kim65020912020-05-20 12:08:20 -0700968 # might use this to install jars containing the user&#x27;s code and all of the
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700969 # various dependencies (libraries, data files, etc.) required in order
970 # for that code to run.
Bu Sun Kim65020912020-05-20 12:08:20 -0700971 &quot;location&quot;: &quot;A String&quot;, # The resource to read the package from. The supported resource type is:
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700972 #
973 # Google Cloud Storage:
974 #
975 # storage.googleapis.com/{bucket}
976 # bucket.storage.googleapis.com/
Bu Sun Kim65020912020-05-20 12:08:20 -0700977 &quot;name&quot;: &quot;A String&quot;, # The name of the package.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700978 },
979 ],
Bu Sun Kim65020912020-05-20 12:08:20 -0700980 &quot;teardownPolicy&quot;: &quot;A String&quot;, # Sets the policy for determining when to turndown worker pool.
Sai Cheemalapati4ba8c232017-06-06 18:46:08 -0400981 # Allowed values are: `TEARDOWN_ALWAYS`, `TEARDOWN_ON_SUCCESS`, and
982 # `TEARDOWN_NEVER`.
983 # `TEARDOWN_ALWAYS` means workers are always torn down regardless of whether
984 # the job succeeds. `TEARDOWN_ON_SUCCESS` means workers are torn down
985 # if the job succeeds. `TEARDOWN_NEVER` means the workers are never torn
986 # down.
987 #
988 # If the workers are not torn down by the service, they will
989 # continue to run and use Google Compute Engine VM resources in the
Bu Sun Kim65020912020-05-20 12:08:20 -0700990 # user&#x27;s project until they are explicitly terminated by the user.
Sai Cheemalapati4ba8c232017-06-06 18:46:08 -0400991 # Because of this, Google recommends using the `TEARDOWN_ALWAYS`
992 # policy except for small, manually supervised test jobs.
993 #
994 # If unknown or unspecified, the service will attempt to choose a reasonable
995 # default.
Bu Sun Kim65020912020-05-20 12:08:20 -0700996 &quot;onHostMaintenance&quot;: &quot;A String&quot;, # The action to take on host maintenance, as defined by the Google
997 # Compute Engine API.
998 &quot;poolArgs&quot;: { # Extra arguments for this worker pool.
999 &quot;a_key&quot;: &quot;&quot;, # Properties of the object. Contains field @type with type URL.
1000 },
1001 &quot;diskSizeGb&quot;: 42, # Size of root disk for VMs, in GB. If zero or unspecified, the service will
Dan O'Mearadd494642020-05-01 07:42:23 -07001002 # attempt to choose a reasonable default.
Bu Sun Kim65020912020-05-20 12:08:20 -07001003 &quot;workerHarnessContainerImage&quot;: &quot;A String&quot;, # Required. Docker container image that executes the Cloud Dataflow worker
1004 # harness, residing in Google Container Registry.
1005 #
1006 # Deprecated for the Fn API path. Use sdk_harness_container_images instead.
1007 &quot;diskType&quot;: &quot;A String&quot;, # Type of root disk for VMs. If empty or unspecified, the service will
Sai Cheemalapati4ba8c232017-06-06 18:46:08 -04001008 # attempt to choose a reasonable default.
Bu Sun Kim65020912020-05-20 12:08:20 -07001009 &quot;machineType&quot;: &quot;A String&quot;, # Machine type (e.g. &quot;n1-standard-1&quot;). If empty or unspecified, the
1010 # service will attempt to choose a reasonable default.
1011 &quot;kind&quot;: &quot;A String&quot;, # The kind of the worker pool; currently only `harness` and `shuffle`
1012 # are supported.
1013 &quot;dataDisks&quot;: [ # Data disks that are used by a VM in this workflow.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001014 { # Describes the data disk used by a workflow job.
Bu Sun Kim65020912020-05-20 12:08:20 -07001015 &quot;sizeGb&quot;: 42, # Size of disk in GB. If zero or unspecified, the service will
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001016 # attempt to choose a reasonable default.
Bu Sun Kim65020912020-05-20 12:08:20 -07001017 &quot;diskType&quot;: &quot;A String&quot;, # Disk storage type, as defined by Google Compute Engine. This
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001018 # must be a disk type appropriate to the project and zone in which
1019 # the workers will run. If unknown or unspecified, the service
1020 # will attempt to choose a reasonable default.
Sai Cheemalapati4ba8c232017-06-06 18:46:08 -04001021 #
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001022 # For example, the standard persistent disk type is a resource name
Bu Sun Kim65020912020-05-20 12:08:20 -07001023 # typically ending in &quot;pd-standard&quot;. If SSD persistent disks are
1024 # available, the resource name typically ends with &quot;pd-ssd&quot;. The
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001025 # actual valid values are defined the Google Compute Engine API,
1026 # not by the Cloud Dataflow API; consult the Google Compute Engine
1027 # documentation for more information about determining the set of
1028 # available disk types for a particular project and zone.
Sai Cheemalapati4ba8c232017-06-06 18:46:08 -04001029 #
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001030 # Google Compute Engine Disk types are local to a particular
1031 # project in a particular zone, and so the resource name will
1032 # typically look something like this:
1033 #
1034 # compute.googleapis.com/projects/project-id/zones/zone/diskTypes/pd-standard
Bu Sun Kim65020912020-05-20 12:08:20 -07001035 &quot;mountPoint&quot;: &quot;A String&quot;, # Directory in a VM where disk is mounted.
Sai Cheemalapati4ba8c232017-06-06 18:46:08 -04001036 },
1037 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07001038 &quot;sdkHarnessContainerImages&quot;: [ # Set of SDK harness containers needed to execute this pipeline. This will
Dan O'Mearadd494642020-05-01 07:42:23 -07001039 # only be set in the Fn API path. For non-cross-language pipelines this
1040 # should have only one entry. Cross-language pipelines will have two or more
1041 # entries.
1042 { # Defines a SDK harness container for executing Dataflow pipelines.
Bu Sun Kim65020912020-05-20 12:08:20 -07001043 &quot;containerImage&quot;: &quot;A String&quot;, # A docker container image that resides in Google Container Registry.
1044 &quot;useSingleCorePerContainer&quot;: True or False, # If true, recommends the Dataflow service to use only one core per SDK
Dan O'Mearadd494642020-05-01 07:42:23 -07001045 # container instance with this image. If false (or unset) recommends using
1046 # more than one core per SDK container instance with this image for
1047 # efficiency. Note that Dataflow service may choose to override this property
1048 # if needed.
1049 },
1050 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07001051 &quot;subnetwork&quot;: &quot;A String&quot;, # Subnetwork to which VMs will be assigned, if desired. Expected to be of
1052 # the form &quot;regions/REGION/subnetworks/SUBNETWORK&quot;.
1053 &quot;ipConfiguration&quot;: &quot;A String&quot;, # Configuration for VM IPs.
1054 &quot;taskrunnerSettings&quot;: { # Taskrunner configuration settings. # Settings passed through to Google Compute Engine workers when
1055 # using the standard Dataflow task runner. Users should ignore
1056 # this field.
1057 &quot;alsologtostderr&quot;: True or False, # Whether to also send taskrunner log info to stderr.
1058 &quot;taskGroup&quot;: &quot;A String&quot;, # The UNIX group ID on the worker VM to use for tasks launched by
1059 # taskrunner; e.g. &quot;wheel&quot;.
1060 &quot;harnessCommand&quot;: &quot;A String&quot;, # The command to launch the worker harness.
1061 &quot;logDir&quot;: &quot;A String&quot;, # The directory on the VM to store logs.
1062 &quot;oauthScopes&quot;: [ # The OAuth2 scopes to be requested by the taskrunner in order to
1063 # access the Cloud Dataflow API.
1064 &quot;A String&quot;,
1065 ],
1066 &quot;dataflowApiVersion&quot;: &quot;A String&quot;, # The API version of endpoint, e.g. &quot;v1b3&quot;
1067 &quot;logUploadLocation&quot;: &quot;A String&quot;, # Indicates where to put logs. If this is not specified, the logs
1068 # will not be uploaded.
1069 #
1070 # The supported resource type is:
1071 #
1072 # Google Cloud Storage:
1073 # storage.googleapis.com/{bucket}/{object}
1074 # bucket.storage.googleapis.com/{object}
1075 &quot;streamingWorkerMainClass&quot;: &quot;A String&quot;, # The streaming worker main class name.
1076 &quot;workflowFileName&quot;: &quot;A String&quot;, # The file to store the workflow in.
1077 &quot;baseTaskDir&quot;: &quot;A String&quot;, # The location on the worker for task-specific subdirectories.
1078 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the taskrunner should use for
1079 # temporary storage.
1080 #
1081 # The supported resource type is:
1082 #
1083 # Google Cloud Storage:
1084 # storage.googleapis.com/{bucket}/{object}
1085 # bucket.storage.googleapis.com/{object}
1086 &quot;commandlinesFileName&quot;: &quot;A String&quot;, # The file to store preprocessing commands in.
1087 &quot;languageHint&quot;: &quot;A String&quot;, # The suggested backend language.
1088 &quot;baseUrl&quot;: &quot;A String&quot;, # The base URL for the taskrunner to use when accessing Google Cloud APIs.
1089 #
1090 # When workers access Google Cloud APIs, they logically do so via
1091 # relative URLs. If this field is specified, it supplies the base
1092 # URL to use for resolving these relative URLs. The normative
1093 # algorithm used is defined by RFC 1808, &quot;Relative Uniform Resource
1094 # Locators&quot;.
1095 #
1096 # If not specified, the default value is &quot;http://www.googleapis.com/&quot;
1097 &quot;logToSerialconsole&quot;: True or False, # Whether to send taskrunner log info to Google Compute Engine VM serial
1098 # console.
1099 &quot;continueOnException&quot;: True or False, # Whether to continue taskrunner if an exception is hit.
1100 &quot;parallelWorkerSettings&quot;: { # Provides data to pass through to the worker harness. # The settings to pass to the parallel worker harness.
1101 &quot;baseUrl&quot;: &quot;A String&quot;, # The base URL for accessing Google Cloud APIs.
1102 #
1103 # When workers access Google Cloud APIs, they logically do so via
1104 # relative URLs. If this field is specified, it supplies the base
1105 # URL to use for resolving these relative URLs. The normative
1106 # algorithm used is defined by RFC 1808, &quot;Relative Uniform Resource
1107 # Locators&quot;.
1108 #
1109 # If not specified, the default value is &quot;http://www.googleapis.com/&quot;
1110 &quot;reportingEnabled&quot;: True or False, # Whether to send work progress updates to the service.
1111 &quot;servicePath&quot;: &quot;A String&quot;, # The Cloud Dataflow service path relative to the root URL, for example,
1112 # &quot;dataflow/v1b3/projects&quot;.
1113 &quot;shuffleServicePath&quot;: &quot;A String&quot;, # The Shuffle service path relative to the root URL, for example,
1114 # &quot;shuffle/v1beta1&quot;.
1115 &quot;workerId&quot;: &quot;A String&quot;, # The ID of the worker running this pipeline.
1116 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the system should use for temporary
1117 # storage.
1118 #
1119 # The supported resource type is:
1120 #
1121 # Google Cloud Storage:
1122 #
1123 # storage.googleapis.com/{bucket}/{object}
1124 # bucket.storage.googleapis.com/{object}
1125 },
1126 &quot;vmId&quot;: &quot;A String&quot;, # The ID string of the VM.
1127 &quot;taskUser&quot;: &quot;A String&quot;, # The UNIX user ID on the worker VM to use for tasks launched by
1128 # taskrunner; e.g. &quot;root&quot;.
1129 },
1130 &quot;autoscalingSettings&quot;: { # Settings for WorkerPool autoscaling. # Settings for autoscaling of this WorkerPool.
1131 &quot;maxNumWorkers&quot;: 42, # The maximum number of workers to cap scaling at.
1132 &quot;algorithm&quot;: &quot;A String&quot;, # The algorithm to use for autoscaling.
1133 },
1134 &quot;metadata&quot;: { # Metadata to set on the Google Compute Engine VMs.
1135 &quot;a_key&quot;: &quot;A String&quot;,
1136 },
Sai Cheemalapati4ba8c232017-06-06 18:46:08 -04001137 },
1138 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07001139 &quot;dataset&quot;: &quot;A String&quot;, # The dataset for the current project where various workflow
1140 # related tables are stored.
1141 #
1142 # The supported resource type is:
1143 #
1144 # Google BigQuery:
1145 # bigquery.googleapis.com/{dataset}
1146 &quot;internalExperiments&quot;: { # Experimental settings.
1147 &quot;a_key&quot;: &quot;&quot;, # Properties of the object. Contains field @type with type URL.
1148 },
1149 &quot;workerRegion&quot;: &quot;A String&quot;, # The Compute Engine region
1150 # (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
1151 # which worker processing should occur, e.g. &quot;us-west1&quot;. Mutually exclusive
1152 # with worker_zone. If neither worker_region nor worker_zone is specified,
1153 # default to the control plane&#x27;s region.
1154 &quot;serviceKmsKeyName&quot;: &quot;A String&quot;, # If set, contains the Cloud KMS key identifier used to encrypt data
1155 # at rest, AKA a Customer Managed Encryption Key (CMEK).
1156 #
1157 # Format:
1158 # projects/PROJECT_ID/locations/LOCATION/keyRings/KEY_RING/cryptoKeys/KEY
1159 &quot;userAgent&quot;: { # A description of the process that generated the request.
1160 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
1161 },
1162 &quot;workerZone&quot;: &quot;A String&quot;, # The Compute Engine zone
1163 # (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
1164 # which worker processing should occur, e.g. &quot;us-west1-a&quot;. Mutually exclusive
1165 # with worker_region. If neither worker_region nor worker_zone is specified,
1166 # a zone in the control plane&#x27;s region is chosen based on available capacity.
1167 &quot;clusterManagerApiService&quot;: &quot;A String&quot;, # The type of cluster manager API to use. If unknown or
Dan O'Mearadd494642020-05-01 07:42:23 -07001168 # unspecified, the service will attempt to choose a reasonable
1169 # default. This should be in the form of the API service name,
Bu Sun Kim65020912020-05-20 12:08:20 -07001170 # e.g. &quot;compute.googleapis.com&quot;.
1171 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the system should use for temporary
1172 # storage. The system will append the suffix &quot;/temp-{JOBNAME} to
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001173 # this resource prefix, where {JOBNAME} is the value of the
1174 # job_name field. The resulting bucket and object prefix is used
1175 # as the prefix of the resources used to store temporary data
1176 # needed during the job execution. NOTE: This will override the
1177 # value in taskrunner_settings.
1178 # The supported resource type is:
1179 #
1180 # Google Cloud Storage:
1181 #
1182 # storage.googleapis.com/{bucket}/{object}
1183 # bucket.storage.googleapis.com/{object}
Bu Sun Kim65020912020-05-20 12:08:20 -07001184 &quot;experiments&quot;: [ # The list of experiments to enable.
1185 &quot;A String&quot;,
1186 ],
1187 &quot;version&quot;: { # A structure describing which components and their versions of the service
1188 # are required in order to run the job.
1189 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
1190 },
1191 &quot;serviceAccountEmail&quot;: &quot;A String&quot;, # Identity to run virtual machines as. Defaults to the default account.
Sai Cheemalapati4ba8c232017-06-06 18:46:08 -04001192 },
Bu Sun Kim65020912020-05-20 12:08:20 -07001193 &quot;stageStates&quot;: [ # This field may be mutated by the Cloud Dataflow service;
1194 # callers cannot mutate it.
1195 { # A message describing the state of a particular execution stage.
1196 &quot;executionStageName&quot;: &quot;A String&quot;, # The name of the execution stage.
1197 &quot;currentStateTime&quot;: &quot;A String&quot;, # The time at which the stage transitioned to this state.
1198 &quot;executionStageState&quot;: &quot;A String&quot;, # Executions stage states allow the same set of values as JobState.
1199 },
1200 ],
1201 &quot;jobMetadata&quot;: { # Metadata available primarily for filtering jobs. Will be included in the # This field is populated by the Dataflow service to support filtering jobs
1202 # by the metadata values provided here. Populated for ListJobs and all GetJob
1203 # views SUMMARY and higher.
1204 # ListJob response and Job SUMMARY view.
1205 &quot;bigTableDetails&quot;: [ # Identification of a BigTable source used in the Dataflow job.
1206 { # Metadata for a BigTable connector used by the job.
1207 &quot;tableId&quot;: &quot;A String&quot;, # TableId accessed in the connection.
1208 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
1209 &quot;instanceId&quot;: &quot;A String&quot;, # InstanceId accessed in the connection.
1210 },
1211 ],
1212 &quot;spannerDetails&quot;: [ # Identification of a Spanner source used in the Dataflow job.
1213 { # Metadata for a Spanner connector used by the job.
1214 &quot;databaseId&quot;: &quot;A String&quot;, # DatabaseId accessed in the connection.
1215 &quot;instanceId&quot;: &quot;A String&quot;, # InstanceId accessed in the connection.
1216 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
1217 },
1218 ],
1219 &quot;datastoreDetails&quot;: [ # Identification of a Datastore source used in the Dataflow job.
1220 { # Metadata for a Datastore connector used by the job.
1221 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
1222 &quot;namespace&quot;: &quot;A String&quot;, # Namespace used in the connection.
1223 },
1224 ],
1225 &quot;sdkVersion&quot;: { # The version of the SDK used to run the job. # The SDK version used to run the job.
1226 &quot;versionDisplayName&quot;: &quot;A String&quot;, # A readable string describing the version of the SDK.
1227 &quot;sdkSupportStatus&quot;: &quot;A String&quot;, # The support status for this SDK version.
1228 &quot;version&quot;: &quot;A String&quot;, # The version of the SDK used to run the job.
1229 },
1230 &quot;bigqueryDetails&quot;: [ # Identification of a BigQuery source used in the Dataflow job.
1231 { # Metadata for a BigQuery connector used by the job.
1232 &quot;table&quot;: &quot;A String&quot;, # Table accessed in the connection.
1233 &quot;dataset&quot;: &quot;A String&quot;, # Dataset accessed in the connection.
1234 &quot;projectId&quot;: &quot;A String&quot;, # Project accessed in the connection.
1235 &quot;query&quot;: &quot;A String&quot;, # Query used to access data in the connection.
1236 },
1237 ],
1238 &quot;fileDetails&quot;: [ # Identification of a File source used in the Dataflow job.
1239 { # Metadata for a File connector used by the job.
1240 &quot;filePattern&quot;: &quot;A String&quot;, # File Pattern used to access files by the connector.
1241 },
1242 ],
1243 &quot;pubsubDetails&quot;: [ # Identification of a PubSub source used in the Dataflow job.
1244 { # Metadata for a PubSub connector used by the job.
1245 &quot;subscription&quot;: &quot;A String&quot;, # Subscription used in the connection.
1246 &quot;topic&quot;: &quot;A String&quot;, # Topic accessed in the connection.
1247 },
1248 ],
1249 },
1250 &quot;createdFromSnapshotId&quot;: &quot;A String&quot;, # If this is specified, the job&#x27;s initial state is populated from the given
1251 # snapshot.
1252 &quot;projectId&quot;: &quot;A String&quot;, # The ID of the Cloud Platform project that the job belongs to.
1253 &quot;type&quot;: &quot;A String&quot;, # The type of Cloud Dataflow job.
1254 &quot;pipelineDescription&quot;: { # A descriptive representation of submitted pipeline as well as the executed # Preliminary field: The format of this data may change at any time.
1255 # A description of the user pipeline and stages through which it is executed.
1256 # Created by Cloud Dataflow service. Only retrieved with
1257 # JOB_VIEW_DESCRIPTION or JOB_VIEW_ALL.
1258 # form. This data is provided by the Dataflow service for ease of visualizing
1259 # the pipeline and interpreting Dataflow provided metrics.
1260 &quot;executionPipelineStage&quot;: [ # Description of each stage of execution of the pipeline.
1261 { # Description of the composing transforms, names/ids, and input/outputs of a
1262 # stage of execution. Some composing transforms and sources may have been
1263 # generated by the Dataflow service during execution planning.
1264 &quot;id&quot;: &quot;A String&quot;, # Dataflow service generated id for this stage.
1265 &quot;componentTransform&quot;: [ # Transforms that comprise this execution stage.
1266 { # Description of a transform executed as part of an execution stage.
1267 &quot;originalTransform&quot;: &quot;A String&quot;, # User name for the original user transform with which this transform is
1268 # most closely associated.
1269 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
1270 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this transform; may be user or system generated.
1271 },
1272 ],
1273 &quot;componentSource&quot;: [ # Collections produced and consumed by component transforms of this stage.
1274 { # Description of an interstitial value between transforms in an execution
1275 # stage.
1276 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
1277 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this transform; may be user or system generated.
1278 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
1279 # source is most closely associated.
1280 },
1281 ],
1282 &quot;kind&quot;: &quot;A String&quot;, # Type of tranform this stage is executing.
1283 &quot;outputSource&quot;: [ # Output sources for this stage.
1284 { # Description of an input or output of an execution stage.
1285 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
1286 # source is most closely associated.
1287 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
1288 &quot;sizeBytes&quot;: &quot;A String&quot;, # Size of the source, if measurable.
1289 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this source; may be user or system generated.
1290 },
1291 ],
1292 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this stage.
1293 &quot;inputSource&quot;: [ # Input sources for this stage.
1294 { # Description of an input or output of an execution stage.
1295 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
1296 # source is most closely associated.
1297 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
1298 &quot;sizeBytes&quot;: &quot;A String&quot;, # Size of the source, if measurable.
1299 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this source; may be user or system generated.
1300 },
1301 ],
1302 },
1303 ],
1304 &quot;originalPipelineTransform&quot;: [ # Description of each transform in the pipeline and collections between them.
1305 { # Description of the type, names/ids, and input/outputs for a transform.
1306 &quot;kind&quot;: &quot;A String&quot;, # Type of transform.
1307 &quot;inputCollectionName&quot;: [ # User names for all collection inputs to this transform.
1308 &quot;A String&quot;,
1309 ],
1310 &quot;name&quot;: &quot;A String&quot;, # User provided name for this transform instance.
1311 &quot;id&quot;: &quot;A String&quot;, # SDK generated id of this transform instance.
1312 &quot;displayData&quot;: [ # Transform-specific display data.
1313 { # Data provided with a pipeline or transform to provide descriptive info.
1314 &quot;timestampValue&quot;: &quot;A String&quot;, # Contains value if the data is of timestamp type.
1315 &quot;boolValue&quot;: True or False, # Contains value if the data is of a boolean type.
1316 &quot;javaClassValue&quot;: &quot;A String&quot;, # Contains value if the data is of java class type.
1317 &quot;strValue&quot;: &quot;A String&quot;, # Contains value if the data is of string type.
1318 &quot;int64Value&quot;: &quot;A String&quot;, # Contains value if the data is of int64 type.
1319 &quot;durationValue&quot;: &quot;A String&quot;, # Contains value if the data is of duration type.
1320 &quot;namespace&quot;: &quot;A String&quot;, # The namespace for the key. This is usually a class name or programming
1321 # language namespace (i.e. python module) which defines the display data.
1322 # This allows a dax monitoring system to specially handle the data
1323 # and perform custom rendering.
1324 &quot;floatValue&quot;: 3.14, # Contains value if the data is of float type.
1325 &quot;key&quot;: &quot;A String&quot;, # The key identifying the display data.
1326 # This is intended to be used as a label for the display data
1327 # when viewed in a dax monitoring system.
1328 &quot;shortStrValue&quot;: &quot;A String&quot;, # A possible additional shorter value to display.
1329 # For example a java_class_name_value of com.mypackage.MyDoFn
1330 # will be stored with MyDoFn as the short_str_value and
1331 # com.mypackage.MyDoFn as the java_class_name value.
1332 # short_str_value can be displayed and java_class_name_value
1333 # will be displayed as a tooltip.
1334 &quot;url&quot;: &quot;A String&quot;, # An optional full URL.
1335 &quot;label&quot;: &quot;A String&quot;, # An optional label to display in a dax UI for the element.
1336 },
1337 ],
1338 &quot;outputCollectionName&quot;: [ # User names for all collection outputs to this transform.
1339 &quot;A String&quot;,
1340 ],
1341 },
1342 ],
1343 &quot;displayData&quot;: [ # Pipeline level display data.
1344 { # Data provided with a pipeline or transform to provide descriptive info.
1345 &quot;timestampValue&quot;: &quot;A String&quot;, # Contains value if the data is of timestamp type.
1346 &quot;boolValue&quot;: True or False, # Contains value if the data is of a boolean type.
1347 &quot;javaClassValue&quot;: &quot;A String&quot;, # Contains value if the data is of java class type.
1348 &quot;strValue&quot;: &quot;A String&quot;, # Contains value if the data is of string type.
1349 &quot;int64Value&quot;: &quot;A String&quot;, # Contains value if the data is of int64 type.
1350 &quot;durationValue&quot;: &quot;A String&quot;, # Contains value if the data is of duration type.
1351 &quot;namespace&quot;: &quot;A String&quot;, # The namespace for the key. This is usually a class name or programming
1352 # language namespace (i.e. python module) which defines the display data.
1353 # This allows a dax monitoring system to specially handle the data
1354 # and perform custom rendering.
1355 &quot;floatValue&quot;: 3.14, # Contains value if the data is of float type.
1356 &quot;key&quot;: &quot;A String&quot;, # The key identifying the display data.
1357 # This is intended to be used as a label for the display data
1358 # when viewed in a dax monitoring system.
1359 &quot;shortStrValue&quot;: &quot;A String&quot;, # A possible additional shorter value to display.
1360 # For example a java_class_name_value of com.mypackage.MyDoFn
1361 # will be stored with MyDoFn as the short_str_value and
1362 # com.mypackage.MyDoFn as the java_class_name value.
1363 # short_str_value can be displayed and java_class_name_value
1364 # will be displayed as a tooltip.
1365 &quot;url&quot;: &quot;A String&quot;, # An optional full URL.
1366 &quot;label&quot;: &quot;A String&quot;, # An optional label to display in a dax UI for the element.
1367 },
1368 ],
1369 },
1370 &quot;replaceJobId&quot;: &quot;A String&quot;, # If this job is an update of an existing job, this field is the job ID
1371 # of the job it replaced.
1372 #
1373 # When sending a `CreateJobRequest`, you can update a job by specifying it
1374 # here. The job named here is stopped, and its intermediate state is
1375 # transferred to this job.
1376 &quot;tempFiles&quot;: [ # A set of files the system should be aware of that are used
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001377 # for temporary storage. These temporary files will be
1378 # removed on job completion.
1379 # No duplicates are allowed.
1380 # No file patterns are supported.
1381 #
1382 # The supported files are:
1383 #
1384 # Google Cloud Storage:
1385 #
1386 # storage.googleapis.com/{bucket}/{object}
1387 # bucket.storage.googleapis.com/{object}
Bu Sun Kim65020912020-05-20 12:08:20 -07001388 &quot;A String&quot;,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001389 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07001390 &quot;name&quot;: &quot;A String&quot;, # The user-specified Cloud Dataflow job name.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001391 #
1392 # Only one Job with a given name may exist in a project at any
1393 # given time. If a caller attempts to create a Job with the same
1394 # name as an already-existing Job, the attempt returns the
1395 # existing Job.
1396 #
1397 # The name must match the regular expression
1398 # `[a-z]([-a-z0-9]{0,38}[a-z0-9])?`
Bu Sun Kim65020912020-05-20 12:08:20 -07001399 &quot;steps&quot;: [ # Exactly one of step or steps_location should be specified.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001400 #
1401 # The top-level steps that constitute the entire job.
Sai Cheemalapati4ba8c232017-06-06 18:46:08 -04001402 { # Defines a particular step within a Cloud Dataflow job.
1403 #
1404 # A job consists of multiple steps, each of which performs some
1405 # specific operation as part of the overall job. Data is typically
1406 # passed from one step to another as part of the job.
1407 #
Bu Sun Kim65020912020-05-20 12:08:20 -07001408 # Here&#x27;s an example of a sequence of steps which together implement a
Sai Cheemalapati4ba8c232017-06-06 18:46:08 -04001409 # Map-Reduce job:
1410 #
1411 # * Read a collection of data from some source, parsing the
Bu Sun Kim65020912020-05-20 12:08:20 -07001412 # collection&#x27;s elements.
Sai Cheemalapati4ba8c232017-06-06 18:46:08 -04001413 #
1414 # * Validate the elements.
1415 #
1416 # * Apply a user-defined function to map each element to some value
1417 # and extract an element-specific key value.
1418 #
1419 # * Group elements with the same key into a single element with
1420 # that key, transforming a multiply-keyed collection into a
1421 # uniquely-keyed collection.
1422 #
1423 # * Write the elements out to some data sink.
1424 #
1425 # Note that the Cloud Dataflow service may be used to run many different
1426 # types of jobs, not just Map-Reduce.
Bu Sun Kim65020912020-05-20 12:08:20 -07001427 &quot;name&quot;: &quot;A String&quot;, # The name that identifies the step. This must be unique for each
Dan O'Mearadd494642020-05-01 07:42:23 -07001428 # step with respect to all other steps in the Cloud Dataflow job.
Bu Sun Kim65020912020-05-20 12:08:20 -07001429 &quot;kind&quot;: &quot;A String&quot;, # The kind of step in the Cloud Dataflow job.
1430 &quot;properties&quot;: { # Named properties associated with the step. Each kind of
Sai Cheemalapati4ba8c232017-06-06 18:46:08 -04001431 # predefined step has its own required set of properties.
1432 # Must be provided on Create. Only retrieved with JOB_VIEW_ALL.
Bu Sun Kim65020912020-05-20 12:08:20 -07001433 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
Sai Cheemalapati4ba8c232017-06-06 18:46:08 -04001434 },
Sai Cheemalapati4ba8c232017-06-06 18:46:08 -04001435 },
1436 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07001437 &quot;replacedByJobId&quot;: &quot;A String&quot;, # If another job is an update of this job (and thus, this job is in
1438 # `JOB_STATE_UPDATED`), this field contains the ID of that job.
1439 &quot;executionInfo&quot;: { # Additional information about how a Cloud Dataflow job will be executed that # Deprecated.
1440 # isn&#x27;t contained in the submitted job.
1441 &quot;stages&quot;: { # A mapping from each stage to the information about that stage.
1442 &quot;a_key&quot;: { # Contains information about how a particular
1443 # google.dataflow.v1beta3.Step will be executed.
1444 &quot;stepName&quot;: [ # The steps associated with the execution stage.
1445 # Note that stages may have several steps, and that a given step
1446 # might be run by more than one stage.
1447 &quot;A String&quot;,
1448 ],
1449 },
1450 },
1451 },
1452 &quot;currentState&quot;: &quot;A String&quot;, # The current state of the job.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001453 #
1454 # Jobs are created in the `JOB_STATE_STOPPED` state unless otherwise
1455 # specified.
1456 #
1457 # A job in the `JOB_STATE_RUNNING` state may asynchronously enter a
1458 # terminal state. After a job has reached a terminal state, no
1459 # further state updates may be made.
1460 #
1461 # This field may be mutated by the Cloud Dataflow service;
1462 # callers cannot mutate it.
Bu Sun Kim65020912020-05-20 12:08:20 -07001463 &quot;location&quot;: &quot;A String&quot;, # The [regional endpoint]
1464 # (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
1465 # contains this job.
1466 &quot;startTime&quot;: &quot;A String&quot;, # The timestamp when the job was started (transitioned to JOB_STATE_PENDING).
1467 # Flexible resource scheduling jobs are started with some delay after job
1468 # creation, so start_time is unset before start and is updated when the
1469 # job is started by the Cloud Dataflow service. For other jobs, start_time
1470 # always equals to create_time and is immutable and set by the Cloud Dataflow
1471 # service.
1472 &quot;stepsLocation&quot;: &quot;A String&quot;, # The GCS location where the steps are stored.
1473 &quot;labels&quot;: { # User-defined labels for this job.
1474 #
1475 # The labels map can contain no more than 64 entries. Entries of the labels
1476 # map are UTF8 strings that comply with the following restrictions:
1477 #
1478 # * Keys must conform to regexp: \p{Ll}\p{Lo}{0,62}
1479 # * Values must conform to regexp: [\p{Ll}\p{Lo}\p{N}_-]{0,63}
1480 # * Both keys and values are additionally constrained to be &lt;= 128 bytes in
1481 # size.
1482 &quot;a_key&quot;: &quot;A String&quot;,
Sai Cheemalapati4ba8c232017-06-06 18:46:08 -04001483 },
Bu Sun Kim65020912020-05-20 12:08:20 -07001484 &quot;createTime&quot;: &quot;A String&quot;, # The timestamp when the job was initially created. Immutable and set by the
1485 # Cloud Dataflow service.
1486 &quot;requestedState&quot;: &quot;A String&quot;, # The job&#x27;s requested state.
1487 #
1488 # `UpdateJob` may be used to switch between the `JOB_STATE_STOPPED` and
1489 # `JOB_STATE_RUNNING` states, by setting requested_state. `UpdateJob` may
1490 # also be used to directly set a job&#x27;s requested state to
1491 # `JOB_STATE_CANCELLED` or `JOB_STATE_DONE`, irrevocably terminating the
1492 # job if it has not already reached a terminal state.
Sai Cheemalapati4ba8c232017-06-06 18:46:08 -04001493 },
Sai Cheemalapati4ba8c232017-06-06 18:46:08 -04001494 }</pre>
1495</div>
1496
1497</body></html>