blob: d384b0ecc85225ef015b8eb2095a8458b867643e [file] [log] [blame]
Jon Wayne Parrott692617a2017-01-06 09:58:29 -08001<html><body>
2<style>
3
4body, h1, h2, h3, div, span, p, pre, a {
5 margin: 0;
6 padding: 0;
7 border: 0;
8 font-weight: inherit;
9 font-style: inherit;
10 font-size: 100%;
11 font-family: inherit;
12 vertical-align: baseline;
13}
14
15body {
16 font-size: 13px;
17 padding: 1em;
18}
19
20h1 {
21 font-size: 26px;
22 margin-bottom: 1em;
23}
24
25h2 {
26 font-size: 24px;
27 margin-bottom: 1em;
28}
29
30h3 {
31 font-size: 20px;
32 margin-bottom: 1em;
33 margin-top: 1em;
34}
35
36pre, code {
37 line-height: 1.5;
38 font-family: Monaco, 'DejaVu Sans Mono', 'Bitstream Vera Sans Mono', 'Lucida Console', monospace;
39}
40
41pre {
42 margin-top: 0.5em;
43}
44
45h1, h2, h3, p {
46 font-family: Arial, sans serif;
47}
48
49h1, h2, h3 {
50 border-bottom: solid #CCC 1px;
51}
52
53.toc_element {
54 margin-top: 0.5em;
55}
56
57.firstline {
58 margin-left: 2 em;
59}
60
61.method {
62 margin-top: 1em;
63 border: solid 1px #CCC;
64 padding: 1em;
65 background: #EEE;
66}
67
68.details {
69 font-weight: bold;
70 font-size: 14px;
71}
72
73</style>
74
Bu Sun Kim715bd7f2019-06-14 16:50:42 -070075<h1><a href="dataflow_v1b3.html">Dataflow API</a> . <a href="dataflow_v1b3.projects.html">projects</a> . <a href="dataflow_v1b3.projects.locations.html">locations</a> . <a href="dataflow_v1b3.projects.locations.jobs.html">jobs</a></h1>
Jon Wayne Parrott692617a2017-01-06 09:58:29 -080076<h2>Instance Methods</h2>
77<p class="toc_element">
Sai Cheemalapati4ba8c232017-06-06 18:46:08 -040078 <code><a href="dataflow_v1b3.projects.locations.jobs.debug.html">debug()</a></code>
79</p>
80<p class="firstline">Returns the debug Resource.</p>
81
82<p class="toc_element">
Jon Wayne Parrott692617a2017-01-06 09:58:29 -080083 <code><a href="dataflow_v1b3.projects.locations.jobs.messages.html">messages()</a></code>
84</p>
85<p class="firstline">Returns the messages Resource.</p>
86
87<p class="toc_element">
Bu Sun Kim715bd7f2019-06-14 16:50:42 -070088 <code><a href="dataflow_v1b3.projects.locations.jobs.snapshots.html">snapshots()</a></code>
89</p>
90<p class="firstline">Returns the snapshots Resource.</p>
91
92<p class="toc_element">
Jon Wayne Parrott692617a2017-01-06 09:58:29 -080093 <code><a href="dataflow_v1b3.projects.locations.jobs.workItems.html">workItems()</a></code>
94</p>
95<p class="firstline">Returns the workItems Resource.</p>
96
97<p class="toc_element">
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -070098 <code><a href="#create">create(projectId, location, body=None, replaceJobId=None, view=None, x__xgafv=None)</a></code></p>
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -040099<p class="firstline">Creates a Cloud Dataflow job.</p>
Jon Wayne Parrott692617a2017-01-06 09:58:29 -0800100<p class="toc_element">
Bu Sun Kim65020912020-05-20 12:08:20 -0700101 <code><a href="#get">get(projectId, location, jobId, view=None, x__xgafv=None)</a></code></p>
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -0400102<p class="firstline">Gets the state of the specified Cloud Dataflow job.</p>
Jon Wayne Parrott692617a2017-01-06 09:58:29 -0800103<p class="toc_element">
104 <code><a href="#getMetrics">getMetrics(projectId, location, jobId, startTime=None, x__xgafv=None)</a></code></p>
105<p class="firstline">Request the job status.</p>
106<p class="toc_element">
Bu Sun Kim65020912020-05-20 12:08:20 -0700107 <code><a href="#list">list(projectId, location, filter=None, pageToken=None, pageSize=None, view=None, x__xgafv=None)</a></code></p>
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -0400108<p class="firstline">List the jobs of a project.</p>
Jon Wayne Parrott692617a2017-01-06 09:58:29 -0800109<p class="toc_element">
110 <code><a href="#list_next">list_next(previous_request, previous_response)</a></code></p>
111<p class="firstline">Retrieves the next page of results.</p>
112<p class="toc_element">
Dan O'Mearadd494642020-05-01 07:42:23 -0700113 <code><a href="#snapshot">snapshot(projectId, location, jobId, body=None, x__xgafv=None)</a></code></p>
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700114<p class="firstline">Snapshot the state of a streaming job.</p>
115<p class="toc_element">
Dan O'Mearadd494642020-05-01 07:42:23 -0700116 <code><a href="#update">update(projectId, location, jobId, body=None, x__xgafv=None)</a></code></p>
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -0400117<p class="firstline">Updates the state of an existing Cloud Dataflow job.</p>
Jon Wayne Parrott692617a2017-01-06 09:58:29 -0800118<h3>Method Details</h3>
119<div class="method">
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -0700120 <code class="details" id="create">create(projectId, location, body=None, replaceJobId=None, view=None, x__xgafv=None)</code>
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -0400121 <pre>Creates a Cloud Dataflow job.
Jon Wayne Parrott692617a2017-01-06 09:58:29 -0800122
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700123To create a job, we recommend using `projects.locations.jobs.create` with a
124[regional endpoint]
125(https://cloud.google.com/dataflow/docs/concepts/regional-endpoints). Using
126`projects.jobs.create` is not recommended, as your job will always start
127in `us-central1`.
128
Jon Wayne Parrott692617a2017-01-06 09:58:29 -0800129Args:
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -0400130 projectId: string, The ID of the Cloud Platform project that the job belongs to. (required)
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700131 location: string, The [regional endpoint]
132(https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
133contains this job. (required)
Dan O'Mearadd494642020-05-01 07:42:23 -0700134 body: object, The request body.
Jon Wayne Parrott692617a2017-01-06 09:58:29 -0800135 The object takes the form of:
136
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -0400137{ # Defines a job to be run by the Cloud Dataflow service.
Bu Sun Kim65020912020-05-20 12:08:20 -0700138 &quot;clientRequestId&quot;: &quot;A String&quot;, # The client&#x27;s unique identifier of the job, re-used across retried attempts.
139 # If this field is set, the service will ensure its uniqueness.
140 # The request to create a job will fail if the service has knowledge of a
141 # previously submitted job with the same client&#x27;s ID and job name.
142 # The caller may use this field to ensure idempotence of job
143 # creation across retried attempts to create a job.
144 # By default, the field is empty and, in that case, the service ignores it.
145 &quot;id&quot;: &quot;A String&quot;, # The unique ID of this job.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700146 #
147 # This field is set by the Cloud Dataflow service when the Job is
148 # created, and is immutable for the life of the job.
Bu Sun Kim65020912020-05-20 12:08:20 -0700149 &quot;currentStateTime&quot;: &quot;A String&quot;, # The timestamp associated with the current state.
150 &quot;transformNameMapping&quot;: { # The map of transform name prefixes of the job to be replaced to the
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700151 # corresponding name prefixes of the new job.
Bu Sun Kim65020912020-05-20 12:08:20 -0700152 &quot;a_key&quot;: &quot;A String&quot;,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700153 },
Bu Sun Kim65020912020-05-20 12:08:20 -0700154 &quot;environment&quot;: { # Describes the environment in which a Dataflow Job runs. # The environment for the job.
Bu Sun Kim65020912020-05-20 12:08:20 -0700155 &quot;internalExperiments&quot;: { # Experimental settings.
156 &quot;a_key&quot;: &quot;&quot;, # Properties of the object. Contains field @type with type URL.
157 },
158 &quot;workerRegion&quot;: &quot;A String&quot;, # The Compute Engine region
159 # (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
160 # which worker processing should occur, e.g. &quot;us-west1&quot;. Mutually exclusive
161 # with worker_zone. If neither worker_region nor worker_zone is specified,
162 # default to the control plane&#x27;s region.
163 &quot;serviceKmsKeyName&quot;: &quot;A String&quot;, # If set, contains the Cloud KMS key identifier used to encrypt data
164 # at rest, AKA a Customer Managed Encryption Key (CMEK).
165 #
166 # Format:
167 # projects/PROJECT_ID/locations/LOCATION/keyRings/KEY_RING/cryptoKeys/KEY
168 &quot;userAgent&quot;: { # A description of the process that generated the request.
169 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
170 },
171 &quot;workerZone&quot;: &quot;A String&quot;, # The Compute Engine zone
172 # (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
173 # which worker processing should occur, e.g. &quot;us-west1-a&quot;. Mutually exclusive
174 # with worker_region. If neither worker_region nor worker_zone is specified,
175 # a zone in the control plane&#x27;s region is chosen based on available capacity.
176 &quot;clusterManagerApiService&quot;: &quot;A String&quot;, # The type of cluster manager API to use. If unknown or
Dan O'Mearadd494642020-05-01 07:42:23 -0700177 # unspecified, the service will attempt to choose a reasonable
178 # default. This should be in the form of the API service name,
Bu Sun Kim65020912020-05-20 12:08:20 -0700179 # e.g. &quot;compute.googleapis.com&quot;.
180 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the system should use for temporary
181 # storage. The system will append the suffix &quot;/temp-{JOBNAME} to
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700182 # this resource prefix, where {JOBNAME} is the value of the
183 # job_name field. The resulting bucket and object prefix is used
184 # as the prefix of the resources used to store temporary data
185 # needed during the job execution. NOTE: This will override the
186 # value in taskrunner_settings.
187 # The supported resource type is:
188 #
189 # Google Cloud Storage:
190 #
191 # storage.googleapis.com/{bucket}/{object}
192 # bucket.storage.googleapis.com/{object}
Bu Sun Kim65020912020-05-20 12:08:20 -0700193 &quot;experiments&quot;: [ # The list of experiments to enable.
194 &quot;A String&quot;,
195 ],
196 &quot;version&quot;: { # A structure describing which components and their versions of the service
197 # are required in order to run the job.
198 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
199 },
200 &quot;serviceAccountEmail&quot;: &quot;A String&quot;, # Identity to run virtual machines as. Defaults to the default account.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -0700201 &quot;sdkPipelineOptions&quot;: { # The Cloud Dataflow SDK pipeline options specified by the user. These
202 # options are passed through the service and are used to recreate the
203 # SDK pipeline options on the worker in a language agnostic and platform
204 # independent way.
205 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
206 },
207 &quot;flexResourceSchedulingGoal&quot;: &quot;A String&quot;, # Which Flexible Resource Scheduling mode to run in.
208 &quot;workerPools&quot;: [ # The worker pools. At least one &quot;harness&quot; worker pool must be
209 # specified in order for the job to have workers.
210 { # Describes one particular pool of Cloud Dataflow workers to be
211 # instantiated by the Cloud Dataflow service in order to perform the
212 # computations required by a job. Note that a workflow job may use
213 # multiple pools, in order to match the various computational
214 # requirements of the various stages of the job.
215 &quot;numThreadsPerWorker&quot;: 42, # The number of threads per worker harness. If empty or unspecified, the
216 # service will choose a number of threads (according to the number of cores
217 # on the selected machine type for batch, or 1 by convention for streaming).
218 &quot;numWorkers&quot;: 42, # Number of Google Compute Engine workers in this pool needed to
219 # execute the job. If zero or unspecified, the service will
220 # attempt to choose a reasonable default.
221 &quot;zone&quot;: &quot;A String&quot;, # Zone to run the worker pools in. If empty or unspecified, the service
222 # will attempt to choose a reasonable default.
223 &quot;diskSourceImage&quot;: &quot;A String&quot;, # Fully qualified source image for disks.
224 &quot;packages&quot;: [ # Packages to be installed on workers.
225 { # The packages that must be installed in order for a worker to run the
226 # steps of the Cloud Dataflow job that will be assigned to its worker
227 # pool.
228 #
229 # This is the mechanism by which the Cloud Dataflow SDK causes code to
230 # be loaded onto the workers. For example, the Cloud Dataflow Java SDK
231 # might use this to install jars containing the user&#x27;s code and all of the
232 # various dependencies (libraries, data files, etc.) required in order
233 # for that code to run.
234 &quot;name&quot;: &quot;A String&quot;, # The name of the package.
235 &quot;location&quot;: &quot;A String&quot;, # The resource to read the package from. The supported resource type is:
236 #
237 # Google Cloud Storage:
238 #
239 # storage.googleapis.com/{bucket}
240 # bucket.storage.googleapis.com/
241 },
242 ],
243 &quot;teardownPolicy&quot;: &quot;A String&quot;, # Sets the policy for determining when to turndown worker pool.
244 # Allowed values are: `TEARDOWN_ALWAYS`, `TEARDOWN_ON_SUCCESS`, and
245 # `TEARDOWN_NEVER`.
246 # `TEARDOWN_ALWAYS` means workers are always torn down regardless of whether
247 # the job succeeds. `TEARDOWN_ON_SUCCESS` means workers are torn down
248 # if the job succeeds. `TEARDOWN_NEVER` means the workers are never torn
249 # down.
250 #
251 # If the workers are not torn down by the service, they will
252 # continue to run and use Google Compute Engine VM resources in the
253 # user&#x27;s project until they are explicitly terminated by the user.
254 # Because of this, Google recommends using the `TEARDOWN_ALWAYS`
255 # policy except for small, manually supervised test jobs.
256 #
257 # If unknown or unspecified, the service will attempt to choose a reasonable
258 # default.
259 &quot;onHostMaintenance&quot;: &quot;A String&quot;, # The action to take on host maintenance, as defined by the Google
260 # Compute Engine API.
261 &quot;poolArgs&quot;: { # Extra arguments for this worker pool.
262 &quot;a_key&quot;: &quot;&quot;, # Properties of the object. Contains field @type with type URL.
263 },
264 &quot;diskSizeGb&quot;: 42, # Size of root disk for VMs, in GB. If zero or unspecified, the service will
265 # attempt to choose a reasonable default.
266 &quot;workerHarnessContainerImage&quot;: &quot;A String&quot;, # Required. Docker container image that executes the Cloud Dataflow worker
267 # harness, residing in Google Container Registry.
268 #
269 # Deprecated for the Fn API path. Use sdk_harness_container_images instead.
270 &quot;diskType&quot;: &quot;A String&quot;, # Type of root disk for VMs. If empty or unspecified, the service will
271 # attempt to choose a reasonable default.
272 &quot;machineType&quot;: &quot;A String&quot;, # Machine type (e.g. &quot;n1-standard-1&quot;). If empty or unspecified, the
273 # service will attempt to choose a reasonable default.
274 &quot;kind&quot;: &quot;A String&quot;, # The kind of the worker pool; currently only `harness` and `shuffle`
275 # are supported.
276 &quot;sdkHarnessContainerImages&quot;: [ # Set of SDK harness containers needed to execute this pipeline. This will
277 # only be set in the Fn API path. For non-cross-language pipelines this
278 # should have only one entry. Cross-language pipelines will have two or more
279 # entries.
280 { # Defines a SDK harness container for executing Dataflow pipelines.
281 &quot;containerImage&quot;: &quot;A String&quot;, # A docker container image that resides in Google Container Registry.
282 &quot;useSingleCorePerContainer&quot;: True or False, # If true, recommends the Dataflow service to use only one core per SDK
283 # container instance with this image. If false (or unset) recommends using
284 # more than one core per SDK container instance with this image for
285 # efficiency. Note that Dataflow service may choose to override this property
286 # if needed.
287 },
288 ],
289 &quot;dataDisks&quot;: [ # Data disks that are used by a VM in this workflow.
290 { # Describes the data disk used by a workflow job.
291 &quot;diskType&quot;: &quot;A String&quot;, # Disk storage type, as defined by Google Compute Engine. This
292 # must be a disk type appropriate to the project and zone in which
293 # the workers will run. If unknown or unspecified, the service
294 # will attempt to choose a reasonable default.
295 #
296 # For example, the standard persistent disk type is a resource name
297 # typically ending in &quot;pd-standard&quot;. If SSD persistent disks are
298 # available, the resource name typically ends with &quot;pd-ssd&quot;. The
299 # actual valid values are defined the Google Compute Engine API,
300 # not by the Cloud Dataflow API; consult the Google Compute Engine
301 # documentation for more information about determining the set of
302 # available disk types for a particular project and zone.
303 #
304 # Google Compute Engine Disk types are local to a particular
305 # project in a particular zone, and so the resource name will
306 # typically look something like this:
307 #
308 # compute.googleapis.com/projects/project-id/zones/zone/diskTypes/pd-standard
309 &quot;sizeGb&quot;: 42, # Size of disk in GB. If zero or unspecified, the service will
310 # attempt to choose a reasonable default.
311 &quot;mountPoint&quot;: &quot;A String&quot;, # Directory in a VM where disk is mounted.
312 },
313 ],
314 &quot;subnetwork&quot;: &quot;A String&quot;, # Subnetwork to which VMs will be assigned, if desired. Expected to be of
315 # the form &quot;regions/REGION/subnetworks/SUBNETWORK&quot;.
316 &quot;ipConfiguration&quot;: &quot;A String&quot;, # Configuration for VM IPs.
317 &quot;taskrunnerSettings&quot;: { # Taskrunner configuration settings. # Settings passed through to Google Compute Engine workers when
318 # using the standard Dataflow task runner. Users should ignore
319 # this field.
320 &quot;alsologtostderr&quot;: True or False, # Whether to also send taskrunner log info to stderr.
321 &quot;taskGroup&quot;: &quot;A String&quot;, # The UNIX group ID on the worker VM to use for tasks launched by
322 # taskrunner; e.g. &quot;wheel&quot;.
323 &quot;harnessCommand&quot;: &quot;A String&quot;, # The command to launch the worker harness.
324 &quot;logDir&quot;: &quot;A String&quot;, # The directory on the VM to store logs.
325 &quot;oauthScopes&quot;: [ # The OAuth2 scopes to be requested by the taskrunner in order to
326 # access the Cloud Dataflow API.
327 &quot;A String&quot;,
328 ],
329 &quot;dataflowApiVersion&quot;: &quot;A String&quot;, # The API version of endpoint, e.g. &quot;v1b3&quot;
330 &quot;logUploadLocation&quot;: &quot;A String&quot;, # Indicates where to put logs. If this is not specified, the logs
331 # will not be uploaded.
332 #
333 # The supported resource type is:
334 #
335 # Google Cloud Storage:
336 # storage.googleapis.com/{bucket}/{object}
337 # bucket.storage.googleapis.com/{object}
338 &quot;streamingWorkerMainClass&quot;: &quot;A String&quot;, # The streaming worker main class name.
339 &quot;workflowFileName&quot;: &quot;A String&quot;, # The file to store the workflow in.
340 &quot;languageHint&quot;: &quot;A String&quot;, # The suggested backend language.
341 &quot;commandlinesFileName&quot;: &quot;A String&quot;, # The file to store preprocessing commands in.
342 &quot;baseTaskDir&quot;: &quot;A String&quot;, # The location on the worker for task-specific subdirectories.
343 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the taskrunner should use for
344 # temporary storage.
345 #
346 # The supported resource type is:
347 #
348 # Google Cloud Storage:
349 # storage.googleapis.com/{bucket}/{object}
350 # bucket.storage.googleapis.com/{object}
351 &quot;baseUrl&quot;: &quot;A String&quot;, # The base URL for the taskrunner to use when accessing Google Cloud APIs.
352 #
353 # When workers access Google Cloud APIs, they logically do so via
354 # relative URLs. If this field is specified, it supplies the base
355 # URL to use for resolving these relative URLs. The normative
356 # algorithm used is defined by RFC 1808, &quot;Relative Uniform Resource
357 # Locators&quot;.
358 #
359 # If not specified, the default value is &quot;http://www.googleapis.com/&quot;
360 &quot;logToSerialconsole&quot;: True or False, # Whether to send taskrunner log info to Google Compute Engine VM serial
361 # console.
362 &quot;continueOnException&quot;: True or False, # Whether to continue taskrunner if an exception is hit.
363 &quot;parallelWorkerSettings&quot;: { # Provides data to pass through to the worker harness. # The settings to pass to the parallel worker harness.
364 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the system should use for temporary
365 # storage.
366 #
367 # The supported resource type is:
368 #
369 # Google Cloud Storage:
370 #
371 # storage.googleapis.com/{bucket}/{object}
372 # bucket.storage.googleapis.com/{object}
373 &quot;reportingEnabled&quot;: True or False, # Whether to send work progress updates to the service.
374 &quot;baseUrl&quot;: &quot;A String&quot;, # The base URL for accessing Google Cloud APIs.
375 #
376 # When workers access Google Cloud APIs, they logically do so via
377 # relative URLs. If this field is specified, it supplies the base
378 # URL to use for resolving these relative URLs. The normative
379 # algorithm used is defined by RFC 1808, &quot;Relative Uniform Resource
380 # Locators&quot;.
381 #
382 # If not specified, the default value is &quot;http://www.googleapis.com/&quot;
383 &quot;servicePath&quot;: &quot;A String&quot;, # The Cloud Dataflow service path relative to the root URL, for example,
384 # &quot;dataflow/v1b3/projects&quot;.
385 &quot;shuffleServicePath&quot;: &quot;A String&quot;, # The Shuffle service path relative to the root URL, for example,
386 # &quot;shuffle/v1beta1&quot;.
387 &quot;workerId&quot;: &quot;A String&quot;, # The ID of the worker running this pipeline.
388 },
389 &quot;taskUser&quot;: &quot;A String&quot;, # The UNIX user ID on the worker VM to use for tasks launched by
390 # taskrunner; e.g. &quot;root&quot;.
391 &quot;vmId&quot;: &quot;A String&quot;, # The ID string of the VM.
392 },
393 &quot;autoscalingSettings&quot;: { # Settings for WorkerPool autoscaling. # Settings for autoscaling of this WorkerPool.
394 &quot;algorithm&quot;: &quot;A String&quot;, # The algorithm to use for autoscaling.
395 &quot;maxNumWorkers&quot;: 42, # The maximum number of workers to cap scaling at.
396 },
397 &quot;metadata&quot;: { # Metadata to set on the Google Compute Engine VMs.
398 &quot;a_key&quot;: &quot;A String&quot;,
399 },
400 &quot;defaultPackageSet&quot;: &quot;A String&quot;, # The default package set to install. This allows the service to
401 # select a default set of packages which are useful to worker
402 # harnesses written in a particular language.
403 &quot;network&quot;: &quot;A String&quot;, # Network to which VMs will be assigned. If empty or unspecified,
404 # the service will use the network &quot;default&quot;.
405 },
406 ],
407 &quot;dataset&quot;: &quot;A String&quot;, # The dataset for the current project where various workflow
408 # related tables are stored.
409 #
410 # The supported resource type is:
411 #
412 # Google BigQuery:
413 # bigquery.googleapis.com/{dataset}
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700414 },
Bu Sun Kim65020912020-05-20 12:08:20 -0700415 &quot;stageStates&quot;: [ # This field may be mutated by the Cloud Dataflow service;
416 # callers cannot mutate it.
417 { # A message describing the state of a particular execution stage.
Bu Sun Kim65020912020-05-20 12:08:20 -0700418 &quot;currentStateTime&quot;: &quot;A String&quot;, # The time at which the stage transitioned to this state.
419 &quot;executionStageState&quot;: &quot;A String&quot;, # Executions stage states allow the same set of values as JobState.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -0700420 &quot;executionStageName&quot;: &quot;A String&quot;, # The name of the execution stage.
Bu Sun Kim65020912020-05-20 12:08:20 -0700421 },
422 ],
423 &quot;jobMetadata&quot;: { # Metadata available primarily for filtering jobs. Will be included in the # This field is populated by the Dataflow service to support filtering jobs
424 # by the metadata values provided here. Populated for ListJobs and all GetJob
425 # views SUMMARY and higher.
426 # ListJob response and Job SUMMARY view.
Bu Sun Kim65020912020-05-20 12:08:20 -0700427 &quot;datastoreDetails&quot;: [ # Identification of a Datastore source used in the Dataflow job.
428 { # Metadata for a Datastore connector used by the job.
Bu Sun Kim65020912020-05-20 12:08:20 -0700429 &quot;namespace&quot;: &quot;A String&quot;, # Namespace used in the connection.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -0700430 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
Bu Sun Kim65020912020-05-20 12:08:20 -0700431 },
432 ],
433 &quot;sdkVersion&quot;: { # The version of the SDK used to run the job. # The SDK version used to run the job.
Bu Sun Kim65020912020-05-20 12:08:20 -0700434 &quot;version&quot;: &quot;A String&quot;, # The version of the SDK used to run the job.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -0700435 &quot;sdkSupportStatus&quot;: &quot;A String&quot;, # The support status for this SDK version.
436 &quot;versionDisplayName&quot;: &quot;A String&quot;, # A readable string describing the version of the SDK.
Bu Sun Kim65020912020-05-20 12:08:20 -0700437 },
438 &quot;bigqueryDetails&quot;: [ # Identification of a BigQuery source used in the Dataflow job.
439 { # Metadata for a BigQuery connector used by the job.
440 &quot;table&quot;: &quot;A String&quot;, # Table accessed in the connection.
441 &quot;dataset&quot;: &quot;A String&quot;, # Dataset accessed in the connection.
Bu Sun Kim65020912020-05-20 12:08:20 -0700442 &quot;query&quot;: &quot;A String&quot;, # Query used to access data in the connection.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -0700443 &quot;projectId&quot;: &quot;A String&quot;, # Project accessed in the connection.
Bu Sun Kim65020912020-05-20 12:08:20 -0700444 },
445 ],
446 &quot;fileDetails&quot;: [ # Identification of a File source used in the Dataflow job.
447 { # Metadata for a File connector used by the job.
448 &quot;filePattern&quot;: &quot;A String&quot;, # File Pattern used to access files by the connector.
449 },
450 ],
451 &quot;pubsubDetails&quot;: [ # Identification of a PubSub source used in the Dataflow job.
452 { # Metadata for a PubSub connector used by the job.
Bu Sun Kim65020912020-05-20 12:08:20 -0700453 &quot;topic&quot;: &quot;A String&quot;, # Topic accessed in the connection.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -0700454 &quot;subscription&quot;: &quot;A String&quot;, # Subscription used in the connection.
455 },
456 ],
457 &quot;bigTableDetails&quot;: [ # Identification of a BigTable source used in the Dataflow job.
458 { # Metadata for a BigTable connector used by the job.
459 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
460 &quot;instanceId&quot;: &quot;A String&quot;, # InstanceId accessed in the connection.
461 &quot;tableId&quot;: &quot;A String&quot;, # TableId accessed in the connection.
462 },
463 ],
464 &quot;spannerDetails&quot;: [ # Identification of a Spanner source used in the Dataflow job.
465 { # Metadata for a Spanner connector used by the job.
466 &quot;instanceId&quot;: &quot;A String&quot;, # InstanceId accessed in the connection.
467 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
468 &quot;databaseId&quot;: &quot;A String&quot;, # DatabaseId accessed in the connection.
Bu Sun Kim65020912020-05-20 12:08:20 -0700469 },
470 ],
471 },
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -0700472 &quot;type&quot;: &quot;A String&quot;, # The type of Cloud Dataflow job.
473 &quot;projectId&quot;: &quot;A String&quot;, # The ID of the Cloud Platform project that the job belongs to.
Bu Sun Kim65020912020-05-20 12:08:20 -0700474 &quot;createdFromSnapshotId&quot;: &quot;A String&quot;, # If this is specified, the job&#x27;s initial state is populated from the given
475 # snapshot.
Bu Sun Kim65020912020-05-20 12:08:20 -0700476 &quot;pipelineDescription&quot;: { # A descriptive representation of submitted pipeline as well as the executed # Preliminary field: The format of this data may change at any time.
477 # A description of the user pipeline and stages through which it is executed.
478 # Created by Cloud Dataflow service. Only retrieved with
479 # JOB_VIEW_DESCRIPTION or JOB_VIEW_ALL.
480 # form. This data is provided by the Dataflow service for ease of visualizing
481 # the pipeline and interpreting Dataflow provided metrics.
482 &quot;executionPipelineStage&quot;: [ # Description of each stage of execution of the pipeline.
483 { # Description of the composing transforms, names/ids, and input/outputs of a
484 # stage of execution. Some composing transforms and sources may have been
485 # generated by the Dataflow service during execution planning.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -0700486 &quot;outputSource&quot;: [ # Output sources for this stage.
487 { # Description of an input or output of an execution stage.
488 &quot;sizeBytes&quot;: &quot;A String&quot;, # Size of the source, if measurable.
489 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
490 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this source; may be user or system generated.
491 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
492 # source is most closely associated.
493 },
494 ],
495 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this stage.
496 &quot;inputSource&quot;: [ # Input sources for this stage.
497 { # Description of an input or output of an execution stage.
498 &quot;sizeBytes&quot;: &quot;A String&quot;, # Size of the source, if measurable.
499 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
500 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this source; may be user or system generated.
501 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
502 # source is most closely associated.
503 },
504 ],
Bu Sun Kim65020912020-05-20 12:08:20 -0700505 &quot;id&quot;: &quot;A String&quot;, # Dataflow service generated id for this stage.
506 &quot;componentTransform&quot;: [ # Transforms that comprise this execution stage.
507 { # Description of a transform executed as part of an execution stage.
508 &quot;originalTransform&quot;: &quot;A String&quot;, # User name for the original user transform with which this transform is
509 # most closely associated.
510 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
511 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this transform; may be user or system generated.
512 },
513 ],
514 &quot;componentSource&quot;: [ # Collections produced and consumed by component transforms of this stage.
515 { # Description of an interstitial value between transforms in an execution
516 # stage.
517 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
518 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this transform; may be user or system generated.
519 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
520 # source is most closely associated.
521 },
522 ],
523 &quot;kind&quot;: &quot;A String&quot;, # Type of tranform this stage is executing.
Bu Sun Kim65020912020-05-20 12:08:20 -0700524 },
525 ],
526 &quot;originalPipelineTransform&quot;: [ # Description of each transform in the pipeline and collections between them.
527 { # Description of the type, names/ids, and input/outputs for a transform.
528 &quot;kind&quot;: &quot;A String&quot;, # Type of transform.
529 &quot;inputCollectionName&quot;: [ # User names for all collection inputs to this transform.
530 &quot;A String&quot;,
531 ],
532 &quot;name&quot;: &quot;A String&quot;, # User provided name for this transform instance.
533 &quot;id&quot;: &quot;A String&quot;, # SDK generated id of this transform instance.
534 &quot;displayData&quot;: [ # Transform-specific display data.
535 { # Data provided with a pipeline or transform to provide descriptive info.
Bu Sun Kim65020912020-05-20 12:08:20 -0700536 &quot;durationValue&quot;: &quot;A String&quot;, # Contains value if the data is of duration type.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -0700537 &quot;int64Value&quot;: &quot;A String&quot;, # Contains value if the data is of int64 type.
Bu Sun Kim65020912020-05-20 12:08:20 -0700538 &quot;namespace&quot;: &quot;A String&quot;, # The namespace for the key. This is usually a class name or programming
539 # language namespace (i.e. python module) which defines the display data.
540 # This allows a dax monitoring system to specially handle the data
541 # and perform custom rendering.
542 &quot;floatValue&quot;: 3.14, # Contains value if the data is of float type.
543 &quot;key&quot;: &quot;A String&quot;, # The key identifying the display data.
544 # This is intended to be used as a label for the display data
545 # when viewed in a dax monitoring system.
546 &quot;shortStrValue&quot;: &quot;A String&quot;, # A possible additional shorter value to display.
547 # For example a java_class_name_value of com.mypackage.MyDoFn
548 # will be stored with MyDoFn as the short_str_value and
549 # com.mypackage.MyDoFn as the java_class_name value.
550 # short_str_value can be displayed and java_class_name_value
551 # will be displayed as a tooltip.
552 &quot;url&quot;: &quot;A String&quot;, # An optional full URL.
553 &quot;label&quot;: &quot;A String&quot;, # An optional label to display in a dax UI for the element.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -0700554 &quot;timestampValue&quot;: &quot;A String&quot;, # Contains value if the data is of timestamp type.
555 &quot;boolValue&quot;: True or False, # Contains value if the data is of a boolean type.
556 &quot;javaClassValue&quot;: &quot;A String&quot;, # Contains value if the data is of java class type.
557 &quot;strValue&quot;: &quot;A String&quot;, # Contains value if the data is of string type.
Bu Sun Kim65020912020-05-20 12:08:20 -0700558 },
559 ],
560 &quot;outputCollectionName&quot;: [ # User names for all collection outputs to this transform.
561 &quot;A String&quot;,
562 ],
563 },
564 ],
565 &quot;displayData&quot;: [ # Pipeline level display data.
566 { # Data provided with a pipeline or transform to provide descriptive info.
Bu Sun Kim65020912020-05-20 12:08:20 -0700567 &quot;durationValue&quot;: &quot;A String&quot;, # Contains value if the data is of duration type.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -0700568 &quot;int64Value&quot;: &quot;A String&quot;, # Contains value if the data is of int64 type.
Bu Sun Kim65020912020-05-20 12:08:20 -0700569 &quot;namespace&quot;: &quot;A String&quot;, # The namespace for the key. This is usually a class name or programming
570 # language namespace (i.e. python module) which defines the display data.
571 # This allows a dax monitoring system to specially handle the data
572 # and perform custom rendering.
573 &quot;floatValue&quot;: 3.14, # Contains value if the data is of float type.
574 &quot;key&quot;: &quot;A String&quot;, # The key identifying the display data.
575 # This is intended to be used as a label for the display data
576 # when viewed in a dax monitoring system.
577 &quot;shortStrValue&quot;: &quot;A String&quot;, # A possible additional shorter value to display.
578 # For example a java_class_name_value of com.mypackage.MyDoFn
579 # will be stored with MyDoFn as the short_str_value and
580 # com.mypackage.MyDoFn as the java_class_name value.
581 # short_str_value can be displayed and java_class_name_value
582 # will be displayed as a tooltip.
583 &quot;url&quot;: &quot;A String&quot;, # An optional full URL.
584 &quot;label&quot;: &quot;A String&quot;, # An optional label to display in a dax UI for the element.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -0700585 &quot;timestampValue&quot;: &quot;A String&quot;, # Contains value if the data is of timestamp type.
586 &quot;boolValue&quot;: True or False, # Contains value if the data is of a boolean type.
587 &quot;javaClassValue&quot;: &quot;A String&quot;, # Contains value if the data is of java class type.
588 &quot;strValue&quot;: &quot;A String&quot;, # Contains value if the data is of string type.
Bu Sun Kim65020912020-05-20 12:08:20 -0700589 },
590 ],
591 },
592 &quot;replaceJobId&quot;: &quot;A String&quot;, # If this job is an update of an existing job, this field is the job ID
593 # of the job it replaced.
594 #
595 # When sending a `CreateJobRequest`, you can update a job by specifying it
596 # here. The job named here is stopped, and its intermediate state is
597 # transferred to this job.
598 &quot;tempFiles&quot;: [ # A set of files the system should be aware of that are used
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700599 # for temporary storage. These temporary files will be
600 # removed on job completion.
601 # No duplicates are allowed.
602 # No file patterns are supported.
603 #
604 # The supported files are:
605 #
606 # Google Cloud Storage:
607 #
608 # storage.googleapis.com/{bucket}/{object}
609 # bucket.storage.googleapis.com/{object}
Bu Sun Kim65020912020-05-20 12:08:20 -0700610 &quot;A String&quot;,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700611 ],
Bu Sun Kim65020912020-05-20 12:08:20 -0700612 &quot;name&quot;: &quot;A String&quot;, # The user-specified Cloud Dataflow job name.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700613 #
614 # Only one Job with a given name may exist in a project at any
615 # given time. If a caller attempts to create a Job with the same
616 # name as an already-existing Job, the attempt returns the
617 # existing Job.
618 #
619 # The name must match the regular expression
620 # `[a-z]([-a-z0-9]{0,38}[a-z0-9])?`
Bu Sun Kim65020912020-05-20 12:08:20 -0700621 &quot;steps&quot;: [ # Exactly one of step or steps_location should be specified.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700622 #
623 # The top-level steps that constitute the entire job.
624 { # Defines a particular step within a Cloud Dataflow job.
625 #
626 # A job consists of multiple steps, each of which performs some
627 # specific operation as part of the overall job. Data is typically
628 # passed from one step to another as part of the job.
629 #
Bu Sun Kim65020912020-05-20 12:08:20 -0700630 # Here&#x27;s an example of a sequence of steps which together implement a
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700631 # Map-Reduce job:
632 #
633 # * Read a collection of data from some source, parsing the
Bu Sun Kim65020912020-05-20 12:08:20 -0700634 # collection&#x27;s elements.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700635 #
636 # * Validate the elements.
637 #
638 # * Apply a user-defined function to map each element to some value
639 # and extract an element-specific key value.
640 #
641 # * Group elements with the same key into a single element with
642 # that key, transforming a multiply-keyed collection into a
643 # uniquely-keyed collection.
644 #
645 # * Write the elements out to some data sink.
646 #
647 # Note that the Cloud Dataflow service may be used to run many different
648 # types of jobs, not just Map-Reduce.
Bu Sun Kim65020912020-05-20 12:08:20 -0700649 &quot;name&quot;: &quot;A String&quot;, # The name that identifies the step. This must be unique for each
Dan O'Mearadd494642020-05-01 07:42:23 -0700650 # step with respect to all other steps in the Cloud Dataflow job.
Bu Sun Kim65020912020-05-20 12:08:20 -0700651 &quot;kind&quot;: &quot;A String&quot;, # The kind of step in the Cloud Dataflow job.
652 &quot;properties&quot;: { # Named properties associated with the step. Each kind of
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700653 # predefined step has its own required set of properties.
654 # Must be provided on Create. Only retrieved with JOB_VIEW_ALL.
Bu Sun Kim65020912020-05-20 12:08:20 -0700655 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700656 },
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700657 },
658 ],
Bu Sun Kim65020912020-05-20 12:08:20 -0700659 &quot;replacedByJobId&quot;: &quot;A String&quot;, # If another job is an update of this job (and thus, this job is in
660 # `JOB_STATE_UPDATED`), this field contains the ID of that job.
661 &quot;executionInfo&quot;: { # Additional information about how a Cloud Dataflow job will be executed that # Deprecated.
662 # isn&#x27;t contained in the submitted job.
663 &quot;stages&quot;: { # A mapping from each stage to the information about that stage.
664 &quot;a_key&quot;: { # Contains information about how a particular
665 # google.dataflow.v1beta3.Step will be executed.
666 &quot;stepName&quot;: [ # The steps associated with the execution stage.
667 # Note that stages may have several steps, and that a given step
668 # might be run by more than one stage.
669 &quot;A String&quot;,
670 ],
671 },
672 },
673 },
674 &quot;currentState&quot;: &quot;A String&quot;, # The current state of the job.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700675 #
676 # Jobs are created in the `JOB_STATE_STOPPED` state unless otherwise
677 # specified.
678 #
679 # A job in the `JOB_STATE_RUNNING` state may asynchronously enter a
680 # terminal state. After a job has reached a terminal state, no
681 # further state updates may be made.
682 #
683 # This field may be mutated by the Cloud Dataflow service;
684 # callers cannot mutate it.
Bu Sun Kim65020912020-05-20 12:08:20 -0700685 &quot;location&quot;: &quot;A String&quot;, # The [regional endpoint]
686 # (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
687 # contains this job.
688 &quot;startTime&quot;: &quot;A String&quot;, # The timestamp when the job was started (transitioned to JOB_STATE_PENDING).
689 # Flexible resource scheduling jobs are started with some delay after job
690 # creation, so start_time is unset before start and is updated when the
691 # job is started by the Cloud Dataflow service. For other jobs, start_time
692 # always equals to create_time and is immutable and set by the Cloud Dataflow
693 # service.
694 &quot;stepsLocation&quot;: &quot;A String&quot;, # The GCS location where the steps are stored.
695 &quot;labels&quot;: { # User-defined labels for this job.
696 #
697 # The labels map can contain no more than 64 entries. Entries of the labels
698 # map are UTF8 strings that comply with the following restrictions:
699 #
700 # * Keys must conform to regexp: \p{Ll}\p{Lo}{0,62}
701 # * Values must conform to regexp: [\p{Ll}\p{Lo}\p{N}_-]{0,63}
702 # * Both keys and values are additionally constrained to be &lt;= 128 bytes in
703 # size.
704 &quot;a_key&quot;: &quot;A String&quot;,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700705 },
Bu Sun Kim65020912020-05-20 12:08:20 -0700706 &quot;createTime&quot;: &quot;A String&quot;, # The timestamp when the job was initially created. Immutable and set by the
707 # Cloud Dataflow service.
708 &quot;requestedState&quot;: &quot;A String&quot;, # The job&#x27;s requested state.
709 #
710 # `UpdateJob` may be used to switch between the `JOB_STATE_STOPPED` and
711 # `JOB_STATE_RUNNING` states, by setting requested_state. `UpdateJob` may
712 # also be used to directly set a job&#x27;s requested state to
713 # `JOB_STATE_CANCELLED` or `JOB_STATE_DONE`, irrevocably terminating the
714 # job if it has not already reached a terminal state.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700715}
716
Bu Sun Kim65020912020-05-20 12:08:20 -0700717 replaceJobId: string, Deprecated. This field is now in the Job message.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -0700718 view: string, The level of information requested in response.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700719 x__xgafv: string, V1 error format.
720 Allowed values
721 1 - v1 error format
722 2 - v2 error format
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700723
724Returns:
725 An object of the form:
726
727 { # Defines a job to be run by the Cloud Dataflow service.
Bu Sun Kim65020912020-05-20 12:08:20 -0700728 &quot;clientRequestId&quot;: &quot;A String&quot;, # The client&#x27;s unique identifier of the job, re-used across retried attempts.
729 # If this field is set, the service will ensure its uniqueness.
730 # The request to create a job will fail if the service has knowledge of a
731 # previously submitted job with the same client&#x27;s ID and job name.
732 # The caller may use this field to ensure idempotence of job
733 # creation across retried attempts to create a job.
734 # By default, the field is empty and, in that case, the service ignores it.
735 &quot;id&quot;: &quot;A String&quot;, # The unique ID of this job.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700736 #
737 # This field is set by the Cloud Dataflow service when the Job is
738 # created, and is immutable for the life of the job.
Bu Sun Kim65020912020-05-20 12:08:20 -0700739 &quot;currentStateTime&quot;: &quot;A String&quot;, # The timestamp associated with the current state.
740 &quot;transformNameMapping&quot;: { # The map of transform name prefixes of the job to be replaced to the
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -0400741 # corresponding name prefixes of the new job.
Bu Sun Kim65020912020-05-20 12:08:20 -0700742 &quot;a_key&quot;: &quot;A String&quot;,
Jon Wayne Parrott692617a2017-01-06 09:58:29 -0800743 },
Bu Sun Kim65020912020-05-20 12:08:20 -0700744 &quot;environment&quot;: { # Describes the environment in which a Dataflow Job runs. # The environment for the job.
Bu Sun Kim65020912020-05-20 12:08:20 -0700745 &quot;internalExperiments&quot;: { # Experimental settings.
746 &quot;a_key&quot;: &quot;&quot;, # Properties of the object. Contains field @type with type URL.
747 },
748 &quot;workerRegion&quot;: &quot;A String&quot;, # The Compute Engine region
749 # (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
750 # which worker processing should occur, e.g. &quot;us-west1&quot;. Mutually exclusive
751 # with worker_zone. If neither worker_region nor worker_zone is specified,
752 # default to the control plane&#x27;s region.
753 &quot;serviceKmsKeyName&quot;: &quot;A String&quot;, # If set, contains the Cloud KMS key identifier used to encrypt data
754 # at rest, AKA a Customer Managed Encryption Key (CMEK).
755 #
756 # Format:
757 # projects/PROJECT_ID/locations/LOCATION/keyRings/KEY_RING/cryptoKeys/KEY
758 &quot;userAgent&quot;: { # A description of the process that generated the request.
759 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
760 },
761 &quot;workerZone&quot;: &quot;A String&quot;, # The Compute Engine zone
762 # (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
763 # which worker processing should occur, e.g. &quot;us-west1-a&quot;. Mutually exclusive
764 # with worker_region. If neither worker_region nor worker_zone is specified,
765 # a zone in the control plane&#x27;s region is chosen based on available capacity.
766 &quot;clusterManagerApiService&quot;: &quot;A String&quot;, # The type of cluster manager API to use. If unknown or
Dan O'Mearadd494642020-05-01 07:42:23 -0700767 # unspecified, the service will attempt to choose a reasonable
768 # default. This should be in the form of the API service name,
Bu Sun Kim65020912020-05-20 12:08:20 -0700769 # e.g. &quot;compute.googleapis.com&quot;.
770 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the system should use for temporary
771 # storage. The system will append the suffix &quot;/temp-{JOBNAME} to
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700772 # this resource prefix, where {JOBNAME} is the value of the
773 # job_name field. The resulting bucket and object prefix is used
774 # as the prefix of the resources used to store temporary data
775 # needed during the job execution. NOTE: This will override the
776 # value in taskrunner_settings.
777 # The supported resource type is:
778 #
779 # Google Cloud Storage:
780 #
781 # storage.googleapis.com/{bucket}/{object}
782 # bucket.storage.googleapis.com/{object}
Bu Sun Kim65020912020-05-20 12:08:20 -0700783 &quot;experiments&quot;: [ # The list of experiments to enable.
784 &quot;A String&quot;,
785 ],
786 &quot;version&quot;: { # A structure describing which components and their versions of the service
787 # are required in order to run the job.
788 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
789 },
790 &quot;serviceAccountEmail&quot;: &quot;A String&quot;, # Identity to run virtual machines as. Defaults to the default account.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -0700791 &quot;sdkPipelineOptions&quot;: { # The Cloud Dataflow SDK pipeline options specified by the user. These
792 # options are passed through the service and are used to recreate the
793 # SDK pipeline options on the worker in a language agnostic and platform
794 # independent way.
795 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
796 },
797 &quot;flexResourceSchedulingGoal&quot;: &quot;A String&quot;, # Which Flexible Resource Scheduling mode to run in.
798 &quot;workerPools&quot;: [ # The worker pools. At least one &quot;harness&quot; worker pool must be
799 # specified in order for the job to have workers.
800 { # Describes one particular pool of Cloud Dataflow workers to be
801 # instantiated by the Cloud Dataflow service in order to perform the
802 # computations required by a job. Note that a workflow job may use
803 # multiple pools, in order to match the various computational
804 # requirements of the various stages of the job.
805 &quot;numThreadsPerWorker&quot;: 42, # The number of threads per worker harness. If empty or unspecified, the
806 # service will choose a number of threads (according to the number of cores
807 # on the selected machine type for batch, or 1 by convention for streaming).
808 &quot;numWorkers&quot;: 42, # Number of Google Compute Engine workers in this pool needed to
809 # execute the job. If zero or unspecified, the service will
810 # attempt to choose a reasonable default.
811 &quot;zone&quot;: &quot;A String&quot;, # Zone to run the worker pools in. If empty or unspecified, the service
812 # will attempt to choose a reasonable default.
813 &quot;diskSourceImage&quot;: &quot;A String&quot;, # Fully qualified source image for disks.
814 &quot;packages&quot;: [ # Packages to be installed on workers.
815 { # The packages that must be installed in order for a worker to run the
816 # steps of the Cloud Dataflow job that will be assigned to its worker
817 # pool.
818 #
819 # This is the mechanism by which the Cloud Dataflow SDK causes code to
820 # be loaded onto the workers. For example, the Cloud Dataflow Java SDK
821 # might use this to install jars containing the user&#x27;s code and all of the
822 # various dependencies (libraries, data files, etc.) required in order
823 # for that code to run.
824 &quot;name&quot;: &quot;A String&quot;, # The name of the package.
825 &quot;location&quot;: &quot;A String&quot;, # The resource to read the package from. The supported resource type is:
826 #
827 # Google Cloud Storage:
828 #
829 # storage.googleapis.com/{bucket}
830 # bucket.storage.googleapis.com/
831 },
832 ],
833 &quot;teardownPolicy&quot;: &quot;A String&quot;, # Sets the policy for determining when to turndown worker pool.
834 # Allowed values are: `TEARDOWN_ALWAYS`, `TEARDOWN_ON_SUCCESS`, and
835 # `TEARDOWN_NEVER`.
836 # `TEARDOWN_ALWAYS` means workers are always torn down regardless of whether
837 # the job succeeds. `TEARDOWN_ON_SUCCESS` means workers are torn down
838 # if the job succeeds. `TEARDOWN_NEVER` means the workers are never torn
839 # down.
840 #
841 # If the workers are not torn down by the service, they will
842 # continue to run and use Google Compute Engine VM resources in the
843 # user&#x27;s project until they are explicitly terminated by the user.
844 # Because of this, Google recommends using the `TEARDOWN_ALWAYS`
845 # policy except for small, manually supervised test jobs.
846 #
847 # If unknown or unspecified, the service will attempt to choose a reasonable
848 # default.
849 &quot;onHostMaintenance&quot;: &quot;A String&quot;, # The action to take on host maintenance, as defined by the Google
850 # Compute Engine API.
851 &quot;poolArgs&quot;: { # Extra arguments for this worker pool.
852 &quot;a_key&quot;: &quot;&quot;, # Properties of the object. Contains field @type with type URL.
853 },
854 &quot;diskSizeGb&quot;: 42, # Size of root disk for VMs, in GB. If zero or unspecified, the service will
855 # attempt to choose a reasonable default.
856 &quot;workerHarnessContainerImage&quot;: &quot;A String&quot;, # Required. Docker container image that executes the Cloud Dataflow worker
857 # harness, residing in Google Container Registry.
858 #
859 # Deprecated for the Fn API path. Use sdk_harness_container_images instead.
860 &quot;diskType&quot;: &quot;A String&quot;, # Type of root disk for VMs. If empty or unspecified, the service will
861 # attempt to choose a reasonable default.
862 &quot;machineType&quot;: &quot;A String&quot;, # Machine type (e.g. &quot;n1-standard-1&quot;). If empty or unspecified, the
863 # service will attempt to choose a reasonable default.
864 &quot;kind&quot;: &quot;A String&quot;, # The kind of the worker pool; currently only `harness` and `shuffle`
865 # are supported.
866 &quot;sdkHarnessContainerImages&quot;: [ # Set of SDK harness containers needed to execute this pipeline. This will
867 # only be set in the Fn API path. For non-cross-language pipelines this
868 # should have only one entry. Cross-language pipelines will have two or more
869 # entries.
870 { # Defines a SDK harness container for executing Dataflow pipelines.
871 &quot;containerImage&quot;: &quot;A String&quot;, # A docker container image that resides in Google Container Registry.
872 &quot;useSingleCorePerContainer&quot;: True or False, # If true, recommends the Dataflow service to use only one core per SDK
873 # container instance with this image. If false (or unset) recommends using
874 # more than one core per SDK container instance with this image for
875 # efficiency. Note that Dataflow service may choose to override this property
876 # if needed.
877 },
878 ],
879 &quot;dataDisks&quot;: [ # Data disks that are used by a VM in this workflow.
880 { # Describes the data disk used by a workflow job.
881 &quot;diskType&quot;: &quot;A String&quot;, # Disk storage type, as defined by Google Compute Engine. This
882 # must be a disk type appropriate to the project and zone in which
883 # the workers will run. If unknown or unspecified, the service
884 # will attempt to choose a reasonable default.
885 #
886 # For example, the standard persistent disk type is a resource name
887 # typically ending in &quot;pd-standard&quot;. If SSD persistent disks are
888 # available, the resource name typically ends with &quot;pd-ssd&quot;. The
889 # actual valid values are defined the Google Compute Engine API,
890 # not by the Cloud Dataflow API; consult the Google Compute Engine
891 # documentation for more information about determining the set of
892 # available disk types for a particular project and zone.
893 #
894 # Google Compute Engine Disk types are local to a particular
895 # project in a particular zone, and so the resource name will
896 # typically look something like this:
897 #
898 # compute.googleapis.com/projects/project-id/zones/zone/diskTypes/pd-standard
899 &quot;sizeGb&quot;: 42, # Size of disk in GB. If zero or unspecified, the service will
900 # attempt to choose a reasonable default.
901 &quot;mountPoint&quot;: &quot;A String&quot;, # Directory in a VM where disk is mounted.
902 },
903 ],
904 &quot;subnetwork&quot;: &quot;A String&quot;, # Subnetwork to which VMs will be assigned, if desired. Expected to be of
905 # the form &quot;regions/REGION/subnetworks/SUBNETWORK&quot;.
906 &quot;ipConfiguration&quot;: &quot;A String&quot;, # Configuration for VM IPs.
907 &quot;taskrunnerSettings&quot;: { # Taskrunner configuration settings. # Settings passed through to Google Compute Engine workers when
908 # using the standard Dataflow task runner. Users should ignore
909 # this field.
910 &quot;alsologtostderr&quot;: True or False, # Whether to also send taskrunner log info to stderr.
911 &quot;taskGroup&quot;: &quot;A String&quot;, # The UNIX group ID on the worker VM to use for tasks launched by
912 # taskrunner; e.g. &quot;wheel&quot;.
913 &quot;harnessCommand&quot;: &quot;A String&quot;, # The command to launch the worker harness.
914 &quot;logDir&quot;: &quot;A String&quot;, # The directory on the VM to store logs.
915 &quot;oauthScopes&quot;: [ # The OAuth2 scopes to be requested by the taskrunner in order to
916 # access the Cloud Dataflow API.
917 &quot;A String&quot;,
918 ],
919 &quot;dataflowApiVersion&quot;: &quot;A String&quot;, # The API version of endpoint, e.g. &quot;v1b3&quot;
920 &quot;logUploadLocation&quot;: &quot;A String&quot;, # Indicates where to put logs. If this is not specified, the logs
921 # will not be uploaded.
922 #
923 # The supported resource type is:
924 #
925 # Google Cloud Storage:
926 # storage.googleapis.com/{bucket}/{object}
927 # bucket.storage.googleapis.com/{object}
928 &quot;streamingWorkerMainClass&quot;: &quot;A String&quot;, # The streaming worker main class name.
929 &quot;workflowFileName&quot;: &quot;A String&quot;, # The file to store the workflow in.
930 &quot;languageHint&quot;: &quot;A String&quot;, # The suggested backend language.
931 &quot;commandlinesFileName&quot;: &quot;A String&quot;, # The file to store preprocessing commands in.
932 &quot;baseTaskDir&quot;: &quot;A String&quot;, # The location on the worker for task-specific subdirectories.
933 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the taskrunner should use for
934 # temporary storage.
935 #
936 # The supported resource type is:
937 #
938 # Google Cloud Storage:
939 # storage.googleapis.com/{bucket}/{object}
940 # bucket.storage.googleapis.com/{object}
941 &quot;baseUrl&quot;: &quot;A String&quot;, # The base URL for the taskrunner to use when accessing Google Cloud APIs.
942 #
943 # When workers access Google Cloud APIs, they logically do so via
944 # relative URLs. If this field is specified, it supplies the base
945 # URL to use for resolving these relative URLs. The normative
946 # algorithm used is defined by RFC 1808, &quot;Relative Uniform Resource
947 # Locators&quot;.
948 #
949 # If not specified, the default value is &quot;http://www.googleapis.com/&quot;
950 &quot;logToSerialconsole&quot;: True or False, # Whether to send taskrunner log info to Google Compute Engine VM serial
951 # console.
952 &quot;continueOnException&quot;: True or False, # Whether to continue taskrunner if an exception is hit.
953 &quot;parallelWorkerSettings&quot;: { # Provides data to pass through to the worker harness. # The settings to pass to the parallel worker harness.
954 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the system should use for temporary
955 # storage.
956 #
957 # The supported resource type is:
958 #
959 # Google Cloud Storage:
960 #
961 # storage.googleapis.com/{bucket}/{object}
962 # bucket.storage.googleapis.com/{object}
963 &quot;reportingEnabled&quot;: True or False, # Whether to send work progress updates to the service.
964 &quot;baseUrl&quot;: &quot;A String&quot;, # The base URL for accessing Google Cloud APIs.
965 #
966 # When workers access Google Cloud APIs, they logically do so via
967 # relative URLs. If this field is specified, it supplies the base
968 # URL to use for resolving these relative URLs. The normative
969 # algorithm used is defined by RFC 1808, &quot;Relative Uniform Resource
970 # Locators&quot;.
971 #
972 # If not specified, the default value is &quot;http://www.googleapis.com/&quot;
973 &quot;servicePath&quot;: &quot;A String&quot;, # The Cloud Dataflow service path relative to the root URL, for example,
974 # &quot;dataflow/v1b3/projects&quot;.
975 &quot;shuffleServicePath&quot;: &quot;A String&quot;, # The Shuffle service path relative to the root URL, for example,
976 # &quot;shuffle/v1beta1&quot;.
977 &quot;workerId&quot;: &quot;A String&quot;, # The ID of the worker running this pipeline.
978 },
979 &quot;taskUser&quot;: &quot;A String&quot;, # The UNIX user ID on the worker VM to use for tasks launched by
980 # taskrunner; e.g. &quot;root&quot;.
981 &quot;vmId&quot;: &quot;A String&quot;, # The ID string of the VM.
982 },
983 &quot;autoscalingSettings&quot;: { # Settings for WorkerPool autoscaling. # Settings for autoscaling of this WorkerPool.
984 &quot;algorithm&quot;: &quot;A String&quot;, # The algorithm to use for autoscaling.
985 &quot;maxNumWorkers&quot;: 42, # The maximum number of workers to cap scaling at.
986 },
987 &quot;metadata&quot;: { # Metadata to set on the Google Compute Engine VMs.
988 &quot;a_key&quot;: &quot;A String&quot;,
989 },
990 &quot;defaultPackageSet&quot;: &quot;A String&quot;, # The default package set to install. This allows the service to
991 # select a default set of packages which are useful to worker
992 # harnesses written in a particular language.
993 &quot;network&quot;: &quot;A String&quot;, # Network to which VMs will be assigned. If empty or unspecified,
994 # the service will use the network &quot;default&quot;.
995 },
996 ],
997 &quot;dataset&quot;: &quot;A String&quot;, # The dataset for the current project where various workflow
998 # related tables are stored.
999 #
1000 # The supported resource type is:
1001 #
1002 # Google BigQuery:
1003 # bigquery.googleapis.com/{dataset}
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001004 },
Bu Sun Kim65020912020-05-20 12:08:20 -07001005 &quot;stageStates&quot;: [ # This field may be mutated by the Cloud Dataflow service;
1006 # callers cannot mutate it.
1007 { # A message describing the state of a particular execution stage.
Bu Sun Kim65020912020-05-20 12:08:20 -07001008 &quot;currentStateTime&quot;: &quot;A String&quot;, # The time at which the stage transitioned to this state.
1009 &quot;executionStageState&quot;: &quot;A String&quot;, # Executions stage states allow the same set of values as JobState.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07001010 &quot;executionStageName&quot;: &quot;A String&quot;, # The name of the execution stage.
Bu Sun Kim65020912020-05-20 12:08:20 -07001011 },
1012 ],
1013 &quot;jobMetadata&quot;: { # Metadata available primarily for filtering jobs. Will be included in the # This field is populated by the Dataflow service to support filtering jobs
1014 # by the metadata values provided here. Populated for ListJobs and all GetJob
1015 # views SUMMARY and higher.
1016 # ListJob response and Job SUMMARY view.
Bu Sun Kim65020912020-05-20 12:08:20 -07001017 &quot;datastoreDetails&quot;: [ # Identification of a Datastore source used in the Dataflow job.
1018 { # Metadata for a Datastore connector used by the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07001019 &quot;namespace&quot;: &quot;A String&quot;, # Namespace used in the connection.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07001020 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
Bu Sun Kim65020912020-05-20 12:08:20 -07001021 },
1022 ],
1023 &quot;sdkVersion&quot;: { # The version of the SDK used to run the job. # The SDK version used to run the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07001024 &quot;version&quot;: &quot;A String&quot;, # The version of the SDK used to run the job.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07001025 &quot;sdkSupportStatus&quot;: &quot;A String&quot;, # The support status for this SDK version.
1026 &quot;versionDisplayName&quot;: &quot;A String&quot;, # A readable string describing the version of the SDK.
Bu Sun Kim65020912020-05-20 12:08:20 -07001027 },
1028 &quot;bigqueryDetails&quot;: [ # Identification of a BigQuery source used in the Dataflow job.
1029 { # Metadata for a BigQuery connector used by the job.
1030 &quot;table&quot;: &quot;A String&quot;, # Table accessed in the connection.
1031 &quot;dataset&quot;: &quot;A String&quot;, # Dataset accessed in the connection.
Bu Sun Kim65020912020-05-20 12:08:20 -07001032 &quot;query&quot;: &quot;A String&quot;, # Query used to access data in the connection.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07001033 &quot;projectId&quot;: &quot;A String&quot;, # Project accessed in the connection.
Bu Sun Kim65020912020-05-20 12:08:20 -07001034 },
1035 ],
1036 &quot;fileDetails&quot;: [ # Identification of a File source used in the Dataflow job.
1037 { # Metadata for a File connector used by the job.
1038 &quot;filePattern&quot;: &quot;A String&quot;, # File Pattern used to access files by the connector.
1039 },
1040 ],
1041 &quot;pubsubDetails&quot;: [ # Identification of a PubSub source used in the Dataflow job.
1042 { # Metadata for a PubSub connector used by the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07001043 &quot;topic&quot;: &quot;A String&quot;, # Topic accessed in the connection.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07001044 &quot;subscription&quot;: &quot;A String&quot;, # Subscription used in the connection.
1045 },
1046 ],
1047 &quot;bigTableDetails&quot;: [ # Identification of a BigTable source used in the Dataflow job.
1048 { # Metadata for a BigTable connector used by the job.
1049 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
1050 &quot;instanceId&quot;: &quot;A String&quot;, # InstanceId accessed in the connection.
1051 &quot;tableId&quot;: &quot;A String&quot;, # TableId accessed in the connection.
1052 },
1053 ],
1054 &quot;spannerDetails&quot;: [ # Identification of a Spanner source used in the Dataflow job.
1055 { # Metadata for a Spanner connector used by the job.
1056 &quot;instanceId&quot;: &quot;A String&quot;, # InstanceId accessed in the connection.
1057 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
1058 &quot;databaseId&quot;: &quot;A String&quot;, # DatabaseId accessed in the connection.
Bu Sun Kim65020912020-05-20 12:08:20 -07001059 },
1060 ],
1061 },
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07001062 &quot;type&quot;: &quot;A String&quot;, # The type of Cloud Dataflow job.
1063 &quot;projectId&quot;: &quot;A String&quot;, # The ID of the Cloud Platform project that the job belongs to.
Bu Sun Kim65020912020-05-20 12:08:20 -07001064 &quot;createdFromSnapshotId&quot;: &quot;A String&quot;, # If this is specified, the job&#x27;s initial state is populated from the given
1065 # snapshot.
Bu Sun Kim65020912020-05-20 12:08:20 -07001066 &quot;pipelineDescription&quot;: { # A descriptive representation of submitted pipeline as well as the executed # Preliminary field: The format of this data may change at any time.
1067 # A description of the user pipeline and stages through which it is executed.
1068 # Created by Cloud Dataflow service. Only retrieved with
1069 # JOB_VIEW_DESCRIPTION or JOB_VIEW_ALL.
1070 # form. This data is provided by the Dataflow service for ease of visualizing
1071 # the pipeline and interpreting Dataflow provided metrics.
1072 &quot;executionPipelineStage&quot;: [ # Description of each stage of execution of the pipeline.
1073 { # Description of the composing transforms, names/ids, and input/outputs of a
1074 # stage of execution. Some composing transforms and sources may have been
1075 # generated by the Dataflow service during execution planning.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07001076 &quot;outputSource&quot;: [ # Output sources for this stage.
1077 { # Description of an input or output of an execution stage.
1078 &quot;sizeBytes&quot;: &quot;A String&quot;, # Size of the source, if measurable.
1079 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
1080 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this source; may be user or system generated.
1081 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
1082 # source is most closely associated.
1083 },
1084 ],
1085 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this stage.
1086 &quot;inputSource&quot;: [ # Input sources for this stage.
1087 { # Description of an input or output of an execution stage.
1088 &quot;sizeBytes&quot;: &quot;A String&quot;, # Size of the source, if measurable.
1089 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
1090 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this source; may be user or system generated.
1091 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
1092 # source is most closely associated.
1093 },
1094 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07001095 &quot;id&quot;: &quot;A String&quot;, # Dataflow service generated id for this stage.
1096 &quot;componentTransform&quot;: [ # Transforms that comprise this execution stage.
1097 { # Description of a transform executed as part of an execution stage.
1098 &quot;originalTransform&quot;: &quot;A String&quot;, # User name for the original user transform with which this transform is
1099 # most closely associated.
1100 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
1101 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this transform; may be user or system generated.
1102 },
1103 ],
1104 &quot;componentSource&quot;: [ # Collections produced and consumed by component transforms of this stage.
1105 { # Description of an interstitial value between transforms in an execution
1106 # stage.
1107 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
1108 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this transform; may be user or system generated.
1109 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
1110 # source is most closely associated.
1111 },
1112 ],
1113 &quot;kind&quot;: &quot;A String&quot;, # Type of tranform this stage is executing.
Bu Sun Kim65020912020-05-20 12:08:20 -07001114 },
1115 ],
1116 &quot;originalPipelineTransform&quot;: [ # Description of each transform in the pipeline and collections between them.
1117 { # Description of the type, names/ids, and input/outputs for a transform.
1118 &quot;kind&quot;: &quot;A String&quot;, # Type of transform.
1119 &quot;inputCollectionName&quot;: [ # User names for all collection inputs to this transform.
1120 &quot;A String&quot;,
1121 ],
1122 &quot;name&quot;: &quot;A String&quot;, # User provided name for this transform instance.
1123 &quot;id&quot;: &quot;A String&quot;, # SDK generated id of this transform instance.
1124 &quot;displayData&quot;: [ # Transform-specific display data.
1125 { # Data provided with a pipeline or transform to provide descriptive info.
Bu Sun Kim65020912020-05-20 12:08:20 -07001126 &quot;durationValue&quot;: &quot;A String&quot;, # Contains value if the data is of duration type.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07001127 &quot;int64Value&quot;: &quot;A String&quot;, # Contains value if the data is of int64 type.
Bu Sun Kim65020912020-05-20 12:08:20 -07001128 &quot;namespace&quot;: &quot;A String&quot;, # The namespace for the key. This is usually a class name or programming
1129 # language namespace (i.e. python module) which defines the display data.
1130 # This allows a dax monitoring system to specially handle the data
1131 # and perform custom rendering.
1132 &quot;floatValue&quot;: 3.14, # Contains value if the data is of float type.
1133 &quot;key&quot;: &quot;A String&quot;, # The key identifying the display data.
1134 # This is intended to be used as a label for the display data
1135 # when viewed in a dax monitoring system.
1136 &quot;shortStrValue&quot;: &quot;A String&quot;, # A possible additional shorter value to display.
1137 # For example a java_class_name_value of com.mypackage.MyDoFn
1138 # will be stored with MyDoFn as the short_str_value and
1139 # com.mypackage.MyDoFn as the java_class_name value.
1140 # short_str_value can be displayed and java_class_name_value
1141 # will be displayed as a tooltip.
1142 &quot;url&quot;: &quot;A String&quot;, # An optional full URL.
1143 &quot;label&quot;: &quot;A String&quot;, # An optional label to display in a dax UI for the element.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07001144 &quot;timestampValue&quot;: &quot;A String&quot;, # Contains value if the data is of timestamp type.
1145 &quot;boolValue&quot;: True or False, # Contains value if the data is of a boolean type.
1146 &quot;javaClassValue&quot;: &quot;A String&quot;, # Contains value if the data is of java class type.
1147 &quot;strValue&quot;: &quot;A String&quot;, # Contains value if the data is of string type.
Bu Sun Kim65020912020-05-20 12:08:20 -07001148 },
1149 ],
1150 &quot;outputCollectionName&quot;: [ # User names for all collection outputs to this transform.
1151 &quot;A String&quot;,
1152 ],
1153 },
1154 ],
1155 &quot;displayData&quot;: [ # Pipeline level display data.
1156 { # Data provided with a pipeline or transform to provide descriptive info.
Bu Sun Kim65020912020-05-20 12:08:20 -07001157 &quot;durationValue&quot;: &quot;A String&quot;, # Contains value if the data is of duration type.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07001158 &quot;int64Value&quot;: &quot;A String&quot;, # Contains value if the data is of int64 type.
Bu Sun Kim65020912020-05-20 12:08:20 -07001159 &quot;namespace&quot;: &quot;A String&quot;, # The namespace for the key. This is usually a class name or programming
1160 # language namespace (i.e. python module) which defines the display data.
1161 # This allows a dax monitoring system to specially handle the data
1162 # and perform custom rendering.
1163 &quot;floatValue&quot;: 3.14, # Contains value if the data is of float type.
1164 &quot;key&quot;: &quot;A String&quot;, # The key identifying the display data.
1165 # This is intended to be used as a label for the display data
1166 # when viewed in a dax monitoring system.
1167 &quot;shortStrValue&quot;: &quot;A String&quot;, # A possible additional shorter value to display.
1168 # For example a java_class_name_value of com.mypackage.MyDoFn
1169 # will be stored with MyDoFn as the short_str_value and
1170 # com.mypackage.MyDoFn as the java_class_name value.
1171 # short_str_value can be displayed and java_class_name_value
1172 # will be displayed as a tooltip.
1173 &quot;url&quot;: &quot;A String&quot;, # An optional full URL.
1174 &quot;label&quot;: &quot;A String&quot;, # An optional label to display in a dax UI for the element.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07001175 &quot;timestampValue&quot;: &quot;A String&quot;, # Contains value if the data is of timestamp type.
1176 &quot;boolValue&quot;: True or False, # Contains value if the data is of a boolean type.
1177 &quot;javaClassValue&quot;: &quot;A String&quot;, # Contains value if the data is of java class type.
1178 &quot;strValue&quot;: &quot;A String&quot;, # Contains value if the data is of string type.
Bu Sun Kim65020912020-05-20 12:08:20 -07001179 },
1180 ],
1181 },
1182 &quot;replaceJobId&quot;: &quot;A String&quot;, # If this job is an update of an existing job, this field is the job ID
1183 # of the job it replaced.
1184 #
1185 # When sending a `CreateJobRequest`, you can update a job by specifying it
1186 # here. The job named here is stopped, and its intermediate state is
1187 # transferred to this job.
1188 &quot;tempFiles&quot;: [ # A set of files the system should be aware of that are used
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001189 # for temporary storage. These temporary files will be
1190 # removed on job completion.
1191 # No duplicates are allowed.
1192 # No file patterns are supported.
1193 #
1194 # The supported files are:
1195 #
1196 # Google Cloud Storage:
1197 #
1198 # storage.googleapis.com/{bucket}/{object}
1199 # bucket.storage.googleapis.com/{object}
Bu Sun Kim65020912020-05-20 12:08:20 -07001200 &quot;A String&quot;,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001201 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07001202 &quot;name&quot;: &quot;A String&quot;, # The user-specified Cloud Dataflow job name.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001203 #
1204 # Only one Job with a given name may exist in a project at any
1205 # given time. If a caller attempts to create a Job with the same
1206 # name as an already-existing Job, the attempt returns the
1207 # existing Job.
1208 #
1209 # The name must match the regular expression
1210 # `[a-z]([-a-z0-9]{0,38}[a-z0-9])?`
Bu Sun Kim65020912020-05-20 12:08:20 -07001211 &quot;steps&quot;: [ # Exactly one of step or steps_location should be specified.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001212 #
1213 # The top-level steps that constitute the entire job.
1214 { # Defines a particular step within a Cloud Dataflow job.
1215 #
1216 # A job consists of multiple steps, each of which performs some
1217 # specific operation as part of the overall job. Data is typically
1218 # passed from one step to another as part of the job.
1219 #
Bu Sun Kim65020912020-05-20 12:08:20 -07001220 # Here&#x27;s an example of a sequence of steps which together implement a
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001221 # Map-Reduce job:
1222 #
1223 # * Read a collection of data from some source, parsing the
Bu Sun Kim65020912020-05-20 12:08:20 -07001224 # collection&#x27;s elements.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001225 #
1226 # * Validate the elements.
1227 #
1228 # * Apply a user-defined function to map each element to some value
1229 # and extract an element-specific key value.
1230 #
1231 # * Group elements with the same key into a single element with
1232 # that key, transforming a multiply-keyed collection into a
1233 # uniquely-keyed collection.
1234 #
1235 # * Write the elements out to some data sink.
1236 #
1237 # Note that the Cloud Dataflow service may be used to run many different
1238 # types of jobs, not just Map-Reduce.
Bu Sun Kim65020912020-05-20 12:08:20 -07001239 &quot;name&quot;: &quot;A String&quot;, # The name that identifies the step. This must be unique for each
Dan O'Mearadd494642020-05-01 07:42:23 -07001240 # step with respect to all other steps in the Cloud Dataflow job.
Bu Sun Kim65020912020-05-20 12:08:20 -07001241 &quot;kind&quot;: &quot;A String&quot;, # The kind of step in the Cloud Dataflow job.
1242 &quot;properties&quot;: { # Named properties associated with the step. Each kind of
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001243 # predefined step has its own required set of properties.
1244 # Must be provided on Create. Only retrieved with JOB_VIEW_ALL.
Bu Sun Kim65020912020-05-20 12:08:20 -07001245 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001246 },
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001247 },
1248 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07001249 &quot;replacedByJobId&quot;: &quot;A String&quot;, # If another job is an update of this job (and thus, this job is in
1250 # `JOB_STATE_UPDATED`), this field contains the ID of that job.
1251 &quot;executionInfo&quot;: { # Additional information about how a Cloud Dataflow job will be executed that # Deprecated.
1252 # isn&#x27;t contained in the submitted job.
1253 &quot;stages&quot;: { # A mapping from each stage to the information about that stage.
1254 &quot;a_key&quot;: { # Contains information about how a particular
1255 # google.dataflow.v1beta3.Step will be executed.
1256 &quot;stepName&quot;: [ # The steps associated with the execution stage.
1257 # Note that stages may have several steps, and that a given step
1258 # might be run by more than one stage.
1259 &quot;A String&quot;,
1260 ],
1261 },
1262 },
1263 },
1264 &quot;currentState&quot;: &quot;A String&quot;, # The current state of the job.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001265 #
1266 # Jobs are created in the `JOB_STATE_STOPPED` state unless otherwise
1267 # specified.
1268 #
1269 # A job in the `JOB_STATE_RUNNING` state may asynchronously enter a
1270 # terminal state. After a job has reached a terminal state, no
1271 # further state updates may be made.
1272 #
1273 # This field may be mutated by the Cloud Dataflow service;
1274 # callers cannot mutate it.
Bu Sun Kim65020912020-05-20 12:08:20 -07001275 &quot;location&quot;: &quot;A String&quot;, # The [regional endpoint]
1276 # (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
1277 # contains this job.
1278 &quot;startTime&quot;: &quot;A String&quot;, # The timestamp when the job was started (transitioned to JOB_STATE_PENDING).
1279 # Flexible resource scheduling jobs are started with some delay after job
1280 # creation, so start_time is unset before start and is updated when the
1281 # job is started by the Cloud Dataflow service. For other jobs, start_time
1282 # always equals to create_time and is immutable and set by the Cloud Dataflow
1283 # service.
1284 &quot;stepsLocation&quot;: &quot;A String&quot;, # The GCS location where the steps are stored.
1285 &quot;labels&quot;: { # User-defined labels for this job.
1286 #
1287 # The labels map can contain no more than 64 entries. Entries of the labels
1288 # map are UTF8 strings that comply with the following restrictions:
1289 #
1290 # * Keys must conform to regexp: \p{Ll}\p{Lo}{0,62}
1291 # * Values must conform to regexp: [\p{Ll}\p{Lo}\p{N}_-]{0,63}
1292 # * Both keys and values are additionally constrained to be &lt;= 128 bytes in
1293 # size.
1294 &quot;a_key&quot;: &quot;A String&quot;,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001295 },
Bu Sun Kim65020912020-05-20 12:08:20 -07001296 &quot;createTime&quot;: &quot;A String&quot;, # The timestamp when the job was initially created. Immutable and set by the
1297 # Cloud Dataflow service.
1298 &quot;requestedState&quot;: &quot;A String&quot;, # The job&#x27;s requested state.
1299 #
1300 # `UpdateJob` may be used to switch between the `JOB_STATE_STOPPED` and
1301 # `JOB_STATE_RUNNING` states, by setting requested_state. `UpdateJob` may
1302 # also be used to directly set a job&#x27;s requested state to
1303 # `JOB_STATE_CANCELLED` or `JOB_STATE_DONE`, irrevocably terminating the
1304 # job if it has not already reached a terminal state.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001305 }</pre>
1306</div>
1307
1308<div class="method">
Bu Sun Kim65020912020-05-20 12:08:20 -07001309 <code class="details" id="get">get(projectId, location, jobId, view=None, x__xgafv=None)</code>
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001310 <pre>Gets the state of the specified Cloud Dataflow job.
1311
1312To get the state of a job, we recommend using `projects.locations.jobs.get`
1313with a [regional endpoint]
1314(https://cloud.google.com/dataflow/docs/concepts/regional-endpoints). Using
1315`projects.jobs.get` is not recommended, as you can only get the state of
1316jobs that are running in `us-central1`.
1317
1318Args:
1319 projectId: string, The ID of the Cloud Platform project that the job belongs to. (required)
1320 location: string, The [regional endpoint]
1321(https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
1322contains this job. (required)
1323 jobId: string, The job ID. (required)
Bu Sun Kim65020912020-05-20 12:08:20 -07001324 view: string, The level of information requested in response.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001325 x__xgafv: string, V1 error format.
1326 Allowed values
1327 1 - v1 error format
1328 2 - v2 error format
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001329
1330Returns:
1331 An object of the form:
1332
1333 { # Defines a job to be run by the Cloud Dataflow service.
Bu Sun Kim65020912020-05-20 12:08:20 -07001334 &quot;clientRequestId&quot;: &quot;A String&quot;, # The client&#x27;s unique identifier of the job, re-used across retried attempts.
1335 # If this field is set, the service will ensure its uniqueness.
1336 # The request to create a job will fail if the service has knowledge of a
1337 # previously submitted job with the same client&#x27;s ID and job name.
1338 # The caller may use this field to ensure idempotence of job
1339 # creation across retried attempts to create a job.
1340 # By default, the field is empty and, in that case, the service ignores it.
1341 &quot;id&quot;: &quot;A String&quot;, # The unique ID of this job.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001342 #
1343 # This field is set by the Cloud Dataflow service when the Job is
1344 # created, and is immutable for the life of the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07001345 &quot;currentStateTime&quot;: &quot;A String&quot;, # The timestamp associated with the current state.
1346 &quot;transformNameMapping&quot;: { # The map of transform name prefixes of the job to be replaced to the
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001347 # corresponding name prefixes of the new job.
Bu Sun Kim65020912020-05-20 12:08:20 -07001348 &quot;a_key&quot;: &quot;A String&quot;,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001349 },
Bu Sun Kim65020912020-05-20 12:08:20 -07001350 &quot;environment&quot;: { # Describes the environment in which a Dataflow Job runs. # The environment for the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07001351 &quot;internalExperiments&quot;: { # Experimental settings.
1352 &quot;a_key&quot;: &quot;&quot;, # Properties of the object. Contains field @type with type URL.
1353 },
1354 &quot;workerRegion&quot;: &quot;A String&quot;, # The Compute Engine region
1355 # (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
1356 # which worker processing should occur, e.g. &quot;us-west1&quot;. Mutually exclusive
1357 # with worker_zone. If neither worker_region nor worker_zone is specified,
1358 # default to the control plane&#x27;s region.
1359 &quot;serviceKmsKeyName&quot;: &quot;A String&quot;, # If set, contains the Cloud KMS key identifier used to encrypt data
1360 # at rest, AKA a Customer Managed Encryption Key (CMEK).
1361 #
1362 # Format:
1363 # projects/PROJECT_ID/locations/LOCATION/keyRings/KEY_RING/cryptoKeys/KEY
1364 &quot;userAgent&quot;: { # A description of the process that generated the request.
1365 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
1366 },
1367 &quot;workerZone&quot;: &quot;A String&quot;, # The Compute Engine zone
1368 # (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
1369 # which worker processing should occur, e.g. &quot;us-west1-a&quot;. Mutually exclusive
1370 # with worker_region. If neither worker_region nor worker_zone is specified,
1371 # a zone in the control plane&#x27;s region is chosen based on available capacity.
1372 &quot;clusterManagerApiService&quot;: &quot;A String&quot;, # The type of cluster manager API to use. If unknown or
Dan O'Mearadd494642020-05-01 07:42:23 -07001373 # unspecified, the service will attempt to choose a reasonable
1374 # default. This should be in the form of the API service name,
Bu Sun Kim65020912020-05-20 12:08:20 -07001375 # e.g. &quot;compute.googleapis.com&quot;.
1376 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the system should use for temporary
1377 # storage. The system will append the suffix &quot;/temp-{JOBNAME} to
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001378 # this resource prefix, where {JOBNAME} is the value of the
1379 # job_name field. The resulting bucket and object prefix is used
1380 # as the prefix of the resources used to store temporary data
1381 # needed during the job execution. NOTE: This will override the
1382 # value in taskrunner_settings.
1383 # The supported resource type is:
1384 #
1385 # Google Cloud Storage:
1386 #
1387 # storage.googleapis.com/{bucket}/{object}
1388 # bucket.storage.googleapis.com/{object}
Bu Sun Kim65020912020-05-20 12:08:20 -07001389 &quot;experiments&quot;: [ # The list of experiments to enable.
1390 &quot;A String&quot;,
1391 ],
1392 &quot;version&quot;: { # A structure describing which components and their versions of the service
1393 # are required in order to run the job.
1394 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
1395 },
1396 &quot;serviceAccountEmail&quot;: &quot;A String&quot;, # Identity to run virtual machines as. Defaults to the default account.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07001397 &quot;sdkPipelineOptions&quot;: { # The Cloud Dataflow SDK pipeline options specified by the user. These
1398 # options are passed through the service and are used to recreate the
1399 # SDK pipeline options on the worker in a language agnostic and platform
1400 # independent way.
1401 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
1402 },
1403 &quot;flexResourceSchedulingGoal&quot;: &quot;A String&quot;, # Which Flexible Resource Scheduling mode to run in.
1404 &quot;workerPools&quot;: [ # The worker pools. At least one &quot;harness&quot; worker pool must be
1405 # specified in order for the job to have workers.
1406 { # Describes one particular pool of Cloud Dataflow workers to be
1407 # instantiated by the Cloud Dataflow service in order to perform the
1408 # computations required by a job. Note that a workflow job may use
1409 # multiple pools, in order to match the various computational
1410 # requirements of the various stages of the job.
1411 &quot;numThreadsPerWorker&quot;: 42, # The number of threads per worker harness. If empty or unspecified, the
1412 # service will choose a number of threads (according to the number of cores
1413 # on the selected machine type for batch, or 1 by convention for streaming).
1414 &quot;numWorkers&quot;: 42, # Number of Google Compute Engine workers in this pool needed to
1415 # execute the job. If zero or unspecified, the service will
1416 # attempt to choose a reasonable default.
1417 &quot;zone&quot;: &quot;A String&quot;, # Zone to run the worker pools in. If empty or unspecified, the service
1418 # will attempt to choose a reasonable default.
1419 &quot;diskSourceImage&quot;: &quot;A String&quot;, # Fully qualified source image for disks.
1420 &quot;packages&quot;: [ # Packages to be installed on workers.
1421 { # The packages that must be installed in order for a worker to run the
1422 # steps of the Cloud Dataflow job that will be assigned to its worker
1423 # pool.
1424 #
1425 # This is the mechanism by which the Cloud Dataflow SDK causes code to
1426 # be loaded onto the workers. For example, the Cloud Dataflow Java SDK
1427 # might use this to install jars containing the user&#x27;s code and all of the
1428 # various dependencies (libraries, data files, etc.) required in order
1429 # for that code to run.
1430 &quot;name&quot;: &quot;A String&quot;, # The name of the package.
1431 &quot;location&quot;: &quot;A String&quot;, # The resource to read the package from. The supported resource type is:
1432 #
1433 # Google Cloud Storage:
1434 #
1435 # storage.googleapis.com/{bucket}
1436 # bucket.storage.googleapis.com/
1437 },
1438 ],
1439 &quot;teardownPolicy&quot;: &quot;A String&quot;, # Sets the policy for determining when to turndown worker pool.
1440 # Allowed values are: `TEARDOWN_ALWAYS`, `TEARDOWN_ON_SUCCESS`, and
1441 # `TEARDOWN_NEVER`.
1442 # `TEARDOWN_ALWAYS` means workers are always torn down regardless of whether
1443 # the job succeeds. `TEARDOWN_ON_SUCCESS` means workers are torn down
1444 # if the job succeeds. `TEARDOWN_NEVER` means the workers are never torn
1445 # down.
1446 #
1447 # If the workers are not torn down by the service, they will
1448 # continue to run and use Google Compute Engine VM resources in the
1449 # user&#x27;s project until they are explicitly terminated by the user.
1450 # Because of this, Google recommends using the `TEARDOWN_ALWAYS`
1451 # policy except for small, manually supervised test jobs.
1452 #
1453 # If unknown or unspecified, the service will attempt to choose a reasonable
1454 # default.
1455 &quot;onHostMaintenance&quot;: &quot;A String&quot;, # The action to take on host maintenance, as defined by the Google
1456 # Compute Engine API.
1457 &quot;poolArgs&quot;: { # Extra arguments for this worker pool.
1458 &quot;a_key&quot;: &quot;&quot;, # Properties of the object. Contains field @type with type URL.
1459 },
1460 &quot;diskSizeGb&quot;: 42, # Size of root disk for VMs, in GB. If zero or unspecified, the service will
1461 # attempt to choose a reasonable default.
1462 &quot;workerHarnessContainerImage&quot;: &quot;A String&quot;, # Required. Docker container image that executes the Cloud Dataflow worker
1463 # harness, residing in Google Container Registry.
1464 #
1465 # Deprecated for the Fn API path. Use sdk_harness_container_images instead.
1466 &quot;diskType&quot;: &quot;A String&quot;, # Type of root disk for VMs. If empty or unspecified, the service will
1467 # attempt to choose a reasonable default.
1468 &quot;machineType&quot;: &quot;A String&quot;, # Machine type (e.g. &quot;n1-standard-1&quot;). If empty or unspecified, the
1469 # service will attempt to choose a reasonable default.
1470 &quot;kind&quot;: &quot;A String&quot;, # The kind of the worker pool; currently only `harness` and `shuffle`
1471 # are supported.
1472 &quot;sdkHarnessContainerImages&quot;: [ # Set of SDK harness containers needed to execute this pipeline. This will
1473 # only be set in the Fn API path. For non-cross-language pipelines this
1474 # should have only one entry. Cross-language pipelines will have two or more
1475 # entries.
1476 { # Defines a SDK harness container for executing Dataflow pipelines.
1477 &quot;containerImage&quot;: &quot;A String&quot;, # A docker container image that resides in Google Container Registry.
1478 &quot;useSingleCorePerContainer&quot;: True or False, # If true, recommends the Dataflow service to use only one core per SDK
1479 # container instance with this image. If false (or unset) recommends using
1480 # more than one core per SDK container instance with this image for
1481 # efficiency. Note that Dataflow service may choose to override this property
1482 # if needed.
1483 },
1484 ],
1485 &quot;dataDisks&quot;: [ # Data disks that are used by a VM in this workflow.
1486 { # Describes the data disk used by a workflow job.
1487 &quot;diskType&quot;: &quot;A String&quot;, # Disk storage type, as defined by Google Compute Engine. This
1488 # must be a disk type appropriate to the project and zone in which
1489 # the workers will run. If unknown or unspecified, the service
1490 # will attempt to choose a reasonable default.
1491 #
1492 # For example, the standard persistent disk type is a resource name
1493 # typically ending in &quot;pd-standard&quot;. If SSD persistent disks are
1494 # available, the resource name typically ends with &quot;pd-ssd&quot;. The
1495 # actual valid values are defined the Google Compute Engine API,
1496 # not by the Cloud Dataflow API; consult the Google Compute Engine
1497 # documentation for more information about determining the set of
1498 # available disk types for a particular project and zone.
1499 #
1500 # Google Compute Engine Disk types are local to a particular
1501 # project in a particular zone, and so the resource name will
1502 # typically look something like this:
1503 #
1504 # compute.googleapis.com/projects/project-id/zones/zone/diskTypes/pd-standard
1505 &quot;sizeGb&quot;: 42, # Size of disk in GB. If zero or unspecified, the service will
1506 # attempt to choose a reasonable default.
1507 &quot;mountPoint&quot;: &quot;A String&quot;, # Directory in a VM where disk is mounted.
1508 },
1509 ],
1510 &quot;subnetwork&quot;: &quot;A String&quot;, # Subnetwork to which VMs will be assigned, if desired. Expected to be of
1511 # the form &quot;regions/REGION/subnetworks/SUBNETWORK&quot;.
1512 &quot;ipConfiguration&quot;: &quot;A String&quot;, # Configuration for VM IPs.
1513 &quot;taskrunnerSettings&quot;: { # Taskrunner configuration settings. # Settings passed through to Google Compute Engine workers when
1514 # using the standard Dataflow task runner. Users should ignore
1515 # this field.
1516 &quot;alsologtostderr&quot;: True or False, # Whether to also send taskrunner log info to stderr.
1517 &quot;taskGroup&quot;: &quot;A String&quot;, # The UNIX group ID on the worker VM to use for tasks launched by
1518 # taskrunner; e.g. &quot;wheel&quot;.
1519 &quot;harnessCommand&quot;: &quot;A String&quot;, # The command to launch the worker harness.
1520 &quot;logDir&quot;: &quot;A String&quot;, # The directory on the VM to store logs.
1521 &quot;oauthScopes&quot;: [ # The OAuth2 scopes to be requested by the taskrunner in order to
1522 # access the Cloud Dataflow API.
1523 &quot;A String&quot;,
1524 ],
1525 &quot;dataflowApiVersion&quot;: &quot;A String&quot;, # The API version of endpoint, e.g. &quot;v1b3&quot;
1526 &quot;logUploadLocation&quot;: &quot;A String&quot;, # Indicates where to put logs. If this is not specified, the logs
1527 # will not be uploaded.
1528 #
1529 # The supported resource type is:
1530 #
1531 # Google Cloud Storage:
1532 # storage.googleapis.com/{bucket}/{object}
1533 # bucket.storage.googleapis.com/{object}
1534 &quot;streamingWorkerMainClass&quot;: &quot;A String&quot;, # The streaming worker main class name.
1535 &quot;workflowFileName&quot;: &quot;A String&quot;, # The file to store the workflow in.
1536 &quot;languageHint&quot;: &quot;A String&quot;, # The suggested backend language.
1537 &quot;commandlinesFileName&quot;: &quot;A String&quot;, # The file to store preprocessing commands in.
1538 &quot;baseTaskDir&quot;: &quot;A String&quot;, # The location on the worker for task-specific subdirectories.
1539 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the taskrunner should use for
1540 # temporary storage.
1541 #
1542 # The supported resource type is:
1543 #
1544 # Google Cloud Storage:
1545 # storage.googleapis.com/{bucket}/{object}
1546 # bucket.storage.googleapis.com/{object}
1547 &quot;baseUrl&quot;: &quot;A String&quot;, # The base URL for the taskrunner to use when accessing Google Cloud APIs.
1548 #
1549 # When workers access Google Cloud APIs, they logically do so via
1550 # relative URLs. If this field is specified, it supplies the base
1551 # URL to use for resolving these relative URLs. The normative
1552 # algorithm used is defined by RFC 1808, &quot;Relative Uniform Resource
1553 # Locators&quot;.
1554 #
1555 # If not specified, the default value is &quot;http://www.googleapis.com/&quot;
1556 &quot;logToSerialconsole&quot;: True or False, # Whether to send taskrunner log info to Google Compute Engine VM serial
1557 # console.
1558 &quot;continueOnException&quot;: True or False, # Whether to continue taskrunner if an exception is hit.
1559 &quot;parallelWorkerSettings&quot;: { # Provides data to pass through to the worker harness. # The settings to pass to the parallel worker harness.
1560 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the system should use for temporary
1561 # storage.
1562 #
1563 # The supported resource type is:
1564 #
1565 # Google Cloud Storage:
1566 #
1567 # storage.googleapis.com/{bucket}/{object}
1568 # bucket.storage.googleapis.com/{object}
1569 &quot;reportingEnabled&quot;: True or False, # Whether to send work progress updates to the service.
1570 &quot;baseUrl&quot;: &quot;A String&quot;, # The base URL for accessing Google Cloud APIs.
1571 #
1572 # When workers access Google Cloud APIs, they logically do so via
1573 # relative URLs. If this field is specified, it supplies the base
1574 # URL to use for resolving these relative URLs. The normative
1575 # algorithm used is defined by RFC 1808, &quot;Relative Uniform Resource
1576 # Locators&quot;.
1577 #
1578 # If not specified, the default value is &quot;http://www.googleapis.com/&quot;
1579 &quot;servicePath&quot;: &quot;A String&quot;, # The Cloud Dataflow service path relative to the root URL, for example,
1580 # &quot;dataflow/v1b3/projects&quot;.
1581 &quot;shuffleServicePath&quot;: &quot;A String&quot;, # The Shuffle service path relative to the root URL, for example,
1582 # &quot;shuffle/v1beta1&quot;.
1583 &quot;workerId&quot;: &quot;A String&quot;, # The ID of the worker running this pipeline.
1584 },
1585 &quot;taskUser&quot;: &quot;A String&quot;, # The UNIX user ID on the worker VM to use for tasks launched by
1586 # taskrunner; e.g. &quot;root&quot;.
1587 &quot;vmId&quot;: &quot;A String&quot;, # The ID string of the VM.
1588 },
1589 &quot;autoscalingSettings&quot;: { # Settings for WorkerPool autoscaling. # Settings for autoscaling of this WorkerPool.
1590 &quot;algorithm&quot;: &quot;A String&quot;, # The algorithm to use for autoscaling.
1591 &quot;maxNumWorkers&quot;: 42, # The maximum number of workers to cap scaling at.
1592 },
1593 &quot;metadata&quot;: { # Metadata to set on the Google Compute Engine VMs.
1594 &quot;a_key&quot;: &quot;A String&quot;,
1595 },
1596 &quot;defaultPackageSet&quot;: &quot;A String&quot;, # The default package set to install. This allows the service to
1597 # select a default set of packages which are useful to worker
1598 # harnesses written in a particular language.
1599 &quot;network&quot;: &quot;A String&quot;, # Network to which VMs will be assigned. If empty or unspecified,
1600 # the service will use the network &quot;default&quot;.
1601 },
1602 ],
1603 &quot;dataset&quot;: &quot;A String&quot;, # The dataset for the current project where various workflow
1604 # related tables are stored.
1605 #
1606 # The supported resource type is:
1607 #
1608 # Google BigQuery:
1609 # bigquery.googleapis.com/{dataset}
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04001610 },
Bu Sun Kim65020912020-05-20 12:08:20 -07001611 &quot;stageStates&quot;: [ # This field may be mutated by the Cloud Dataflow service;
1612 # callers cannot mutate it.
1613 { # A message describing the state of a particular execution stage.
Bu Sun Kim65020912020-05-20 12:08:20 -07001614 &quot;currentStateTime&quot;: &quot;A String&quot;, # The time at which the stage transitioned to this state.
1615 &quot;executionStageState&quot;: &quot;A String&quot;, # Executions stage states allow the same set of values as JobState.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07001616 &quot;executionStageName&quot;: &quot;A String&quot;, # The name of the execution stage.
Bu Sun Kim65020912020-05-20 12:08:20 -07001617 },
1618 ],
1619 &quot;jobMetadata&quot;: { # Metadata available primarily for filtering jobs. Will be included in the # This field is populated by the Dataflow service to support filtering jobs
1620 # by the metadata values provided here. Populated for ListJobs and all GetJob
1621 # views SUMMARY and higher.
1622 # ListJob response and Job SUMMARY view.
Bu Sun Kim65020912020-05-20 12:08:20 -07001623 &quot;datastoreDetails&quot;: [ # Identification of a Datastore source used in the Dataflow job.
1624 { # Metadata for a Datastore connector used by the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07001625 &quot;namespace&quot;: &quot;A String&quot;, # Namespace used in the connection.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07001626 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
Bu Sun Kim65020912020-05-20 12:08:20 -07001627 },
1628 ],
1629 &quot;sdkVersion&quot;: { # The version of the SDK used to run the job. # The SDK version used to run the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07001630 &quot;version&quot;: &quot;A String&quot;, # The version of the SDK used to run the job.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07001631 &quot;sdkSupportStatus&quot;: &quot;A String&quot;, # The support status for this SDK version.
1632 &quot;versionDisplayName&quot;: &quot;A String&quot;, # A readable string describing the version of the SDK.
Bu Sun Kim65020912020-05-20 12:08:20 -07001633 },
1634 &quot;bigqueryDetails&quot;: [ # Identification of a BigQuery source used in the Dataflow job.
1635 { # Metadata for a BigQuery connector used by the job.
1636 &quot;table&quot;: &quot;A String&quot;, # Table accessed in the connection.
1637 &quot;dataset&quot;: &quot;A String&quot;, # Dataset accessed in the connection.
Bu Sun Kim65020912020-05-20 12:08:20 -07001638 &quot;query&quot;: &quot;A String&quot;, # Query used to access data in the connection.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07001639 &quot;projectId&quot;: &quot;A String&quot;, # Project accessed in the connection.
Bu Sun Kim65020912020-05-20 12:08:20 -07001640 },
1641 ],
1642 &quot;fileDetails&quot;: [ # Identification of a File source used in the Dataflow job.
1643 { # Metadata for a File connector used by the job.
1644 &quot;filePattern&quot;: &quot;A String&quot;, # File Pattern used to access files by the connector.
1645 },
1646 ],
1647 &quot;pubsubDetails&quot;: [ # Identification of a PubSub source used in the Dataflow job.
1648 { # Metadata for a PubSub connector used by the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07001649 &quot;topic&quot;: &quot;A String&quot;, # Topic accessed in the connection.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07001650 &quot;subscription&quot;: &quot;A String&quot;, # Subscription used in the connection.
1651 },
1652 ],
1653 &quot;bigTableDetails&quot;: [ # Identification of a BigTable source used in the Dataflow job.
1654 { # Metadata for a BigTable connector used by the job.
1655 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
1656 &quot;instanceId&quot;: &quot;A String&quot;, # InstanceId accessed in the connection.
1657 &quot;tableId&quot;: &quot;A String&quot;, # TableId accessed in the connection.
1658 },
1659 ],
1660 &quot;spannerDetails&quot;: [ # Identification of a Spanner source used in the Dataflow job.
1661 { # Metadata for a Spanner connector used by the job.
1662 &quot;instanceId&quot;: &quot;A String&quot;, # InstanceId accessed in the connection.
1663 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
1664 &quot;databaseId&quot;: &quot;A String&quot;, # DatabaseId accessed in the connection.
Bu Sun Kim65020912020-05-20 12:08:20 -07001665 },
1666 ],
1667 },
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07001668 &quot;type&quot;: &quot;A String&quot;, # The type of Cloud Dataflow job.
1669 &quot;projectId&quot;: &quot;A String&quot;, # The ID of the Cloud Platform project that the job belongs to.
Bu Sun Kim65020912020-05-20 12:08:20 -07001670 &quot;createdFromSnapshotId&quot;: &quot;A String&quot;, # If this is specified, the job&#x27;s initial state is populated from the given
1671 # snapshot.
Bu Sun Kim65020912020-05-20 12:08:20 -07001672 &quot;pipelineDescription&quot;: { # A descriptive representation of submitted pipeline as well as the executed # Preliminary field: The format of this data may change at any time.
1673 # A description of the user pipeline and stages through which it is executed.
1674 # Created by Cloud Dataflow service. Only retrieved with
1675 # JOB_VIEW_DESCRIPTION or JOB_VIEW_ALL.
1676 # form. This data is provided by the Dataflow service for ease of visualizing
1677 # the pipeline and interpreting Dataflow provided metrics.
1678 &quot;executionPipelineStage&quot;: [ # Description of each stage of execution of the pipeline.
1679 { # Description of the composing transforms, names/ids, and input/outputs of a
1680 # stage of execution. Some composing transforms and sources may have been
1681 # generated by the Dataflow service during execution planning.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07001682 &quot;outputSource&quot;: [ # Output sources for this stage.
1683 { # Description of an input or output of an execution stage.
1684 &quot;sizeBytes&quot;: &quot;A String&quot;, # Size of the source, if measurable.
1685 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
1686 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this source; may be user or system generated.
1687 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
1688 # source is most closely associated.
1689 },
1690 ],
1691 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this stage.
1692 &quot;inputSource&quot;: [ # Input sources for this stage.
1693 { # Description of an input or output of an execution stage.
1694 &quot;sizeBytes&quot;: &quot;A String&quot;, # Size of the source, if measurable.
1695 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
1696 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this source; may be user or system generated.
1697 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
1698 # source is most closely associated.
1699 },
1700 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07001701 &quot;id&quot;: &quot;A String&quot;, # Dataflow service generated id for this stage.
1702 &quot;componentTransform&quot;: [ # Transforms that comprise this execution stage.
1703 { # Description of a transform executed as part of an execution stage.
1704 &quot;originalTransform&quot;: &quot;A String&quot;, # User name for the original user transform with which this transform is
1705 # most closely associated.
1706 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
1707 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this transform; may be user or system generated.
1708 },
1709 ],
1710 &quot;componentSource&quot;: [ # Collections produced and consumed by component transforms of this stage.
1711 { # Description of an interstitial value between transforms in an execution
1712 # stage.
1713 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
1714 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this transform; may be user or system generated.
1715 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
1716 # source is most closely associated.
1717 },
1718 ],
1719 &quot;kind&quot;: &quot;A String&quot;, # Type of tranform this stage is executing.
Bu Sun Kim65020912020-05-20 12:08:20 -07001720 },
1721 ],
1722 &quot;originalPipelineTransform&quot;: [ # Description of each transform in the pipeline and collections between them.
1723 { # Description of the type, names/ids, and input/outputs for a transform.
1724 &quot;kind&quot;: &quot;A String&quot;, # Type of transform.
1725 &quot;inputCollectionName&quot;: [ # User names for all collection inputs to this transform.
1726 &quot;A String&quot;,
1727 ],
1728 &quot;name&quot;: &quot;A String&quot;, # User provided name for this transform instance.
1729 &quot;id&quot;: &quot;A String&quot;, # SDK generated id of this transform instance.
1730 &quot;displayData&quot;: [ # Transform-specific display data.
1731 { # Data provided with a pipeline or transform to provide descriptive info.
Bu Sun Kim65020912020-05-20 12:08:20 -07001732 &quot;durationValue&quot;: &quot;A String&quot;, # Contains value if the data is of duration type.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07001733 &quot;int64Value&quot;: &quot;A String&quot;, # Contains value if the data is of int64 type.
Bu Sun Kim65020912020-05-20 12:08:20 -07001734 &quot;namespace&quot;: &quot;A String&quot;, # The namespace for the key. This is usually a class name or programming
1735 # language namespace (i.e. python module) which defines the display data.
1736 # This allows a dax monitoring system to specially handle the data
1737 # and perform custom rendering.
1738 &quot;floatValue&quot;: 3.14, # Contains value if the data is of float type.
1739 &quot;key&quot;: &quot;A String&quot;, # The key identifying the display data.
1740 # This is intended to be used as a label for the display data
1741 # when viewed in a dax monitoring system.
1742 &quot;shortStrValue&quot;: &quot;A String&quot;, # A possible additional shorter value to display.
1743 # For example a java_class_name_value of com.mypackage.MyDoFn
1744 # will be stored with MyDoFn as the short_str_value and
1745 # com.mypackage.MyDoFn as the java_class_name value.
1746 # short_str_value can be displayed and java_class_name_value
1747 # will be displayed as a tooltip.
1748 &quot;url&quot;: &quot;A String&quot;, # An optional full URL.
1749 &quot;label&quot;: &quot;A String&quot;, # An optional label to display in a dax UI for the element.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07001750 &quot;timestampValue&quot;: &quot;A String&quot;, # Contains value if the data is of timestamp type.
1751 &quot;boolValue&quot;: True or False, # Contains value if the data is of a boolean type.
1752 &quot;javaClassValue&quot;: &quot;A String&quot;, # Contains value if the data is of java class type.
1753 &quot;strValue&quot;: &quot;A String&quot;, # Contains value if the data is of string type.
Bu Sun Kim65020912020-05-20 12:08:20 -07001754 },
1755 ],
1756 &quot;outputCollectionName&quot;: [ # User names for all collection outputs to this transform.
1757 &quot;A String&quot;,
1758 ],
1759 },
1760 ],
1761 &quot;displayData&quot;: [ # Pipeline level display data.
1762 { # Data provided with a pipeline or transform to provide descriptive info.
Bu Sun Kim65020912020-05-20 12:08:20 -07001763 &quot;durationValue&quot;: &quot;A String&quot;, # Contains value if the data is of duration type.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07001764 &quot;int64Value&quot;: &quot;A String&quot;, # Contains value if the data is of int64 type.
Bu Sun Kim65020912020-05-20 12:08:20 -07001765 &quot;namespace&quot;: &quot;A String&quot;, # The namespace for the key. This is usually a class name or programming
1766 # language namespace (i.e. python module) which defines the display data.
1767 # This allows a dax monitoring system to specially handle the data
1768 # and perform custom rendering.
1769 &quot;floatValue&quot;: 3.14, # Contains value if the data is of float type.
1770 &quot;key&quot;: &quot;A String&quot;, # The key identifying the display data.
1771 # This is intended to be used as a label for the display data
1772 # when viewed in a dax monitoring system.
1773 &quot;shortStrValue&quot;: &quot;A String&quot;, # A possible additional shorter value to display.
1774 # For example a java_class_name_value of com.mypackage.MyDoFn
1775 # will be stored with MyDoFn as the short_str_value and
1776 # com.mypackage.MyDoFn as the java_class_name value.
1777 # short_str_value can be displayed and java_class_name_value
1778 # will be displayed as a tooltip.
1779 &quot;url&quot;: &quot;A String&quot;, # An optional full URL.
1780 &quot;label&quot;: &quot;A String&quot;, # An optional label to display in a dax UI for the element.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07001781 &quot;timestampValue&quot;: &quot;A String&quot;, # Contains value if the data is of timestamp type.
1782 &quot;boolValue&quot;: True or False, # Contains value if the data is of a boolean type.
1783 &quot;javaClassValue&quot;: &quot;A String&quot;, # Contains value if the data is of java class type.
1784 &quot;strValue&quot;: &quot;A String&quot;, # Contains value if the data is of string type.
Bu Sun Kim65020912020-05-20 12:08:20 -07001785 },
1786 ],
1787 },
1788 &quot;replaceJobId&quot;: &quot;A String&quot;, # If this job is an update of an existing job, this field is the job ID
1789 # of the job it replaced.
1790 #
1791 # When sending a `CreateJobRequest`, you can update a job by specifying it
1792 # here. The job named here is stopped, and its intermediate state is
1793 # transferred to this job.
1794 &quot;tempFiles&quot;: [ # A set of files the system should be aware of that are used
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001795 # for temporary storage. These temporary files will be
1796 # removed on job completion.
1797 # No duplicates are allowed.
1798 # No file patterns are supported.
1799 #
1800 # The supported files are:
1801 #
1802 # Google Cloud Storage:
1803 #
1804 # storage.googleapis.com/{bucket}/{object}
1805 # bucket.storage.googleapis.com/{object}
Bu Sun Kim65020912020-05-20 12:08:20 -07001806 &quot;A String&quot;,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001807 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07001808 &quot;name&quot;: &quot;A String&quot;, # The user-specified Cloud Dataflow job name.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001809 #
1810 # Only one Job with a given name may exist in a project at any
1811 # given time. If a caller attempts to create a Job with the same
1812 # name as an already-existing Job, the attempt returns the
1813 # existing Job.
1814 #
1815 # The name must match the regular expression
1816 # `[a-z]([-a-z0-9]{0,38}[a-z0-9])?`
Bu Sun Kim65020912020-05-20 12:08:20 -07001817 &quot;steps&quot;: [ # Exactly one of step or steps_location should be specified.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001818 #
1819 # The top-level steps that constitute the entire job.
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04001820 { # Defines a particular step within a Cloud Dataflow job.
1821 #
1822 # A job consists of multiple steps, each of which performs some
1823 # specific operation as part of the overall job. Data is typically
1824 # passed from one step to another as part of the job.
1825 #
Bu Sun Kim65020912020-05-20 12:08:20 -07001826 # Here&#x27;s an example of a sequence of steps which together implement a
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04001827 # Map-Reduce job:
1828 #
1829 # * Read a collection of data from some source, parsing the
Bu Sun Kim65020912020-05-20 12:08:20 -07001830 # collection&#x27;s elements.
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04001831 #
1832 # * Validate the elements.
1833 #
1834 # * Apply a user-defined function to map each element to some value
1835 # and extract an element-specific key value.
1836 #
1837 # * Group elements with the same key into a single element with
1838 # that key, transforming a multiply-keyed collection into a
1839 # uniquely-keyed collection.
1840 #
1841 # * Write the elements out to some data sink.
1842 #
1843 # Note that the Cloud Dataflow service may be used to run many different
1844 # types of jobs, not just Map-Reduce.
Bu Sun Kim65020912020-05-20 12:08:20 -07001845 &quot;name&quot;: &quot;A String&quot;, # The name that identifies the step. This must be unique for each
Dan O'Mearadd494642020-05-01 07:42:23 -07001846 # step with respect to all other steps in the Cloud Dataflow job.
Bu Sun Kim65020912020-05-20 12:08:20 -07001847 &quot;kind&quot;: &quot;A String&quot;, # The kind of step in the Cloud Dataflow job.
1848 &quot;properties&quot;: { # Named properties associated with the step. Each kind of
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04001849 # predefined step has its own required set of properties.
1850 # Must be provided on Create. Only retrieved with JOB_VIEW_ALL.
Bu Sun Kim65020912020-05-20 12:08:20 -07001851 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
Jon Wayne Parrott692617a2017-01-06 09:58:29 -08001852 },
1853 },
1854 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07001855 &quot;replacedByJobId&quot;: &quot;A String&quot;, # If another job is an update of this job (and thus, this job is in
1856 # `JOB_STATE_UPDATED`), this field contains the ID of that job.
1857 &quot;executionInfo&quot;: { # Additional information about how a Cloud Dataflow job will be executed that # Deprecated.
1858 # isn&#x27;t contained in the submitted job.
1859 &quot;stages&quot;: { # A mapping from each stage to the information about that stage.
1860 &quot;a_key&quot;: { # Contains information about how a particular
1861 # google.dataflow.v1beta3.Step will be executed.
1862 &quot;stepName&quot;: [ # The steps associated with the execution stage.
1863 # Note that stages may have several steps, and that a given step
1864 # might be run by more than one stage.
1865 &quot;A String&quot;,
1866 ],
1867 },
1868 },
1869 },
1870 &quot;currentState&quot;: &quot;A String&quot;, # The current state of the job.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001871 #
1872 # Jobs are created in the `JOB_STATE_STOPPED` state unless otherwise
1873 # specified.
1874 #
1875 # A job in the `JOB_STATE_RUNNING` state may asynchronously enter a
1876 # terminal state. After a job has reached a terminal state, no
1877 # further state updates may be made.
1878 #
1879 # This field may be mutated by the Cloud Dataflow service;
1880 # callers cannot mutate it.
Bu Sun Kim65020912020-05-20 12:08:20 -07001881 &quot;location&quot;: &quot;A String&quot;, # The [regional endpoint]
1882 # (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
1883 # contains this job.
1884 &quot;startTime&quot;: &quot;A String&quot;, # The timestamp when the job was started (transitioned to JOB_STATE_PENDING).
1885 # Flexible resource scheduling jobs are started with some delay after job
1886 # creation, so start_time is unset before start and is updated when the
1887 # job is started by the Cloud Dataflow service. For other jobs, start_time
1888 # always equals to create_time and is immutable and set by the Cloud Dataflow
1889 # service.
1890 &quot;stepsLocation&quot;: &quot;A String&quot;, # The GCS location where the steps are stored.
1891 &quot;labels&quot;: { # User-defined labels for this job.
1892 #
1893 # The labels map can contain no more than 64 entries. Entries of the labels
1894 # map are UTF8 strings that comply with the following restrictions:
1895 #
1896 # * Keys must conform to regexp: \p{Ll}\p{Lo}{0,62}
1897 # * Values must conform to regexp: [\p{Ll}\p{Lo}\p{N}_-]{0,63}
1898 # * Both keys and values are additionally constrained to be &lt;= 128 bytes in
1899 # size.
1900 &quot;a_key&quot;: &quot;A String&quot;,
Jon Wayne Parrott692617a2017-01-06 09:58:29 -08001901 },
Bu Sun Kim65020912020-05-20 12:08:20 -07001902 &quot;createTime&quot;: &quot;A String&quot;, # The timestamp when the job was initially created. Immutable and set by the
1903 # Cloud Dataflow service.
1904 &quot;requestedState&quot;: &quot;A String&quot;, # The job&#x27;s requested state.
1905 #
1906 # `UpdateJob` may be used to switch between the `JOB_STATE_STOPPED` and
1907 # `JOB_STATE_RUNNING` states, by setting requested_state. `UpdateJob` may
1908 # also be used to directly set a job&#x27;s requested state to
1909 # `JOB_STATE_CANCELLED` or `JOB_STATE_DONE`, irrevocably terminating the
1910 # job if it has not already reached a terminal state.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001911 }</pre>
Jon Wayne Parrott692617a2017-01-06 09:58:29 -08001912</div>
1913
1914<div class="method">
1915 <code class="details" id="getMetrics">getMetrics(projectId, location, jobId, startTime=None, x__xgafv=None)</code>
1916 <pre>Request the job status.
1917
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001918To request the status of a job, we recommend using
1919`projects.locations.jobs.getMetrics` with a [regional endpoint]
1920(https://cloud.google.com/dataflow/docs/concepts/regional-endpoints). Using
1921`projects.jobs.getMetrics` is not recommended, as you can only request the
1922status of jobs that are running in `us-central1`.
1923
Jon Wayne Parrott692617a2017-01-06 09:58:29 -08001924Args:
1925 projectId: string, A project id. (required)
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001926 location: string, The [regional endpoint]
1927(https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
1928contains the job specified by job_id. (required)
Jon Wayne Parrott692617a2017-01-06 09:58:29 -08001929 jobId: string, The job to get messages for. (required)
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04001930 startTime: string, Return only metric data that has changed since this time.
1931Default is to return all information about all metrics for the job.
Jon Wayne Parrott692617a2017-01-06 09:58:29 -08001932 x__xgafv: string, V1 error format.
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04001933 Allowed values
1934 1 - v1 error format
1935 2 - v2 error format
Jon Wayne Parrott692617a2017-01-06 09:58:29 -08001936
1937Returns:
1938 An object of the form:
1939
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001940 { # JobMetrics contains a collection of metrics describing the detailed progress
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04001941 # of a Dataflow job. Metrics correspond to user-defined and system-defined
1942 # metrics in the job.
1943 #
1944 # This resource captures only the most recent values of each metric;
1945 # time-series data can be queried for them (under the same metric names)
1946 # from Cloud Monitoring.
Bu Sun Kim65020912020-05-20 12:08:20 -07001947 &quot;metrics&quot;: [ # All metrics for this job.
Jon Wayne Parrott692617a2017-01-06 09:58:29 -08001948 { # Describes the state of a metric.
Bu Sun Kim65020912020-05-20 12:08:20 -07001949 &quot;set&quot;: &quot;&quot;, # Worker-computed aggregate value for the &quot;Set&quot; aggregation kind. The only
1950 # possible value type is a list of Values whose type can be Long, Double,
1951 # or String, according to the metric&#x27;s type. All Values in the list must
1952 # be of the same type.
1953 &quot;gauge&quot;: &quot;&quot;, # A struct value describing properties of a Gauge.
1954 # Metrics of gauge type show the value of a metric across time, and is
1955 # aggregated based on the newest value.
1956 &quot;cumulative&quot;: True or False, # True if this metric is reported as the total cumulative aggregate
1957 # value accumulated since the worker started working on this WorkItem.
1958 # By default this is false, indicating that this metric is reported
1959 # as a delta that is not associated with any WorkItem.
1960 &quot;internal&quot;: &quot;&quot;, # Worker-computed aggregate value for internal use by the Dataflow
1961 # service.
1962 &quot;kind&quot;: &quot;A String&quot;, # Metric aggregation kind. The possible metric aggregation kinds are
1963 # &quot;Sum&quot;, &quot;Max&quot;, &quot;Min&quot;, &quot;Mean&quot;, &quot;Set&quot;, &quot;And&quot;, &quot;Or&quot;, and &quot;Distribution&quot;.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001964 # The specified aggregation kind is case-insensitive.
1965 #
1966 # If omitted, this is not an aggregated value but instead
1967 # a single metric sample value.
Bu Sun Kim65020912020-05-20 12:08:20 -07001968 &quot;scalar&quot;: &quot;&quot;, # Worker-computed aggregate value for aggregation kinds &quot;Sum&quot;, &quot;Max&quot;, &quot;Min&quot;,
1969 # &quot;And&quot;, and &quot;Or&quot;. The possible value types are Long, Double, and Boolean.
1970 &quot;meanCount&quot;: &quot;&quot;, # Worker-computed aggregate value for the &quot;Mean&quot; aggregation kind.
1971 # This holds the count of the aggregated values and is used in combination
1972 # with mean_sum above to obtain the actual mean aggregate value.
1973 # The only possible value type is Long.
1974 &quot;meanSum&quot;: &quot;&quot;, # Worker-computed aggregate value for the &quot;Mean&quot; aggregation kind.
Sai Cheemalapati4ba8c232017-06-06 18:46:08 -04001975 # This holds the sum of the aggregated values and is used in combination
1976 # with mean_count below to obtain the actual mean aggregate value.
1977 # The only possible value types are Long and Double.
Bu Sun Kim65020912020-05-20 12:08:20 -07001978 &quot;updateTime&quot;: &quot;A String&quot;, # Timestamp associated with the metric value. Optional when workers are
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001979 # reporting work progress; it will be filled in responses from the
1980 # metrics API.
Bu Sun Kim65020912020-05-20 12:08:20 -07001981 &quot;name&quot;: { # Identifies a metric, by describing the source which generated the # Name of the metric.
1982 # metric.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07001983 &quot;name&quot;: &quot;A String&quot;, # Worker-defined metric name.
1984 &quot;origin&quot;: &quot;A String&quot;, # Origin (namespace) of metric name. May be blank for user-define metrics;
1985 # will be &quot;dataflow&quot; for metrics defined by the Dataflow service or SDK.
Bu Sun Kim65020912020-05-20 12:08:20 -07001986 &quot;context&quot;: { # Zero or more labeled fields which identify the part of the job this
1987 # metric is associated with, such as the name of a step or collection.
1988 #
1989 # For example, built-in counters associated with steps will have
1990 # context[&#x27;step&#x27;] = &lt;step-name&gt;. Counters associated with PCollections
1991 # in the SDK will have context[&#x27;pcollection&#x27;] = &lt;pcollection-name&gt;.
1992 &quot;a_key&quot;: &quot;A String&quot;,
1993 },
Bu Sun Kim65020912020-05-20 12:08:20 -07001994 },
1995 &quot;distribution&quot;: &quot;&quot;, # A struct value describing properties of a distribution of numeric values.
Jon Wayne Parrott692617a2017-01-06 09:58:29 -08001996 },
1997 ],
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07001998 &quot;metricTime&quot;: &quot;A String&quot;, # Timestamp as of which metric values are current.
Jon Wayne Parrott692617a2017-01-06 09:58:29 -08001999 }</pre>
2000</div>
2001
2002<div class="method">
Bu Sun Kim65020912020-05-20 12:08:20 -07002003 <code class="details" id="list">list(projectId, location, filter=None, pageToken=None, pageSize=None, view=None, x__xgafv=None)</code>
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04002004 <pre>List the jobs of a project.
Jon Wayne Parrott692617a2017-01-06 09:58:29 -08002005
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002006To list the jobs of a project in a region, we recommend using
2007`projects.locations.jobs.get` with a [regional endpoint]
2008(https://cloud.google.com/dataflow/docs/concepts/regional-endpoints). To
2009list the all jobs across all regions, use `projects.jobs.aggregated`. Using
2010`projects.jobs.list` is not recommended, as you can only get the list of
2011jobs that are running in `us-central1`.
2012
Jon Wayne Parrott692617a2017-01-06 09:58:29 -08002013Args:
2014 projectId: string, The project which owns the jobs. (required)
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002015 location: string, The [regional endpoint]
2016(https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
2017contains this job. (required)
Bu Sun Kim65020912020-05-20 12:08:20 -07002018 filter: string, The kind of filter to use.
2019 pageToken: string, Set this to the &#x27;next_page_token&#x27; field of a previous response
2020to request additional results in a long list.
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04002021 pageSize: integer, If there are many jobs, limit response to at most this many.
2022The actual number of jobs returned will be the lesser of max_responses
2023and an unspecified server-defined limit.
Bu Sun Kim65020912020-05-20 12:08:20 -07002024 view: string, Level of information requested in response. Default is `JOB_VIEW_SUMMARY`.
Jon Wayne Parrott692617a2017-01-06 09:58:29 -08002025 x__xgafv: string, V1 error format.
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04002026 Allowed values
2027 1 - v1 error format
2028 2 - v2 error format
Jon Wayne Parrott692617a2017-01-06 09:58:29 -08002029
2030Returns:
2031 An object of the form:
2032
Dan O'Mearadd494642020-05-01 07:42:23 -07002033 { # Response to a request to list Cloud Dataflow jobs in a project. This might
2034 # be a partial response, depending on the page size in the ListJobsRequest.
2035 # However, if the project does not have any jobs, an instance of
Bu Sun Kim65020912020-05-20 12:08:20 -07002036 # ListJobsResponse is not returned and the requests&#x27;s response
Dan O'Mearadd494642020-05-01 07:42:23 -07002037 # body is empty {}.
Bu Sun Kim65020912020-05-20 12:08:20 -07002038 &quot;jobs&quot;: [ # A subset of the requested job information.
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04002039 { # Defines a job to be run by the Cloud Dataflow service.
Bu Sun Kim65020912020-05-20 12:08:20 -07002040 &quot;clientRequestId&quot;: &quot;A String&quot;, # The client&#x27;s unique identifier of the job, re-used across retried attempts.
2041 # If this field is set, the service will ensure its uniqueness.
2042 # The request to create a job will fail if the service has knowledge of a
2043 # previously submitted job with the same client&#x27;s ID and job name.
2044 # The caller may use this field to ensure idempotence of job
2045 # creation across retried attempts to create a job.
2046 # By default, the field is empty and, in that case, the service ignores it.
2047 &quot;id&quot;: &quot;A String&quot;, # The unique ID of this job.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002048 #
2049 # This field is set by the Cloud Dataflow service when the Job is
2050 # created, and is immutable for the life of the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07002051 &quot;currentStateTime&quot;: &quot;A String&quot;, # The timestamp associated with the current state.
2052 &quot;transformNameMapping&quot;: { # The map of transform name prefixes of the job to be replaced to the
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002053 # corresponding name prefixes of the new job.
Bu Sun Kim65020912020-05-20 12:08:20 -07002054 &quot;a_key&quot;: &quot;A String&quot;,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002055 },
Bu Sun Kim65020912020-05-20 12:08:20 -07002056 &quot;environment&quot;: { # Describes the environment in which a Dataflow Job runs. # The environment for the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07002057 &quot;internalExperiments&quot;: { # Experimental settings.
2058 &quot;a_key&quot;: &quot;&quot;, # Properties of the object. Contains field @type with type URL.
2059 },
2060 &quot;workerRegion&quot;: &quot;A String&quot;, # The Compute Engine region
2061 # (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
2062 # which worker processing should occur, e.g. &quot;us-west1&quot;. Mutually exclusive
2063 # with worker_zone. If neither worker_region nor worker_zone is specified,
2064 # default to the control plane&#x27;s region.
2065 &quot;serviceKmsKeyName&quot;: &quot;A String&quot;, # If set, contains the Cloud KMS key identifier used to encrypt data
2066 # at rest, AKA a Customer Managed Encryption Key (CMEK).
2067 #
2068 # Format:
2069 # projects/PROJECT_ID/locations/LOCATION/keyRings/KEY_RING/cryptoKeys/KEY
2070 &quot;userAgent&quot;: { # A description of the process that generated the request.
2071 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
2072 },
2073 &quot;workerZone&quot;: &quot;A String&quot;, # The Compute Engine zone
2074 # (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
2075 # which worker processing should occur, e.g. &quot;us-west1-a&quot;. Mutually exclusive
2076 # with worker_region. If neither worker_region nor worker_zone is specified,
2077 # a zone in the control plane&#x27;s region is chosen based on available capacity.
2078 &quot;clusterManagerApiService&quot;: &quot;A String&quot;, # The type of cluster manager API to use. If unknown or
Dan O'Mearadd494642020-05-01 07:42:23 -07002079 # unspecified, the service will attempt to choose a reasonable
2080 # default. This should be in the form of the API service name,
Bu Sun Kim65020912020-05-20 12:08:20 -07002081 # e.g. &quot;compute.googleapis.com&quot;.
2082 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the system should use for temporary
2083 # storage. The system will append the suffix &quot;/temp-{JOBNAME} to
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002084 # this resource prefix, where {JOBNAME} is the value of the
2085 # job_name field. The resulting bucket and object prefix is used
2086 # as the prefix of the resources used to store temporary data
2087 # needed during the job execution. NOTE: This will override the
2088 # value in taskrunner_settings.
2089 # The supported resource type is:
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04002090 #
2091 # Google Cloud Storage:
2092 #
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002093 # storage.googleapis.com/{bucket}/{object}
2094 # bucket.storage.googleapis.com/{object}
Bu Sun Kim65020912020-05-20 12:08:20 -07002095 &quot;experiments&quot;: [ # The list of experiments to enable.
2096 &quot;A String&quot;,
2097 ],
2098 &quot;version&quot;: { # A structure describing which components and their versions of the service
2099 # are required in order to run the job.
2100 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
2101 },
2102 &quot;serviceAccountEmail&quot;: &quot;A String&quot;, # Identity to run virtual machines as. Defaults to the default account.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07002103 &quot;sdkPipelineOptions&quot;: { # The Cloud Dataflow SDK pipeline options specified by the user. These
2104 # options are passed through the service and are used to recreate the
2105 # SDK pipeline options on the worker in a language agnostic and platform
2106 # independent way.
2107 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
2108 },
2109 &quot;flexResourceSchedulingGoal&quot;: &quot;A String&quot;, # Which Flexible Resource Scheduling mode to run in.
2110 &quot;workerPools&quot;: [ # The worker pools. At least one &quot;harness&quot; worker pool must be
2111 # specified in order for the job to have workers.
2112 { # Describes one particular pool of Cloud Dataflow workers to be
2113 # instantiated by the Cloud Dataflow service in order to perform the
2114 # computations required by a job. Note that a workflow job may use
2115 # multiple pools, in order to match the various computational
2116 # requirements of the various stages of the job.
2117 &quot;numThreadsPerWorker&quot;: 42, # The number of threads per worker harness. If empty or unspecified, the
2118 # service will choose a number of threads (according to the number of cores
2119 # on the selected machine type for batch, or 1 by convention for streaming).
2120 &quot;numWorkers&quot;: 42, # Number of Google Compute Engine workers in this pool needed to
2121 # execute the job. If zero or unspecified, the service will
2122 # attempt to choose a reasonable default.
2123 &quot;zone&quot;: &quot;A String&quot;, # Zone to run the worker pools in. If empty or unspecified, the service
2124 # will attempt to choose a reasonable default.
2125 &quot;diskSourceImage&quot;: &quot;A String&quot;, # Fully qualified source image for disks.
2126 &quot;packages&quot;: [ # Packages to be installed on workers.
2127 { # The packages that must be installed in order for a worker to run the
2128 # steps of the Cloud Dataflow job that will be assigned to its worker
2129 # pool.
2130 #
2131 # This is the mechanism by which the Cloud Dataflow SDK causes code to
2132 # be loaded onto the workers. For example, the Cloud Dataflow Java SDK
2133 # might use this to install jars containing the user&#x27;s code and all of the
2134 # various dependencies (libraries, data files, etc.) required in order
2135 # for that code to run.
2136 &quot;name&quot;: &quot;A String&quot;, # The name of the package.
2137 &quot;location&quot;: &quot;A String&quot;, # The resource to read the package from. The supported resource type is:
2138 #
2139 # Google Cloud Storage:
2140 #
2141 # storage.googleapis.com/{bucket}
2142 # bucket.storage.googleapis.com/
2143 },
2144 ],
2145 &quot;teardownPolicy&quot;: &quot;A String&quot;, # Sets the policy for determining when to turndown worker pool.
2146 # Allowed values are: `TEARDOWN_ALWAYS`, `TEARDOWN_ON_SUCCESS`, and
2147 # `TEARDOWN_NEVER`.
2148 # `TEARDOWN_ALWAYS` means workers are always torn down regardless of whether
2149 # the job succeeds. `TEARDOWN_ON_SUCCESS` means workers are torn down
2150 # if the job succeeds. `TEARDOWN_NEVER` means the workers are never torn
2151 # down.
2152 #
2153 # If the workers are not torn down by the service, they will
2154 # continue to run and use Google Compute Engine VM resources in the
2155 # user&#x27;s project until they are explicitly terminated by the user.
2156 # Because of this, Google recommends using the `TEARDOWN_ALWAYS`
2157 # policy except for small, manually supervised test jobs.
2158 #
2159 # If unknown or unspecified, the service will attempt to choose a reasonable
2160 # default.
2161 &quot;onHostMaintenance&quot;: &quot;A String&quot;, # The action to take on host maintenance, as defined by the Google
2162 # Compute Engine API.
2163 &quot;poolArgs&quot;: { # Extra arguments for this worker pool.
2164 &quot;a_key&quot;: &quot;&quot;, # Properties of the object. Contains field @type with type URL.
2165 },
2166 &quot;diskSizeGb&quot;: 42, # Size of root disk for VMs, in GB. If zero or unspecified, the service will
2167 # attempt to choose a reasonable default.
2168 &quot;workerHarnessContainerImage&quot;: &quot;A String&quot;, # Required. Docker container image that executes the Cloud Dataflow worker
2169 # harness, residing in Google Container Registry.
2170 #
2171 # Deprecated for the Fn API path. Use sdk_harness_container_images instead.
2172 &quot;diskType&quot;: &quot;A String&quot;, # Type of root disk for VMs. If empty or unspecified, the service will
2173 # attempt to choose a reasonable default.
2174 &quot;machineType&quot;: &quot;A String&quot;, # Machine type (e.g. &quot;n1-standard-1&quot;). If empty or unspecified, the
2175 # service will attempt to choose a reasonable default.
2176 &quot;kind&quot;: &quot;A String&quot;, # The kind of the worker pool; currently only `harness` and `shuffle`
2177 # are supported.
2178 &quot;sdkHarnessContainerImages&quot;: [ # Set of SDK harness containers needed to execute this pipeline. This will
2179 # only be set in the Fn API path. For non-cross-language pipelines this
2180 # should have only one entry. Cross-language pipelines will have two or more
2181 # entries.
2182 { # Defines a SDK harness container for executing Dataflow pipelines.
2183 &quot;containerImage&quot;: &quot;A String&quot;, # A docker container image that resides in Google Container Registry.
2184 &quot;useSingleCorePerContainer&quot;: True or False, # If true, recommends the Dataflow service to use only one core per SDK
2185 # container instance with this image. If false (or unset) recommends using
2186 # more than one core per SDK container instance with this image for
2187 # efficiency. Note that Dataflow service may choose to override this property
2188 # if needed.
2189 },
2190 ],
2191 &quot;dataDisks&quot;: [ # Data disks that are used by a VM in this workflow.
2192 { # Describes the data disk used by a workflow job.
2193 &quot;diskType&quot;: &quot;A String&quot;, # Disk storage type, as defined by Google Compute Engine. This
2194 # must be a disk type appropriate to the project and zone in which
2195 # the workers will run. If unknown or unspecified, the service
2196 # will attempt to choose a reasonable default.
2197 #
2198 # For example, the standard persistent disk type is a resource name
2199 # typically ending in &quot;pd-standard&quot;. If SSD persistent disks are
2200 # available, the resource name typically ends with &quot;pd-ssd&quot;. The
2201 # actual valid values are defined the Google Compute Engine API,
2202 # not by the Cloud Dataflow API; consult the Google Compute Engine
2203 # documentation for more information about determining the set of
2204 # available disk types for a particular project and zone.
2205 #
2206 # Google Compute Engine Disk types are local to a particular
2207 # project in a particular zone, and so the resource name will
2208 # typically look something like this:
2209 #
2210 # compute.googleapis.com/projects/project-id/zones/zone/diskTypes/pd-standard
2211 &quot;sizeGb&quot;: 42, # Size of disk in GB. If zero or unspecified, the service will
2212 # attempt to choose a reasonable default.
2213 &quot;mountPoint&quot;: &quot;A String&quot;, # Directory in a VM where disk is mounted.
2214 },
2215 ],
2216 &quot;subnetwork&quot;: &quot;A String&quot;, # Subnetwork to which VMs will be assigned, if desired. Expected to be of
2217 # the form &quot;regions/REGION/subnetworks/SUBNETWORK&quot;.
2218 &quot;ipConfiguration&quot;: &quot;A String&quot;, # Configuration for VM IPs.
2219 &quot;taskrunnerSettings&quot;: { # Taskrunner configuration settings. # Settings passed through to Google Compute Engine workers when
2220 # using the standard Dataflow task runner. Users should ignore
2221 # this field.
2222 &quot;alsologtostderr&quot;: True or False, # Whether to also send taskrunner log info to stderr.
2223 &quot;taskGroup&quot;: &quot;A String&quot;, # The UNIX group ID on the worker VM to use for tasks launched by
2224 # taskrunner; e.g. &quot;wheel&quot;.
2225 &quot;harnessCommand&quot;: &quot;A String&quot;, # The command to launch the worker harness.
2226 &quot;logDir&quot;: &quot;A String&quot;, # The directory on the VM to store logs.
2227 &quot;oauthScopes&quot;: [ # The OAuth2 scopes to be requested by the taskrunner in order to
2228 # access the Cloud Dataflow API.
2229 &quot;A String&quot;,
2230 ],
2231 &quot;dataflowApiVersion&quot;: &quot;A String&quot;, # The API version of endpoint, e.g. &quot;v1b3&quot;
2232 &quot;logUploadLocation&quot;: &quot;A String&quot;, # Indicates where to put logs. If this is not specified, the logs
2233 # will not be uploaded.
2234 #
2235 # The supported resource type is:
2236 #
2237 # Google Cloud Storage:
2238 # storage.googleapis.com/{bucket}/{object}
2239 # bucket.storage.googleapis.com/{object}
2240 &quot;streamingWorkerMainClass&quot;: &quot;A String&quot;, # The streaming worker main class name.
2241 &quot;workflowFileName&quot;: &quot;A String&quot;, # The file to store the workflow in.
2242 &quot;languageHint&quot;: &quot;A String&quot;, # The suggested backend language.
2243 &quot;commandlinesFileName&quot;: &quot;A String&quot;, # The file to store preprocessing commands in.
2244 &quot;baseTaskDir&quot;: &quot;A String&quot;, # The location on the worker for task-specific subdirectories.
2245 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the taskrunner should use for
2246 # temporary storage.
2247 #
2248 # The supported resource type is:
2249 #
2250 # Google Cloud Storage:
2251 # storage.googleapis.com/{bucket}/{object}
2252 # bucket.storage.googleapis.com/{object}
2253 &quot;baseUrl&quot;: &quot;A String&quot;, # The base URL for the taskrunner to use when accessing Google Cloud APIs.
2254 #
2255 # When workers access Google Cloud APIs, they logically do so via
2256 # relative URLs. If this field is specified, it supplies the base
2257 # URL to use for resolving these relative URLs. The normative
2258 # algorithm used is defined by RFC 1808, &quot;Relative Uniform Resource
2259 # Locators&quot;.
2260 #
2261 # If not specified, the default value is &quot;http://www.googleapis.com/&quot;
2262 &quot;logToSerialconsole&quot;: True or False, # Whether to send taskrunner log info to Google Compute Engine VM serial
2263 # console.
2264 &quot;continueOnException&quot;: True or False, # Whether to continue taskrunner if an exception is hit.
2265 &quot;parallelWorkerSettings&quot;: { # Provides data to pass through to the worker harness. # The settings to pass to the parallel worker harness.
2266 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the system should use for temporary
2267 # storage.
2268 #
2269 # The supported resource type is:
2270 #
2271 # Google Cloud Storage:
2272 #
2273 # storage.googleapis.com/{bucket}/{object}
2274 # bucket.storage.googleapis.com/{object}
2275 &quot;reportingEnabled&quot;: True or False, # Whether to send work progress updates to the service.
2276 &quot;baseUrl&quot;: &quot;A String&quot;, # The base URL for accessing Google Cloud APIs.
2277 #
2278 # When workers access Google Cloud APIs, they logically do so via
2279 # relative URLs. If this field is specified, it supplies the base
2280 # URL to use for resolving these relative URLs. The normative
2281 # algorithm used is defined by RFC 1808, &quot;Relative Uniform Resource
2282 # Locators&quot;.
2283 #
2284 # If not specified, the default value is &quot;http://www.googleapis.com/&quot;
2285 &quot;servicePath&quot;: &quot;A String&quot;, # The Cloud Dataflow service path relative to the root URL, for example,
2286 # &quot;dataflow/v1b3/projects&quot;.
2287 &quot;shuffleServicePath&quot;: &quot;A String&quot;, # The Shuffle service path relative to the root URL, for example,
2288 # &quot;shuffle/v1beta1&quot;.
2289 &quot;workerId&quot;: &quot;A String&quot;, # The ID of the worker running this pipeline.
2290 },
2291 &quot;taskUser&quot;: &quot;A String&quot;, # The UNIX user ID on the worker VM to use for tasks launched by
2292 # taskrunner; e.g. &quot;root&quot;.
2293 &quot;vmId&quot;: &quot;A String&quot;, # The ID string of the VM.
2294 },
2295 &quot;autoscalingSettings&quot;: { # Settings for WorkerPool autoscaling. # Settings for autoscaling of this WorkerPool.
2296 &quot;algorithm&quot;: &quot;A String&quot;, # The algorithm to use for autoscaling.
2297 &quot;maxNumWorkers&quot;: 42, # The maximum number of workers to cap scaling at.
2298 },
2299 &quot;metadata&quot;: { # Metadata to set on the Google Compute Engine VMs.
2300 &quot;a_key&quot;: &quot;A String&quot;,
2301 },
2302 &quot;defaultPackageSet&quot;: &quot;A String&quot;, # The default package set to install. This allows the service to
2303 # select a default set of packages which are useful to worker
2304 # harnesses written in a particular language.
2305 &quot;network&quot;: &quot;A String&quot;, # Network to which VMs will be assigned. If empty or unspecified,
2306 # the service will use the network &quot;default&quot;.
2307 },
2308 ],
2309 &quot;dataset&quot;: &quot;A String&quot;, # The dataset for the current project where various workflow
2310 # related tables are stored.
2311 #
2312 # The supported resource type is:
2313 #
2314 # Google BigQuery:
2315 # bigquery.googleapis.com/{dataset}
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002316 },
Bu Sun Kim65020912020-05-20 12:08:20 -07002317 &quot;stageStates&quot;: [ # This field may be mutated by the Cloud Dataflow service;
2318 # callers cannot mutate it.
2319 { # A message describing the state of a particular execution stage.
Bu Sun Kim65020912020-05-20 12:08:20 -07002320 &quot;currentStateTime&quot;: &quot;A String&quot;, # The time at which the stage transitioned to this state.
2321 &quot;executionStageState&quot;: &quot;A String&quot;, # Executions stage states allow the same set of values as JobState.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07002322 &quot;executionStageName&quot;: &quot;A String&quot;, # The name of the execution stage.
Bu Sun Kim65020912020-05-20 12:08:20 -07002323 },
2324 ],
2325 &quot;jobMetadata&quot;: { # Metadata available primarily for filtering jobs. Will be included in the # This field is populated by the Dataflow service to support filtering jobs
2326 # by the metadata values provided here. Populated for ListJobs and all GetJob
2327 # views SUMMARY and higher.
2328 # ListJob response and Job SUMMARY view.
Bu Sun Kim65020912020-05-20 12:08:20 -07002329 &quot;datastoreDetails&quot;: [ # Identification of a Datastore source used in the Dataflow job.
2330 { # Metadata for a Datastore connector used by the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07002331 &quot;namespace&quot;: &quot;A String&quot;, # Namespace used in the connection.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07002332 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
Bu Sun Kim65020912020-05-20 12:08:20 -07002333 },
2334 ],
2335 &quot;sdkVersion&quot;: { # The version of the SDK used to run the job. # The SDK version used to run the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07002336 &quot;version&quot;: &quot;A String&quot;, # The version of the SDK used to run the job.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07002337 &quot;sdkSupportStatus&quot;: &quot;A String&quot;, # The support status for this SDK version.
2338 &quot;versionDisplayName&quot;: &quot;A String&quot;, # A readable string describing the version of the SDK.
Bu Sun Kim65020912020-05-20 12:08:20 -07002339 },
2340 &quot;bigqueryDetails&quot;: [ # Identification of a BigQuery source used in the Dataflow job.
2341 { # Metadata for a BigQuery connector used by the job.
2342 &quot;table&quot;: &quot;A String&quot;, # Table accessed in the connection.
2343 &quot;dataset&quot;: &quot;A String&quot;, # Dataset accessed in the connection.
Bu Sun Kim65020912020-05-20 12:08:20 -07002344 &quot;query&quot;: &quot;A String&quot;, # Query used to access data in the connection.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07002345 &quot;projectId&quot;: &quot;A String&quot;, # Project accessed in the connection.
Bu Sun Kim65020912020-05-20 12:08:20 -07002346 },
2347 ],
2348 &quot;fileDetails&quot;: [ # Identification of a File source used in the Dataflow job.
2349 { # Metadata for a File connector used by the job.
2350 &quot;filePattern&quot;: &quot;A String&quot;, # File Pattern used to access files by the connector.
2351 },
2352 ],
2353 &quot;pubsubDetails&quot;: [ # Identification of a PubSub source used in the Dataflow job.
2354 { # Metadata for a PubSub connector used by the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07002355 &quot;topic&quot;: &quot;A String&quot;, # Topic accessed in the connection.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07002356 &quot;subscription&quot;: &quot;A String&quot;, # Subscription used in the connection.
2357 },
2358 ],
2359 &quot;bigTableDetails&quot;: [ # Identification of a BigTable source used in the Dataflow job.
2360 { # Metadata for a BigTable connector used by the job.
2361 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
2362 &quot;instanceId&quot;: &quot;A String&quot;, # InstanceId accessed in the connection.
2363 &quot;tableId&quot;: &quot;A String&quot;, # TableId accessed in the connection.
2364 },
2365 ],
2366 &quot;spannerDetails&quot;: [ # Identification of a Spanner source used in the Dataflow job.
2367 { # Metadata for a Spanner connector used by the job.
2368 &quot;instanceId&quot;: &quot;A String&quot;, # InstanceId accessed in the connection.
2369 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
2370 &quot;databaseId&quot;: &quot;A String&quot;, # DatabaseId accessed in the connection.
Bu Sun Kim65020912020-05-20 12:08:20 -07002371 },
2372 ],
2373 },
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07002374 &quot;type&quot;: &quot;A String&quot;, # The type of Cloud Dataflow job.
2375 &quot;projectId&quot;: &quot;A String&quot;, # The ID of the Cloud Platform project that the job belongs to.
Bu Sun Kim65020912020-05-20 12:08:20 -07002376 &quot;createdFromSnapshotId&quot;: &quot;A String&quot;, # If this is specified, the job&#x27;s initial state is populated from the given
2377 # snapshot.
Bu Sun Kim65020912020-05-20 12:08:20 -07002378 &quot;pipelineDescription&quot;: { # A descriptive representation of submitted pipeline as well as the executed # Preliminary field: The format of this data may change at any time.
2379 # A description of the user pipeline and stages through which it is executed.
2380 # Created by Cloud Dataflow service. Only retrieved with
2381 # JOB_VIEW_DESCRIPTION or JOB_VIEW_ALL.
2382 # form. This data is provided by the Dataflow service for ease of visualizing
2383 # the pipeline and interpreting Dataflow provided metrics.
2384 &quot;executionPipelineStage&quot;: [ # Description of each stage of execution of the pipeline.
2385 { # Description of the composing transforms, names/ids, and input/outputs of a
2386 # stage of execution. Some composing transforms and sources may have been
2387 # generated by the Dataflow service during execution planning.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07002388 &quot;outputSource&quot;: [ # Output sources for this stage.
2389 { # Description of an input or output of an execution stage.
2390 &quot;sizeBytes&quot;: &quot;A String&quot;, # Size of the source, if measurable.
2391 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
2392 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this source; may be user or system generated.
2393 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
2394 # source is most closely associated.
2395 },
2396 ],
2397 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this stage.
2398 &quot;inputSource&quot;: [ # Input sources for this stage.
2399 { # Description of an input or output of an execution stage.
2400 &quot;sizeBytes&quot;: &quot;A String&quot;, # Size of the source, if measurable.
2401 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
2402 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this source; may be user or system generated.
2403 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
2404 # source is most closely associated.
2405 },
2406 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07002407 &quot;id&quot;: &quot;A String&quot;, # Dataflow service generated id for this stage.
2408 &quot;componentTransform&quot;: [ # Transforms that comprise this execution stage.
2409 { # Description of a transform executed as part of an execution stage.
2410 &quot;originalTransform&quot;: &quot;A String&quot;, # User name for the original user transform with which this transform is
2411 # most closely associated.
2412 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
2413 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this transform; may be user or system generated.
2414 },
2415 ],
2416 &quot;componentSource&quot;: [ # Collections produced and consumed by component transforms of this stage.
2417 { # Description of an interstitial value between transforms in an execution
2418 # stage.
2419 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
2420 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this transform; may be user or system generated.
2421 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
2422 # source is most closely associated.
2423 },
2424 ],
2425 &quot;kind&quot;: &quot;A String&quot;, # Type of tranform this stage is executing.
Bu Sun Kim65020912020-05-20 12:08:20 -07002426 },
2427 ],
2428 &quot;originalPipelineTransform&quot;: [ # Description of each transform in the pipeline and collections between them.
2429 { # Description of the type, names/ids, and input/outputs for a transform.
2430 &quot;kind&quot;: &quot;A String&quot;, # Type of transform.
2431 &quot;inputCollectionName&quot;: [ # User names for all collection inputs to this transform.
2432 &quot;A String&quot;,
2433 ],
2434 &quot;name&quot;: &quot;A String&quot;, # User provided name for this transform instance.
2435 &quot;id&quot;: &quot;A String&quot;, # SDK generated id of this transform instance.
2436 &quot;displayData&quot;: [ # Transform-specific display data.
2437 { # Data provided with a pipeline or transform to provide descriptive info.
Bu Sun Kim65020912020-05-20 12:08:20 -07002438 &quot;durationValue&quot;: &quot;A String&quot;, # Contains value if the data is of duration type.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07002439 &quot;int64Value&quot;: &quot;A String&quot;, # Contains value if the data is of int64 type.
Bu Sun Kim65020912020-05-20 12:08:20 -07002440 &quot;namespace&quot;: &quot;A String&quot;, # The namespace for the key. This is usually a class name or programming
2441 # language namespace (i.e. python module) which defines the display data.
2442 # This allows a dax monitoring system to specially handle the data
2443 # and perform custom rendering.
2444 &quot;floatValue&quot;: 3.14, # Contains value if the data is of float type.
2445 &quot;key&quot;: &quot;A String&quot;, # The key identifying the display data.
2446 # This is intended to be used as a label for the display data
2447 # when viewed in a dax monitoring system.
2448 &quot;shortStrValue&quot;: &quot;A String&quot;, # A possible additional shorter value to display.
2449 # For example a java_class_name_value of com.mypackage.MyDoFn
2450 # will be stored with MyDoFn as the short_str_value and
2451 # com.mypackage.MyDoFn as the java_class_name value.
2452 # short_str_value can be displayed and java_class_name_value
2453 # will be displayed as a tooltip.
2454 &quot;url&quot;: &quot;A String&quot;, # An optional full URL.
2455 &quot;label&quot;: &quot;A String&quot;, # An optional label to display in a dax UI for the element.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07002456 &quot;timestampValue&quot;: &quot;A String&quot;, # Contains value if the data is of timestamp type.
2457 &quot;boolValue&quot;: True or False, # Contains value if the data is of a boolean type.
2458 &quot;javaClassValue&quot;: &quot;A String&quot;, # Contains value if the data is of java class type.
2459 &quot;strValue&quot;: &quot;A String&quot;, # Contains value if the data is of string type.
Bu Sun Kim65020912020-05-20 12:08:20 -07002460 },
2461 ],
2462 &quot;outputCollectionName&quot;: [ # User names for all collection outputs to this transform.
2463 &quot;A String&quot;,
2464 ],
2465 },
2466 ],
2467 &quot;displayData&quot;: [ # Pipeline level display data.
2468 { # Data provided with a pipeline or transform to provide descriptive info.
Bu Sun Kim65020912020-05-20 12:08:20 -07002469 &quot;durationValue&quot;: &quot;A String&quot;, # Contains value if the data is of duration type.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07002470 &quot;int64Value&quot;: &quot;A String&quot;, # Contains value if the data is of int64 type.
Bu Sun Kim65020912020-05-20 12:08:20 -07002471 &quot;namespace&quot;: &quot;A String&quot;, # The namespace for the key. This is usually a class name or programming
2472 # language namespace (i.e. python module) which defines the display data.
2473 # This allows a dax monitoring system to specially handle the data
2474 # and perform custom rendering.
2475 &quot;floatValue&quot;: 3.14, # Contains value if the data is of float type.
2476 &quot;key&quot;: &quot;A String&quot;, # The key identifying the display data.
2477 # This is intended to be used as a label for the display data
2478 # when viewed in a dax monitoring system.
2479 &quot;shortStrValue&quot;: &quot;A String&quot;, # A possible additional shorter value to display.
2480 # For example a java_class_name_value of com.mypackage.MyDoFn
2481 # will be stored with MyDoFn as the short_str_value and
2482 # com.mypackage.MyDoFn as the java_class_name value.
2483 # short_str_value can be displayed and java_class_name_value
2484 # will be displayed as a tooltip.
2485 &quot;url&quot;: &quot;A String&quot;, # An optional full URL.
2486 &quot;label&quot;: &quot;A String&quot;, # An optional label to display in a dax UI for the element.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07002487 &quot;timestampValue&quot;: &quot;A String&quot;, # Contains value if the data is of timestamp type.
2488 &quot;boolValue&quot;: True or False, # Contains value if the data is of a boolean type.
2489 &quot;javaClassValue&quot;: &quot;A String&quot;, # Contains value if the data is of java class type.
2490 &quot;strValue&quot;: &quot;A String&quot;, # Contains value if the data is of string type.
Bu Sun Kim65020912020-05-20 12:08:20 -07002491 },
2492 ],
2493 },
2494 &quot;replaceJobId&quot;: &quot;A String&quot;, # If this job is an update of an existing job, this field is the job ID
2495 # of the job it replaced.
2496 #
2497 # When sending a `CreateJobRequest`, you can update a job by specifying it
2498 # here. The job named here is stopped, and its intermediate state is
2499 # transferred to this job.
2500 &quot;tempFiles&quot;: [ # A set of files the system should be aware of that are used
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002501 # for temporary storage. These temporary files will be
2502 # removed on job completion.
2503 # No duplicates are allowed.
2504 # No file patterns are supported.
2505 #
2506 # The supported files are:
2507 #
2508 # Google Cloud Storage:
2509 #
2510 # storage.googleapis.com/{bucket}/{object}
2511 # bucket.storage.googleapis.com/{object}
Bu Sun Kim65020912020-05-20 12:08:20 -07002512 &quot;A String&quot;,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002513 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07002514 &quot;name&quot;: &quot;A String&quot;, # The user-specified Cloud Dataflow job name.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002515 #
2516 # Only one Job with a given name may exist in a project at any
2517 # given time. If a caller attempts to create a Job with the same
2518 # name as an already-existing Job, the attempt returns the
2519 # existing Job.
2520 #
2521 # The name must match the regular expression
2522 # `[a-z]([-a-z0-9]{0,38}[a-z0-9])?`
Bu Sun Kim65020912020-05-20 12:08:20 -07002523 &quot;steps&quot;: [ # Exactly one of step or steps_location should be specified.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002524 #
2525 # The top-level steps that constitute the entire job.
2526 { # Defines a particular step within a Cloud Dataflow job.
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04002527 #
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002528 # A job consists of multiple steps, each of which performs some
2529 # specific operation as part of the overall job. Data is typically
2530 # passed from one step to another as part of the job.
2531 #
Bu Sun Kim65020912020-05-20 12:08:20 -07002532 # Here&#x27;s an example of a sequence of steps which together implement a
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002533 # Map-Reduce job:
2534 #
2535 # * Read a collection of data from some source, parsing the
Bu Sun Kim65020912020-05-20 12:08:20 -07002536 # collection&#x27;s elements.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002537 #
2538 # * Validate the elements.
2539 #
2540 # * Apply a user-defined function to map each element to some value
2541 # and extract an element-specific key value.
2542 #
2543 # * Group elements with the same key into a single element with
2544 # that key, transforming a multiply-keyed collection into a
2545 # uniquely-keyed collection.
2546 #
2547 # * Write the elements out to some data sink.
2548 #
2549 # Note that the Cloud Dataflow service may be used to run many different
2550 # types of jobs, not just Map-Reduce.
Bu Sun Kim65020912020-05-20 12:08:20 -07002551 &quot;name&quot;: &quot;A String&quot;, # The name that identifies the step. This must be unique for each
Dan O'Mearadd494642020-05-01 07:42:23 -07002552 # step with respect to all other steps in the Cloud Dataflow job.
Bu Sun Kim65020912020-05-20 12:08:20 -07002553 &quot;kind&quot;: &quot;A String&quot;, # The kind of step in the Cloud Dataflow job.
2554 &quot;properties&quot;: { # Named properties associated with the step. Each kind of
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002555 # predefined step has its own required set of properties.
2556 # Must be provided on Create. Only retrieved with JOB_VIEW_ALL.
Bu Sun Kim65020912020-05-20 12:08:20 -07002557 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002558 },
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002559 },
2560 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07002561 &quot;replacedByJobId&quot;: &quot;A String&quot;, # If another job is an update of this job (and thus, this job is in
2562 # `JOB_STATE_UPDATED`), this field contains the ID of that job.
2563 &quot;executionInfo&quot;: { # Additional information about how a Cloud Dataflow job will be executed that # Deprecated.
2564 # isn&#x27;t contained in the submitted job.
2565 &quot;stages&quot;: { # A mapping from each stage to the information about that stage.
2566 &quot;a_key&quot;: { # Contains information about how a particular
2567 # google.dataflow.v1beta3.Step will be executed.
2568 &quot;stepName&quot;: [ # The steps associated with the execution stage.
2569 # Note that stages may have several steps, and that a given step
2570 # might be run by more than one stage.
2571 &quot;A String&quot;,
2572 ],
2573 },
2574 },
2575 },
2576 &quot;currentState&quot;: &quot;A String&quot;, # The current state of the job.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002577 #
2578 # Jobs are created in the `JOB_STATE_STOPPED` state unless otherwise
2579 # specified.
2580 #
2581 # A job in the `JOB_STATE_RUNNING` state may asynchronously enter a
2582 # terminal state. After a job has reached a terminal state, no
2583 # further state updates may be made.
2584 #
2585 # This field may be mutated by the Cloud Dataflow service;
2586 # callers cannot mutate it.
Bu Sun Kim65020912020-05-20 12:08:20 -07002587 &quot;location&quot;: &quot;A String&quot;, # The [regional endpoint]
2588 # (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
2589 # contains this job.
2590 &quot;startTime&quot;: &quot;A String&quot;, # The timestamp when the job was started (transitioned to JOB_STATE_PENDING).
2591 # Flexible resource scheduling jobs are started with some delay after job
2592 # creation, so start_time is unset before start and is updated when the
2593 # job is started by the Cloud Dataflow service. For other jobs, start_time
2594 # always equals to create_time and is immutable and set by the Cloud Dataflow
2595 # service.
2596 &quot;stepsLocation&quot;: &quot;A String&quot;, # The GCS location where the steps are stored.
2597 &quot;labels&quot;: { # User-defined labels for this job.
2598 #
2599 # The labels map can contain no more than 64 entries. Entries of the labels
2600 # map are UTF8 strings that comply with the following restrictions:
2601 #
2602 # * Keys must conform to regexp: \p{Ll}\p{Lo}{0,62}
2603 # * Values must conform to regexp: [\p{Ll}\p{Lo}\p{N}_-]{0,63}
2604 # * Both keys and values are additionally constrained to be &lt;= 128 bytes in
2605 # size.
2606 &quot;a_key&quot;: &quot;A String&quot;,
Jon Wayne Parrott692617a2017-01-06 09:58:29 -08002607 },
Bu Sun Kim65020912020-05-20 12:08:20 -07002608 &quot;createTime&quot;: &quot;A String&quot;, # The timestamp when the job was initially created. Immutable and set by the
2609 # Cloud Dataflow service.
2610 &quot;requestedState&quot;: &quot;A String&quot;, # The job&#x27;s requested state.
2611 #
2612 # `UpdateJob` may be used to switch between the `JOB_STATE_STOPPED` and
2613 # `JOB_STATE_RUNNING` states, by setting requested_state. `UpdateJob` may
2614 # also be used to directly set a job&#x27;s requested state to
2615 # `JOB_STATE_CANCELLED` or `JOB_STATE_DONE`, irrevocably terminating the
2616 # job if it has not already reached a terminal state.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002617 },
Jon Wayne Parrott692617a2017-01-06 09:58:29 -08002618 ],
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07002619 &quot;failedLocation&quot;: [ # Zero or more messages describing the [regional endpoints]
2620 # (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
2621 # failed to respond.
2622 { # Indicates which [regional endpoint]
2623 # (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) failed
2624 # to respond to a request for data.
2625 &quot;name&quot;: &quot;A String&quot;, # The name of the [regional endpoint]
2626 # (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
2627 # failed to respond.
2628 },
2629 ],
2630 &quot;nextPageToken&quot;: &quot;A String&quot;, # Set if there may be more results than fit in this response.
Jon Wayne Parrott692617a2017-01-06 09:58:29 -08002631 }</pre>
2632</div>
2633
2634<div class="method">
2635 <code class="details" id="list_next">list_next(previous_request, previous_response)</code>
2636 <pre>Retrieves the next page of results.
2637
2638Args:
2639 previous_request: The request for the previous page. (required)
2640 previous_response: The response from the request for the previous page. (required)
2641
2642Returns:
Bu Sun Kim65020912020-05-20 12:08:20 -07002643 A request object that you can call &#x27;execute()&#x27; on to request the next
Jon Wayne Parrott692617a2017-01-06 09:58:29 -08002644 page. Returns None if there are no more items in the collection.
2645 </pre>
2646</div>
2647
2648<div class="method">
Dan O'Mearadd494642020-05-01 07:42:23 -07002649 <code class="details" id="snapshot">snapshot(projectId, location, jobId, body=None, x__xgafv=None)</code>
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002650 <pre>Snapshot the state of a streaming job.
2651
2652Args:
2653 projectId: string, The project which owns the job to be snapshotted. (required)
2654 location: string, The location that contains this job. (required)
2655 jobId: string, The job to be snapshotted. (required)
Dan O'Mearadd494642020-05-01 07:42:23 -07002656 body: object, The request body.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002657 The object takes the form of:
2658
2659{ # Request to create a snapshot of a job.
Bu Sun Kim65020912020-05-20 12:08:20 -07002660 &quot;description&quot;: &quot;A String&quot;, # User specified description of the snapshot. Maybe empty.
2661 &quot;snapshotSources&quot;: True or False, # If true, perform snapshots for sources which support this.
2662 &quot;ttl&quot;: &quot;A String&quot;, # TTL for the snapshot.
2663 &quot;location&quot;: &quot;A String&quot;, # The location that contains this job.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002664 }
2665
2666 x__xgafv: string, V1 error format.
2667 Allowed values
2668 1 - v1 error format
2669 2 - v2 error format
2670
2671Returns:
2672 An object of the form:
2673
2674 { # Represents a snapshot of a job.
Bu Sun Kim65020912020-05-20 12:08:20 -07002675 &quot;pubsubMetadata&quot;: [ # PubSub snapshot metadata.
Dan O'Mearadd494642020-05-01 07:42:23 -07002676 { # Represents a Pubsub snapshot.
Bu Sun Kim65020912020-05-20 12:08:20 -07002677 &quot;snapshotName&quot;: &quot;A String&quot;, # The name of the Pubsub snapshot.
2678 &quot;topicName&quot;: &quot;A String&quot;, # The name of the Pubsub topic.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07002679 &quot;expireTime&quot;: &quot;A String&quot;, # The expire time of the Pubsub snapshot.
Dan O'Mearadd494642020-05-01 07:42:23 -07002680 },
2681 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07002682 &quot;creationTime&quot;: &quot;A String&quot;, # The time this snapshot was created.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07002683 &quot;sourceJobId&quot;: &quot;A String&quot;, # The job this snapshot was created from.
2684 &quot;state&quot;: &quot;A String&quot;, # State of the snapshot.
2685 &quot;projectId&quot;: &quot;A String&quot;, # The project this snapshot belongs to.
2686 &quot;ttl&quot;: &quot;A String&quot;, # The time after which this snapshot will be automatically deleted.
2687 &quot;id&quot;: &quot;A String&quot;, # The unique ID of this snapshot.
2688 &quot;description&quot;: &quot;A String&quot;, # User specified description of the snapshot. Maybe empty.
2689 &quot;diskSizeBytes&quot;: &quot;A String&quot;, # The disk byte size of the snapshot. Only available for snapshots in READY
2690 # state.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002691 }</pre>
2692</div>
2693
2694<div class="method">
Dan O'Mearadd494642020-05-01 07:42:23 -07002695 <code class="details" id="update">update(projectId, location, jobId, body=None, x__xgafv=None)</code>
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04002696 <pre>Updates the state of an existing Cloud Dataflow job.
Jon Wayne Parrott692617a2017-01-06 09:58:29 -08002697
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002698To update the state of an existing job, we recommend using
2699`projects.locations.jobs.update` with a [regional endpoint]
2700(https://cloud.google.com/dataflow/docs/concepts/regional-endpoints). Using
2701`projects.jobs.update` is not recommended, as you can only update the state
2702of jobs that are running in `us-central1`.
2703
Jon Wayne Parrott692617a2017-01-06 09:58:29 -08002704Args:
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04002705 projectId: string, The ID of the Cloud Platform project that the job belongs to. (required)
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002706 location: string, The [regional endpoint]
2707(https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
2708contains this job. (required)
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04002709 jobId: string, The job ID. (required)
Dan O'Mearadd494642020-05-01 07:42:23 -07002710 body: object, The request body.
Jon Wayne Parrott692617a2017-01-06 09:58:29 -08002711 The object takes the form of:
2712
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04002713{ # Defines a job to be run by the Cloud Dataflow service.
Bu Sun Kim65020912020-05-20 12:08:20 -07002714 &quot;clientRequestId&quot;: &quot;A String&quot;, # The client&#x27;s unique identifier of the job, re-used across retried attempts.
2715 # If this field is set, the service will ensure its uniqueness.
2716 # The request to create a job will fail if the service has knowledge of a
2717 # previously submitted job with the same client&#x27;s ID and job name.
2718 # The caller may use this field to ensure idempotence of job
2719 # creation across retried attempts to create a job.
2720 # By default, the field is empty and, in that case, the service ignores it.
2721 &quot;id&quot;: &quot;A String&quot;, # The unique ID of this job.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002722 #
2723 # This field is set by the Cloud Dataflow service when the Job is
2724 # created, and is immutable for the life of the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07002725 &quot;currentStateTime&quot;: &quot;A String&quot;, # The timestamp associated with the current state.
2726 &quot;transformNameMapping&quot;: { # The map of transform name prefixes of the job to be replaced to the
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002727 # corresponding name prefixes of the new job.
Bu Sun Kim65020912020-05-20 12:08:20 -07002728 &quot;a_key&quot;: &quot;A String&quot;,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002729 },
Bu Sun Kim65020912020-05-20 12:08:20 -07002730 &quot;environment&quot;: { # Describes the environment in which a Dataflow Job runs. # The environment for the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07002731 &quot;internalExperiments&quot;: { # Experimental settings.
2732 &quot;a_key&quot;: &quot;&quot;, # Properties of the object. Contains field @type with type URL.
2733 },
2734 &quot;workerRegion&quot;: &quot;A String&quot;, # The Compute Engine region
2735 # (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
2736 # which worker processing should occur, e.g. &quot;us-west1&quot;. Mutually exclusive
2737 # with worker_zone. If neither worker_region nor worker_zone is specified,
2738 # default to the control plane&#x27;s region.
2739 &quot;serviceKmsKeyName&quot;: &quot;A String&quot;, # If set, contains the Cloud KMS key identifier used to encrypt data
2740 # at rest, AKA a Customer Managed Encryption Key (CMEK).
2741 #
2742 # Format:
2743 # projects/PROJECT_ID/locations/LOCATION/keyRings/KEY_RING/cryptoKeys/KEY
2744 &quot;userAgent&quot;: { # A description of the process that generated the request.
2745 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
2746 },
2747 &quot;workerZone&quot;: &quot;A String&quot;, # The Compute Engine zone
2748 # (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
2749 # which worker processing should occur, e.g. &quot;us-west1-a&quot;. Mutually exclusive
2750 # with worker_region. If neither worker_region nor worker_zone is specified,
2751 # a zone in the control plane&#x27;s region is chosen based on available capacity.
2752 &quot;clusterManagerApiService&quot;: &quot;A String&quot;, # The type of cluster manager API to use. If unknown or
Dan O'Mearadd494642020-05-01 07:42:23 -07002753 # unspecified, the service will attempt to choose a reasonable
2754 # default. This should be in the form of the API service name,
Bu Sun Kim65020912020-05-20 12:08:20 -07002755 # e.g. &quot;compute.googleapis.com&quot;.
2756 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the system should use for temporary
2757 # storage. The system will append the suffix &quot;/temp-{JOBNAME} to
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002758 # this resource prefix, where {JOBNAME} is the value of the
2759 # job_name field. The resulting bucket and object prefix is used
2760 # as the prefix of the resources used to store temporary data
2761 # needed during the job execution. NOTE: This will override the
2762 # value in taskrunner_settings.
2763 # The supported resource type is:
2764 #
2765 # Google Cloud Storage:
2766 #
2767 # storage.googleapis.com/{bucket}/{object}
2768 # bucket.storage.googleapis.com/{object}
Bu Sun Kim65020912020-05-20 12:08:20 -07002769 &quot;experiments&quot;: [ # The list of experiments to enable.
2770 &quot;A String&quot;,
2771 ],
2772 &quot;version&quot;: { # A structure describing which components and their versions of the service
2773 # are required in order to run the job.
2774 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
2775 },
2776 &quot;serviceAccountEmail&quot;: &quot;A String&quot;, # Identity to run virtual machines as. Defaults to the default account.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07002777 &quot;sdkPipelineOptions&quot;: { # The Cloud Dataflow SDK pipeline options specified by the user. These
2778 # options are passed through the service and are used to recreate the
2779 # SDK pipeline options on the worker in a language agnostic and platform
2780 # independent way.
2781 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
2782 },
2783 &quot;flexResourceSchedulingGoal&quot;: &quot;A String&quot;, # Which Flexible Resource Scheduling mode to run in.
2784 &quot;workerPools&quot;: [ # The worker pools. At least one &quot;harness&quot; worker pool must be
2785 # specified in order for the job to have workers.
2786 { # Describes one particular pool of Cloud Dataflow workers to be
2787 # instantiated by the Cloud Dataflow service in order to perform the
2788 # computations required by a job. Note that a workflow job may use
2789 # multiple pools, in order to match the various computational
2790 # requirements of the various stages of the job.
2791 &quot;numThreadsPerWorker&quot;: 42, # The number of threads per worker harness. If empty or unspecified, the
2792 # service will choose a number of threads (according to the number of cores
2793 # on the selected machine type for batch, or 1 by convention for streaming).
2794 &quot;numWorkers&quot;: 42, # Number of Google Compute Engine workers in this pool needed to
2795 # execute the job. If zero or unspecified, the service will
2796 # attempt to choose a reasonable default.
2797 &quot;zone&quot;: &quot;A String&quot;, # Zone to run the worker pools in. If empty or unspecified, the service
2798 # will attempt to choose a reasonable default.
2799 &quot;diskSourceImage&quot;: &quot;A String&quot;, # Fully qualified source image for disks.
2800 &quot;packages&quot;: [ # Packages to be installed on workers.
2801 { # The packages that must be installed in order for a worker to run the
2802 # steps of the Cloud Dataflow job that will be assigned to its worker
2803 # pool.
2804 #
2805 # This is the mechanism by which the Cloud Dataflow SDK causes code to
2806 # be loaded onto the workers. For example, the Cloud Dataflow Java SDK
2807 # might use this to install jars containing the user&#x27;s code and all of the
2808 # various dependencies (libraries, data files, etc.) required in order
2809 # for that code to run.
2810 &quot;name&quot;: &quot;A String&quot;, # The name of the package.
2811 &quot;location&quot;: &quot;A String&quot;, # The resource to read the package from. The supported resource type is:
2812 #
2813 # Google Cloud Storage:
2814 #
2815 # storage.googleapis.com/{bucket}
2816 # bucket.storage.googleapis.com/
2817 },
2818 ],
2819 &quot;teardownPolicy&quot;: &quot;A String&quot;, # Sets the policy for determining when to turndown worker pool.
2820 # Allowed values are: `TEARDOWN_ALWAYS`, `TEARDOWN_ON_SUCCESS`, and
2821 # `TEARDOWN_NEVER`.
2822 # `TEARDOWN_ALWAYS` means workers are always torn down regardless of whether
2823 # the job succeeds. `TEARDOWN_ON_SUCCESS` means workers are torn down
2824 # if the job succeeds. `TEARDOWN_NEVER` means the workers are never torn
2825 # down.
2826 #
2827 # If the workers are not torn down by the service, they will
2828 # continue to run and use Google Compute Engine VM resources in the
2829 # user&#x27;s project until they are explicitly terminated by the user.
2830 # Because of this, Google recommends using the `TEARDOWN_ALWAYS`
2831 # policy except for small, manually supervised test jobs.
2832 #
2833 # If unknown or unspecified, the service will attempt to choose a reasonable
2834 # default.
2835 &quot;onHostMaintenance&quot;: &quot;A String&quot;, # The action to take on host maintenance, as defined by the Google
2836 # Compute Engine API.
2837 &quot;poolArgs&quot;: { # Extra arguments for this worker pool.
2838 &quot;a_key&quot;: &quot;&quot;, # Properties of the object. Contains field @type with type URL.
2839 },
2840 &quot;diskSizeGb&quot;: 42, # Size of root disk for VMs, in GB. If zero or unspecified, the service will
2841 # attempt to choose a reasonable default.
2842 &quot;workerHarnessContainerImage&quot;: &quot;A String&quot;, # Required. Docker container image that executes the Cloud Dataflow worker
2843 # harness, residing in Google Container Registry.
2844 #
2845 # Deprecated for the Fn API path. Use sdk_harness_container_images instead.
2846 &quot;diskType&quot;: &quot;A String&quot;, # Type of root disk for VMs. If empty or unspecified, the service will
2847 # attempt to choose a reasonable default.
2848 &quot;machineType&quot;: &quot;A String&quot;, # Machine type (e.g. &quot;n1-standard-1&quot;). If empty or unspecified, the
2849 # service will attempt to choose a reasonable default.
2850 &quot;kind&quot;: &quot;A String&quot;, # The kind of the worker pool; currently only `harness` and `shuffle`
2851 # are supported.
2852 &quot;sdkHarnessContainerImages&quot;: [ # Set of SDK harness containers needed to execute this pipeline. This will
2853 # only be set in the Fn API path. For non-cross-language pipelines this
2854 # should have only one entry. Cross-language pipelines will have two or more
2855 # entries.
2856 { # Defines a SDK harness container for executing Dataflow pipelines.
2857 &quot;containerImage&quot;: &quot;A String&quot;, # A docker container image that resides in Google Container Registry.
2858 &quot;useSingleCorePerContainer&quot;: True or False, # If true, recommends the Dataflow service to use only one core per SDK
2859 # container instance with this image. If false (or unset) recommends using
2860 # more than one core per SDK container instance with this image for
2861 # efficiency. Note that Dataflow service may choose to override this property
2862 # if needed.
2863 },
2864 ],
2865 &quot;dataDisks&quot;: [ # Data disks that are used by a VM in this workflow.
2866 { # Describes the data disk used by a workflow job.
2867 &quot;diskType&quot;: &quot;A String&quot;, # Disk storage type, as defined by Google Compute Engine. This
2868 # must be a disk type appropriate to the project and zone in which
2869 # the workers will run. If unknown or unspecified, the service
2870 # will attempt to choose a reasonable default.
2871 #
2872 # For example, the standard persistent disk type is a resource name
2873 # typically ending in &quot;pd-standard&quot;. If SSD persistent disks are
2874 # available, the resource name typically ends with &quot;pd-ssd&quot;. The
2875 # actual valid values are defined the Google Compute Engine API,
2876 # not by the Cloud Dataflow API; consult the Google Compute Engine
2877 # documentation for more information about determining the set of
2878 # available disk types for a particular project and zone.
2879 #
2880 # Google Compute Engine Disk types are local to a particular
2881 # project in a particular zone, and so the resource name will
2882 # typically look something like this:
2883 #
2884 # compute.googleapis.com/projects/project-id/zones/zone/diskTypes/pd-standard
2885 &quot;sizeGb&quot;: 42, # Size of disk in GB. If zero or unspecified, the service will
2886 # attempt to choose a reasonable default.
2887 &quot;mountPoint&quot;: &quot;A String&quot;, # Directory in a VM where disk is mounted.
2888 },
2889 ],
2890 &quot;subnetwork&quot;: &quot;A String&quot;, # Subnetwork to which VMs will be assigned, if desired. Expected to be of
2891 # the form &quot;regions/REGION/subnetworks/SUBNETWORK&quot;.
2892 &quot;ipConfiguration&quot;: &quot;A String&quot;, # Configuration for VM IPs.
2893 &quot;taskrunnerSettings&quot;: { # Taskrunner configuration settings. # Settings passed through to Google Compute Engine workers when
2894 # using the standard Dataflow task runner. Users should ignore
2895 # this field.
2896 &quot;alsologtostderr&quot;: True or False, # Whether to also send taskrunner log info to stderr.
2897 &quot;taskGroup&quot;: &quot;A String&quot;, # The UNIX group ID on the worker VM to use for tasks launched by
2898 # taskrunner; e.g. &quot;wheel&quot;.
2899 &quot;harnessCommand&quot;: &quot;A String&quot;, # The command to launch the worker harness.
2900 &quot;logDir&quot;: &quot;A String&quot;, # The directory on the VM to store logs.
2901 &quot;oauthScopes&quot;: [ # The OAuth2 scopes to be requested by the taskrunner in order to
2902 # access the Cloud Dataflow API.
2903 &quot;A String&quot;,
2904 ],
2905 &quot;dataflowApiVersion&quot;: &quot;A String&quot;, # The API version of endpoint, e.g. &quot;v1b3&quot;
2906 &quot;logUploadLocation&quot;: &quot;A String&quot;, # Indicates where to put logs. If this is not specified, the logs
2907 # will not be uploaded.
2908 #
2909 # The supported resource type is:
2910 #
2911 # Google Cloud Storage:
2912 # storage.googleapis.com/{bucket}/{object}
2913 # bucket.storage.googleapis.com/{object}
2914 &quot;streamingWorkerMainClass&quot;: &quot;A String&quot;, # The streaming worker main class name.
2915 &quot;workflowFileName&quot;: &quot;A String&quot;, # The file to store the workflow in.
2916 &quot;languageHint&quot;: &quot;A String&quot;, # The suggested backend language.
2917 &quot;commandlinesFileName&quot;: &quot;A String&quot;, # The file to store preprocessing commands in.
2918 &quot;baseTaskDir&quot;: &quot;A String&quot;, # The location on the worker for task-specific subdirectories.
2919 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the taskrunner should use for
2920 # temporary storage.
2921 #
2922 # The supported resource type is:
2923 #
2924 # Google Cloud Storage:
2925 # storage.googleapis.com/{bucket}/{object}
2926 # bucket.storage.googleapis.com/{object}
2927 &quot;baseUrl&quot;: &quot;A String&quot;, # The base URL for the taskrunner to use when accessing Google Cloud APIs.
2928 #
2929 # When workers access Google Cloud APIs, they logically do so via
2930 # relative URLs. If this field is specified, it supplies the base
2931 # URL to use for resolving these relative URLs. The normative
2932 # algorithm used is defined by RFC 1808, &quot;Relative Uniform Resource
2933 # Locators&quot;.
2934 #
2935 # If not specified, the default value is &quot;http://www.googleapis.com/&quot;
2936 &quot;logToSerialconsole&quot;: True or False, # Whether to send taskrunner log info to Google Compute Engine VM serial
2937 # console.
2938 &quot;continueOnException&quot;: True or False, # Whether to continue taskrunner if an exception is hit.
2939 &quot;parallelWorkerSettings&quot;: { # Provides data to pass through to the worker harness. # The settings to pass to the parallel worker harness.
2940 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the system should use for temporary
2941 # storage.
2942 #
2943 # The supported resource type is:
2944 #
2945 # Google Cloud Storage:
2946 #
2947 # storage.googleapis.com/{bucket}/{object}
2948 # bucket.storage.googleapis.com/{object}
2949 &quot;reportingEnabled&quot;: True or False, # Whether to send work progress updates to the service.
2950 &quot;baseUrl&quot;: &quot;A String&quot;, # The base URL for accessing Google Cloud APIs.
2951 #
2952 # When workers access Google Cloud APIs, they logically do so via
2953 # relative URLs. If this field is specified, it supplies the base
2954 # URL to use for resolving these relative URLs. The normative
2955 # algorithm used is defined by RFC 1808, &quot;Relative Uniform Resource
2956 # Locators&quot;.
2957 #
2958 # If not specified, the default value is &quot;http://www.googleapis.com/&quot;
2959 &quot;servicePath&quot;: &quot;A String&quot;, # The Cloud Dataflow service path relative to the root URL, for example,
2960 # &quot;dataflow/v1b3/projects&quot;.
2961 &quot;shuffleServicePath&quot;: &quot;A String&quot;, # The Shuffle service path relative to the root URL, for example,
2962 # &quot;shuffle/v1beta1&quot;.
2963 &quot;workerId&quot;: &quot;A String&quot;, # The ID of the worker running this pipeline.
2964 },
2965 &quot;taskUser&quot;: &quot;A String&quot;, # The UNIX user ID on the worker VM to use for tasks launched by
2966 # taskrunner; e.g. &quot;root&quot;.
2967 &quot;vmId&quot;: &quot;A String&quot;, # The ID string of the VM.
2968 },
2969 &quot;autoscalingSettings&quot;: { # Settings for WorkerPool autoscaling. # Settings for autoscaling of this WorkerPool.
2970 &quot;algorithm&quot;: &quot;A String&quot;, # The algorithm to use for autoscaling.
2971 &quot;maxNumWorkers&quot;: 42, # The maximum number of workers to cap scaling at.
2972 },
2973 &quot;metadata&quot;: { # Metadata to set on the Google Compute Engine VMs.
2974 &quot;a_key&quot;: &quot;A String&quot;,
2975 },
2976 &quot;defaultPackageSet&quot;: &quot;A String&quot;, # The default package set to install. This allows the service to
2977 # select a default set of packages which are useful to worker
2978 # harnesses written in a particular language.
2979 &quot;network&quot;: &quot;A String&quot;, # Network to which VMs will be assigned. If empty or unspecified,
2980 # the service will use the network &quot;default&quot;.
2981 },
2982 ],
2983 &quot;dataset&quot;: &quot;A String&quot;, # The dataset for the current project where various workflow
2984 # related tables are stored.
2985 #
2986 # The supported resource type is:
2987 #
2988 # Google BigQuery:
2989 # bigquery.googleapis.com/{dataset}
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002990 },
Bu Sun Kim65020912020-05-20 12:08:20 -07002991 &quot;stageStates&quot;: [ # This field may be mutated by the Cloud Dataflow service;
2992 # callers cannot mutate it.
2993 { # A message describing the state of a particular execution stage.
Bu Sun Kim65020912020-05-20 12:08:20 -07002994 &quot;currentStateTime&quot;: &quot;A String&quot;, # The time at which the stage transitioned to this state.
2995 &quot;executionStageState&quot;: &quot;A String&quot;, # Executions stage states allow the same set of values as JobState.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07002996 &quot;executionStageName&quot;: &quot;A String&quot;, # The name of the execution stage.
Bu Sun Kim65020912020-05-20 12:08:20 -07002997 },
2998 ],
2999 &quot;jobMetadata&quot;: { # Metadata available primarily for filtering jobs. Will be included in the # This field is populated by the Dataflow service to support filtering jobs
3000 # by the metadata values provided here. Populated for ListJobs and all GetJob
3001 # views SUMMARY and higher.
3002 # ListJob response and Job SUMMARY view.
Bu Sun Kim65020912020-05-20 12:08:20 -07003003 &quot;datastoreDetails&quot;: [ # Identification of a Datastore source used in the Dataflow job.
3004 { # Metadata for a Datastore connector used by the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07003005 &quot;namespace&quot;: &quot;A String&quot;, # Namespace used in the connection.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07003006 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
Bu Sun Kim65020912020-05-20 12:08:20 -07003007 },
3008 ],
3009 &quot;sdkVersion&quot;: { # The version of the SDK used to run the job. # The SDK version used to run the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07003010 &quot;version&quot;: &quot;A String&quot;, # The version of the SDK used to run the job.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07003011 &quot;sdkSupportStatus&quot;: &quot;A String&quot;, # The support status for this SDK version.
3012 &quot;versionDisplayName&quot;: &quot;A String&quot;, # A readable string describing the version of the SDK.
Bu Sun Kim65020912020-05-20 12:08:20 -07003013 },
3014 &quot;bigqueryDetails&quot;: [ # Identification of a BigQuery source used in the Dataflow job.
3015 { # Metadata for a BigQuery connector used by the job.
3016 &quot;table&quot;: &quot;A String&quot;, # Table accessed in the connection.
3017 &quot;dataset&quot;: &quot;A String&quot;, # Dataset accessed in the connection.
Bu Sun Kim65020912020-05-20 12:08:20 -07003018 &quot;query&quot;: &quot;A String&quot;, # Query used to access data in the connection.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07003019 &quot;projectId&quot;: &quot;A String&quot;, # Project accessed in the connection.
Bu Sun Kim65020912020-05-20 12:08:20 -07003020 },
3021 ],
3022 &quot;fileDetails&quot;: [ # Identification of a File source used in the Dataflow job.
3023 { # Metadata for a File connector used by the job.
3024 &quot;filePattern&quot;: &quot;A String&quot;, # File Pattern used to access files by the connector.
3025 },
3026 ],
3027 &quot;pubsubDetails&quot;: [ # Identification of a PubSub source used in the Dataflow job.
3028 { # Metadata for a PubSub connector used by the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07003029 &quot;topic&quot;: &quot;A String&quot;, # Topic accessed in the connection.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07003030 &quot;subscription&quot;: &quot;A String&quot;, # Subscription used in the connection.
3031 },
3032 ],
3033 &quot;bigTableDetails&quot;: [ # Identification of a BigTable source used in the Dataflow job.
3034 { # Metadata for a BigTable connector used by the job.
3035 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
3036 &quot;instanceId&quot;: &quot;A String&quot;, # InstanceId accessed in the connection.
3037 &quot;tableId&quot;: &quot;A String&quot;, # TableId accessed in the connection.
3038 },
3039 ],
3040 &quot;spannerDetails&quot;: [ # Identification of a Spanner source used in the Dataflow job.
3041 { # Metadata for a Spanner connector used by the job.
3042 &quot;instanceId&quot;: &quot;A String&quot;, # InstanceId accessed in the connection.
3043 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
3044 &quot;databaseId&quot;: &quot;A String&quot;, # DatabaseId accessed in the connection.
Bu Sun Kim65020912020-05-20 12:08:20 -07003045 },
3046 ],
3047 },
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07003048 &quot;type&quot;: &quot;A String&quot;, # The type of Cloud Dataflow job.
3049 &quot;projectId&quot;: &quot;A String&quot;, # The ID of the Cloud Platform project that the job belongs to.
Bu Sun Kim65020912020-05-20 12:08:20 -07003050 &quot;createdFromSnapshotId&quot;: &quot;A String&quot;, # If this is specified, the job&#x27;s initial state is populated from the given
3051 # snapshot.
Bu Sun Kim65020912020-05-20 12:08:20 -07003052 &quot;pipelineDescription&quot;: { # A descriptive representation of submitted pipeline as well as the executed # Preliminary field: The format of this data may change at any time.
3053 # A description of the user pipeline and stages through which it is executed.
3054 # Created by Cloud Dataflow service. Only retrieved with
3055 # JOB_VIEW_DESCRIPTION or JOB_VIEW_ALL.
3056 # form. This data is provided by the Dataflow service for ease of visualizing
3057 # the pipeline and interpreting Dataflow provided metrics.
3058 &quot;executionPipelineStage&quot;: [ # Description of each stage of execution of the pipeline.
3059 { # Description of the composing transforms, names/ids, and input/outputs of a
3060 # stage of execution. Some composing transforms and sources may have been
3061 # generated by the Dataflow service during execution planning.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07003062 &quot;outputSource&quot;: [ # Output sources for this stage.
3063 { # Description of an input or output of an execution stage.
3064 &quot;sizeBytes&quot;: &quot;A String&quot;, # Size of the source, if measurable.
3065 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
3066 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this source; may be user or system generated.
3067 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
3068 # source is most closely associated.
3069 },
3070 ],
3071 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this stage.
3072 &quot;inputSource&quot;: [ # Input sources for this stage.
3073 { # Description of an input or output of an execution stage.
3074 &quot;sizeBytes&quot;: &quot;A String&quot;, # Size of the source, if measurable.
3075 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
3076 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this source; may be user or system generated.
3077 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
3078 # source is most closely associated.
3079 },
3080 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07003081 &quot;id&quot;: &quot;A String&quot;, # Dataflow service generated id for this stage.
3082 &quot;componentTransform&quot;: [ # Transforms that comprise this execution stage.
3083 { # Description of a transform executed as part of an execution stage.
3084 &quot;originalTransform&quot;: &quot;A String&quot;, # User name for the original user transform with which this transform is
3085 # most closely associated.
3086 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
3087 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this transform; may be user or system generated.
3088 },
3089 ],
3090 &quot;componentSource&quot;: [ # Collections produced and consumed by component transforms of this stage.
3091 { # Description of an interstitial value between transforms in an execution
3092 # stage.
3093 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
3094 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this transform; may be user or system generated.
3095 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
3096 # source is most closely associated.
3097 },
3098 ],
3099 &quot;kind&quot;: &quot;A String&quot;, # Type of tranform this stage is executing.
Bu Sun Kim65020912020-05-20 12:08:20 -07003100 },
3101 ],
3102 &quot;originalPipelineTransform&quot;: [ # Description of each transform in the pipeline and collections between them.
3103 { # Description of the type, names/ids, and input/outputs for a transform.
3104 &quot;kind&quot;: &quot;A String&quot;, # Type of transform.
3105 &quot;inputCollectionName&quot;: [ # User names for all collection inputs to this transform.
3106 &quot;A String&quot;,
3107 ],
3108 &quot;name&quot;: &quot;A String&quot;, # User provided name for this transform instance.
3109 &quot;id&quot;: &quot;A String&quot;, # SDK generated id of this transform instance.
3110 &quot;displayData&quot;: [ # Transform-specific display data.
3111 { # Data provided with a pipeline or transform to provide descriptive info.
Bu Sun Kim65020912020-05-20 12:08:20 -07003112 &quot;durationValue&quot;: &quot;A String&quot;, # Contains value if the data is of duration type.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07003113 &quot;int64Value&quot;: &quot;A String&quot;, # Contains value if the data is of int64 type.
Bu Sun Kim65020912020-05-20 12:08:20 -07003114 &quot;namespace&quot;: &quot;A String&quot;, # The namespace for the key. This is usually a class name or programming
3115 # language namespace (i.e. python module) which defines the display data.
3116 # This allows a dax monitoring system to specially handle the data
3117 # and perform custom rendering.
3118 &quot;floatValue&quot;: 3.14, # Contains value if the data is of float type.
3119 &quot;key&quot;: &quot;A String&quot;, # The key identifying the display data.
3120 # This is intended to be used as a label for the display data
3121 # when viewed in a dax monitoring system.
3122 &quot;shortStrValue&quot;: &quot;A String&quot;, # A possible additional shorter value to display.
3123 # For example a java_class_name_value of com.mypackage.MyDoFn
3124 # will be stored with MyDoFn as the short_str_value and
3125 # com.mypackage.MyDoFn as the java_class_name value.
3126 # short_str_value can be displayed and java_class_name_value
3127 # will be displayed as a tooltip.
3128 &quot;url&quot;: &quot;A String&quot;, # An optional full URL.
3129 &quot;label&quot;: &quot;A String&quot;, # An optional label to display in a dax UI for the element.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07003130 &quot;timestampValue&quot;: &quot;A String&quot;, # Contains value if the data is of timestamp type.
3131 &quot;boolValue&quot;: True or False, # Contains value if the data is of a boolean type.
3132 &quot;javaClassValue&quot;: &quot;A String&quot;, # Contains value if the data is of java class type.
3133 &quot;strValue&quot;: &quot;A String&quot;, # Contains value if the data is of string type.
Bu Sun Kim65020912020-05-20 12:08:20 -07003134 },
3135 ],
3136 &quot;outputCollectionName&quot;: [ # User names for all collection outputs to this transform.
3137 &quot;A String&quot;,
3138 ],
3139 },
3140 ],
3141 &quot;displayData&quot;: [ # Pipeline level display data.
3142 { # Data provided with a pipeline or transform to provide descriptive info.
Bu Sun Kim65020912020-05-20 12:08:20 -07003143 &quot;durationValue&quot;: &quot;A String&quot;, # Contains value if the data is of duration type.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07003144 &quot;int64Value&quot;: &quot;A String&quot;, # Contains value if the data is of int64 type.
Bu Sun Kim65020912020-05-20 12:08:20 -07003145 &quot;namespace&quot;: &quot;A String&quot;, # The namespace for the key. This is usually a class name or programming
3146 # language namespace (i.e. python module) which defines the display data.
3147 # This allows a dax monitoring system to specially handle the data
3148 # and perform custom rendering.
3149 &quot;floatValue&quot;: 3.14, # Contains value if the data is of float type.
3150 &quot;key&quot;: &quot;A String&quot;, # The key identifying the display data.
3151 # This is intended to be used as a label for the display data
3152 # when viewed in a dax monitoring system.
3153 &quot;shortStrValue&quot;: &quot;A String&quot;, # A possible additional shorter value to display.
3154 # For example a java_class_name_value of com.mypackage.MyDoFn
3155 # will be stored with MyDoFn as the short_str_value and
3156 # com.mypackage.MyDoFn as the java_class_name value.
3157 # short_str_value can be displayed and java_class_name_value
3158 # will be displayed as a tooltip.
3159 &quot;url&quot;: &quot;A String&quot;, # An optional full URL.
3160 &quot;label&quot;: &quot;A String&quot;, # An optional label to display in a dax UI for the element.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07003161 &quot;timestampValue&quot;: &quot;A String&quot;, # Contains value if the data is of timestamp type.
3162 &quot;boolValue&quot;: True or False, # Contains value if the data is of a boolean type.
3163 &quot;javaClassValue&quot;: &quot;A String&quot;, # Contains value if the data is of java class type.
3164 &quot;strValue&quot;: &quot;A String&quot;, # Contains value if the data is of string type.
Bu Sun Kim65020912020-05-20 12:08:20 -07003165 },
3166 ],
3167 },
3168 &quot;replaceJobId&quot;: &quot;A String&quot;, # If this job is an update of an existing job, this field is the job ID
3169 # of the job it replaced.
3170 #
3171 # When sending a `CreateJobRequest`, you can update a job by specifying it
3172 # here. The job named here is stopped, and its intermediate state is
3173 # transferred to this job.
3174 &quot;tempFiles&quot;: [ # A set of files the system should be aware of that are used
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003175 # for temporary storage. These temporary files will be
3176 # removed on job completion.
3177 # No duplicates are allowed.
3178 # No file patterns are supported.
3179 #
3180 # The supported files are:
3181 #
3182 # Google Cloud Storage:
3183 #
3184 # storage.googleapis.com/{bucket}/{object}
3185 # bucket.storage.googleapis.com/{object}
Bu Sun Kim65020912020-05-20 12:08:20 -07003186 &quot;A String&quot;,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003187 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07003188 &quot;name&quot;: &quot;A String&quot;, # The user-specified Cloud Dataflow job name.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003189 #
3190 # Only one Job with a given name may exist in a project at any
3191 # given time. If a caller attempts to create a Job with the same
3192 # name as an already-existing Job, the attempt returns the
3193 # existing Job.
3194 #
3195 # The name must match the regular expression
3196 # `[a-z]([-a-z0-9]{0,38}[a-z0-9])?`
Bu Sun Kim65020912020-05-20 12:08:20 -07003197 &quot;steps&quot;: [ # Exactly one of step or steps_location should be specified.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003198 #
3199 # The top-level steps that constitute the entire job.
3200 { # Defines a particular step within a Cloud Dataflow job.
3201 #
3202 # A job consists of multiple steps, each of which performs some
3203 # specific operation as part of the overall job. Data is typically
3204 # passed from one step to another as part of the job.
3205 #
Bu Sun Kim65020912020-05-20 12:08:20 -07003206 # Here&#x27;s an example of a sequence of steps which together implement a
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003207 # Map-Reduce job:
3208 #
3209 # * Read a collection of data from some source, parsing the
Bu Sun Kim65020912020-05-20 12:08:20 -07003210 # collection&#x27;s elements.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003211 #
3212 # * Validate the elements.
3213 #
3214 # * Apply a user-defined function to map each element to some value
3215 # and extract an element-specific key value.
3216 #
3217 # * Group elements with the same key into a single element with
3218 # that key, transforming a multiply-keyed collection into a
3219 # uniquely-keyed collection.
3220 #
3221 # * Write the elements out to some data sink.
3222 #
3223 # Note that the Cloud Dataflow service may be used to run many different
3224 # types of jobs, not just Map-Reduce.
Bu Sun Kim65020912020-05-20 12:08:20 -07003225 &quot;name&quot;: &quot;A String&quot;, # The name that identifies the step. This must be unique for each
Dan O'Mearadd494642020-05-01 07:42:23 -07003226 # step with respect to all other steps in the Cloud Dataflow job.
Bu Sun Kim65020912020-05-20 12:08:20 -07003227 &quot;kind&quot;: &quot;A String&quot;, # The kind of step in the Cloud Dataflow job.
3228 &quot;properties&quot;: { # Named properties associated with the step. Each kind of
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003229 # predefined step has its own required set of properties.
3230 # Must be provided on Create. Only retrieved with JOB_VIEW_ALL.
Bu Sun Kim65020912020-05-20 12:08:20 -07003231 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003232 },
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003233 },
3234 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07003235 &quot;replacedByJobId&quot;: &quot;A String&quot;, # If another job is an update of this job (and thus, this job is in
3236 # `JOB_STATE_UPDATED`), this field contains the ID of that job.
3237 &quot;executionInfo&quot;: { # Additional information about how a Cloud Dataflow job will be executed that # Deprecated.
3238 # isn&#x27;t contained in the submitted job.
3239 &quot;stages&quot;: { # A mapping from each stage to the information about that stage.
3240 &quot;a_key&quot;: { # Contains information about how a particular
3241 # google.dataflow.v1beta3.Step will be executed.
3242 &quot;stepName&quot;: [ # The steps associated with the execution stage.
3243 # Note that stages may have several steps, and that a given step
3244 # might be run by more than one stage.
3245 &quot;A String&quot;,
3246 ],
3247 },
3248 },
3249 },
3250 &quot;currentState&quot;: &quot;A String&quot;, # The current state of the job.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003251 #
3252 # Jobs are created in the `JOB_STATE_STOPPED` state unless otherwise
3253 # specified.
3254 #
3255 # A job in the `JOB_STATE_RUNNING` state may asynchronously enter a
3256 # terminal state. After a job has reached a terminal state, no
3257 # further state updates may be made.
3258 #
3259 # This field may be mutated by the Cloud Dataflow service;
3260 # callers cannot mutate it.
Bu Sun Kim65020912020-05-20 12:08:20 -07003261 &quot;location&quot;: &quot;A String&quot;, # The [regional endpoint]
3262 # (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
3263 # contains this job.
3264 &quot;startTime&quot;: &quot;A String&quot;, # The timestamp when the job was started (transitioned to JOB_STATE_PENDING).
3265 # Flexible resource scheduling jobs are started with some delay after job
3266 # creation, so start_time is unset before start and is updated when the
3267 # job is started by the Cloud Dataflow service. For other jobs, start_time
3268 # always equals to create_time and is immutable and set by the Cloud Dataflow
3269 # service.
3270 &quot;stepsLocation&quot;: &quot;A String&quot;, # The GCS location where the steps are stored.
3271 &quot;labels&quot;: { # User-defined labels for this job.
3272 #
3273 # The labels map can contain no more than 64 entries. Entries of the labels
3274 # map are UTF8 strings that comply with the following restrictions:
3275 #
3276 # * Keys must conform to regexp: \p{Ll}\p{Lo}{0,62}
3277 # * Values must conform to regexp: [\p{Ll}\p{Lo}\p{N}_-]{0,63}
3278 # * Both keys and values are additionally constrained to be &lt;= 128 bytes in
3279 # size.
3280 &quot;a_key&quot;: &quot;A String&quot;,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003281 },
Bu Sun Kim65020912020-05-20 12:08:20 -07003282 &quot;createTime&quot;: &quot;A String&quot;, # The timestamp when the job was initially created. Immutable and set by the
3283 # Cloud Dataflow service.
3284 &quot;requestedState&quot;: &quot;A String&quot;, # The job&#x27;s requested state.
3285 #
3286 # `UpdateJob` may be used to switch between the `JOB_STATE_STOPPED` and
3287 # `JOB_STATE_RUNNING` states, by setting requested_state. `UpdateJob` may
3288 # also be used to directly set a job&#x27;s requested state to
3289 # `JOB_STATE_CANCELLED` or `JOB_STATE_DONE`, irrevocably terminating the
3290 # job if it has not already reached a terminal state.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003291}
3292
3293 x__xgafv: string, V1 error format.
3294 Allowed values
3295 1 - v1 error format
3296 2 - v2 error format
3297
3298Returns:
3299 An object of the form:
3300
3301 { # Defines a job to be run by the Cloud Dataflow service.
Bu Sun Kim65020912020-05-20 12:08:20 -07003302 &quot;clientRequestId&quot;: &quot;A String&quot;, # The client&#x27;s unique identifier of the job, re-used across retried attempts.
3303 # If this field is set, the service will ensure its uniqueness.
3304 # The request to create a job will fail if the service has knowledge of a
3305 # previously submitted job with the same client&#x27;s ID and job name.
3306 # The caller may use this field to ensure idempotence of job
3307 # creation across retried attempts to create a job.
3308 # By default, the field is empty and, in that case, the service ignores it.
3309 &quot;id&quot;: &quot;A String&quot;, # The unique ID of this job.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003310 #
3311 # This field is set by the Cloud Dataflow service when the Job is
3312 # created, and is immutable for the life of the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07003313 &quot;currentStateTime&quot;: &quot;A String&quot;, # The timestamp associated with the current state.
3314 &quot;transformNameMapping&quot;: { # The map of transform name prefixes of the job to be replaced to the
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04003315 # corresponding name prefixes of the new job.
Bu Sun Kim65020912020-05-20 12:08:20 -07003316 &quot;a_key&quot;: &quot;A String&quot;,
Jon Wayne Parrott692617a2017-01-06 09:58:29 -08003317 },
Bu Sun Kim65020912020-05-20 12:08:20 -07003318 &quot;environment&quot;: { # Describes the environment in which a Dataflow Job runs. # The environment for the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07003319 &quot;internalExperiments&quot;: { # Experimental settings.
3320 &quot;a_key&quot;: &quot;&quot;, # Properties of the object. Contains field @type with type URL.
3321 },
3322 &quot;workerRegion&quot;: &quot;A String&quot;, # The Compute Engine region
3323 # (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
3324 # which worker processing should occur, e.g. &quot;us-west1&quot;. Mutually exclusive
3325 # with worker_zone. If neither worker_region nor worker_zone is specified,
3326 # default to the control plane&#x27;s region.
3327 &quot;serviceKmsKeyName&quot;: &quot;A String&quot;, # If set, contains the Cloud KMS key identifier used to encrypt data
3328 # at rest, AKA a Customer Managed Encryption Key (CMEK).
3329 #
3330 # Format:
3331 # projects/PROJECT_ID/locations/LOCATION/keyRings/KEY_RING/cryptoKeys/KEY
3332 &quot;userAgent&quot;: { # A description of the process that generated the request.
3333 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
3334 },
3335 &quot;workerZone&quot;: &quot;A String&quot;, # The Compute Engine zone
3336 # (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
3337 # which worker processing should occur, e.g. &quot;us-west1-a&quot;. Mutually exclusive
3338 # with worker_region. If neither worker_region nor worker_zone is specified,
3339 # a zone in the control plane&#x27;s region is chosen based on available capacity.
3340 &quot;clusterManagerApiService&quot;: &quot;A String&quot;, # The type of cluster manager API to use. If unknown or
Dan O'Mearadd494642020-05-01 07:42:23 -07003341 # unspecified, the service will attempt to choose a reasonable
3342 # default. This should be in the form of the API service name,
Bu Sun Kim65020912020-05-20 12:08:20 -07003343 # e.g. &quot;compute.googleapis.com&quot;.
3344 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the system should use for temporary
3345 # storage. The system will append the suffix &quot;/temp-{JOBNAME} to
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003346 # this resource prefix, where {JOBNAME} is the value of the
3347 # job_name field. The resulting bucket and object prefix is used
3348 # as the prefix of the resources used to store temporary data
3349 # needed during the job execution. NOTE: This will override the
3350 # value in taskrunner_settings.
3351 # The supported resource type is:
3352 #
3353 # Google Cloud Storage:
3354 #
3355 # storage.googleapis.com/{bucket}/{object}
3356 # bucket.storage.googleapis.com/{object}
Bu Sun Kim65020912020-05-20 12:08:20 -07003357 &quot;experiments&quot;: [ # The list of experiments to enable.
3358 &quot;A String&quot;,
3359 ],
3360 &quot;version&quot;: { # A structure describing which components and their versions of the service
3361 # are required in order to run the job.
3362 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
3363 },
3364 &quot;serviceAccountEmail&quot;: &quot;A String&quot;, # Identity to run virtual machines as. Defaults to the default account.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07003365 &quot;sdkPipelineOptions&quot;: { # The Cloud Dataflow SDK pipeline options specified by the user. These
3366 # options are passed through the service and are used to recreate the
3367 # SDK pipeline options on the worker in a language agnostic and platform
3368 # independent way.
3369 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
3370 },
3371 &quot;flexResourceSchedulingGoal&quot;: &quot;A String&quot;, # Which Flexible Resource Scheduling mode to run in.
3372 &quot;workerPools&quot;: [ # The worker pools. At least one &quot;harness&quot; worker pool must be
3373 # specified in order for the job to have workers.
3374 { # Describes one particular pool of Cloud Dataflow workers to be
3375 # instantiated by the Cloud Dataflow service in order to perform the
3376 # computations required by a job. Note that a workflow job may use
3377 # multiple pools, in order to match the various computational
3378 # requirements of the various stages of the job.
3379 &quot;numThreadsPerWorker&quot;: 42, # The number of threads per worker harness. If empty or unspecified, the
3380 # service will choose a number of threads (according to the number of cores
3381 # on the selected machine type for batch, or 1 by convention for streaming).
3382 &quot;numWorkers&quot;: 42, # Number of Google Compute Engine workers in this pool needed to
3383 # execute the job. If zero or unspecified, the service will
3384 # attempt to choose a reasonable default.
3385 &quot;zone&quot;: &quot;A String&quot;, # Zone to run the worker pools in. If empty or unspecified, the service
3386 # will attempt to choose a reasonable default.
3387 &quot;diskSourceImage&quot;: &quot;A String&quot;, # Fully qualified source image for disks.
3388 &quot;packages&quot;: [ # Packages to be installed on workers.
3389 { # The packages that must be installed in order for a worker to run the
3390 # steps of the Cloud Dataflow job that will be assigned to its worker
3391 # pool.
3392 #
3393 # This is the mechanism by which the Cloud Dataflow SDK causes code to
3394 # be loaded onto the workers. For example, the Cloud Dataflow Java SDK
3395 # might use this to install jars containing the user&#x27;s code and all of the
3396 # various dependencies (libraries, data files, etc.) required in order
3397 # for that code to run.
3398 &quot;name&quot;: &quot;A String&quot;, # The name of the package.
3399 &quot;location&quot;: &quot;A String&quot;, # The resource to read the package from. The supported resource type is:
3400 #
3401 # Google Cloud Storage:
3402 #
3403 # storage.googleapis.com/{bucket}
3404 # bucket.storage.googleapis.com/
3405 },
3406 ],
3407 &quot;teardownPolicy&quot;: &quot;A String&quot;, # Sets the policy for determining when to turndown worker pool.
3408 # Allowed values are: `TEARDOWN_ALWAYS`, `TEARDOWN_ON_SUCCESS`, and
3409 # `TEARDOWN_NEVER`.
3410 # `TEARDOWN_ALWAYS` means workers are always torn down regardless of whether
3411 # the job succeeds. `TEARDOWN_ON_SUCCESS` means workers are torn down
3412 # if the job succeeds. `TEARDOWN_NEVER` means the workers are never torn
3413 # down.
3414 #
3415 # If the workers are not torn down by the service, they will
3416 # continue to run and use Google Compute Engine VM resources in the
3417 # user&#x27;s project until they are explicitly terminated by the user.
3418 # Because of this, Google recommends using the `TEARDOWN_ALWAYS`
3419 # policy except for small, manually supervised test jobs.
3420 #
3421 # If unknown or unspecified, the service will attempt to choose a reasonable
3422 # default.
3423 &quot;onHostMaintenance&quot;: &quot;A String&quot;, # The action to take on host maintenance, as defined by the Google
3424 # Compute Engine API.
3425 &quot;poolArgs&quot;: { # Extra arguments for this worker pool.
3426 &quot;a_key&quot;: &quot;&quot;, # Properties of the object. Contains field @type with type URL.
3427 },
3428 &quot;diskSizeGb&quot;: 42, # Size of root disk for VMs, in GB. If zero or unspecified, the service will
3429 # attempt to choose a reasonable default.
3430 &quot;workerHarnessContainerImage&quot;: &quot;A String&quot;, # Required. Docker container image that executes the Cloud Dataflow worker
3431 # harness, residing in Google Container Registry.
3432 #
3433 # Deprecated for the Fn API path. Use sdk_harness_container_images instead.
3434 &quot;diskType&quot;: &quot;A String&quot;, # Type of root disk for VMs. If empty or unspecified, the service will
3435 # attempt to choose a reasonable default.
3436 &quot;machineType&quot;: &quot;A String&quot;, # Machine type (e.g. &quot;n1-standard-1&quot;). If empty or unspecified, the
3437 # service will attempt to choose a reasonable default.
3438 &quot;kind&quot;: &quot;A String&quot;, # The kind of the worker pool; currently only `harness` and `shuffle`
3439 # are supported.
3440 &quot;sdkHarnessContainerImages&quot;: [ # Set of SDK harness containers needed to execute this pipeline. This will
3441 # only be set in the Fn API path. For non-cross-language pipelines this
3442 # should have only one entry. Cross-language pipelines will have two or more
3443 # entries.
3444 { # Defines a SDK harness container for executing Dataflow pipelines.
3445 &quot;containerImage&quot;: &quot;A String&quot;, # A docker container image that resides in Google Container Registry.
3446 &quot;useSingleCorePerContainer&quot;: True or False, # If true, recommends the Dataflow service to use only one core per SDK
3447 # container instance with this image. If false (or unset) recommends using
3448 # more than one core per SDK container instance with this image for
3449 # efficiency. Note that Dataflow service may choose to override this property
3450 # if needed.
3451 },
3452 ],
3453 &quot;dataDisks&quot;: [ # Data disks that are used by a VM in this workflow.
3454 { # Describes the data disk used by a workflow job.
3455 &quot;diskType&quot;: &quot;A String&quot;, # Disk storage type, as defined by Google Compute Engine. This
3456 # must be a disk type appropriate to the project and zone in which
3457 # the workers will run. If unknown or unspecified, the service
3458 # will attempt to choose a reasonable default.
3459 #
3460 # For example, the standard persistent disk type is a resource name
3461 # typically ending in &quot;pd-standard&quot;. If SSD persistent disks are
3462 # available, the resource name typically ends with &quot;pd-ssd&quot;. The
3463 # actual valid values are defined the Google Compute Engine API,
3464 # not by the Cloud Dataflow API; consult the Google Compute Engine
3465 # documentation for more information about determining the set of
3466 # available disk types for a particular project and zone.
3467 #
3468 # Google Compute Engine Disk types are local to a particular
3469 # project in a particular zone, and so the resource name will
3470 # typically look something like this:
3471 #
3472 # compute.googleapis.com/projects/project-id/zones/zone/diskTypes/pd-standard
3473 &quot;sizeGb&quot;: 42, # Size of disk in GB. If zero or unspecified, the service will
3474 # attempt to choose a reasonable default.
3475 &quot;mountPoint&quot;: &quot;A String&quot;, # Directory in a VM where disk is mounted.
3476 },
3477 ],
3478 &quot;subnetwork&quot;: &quot;A String&quot;, # Subnetwork to which VMs will be assigned, if desired. Expected to be of
3479 # the form &quot;regions/REGION/subnetworks/SUBNETWORK&quot;.
3480 &quot;ipConfiguration&quot;: &quot;A String&quot;, # Configuration for VM IPs.
3481 &quot;taskrunnerSettings&quot;: { # Taskrunner configuration settings. # Settings passed through to Google Compute Engine workers when
3482 # using the standard Dataflow task runner. Users should ignore
3483 # this field.
3484 &quot;alsologtostderr&quot;: True or False, # Whether to also send taskrunner log info to stderr.
3485 &quot;taskGroup&quot;: &quot;A String&quot;, # The UNIX group ID on the worker VM to use for tasks launched by
3486 # taskrunner; e.g. &quot;wheel&quot;.
3487 &quot;harnessCommand&quot;: &quot;A String&quot;, # The command to launch the worker harness.
3488 &quot;logDir&quot;: &quot;A String&quot;, # The directory on the VM to store logs.
3489 &quot;oauthScopes&quot;: [ # The OAuth2 scopes to be requested by the taskrunner in order to
3490 # access the Cloud Dataflow API.
3491 &quot;A String&quot;,
3492 ],
3493 &quot;dataflowApiVersion&quot;: &quot;A String&quot;, # The API version of endpoint, e.g. &quot;v1b3&quot;
3494 &quot;logUploadLocation&quot;: &quot;A String&quot;, # Indicates where to put logs. If this is not specified, the logs
3495 # will not be uploaded.
3496 #
3497 # The supported resource type is:
3498 #
3499 # Google Cloud Storage:
3500 # storage.googleapis.com/{bucket}/{object}
3501 # bucket.storage.googleapis.com/{object}
3502 &quot;streamingWorkerMainClass&quot;: &quot;A String&quot;, # The streaming worker main class name.
3503 &quot;workflowFileName&quot;: &quot;A String&quot;, # The file to store the workflow in.
3504 &quot;languageHint&quot;: &quot;A String&quot;, # The suggested backend language.
3505 &quot;commandlinesFileName&quot;: &quot;A String&quot;, # The file to store preprocessing commands in.
3506 &quot;baseTaskDir&quot;: &quot;A String&quot;, # The location on the worker for task-specific subdirectories.
3507 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the taskrunner should use for
3508 # temporary storage.
3509 #
3510 # The supported resource type is:
3511 #
3512 # Google Cloud Storage:
3513 # storage.googleapis.com/{bucket}/{object}
3514 # bucket.storage.googleapis.com/{object}
3515 &quot;baseUrl&quot;: &quot;A String&quot;, # The base URL for the taskrunner to use when accessing Google Cloud APIs.
3516 #
3517 # When workers access Google Cloud APIs, they logically do so via
3518 # relative URLs. If this field is specified, it supplies the base
3519 # URL to use for resolving these relative URLs. The normative
3520 # algorithm used is defined by RFC 1808, &quot;Relative Uniform Resource
3521 # Locators&quot;.
3522 #
3523 # If not specified, the default value is &quot;http://www.googleapis.com/&quot;
3524 &quot;logToSerialconsole&quot;: True or False, # Whether to send taskrunner log info to Google Compute Engine VM serial
3525 # console.
3526 &quot;continueOnException&quot;: True or False, # Whether to continue taskrunner if an exception is hit.
3527 &quot;parallelWorkerSettings&quot;: { # Provides data to pass through to the worker harness. # The settings to pass to the parallel worker harness.
3528 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the system should use for temporary
3529 # storage.
3530 #
3531 # The supported resource type is:
3532 #
3533 # Google Cloud Storage:
3534 #
3535 # storage.googleapis.com/{bucket}/{object}
3536 # bucket.storage.googleapis.com/{object}
3537 &quot;reportingEnabled&quot;: True or False, # Whether to send work progress updates to the service.
3538 &quot;baseUrl&quot;: &quot;A String&quot;, # The base URL for accessing Google Cloud APIs.
3539 #
3540 # When workers access Google Cloud APIs, they logically do so via
3541 # relative URLs. If this field is specified, it supplies the base
3542 # URL to use for resolving these relative URLs. The normative
3543 # algorithm used is defined by RFC 1808, &quot;Relative Uniform Resource
3544 # Locators&quot;.
3545 #
3546 # If not specified, the default value is &quot;http://www.googleapis.com/&quot;
3547 &quot;servicePath&quot;: &quot;A String&quot;, # The Cloud Dataflow service path relative to the root URL, for example,
3548 # &quot;dataflow/v1b3/projects&quot;.
3549 &quot;shuffleServicePath&quot;: &quot;A String&quot;, # The Shuffle service path relative to the root URL, for example,
3550 # &quot;shuffle/v1beta1&quot;.
3551 &quot;workerId&quot;: &quot;A String&quot;, # The ID of the worker running this pipeline.
3552 },
3553 &quot;taskUser&quot;: &quot;A String&quot;, # The UNIX user ID on the worker VM to use for tasks launched by
3554 # taskrunner; e.g. &quot;root&quot;.
3555 &quot;vmId&quot;: &quot;A String&quot;, # The ID string of the VM.
3556 },
3557 &quot;autoscalingSettings&quot;: { # Settings for WorkerPool autoscaling. # Settings for autoscaling of this WorkerPool.
3558 &quot;algorithm&quot;: &quot;A String&quot;, # The algorithm to use for autoscaling.
3559 &quot;maxNumWorkers&quot;: 42, # The maximum number of workers to cap scaling at.
3560 },
3561 &quot;metadata&quot;: { # Metadata to set on the Google Compute Engine VMs.
3562 &quot;a_key&quot;: &quot;A String&quot;,
3563 },
3564 &quot;defaultPackageSet&quot;: &quot;A String&quot;, # The default package set to install. This allows the service to
3565 # select a default set of packages which are useful to worker
3566 # harnesses written in a particular language.
3567 &quot;network&quot;: &quot;A String&quot;, # Network to which VMs will be assigned. If empty or unspecified,
3568 # the service will use the network &quot;default&quot;.
3569 },
3570 ],
3571 &quot;dataset&quot;: &quot;A String&quot;, # The dataset for the current project where various workflow
3572 # related tables are stored.
3573 #
3574 # The supported resource type is:
3575 #
3576 # Google BigQuery:
3577 # bigquery.googleapis.com/{dataset}
Jon Wayne Parrott692617a2017-01-06 09:58:29 -08003578 },
Bu Sun Kim65020912020-05-20 12:08:20 -07003579 &quot;stageStates&quot;: [ # This field may be mutated by the Cloud Dataflow service;
3580 # callers cannot mutate it.
3581 { # A message describing the state of a particular execution stage.
Bu Sun Kim65020912020-05-20 12:08:20 -07003582 &quot;currentStateTime&quot;: &quot;A String&quot;, # The time at which the stage transitioned to this state.
3583 &quot;executionStageState&quot;: &quot;A String&quot;, # Executions stage states allow the same set of values as JobState.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07003584 &quot;executionStageName&quot;: &quot;A String&quot;, # The name of the execution stage.
Bu Sun Kim65020912020-05-20 12:08:20 -07003585 },
3586 ],
3587 &quot;jobMetadata&quot;: { # Metadata available primarily for filtering jobs. Will be included in the # This field is populated by the Dataflow service to support filtering jobs
3588 # by the metadata values provided here. Populated for ListJobs and all GetJob
3589 # views SUMMARY and higher.
3590 # ListJob response and Job SUMMARY view.
Bu Sun Kim65020912020-05-20 12:08:20 -07003591 &quot;datastoreDetails&quot;: [ # Identification of a Datastore source used in the Dataflow job.
3592 { # Metadata for a Datastore connector used by the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07003593 &quot;namespace&quot;: &quot;A String&quot;, # Namespace used in the connection.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07003594 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
Bu Sun Kim65020912020-05-20 12:08:20 -07003595 },
3596 ],
3597 &quot;sdkVersion&quot;: { # The version of the SDK used to run the job. # The SDK version used to run the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07003598 &quot;version&quot;: &quot;A String&quot;, # The version of the SDK used to run the job.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07003599 &quot;sdkSupportStatus&quot;: &quot;A String&quot;, # The support status for this SDK version.
3600 &quot;versionDisplayName&quot;: &quot;A String&quot;, # A readable string describing the version of the SDK.
Bu Sun Kim65020912020-05-20 12:08:20 -07003601 },
3602 &quot;bigqueryDetails&quot;: [ # Identification of a BigQuery source used in the Dataflow job.
3603 { # Metadata for a BigQuery connector used by the job.
3604 &quot;table&quot;: &quot;A String&quot;, # Table accessed in the connection.
3605 &quot;dataset&quot;: &quot;A String&quot;, # Dataset accessed in the connection.
Bu Sun Kim65020912020-05-20 12:08:20 -07003606 &quot;query&quot;: &quot;A String&quot;, # Query used to access data in the connection.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07003607 &quot;projectId&quot;: &quot;A String&quot;, # Project accessed in the connection.
Bu Sun Kim65020912020-05-20 12:08:20 -07003608 },
3609 ],
3610 &quot;fileDetails&quot;: [ # Identification of a File source used in the Dataflow job.
3611 { # Metadata for a File connector used by the job.
3612 &quot;filePattern&quot;: &quot;A String&quot;, # File Pattern used to access files by the connector.
3613 },
3614 ],
3615 &quot;pubsubDetails&quot;: [ # Identification of a PubSub source used in the Dataflow job.
3616 { # Metadata for a PubSub connector used by the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07003617 &quot;topic&quot;: &quot;A String&quot;, # Topic accessed in the connection.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07003618 &quot;subscription&quot;: &quot;A String&quot;, # Subscription used in the connection.
3619 },
3620 ],
3621 &quot;bigTableDetails&quot;: [ # Identification of a BigTable source used in the Dataflow job.
3622 { # Metadata for a BigTable connector used by the job.
3623 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
3624 &quot;instanceId&quot;: &quot;A String&quot;, # InstanceId accessed in the connection.
3625 &quot;tableId&quot;: &quot;A String&quot;, # TableId accessed in the connection.
3626 },
3627 ],
3628 &quot;spannerDetails&quot;: [ # Identification of a Spanner source used in the Dataflow job.
3629 { # Metadata for a Spanner connector used by the job.
3630 &quot;instanceId&quot;: &quot;A String&quot;, # InstanceId accessed in the connection.
3631 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
3632 &quot;databaseId&quot;: &quot;A String&quot;, # DatabaseId accessed in the connection.
Bu Sun Kim65020912020-05-20 12:08:20 -07003633 },
3634 ],
3635 },
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07003636 &quot;type&quot;: &quot;A String&quot;, # The type of Cloud Dataflow job.
3637 &quot;projectId&quot;: &quot;A String&quot;, # The ID of the Cloud Platform project that the job belongs to.
Bu Sun Kim65020912020-05-20 12:08:20 -07003638 &quot;createdFromSnapshotId&quot;: &quot;A String&quot;, # If this is specified, the job&#x27;s initial state is populated from the given
3639 # snapshot.
Bu Sun Kim65020912020-05-20 12:08:20 -07003640 &quot;pipelineDescription&quot;: { # A descriptive representation of submitted pipeline as well as the executed # Preliminary field: The format of this data may change at any time.
3641 # A description of the user pipeline and stages through which it is executed.
3642 # Created by Cloud Dataflow service. Only retrieved with
3643 # JOB_VIEW_DESCRIPTION or JOB_VIEW_ALL.
3644 # form. This data is provided by the Dataflow service for ease of visualizing
3645 # the pipeline and interpreting Dataflow provided metrics.
3646 &quot;executionPipelineStage&quot;: [ # Description of each stage of execution of the pipeline.
3647 { # Description of the composing transforms, names/ids, and input/outputs of a
3648 # stage of execution. Some composing transforms and sources may have been
3649 # generated by the Dataflow service during execution planning.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07003650 &quot;outputSource&quot;: [ # Output sources for this stage.
3651 { # Description of an input or output of an execution stage.
3652 &quot;sizeBytes&quot;: &quot;A String&quot;, # Size of the source, if measurable.
3653 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
3654 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this source; may be user or system generated.
3655 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
3656 # source is most closely associated.
3657 },
3658 ],
3659 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this stage.
3660 &quot;inputSource&quot;: [ # Input sources for this stage.
3661 { # Description of an input or output of an execution stage.
3662 &quot;sizeBytes&quot;: &quot;A String&quot;, # Size of the source, if measurable.
3663 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
3664 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this source; may be user or system generated.
3665 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
3666 # source is most closely associated.
3667 },
3668 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07003669 &quot;id&quot;: &quot;A String&quot;, # Dataflow service generated id for this stage.
3670 &quot;componentTransform&quot;: [ # Transforms that comprise this execution stage.
3671 { # Description of a transform executed as part of an execution stage.
3672 &quot;originalTransform&quot;: &quot;A String&quot;, # User name for the original user transform with which this transform is
3673 # most closely associated.
3674 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
3675 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this transform; may be user or system generated.
3676 },
3677 ],
3678 &quot;componentSource&quot;: [ # Collections produced and consumed by component transforms of this stage.
3679 { # Description of an interstitial value between transforms in an execution
3680 # stage.
3681 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
3682 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this transform; may be user or system generated.
3683 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
3684 # source is most closely associated.
3685 },
3686 ],
3687 &quot;kind&quot;: &quot;A String&quot;, # Type of tranform this stage is executing.
Bu Sun Kim65020912020-05-20 12:08:20 -07003688 },
3689 ],
3690 &quot;originalPipelineTransform&quot;: [ # Description of each transform in the pipeline and collections between them.
3691 { # Description of the type, names/ids, and input/outputs for a transform.
3692 &quot;kind&quot;: &quot;A String&quot;, # Type of transform.
3693 &quot;inputCollectionName&quot;: [ # User names for all collection inputs to this transform.
3694 &quot;A String&quot;,
3695 ],
3696 &quot;name&quot;: &quot;A String&quot;, # User provided name for this transform instance.
3697 &quot;id&quot;: &quot;A String&quot;, # SDK generated id of this transform instance.
3698 &quot;displayData&quot;: [ # Transform-specific display data.
3699 { # Data provided with a pipeline or transform to provide descriptive info.
Bu Sun Kim65020912020-05-20 12:08:20 -07003700 &quot;durationValue&quot;: &quot;A String&quot;, # Contains value if the data is of duration type.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07003701 &quot;int64Value&quot;: &quot;A String&quot;, # Contains value if the data is of int64 type.
Bu Sun Kim65020912020-05-20 12:08:20 -07003702 &quot;namespace&quot;: &quot;A String&quot;, # The namespace for the key. This is usually a class name or programming
3703 # language namespace (i.e. python module) which defines the display data.
3704 # This allows a dax monitoring system to specially handle the data
3705 # and perform custom rendering.
3706 &quot;floatValue&quot;: 3.14, # Contains value if the data is of float type.
3707 &quot;key&quot;: &quot;A String&quot;, # The key identifying the display data.
3708 # This is intended to be used as a label for the display data
3709 # when viewed in a dax monitoring system.
3710 &quot;shortStrValue&quot;: &quot;A String&quot;, # A possible additional shorter value to display.
3711 # For example a java_class_name_value of com.mypackage.MyDoFn
3712 # will be stored with MyDoFn as the short_str_value and
3713 # com.mypackage.MyDoFn as the java_class_name value.
3714 # short_str_value can be displayed and java_class_name_value
3715 # will be displayed as a tooltip.
3716 &quot;url&quot;: &quot;A String&quot;, # An optional full URL.
3717 &quot;label&quot;: &quot;A String&quot;, # An optional label to display in a dax UI for the element.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07003718 &quot;timestampValue&quot;: &quot;A String&quot;, # Contains value if the data is of timestamp type.
3719 &quot;boolValue&quot;: True or False, # Contains value if the data is of a boolean type.
3720 &quot;javaClassValue&quot;: &quot;A String&quot;, # Contains value if the data is of java class type.
3721 &quot;strValue&quot;: &quot;A String&quot;, # Contains value if the data is of string type.
Bu Sun Kim65020912020-05-20 12:08:20 -07003722 },
3723 ],
3724 &quot;outputCollectionName&quot;: [ # User names for all collection outputs to this transform.
3725 &quot;A String&quot;,
3726 ],
3727 },
3728 ],
3729 &quot;displayData&quot;: [ # Pipeline level display data.
3730 { # Data provided with a pipeline or transform to provide descriptive info.
Bu Sun Kim65020912020-05-20 12:08:20 -07003731 &quot;durationValue&quot;: &quot;A String&quot;, # Contains value if the data is of duration type.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07003732 &quot;int64Value&quot;: &quot;A String&quot;, # Contains value if the data is of int64 type.
Bu Sun Kim65020912020-05-20 12:08:20 -07003733 &quot;namespace&quot;: &quot;A String&quot;, # The namespace for the key. This is usually a class name or programming
3734 # language namespace (i.e. python module) which defines the display data.
3735 # This allows a dax monitoring system to specially handle the data
3736 # and perform custom rendering.
3737 &quot;floatValue&quot;: 3.14, # Contains value if the data is of float type.
3738 &quot;key&quot;: &quot;A String&quot;, # The key identifying the display data.
3739 # This is intended to be used as a label for the display data
3740 # when viewed in a dax monitoring system.
3741 &quot;shortStrValue&quot;: &quot;A String&quot;, # A possible additional shorter value to display.
3742 # For example a java_class_name_value of com.mypackage.MyDoFn
3743 # will be stored with MyDoFn as the short_str_value and
3744 # com.mypackage.MyDoFn as the java_class_name value.
3745 # short_str_value can be displayed and java_class_name_value
3746 # will be displayed as a tooltip.
3747 &quot;url&quot;: &quot;A String&quot;, # An optional full URL.
3748 &quot;label&quot;: &quot;A String&quot;, # An optional label to display in a dax UI for the element.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07003749 &quot;timestampValue&quot;: &quot;A String&quot;, # Contains value if the data is of timestamp type.
3750 &quot;boolValue&quot;: True or False, # Contains value if the data is of a boolean type.
3751 &quot;javaClassValue&quot;: &quot;A String&quot;, # Contains value if the data is of java class type.
3752 &quot;strValue&quot;: &quot;A String&quot;, # Contains value if the data is of string type.
Bu Sun Kim65020912020-05-20 12:08:20 -07003753 },
3754 ],
3755 },
3756 &quot;replaceJobId&quot;: &quot;A String&quot;, # If this job is an update of an existing job, this field is the job ID
3757 # of the job it replaced.
3758 #
3759 # When sending a `CreateJobRequest`, you can update a job by specifying it
3760 # here. The job named here is stopped, and its intermediate state is
3761 # transferred to this job.
3762 &quot;tempFiles&quot;: [ # A set of files the system should be aware of that are used
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003763 # for temporary storage. These temporary files will be
3764 # removed on job completion.
3765 # No duplicates are allowed.
3766 # No file patterns are supported.
3767 #
3768 # The supported files are:
3769 #
3770 # Google Cloud Storage:
3771 #
3772 # storage.googleapis.com/{bucket}/{object}
3773 # bucket.storage.googleapis.com/{object}
Bu Sun Kim65020912020-05-20 12:08:20 -07003774 &quot;A String&quot;,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003775 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07003776 &quot;name&quot;: &quot;A String&quot;, # The user-specified Cloud Dataflow job name.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003777 #
3778 # Only one Job with a given name may exist in a project at any
3779 # given time. If a caller attempts to create a Job with the same
3780 # name as an already-existing Job, the attempt returns the
3781 # existing Job.
3782 #
3783 # The name must match the regular expression
3784 # `[a-z]([-a-z0-9]{0,38}[a-z0-9])?`
Bu Sun Kim65020912020-05-20 12:08:20 -07003785 &quot;steps&quot;: [ # Exactly one of step or steps_location should be specified.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003786 #
3787 # The top-level steps that constitute the entire job.
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04003788 { # Defines a particular step within a Cloud Dataflow job.
3789 #
3790 # A job consists of multiple steps, each of which performs some
3791 # specific operation as part of the overall job. Data is typically
3792 # passed from one step to another as part of the job.
3793 #
Bu Sun Kim65020912020-05-20 12:08:20 -07003794 # Here&#x27;s an example of a sequence of steps which together implement a
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04003795 # Map-Reduce job:
3796 #
3797 # * Read a collection of data from some source, parsing the
Bu Sun Kim65020912020-05-20 12:08:20 -07003798 # collection&#x27;s elements.
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04003799 #
3800 # * Validate the elements.
3801 #
3802 # * Apply a user-defined function to map each element to some value
3803 # and extract an element-specific key value.
3804 #
3805 # * Group elements with the same key into a single element with
3806 # that key, transforming a multiply-keyed collection into a
3807 # uniquely-keyed collection.
3808 #
3809 # * Write the elements out to some data sink.
3810 #
3811 # Note that the Cloud Dataflow service may be used to run many different
3812 # types of jobs, not just Map-Reduce.
Bu Sun Kim65020912020-05-20 12:08:20 -07003813 &quot;name&quot;: &quot;A String&quot;, # The name that identifies the step. This must be unique for each
Dan O'Mearadd494642020-05-01 07:42:23 -07003814 # step with respect to all other steps in the Cloud Dataflow job.
Bu Sun Kim65020912020-05-20 12:08:20 -07003815 &quot;kind&quot;: &quot;A String&quot;, # The kind of step in the Cloud Dataflow job.
3816 &quot;properties&quot;: { # Named properties associated with the step. Each kind of
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04003817 # predefined step has its own required set of properties.
3818 # Must be provided on Create. Only retrieved with JOB_VIEW_ALL.
Bu Sun Kim65020912020-05-20 12:08:20 -07003819 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
Jon Wayne Parrott692617a2017-01-06 09:58:29 -08003820 },
3821 },
3822 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07003823 &quot;replacedByJobId&quot;: &quot;A String&quot;, # If another job is an update of this job (and thus, this job is in
3824 # `JOB_STATE_UPDATED`), this field contains the ID of that job.
3825 &quot;executionInfo&quot;: { # Additional information about how a Cloud Dataflow job will be executed that # Deprecated.
3826 # isn&#x27;t contained in the submitted job.
3827 &quot;stages&quot;: { # A mapping from each stage to the information about that stage.
3828 &quot;a_key&quot;: { # Contains information about how a particular
3829 # google.dataflow.v1beta3.Step will be executed.
3830 &quot;stepName&quot;: [ # The steps associated with the execution stage.
3831 # Note that stages may have several steps, and that a given step
3832 # might be run by more than one stage.
3833 &quot;A String&quot;,
3834 ],
3835 },
3836 },
3837 },
3838 &quot;currentState&quot;: &quot;A String&quot;, # The current state of the job.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003839 #
3840 # Jobs are created in the `JOB_STATE_STOPPED` state unless otherwise
3841 # specified.
3842 #
3843 # A job in the `JOB_STATE_RUNNING` state may asynchronously enter a
3844 # terminal state. After a job has reached a terminal state, no
3845 # further state updates may be made.
3846 #
3847 # This field may be mutated by the Cloud Dataflow service;
3848 # callers cannot mutate it.
Bu Sun Kim65020912020-05-20 12:08:20 -07003849 &quot;location&quot;: &quot;A String&quot;, # The [regional endpoint]
3850 # (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
3851 # contains this job.
3852 &quot;startTime&quot;: &quot;A String&quot;, # The timestamp when the job was started (transitioned to JOB_STATE_PENDING).
3853 # Flexible resource scheduling jobs are started with some delay after job
3854 # creation, so start_time is unset before start and is updated when the
3855 # job is started by the Cloud Dataflow service. For other jobs, start_time
3856 # always equals to create_time and is immutable and set by the Cloud Dataflow
3857 # service.
3858 &quot;stepsLocation&quot;: &quot;A String&quot;, # The GCS location where the steps are stored.
3859 &quot;labels&quot;: { # User-defined labels for this job.
3860 #
3861 # The labels map can contain no more than 64 entries. Entries of the labels
3862 # map are UTF8 strings that comply with the following restrictions:
3863 #
3864 # * Keys must conform to regexp: \p{Ll}\p{Lo}{0,62}
3865 # * Values must conform to regexp: [\p{Ll}\p{Lo}\p{N}_-]{0,63}
3866 # * Both keys and values are additionally constrained to be &lt;= 128 bytes in
3867 # size.
3868 &quot;a_key&quot;: &quot;A String&quot;,
Jon Wayne Parrott692617a2017-01-06 09:58:29 -08003869 },
Bu Sun Kim65020912020-05-20 12:08:20 -07003870 &quot;createTime&quot;: &quot;A String&quot;, # The timestamp when the job was initially created. Immutable and set by the
3871 # Cloud Dataflow service.
3872 &quot;requestedState&quot;: &quot;A String&quot;, # The job&#x27;s requested state.
3873 #
3874 # `UpdateJob` may be used to switch between the `JOB_STATE_STOPPED` and
3875 # `JOB_STATE_RUNNING` states, by setting requested_state. `UpdateJob` may
3876 # also be used to directly set a job&#x27;s requested state to
3877 # `JOB_STATE_CANCELLED` or `JOB_STATE_DONE`, irrevocably terminating the
3878 # job if it has not already reached a terminal state.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003879 }</pre>
Jon Wayne Parrott692617a2017-01-06 09:58:29 -08003880</div>
3881
3882</body></html>