blob: 828794773093b25dd0ad867592c74b9985cf303a [file] [log] [blame]
Nathaniel Manista4f877e52015-06-15 16:44:50 +00001<html><body>
2<style>
3
4body, h1, h2, h3, div, span, p, pre, a {
5 margin: 0;
6 padding: 0;
7 border: 0;
8 font-weight: inherit;
9 font-style: inherit;
10 font-size: 100%;
11 font-family: inherit;
12 vertical-align: baseline;
13}
14
15body {
16 font-size: 13px;
17 padding: 1em;
18}
19
20h1 {
21 font-size: 26px;
22 margin-bottom: 1em;
23}
24
25h2 {
26 font-size: 24px;
27 margin-bottom: 1em;
28}
29
30h3 {
31 font-size: 20px;
32 margin-bottom: 1em;
33 margin-top: 1em;
34}
35
36pre, code {
37 line-height: 1.5;
38 font-family: Monaco, 'DejaVu Sans Mono', 'Bitstream Vera Sans Mono', 'Lucida Console', monospace;
39}
40
41pre {
42 margin-top: 0.5em;
43}
44
45h1, h2, h3, p {
46 font-family: Arial, sans serif;
47}
48
49h1, h2, h3 {
50 border-bottom: solid #CCC 1px;
51}
52
53.toc_element {
54 margin-top: 0.5em;
55}
56
57.firstline {
58 margin-left: 2 em;
59}
60
61.method {
62 margin-top: 1em;
63 border: solid 1px #CCC;
64 padding: 1em;
65 background: #EEE;
66}
67
68.details {
69 font-weight: bold;
70 font-size: 14px;
71}
72
73</style>
74
Bu Sun Kim715bd7f2019-06-14 16:50:42 -070075<h1><a href="dataflow_v1b3.html">Dataflow API</a> . <a href="dataflow_v1b3.projects.html">projects</a> . <a href="dataflow_v1b3.projects.jobs.html">jobs</a></h1>
Nathaniel Manista4f877e52015-06-15 16:44:50 +000076<h2>Instance Methods</h2>
77<p class="toc_element">
Jon Wayne Parrott7d5badb2016-08-16 12:44:29 -070078 <code><a href="dataflow_v1b3.projects.jobs.debug.html">debug()</a></code>
79</p>
80<p class="firstline">Returns the debug Resource.</p>
81
82<p class="toc_element">
Nathaniel Manista4f877e52015-06-15 16:44:50 +000083 <code><a href="dataflow_v1b3.projects.jobs.messages.html">messages()</a></code>
84</p>
85<p class="firstline">Returns the messages Resource.</p>
86
87<p class="toc_element">
88 <code><a href="dataflow_v1b3.projects.jobs.workItems.html">workItems()</a></code>
89</p>
90<p class="firstline">Returns the workItems Resource.</p>
91
92<p class="toc_element">
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -070093 <code><a href="#aggregated">aggregated(projectId, filter=None, location=None, pageToken=None, pageSize=None, view=None, x__xgafv=None)</a></code></p>
Bu Sun Kim715bd7f2019-06-14 16:50:42 -070094<p class="firstline">List the jobs of a project across all regions.</p>
95<p class="toc_element">
96 <code><a href="#aggregated_next">aggregated_next(previous_request, previous_response)</a></code></p>
97<p class="firstline">Retrieves the next page of results.</p>
98<p class="toc_element">
Bu Sun Kim65020912020-05-20 12:08:20 -070099 <code><a href="#create">create(projectId, body=None, location=None, replaceJobId=None, view=None, x__xgafv=None)</a></code></p>
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -0400100<p class="firstline">Creates a Cloud Dataflow job.</p>
Nathaniel Manista4f877e52015-06-15 16:44:50 +0000101<p class="toc_element">
Bu Sun Kim65020912020-05-20 12:08:20 -0700102 <code><a href="#get">get(projectId, jobId, view=None, location=None, x__xgafv=None)</a></code></p>
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -0400103<p class="firstline">Gets the state of the specified Cloud Dataflow job.</p>
Nathaniel Manista4f877e52015-06-15 16:44:50 +0000104<p class="toc_element">
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -0700105 <code><a href="#getMetrics">getMetrics(projectId, jobId, startTime=None, location=None, x__xgafv=None)</a></code></p>
Nathaniel Manista4f877e52015-06-15 16:44:50 +0000106<p class="firstline">Request the job status.</p>
107<p class="toc_element">
Bu Sun Kim65020912020-05-20 12:08:20 -0700108 <code><a href="#list">list(projectId, filter=None, location=None, pageToken=None, pageSize=None, view=None, x__xgafv=None)</a></code></p>
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -0400109<p class="firstline">List the jobs of a project.</p>
Nathaniel Manista4f877e52015-06-15 16:44:50 +0000110<p class="toc_element">
111 <code><a href="#list_next">list_next(previous_request, previous_response)</a></code></p>
112<p class="firstline">Retrieves the next page of results.</p>
113<p class="toc_element">
Dan O'Mearadd494642020-05-01 07:42:23 -0700114 <code><a href="#snapshot">snapshot(projectId, jobId, body=None, x__xgafv=None)</a></code></p>
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700115<p class="firstline">Snapshot the state of a streaming job.</p>
116<p class="toc_element">
Dan O'Mearadd494642020-05-01 07:42:23 -0700117 <code><a href="#update">update(projectId, jobId, body=None, location=None, x__xgafv=None)</a></code></p>
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -0400118<p class="firstline">Updates the state of an existing Cloud Dataflow job.</p>
Nathaniel Manista4f877e52015-06-15 16:44:50 +0000119<h3>Method Details</h3>
120<div class="method">
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -0700121 <code class="details" id="aggregated">aggregated(projectId, filter=None, location=None, pageToken=None, pageSize=None, view=None, x__xgafv=None)</code>
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700122 <pre>List the jobs of a project across all regions.
123
124Args:
125 projectId: string, The project which owns the jobs. (required)
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -0700126 filter: string, The kind of filter to use.
127 location: string, The [regional endpoint]
128(https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
129contains this job.
Bu Sun Kim65020912020-05-20 12:08:20 -0700130 pageToken: string, Set this to the &#x27;next_page_token&#x27; field of a previous response
131to request additional results in a long list.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700132 pageSize: integer, If there are many jobs, limit response to at most this many.
133The actual number of jobs returned will be the lesser of max_responses
134and an unspecified server-defined limit.
Bu Sun Kim65020912020-05-20 12:08:20 -0700135 view: string, Level of information requested in response. Default is `JOB_VIEW_SUMMARY`.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700136 x__xgafv: string, V1 error format.
137 Allowed values
138 1 - v1 error format
139 2 - v2 error format
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700140
141Returns:
142 An object of the form:
143
Dan O'Mearadd494642020-05-01 07:42:23 -0700144 { # Response to a request to list Cloud Dataflow jobs in a project. This might
145 # be a partial response, depending on the page size in the ListJobsRequest.
146 # However, if the project does not have any jobs, an instance of
Bu Sun Kim65020912020-05-20 12:08:20 -0700147 # ListJobsResponse is not returned and the requests&#x27;s response
Dan O'Mearadd494642020-05-01 07:42:23 -0700148 # body is empty {}.
Bu Sun Kim65020912020-05-20 12:08:20 -0700149 &quot;jobs&quot;: [ # A subset of the requested job information.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700150 { # Defines a job to be run by the Cloud Dataflow service.
Bu Sun Kim65020912020-05-20 12:08:20 -0700151 &quot;clientRequestId&quot;: &quot;A String&quot;, # The client&#x27;s unique identifier of the job, re-used across retried attempts.
152 # If this field is set, the service will ensure its uniqueness.
153 # The request to create a job will fail if the service has knowledge of a
154 # previously submitted job with the same client&#x27;s ID and job name.
155 # The caller may use this field to ensure idempotence of job
156 # creation across retried attempts to create a job.
157 # By default, the field is empty and, in that case, the service ignores it.
158 &quot;id&quot;: &quot;A String&quot;, # The unique ID of this job.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700159 #
160 # This field is set by the Cloud Dataflow service when the Job is
161 # created, and is immutable for the life of the job.
Bu Sun Kim65020912020-05-20 12:08:20 -0700162 &quot;currentStateTime&quot;: &quot;A String&quot;, # The timestamp associated with the current state.
163 &quot;transformNameMapping&quot;: { # The map of transform name prefixes of the job to be replaced to the
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700164 # corresponding name prefixes of the new job.
Bu Sun Kim65020912020-05-20 12:08:20 -0700165 &quot;a_key&quot;: &quot;A String&quot;,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700166 },
Bu Sun Kim65020912020-05-20 12:08:20 -0700167 &quot;environment&quot;: { # Describes the environment in which a Dataflow Job runs. # The environment for the job.
Bu Sun Kim65020912020-05-20 12:08:20 -0700168 &quot;internalExperiments&quot;: { # Experimental settings.
169 &quot;a_key&quot;: &quot;&quot;, # Properties of the object. Contains field @type with type URL.
170 },
171 &quot;workerRegion&quot;: &quot;A String&quot;, # The Compute Engine region
172 # (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
173 # which worker processing should occur, e.g. &quot;us-west1&quot;. Mutually exclusive
174 # with worker_zone. If neither worker_region nor worker_zone is specified,
175 # default to the control plane&#x27;s region.
176 &quot;serviceKmsKeyName&quot;: &quot;A String&quot;, # If set, contains the Cloud KMS key identifier used to encrypt data
177 # at rest, AKA a Customer Managed Encryption Key (CMEK).
178 #
179 # Format:
180 # projects/PROJECT_ID/locations/LOCATION/keyRings/KEY_RING/cryptoKeys/KEY
181 &quot;userAgent&quot;: { # A description of the process that generated the request.
182 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
183 },
184 &quot;workerZone&quot;: &quot;A String&quot;, # The Compute Engine zone
185 # (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
186 # which worker processing should occur, e.g. &quot;us-west1-a&quot;. Mutually exclusive
187 # with worker_region. If neither worker_region nor worker_zone is specified,
188 # a zone in the control plane&#x27;s region is chosen based on available capacity.
189 &quot;clusterManagerApiService&quot;: &quot;A String&quot;, # The type of cluster manager API to use. If unknown or
Dan O'Mearadd494642020-05-01 07:42:23 -0700190 # unspecified, the service will attempt to choose a reasonable
191 # default. This should be in the form of the API service name,
Bu Sun Kim65020912020-05-20 12:08:20 -0700192 # e.g. &quot;compute.googleapis.com&quot;.
193 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the system should use for temporary
194 # storage. The system will append the suffix &quot;/temp-{JOBNAME} to
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700195 # this resource prefix, where {JOBNAME} is the value of the
196 # job_name field. The resulting bucket and object prefix is used
197 # as the prefix of the resources used to store temporary data
198 # needed during the job execution. NOTE: This will override the
199 # value in taskrunner_settings.
200 # The supported resource type is:
201 #
202 # Google Cloud Storage:
203 #
204 # storage.googleapis.com/{bucket}/{object}
205 # bucket.storage.googleapis.com/{object}
Bu Sun Kim65020912020-05-20 12:08:20 -0700206 &quot;experiments&quot;: [ # The list of experiments to enable.
207 &quot;A String&quot;,
208 ],
209 &quot;version&quot;: { # A structure describing which components and their versions of the service
210 # are required in order to run the job.
211 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
212 },
213 &quot;serviceAccountEmail&quot;: &quot;A String&quot;, # Identity to run virtual machines as. Defaults to the default account.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -0700214 &quot;sdkPipelineOptions&quot;: { # The Cloud Dataflow SDK pipeline options specified by the user. These
215 # options are passed through the service and are used to recreate the
216 # SDK pipeline options on the worker in a language agnostic and platform
217 # independent way.
218 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
219 },
220 &quot;flexResourceSchedulingGoal&quot;: &quot;A String&quot;, # Which Flexible Resource Scheduling mode to run in.
221 &quot;workerPools&quot;: [ # The worker pools. At least one &quot;harness&quot; worker pool must be
222 # specified in order for the job to have workers.
223 { # Describes one particular pool of Cloud Dataflow workers to be
224 # instantiated by the Cloud Dataflow service in order to perform the
225 # computations required by a job. Note that a workflow job may use
226 # multiple pools, in order to match the various computational
227 # requirements of the various stages of the job.
228 &quot;numThreadsPerWorker&quot;: 42, # The number of threads per worker harness. If empty or unspecified, the
229 # service will choose a number of threads (according to the number of cores
230 # on the selected machine type for batch, or 1 by convention for streaming).
231 &quot;numWorkers&quot;: 42, # Number of Google Compute Engine workers in this pool needed to
232 # execute the job. If zero or unspecified, the service will
233 # attempt to choose a reasonable default.
234 &quot;zone&quot;: &quot;A String&quot;, # Zone to run the worker pools in. If empty or unspecified, the service
235 # will attempt to choose a reasonable default.
236 &quot;diskSourceImage&quot;: &quot;A String&quot;, # Fully qualified source image for disks.
237 &quot;packages&quot;: [ # Packages to be installed on workers.
238 { # The packages that must be installed in order for a worker to run the
239 # steps of the Cloud Dataflow job that will be assigned to its worker
240 # pool.
241 #
242 # This is the mechanism by which the Cloud Dataflow SDK causes code to
243 # be loaded onto the workers. For example, the Cloud Dataflow Java SDK
244 # might use this to install jars containing the user&#x27;s code and all of the
245 # various dependencies (libraries, data files, etc.) required in order
246 # for that code to run.
247 &quot;name&quot;: &quot;A String&quot;, # The name of the package.
248 &quot;location&quot;: &quot;A String&quot;, # The resource to read the package from. The supported resource type is:
249 #
250 # Google Cloud Storage:
251 #
252 # storage.googleapis.com/{bucket}
253 # bucket.storage.googleapis.com/
254 },
255 ],
256 &quot;teardownPolicy&quot;: &quot;A String&quot;, # Sets the policy for determining when to turndown worker pool.
257 # Allowed values are: `TEARDOWN_ALWAYS`, `TEARDOWN_ON_SUCCESS`, and
258 # `TEARDOWN_NEVER`.
259 # `TEARDOWN_ALWAYS` means workers are always torn down regardless of whether
260 # the job succeeds. `TEARDOWN_ON_SUCCESS` means workers are torn down
261 # if the job succeeds. `TEARDOWN_NEVER` means the workers are never torn
262 # down.
263 #
264 # If the workers are not torn down by the service, they will
265 # continue to run and use Google Compute Engine VM resources in the
266 # user&#x27;s project until they are explicitly terminated by the user.
267 # Because of this, Google recommends using the `TEARDOWN_ALWAYS`
268 # policy except for small, manually supervised test jobs.
269 #
270 # If unknown or unspecified, the service will attempt to choose a reasonable
271 # default.
272 &quot;onHostMaintenance&quot;: &quot;A String&quot;, # The action to take on host maintenance, as defined by the Google
273 # Compute Engine API.
274 &quot;poolArgs&quot;: { # Extra arguments for this worker pool.
275 &quot;a_key&quot;: &quot;&quot;, # Properties of the object. Contains field @type with type URL.
276 },
277 &quot;diskSizeGb&quot;: 42, # Size of root disk for VMs, in GB. If zero or unspecified, the service will
278 # attempt to choose a reasonable default.
279 &quot;workerHarnessContainerImage&quot;: &quot;A String&quot;, # Required. Docker container image that executes the Cloud Dataflow worker
280 # harness, residing in Google Container Registry.
281 #
282 # Deprecated for the Fn API path. Use sdk_harness_container_images instead.
283 &quot;diskType&quot;: &quot;A String&quot;, # Type of root disk for VMs. If empty or unspecified, the service will
284 # attempt to choose a reasonable default.
285 &quot;machineType&quot;: &quot;A String&quot;, # Machine type (e.g. &quot;n1-standard-1&quot;). If empty or unspecified, the
286 # service will attempt to choose a reasonable default.
287 &quot;kind&quot;: &quot;A String&quot;, # The kind of the worker pool; currently only `harness` and `shuffle`
288 # are supported.
289 &quot;sdkHarnessContainerImages&quot;: [ # Set of SDK harness containers needed to execute this pipeline. This will
290 # only be set in the Fn API path. For non-cross-language pipelines this
291 # should have only one entry. Cross-language pipelines will have two or more
292 # entries.
293 { # Defines a SDK harness container for executing Dataflow pipelines.
294 &quot;containerImage&quot;: &quot;A String&quot;, # A docker container image that resides in Google Container Registry.
295 &quot;useSingleCorePerContainer&quot;: True or False, # If true, recommends the Dataflow service to use only one core per SDK
296 # container instance with this image. If false (or unset) recommends using
297 # more than one core per SDK container instance with this image for
298 # efficiency. Note that Dataflow service may choose to override this property
299 # if needed.
300 },
301 ],
302 &quot;dataDisks&quot;: [ # Data disks that are used by a VM in this workflow.
303 { # Describes the data disk used by a workflow job.
304 &quot;diskType&quot;: &quot;A String&quot;, # Disk storage type, as defined by Google Compute Engine. This
305 # must be a disk type appropriate to the project and zone in which
306 # the workers will run. If unknown or unspecified, the service
307 # will attempt to choose a reasonable default.
308 #
309 # For example, the standard persistent disk type is a resource name
310 # typically ending in &quot;pd-standard&quot;. If SSD persistent disks are
311 # available, the resource name typically ends with &quot;pd-ssd&quot;. The
312 # actual valid values are defined the Google Compute Engine API,
313 # not by the Cloud Dataflow API; consult the Google Compute Engine
314 # documentation for more information about determining the set of
315 # available disk types for a particular project and zone.
316 #
317 # Google Compute Engine Disk types are local to a particular
318 # project in a particular zone, and so the resource name will
319 # typically look something like this:
320 #
321 # compute.googleapis.com/projects/project-id/zones/zone/diskTypes/pd-standard
322 &quot;sizeGb&quot;: 42, # Size of disk in GB. If zero or unspecified, the service will
323 # attempt to choose a reasonable default.
324 &quot;mountPoint&quot;: &quot;A String&quot;, # Directory in a VM where disk is mounted.
325 },
326 ],
327 &quot;subnetwork&quot;: &quot;A String&quot;, # Subnetwork to which VMs will be assigned, if desired. Expected to be of
328 # the form &quot;regions/REGION/subnetworks/SUBNETWORK&quot;.
329 &quot;ipConfiguration&quot;: &quot;A String&quot;, # Configuration for VM IPs.
330 &quot;taskrunnerSettings&quot;: { # Taskrunner configuration settings. # Settings passed through to Google Compute Engine workers when
331 # using the standard Dataflow task runner. Users should ignore
332 # this field.
333 &quot;alsologtostderr&quot;: True or False, # Whether to also send taskrunner log info to stderr.
334 &quot;taskGroup&quot;: &quot;A String&quot;, # The UNIX group ID on the worker VM to use for tasks launched by
335 # taskrunner; e.g. &quot;wheel&quot;.
336 &quot;harnessCommand&quot;: &quot;A String&quot;, # The command to launch the worker harness.
337 &quot;logDir&quot;: &quot;A String&quot;, # The directory on the VM to store logs.
338 &quot;oauthScopes&quot;: [ # The OAuth2 scopes to be requested by the taskrunner in order to
339 # access the Cloud Dataflow API.
340 &quot;A String&quot;,
341 ],
342 &quot;dataflowApiVersion&quot;: &quot;A String&quot;, # The API version of endpoint, e.g. &quot;v1b3&quot;
343 &quot;logUploadLocation&quot;: &quot;A String&quot;, # Indicates where to put logs. If this is not specified, the logs
344 # will not be uploaded.
345 #
346 # The supported resource type is:
347 #
348 # Google Cloud Storage:
349 # storage.googleapis.com/{bucket}/{object}
350 # bucket.storage.googleapis.com/{object}
351 &quot;streamingWorkerMainClass&quot;: &quot;A String&quot;, # The streaming worker main class name.
352 &quot;workflowFileName&quot;: &quot;A String&quot;, # The file to store the workflow in.
353 &quot;languageHint&quot;: &quot;A String&quot;, # The suggested backend language.
354 &quot;commandlinesFileName&quot;: &quot;A String&quot;, # The file to store preprocessing commands in.
355 &quot;baseTaskDir&quot;: &quot;A String&quot;, # The location on the worker for task-specific subdirectories.
356 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the taskrunner should use for
357 # temporary storage.
358 #
359 # The supported resource type is:
360 #
361 # Google Cloud Storage:
362 # storage.googleapis.com/{bucket}/{object}
363 # bucket.storage.googleapis.com/{object}
364 &quot;baseUrl&quot;: &quot;A String&quot;, # The base URL for the taskrunner to use when accessing Google Cloud APIs.
365 #
366 # When workers access Google Cloud APIs, they logically do so via
367 # relative URLs. If this field is specified, it supplies the base
368 # URL to use for resolving these relative URLs. The normative
369 # algorithm used is defined by RFC 1808, &quot;Relative Uniform Resource
370 # Locators&quot;.
371 #
372 # If not specified, the default value is &quot;http://www.googleapis.com/&quot;
373 &quot;logToSerialconsole&quot;: True or False, # Whether to send taskrunner log info to Google Compute Engine VM serial
374 # console.
375 &quot;continueOnException&quot;: True or False, # Whether to continue taskrunner if an exception is hit.
376 &quot;parallelWorkerSettings&quot;: { # Provides data to pass through to the worker harness. # The settings to pass to the parallel worker harness.
377 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the system should use for temporary
378 # storage.
379 #
380 # The supported resource type is:
381 #
382 # Google Cloud Storage:
383 #
384 # storage.googleapis.com/{bucket}/{object}
385 # bucket.storage.googleapis.com/{object}
386 &quot;reportingEnabled&quot;: True or False, # Whether to send work progress updates to the service.
387 &quot;baseUrl&quot;: &quot;A String&quot;, # The base URL for accessing Google Cloud APIs.
388 #
389 # When workers access Google Cloud APIs, they logically do so via
390 # relative URLs. If this field is specified, it supplies the base
391 # URL to use for resolving these relative URLs. The normative
392 # algorithm used is defined by RFC 1808, &quot;Relative Uniform Resource
393 # Locators&quot;.
394 #
395 # If not specified, the default value is &quot;http://www.googleapis.com/&quot;
396 &quot;servicePath&quot;: &quot;A String&quot;, # The Cloud Dataflow service path relative to the root URL, for example,
397 # &quot;dataflow/v1b3/projects&quot;.
398 &quot;shuffleServicePath&quot;: &quot;A String&quot;, # The Shuffle service path relative to the root URL, for example,
399 # &quot;shuffle/v1beta1&quot;.
400 &quot;workerId&quot;: &quot;A String&quot;, # The ID of the worker running this pipeline.
401 },
402 &quot;taskUser&quot;: &quot;A String&quot;, # The UNIX user ID on the worker VM to use for tasks launched by
403 # taskrunner; e.g. &quot;root&quot;.
404 &quot;vmId&quot;: &quot;A String&quot;, # The ID string of the VM.
405 },
406 &quot;autoscalingSettings&quot;: { # Settings for WorkerPool autoscaling. # Settings for autoscaling of this WorkerPool.
407 &quot;algorithm&quot;: &quot;A String&quot;, # The algorithm to use for autoscaling.
408 &quot;maxNumWorkers&quot;: 42, # The maximum number of workers to cap scaling at.
409 },
410 &quot;metadata&quot;: { # Metadata to set on the Google Compute Engine VMs.
411 &quot;a_key&quot;: &quot;A String&quot;,
412 },
413 &quot;defaultPackageSet&quot;: &quot;A String&quot;, # The default package set to install. This allows the service to
414 # select a default set of packages which are useful to worker
415 # harnesses written in a particular language.
416 &quot;network&quot;: &quot;A String&quot;, # Network to which VMs will be assigned. If empty or unspecified,
417 # the service will use the network &quot;default&quot;.
418 },
419 ],
420 &quot;dataset&quot;: &quot;A String&quot;, # The dataset for the current project where various workflow
421 # related tables are stored.
422 #
423 # The supported resource type is:
424 #
425 # Google BigQuery:
426 # bigquery.googleapis.com/{dataset}
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700427 },
Bu Sun Kim65020912020-05-20 12:08:20 -0700428 &quot;stageStates&quot;: [ # This field may be mutated by the Cloud Dataflow service;
429 # callers cannot mutate it.
430 { # A message describing the state of a particular execution stage.
Bu Sun Kim65020912020-05-20 12:08:20 -0700431 &quot;currentStateTime&quot;: &quot;A String&quot;, # The time at which the stage transitioned to this state.
432 &quot;executionStageState&quot;: &quot;A String&quot;, # Executions stage states allow the same set of values as JobState.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -0700433 &quot;executionStageName&quot;: &quot;A String&quot;, # The name of the execution stage.
Bu Sun Kim65020912020-05-20 12:08:20 -0700434 },
435 ],
436 &quot;jobMetadata&quot;: { # Metadata available primarily for filtering jobs. Will be included in the # This field is populated by the Dataflow service to support filtering jobs
437 # by the metadata values provided here. Populated for ListJobs and all GetJob
438 # views SUMMARY and higher.
439 # ListJob response and Job SUMMARY view.
Bu Sun Kim65020912020-05-20 12:08:20 -0700440 &quot;datastoreDetails&quot;: [ # Identification of a Datastore source used in the Dataflow job.
441 { # Metadata for a Datastore connector used by the job.
Bu Sun Kim65020912020-05-20 12:08:20 -0700442 &quot;namespace&quot;: &quot;A String&quot;, # Namespace used in the connection.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -0700443 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
Bu Sun Kim65020912020-05-20 12:08:20 -0700444 },
445 ],
446 &quot;sdkVersion&quot;: { # The version of the SDK used to run the job. # The SDK version used to run the job.
Bu Sun Kim65020912020-05-20 12:08:20 -0700447 &quot;version&quot;: &quot;A String&quot;, # The version of the SDK used to run the job.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -0700448 &quot;sdkSupportStatus&quot;: &quot;A String&quot;, # The support status for this SDK version.
449 &quot;versionDisplayName&quot;: &quot;A String&quot;, # A readable string describing the version of the SDK.
Bu Sun Kim65020912020-05-20 12:08:20 -0700450 },
451 &quot;bigqueryDetails&quot;: [ # Identification of a BigQuery source used in the Dataflow job.
452 { # Metadata for a BigQuery connector used by the job.
453 &quot;table&quot;: &quot;A String&quot;, # Table accessed in the connection.
454 &quot;dataset&quot;: &quot;A String&quot;, # Dataset accessed in the connection.
Bu Sun Kim65020912020-05-20 12:08:20 -0700455 &quot;query&quot;: &quot;A String&quot;, # Query used to access data in the connection.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -0700456 &quot;projectId&quot;: &quot;A String&quot;, # Project accessed in the connection.
Bu Sun Kim65020912020-05-20 12:08:20 -0700457 },
458 ],
459 &quot;fileDetails&quot;: [ # Identification of a File source used in the Dataflow job.
460 { # Metadata for a File connector used by the job.
461 &quot;filePattern&quot;: &quot;A String&quot;, # File Pattern used to access files by the connector.
462 },
463 ],
464 &quot;pubsubDetails&quot;: [ # Identification of a PubSub source used in the Dataflow job.
465 { # Metadata for a PubSub connector used by the job.
Bu Sun Kim65020912020-05-20 12:08:20 -0700466 &quot;topic&quot;: &quot;A String&quot;, # Topic accessed in the connection.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -0700467 &quot;subscription&quot;: &quot;A String&quot;, # Subscription used in the connection.
468 },
469 ],
470 &quot;bigTableDetails&quot;: [ # Identification of a BigTable source used in the Dataflow job.
471 { # Metadata for a BigTable connector used by the job.
472 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
473 &quot;instanceId&quot;: &quot;A String&quot;, # InstanceId accessed in the connection.
474 &quot;tableId&quot;: &quot;A String&quot;, # TableId accessed in the connection.
475 },
476 ],
477 &quot;spannerDetails&quot;: [ # Identification of a Spanner source used in the Dataflow job.
478 { # Metadata for a Spanner connector used by the job.
479 &quot;instanceId&quot;: &quot;A String&quot;, # InstanceId accessed in the connection.
480 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
481 &quot;databaseId&quot;: &quot;A String&quot;, # DatabaseId accessed in the connection.
Bu Sun Kim65020912020-05-20 12:08:20 -0700482 },
483 ],
484 },
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -0700485 &quot;type&quot;: &quot;A String&quot;, # The type of Cloud Dataflow job.
486 &quot;projectId&quot;: &quot;A String&quot;, # The ID of the Cloud Platform project that the job belongs to.
Bu Sun Kim65020912020-05-20 12:08:20 -0700487 &quot;createdFromSnapshotId&quot;: &quot;A String&quot;, # If this is specified, the job&#x27;s initial state is populated from the given
488 # snapshot.
Bu Sun Kim65020912020-05-20 12:08:20 -0700489 &quot;pipelineDescription&quot;: { # A descriptive representation of submitted pipeline as well as the executed # Preliminary field: The format of this data may change at any time.
490 # A description of the user pipeline and stages through which it is executed.
491 # Created by Cloud Dataflow service. Only retrieved with
492 # JOB_VIEW_DESCRIPTION or JOB_VIEW_ALL.
493 # form. This data is provided by the Dataflow service for ease of visualizing
494 # the pipeline and interpreting Dataflow provided metrics.
495 &quot;executionPipelineStage&quot;: [ # Description of each stage of execution of the pipeline.
496 { # Description of the composing transforms, names/ids, and input/outputs of a
497 # stage of execution. Some composing transforms and sources may have been
498 # generated by the Dataflow service during execution planning.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -0700499 &quot;outputSource&quot;: [ # Output sources for this stage.
500 { # Description of an input or output of an execution stage.
501 &quot;sizeBytes&quot;: &quot;A String&quot;, # Size of the source, if measurable.
502 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
503 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this source; may be user or system generated.
504 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
505 # source is most closely associated.
506 },
507 ],
508 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this stage.
509 &quot;inputSource&quot;: [ # Input sources for this stage.
510 { # Description of an input or output of an execution stage.
511 &quot;sizeBytes&quot;: &quot;A String&quot;, # Size of the source, if measurable.
512 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
513 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this source; may be user or system generated.
514 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
515 # source is most closely associated.
516 },
517 ],
Bu Sun Kim65020912020-05-20 12:08:20 -0700518 &quot;id&quot;: &quot;A String&quot;, # Dataflow service generated id for this stage.
519 &quot;componentTransform&quot;: [ # Transforms that comprise this execution stage.
520 { # Description of a transform executed as part of an execution stage.
521 &quot;originalTransform&quot;: &quot;A String&quot;, # User name for the original user transform with which this transform is
522 # most closely associated.
523 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
524 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this transform; may be user or system generated.
525 },
526 ],
527 &quot;componentSource&quot;: [ # Collections produced and consumed by component transforms of this stage.
528 { # Description of an interstitial value between transforms in an execution
529 # stage.
530 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
531 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this transform; may be user or system generated.
532 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
533 # source is most closely associated.
534 },
535 ],
536 &quot;kind&quot;: &quot;A String&quot;, # Type of tranform this stage is executing.
Bu Sun Kim65020912020-05-20 12:08:20 -0700537 },
538 ],
539 &quot;originalPipelineTransform&quot;: [ # Description of each transform in the pipeline and collections between them.
540 { # Description of the type, names/ids, and input/outputs for a transform.
541 &quot;kind&quot;: &quot;A String&quot;, # Type of transform.
542 &quot;inputCollectionName&quot;: [ # User names for all collection inputs to this transform.
543 &quot;A String&quot;,
544 ],
545 &quot;name&quot;: &quot;A String&quot;, # User provided name for this transform instance.
546 &quot;id&quot;: &quot;A String&quot;, # SDK generated id of this transform instance.
547 &quot;displayData&quot;: [ # Transform-specific display data.
548 { # Data provided with a pipeline or transform to provide descriptive info.
Bu Sun Kim65020912020-05-20 12:08:20 -0700549 &quot;durationValue&quot;: &quot;A String&quot;, # Contains value if the data is of duration type.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -0700550 &quot;int64Value&quot;: &quot;A String&quot;, # Contains value if the data is of int64 type.
Bu Sun Kim65020912020-05-20 12:08:20 -0700551 &quot;namespace&quot;: &quot;A String&quot;, # The namespace for the key. This is usually a class name or programming
552 # language namespace (i.e. python module) which defines the display data.
553 # This allows a dax monitoring system to specially handle the data
554 # and perform custom rendering.
555 &quot;floatValue&quot;: 3.14, # Contains value if the data is of float type.
556 &quot;key&quot;: &quot;A String&quot;, # The key identifying the display data.
557 # This is intended to be used as a label for the display data
558 # when viewed in a dax monitoring system.
559 &quot;shortStrValue&quot;: &quot;A String&quot;, # A possible additional shorter value to display.
560 # For example a java_class_name_value of com.mypackage.MyDoFn
561 # will be stored with MyDoFn as the short_str_value and
562 # com.mypackage.MyDoFn as the java_class_name value.
563 # short_str_value can be displayed and java_class_name_value
564 # will be displayed as a tooltip.
565 &quot;url&quot;: &quot;A String&quot;, # An optional full URL.
566 &quot;label&quot;: &quot;A String&quot;, # An optional label to display in a dax UI for the element.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -0700567 &quot;timestampValue&quot;: &quot;A String&quot;, # Contains value if the data is of timestamp type.
568 &quot;boolValue&quot;: True or False, # Contains value if the data is of a boolean type.
569 &quot;javaClassValue&quot;: &quot;A String&quot;, # Contains value if the data is of java class type.
570 &quot;strValue&quot;: &quot;A String&quot;, # Contains value if the data is of string type.
Bu Sun Kim65020912020-05-20 12:08:20 -0700571 },
572 ],
573 &quot;outputCollectionName&quot;: [ # User names for all collection outputs to this transform.
574 &quot;A String&quot;,
575 ],
576 },
577 ],
578 &quot;displayData&quot;: [ # Pipeline level display data.
579 { # Data provided with a pipeline or transform to provide descriptive info.
Bu Sun Kim65020912020-05-20 12:08:20 -0700580 &quot;durationValue&quot;: &quot;A String&quot;, # Contains value if the data is of duration type.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -0700581 &quot;int64Value&quot;: &quot;A String&quot;, # Contains value if the data is of int64 type.
Bu Sun Kim65020912020-05-20 12:08:20 -0700582 &quot;namespace&quot;: &quot;A String&quot;, # The namespace for the key. This is usually a class name or programming
583 # language namespace (i.e. python module) which defines the display data.
584 # This allows a dax monitoring system to specially handle the data
585 # and perform custom rendering.
586 &quot;floatValue&quot;: 3.14, # Contains value if the data is of float type.
587 &quot;key&quot;: &quot;A String&quot;, # The key identifying the display data.
588 # This is intended to be used as a label for the display data
589 # when viewed in a dax monitoring system.
590 &quot;shortStrValue&quot;: &quot;A String&quot;, # A possible additional shorter value to display.
591 # For example a java_class_name_value of com.mypackage.MyDoFn
592 # will be stored with MyDoFn as the short_str_value and
593 # com.mypackage.MyDoFn as the java_class_name value.
594 # short_str_value can be displayed and java_class_name_value
595 # will be displayed as a tooltip.
596 &quot;url&quot;: &quot;A String&quot;, # An optional full URL.
597 &quot;label&quot;: &quot;A String&quot;, # An optional label to display in a dax UI for the element.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -0700598 &quot;timestampValue&quot;: &quot;A String&quot;, # Contains value if the data is of timestamp type.
599 &quot;boolValue&quot;: True or False, # Contains value if the data is of a boolean type.
600 &quot;javaClassValue&quot;: &quot;A String&quot;, # Contains value if the data is of java class type.
601 &quot;strValue&quot;: &quot;A String&quot;, # Contains value if the data is of string type.
Bu Sun Kim65020912020-05-20 12:08:20 -0700602 },
603 ],
604 },
605 &quot;replaceJobId&quot;: &quot;A String&quot;, # If this job is an update of an existing job, this field is the job ID
606 # of the job it replaced.
607 #
608 # When sending a `CreateJobRequest`, you can update a job by specifying it
609 # here. The job named here is stopped, and its intermediate state is
610 # transferred to this job.
611 &quot;tempFiles&quot;: [ # A set of files the system should be aware of that are used
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700612 # for temporary storage. These temporary files will be
613 # removed on job completion.
614 # No duplicates are allowed.
615 # No file patterns are supported.
616 #
617 # The supported files are:
618 #
619 # Google Cloud Storage:
620 #
621 # storage.googleapis.com/{bucket}/{object}
622 # bucket.storage.googleapis.com/{object}
Bu Sun Kim65020912020-05-20 12:08:20 -0700623 &quot;A String&quot;,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700624 ],
Bu Sun Kim65020912020-05-20 12:08:20 -0700625 &quot;name&quot;: &quot;A String&quot;, # The user-specified Cloud Dataflow job name.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700626 #
627 # Only one Job with a given name may exist in a project at any
628 # given time. If a caller attempts to create a Job with the same
629 # name as an already-existing Job, the attempt returns the
630 # existing Job.
631 #
632 # The name must match the regular expression
633 # `[a-z]([-a-z0-9]{0,38}[a-z0-9])?`
Bu Sun Kim65020912020-05-20 12:08:20 -0700634 &quot;steps&quot;: [ # Exactly one of step or steps_location should be specified.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700635 #
636 # The top-level steps that constitute the entire job.
637 { # Defines a particular step within a Cloud Dataflow job.
638 #
639 # A job consists of multiple steps, each of which performs some
640 # specific operation as part of the overall job. Data is typically
641 # passed from one step to another as part of the job.
642 #
Bu Sun Kim65020912020-05-20 12:08:20 -0700643 # Here&#x27;s an example of a sequence of steps which together implement a
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700644 # Map-Reduce job:
645 #
646 # * Read a collection of data from some source, parsing the
Bu Sun Kim65020912020-05-20 12:08:20 -0700647 # collection&#x27;s elements.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700648 #
649 # * Validate the elements.
650 #
651 # * Apply a user-defined function to map each element to some value
652 # and extract an element-specific key value.
653 #
654 # * Group elements with the same key into a single element with
655 # that key, transforming a multiply-keyed collection into a
656 # uniquely-keyed collection.
657 #
658 # * Write the elements out to some data sink.
659 #
660 # Note that the Cloud Dataflow service may be used to run many different
661 # types of jobs, not just Map-Reduce.
Bu Sun Kim65020912020-05-20 12:08:20 -0700662 &quot;name&quot;: &quot;A String&quot;, # The name that identifies the step. This must be unique for each
Dan O'Mearadd494642020-05-01 07:42:23 -0700663 # step with respect to all other steps in the Cloud Dataflow job.
Bu Sun Kim65020912020-05-20 12:08:20 -0700664 &quot;kind&quot;: &quot;A String&quot;, # The kind of step in the Cloud Dataflow job.
665 &quot;properties&quot;: { # Named properties associated with the step. Each kind of
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700666 # predefined step has its own required set of properties.
667 # Must be provided on Create. Only retrieved with JOB_VIEW_ALL.
Bu Sun Kim65020912020-05-20 12:08:20 -0700668 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700669 },
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700670 },
671 ],
Bu Sun Kim65020912020-05-20 12:08:20 -0700672 &quot;replacedByJobId&quot;: &quot;A String&quot;, # If another job is an update of this job (and thus, this job is in
673 # `JOB_STATE_UPDATED`), this field contains the ID of that job.
674 &quot;executionInfo&quot;: { # Additional information about how a Cloud Dataflow job will be executed that # Deprecated.
675 # isn&#x27;t contained in the submitted job.
676 &quot;stages&quot;: { # A mapping from each stage to the information about that stage.
677 &quot;a_key&quot;: { # Contains information about how a particular
678 # google.dataflow.v1beta3.Step will be executed.
679 &quot;stepName&quot;: [ # The steps associated with the execution stage.
680 # Note that stages may have several steps, and that a given step
681 # might be run by more than one stage.
682 &quot;A String&quot;,
683 ],
684 },
685 },
686 },
687 &quot;currentState&quot;: &quot;A String&quot;, # The current state of the job.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700688 #
689 # Jobs are created in the `JOB_STATE_STOPPED` state unless otherwise
690 # specified.
691 #
692 # A job in the `JOB_STATE_RUNNING` state may asynchronously enter a
693 # terminal state. After a job has reached a terminal state, no
694 # further state updates may be made.
695 #
696 # This field may be mutated by the Cloud Dataflow service;
697 # callers cannot mutate it.
Bu Sun Kim65020912020-05-20 12:08:20 -0700698 &quot;location&quot;: &quot;A String&quot;, # The [regional endpoint]
699 # (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
700 # contains this job.
701 &quot;startTime&quot;: &quot;A String&quot;, # The timestamp when the job was started (transitioned to JOB_STATE_PENDING).
702 # Flexible resource scheduling jobs are started with some delay after job
703 # creation, so start_time is unset before start and is updated when the
704 # job is started by the Cloud Dataflow service. For other jobs, start_time
705 # always equals to create_time and is immutable and set by the Cloud Dataflow
706 # service.
707 &quot;stepsLocation&quot;: &quot;A String&quot;, # The GCS location where the steps are stored.
708 &quot;labels&quot;: { # User-defined labels for this job.
709 #
710 # The labels map can contain no more than 64 entries. Entries of the labels
711 # map are UTF8 strings that comply with the following restrictions:
712 #
713 # * Keys must conform to regexp: \p{Ll}\p{Lo}{0,62}
714 # * Values must conform to regexp: [\p{Ll}\p{Lo}\p{N}_-]{0,63}
715 # * Both keys and values are additionally constrained to be &lt;= 128 bytes in
716 # size.
717 &quot;a_key&quot;: &quot;A String&quot;,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700718 },
Bu Sun Kim65020912020-05-20 12:08:20 -0700719 &quot;createTime&quot;: &quot;A String&quot;, # The timestamp when the job was initially created. Immutable and set by the
720 # Cloud Dataflow service.
721 &quot;requestedState&quot;: &quot;A String&quot;, # The job&#x27;s requested state.
722 #
723 # `UpdateJob` may be used to switch between the `JOB_STATE_STOPPED` and
724 # `JOB_STATE_RUNNING` states, by setting requested_state. `UpdateJob` may
725 # also be used to directly set a job&#x27;s requested state to
726 # `JOB_STATE_CANCELLED` or `JOB_STATE_DONE`, irrevocably terminating the
727 # job if it has not already reached a terminal state.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700728 },
729 ],
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -0700730 &quot;failedLocation&quot;: [ # Zero or more messages describing the [regional endpoints]
731 # (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
732 # failed to respond.
733 { # Indicates which [regional endpoint]
734 # (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) failed
735 # to respond to a request for data.
736 &quot;name&quot;: &quot;A String&quot;, # The name of the [regional endpoint]
737 # (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
738 # failed to respond.
739 },
740 ],
741 &quot;nextPageToken&quot;: &quot;A String&quot;, # Set if there may be more results than fit in this response.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700742 }</pre>
743</div>
744
745<div class="method">
746 <code class="details" id="aggregated_next">aggregated_next(previous_request, previous_response)</code>
747 <pre>Retrieves the next page of results.
748
749Args:
750 previous_request: The request for the previous page. (required)
751 previous_response: The response from the request for the previous page. (required)
752
753Returns:
Bu Sun Kim65020912020-05-20 12:08:20 -0700754 A request object that you can call &#x27;execute()&#x27; on to request the next
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700755 page. Returns None if there are no more items in the collection.
756 </pre>
757</div>
758
759<div class="method">
Bu Sun Kim65020912020-05-20 12:08:20 -0700760 <code class="details" id="create">create(projectId, body=None, location=None, replaceJobId=None, view=None, x__xgafv=None)</code>
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -0400761 <pre>Creates a Cloud Dataflow job.
Nathaniel Manista4f877e52015-06-15 16:44:50 +0000762
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700763To create a job, we recommend using `projects.locations.jobs.create` with a
764[regional endpoint]
765(https://cloud.google.com/dataflow/docs/concepts/regional-endpoints). Using
766`projects.jobs.create` is not recommended, as your job will always start
767in `us-central1`.
768
Nathaniel Manista4f877e52015-06-15 16:44:50 +0000769Args:
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -0400770 projectId: string, The ID of the Cloud Platform project that the job belongs to. (required)
Dan O'Mearadd494642020-05-01 07:42:23 -0700771 body: object, The request body.
Nathaniel Manista4f877e52015-06-15 16:44:50 +0000772 The object takes the form of:
773
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -0400774{ # Defines a job to be run by the Cloud Dataflow service.
Bu Sun Kim65020912020-05-20 12:08:20 -0700775 &quot;clientRequestId&quot;: &quot;A String&quot;, # The client&#x27;s unique identifier of the job, re-used across retried attempts.
776 # If this field is set, the service will ensure its uniqueness.
777 # The request to create a job will fail if the service has knowledge of a
778 # previously submitted job with the same client&#x27;s ID and job name.
779 # The caller may use this field to ensure idempotence of job
780 # creation across retried attempts to create a job.
781 # By default, the field is empty and, in that case, the service ignores it.
782 &quot;id&quot;: &quot;A String&quot;, # The unique ID of this job.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700783 #
784 # This field is set by the Cloud Dataflow service when the Job is
785 # created, and is immutable for the life of the job.
Bu Sun Kim65020912020-05-20 12:08:20 -0700786 &quot;currentStateTime&quot;: &quot;A String&quot;, # The timestamp associated with the current state.
787 &quot;transformNameMapping&quot;: { # The map of transform name prefixes of the job to be replaced to the
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700788 # corresponding name prefixes of the new job.
Bu Sun Kim65020912020-05-20 12:08:20 -0700789 &quot;a_key&quot;: &quot;A String&quot;,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700790 },
Bu Sun Kim65020912020-05-20 12:08:20 -0700791 &quot;environment&quot;: { # Describes the environment in which a Dataflow Job runs. # The environment for the job.
Bu Sun Kim65020912020-05-20 12:08:20 -0700792 &quot;internalExperiments&quot;: { # Experimental settings.
793 &quot;a_key&quot;: &quot;&quot;, # Properties of the object. Contains field @type with type URL.
794 },
795 &quot;workerRegion&quot;: &quot;A String&quot;, # The Compute Engine region
796 # (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
797 # which worker processing should occur, e.g. &quot;us-west1&quot;. Mutually exclusive
798 # with worker_zone. If neither worker_region nor worker_zone is specified,
799 # default to the control plane&#x27;s region.
800 &quot;serviceKmsKeyName&quot;: &quot;A String&quot;, # If set, contains the Cloud KMS key identifier used to encrypt data
801 # at rest, AKA a Customer Managed Encryption Key (CMEK).
802 #
803 # Format:
804 # projects/PROJECT_ID/locations/LOCATION/keyRings/KEY_RING/cryptoKeys/KEY
805 &quot;userAgent&quot;: { # A description of the process that generated the request.
806 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
807 },
808 &quot;workerZone&quot;: &quot;A String&quot;, # The Compute Engine zone
809 # (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
810 # which worker processing should occur, e.g. &quot;us-west1-a&quot;. Mutually exclusive
811 # with worker_region. If neither worker_region nor worker_zone is specified,
812 # a zone in the control plane&#x27;s region is chosen based on available capacity.
813 &quot;clusterManagerApiService&quot;: &quot;A String&quot;, # The type of cluster manager API to use. If unknown or
Dan O'Mearadd494642020-05-01 07:42:23 -0700814 # unspecified, the service will attempt to choose a reasonable
815 # default. This should be in the form of the API service name,
Bu Sun Kim65020912020-05-20 12:08:20 -0700816 # e.g. &quot;compute.googleapis.com&quot;.
817 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the system should use for temporary
818 # storage. The system will append the suffix &quot;/temp-{JOBNAME} to
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700819 # this resource prefix, where {JOBNAME} is the value of the
820 # job_name field. The resulting bucket and object prefix is used
821 # as the prefix of the resources used to store temporary data
822 # needed during the job execution. NOTE: This will override the
823 # value in taskrunner_settings.
824 # The supported resource type is:
825 #
826 # Google Cloud Storage:
827 #
828 # storage.googleapis.com/{bucket}/{object}
829 # bucket.storage.googleapis.com/{object}
Bu Sun Kim65020912020-05-20 12:08:20 -0700830 &quot;experiments&quot;: [ # The list of experiments to enable.
831 &quot;A String&quot;,
832 ],
833 &quot;version&quot;: { # A structure describing which components and their versions of the service
834 # are required in order to run the job.
835 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
836 },
837 &quot;serviceAccountEmail&quot;: &quot;A String&quot;, # Identity to run virtual machines as. Defaults to the default account.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -0700838 &quot;sdkPipelineOptions&quot;: { # The Cloud Dataflow SDK pipeline options specified by the user. These
839 # options are passed through the service and are used to recreate the
840 # SDK pipeline options on the worker in a language agnostic and platform
841 # independent way.
842 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
843 },
844 &quot;flexResourceSchedulingGoal&quot;: &quot;A String&quot;, # Which Flexible Resource Scheduling mode to run in.
845 &quot;workerPools&quot;: [ # The worker pools. At least one &quot;harness&quot; worker pool must be
846 # specified in order for the job to have workers.
847 { # Describes one particular pool of Cloud Dataflow workers to be
848 # instantiated by the Cloud Dataflow service in order to perform the
849 # computations required by a job. Note that a workflow job may use
850 # multiple pools, in order to match the various computational
851 # requirements of the various stages of the job.
852 &quot;numThreadsPerWorker&quot;: 42, # The number of threads per worker harness. If empty or unspecified, the
853 # service will choose a number of threads (according to the number of cores
854 # on the selected machine type for batch, or 1 by convention for streaming).
855 &quot;numWorkers&quot;: 42, # Number of Google Compute Engine workers in this pool needed to
856 # execute the job. If zero or unspecified, the service will
857 # attempt to choose a reasonable default.
858 &quot;zone&quot;: &quot;A String&quot;, # Zone to run the worker pools in. If empty or unspecified, the service
859 # will attempt to choose a reasonable default.
860 &quot;diskSourceImage&quot;: &quot;A String&quot;, # Fully qualified source image for disks.
861 &quot;packages&quot;: [ # Packages to be installed on workers.
862 { # The packages that must be installed in order for a worker to run the
863 # steps of the Cloud Dataflow job that will be assigned to its worker
864 # pool.
865 #
866 # This is the mechanism by which the Cloud Dataflow SDK causes code to
867 # be loaded onto the workers. For example, the Cloud Dataflow Java SDK
868 # might use this to install jars containing the user&#x27;s code and all of the
869 # various dependencies (libraries, data files, etc.) required in order
870 # for that code to run.
871 &quot;name&quot;: &quot;A String&quot;, # The name of the package.
872 &quot;location&quot;: &quot;A String&quot;, # The resource to read the package from. The supported resource type is:
873 #
874 # Google Cloud Storage:
875 #
876 # storage.googleapis.com/{bucket}
877 # bucket.storage.googleapis.com/
878 },
879 ],
880 &quot;teardownPolicy&quot;: &quot;A String&quot;, # Sets the policy for determining when to turndown worker pool.
881 # Allowed values are: `TEARDOWN_ALWAYS`, `TEARDOWN_ON_SUCCESS`, and
882 # `TEARDOWN_NEVER`.
883 # `TEARDOWN_ALWAYS` means workers are always torn down regardless of whether
884 # the job succeeds. `TEARDOWN_ON_SUCCESS` means workers are torn down
885 # if the job succeeds. `TEARDOWN_NEVER` means the workers are never torn
886 # down.
887 #
888 # If the workers are not torn down by the service, they will
889 # continue to run and use Google Compute Engine VM resources in the
890 # user&#x27;s project until they are explicitly terminated by the user.
891 # Because of this, Google recommends using the `TEARDOWN_ALWAYS`
892 # policy except for small, manually supervised test jobs.
893 #
894 # If unknown or unspecified, the service will attempt to choose a reasonable
895 # default.
896 &quot;onHostMaintenance&quot;: &quot;A String&quot;, # The action to take on host maintenance, as defined by the Google
897 # Compute Engine API.
898 &quot;poolArgs&quot;: { # Extra arguments for this worker pool.
899 &quot;a_key&quot;: &quot;&quot;, # Properties of the object. Contains field @type with type URL.
900 },
901 &quot;diskSizeGb&quot;: 42, # Size of root disk for VMs, in GB. If zero or unspecified, the service will
902 # attempt to choose a reasonable default.
903 &quot;workerHarnessContainerImage&quot;: &quot;A String&quot;, # Required. Docker container image that executes the Cloud Dataflow worker
904 # harness, residing in Google Container Registry.
905 #
906 # Deprecated for the Fn API path. Use sdk_harness_container_images instead.
907 &quot;diskType&quot;: &quot;A String&quot;, # Type of root disk for VMs. If empty or unspecified, the service will
908 # attempt to choose a reasonable default.
909 &quot;machineType&quot;: &quot;A String&quot;, # Machine type (e.g. &quot;n1-standard-1&quot;). If empty or unspecified, the
910 # service will attempt to choose a reasonable default.
911 &quot;kind&quot;: &quot;A String&quot;, # The kind of the worker pool; currently only `harness` and `shuffle`
912 # are supported.
913 &quot;sdkHarnessContainerImages&quot;: [ # Set of SDK harness containers needed to execute this pipeline. This will
914 # only be set in the Fn API path. For non-cross-language pipelines this
915 # should have only one entry. Cross-language pipelines will have two or more
916 # entries.
917 { # Defines a SDK harness container for executing Dataflow pipelines.
918 &quot;containerImage&quot;: &quot;A String&quot;, # A docker container image that resides in Google Container Registry.
919 &quot;useSingleCorePerContainer&quot;: True or False, # If true, recommends the Dataflow service to use only one core per SDK
920 # container instance with this image. If false (or unset) recommends using
921 # more than one core per SDK container instance with this image for
922 # efficiency. Note that Dataflow service may choose to override this property
923 # if needed.
924 },
925 ],
926 &quot;dataDisks&quot;: [ # Data disks that are used by a VM in this workflow.
927 { # Describes the data disk used by a workflow job.
928 &quot;diskType&quot;: &quot;A String&quot;, # Disk storage type, as defined by Google Compute Engine. This
929 # must be a disk type appropriate to the project and zone in which
930 # the workers will run. If unknown or unspecified, the service
931 # will attempt to choose a reasonable default.
932 #
933 # For example, the standard persistent disk type is a resource name
934 # typically ending in &quot;pd-standard&quot;. If SSD persistent disks are
935 # available, the resource name typically ends with &quot;pd-ssd&quot;. The
936 # actual valid values are defined the Google Compute Engine API,
937 # not by the Cloud Dataflow API; consult the Google Compute Engine
938 # documentation for more information about determining the set of
939 # available disk types for a particular project and zone.
940 #
941 # Google Compute Engine Disk types are local to a particular
942 # project in a particular zone, and so the resource name will
943 # typically look something like this:
944 #
945 # compute.googleapis.com/projects/project-id/zones/zone/diskTypes/pd-standard
946 &quot;sizeGb&quot;: 42, # Size of disk in GB. If zero or unspecified, the service will
947 # attempt to choose a reasonable default.
948 &quot;mountPoint&quot;: &quot;A String&quot;, # Directory in a VM where disk is mounted.
949 },
950 ],
951 &quot;subnetwork&quot;: &quot;A String&quot;, # Subnetwork to which VMs will be assigned, if desired. Expected to be of
952 # the form &quot;regions/REGION/subnetworks/SUBNETWORK&quot;.
953 &quot;ipConfiguration&quot;: &quot;A String&quot;, # Configuration for VM IPs.
954 &quot;taskrunnerSettings&quot;: { # Taskrunner configuration settings. # Settings passed through to Google Compute Engine workers when
955 # using the standard Dataflow task runner. Users should ignore
956 # this field.
957 &quot;alsologtostderr&quot;: True or False, # Whether to also send taskrunner log info to stderr.
958 &quot;taskGroup&quot;: &quot;A String&quot;, # The UNIX group ID on the worker VM to use for tasks launched by
959 # taskrunner; e.g. &quot;wheel&quot;.
960 &quot;harnessCommand&quot;: &quot;A String&quot;, # The command to launch the worker harness.
961 &quot;logDir&quot;: &quot;A String&quot;, # The directory on the VM to store logs.
962 &quot;oauthScopes&quot;: [ # The OAuth2 scopes to be requested by the taskrunner in order to
963 # access the Cloud Dataflow API.
964 &quot;A String&quot;,
965 ],
966 &quot;dataflowApiVersion&quot;: &quot;A String&quot;, # The API version of endpoint, e.g. &quot;v1b3&quot;
967 &quot;logUploadLocation&quot;: &quot;A String&quot;, # Indicates where to put logs. If this is not specified, the logs
968 # will not be uploaded.
969 #
970 # The supported resource type is:
971 #
972 # Google Cloud Storage:
973 # storage.googleapis.com/{bucket}/{object}
974 # bucket.storage.googleapis.com/{object}
975 &quot;streamingWorkerMainClass&quot;: &quot;A String&quot;, # The streaming worker main class name.
976 &quot;workflowFileName&quot;: &quot;A String&quot;, # The file to store the workflow in.
977 &quot;languageHint&quot;: &quot;A String&quot;, # The suggested backend language.
978 &quot;commandlinesFileName&quot;: &quot;A String&quot;, # The file to store preprocessing commands in.
979 &quot;baseTaskDir&quot;: &quot;A String&quot;, # The location on the worker for task-specific subdirectories.
980 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the taskrunner should use for
981 # temporary storage.
982 #
983 # The supported resource type is:
984 #
985 # Google Cloud Storage:
986 # storage.googleapis.com/{bucket}/{object}
987 # bucket.storage.googleapis.com/{object}
988 &quot;baseUrl&quot;: &quot;A String&quot;, # The base URL for the taskrunner to use when accessing Google Cloud APIs.
989 #
990 # When workers access Google Cloud APIs, they logically do so via
991 # relative URLs. If this field is specified, it supplies the base
992 # URL to use for resolving these relative URLs. The normative
993 # algorithm used is defined by RFC 1808, &quot;Relative Uniform Resource
994 # Locators&quot;.
995 #
996 # If not specified, the default value is &quot;http://www.googleapis.com/&quot;
997 &quot;logToSerialconsole&quot;: True or False, # Whether to send taskrunner log info to Google Compute Engine VM serial
998 # console.
999 &quot;continueOnException&quot;: True or False, # Whether to continue taskrunner if an exception is hit.
1000 &quot;parallelWorkerSettings&quot;: { # Provides data to pass through to the worker harness. # The settings to pass to the parallel worker harness.
1001 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the system should use for temporary
1002 # storage.
1003 #
1004 # The supported resource type is:
1005 #
1006 # Google Cloud Storage:
1007 #
1008 # storage.googleapis.com/{bucket}/{object}
1009 # bucket.storage.googleapis.com/{object}
1010 &quot;reportingEnabled&quot;: True or False, # Whether to send work progress updates to the service.
1011 &quot;baseUrl&quot;: &quot;A String&quot;, # The base URL for accessing Google Cloud APIs.
1012 #
1013 # When workers access Google Cloud APIs, they logically do so via
1014 # relative URLs. If this field is specified, it supplies the base
1015 # URL to use for resolving these relative URLs. The normative
1016 # algorithm used is defined by RFC 1808, &quot;Relative Uniform Resource
1017 # Locators&quot;.
1018 #
1019 # If not specified, the default value is &quot;http://www.googleapis.com/&quot;
1020 &quot;servicePath&quot;: &quot;A String&quot;, # The Cloud Dataflow service path relative to the root URL, for example,
1021 # &quot;dataflow/v1b3/projects&quot;.
1022 &quot;shuffleServicePath&quot;: &quot;A String&quot;, # The Shuffle service path relative to the root URL, for example,
1023 # &quot;shuffle/v1beta1&quot;.
1024 &quot;workerId&quot;: &quot;A String&quot;, # The ID of the worker running this pipeline.
1025 },
1026 &quot;taskUser&quot;: &quot;A String&quot;, # The UNIX user ID on the worker VM to use for tasks launched by
1027 # taskrunner; e.g. &quot;root&quot;.
1028 &quot;vmId&quot;: &quot;A String&quot;, # The ID string of the VM.
1029 },
1030 &quot;autoscalingSettings&quot;: { # Settings for WorkerPool autoscaling. # Settings for autoscaling of this WorkerPool.
1031 &quot;algorithm&quot;: &quot;A String&quot;, # The algorithm to use for autoscaling.
1032 &quot;maxNumWorkers&quot;: 42, # The maximum number of workers to cap scaling at.
1033 },
1034 &quot;metadata&quot;: { # Metadata to set on the Google Compute Engine VMs.
1035 &quot;a_key&quot;: &quot;A String&quot;,
1036 },
1037 &quot;defaultPackageSet&quot;: &quot;A String&quot;, # The default package set to install. This allows the service to
1038 # select a default set of packages which are useful to worker
1039 # harnesses written in a particular language.
1040 &quot;network&quot;: &quot;A String&quot;, # Network to which VMs will be assigned. If empty or unspecified,
1041 # the service will use the network &quot;default&quot;.
1042 },
1043 ],
1044 &quot;dataset&quot;: &quot;A String&quot;, # The dataset for the current project where various workflow
1045 # related tables are stored.
1046 #
1047 # The supported resource type is:
1048 #
1049 # Google BigQuery:
1050 # bigquery.googleapis.com/{dataset}
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001051 },
Bu Sun Kim65020912020-05-20 12:08:20 -07001052 &quot;stageStates&quot;: [ # This field may be mutated by the Cloud Dataflow service;
1053 # callers cannot mutate it.
1054 { # A message describing the state of a particular execution stage.
Bu Sun Kim65020912020-05-20 12:08:20 -07001055 &quot;currentStateTime&quot;: &quot;A String&quot;, # The time at which the stage transitioned to this state.
1056 &quot;executionStageState&quot;: &quot;A String&quot;, # Executions stage states allow the same set of values as JobState.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07001057 &quot;executionStageName&quot;: &quot;A String&quot;, # The name of the execution stage.
Bu Sun Kim65020912020-05-20 12:08:20 -07001058 },
1059 ],
1060 &quot;jobMetadata&quot;: { # Metadata available primarily for filtering jobs. Will be included in the # This field is populated by the Dataflow service to support filtering jobs
1061 # by the metadata values provided here. Populated for ListJobs and all GetJob
1062 # views SUMMARY and higher.
1063 # ListJob response and Job SUMMARY view.
Bu Sun Kim65020912020-05-20 12:08:20 -07001064 &quot;datastoreDetails&quot;: [ # Identification of a Datastore source used in the Dataflow job.
1065 { # Metadata for a Datastore connector used by the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07001066 &quot;namespace&quot;: &quot;A String&quot;, # Namespace used in the connection.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07001067 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
Bu Sun Kim65020912020-05-20 12:08:20 -07001068 },
1069 ],
1070 &quot;sdkVersion&quot;: { # The version of the SDK used to run the job. # The SDK version used to run the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07001071 &quot;version&quot;: &quot;A String&quot;, # The version of the SDK used to run the job.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07001072 &quot;sdkSupportStatus&quot;: &quot;A String&quot;, # The support status for this SDK version.
1073 &quot;versionDisplayName&quot;: &quot;A String&quot;, # A readable string describing the version of the SDK.
Bu Sun Kim65020912020-05-20 12:08:20 -07001074 },
1075 &quot;bigqueryDetails&quot;: [ # Identification of a BigQuery source used in the Dataflow job.
1076 { # Metadata for a BigQuery connector used by the job.
1077 &quot;table&quot;: &quot;A String&quot;, # Table accessed in the connection.
1078 &quot;dataset&quot;: &quot;A String&quot;, # Dataset accessed in the connection.
Bu Sun Kim65020912020-05-20 12:08:20 -07001079 &quot;query&quot;: &quot;A String&quot;, # Query used to access data in the connection.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07001080 &quot;projectId&quot;: &quot;A String&quot;, # Project accessed in the connection.
Bu Sun Kim65020912020-05-20 12:08:20 -07001081 },
1082 ],
1083 &quot;fileDetails&quot;: [ # Identification of a File source used in the Dataflow job.
1084 { # Metadata for a File connector used by the job.
1085 &quot;filePattern&quot;: &quot;A String&quot;, # File Pattern used to access files by the connector.
1086 },
1087 ],
1088 &quot;pubsubDetails&quot;: [ # Identification of a PubSub source used in the Dataflow job.
1089 { # Metadata for a PubSub connector used by the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07001090 &quot;topic&quot;: &quot;A String&quot;, # Topic accessed in the connection.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07001091 &quot;subscription&quot;: &quot;A String&quot;, # Subscription used in the connection.
1092 },
1093 ],
1094 &quot;bigTableDetails&quot;: [ # Identification of a BigTable source used in the Dataflow job.
1095 { # Metadata for a BigTable connector used by the job.
1096 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
1097 &quot;instanceId&quot;: &quot;A String&quot;, # InstanceId accessed in the connection.
1098 &quot;tableId&quot;: &quot;A String&quot;, # TableId accessed in the connection.
1099 },
1100 ],
1101 &quot;spannerDetails&quot;: [ # Identification of a Spanner source used in the Dataflow job.
1102 { # Metadata for a Spanner connector used by the job.
1103 &quot;instanceId&quot;: &quot;A String&quot;, # InstanceId accessed in the connection.
1104 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
1105 &quot;databaseId&quot;: &quot;A String&quot;, # DatabaseId accessed in the connection.
Bu Sun Kim65020912020-05-20 12:08:20 -07001106 },
1107 ],
1108 },
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07001109 &quot;type&quot;: &quot;A String&quot;, # The type of Cloud Dataflow job.
1110 &quot;projectId&quot;: &quot;A String&quot;, # The ID of the Cloud Platform project that the job belongs to.
Bu Sun Kim65020912020-05-20 12:08:20 -07001111 &quot;createdFromSnapshotId&quot;: &quot;A String&quot;, # If this is specified, the job&#x27;s initial state is populated from the given
1112 # snapshot.
Bu Sun Kim65020912020-05-20 12:08:20 -07001113 &quot;pipelineDescription&quot;: { # A descriptive representation of submitted pipeline as well as the executed # Preliminary field: The format of this data may change at any time.
1114 # A description of the user pipeline and stages through which it is executed.
1115 # Created by Cloud Dataflow service. Only retrieved with
1116 # JOB_VIEW_DESCRIPTION or JOB_VIEW_ALL.
1117 # form. This data is provided by the Dataflow service for ease of visualizing
1118 # the pipeline and interpreting Dataflow provided metrics.
1119 &quot;executionPipelineStage&quot;: [ # Description of each stage of execution of the pipeline.
1120 { # Description of the composing transforms, names/ids, and input/outputs of a
1121 # stage of execution. Some composing transforms and sources may have been
1122 # generated by the Dataflow service during execution planning.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07001123 &quot;outputSource&quot;: [ # Output sources for this stage.
1124 { # Description of an input or output of an execution stage.
1125 &quot;sizeBytes&quot;: &quot;A String&quot;, # Size of the source, if measurable.
1126 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
1127 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this source; may be user or system generated.
1128 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
1129 # source is most closely associated.
1130 },
1131 ],
1132 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this stage.
1133 &quot;inputSource&quot;: [ # Input sources for this stage.
1134 { # Description of an input or output of an execution stage.
1135 &quot;sizeBytes&quot;: &quot;A String&quot;, # Size of the source, if measurable.
1136 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
1137 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this source; may be user or system generated.
1138 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
1139 # source is most closely associated.
1140 },
1141 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07001142 &quot;id&quot;: &quot;A String&quot;, # Dataflow service generated id for this stage.
1143 &quot;componentTransform&quot;: [ # Transforms that comprise this execution stage.
1144 { # Description of a transform executed as part of an execution stage.
1145 &quot;originalTransform&quot;: &quot;A String&quot;, # User name for the original user transform with which this transform is
1146 # most closely associated.
1147 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
1148 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this transform; may be user or system generated.
1149 },
1150 ],
1151 &quot;componentSource&quot;: [ # Collections produced and consumed by component transforms of this stage.
1152 { # Description of an interstitial value between transforms in an execution
1153 # stage.
1154 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
1155 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this transform; may be user or system generated.
1156 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
1157 # source is most closely associated.
1158 },
1159 ],
1160 &quot;kind&quot;: &quot;A String&quot;, # Type of tranform this stage is executing.
Bu Sun Kim65020912020-05-20 12:08:20 -07001161 },
1162 ],
1163 &quot;originalPipelineTransform&quot;: [ # Description of each transform in the pipeline and collections between them.
1164 { # Description of the type, names/ids, and input/outputs for a transform.
1165 &quot;kind&quot;: &quot;A String&quot;, # Type of transform.
1166 &quot;inputCollectionName&quot;: [ # User names for all collection inputs to this transform.
1167 &quot;A String&quot;,
1168 ],
1169 &quot;name&quot;: &quot;A String&quot;, # User provided name for this transform instance.
1170 &quot;id&quot;: &quot;A String&quot;, # SDK generated id of this transform instance.
1171 &quot;displayData&quot;: [ # Transform-specific display data.
1172 { # Data provided with a pipeline or transform to provide descriptive info.
Bu Sun Kim65020912020-05-20 12:08:20 -07001173 &quot;durationValue&quot;: &quot;A String&quot;, # Contains value if the data is of duration type.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07001174 &quot;int64Value&quot;: &quot;A String&quot;, # Contains value if the data is of int64 type.
Bu Sun Kim65020912020-05-20 12:08:20 -07001175 &quot;namespace&quot;: &quot;A String&quot;, # The namespace for the key. This is usually a class name or programming
1176 # language namespace (i.e. python module) which defines the display data.
1177 # This allows a dax monitoring system to specially handle the data
1178 # and perform custom rendering.
1179 &quot;floatValue&quot;: 3.14, # Contains value if the data is of float type.
1180 &quot;key&quot;: &quot;A String&quot;, # The key identifying the display data.
1181 # This is intended to be used as a label for the display data
1182 # when viewed in a dax monitoring system.
1183 &quot;shortStrValue&quot;: &quot;A String&quot;, # A possible additional shorter value to display.
1184 # For example a java_class_name_value of com.mypackage.MyDoFn
1185 # will be stored with MyDoFn as the short_str_value and
1186 # com.mypackage.MyDoFn as the java_class_name value.
1187 # short_str_value can be displayed and java_class_name_value
1188 # will be displayed as a tooltip.
1189 &quot;url&quot;: &quot;A String&quot;, # An optional full URL.
1190 &quot;label&quot;: &quot;A String&quot;, # An optional label to display in a dax UI for the element.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07001191 &quot;timestampValue&quot;: &quot;A String&quot;, # Contains value if the data is of timestamp type.
1192 &quot;boolValue&quot;: True or False, # Contains value if the data is of a boolean type.
1193 &quot;javaClassValue&quot;: &quot;A String&quot;, # Contains value if the data is of java class type.
1194 &quot;strValue&quot;: &quot;A String&quot;, # Contains value if the data is of string type.
Bu Sun Kim65020912020-05-20 12:08:20 -07001195 },
1196 ],
1197 &quot;outputCollectionName&quot;: [ # User names for all collection outputs to this transform.
1198 &quot;A String&quot;,
1199 ],
1200 },
1201 ],
1202 &quot;displayData&quot;: [ # Pipeline level display data.
1203 { # Data provided with a pipeline or transform to provide descriptive info.
Bu Sun Kim65020912020-05-20 12:08:20 -07001204 &quot;durationValue&quot;: &quot;A String&quot;, # Contains value if the data is of duration type.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07001205 &quot;int64Value&quot;: &quot;A String&quot;, # Contains value if the data is of int64 type.
Bu Sun Kim65020912020-05-20 12:08:20 -07001206 &quot;namespace&quot;: &quot;A String&quot;, # The namespace for the key. This is usually a class name or programming
1207 # language namespace (i.e. python module) which defines the display data.
1208 # This allows a dax monitoring system to specially handle the data
1209 # and perform custom rendering.
1210 &quot;floatValue&quot;: 3.14, # Contains value if the data is of float type.
1211 &quot;key&quot;: &quot;A String&quot;, # The key identifying the display data.
1212 # This is intended to be used as a label for the display data
1213 # when viewed in a dax monitoring system.
1214 &quot;shortStrValue&quot;: &quot;A String&quot;, # A possible additional shorter value to display.
1215 # For example a java_class_name_value of com.mypackage.MyDoFn
1216 # will be stored with MyDoFn as the short_str_value and
1217 # com.mypackage.MyDoFn as the java_class_name value.
1218 # short_str_value can be displayed and java_class_name_value
1219 # will be displayed as a tooltip.
1220 &quot;url&quot;: &quot;A String&quot;, # An optional full URL.
1221 &quot;label&quot;: &quot;A String&quot;, # An optional label to display in a dax UI for the element.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07001222 &quot;timestampValue&quot;: &quot;A String&quot;, # Contains value if the data is of timestamp type.
1223 &quot;boolValue&quot;: True or False, # Contains value if the data is of a boolean type.
1224 &quot;javaClassValue&quot;: &quot;A String&quot;, # Contains value if the data is of java class type.
1225 &quot;strValue&quot;: &quot;A String&quot;, # Contains value if the data is of string type.
Bu Sun Kim65020912020-05-20 12:08:20 -07001226 },
1227 ],
1228 },
1229 &quot;replaceJobId&quot;: &quot;A String&quot;, # If this job is an update of an existing job, this field is the job ID
1230 # of the job it replaced.
1231 #
1232 # When sending a `CreateJobRequest`, you can update a job by specifying it
1233 # here. The job named here is stopped, and its intermediate state is
1234 # transferred to this job.
1235 &quot;tempFiles&quot;: [ # A set of files the system should be aware of that are used
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001236 # for temporary storage. These temporary files will be
1237 # removed on job completion.
1238 # No duplicates are allowed.
1239 # No file patterns are supported.
1240 #
1241 # The supported files are:
1242 #
1243 # Google Cloud Storage:
1244 #
1245 # storage.googleapis.com/{bucket}/{object}
1246 # bucket.storage.googleapis.com/{object}
Bu Sun Kim65020912020-05-20 12:08:20 -07001247 &quot;A String&quot;,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001248 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07001249 &quot;name&quot;: &quot;A String&quot;, # The user-specified Cloud Dataflow job name.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001250 #
1251 # Only one Job with a given name may exist in a project at any
1252 # given time. If a caller attempts to create a Job with the same
1253 # name as an already-existing Job, the attempt returns the
1254 # existing Job.
1255 #
1256 # The name must match the regular expression
1257 # `[a-z]([-a-z0-9]{0,38}[a-z0-9])?`
Bu Sun Kim65020912020-05-20 12:08:20 -07001258 &quot;steps&quot;: [ # Exactly one of step or steps_location should be specified.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001259 #
1260 # The top-level steps that constitute the entire job.
1261 { # Defines a particular step within a Cloud Dataflow job.
1262 #
1263 # A job consists of multiple steps, each of which performs some
1264 # specific operation as part of the overall job. Data is typically
1265 # passed from one step to another as part of the job.
1266 #
Bu Sun Kim65020912020-05-20 12:08:20 -07001267 # Here&#x27;s an example of a sequence of steps which together implement a
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001268 # Map-Reduce job:
1269 #
1270 # * Read a collection of data from some source, parsing the
Bu Sun Kim65020912020-05-20 12:08:20 -07001271 # collection&#x27;s elements.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001272 #
1273 # * Validate the elements.
1274 #
1275 # * Apply a user-defined function to map each element to some value
1276 # and extract an element-specific key value.
1277 #
1278 # * Group elements with the same key into a single element with
1279 # that key, transforming a multiply-keyed collection into a
1280 # uniquely-keyed collection.
1281 #
1282 # * Write the elements out to some data sink.
1283 #
1284 # Note that the Cloud Dataflow service may be used to run many different
1285 # types of jobs, not just Map-Reduce.
Bu Sun Kim65020912020-05-20 12:08:20 -07001286 &quot;name&quot;: &quot;A String&quot;, # The name that identifies the step. This must be unique for each
Dan O'Mearadd494642020-05-01 07:42:23 -07001287 # step with respect to all other steps in the Cloud Dataflow job.
Bu Sun Kim65020912020-05-20 12:08:20 -07001288 &quot;kind&quot;: &quot;A String&quot;, # The kind of step in the Cloud Dataflow job.
1289 &quot;properties&quot;: { # Named properties associated with the step. Each kind of
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001290 # predefined step has its own required set of properties.
1291 # Must be provided on Create. Only retrieved with JOB_VIEW_ALL.
Bu Sun Kim65020912020-05-20 12:08:20 -07001292 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001293 },
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001294 },
1295 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07001296 &quot;replacedByJobId&quot;: &quot;A String&quot;, # If another job is an update of this job (and thus, this job is in
1297 # `JOB_STATE_UPDATED`), this field contains the ID of that job.
1298 &quot;executionInfo&quot;: { # Additional information about how a Cloud Dataflow job will be executed that # Deprecated.
1299 # isn&#x27;t contained in the submitted job.
1300 &quot;stages&quot;: { # A mapping from each stage to the information about that stage.
1301 &quot;a_key&quot;: { # Contains information about how a particular
1302 # google.dataflow.v1beta3.Step will be executed.
1303 &quot;stepName&quot;: [ # The steps associated with the execution stage.
1304 # Note that stages may have several steps, and that a given step
1305 # might be run by more than one stage.
1306 &quot;A String&quot;,
1307 ],
1308 },
1309 },
1310 },
1311 &quot;currentState&quot;: &quot;A String&quot;, # The current state of the job.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001312 #
1313 # Jobs are created in the `JOB_STATE_STOPPED` state unless otherwise
1314 # specified.
1315 #
1316 # A job in the `JOB_STATE_RUNNING` state may asynchronously enter a
1317 # terminal state. After a job has reached a terminal state, no
1318 # further state updates may be made.
1319 #
1320 # This field may be mutated by the Cloud Dataflow service;
1321 # callers cannot mutate it.
Bu Sun Kim65020912020-05-20 12:08:20 -07001322 &quot;location&quot;: &quot;A String&quot;, # The [regional endpoint]
1323 # (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
1324 # contains this job.
1325 &quot;startTime&quot;: &quot;A String&quot;, # The timestamp when the job was started (transitioned to JOB_STATE_PENDING).
1326 # Flexible resource scheduling jobs are started with some delay after job
1327 # creation, so start_time is unset before start and is updated when the
1328 # job is started by the Cloud Dataflow service. For other jobs, start_time
1329 # always equals to create_time and is immutable and set by the Cloud Dataflow
1330 # service.
1331 &quot;stepsLocation&quot;: &quot;A String&quot;, # The GCS location where the steps are stored.
1332 &quot;labels&quot;: { # User-defined labels for this job.
1333 #
1334 # The labels map can contain no more than 64 entries. Entries of the labels
1335 # map are UTF8 strings that comply with the following restrictions:
1336 #
1337 # * Keys must conform to regexp: \p{Ll}\p{Lo}{0,62}
1338 # * Values must conform to regexp: [\p{Ll}\p{Lo}\p{N}_-]{0,63}
1339 # * Both keys and values are additionally constrained to be &lt;= 128 bytes in
1340 # size.
1341 &quot;a_key&quot;: &quot;A String&quot;,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001342 },
Bu Sun Kim65020912020-05-20 12:08:20 -07001343 &quot;createTime&quot;: &quot;A String&quot;, # The timestamp when the job was initially created. Immutable and set by the
1344 # Cloud Dataflow service.
1345 &quot;requestedState&quot;: &quot;A String&quot;, # The job&#x27;s requested state.
1346 #
1347 # `UpdateJob` may be used to switch between the `JOB_STATE_STOPPED` and
1348 # `JOB_STATE_RUNNING` states, by setting requested_state. `UpdateJob` may
1349 # also be used to directly set a job&#x27;s requested state to
1350 # `JOB_STATE_CANCELLED` or `JOB_STATE_DONE`, irrevocably terminating the
1351 # job if it has not already reached a terminal state.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001352}
1353
1354 location: string, The [regional endpoint]
1355(https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
1356contains this job.
Bu Sun Kim65020912020-05-20 12:08:20 -07001357 replaceJobId: string, Deprecated. This field is now in the Job message.
1358 view: string, The level of information requested in response.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001359 x__xgafv: string, V1 error format.
1360 Allowed values
1361 1 - v1 error format
1362 2 - v2 error format
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001363
1364Returns:
1365 An object of the form:
1366
1367 { # Defines a job to be run by the Cloud Dataflow service.
Bu Sun Kim65020912020-05-20 12:08:20 -07001368 &quot;clientRequestId&quot;: &quot;A String&quot;, # The client&#x27;s unique identifier of the job, re-used across retried attempts.
1369 # If this field is set, the service will ensure its uniqueness.
1370 # The request to create a job will fail if the service has knowledge of a
1371 # previously submitted job with the same client&#x27;s ID and job name.
1372 # The caller may use this field to ensure idempotence of job
1373 # creation across retried attempts to create a job.
1374 # By default, the field is empty and, in that case, the service ignores it.
1375 &quot;id&quot;: &quot;A String&quot;, # The unique ID of this job.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001376 #
1377 # This field is set by the Cloud Dataflow service when the Job is
1378 # created, and is immutable for the life of the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07001379 &quot;currentStateTime&quot;: &quot;A String&quot;, # The timestamp associated with the current state.
1380 &quot;transformNameMapping&quot;: { # The map of transform name prefixes of the job to be replaced to the
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04001381 # corresponding name prefixes of the new job.
Bu Sun Kim65020912020-05-20 12:08:20 -07001382 &quot;a_key&quot;: &quot;A String&quot;,
Nathaniel Manista4f877e52015-06-15 16:44:50 +00001383 },
Bu Sun Kim65020912020-05-20 12:08:20 -07001384 &quot;environment&quot;: { # Describes the environment in which a Dataflow Job runs. # The environment for the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07001385 &quot;internalExperiments&quot;: { # Experimental settings.
1386 &quot;a_key&quot;: &quot;&quot;, # Properties of the object. Contains field @type with type URL.
1387 },
1388 &quot;workerRegion&quot;: &quot;A String&quot;, # The Compute Engine region
1389 # (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
1390 # which worker processing should occur, e.g. &quot;us-west1&quot;. Mutually exclusive
1391 # with worker_zone. If neither worker_region nor worker_zone is specified,
1392 # default to the control plane&#x27;s region.
1393 &quot;serviceKmsKeyName&quot;: &quot;A String&quot;, # If set, contains the Cloud KMS key identifier used to encrypt data
1394 # at rest, AKA a Customer Managed Encryption Key (CMEK).
1395 #
1396 # Format:
1397 # projects/PROJECT_ID/locations/LOCATION/keyRings/KEY_RING/cryptoKeys/KEY
1398 &quot;userAgent&quot;: { # A description of the process that generated the request.
1399 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
1400 },
1401 &quot;workerZone&quot;: &quot;A String&quot;, # The Compute Engine zone
1402 # (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
1403 # which worker processing should occur, e.g. &quot;us-west1-a&quot;. Mutually exclusive
1404 # with worker_region. If neither worker_region nor worker_zone is specified,
1405 # a zone in the control plane&#x27;s region is chosen based on available capacity.
1406 &quot;clusterManagerApiService&quot;: &quot;A String&quot;, # The type of cluster manager API to use. If unknown or
Dan O'Mearadd494642020-05-01 07:42:23 -07001407 # unspecified, the service will attempt to choose a reasonable
1408 # default. This should be in the form of the API service name,
Bu Sun Kim65020912020-05-20 12:08:20 -07001409 # e.g. &quot;compute.googleapis.com&quot;.
1410 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the system should use for temporary
1411 # storage. The system will append the suffix &quot;/temp-{JOBNAME} to
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001412 # this resource prefix, where {JOBNAME} is the value of the
1413 # job_name field. The resulting bucket and object prefix is used
1414 # as the prefix of the resources used to store temporary data
1415 # needed during the job execution. NOTE: This will override the
1416 # value in taskrunner_settings.
1417 # The supported resource type is:
1418 #
1419 # Google Cloud Storage:
1420 #
1421 # storage.googleapis.com/{bucket}/{object}
1422 # bucket.storage.googleapis.com/{object}
Bu Sun Kim65020912020-05-20 12:08:20 -07001423 &quot;experiments&quot;: [ # The list of experiments to enable.
1424 &quot;A String&quot;,
1425 ],
1426 &quot;version&quot;: { # A structure describing which components and their versions of the service
1427 # are required in order to run the job.
1428 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
1429 },
1430 &quot;serviceAccountEmail&quot;: &quot;A String&quot;, # Identity to run virtual machines as. Defaults to the default account.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07001431 &quot;sdkPipelineOptions&quot;: { # The Cloud Dataflow SDK pipeline options specified by the user. These
1432 # options are passed through the service and are used to recreate the
1433 # SDK pipeline options on the worker in a language agnostic and platform
1434 # independent way.
1435 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
1436 },
1437 &quot;flexResourceSchedulingGoal&quot;: &quot;A String&quot;, # Which Flexible Resource Scheduling mode to run in.
1438 &quot;workerPools&quot;: [ # The worker pools. At least one &quot;harness&quot; worker pool must be
1439 # specified in order for the job to have workers.
1440 { # Describes one particular pool of Cloud Dataflow workers to be
1441 # instantiated by the Cloud Dataflow service in order to perform the
1442 # computations required by a job. Note that a workflow job may use
1443 # multiple pools, in order to match the various computational
1444 # requirements of the various stages of the job.
1445 &quot;numThreadsPerWorker&quot;: 42, # The number of threads per worker harness. If empty or unspecified, the
1446 # service will choose a number of threads (according to the number of cores
1447 # on the selected machine type for batch, or 1 by convention for streaming).
1448 &quot;numWorkers&quot;: 42, # Number of Google Compute Engine workers in this pool needed to
1449 # execute the job. If zero or unspecified, the service will
1450 # attempt to choose a reasonable default.
1451 &quot;zone&quot;: &quot;A String&quot;, # Zone to run the worker pools in. If empty or unspecified, the service
1452 # will attempt to choose a reasonable default.
1453 &quot;diskSourceImage&quot;: &quot;A String&quot;, # Fully qualified source image for disks.
1454 &quot;packages&quot;: [ # Packages to be installed on workers.
1455 { # The packages that must be installed in order for a worker to run the
1456 # steps of the Cloud Dataflow job that will be assigned to its worker
1457 # pool.
1458 #
1459 # This is the mechanism by which the Cloud Dataflow SDK causes code to
1460 # be loaded onto the workers. For example, the Cloud Dataflow Java SDK
1461 # might use this to install jars containing the user&#x27;s code and all of the
1462 # various dependencies (libraries, data files, etc.) required in order
1463 # for that code to run.
1464 &quot;name&quot;: &quot;A String&quot;, # The name of the package.
1465 &quot;location&quot;: &quot;A String&quot;, # The resource to read the package from. The supported resource type is:
1466 #
1467 # Google Cloud Storage:
1468 #
1469 # storage.googleapis.com/{bucket}
1470 # bucket.storage.googleapis.com/
1471 },
1472 ],
1473 &quot;teardownPolicy&quot;: &quot;A String&quot;, # Sets the policy for determining when to turndown worker pool.
1474 # Allowed values are: `TEARDOWN_ALWAYS`, `TEARDOWN_ON_SUCCESS`, and
1475 # `TEARDOWN_NEVER`.
1476 # `TEARDOWN_ALWAYS` means workers are always torn down regardless of whether
1477 # the job succeeds. `TEARDOWN_ON_SUCCESS` means workers are torn down
1478 # if the job succeeds. `TEARDOWN_NEVER` means the workers are never torn
1479 # down.
1480 #
1481 # If the workers are not torn down by the service, they will
1482 # continue to run and use Google Compute Engine VM resources in the
1483 # user&#x27;s project until they are explicitly terminated by the user.
1484 # Because of this, Google recommends using the `TEARDOWN_ALWAYS`
1485 # policy except for small, manually supervised test jobs.
1486 #
1487 # If unknown or unspecified, the service will attempt to choose a reasonable
1488 # default.
1489 &quot;onHostMaintenance&quot;: &quot;A String&quot;, # The action to take on host maintenance, as defined by the Google
1490 # Compute Engine API.
1491 &quot;poolArgs&quot;: { # Extra arguments for this worker pool.
1492 &quot;a_key&quot;: &quot;&quot;, # Properties of the object. Contains field @type with type URL.
1493 },
1494 &quot;diskSizeGb&quot;: 42, # Size of root disk for VMs, in GB. If zero or unspecified, the service will
1495 # attempt to choose a reasonable default.
1496 &quot;workerHarnessContainerImage&quot;: &quot;A String&quot;, # Required. Docker container image that executes the Cloud Dataflow worker
1497 # harness, residing in Google Container Registry.
1498 #
1499 # Deprecated for the Fn API path. Use sdk_harness_container_images instead.
1500 &quot;diskType&quot;: &quot;A String&quot;, # Type of root disk for VMs. If empty or unspecified, the service will
1501 # attempt to choose a reasonable default.
1502 &quot;machineType&quot;: &quot;A String&quot;, # Machine type (e.g. &quot;n1-standard-1&quot;). If empty or unspecified, the
1503 # service will attempt to choose a reasonable default.
1504 &quot;kind&quot;: &quot;A String&quot;, # The kind of the worker pool; currently only `harness` and `shuffle`
1505 # are supported.
1506 &quot;sdkHarnessContainerImages&quot;: [ # Set of SDK harness containers needed to execute this pipeline. This will
1507 # only be set in the Fn API path. For non-cross-language pipelines this
1508 # should have only one entry. Cross-language pipelines will have two or more
1509 # entries.
1510 { # Defines a SDK harness container for executing Dataflow pipelines.
1511 &quot;containerImage&quot;: &quot;A String&quot;, # A docker container image that resides in Google Container Registry.
1512 &quot;useSingleCorePerContainer&quot;: True or False, # If true, recommends the Dataflow service to use only one core per SDK
1513 # container instance with this image. If false (or unset) recommends using
1514 # more than one core per SDK container instance with this image for
1515 # efficiency. Note that Dataflow service may choose to override this property
1516 # if needed.
1517 },
1518 ],
1519 &quot;dataDisks&quot;: [ # Data disks that are used by a VM in this workflow.
1520 { # Describes the data disk used by a workflow job.
1521 &quot;diskType&quot;: &quot;A String&quot;, # Disk storage type, as defined by Google Compute Engine. This
1522 # must be a disk type appropriate to the project and zone in which
1523 # the workers will run. If unknown or unspecified, the service
1524 # will attempt to choose a reasonable default.
1525 #
1526 # For example, the standard persistent disk type is a resource name
1527 # typically ending in &quot;pd-standard&quot;. If SSD persistent disks are
1528 # available, the resource name typically ends with &quot;pd-ssd&quot;. The
1529 # actual valid values are defined the Google Compute Engine API,
1530 # not by the Cloud Dataflow API; consult the Google Compute Engine
1531 # documentation for more information about determining the set of
1532 # available disk types for a particular project and zone.
1533 #
1534 # Google Compute Engine Disk types are local to a particular
1535 # project in a particular zone, and so the resource name will
1536 # typically look something like this:
1537 #
1538 # compute.googleapis.com/projects/project-id/zones/zone/diskTypes/pd-standard
1539 &quot;sizeGb&quot;: 42, # Size of disk in GB. If zero or unspecified, the service will
1540 # attempt to choose a reasonable default.
1541 &quot;mountPoint&quot;: &quot;A String&quot;, # Directory in a VM where disk is mounted.
1542 },
1543 ],
1544 &quot;subnetwork&quot;: &quot;A String&quot;, # Subnetwork to which VMs will be assigned, if desired. Expected to be of
1545 # the form &quot;regions/REGION/subnetworks/SUBNETWORK&quot;.
1546 &quot;ipConfiguration&quot;: &quot;A String&quot;, # Configuration for VM IPs.
1547 &quot;taskrunnerSettings&quot;: { # Taskrunner configuration settings. # Settings passed through to Google Compute Engine workers when
1548 # using the standard Dataflow task runner. Users should ignore
1549 # this field.
1550 &quot;alsologtostderr&quot;: True or False, # Whether to also send taskrunner log info to stderr.
1551 &quot;taskGroup&quot;: &quot;A String&quot;, # The UNIX group ID on the worker VM to use for tasks launched by
1552 # taskrunner; e.g. &quot;wheel&quot;.
1553 &quot;harnessCommand&quot;: &quot;A String&quot;, # The command to launch the worker harness.
1554 &quot;logDir&quot;: &quot;A String&quot;, # The directory on the VM to store logs.
1555 &quot;oauthScopes&quot;: [ # The OAuth2 scopes to be requested by the taskrunner in order to
1556 # access the Cloud Dataflow API.
1557 &quot;A String&quot;,
1558 ],
1559 &quot;dataflowApiVersion&quot;: &quot;A String&quot;, # The API version of endpoint, e.g. &quot;v1b3&quot;
1560 &quot;logUploadLocation&quot;: &quot;A String&quot;, # Indicates where to put logs. If this is not specified, the logs
1561 # will not be uploaded.
1562 #
1563 # The supported resource type is:
1564 #
1565 # Google Cloud Storage:
1566 # storage.googleapis.com/{bucket}/{object}
1567 # bucket.storage.googleapis.com/{object}
1568 &quot;streamingWorkerMainClass&quot;: &quot;A String&quot;, # The streaming worker main class name.
1569 &quot;workflowFileName&quot;: &quot;A String&quot;, # The file to store the workflow in.
1570 &quot;languageHint&quot;: &quot;A String&quot;, # The suggested backend language.
1571 &quot;commandlinesFileName&quot;: &quot;A String&quot;, # The file to store preprocessing commands in.
1572 &quot;baseTaskDir&quot;: &quot;A String&quot;, # The location on the worker for task-specific subdirectories.
1573 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the taskrunner should use for
1574 # temporary storage.
1575 #
1576 # The supported resource type is:
1577 #
1578 # Google Cloud Storage:
1579 # storage.googleapis.com/{bucket}/{object}
1580 # bucket.storage.googleapis.com/{object}
1581 &quot;baseUrl&quot;: &quot;A String&quot;, # The base URL for the taskrunner to use when accessing Google Cloud APIs.
1582 #
1583 # When workers access Google Cloud APIs, they logically do so via
1584 # relative URLs. If this field is specified, it supplies the base
1585 # URL to use for resolving these relative URLs. The normative
1586 # algorithm used is defined by RFC 1808, &quot;Relative Uniform Resource
1587 # Locators&quot;.
1588 #
1589 # If not specified, the default value is &quot;http://www.googleapis.com/&quot;
1590 &quot;logToSerialconsole&quot;: True or False, # Whether to send taskrunner log info to Google Compute Engine VM serial
1591 # console.
1592 &quot;continueOnException&quot;: True or False, # Whether to continue taskrunner if an exception is hit.
1593 &quot;parallelWorkerSettings&quot;: { # Provides data to pass through to the worker harness. # The settings to pass to the parallel worker harness.
1594 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the system should use for temporary
1595 # storage.
1596 #
1597 # The supported resource type is:
1598 #
1599 # Google Cloud Storage:
1600 #
1601 # storage.googleapis.com/{bucket}/{object}
1602 # bucket.storage.googleapis.com/{object}
1603 &quot;reportingEnabled&quot;: True or False, # Whether to send work progress updates to the service.
1604 &quot;baseUrl&quot;: &quot;A String&quot;, # The base URL for accessing Google Cloud APIs.
1605 #
1606 # When workers access Google Cloud APIs, they logically do so via
1607 # relative URLs. If this field is specified, it supplies the base
1608 # URL to use for resolving these relative URLs. The normative
1609 # algorithm used is defined by RFC 1808, &quot;Relative Uniform Resource
1610 # Locators&quot;.
1611 #
1612 # If not specified, the default value is &quot;http://www.googleapis.com/&quot;
1613 &quot;servicePath&quot;: &quot;A String&quot;, # The Cloud Dataflow service path relative to the root URL, for example,
1614 # &quot;dataflow/v1b3/projects&quot;.
1615 &quot;shuffleServicePath&quot;: &quot;A String&quot;, # The Shuffle service path relative to the root URL, for example,
1616 # &quot;shuffle/v1beta1&quot;.
1617 &quot;workerId&quot;: &quot;A String&quot;, # The ID of the worker running this pipeline.
1618 },
1619 &quot;taskUser&quot;: &quot;A String&quot;, # The UNIX user ID on the worker VM to use for tasks launched by
1620 # taskrunner; e.g. &quot;root&quot;.
1621 &quot;vmId&quot;: &quot;A String&quot;, # The ID string of the VM.
1622 },
1623 &quot;autoscalingSettings&quot;: { # Settings for WorkerPool autoscaling. # Settings for autoscaling of this WorkerPool.
1624 &quot;algorithm&quot;: &quot;A String&quot;, # The algorithm to use for autoscaling.
1625 &quot;maxNumWorkers&quot;: 42, # The maximum number of workers to cap scaling at.
1626 },
1627 &quot;metadata&quot;: { # Metadata to set on the Google Compute Engine VMs.
1628 &quot;a_key&quot;: &quot;A String&quot;,
1629 },
1630 &quot;defaultPackageSet&quot;: &quot;A String&quot;, # The default package set to install. This allows the service to
1631 # select a default set of packages which are useful to worker
1632 # harnesses written in a particular language.
1633 &quot;network&quot;: &quot;A String&quot;, # Network to which VMs will be assigned. If empty or unspecified,
1634 # the service will use the network &quot;default&quot;.
1635 },
1636 ],
1637 &quot;dataset&quot;: &quot;A String&quot;, # The dataset for the current project where various workflow
1638 # related tables are stored.
1639 #
1640 # The supported resource type is:
1641 #
1642 # Google BigQuery:
1643 # bigquery.googleapis.com/{dataset}
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001644 },
Bu Sun Kim65020912020-05-20 12:08:20 -07001645 &quot;stageStates&quot;: [ # This field may be mutated by the Cloud Dataflow service;
1646 # callers cannot mutate it.
1647 { # A message describing the state of a particular execution stage.
Bu Sun Kim65020912020-05-20 12:08:20 -07001648 &quot;currentStateTime&quot;: &quot;A String&quot;, # The time at which the stage transitioned to this state.
1649 &quot;executionStageState&quot;: &quot;A String&quot;, # Executions stage states allow the same set of values as JobState.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07001650 &quot;executionStageName&quot;: &quot;A String&quot;, # The name of the execution stage.
Bu Sun Kim65020912020-05-20 12:08:20 -07001651 },
1652 ],
1653 &quot;jobMetadata&quot;: { # Metadata available primarily for filtering jobs. Will be included in the # This field is populated by the Dataflow service to support filtering jobs
1654 # by the metadata values provided here. Populated for ListJobs and all GetJob
1655 # views SUMMARY and higher.
1656 # ListJob response and Job SUMMARY view.
Bu Sun Kim65020912020-05-20 12:08:20 -07001657 &quot;datastoreDetails&quot;: [ # Identification of a Datastore source used in the Dataflow job.
1658 { # Metadata for a Datastore connector used by the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07001659 &quot;namespace&quot;: &quot;A String&quot;, # Namespace used in the connection.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07001660 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
Bu Sun Kim65020912020-05-20 12:08:20 -07001661 },
1662 ],
1663 &quot;sdkVersion&quot;: { # The version of the SDK used to run the job. # The SDK version used to run the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07001664 &quot;version&quot;: &quot;A String&quot;, # The version of the SDK used to run the job.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07001665 &quot;sdkSupportStatus&quot;: &quot;A String&quot;, # The support status for this SDK version.
1666 &quot;versionDisplayName&quot;: &quot;A String&quot;, # A readable string describing the version of the SDK.
Bu Sun Kim65020912020-05-20 12:08:20 -07001667 },
1668 &quot;bigqueryDetails&quot;: [ # Identification of a BigQuery source used in the Dataflow job.
1669 { # Metadata for a BigQuery connector used by the job.
1670 &quot;table&quot;: &quot;A String&quot;, # Table accessed in the connection.
1671 &quot;dataset&quot;: &quot;A String&quot;, # Dataset accessed in the connection.
Bu Sun Kim65020912020-05-20 12:08:20 -07001672 &quot;query&quot;: &quot;A String&quot;, # Query used to access data in the connection.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07001673 &quot;projectId&quot;: &quot;A String&quot;, # Project accessed in the connection.
Bu Sun Kim65020912020-05-20 12:08:20 -07001674 },
1675 ],
1676 &quot;fileDetails&quot;: [ # Identification of a File source used in the Dataflow job.
1677 { # Metadata for a File connector used by the job.
1678 &quot;filePattern&quot;: &quot;A String&quot;, # File Pattern used to access files by the connector.
1679 },
1680 ],
1681 &quot;pubsubDetails&quot;: [ # Identification of a PubSub source used in the Dataflow job.
1682 { # Metadata for a PubSub connector used by the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07001683 &quot;topic&quot;: &quot;A String&quot;, # Topic accessed in the connection.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07001684 &quot;subscription&quot;: &quot;A String&quot;, # Subscription used in the connection.
1685 },
1686 ],
1687 &quot;bigTableDetails&quot;: [ # Identification of a BigTable source used in the Dataflow job.
1688 { # Metadata for a BigTable connector used by the job.
1689 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
1690 &quot;instanceId&quot;: &quot;A String&quot;, # InstanceId accessed in the connection.
1691 &quot;tableId&quot;: &quot;A String&quot;, # TableId accessed in the connection.
1692 },
1693 ],
1694 &quot;spannerDetails&quot;: [ # Identification of a Spanner source used in the Dataflow job.
1695 { # Metadata for a Spanner connector used by the job.
1696 &quot;instanceId&quot;: &quot;A String&quot;, # InstanceId accessed in the connection.
1697 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
1698 &quot;databaseId&quot;: &quot;A String&quot;, # DatabaseId accessed in the connection.
Bu Sun Kim65020912020-05-20 12:08:20 -07001699 },
1700 ],
1701 },
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07001702 &quot;type&quot;: &quot;A String&quot;, # The type of Cloud Dataflow job.
1703 &quot;projectId&quot;: &quot;A String&quot;, # The ID of the Cloud Platform project that the job belongs to.
Bu Sun Kim65020912020-05-20 12:08:20 -07001704 &quot;createdFromSnapshotId&quot;: &quot;A String&quot;, # If this is specified, the job&#x27;s initial state is populated from the given
1705 # snapshot.
Bu Sun Kim65020912020-05-20 12:08:20 -07001706 &quot;pipelineDescription&quot;: { # A descriptive representation of submitted pipeline as well as the executed # Preliminary field: The format of this data may change at any time.
1707 # A description of the user pipeline and stages through which it is executed.
1708 # Created by Cloud Dataflow service. Only retrieved with
1709 # JOB_VIEW_DESCRIPTION or JOB_VIEW_ALL.
1710 # form. This data is provided by the Dataflow service for ease of visualizing
1711 # the pipeline and interpreting Dataflow provided metrics.
1712 &quot;executionPipelineStage&quot;: [ # Description of each stage of execution of the pipeline.
1713 { # Description of the composing transforms, names/ids, and input/outputs of a
1714 # stage of execution. Some composing transforms and sources may have been
1715 # generated by the Dataflow service during execution planning.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07001716 &quot;outputSource&quot;: [ # Output sources for this stage.
1717 { # Description of an input or output of an execution stage.
1718 &quot;sizeBytes&quot;: &quot;A String&quot;, # Size of the source, if measurable.
1719 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
1720 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this source; may be user or system generated.
1721 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
1722 # source is most closely associated.
1723 },
1724 ],
1725 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this stage.
1726 &quot;inputSource&quot;: [ # Input sources for this stage.
1727 { # Description of an input or output of an execution stage.
1728 &quot;sizeBytes&quot;: &quot;A String&quot;, # Size of the source, if measurable.
1729 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
1730 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this source; may be user or system generated.
1731 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
1732 # source is most closely associated.
1733 },
1734 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07001735 &quot;id&quot;: &quot;A String&quot;, # Dataflow service generated id for this stage.
1736 &quot;componentTransform&quot;: [ # Transforms that comprise this execution stage.
1737 { # Description of a transform executed as part of an execution stage.
1738 &quot;originalTransform&quot;: &quot;A String&quot;, # User name for the original user transform with which this transform is
1739 # most closely associated.
1740 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
1741 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this transform; may be user or system generated.
1742 },
1743 ],
1744 &quot;componentSource&quot;: [ # Collections produced and consumed by component transforms of this stage.
1745 { # Description of an interstitial value between transforms in an execution
1746 # stage.
1747 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
1748 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this transform; may be user or system generated.
1749 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
1750 # source is most closely associated.
1751 },
1752 ],
1753 &quot;kind&quot;: &quot;A String&quot;, # Type of tranform this stage is executing.
Bu Sun Kim65020912020-05-20 12:08:20 -07001754 },
1755 ],
1756 &quot;originalPipelineTransform&quot;: [ # Description of each transform in the pipeline and collections between them.
1757 { # Description of the type, names/ids, and input/outputs for a transform.
1758 &quot;kind&quot;: &quot;A String&quot;, # Type of transform.
1759 &quot;inputCollectionName&quot;: [ # User names for all collection inputs to this transform.
1760 &quot;A String&quot;,
1761 ],
1762 &quot;name&quot;: &quot;A String&quot;, # User provided name for this transform instance.
1763 &quot;id&quot;: &quot;A String&quot;, # SDK generated id of this transform instance.
1764 &quot;displayData&quot;: [ # Transform-specific display data.
1765 { # Data provided with a pipeline or transform to provide descriptive info.
Bu Sun Kim65020912020-05-20 12:08:20 -07001766 &quot;durationValue&quot;: &quot;A String&quot;, # Contains value if the data is of duration type.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07001767 &quot;int64Value&quot;: &quot;A String&quot;, # Contains value if the data is of int64 type.
Bu Sun Kim65020912020-05-20 12:08:20 -07001768 &quot;namespace&quot;: &quot;A String&quot;, # The namespace for the key. This is usually a class name or programming
1769 # language namespace (i.e. python module) which defines the display data.
1770 # This allows a dax monitoring system to specially handle the data
1771 # and perform custom rendering.
1772 &quot;floatValue&quot;: 3.14, # Contains value if the data is of float type.
1773 &quot;key&quot;: &quot;A String&quot;, # The key identifying the display data.
1774 # This is intended to be used as a label for the display data
1775 # when viewed in a dax monitoring system.
1776 &quot;shortStrValue&quot;: &quot;A String&quot;, # A possible additional shorter value to display.
1777 # For example a java_class_name_value of com.mypackage.MyDoFn
1778 # will be stored with MyDoFn as the short_str_value and
1779 # com.mypackage.MyDoFn as the java_class_name value.
1780 # short_str_value can be displayed and java_class_name_value
1781 # will be displayed as a tooltip.
1782 &quot;url&quot;: &quot;A String&quot;, # An optional full URL.
1783 &quot;label&quot;: &quot;A String&quot;, # An optional label to display in a dax UI for the element.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07001784 &quot;timestampValue&quot;: &quot;A String&quot;, # Contains value if the data is of timestamp type.
1785 &quot;boolValue&quot;: True or False, # Contains value if the data is of a boolean type.
1786 &quot;javaClassValue&quot;: &quot;A String&quot;, # Contains value if the data is of java class type.
1787 &quot;strValue&quot;: &quot;A String&quot;, # Contains value if the data is of string type.
Bu Sun Kim65020912020-05-20 12:08:20 -07001788 },
1789 ],
1790 &quot;outputCollectionName&quot;: [ # User names for all collection outputs to this transform.
1791 &quot;A String&quot;,
1792 ],
1793 },
1794 ],
1795 &quot;displayData&quot;: [ # Pipeline level display data.
1796 { # Data provided with a pipeline or transform to provide descriptive info.
Bu Sun Kim65020912020-05-20 12:08:20 -07001797 &quot;durationValue&quot;: &quot;A String&quot;, # Contains value if the data is of duration type.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07001798 &quot;int64Value&quot;: &quot;A String&quot;, # Contains value if the data is of int64 type.
Bu Sun Kim65020912020-05-20 12:08:20 -07001799 &quot;namespace&quot;: &quot;A String&quot;, # The namespace for the key. This is usually a class name or programming
1800 # language namespace (i.e. python module) which defines the display data.
1801 # This allows a dax monitoring system to specially handle the data
1802 # and perform custom rendering.
1803 &quot;floatValue&quot;: 3.14, # Contains value if the data is of float type.
1804 &quot;key&quot;: &quot;A String&quot;, # The key identifying the display data.
1805 # This is intended to be used as a label for the display data
1806 # when viewed in a dax monitoring system.
1807 &quot;shortStrValue&quot;: &quot;A String&quot;, # A possible additional shorter value to display.
1808 # For example a java_class_name_value of com.mypackage.MyDoFn
1809 # will be stored with MyDoFn as the short_str_value and
1810 # com.mypackage.MyDoFn as the java_class_name value.
1811 # short_str_value can be displayed and java_class_name_value
1812 # will be displayed as a tooltip.
1813 &quot;url&quot;: &quot;A String&quot;, # An optional full URL.
1814 &quot;label&quot;: &quot;A String&quot;, # An optional label to display in a dax UI for the element.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07001815 &quot;timestampValue&quot;: &quot;A String&quot;, # Contains value if the data is of timestamp type.
1816 &quot;boolValue&quot;: True or False, # Contains value if the data is of a boolean type.
1817 &quot;javaClassValue&quot;: &quot;A String&quot;, # Contains value if the data is of java class type.
1818 &quot;strValue&quot;: &quot;A String&quot;, # Contains value if the data is of string type.
Bu Sun Kim65020912020-05-20 12:08:20 -07001819 },
1820 ],
1821 },
1822 &quot;replaceJobId&quot;: &quot;A String&quot;, # If this job is an update of an existing job, this field is the job ID
1823 # of the job it replaced.
1824 #
1825 # When sending a `CreateJobRequest`, you can update a job by specifying it
1826 # here. The job named here is stopped, and its intermediate state is
1827 # transferred to this job.
1828 &quot;tempFiles&quot;: [ # A set of files the system should be aware of that are used
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001829 # for temporary storage. These temporary files will be
1830 # removed on job completion.
1831 # No duplicates are allowed.
1832 # No file patterns are supported.
1833 #
1834 # The supported files are:
1835 #
1836 # Google Cloud Storage:
1837 #
1838 # storage.googleapis.com/{bucket}/{object}
1839 # bucket.storage.googleapis.com/{object}
Bu Sun Kim65020912020-05-20 12:08:20 -07001840 &quot;A String&quot;,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001841 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07001842 &quot;name&quot;: &quot;A String&quot;, # The user-specified Cloud Dataflow job name.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001843 #
1844 # Only one Job with a given name may exist in a project at any
1845 # given time. If a caller attempts to create a Job with the same
1846 # name as an already-existing Job, the attempt returns the
1847 # existing Job.
1848 #
1849 # The name must match the regular expression
1850 # `[a-z]([-a-z0-9]{0,38}[a-z0-9])?`
Bu Sun Kim65020912020-05-20 12:08:20 -07001851 &quot;steps&quot;: [ # Exactly one of step or steps_location should be specified.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001852 #
1853 # The top-level steps that constitute the entire job.
1854 { # Defines a particular step within a Cloud Dataflow job.
1855 #
1856 # A job consists of multiple steps, each of which performs some
1857 # specific operation as part of the overall job. Data is typically
1858 # passed from one step to another as part of the job.
1859 #
Bu Sun Kim65020912020-05-20 12:08:20 -07001860 # Here&#x27;s an example of a sequence of steps which together implement a
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001861 # Map-Reduce job:
1862 #
1863 # * Read a collection of data from some source, parsing the
Bu Sun Kim65020912020-05-20 12:08:20 -07001864 # collection&#x27;s elements.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001865 #
1866 # * Validate the elements.
1867 #
1868 # * Apply a user-defined function to map each element to some value
1869 # and extract an element-specific key value.
1870 #
1871 # * Group elements with the same key into a single element with
1872 # that key, transforming a multiply-keyed collection into a
1873 # uniquely-keyed collection.
1874 #
1875 # * Write the elements out to some data sink.
1876 #
1877 # Note that the Cloud Dataflow service may be used to run many different
1878 # types of jobs, not just Map-Reduce.
Bu Sun Kim65020912020-05-20 12:08:20 -07001879 &quot;name&quot;: &quot;A String&quot;, # The name that identifies the step. This must be unique for each
Dan O'Mearadd494642020-05-01 07:42:23 -07001880 # step with respect to all other steps in the Cloud Dataflow job.
Bu Sun Kim65020912020-05-20 12:08:20 -07001881 &quot;kind&quot;: &quot;A String&quot;, # The kind of step in the Cloud Dataflow job.
1882 &quot;properties&quot;: { # Named properties associated with the step. Each kind of
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001883 # predefined step has its own required set of properties.
1884 # Must be provided on Create. Only retrieved with JOB_VIEW_ALL.
Bu Sun Kim65020912020-05-20 12:08:20 -07001885 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001886 },
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001887 },
1888 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07001889 &quot;replacedByJobId&quot;: &quot;A String&quot;, # If another job is an update of this job (and thus, this job is in
1890 # `JOB_STATE_UPDATED`), this field contains the ID of that job.
1891 &quot;executionInfo&quot;: { # Additional information about how a Cloud Dataflow job will be executed that # Deprecated.
1892 # isn&#x27;t contained in the submitted job.
1893 &quot;stages&quot;: { # A mapping from each stage to the information about that stage.
1894 &quot;a_key&quot;: { # Contains information about how a particular
1895 # google.dataflow.v1beta3.Step will be executed.
1896 &quot;stepName&quot;: [ # The steps associated with the execution stage.
1897 # Note that stages may have several steps, and that a given step
1898 # might be run by more than one stage.
1899 &quot;A String&quot;,
1900 ],
1901 },
1902 },
1903 },
1904 &quot;currentState&quot;: &quot;A String&quot;, # The current state of the job.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001905 #
1906 # Jobs are created in the `JOB_STATE_STOPPED` state unless otherwise
1907 # specified.
1908 #
1909 # A job in the `JOB_STATE_RUNNING` state may asynchronously enter a
1910 # terminal state. After a job has reached a terminal state, no
1911 # further state updates may be made.
1912 #
1913 # This field may be mutated by the Cloud Dataflow service;
1914 # callers cannot mutate it.
Bu Sun Kim65020912020-05-20 12:08:20 -07001915 &quot;location&quot;: &quot;A String&quot;, # The [regional endpoint]
1916 # (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
1917 # contains this job.
1918 &quot;startTime&quot;: &quot;A String&quot;, # The timestamp when the job was started (transitioned to JOB_STATE_PENDING).
1919 # Flexible resource scheduling jobs are started with some delay after job
1920 # creation, so start_time is unset before start and is updated when the
1921 # job is started by the Cloud Dataflow service. For other jobs, start_time
1922 # always equals to create_time and is immutable and set by the Cloud Dataflow
1923 # service.
1924 &quot;stepsLocation&quot;: &quot;A String&quot;, # The GCS location where the steps are stored.
1925 &quot;labels&quot;: { # User-defined labels for this job.
1926 #
1927 # The labels map can contain no more than 64 entries. Entries of the labels
1928 # map are UTF8 strings that comply with the following restrictions:
1929 #
1930 # * Keys must conform to regexp: \p{Ll}\p{Lo}{0,62}
1931 # * Values must conform to regexp: [\p{Ll}\p{Lo}\p{N}_-]{0,63}
1932 # * Both keys and values are additionally constrained to be &lt;= 128 bytes in
1933 # size.
1934 &quot;a_key&quot;: &quot;A String&quot;,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001935 },
Bu Sun Kim65020912020-05-20 12:08:20 -07001936 &quot;createTime&quot;: &quot;A String&quot;, # The timestamp when the job was initially created. Immutable and set by the
1937 # Cloud Dataflow service.
1938 &quot;requestedState&quot;: &quot;A String&quot;, # The job&#x27;s requested state.
1939 #
1940 # `UpdateJob` may be used to switch between the `JOB_STATE_STOPPED` and
1941 # `JOB_STATE_RUNNING` states, by setting requested_state. `UpdateJob` may
1942 # also be used to directly set a job&#x27;s requested state to
1943 # `JOB_STATE_CANCELLED` or `JOB_STATE_DONE`, irrevocably terminating the
1944 # job if it has not already reached a terminal state.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001945 }</pre>
1946</div>
1947
1948<div class="method">
Bu Sun Kim65020912020-05-20 12:08:20 -07001949 <code class="details" id="get">get(projectId, jobId, view=None, location=None, x__xgafv=None)</code>
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001950 <pre>Gets the state of the specified Cloud Dataflow job.
1951
1952To get the state of a job, we recommend using `projects.locations.jobs.get`
1953with a [regional endpoint]
1954(https://cloud.google.com/dataflow/docs/concepts/regional-endpoints). Using
1955`projects.jobs.get` is not recommended, as you can only get the state of
1956jobs that are running in `us-central1`.
1957
1958Args:
1959 projectId: string, The ID of the Cloud Platform project that the job belongs to. (required)
1960 jobId: string, The job ID. (required)
Bu Sun Kim65020912020-05-20 12:08:20 -07001961 view: string, The level of information requested in response.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001962 location: string, The [regional endpoint]
1963(https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
1964contains this job.
1965 x__xgafv: string, V1 error format.
1966 Allowed values
1967 1 - v1 error format
1968 2 - v2 error format
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001969
1970Returns:
1971 An object of the form:
1972
1973 { # Defines a job to be run by the Cloud Dataflow service.
Bu Sun Kim65020912020-05-20 12:08:20 -07001974 &quot;clientRequestId&quot;: &quot;A String&quot;, # The client&#x27;s unique identifier of the job, re-used across retried attempts.
1975 # If this field is set, the service will ensure its uniqueness.
1976 # The request to create a job will fail if the service has knowledge of a
1977 # previously submitted job with the same client&#x27;s ID and job name.
1978 # The caller may use this field to ensure idempotence of job
1979 # creation across retried attempts to create a job.
1980 # By default, the field is empty and, in that case, the service ignores it.
1981 &quot;id&quot;: &quot;A String&quot;, # The unique ID of this job.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001982 #
1983 # This field is set by the Cloud Dataflow service when the Job is
1984 # created, and is immutable for the life of the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07001985 &quot;currentStateTime&quot;: &quot;A String&quot;, # The timestamp associated with the current state.
1986 &quot;transformNameMapping&quot;: { # The map of transform name prefixes of the job to be replaced to the
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001987 # corresponding name prefixes of the new job.
Bu Sun Kim65020912020-05-20 12:08:20 -07001988 &quot;a_key&quot;: &quot;A String&quot;,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001989 },
Bu Sun Kim65020912020-05-20 12:08:20 -07001990 &quot;environment&quot;: { # Describes the environment in which a Dataflow Job runs. # The environment for the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07001991 &quot;internalExperiments&quot;: { # Experimental settings.
1992 &quot;a_key&quot;: &quot;&quot;, # Properties of the object. Contains field @type with type URL.
1993 },
1994 &quot;workerRegion&quot;: &quot;A String&quot;, # The Compute Engine region
1995 # (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
1996 # which worker processing should occur, e.g. &quot;us-west1&quot;. Mutually exclusive
1997 # with worker_zone. If neither worker_region nor worker_zone is specified,
1998 # default to the control plane&#x27;s region.
1999 &quot;serviceKmsKeyName&quot;: &quot;A String&quot;, # If set, contains the Cloud KMS key identifier used to encrypt data
2000 # at rest, AKA a Customer Managed Encryption Key (CMEK).
2001 #
2002 # Format:
2003 # projects/PROJECT_ID/locations/LOCATION/keyRings/KEY_RING/cryptoKeys/KEY
2004 &quot;userAgent&quot;: { # A description of the process that generated the request.
2005 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
2006 },
2007 &quot;workerZone&quot;: &quot;A String&quot;, # The Compute Engine zone
2008 # (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
2009 # which worker processing should occur, e.g. &quot;us-west1-a&quot;. Mutually exclusive
2010 # with worker_region. If neither worker_region nor worker_zone is specified,
2011 # a zone in the control plane&#x27;s region is chosen based on available capacity.
2012 &quot;clusterManagerApiService&quot;: &quot;A String&quot;, # The type of cluster manager API to use. If unknown or
Dan O'Mearadd494642020-05-01 07:42:23 -07002013 # unspecified, the service will attempt to choose a reasonable
2014 # default. This should be in the form of the API service name,
Bu Sun Kim65020912020-05-20 12:08:20 -07002015 # e.g. &quot;compute.googleapis.com&quot;.
2016 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the system should use for temporary
2017 # storage. The system will append the suffix &quot;/temp-{JOBNAME} to
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002018 # this resource prefix, where {JOBNAME} is the value of the
2019 # job_name field. The resulting bucket and object prefix is used
2020 # as the prefix of the resources used to store temporary data
2021 # needed during the job execution. NOTE: This will override the
2022 # value in taskrunner_settings.
2023 # The supported resource type is:
2024 #
2025 # Google Cloud Storage:
2026 #
2027 # storage.googleapis.com/{bucket}/{object}
2028 # bucket.storage.googleapis.com/{object}
Bu Sun Kim65020912020-05-20 12:08:20 -07002029 &quot;experiments&quot;: [ # The list of experiments to enable.
2030 &quot;A String&quot;,
2031 ],
2032 &quot;version&quot;: { # A structure describing which components and their versions of the service
2033 # are required in order to run the job.
2034 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
2035 },
2036 &quot;serviceAccountEmail&quot;: &quot;A String&quot;, # Identity to run virtual machines as. Defaults to the default account.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07002037 &quot;sdkPipelineOptions&quot;: { # The Cloud Dataflow SDK pipeline options specified by the user. These
2038 # options are passed through the service and are used to recreate the
2039 # SDK pipeline options on the worker in a language agnostic and platform
2040 # independent way.
2041 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
2042 },
2043 &quot;flexResourceSchedulingGoal&quot;: &quot;A String&quot;, # Which Flexible Resource Scheduling mode to run in.
2044 &quot;workerPools&quot;: [ # The worker pools. At least one &quot;harness&quot; worker pool must be
2045 # specified in order for the job to have workers.
2046 { # Describes one particular pool of Cloud Dataflow workers to be
2047 # instantiated by the Cloud Dataflow service in order to perform the
2048 # computations required by a job. Note that a workflow job may use
2049 # multiple pools, in order to match the various computational
2050 # requirements of the various stages of the job.
2051 &quot;numThreadsPerWorker&quot;: 42, # The number of threads per worker harness. If empty or unspecified, the
2052 # service will choose a number of threads (according to the number of cores
2053 # on the selected machine type for batch, or 1 by convention for streaming).
2054 &quot;numWorkers&quot;: 42, # Number of Google Compute Engine workers in this pool needed to
2055 # execute the job. If zero or unspecified, the service will
2056 # attempt to choose a reasonable default.
2057 &quot;zone&quot;: &quot;A String&quot;, # Zone to run the worker pools in. If empty or unspecified, the service
2058 # will attempt to choose a reasonable default.
2059 &quot;diskSourceImage&quot;: &quot;A String&quot;, # Fully qualified source image for disks.
2060 &quot;packages&quot;: [ # Packages to be installed on workers.
2061 { # The packages that must be installed in order for a worker to run the
2062 # steps of the Cloud Dataflow job that will be assigned to its worker
2063 # pool.
2064 #
2065 # This is the mechanism by which the Cloud Dataflow SDK causes code to
2066 # be loaded onto the workers. For example, the Cloud Dataflow Java SDK
2067 # might use this to install jars containing the user&#x27;s code and all of the
2068 # various dependencies (libraries, data files, etc.) required in order
2069 # for that code to run.
2070 &quot;name&quot;: &quot;A String&quot;, # The name of the package.
2071 &quot;location&quot;: &quot;A String&quot;, # The resource to read the package from. The supported resource type is:
2072 #
2073 # Google Cloud Storage:
2074 #
2075 # storage.googleapis.com/{bucket}
2076 # bucket.storage.googleapis.com/
2077 },
2078 ],
2079 &quot;teardownPolicy&quot;: &quot;A String&quot;, # Sets the policy for determining when to turndown worker pool.
2080 # Allowed values are: `TEARDOWN_ALWAYS`, `TEARDOWN_ON_SUCCESS`, and
2081 # `TEARDOWN_NEVER`.
2082 # `TEARDOWN_ALWAYS` means workers are always torn down regardless of whether
2083 # the job succeeds. `TEARDOWN_ON_SUCCESS` means workers are torn down
2084 # if the job succeeds. `TEARDOWN_NEVER` means the workers are never torn
2085 # down.
2086 #
2087 # If the workers are not torn down by the service, they will
2088 # continue to run and use Google Compute Engine VM resources in the
2089 # user&#x27;s project until they are explicitly terminated by the user.
2090 # Because of this, Google recommends using the `TEARDOWN_ALWAYS`
2091 # policy except for small, manually supervised test jobs.
2092 #
2093 # If unknown or unspecified, the service will attempt to choose a reasonable
2094 # default.
2095 &quot;onHostMaintenance&quot;: &quot;A String&quot;, # The action to take on host maintenance, as defined by the Google
2096 # Compute Engine API.
2097 &quot;poolArgs&quot;: { # Extra arguments for this worker pool.
2098 &quot;a_key&quot;: &quot;&quot;, # Properties of the object. Contains field @type with type URL.
2099 },
2100 &quot;diskSizeGb&quot;: 42, # Size of root disk for VMs, in GB. If zero or unspecified, the service will
2101 # attempt to choose a reasonable default.
2102 &quot;workerHarnessContainerImage&quot;: &quot;A String&quot;, # Required. Docker container image that executes the Cloud Dataflow worker
2103 # harness, residing in Google Container Registry.
2104 #
2105 # Deprecated for the Fn API path. Use sdk_harness_container_images instead.
2106 &quot;diskType&quot;: &quot;A String&quot;, # Type of root disk for VMs. If empty or unspecified, the service will
2107 # attempt to choose a reasonable default.
2108 &quot;machineType&quot;: &quot;A String&quot;, # Machine type (e.g. &quot;n1-standard-1&quot;). If empty or unspecified, the
2109 # service will attempt to choose a reasonable default.
2110 &quot;kind&quot;: &quot;A String&quot;, # The kind of the worker pool; currently only `harness` and `shuffle`
2111 # are supported.
2112 &quot;sdkHarnessContainerImages&quot;: [ # Set of SDK harness containers needed to execute this pipeline. This will
2113 # only be set in the Fn API path. For non-cross-language pipelines this
2114 # should have only one entry. Cross-language pipelines will have two or more
2115 # entries.
2116 { # Defines a SDK harness container for executing Dataflow pipelines.
2117 &quot;containerImage&quot;: &quot;A String&quot;, # A docker container image that resides in Google Container Registry.
2118 &quot;useSingleCorePerContainer&quot;: True or False, # If true, recommends the Dataflow service to use only one core per SDK
2119 # container instance with this image. If false (or unset) recommends using
2120 # more than one core per SDK container instance with this image for
2121 # efficiency. Note that Dataflow service may choose to override this property
2122 # if needed.
2123 },
2124 ],
2125 &quot;dataDisks&quot;: [ # Data disks that are used by a VM in this workflow.
2126 { # Describes the data disk used by a workflow job.
2127 &quot;diskType&quot;: &quot;A String&quot;, # Disk storage type, as defined by Google Compute Engine. This
2128 # must be a disk type appropriate to the project and zone in which
2129 # the workers will run. If unknown or unspecified, the service
2130 # will attempt to choose a reasonable default.
2131 #
2132 # For example, the standard persistent disk type is a resource name
2133 # typically ending in &quot;pd-standard&quot;. If SSD persistent disks are
2134 # available, the resource name typically ends with &quot;pd-ssd&quot;. The
2135 # actual valid values are defined the Google Compute Engine API,
2136 # not by the Cloud Dataflow API; consult the Google Compute Engine
2137 # documentation for more information about determining the set of
2138 # available disk types for a particular project and zone.
2139 #
2140 # Google Compute Engine Disk types are local to a particular
2141 # project in a particular zone, and so the resource name will
2142 # typically look something like this:
2143 #
2144 # compute.googleapis.com/projects/project-id/zones/zone/diskTypes/pd-standard
2145 &quot;sizeGb&quot;: 42, # Size of disk in GB. If zero or unspecified, the service will
2146 # attempt to choose a reasonable default.
2147 &quot;mountPoint&quot;: &quot;A String&quot;, # Directory in a VM where disk is mounted.
2148 },
2149 ],
2150 &quot;subnetwork&quot;: &quot;A String&quot;, # Subnetwork to which VMs will be assigned, if desired. Expected to be of
2151 # the form &quot;regions/REGION/subnetworks/SUBNETWORK&quot;.
2152 &quot;ipConfiguration&quot;: &quot;A String&quot;, # Configuration for VM IPs.
2153 &quot;taskrunnerSettings&quot;: { # Taskrunner configuration settings. # Settings passed through to Google Compute Engine workers when
2154 # using the standard Dataflow task runner. Users should ignore
2155 # this field.
2156 &quot;alsologtostderr&quot;: True or False, # Whether to also send taskrunner log info to stderr.
2157 &quot;taskGroup&quot;: &quot;A String&quot;, # The UNIX group ID on the worker VM to use for tasks launched by
2158 # taskrunner; e.g. &quot;wheel&quot;.
2159 &quot;harnessCommand&quot;: &quot;A String&quot;, # The command to launch the worker harness.
2160 &quot;logDir&quot;: &quot;A String&quot;, # The directory on the VM to store logs.
2161 &quot;oauthScopes&quot;: [ # The OAuth2 scopes to be requested by the taskrunner in order to
2162 # access the Cloud Dataflow API.
2163 &quot;A String&quot;,
2164 ],
2165 &quot;dataflowApiVersion&quot;: &quot;A String&quot;, # The API version of endpoint, e.g. &quot;v1b3&quot;
2166 &quot;logUploadLocation&quot;: &quot;A String&quot;, # Indicates where to put logs. If this is not specified, the logs
2167 # will not be uploaded.
2168 #
2169 # The supported resource type is:
2170 #
2171 # Google Cloud Storage:
2172 # storage.googleapis.com/{bucket}/{object}
2173 # bucket.storage.googleapis.com/{object}
2174 &quot;streamingWorkerMainClass&quot;: &quot;A String&quot;, # The streaming worker main class name.
2175 &quot;workflowFileName&quot;: &quot;A String&quot;, # The file to store the workflow in.
2176 &quot;languageHint&quot;: &quot;A String&quot;, # The suggested backend language.
2177 &quot;commandlinesFileName&quot;: &quot;A String&quot;, # The file to store preprocessing commands in.
2178 &quot;baseTaskDir&quot;: &quot;A String&quot;, # The location on the worker for task-specific subdirectories.
2179 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the taskrunner should use for
2180 # temporary storage.
2181 #
2182 # The supported resource type is:
2183 #
2184 # Google Cloud Storage:
2185 # storage.googleapis.com/{bucket}/{object}
2186 # bucket.storage.googleapis.com/{object}
2187 &quot;baseUrl&quot;: &quot;A String&quot;, # The base URL for the taskrunner to use when accessing Google Cloud APIs.
2188 #
2189 # When workers access Google Cloud APIs, they logically do so via
2190 # relative URLs. If this field is specified, it supplies the base
2191 # URL to use for resolving these relative URLs. The normative
2192 # algorithm used is defined by RFC 1808, &quot;Relative Uniform Resource
2193 # Locators&quot;.
2194 #
2195 # If not specified, the default value is &quot;http://www.googleapis.com/&quot;
2196 &quot;logToSerialconsole&quot;: True or False, # Whether to send taskrunner log info to Google Compute Engine VM serial
2197 # console.
2198 &quot;continueOnException&quot;: True or False, # Whether to continue taskrunner if an exception is hit.
2199 &quot;parallelWorkerSettings&quot;: { # Provides data to pass through to the worker harness. # The settings to pass to the parallel worker harness.
2200 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the system should use for temporary
2201 # storage.
2202 #
2203 # The supported resource type is:
2204 #
2205 # Google Cloud Storage:
2206 #
2207 # storage.googleapis.com/{bucket}/{object}
2208 # bucket.storage.googleapis.com/{object}
2209 &quot;reportingEnabled&quot;: True or False, # Whether to send work progress updates to the service.
2210 &quot;baseUrl&quot;: &quot;A String&quot;, # The base URL for accessing Google Cloud APIs.
2211 #
2212 # When workers access Google Cloud APIs, they logically do so via
2213 # relative URLs. If this field is specified, it supplies the base
2214 # URL to use for resolving these relative URLs. The normative
2215 # algorithm used is defined by RFC 1808, &quot;Relative Uniform Resource
2216 # Locators&quot;.
2217 #
2218 # If not specified, the default value is &quot;http://www.googleapis.com/&quot;
2219 &quot;servicePath&quot;: &quot;A String&quot;, # The Cloud Dataflow service path relative to the root URL, for example,
2220 # &quot;dataflow/v1b3/projects&quot;.
2221 &quot;shuffleServicePath&quot;: &quot;A String&quot;, # The Shuffle service path relative to the root URL, for example,
2222 # &quot;shuffle/v1beta1&quot;.
2223 &quot;workerId&quot;: &quot;A String&quot;, # The ID of the worker running this pipeline.
2224 },
2225 &quot;taskUser&quot;: &quot;A String&quot;, # The UNIX user ID on the worker VM to use for tasks launched by
2226 # taskrunner; e.g. &quot;root&quot;.
2227 &quot;vmId&quot;: &quot;A String&quot;, # The ID string of the VM.
2228 },
2229 &quot;autoscalingSettings&quot;: { # Settings for WorkerPool autoscaling. # Settings for autoscaling of this WorkerPool.
2230 &quot;algorithm&quot;: &quot;A String&quot;, # The algorithm to use for autoscaling.
2231 &quot;maxNumWorkers&quot;: 42, # The maximum number of workers to cap scaling at.
2232 },
2233 &quot;metadata&quot;: { # Metadata to set on the Google Compute Engine VMs.
2234 &quot;a_key&quot;: &quot;A String&quot;,
2235 },
2236 &quot;defaultPackageSet&quot;: &quot;A String&quot;, # The default package set to install. This allows the service to
2237 # select a default set of packages which are useful to worker
2238 # harnesses written in a particular language.
2239 &quot;network&quot;: &quot;A String&quot;, # Network to which VMs will be assigned. If empty or unspecified,
2240 # the service will use the network &quot;default&quot;.
2241 },
2242 ],
2243 &quot;dataset&quot;: &quot;A String&quot;, # The dataset for the current project where various workflow
2244 # related tables are stored.
2245 #
2246 # The supported resource type is:
2247 #
2248 # Google BigQuery:
2249 # bigquery.googleapis.com/{dataset}
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04002250 },
Bu Sun Kim65020912020-05-20 12:08:20 -07002251 &quot;stageStates&quot;: [ # This field may be mutated by the Cloud Dataflow service;
2252 # callers cannot mutate it.
2253 { # A message describing the state of a particular execution stage.
Bu Sun Kim65020912020-05-20 12:08:20 -07002254 &quot;currentStateTime&quot;: &quot;A String&quot;, # The time at which the stage transitioned to this state.
2255 &quot;executionStageState&quot;: &quot;A String&quot;, # Executions stage states allow the same set of values as JobState.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07002256 &quot;executionStageName&quot;: &quot;A String&quot;, # The name of the execution stage.
Bu Sun Kim65020912020-05-20 12:08:20 -07002257 },
2258 ],
2259 &quot;jobMetadata&quot;: { # Metadata available primarily for filtering jobs. Will be included in the # This field is populated by the Dataflow service to support filtering jobs
2260 # by the metadata values provided here. Populated for ListJobs and all GetJob
2261 # views SUMMARY and higher.
2262 # ListJob response and Job SUMMARY view.
Bu Sun Kim65020912020-05-20 12:08:20 -07002263 &quot;datastoreDetails&quot;: [ # Identification of a Datastore source used in the Dataflow job.
2264 { # Metadata for a Datastore connector used by the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07002265 &quot;namespace&quot;: &quot;A String&quot;, # Namespace used in the connection.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07002266 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
Bu Sun Kim65020912020-05-20 12:08:20 -07002267 },
2268 ],
2269 &quot;sdkVersion&quot;: { # The version of the SDK used to run the job. # The SDK version used to run the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07002270 &quot;version&quot;: &quot;A String&quot;, # The version of the SDK used to run the job.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07002271 &quot;sdkSupportStatus&quot;: &quot;A String&quot;, # The support status for this SDK version.
2272 &quot;versionDisplayName&quot;: &quot;A String&quot;, # A readable string describing the version of the SDK.
Bu Sun Kim65020912020-05-20 12:08:20 -07002273 },
2274 &quot;bigqueryDetails&quot;: [ # Identification of a BigQuery source used in the Dataflow job.
2275 { # Metadata for a BigQuery connector used by the job.
2276 &quot;table&quot;: &quot;A String&quot;, # Table accessed in the connection.
2277 &quot;dataset&quot;: &quot;A String&quot;, # Dataset accessed in the connection.
Bu Sun Kim65020912020-05-20 12:08:20 -07002278 &quot;query&quot;: &quot;A String&quot;, # Query used to access data in the connection.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07002279 &quot;projectId&quot;: &quot;A String&quot;, # Project accessed in the connection.
Bu Sun Kim65020912020-05-20 12:08:20 -07002280 },
2281 ],
2282 &quot;fileDetails&quot;: [ # Identification of a File source used in the Dataflow job.
2283 { # Metadata for a File connector used by the job.
2284 &quot;filePattern&quot;: &quot;A String&quot;, # File Pattern used to access files by the connector.
2285 },
2286 ],
2287 &quot;pubsubDetails&quot;: [ # Identification of a PubSub source used in the Dataflow job.
2288 { # Metadata for a PubSub connector used by the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07002289 &quot;topic&quot;: &quot;A String&quot;, # Topic accessed in the connection.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07002290 &quot;subscription&quot;: &quot;A String&quot;, # Subscription used in the connection.
2291 },
2292 ],
2293 &quot;bigTableDetails&quot;: [ # Identification of a BigTable source used in the Dataflow job.
2294 { # Metadata for a BigTable connector used by the job.
2295 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
2296 &quot;instanceId&quot;: &quot;A String&quot;, # InstanceId accessed in the connection.
2297 &quot;tableId&quot;: &quot;A String&quot;, # TableId accessed in the connection.
2298 },
2299 ],
2300 &quot;spannerDetails&quot;: [ # Identification of a Spanner source used in the Dataflow job.
2301 { # Metadata for a Spanner connector used by the job.
2302 &quot;instanceId&quot;: &quot;A String&quot;, # InstanceId accessed in the connection.
2303 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
2304 &quot;databaseId&quot;: &quot;A String&quot;, # DatabaseId accessed in the connection.
Bu Sun Kim65020912020-05-20 12:08:20 -07002305 },
2306 ],
2307 },
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07002308 &quot;type&quot;: &quot;A String&quot;, # The type of Cloud Dataflow job.
2309 &quot;projectId&quot;: &quot;A String&quot;, # The ID of the Cloud Platform project that the job belongs to.
Bu Sun Kim65020912020-05-20 12:08:20 -07002310 &quot;createdFromSnapshotId&quot;: &quot;A String&quot;, # If this is specified, the job&#x27;s initial state is populated from the given
2311 # snapshot.
Bu Sun Kim65020912020-05-20 12:08:20 -07002312 &quot;pipelineDescription&quot;: { # A descriptive representation of submitted pipeline as well as the executed # Preliminary field: The format of this data may change at any time.
2313 # A description of the user pipeline and stages through which it is executed.
2314 # Created by Cloud Dataflow service. Only retrieved with
2315 # JOB_VIEW_DESCRIPTION or JOB_VIEW_ALL.
2316 # form. This data is provided by the Dataflow service for ease of visualizing
2317 # the pipeline and interpreting Dataflow provided metrics.
2318 &quot;executionPipelineStage&quot;: [ # Description of each stage of execution of the pipeline.
2319 { # Description of the composing transforms, names/ids, and input/outputs of a
2320 # stage of execution. Some composing transforms and sources may have been
2321 # generated by the Dataflow service during execution planning.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07002322 &quot;outputSource&quot;: [ # Output sources for this stage.
2323 { # Description of an input or output of an execution stage.
2324 &quot;sizeBytes&quot;: &quot;A String&quot;, # Size of the source, if measurable.
2325 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
2326 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this source; may be user or system generated.
2327 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
2328 # source is most closely associated.
2329 },
2330 ],
2331 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this stage.
2332 &quot;inputSource&quot;: [ # Input sources for this stage.
2333 { # Description of an input or output of an execution stage.
2334 &quot;sizeBytes&quot;: &quot;A String&quot;, # Size of the source, if measurable.
2335 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
2336 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this source; may be user or system generated.
2337 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
2338 # source is most closely associated.
2339 },
2340 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07002341 &quot;id&quot;: &quot;A String&quot;, # Dataflow service generated id for this stage.
2342 &quot;componentTransform&quot;: [ # Transforms that comprise this execution stage.
2343 { # Description of a transform executed as part of an execution stage.
2344 &quot;originalTransform&quot;: &quot;A String&quot;, # User name for the original user transform with which this transform is
2345 # most closely associated.
2346 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
2347 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this transform; may be user or system generated.
2348 },
2349 ],
2350 &quot;componentSource&quot;: [ # Collections produced and consumed by component transforms of this stage.
2351 { # Description of an interstitial value between transforms in an execution
2352 # stage.
2353 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
2354 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this transform; may be user or system generated.
2355 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
2356 # source is most closely associated.
2357 },
2358 ],
2359 &quot;kind&quot;: &quot;A String&quot;, # Type of tranform this stage is executing.
Bu Sun Kim65020912020-05-20 12:08:20 -07002360 },
2361 ],
2362 &quot;originalPipelineTransform&quot;: [ # Description of each transform in the pipeline and collections between them.
2363 { # Description of the type, names/ids, and input/outputs for a transform.
2364 &quot;kind&quot;: &quot;A String&quot;, # Type of transform.
2365 &quot;inputCollectionName&quot;: [ # User names for all collection inputs to this transform.
2366 &quot;A String&quot;,
2367 ],
2368 &quot;name&quot;: &quot;A String&quot;, # User provided name for this transform instance.
2369 &quot;id&quot;: &quot;A String&quot;, # SDK generated id of this transform instance.
2370 &quot;displayData&quot;: [ # Transform-specific display data.
2371 { # Data provided with a pipeline or transform to provide descriptive info.
Bu Sun Kim65020912020-05-20 12:08:20 -07002372 &quot;durationValue&quot;: &quot;A String&quot;, # Contains value if the data is of duration type.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07002373 &quot;int64Value&quot;: &quot;A String&quot;, # Contains value if the data is of int64 type.
Bu Sun Kim65020912020-05-20 12:08:20 -07002374 &quot;namespace&quot;: &quot;A String&quot;, # The namespace for the key. This is usually a class name or programming
2375 # language namespace (i.e. python module) which defines the display data.
2376 # This allows a dax monitoring system to specially handle the data
2377 # and perform custom rendering.
2378 &quot;floatValue&quot;: 3.14, # Contains value if the data is of float type.
2379 &quot;key&quot;: &quot;A String&quot;, # The key identifying the display data.
2380 # This is intended to be used as a label for the display data
2381 # when viewed in a dax monitoring system.
2382 &quot;shortStrValue&quot;: &quot;A String&quot;, # A possible additional shorter value to display.
2383 # For example a java_class_name_value of com.mypackage.MyDoFn
2384 # will be stored with MyDoFn as the short_str_value and
2385 # com.mypackage.MyDoFn as the java_class_name value.
2386 # short_str_value can be displayed and java_class_name_value
2387 # will be displayed as a tooltip.
2388 &quot;url&quot;: &quot;A String&quot;, # An optional full URL.
2389 &quot;label&quot;: &quot;A String&quot;, # An optional label to display in a dax UI for the element.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07002390 &quot;timestampValue&quot;: &quot;A String&quot;, # Contains value if the data is of timestamp type.
2391 &quot;boolValue&quot;: True or False, # Contains value if the data is of a boolean type.
2392 &quot;javaClassValue&quot;: &quot;A String&quot;, # Contains value if the data is of java class type.
2393 &quot;strValue&quot;: &quot;A String&quot;, # Contains value if the data is of string type.
Bu Sun Kim65020912020-05-20 12:08:20 -07002394 },
2395 ],
2396 &quot;outputCollectionName&quot;: [ # User names for all collection outputs to this transform.
2397 &quot;A String&quot;,
2398 ],
2399 },
2400 ],
2401 &quot;displayData&quot;: [ # Pipeline level display data.
2402 { # Data provided with a pipeline or transform to provide descriptive info.
Bu Sun Kim65020912020-05-20 12:08:20 -07002403 &quot;durationValue&quot;: &quot;A String&quot;, # Contains value if the data is of duration type.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07002404 &quot;int64Value&quot;: &quot;A String&quot;, # Contains value if the data is of int64 type.
Bu Sun Kim65020912020-05-20 12:08:20 -07002405 &quot;namespace&quot;: &quot;A String&quot;, # The namespace for the key. This is usually a class name or programming
2406 # language namespace (i.e. python module) which defines the display data.
2407 # This allows a dax monitoring system to specially handle the data
2408 # and perform custom rendering.
2409 &quot;floatValue&quot;: 3.14, # Contains value if the data is of float type.
2410 &quot;key&quot;: &quot;A String&quot;, # The key identifying the display data.
2411 # This is intended to be used as a label for the display data
2412 # when viewed in a dax monitoring system.
2413 &quot;shortStrValue&quot;: &quot;A String&quot;, # A possible additional shorter value to display.
2414 # For example a java_class_name_value of com.mypackage.MyDoFn
2415 # will be stored with MyDoFn as the short_str_value and
2416 # com.mypackage.MyDoFn as the java_class_name value.
2417 # short_str_value can be displayed and java_class_name_value
2418 # will be displayed as a tooltip.
2419 &quot;url&quot;: &quot;A String&quot;, # An optional full URL.
2420 &quot;label&quot;: &quot;A String&quot;, # An optional label to display in a dax UI for the element.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07002421 &quot;timestampValue&quot;: &quot;A String&quot;, # Contains value if the data is of timestamp type.
2422 &quot;boolValue&quot;: True or False, # Contains value if the data is of a boolean type.
2423 &quot;javaClassValue&quot;: &quot;A String&quot;, # Contains value if the data is of java class type.
2424 &quot;strValue&quot;: &quot;A String&quot;, # Contains value if the data is of string type.
Bu Sun Kim65020912020-05-20 12:08:20 -07002425 },
2426 ],
2427 },
2428 &quot;replaceJobId&quot;: &quot;A String&quot;, # If this job is an update of an existing job, this field is the job ID
2429 # of the job it replaced.
2430 #
2431 # When sending a `CreateJobRequest`, you can update a job by specifying it
2432 # here. The job named here is stopped, and its intermediate state is
2433 # transferred to this job.
2434 &quot;tempFiles&quot;: [ # A set of files the system should be aware of that are used
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002435 # for temporary storage. These temporary files will be
2436 # removed on job completion.
2437 # No duplicates are allowed.
2438 # No file patterns are supported.
2439 #
2440 # The supported files are:
2441 #
2442 # Google Cloud Storage:
2443 #
2444 # storage.googleapis.com/{bucket}/{object}
2445 # bucket.storage.googleapis.com/{object}
Bu Sun Kim65020912020-05-20 12:08:20 -07002446 &quot;A String&quot;,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002447 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07002448 &quot;name&quot;: &quot;A String&quot;, # The user-specified Cloud Dataflow job name.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002449 #
2450 # Only one Job with a given name may exist in a project at any
2451 # given time. If a caller attempts to create a Job with the same
2452 # name as an already-existing Job, the attempt returns the
2453 # existing Job.
2454 #
2455 # The name must match the regular expression
2456 # `[a-z]([-a-z0-9]{0,38}[a-z0-9])?`
Bu Sun Kim65020912020-05-20 12:08:20 -07002457 &quot;steps&quot;: [ # Exactly one of step or steps_location should be specified.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002458 #
2459 # The top-level steps that constitute the entire job.
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04002460 { # Defines a particular step within a Cloud Dataflow job.
2461 #
2462 # A job consists of multiple steps, each of which performs some
2463 # specific operation as part of the overall job. Data is typically
2464 # passed from one step to another as part of the job.
2465 #
Bu Sun Kim65020912020-05-20 12:08:20 -07002466 # Here&#x27;s an example of a sequence of steps which together implement a
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04002467 # Map-Reduce job:
2468 #
2469 # * Read a collection of data from some source, parsing the
Bu Sun Kim65020912020-05-20 12:08:20 -07002470 # collection&#x27;s elements.
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04002471 #
2472 # * Validate the elements.
2473 #
2474 # * Apply a user-defined function to map each element to some value
2475 # and extract an element-specific key value.
2476 #
2477 # * Group elements with the same key into a single element with
2478 # that key, transforming a multiply-keyed collection into a
2479 # uniquely-keyed collection.
2480 #
2481 # * Write the elements out to some data sink.
2482 #
2483 # Note that the Cloud Dataflow service may be used to run many different
2484 # types of jobs, not just Map-Reduce.
Bu Sun Kim65020912020-05-20 12:08:20 -07002485 &quot;name&quot;: &quot;A String&quot;, # The name that identifies the step. This must be unique for each
Dan O'Mearadd494642020-05-01 07:42:23 -07002486 # step with respect to all other steps in the Cloud Dataflow job.
Bu Sun Kim65020912020-05-20 12:08:20 -07002487 &quot;kind&quot;: &quot;A String&quot;, # The kind of step in the Cloud Dataflow job.
2488 &quot;properties&quot;: { # Named properties associated with the step. Each kind of
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04002489 # predefined step has its own required set of properties.
2490 # Must be provided on Create. Only retrieved with JOB_VIEW_ALL.
Bu Sun Kim65020912020-05-20 12:08:20 -07002491 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
Takashi Matsuo06694102015-09-11 13:55:40 -07002492 },
2493 },
2494 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07002495 &quot;replacedByJobId&quot;: &quot;A String&quot;, # If another job is an update of this job (and thus, this job is in
2496 # `JOB_STATE_UPDATED`), this field contains the ID of that job.
2497 &quot;executionInfo&quot;: { # Additional information about how a Cloud Dataflow job will be executed that # Deprecated.
2498 # isn&#x27;t contained in the submitted job.
2499 &quot;stages&quot;: { # A mapping from each stage to the information about that stage.
2500 &quot;a_key&quot;: { # Contains information about how a particular
2501 # google.dataflow.v1beta3.Step will be executed.
2502 &quot;stepName&quot;: [ # The steps associated with the execution stage.
2503 # Note that stages may have several steps, and that a given step
2504 # might be run by more than one stage.
2505 &quot;A String&quot;,
2506 ],
2507 },
2508 },
2509 },
2510 &quot;currentState&quot;: &quot;A String&quot;, # The current state of the job.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002511 #
2512 # Jobs are created in the `JOB_STATE_STOPPED` state unless otherwise
2513 # specified.
2514 #
2515 # A job in the `JOB_STATE_RUNNING` state may asynchronously enter a
2516 # terminal state. After a job has reached a terminal state, no
2517 # further state updates may be made.
2518 #
2519 # This field may be mutated by the Cloud Dataflow service;
2520 # callers cannot mutate it.
Bu Sun Kim65020912020-05-20 12:08:20 -07002521 &quot;location&quot;: &quot;A String&quot;, # The [regional endpoint]
2522 # (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
2523 # contains this job.
2524 &quot;startTime&quot;: &quot;A String&quot;, # The timestamp when the job was started (transitioned to JOB_STATE_PENDING).
2525 # Flexible resource scheduling jobs are started with some delay after job
2526 # creation, so start_time is unset before start and is updated when the
2527 # job is started by the Cloud Dataflow service. For other jobs, start_time
2528 # always equals to create_time and is immutable and set by the Cloud Dataflow
2529 # service.
2530 &quot;stepsLocation&quot;: &quot;A String&quot;, # The GCS location where the steps are stored.
2531 &quot;labels&quot;: { # User-defined labels for this job.
2532 #
2533 # The labels map can contain no more than 64 entries. Entries of the labels
2534 # map are UTF8 strings that comply with the following restrictions:
2535 #
2536 # * Keys must conform to regexp: \p{Ll}\p{Lo}{0,62}
2537 # * Values must conform to regexp: [\p{Ll}\p{Lo}\p{N}_-]{0,63}
2538 # * Both keys and values are additionally constrained to be &lt;= 128 bytes in
2539 # size.
2540 &quot;a_key&quot;: &quot;A String&quot;,
Nathaniel Manista4f877e52015-06-15 16:44:50 +00002541 },
Bu Sun Kim65020912020-05-20 12:08:20 -07002542 &quot;createTime&quot;: &quot;A String&quot;, # The timestamp when the job was initially created. Immutable and set by the
2543 # Cloud Dataflow service.
2544 &quot;requestedState&quot;: &quot;A String&quot;, # The job&#x27;s requested state.
2545 #
2546 # `UpdateJob` may be used to switch between the `JOB_STATE_STOPPED` and
2547 # `JOB_STATE_RUNNING` states, by setting requested_state. `UpdateJob` may
2548 # also be used to directly set a job&#x27;s requested state to
2549 # `JOB_STATE_CANCELLED` or `JOB_STATE_DONE`, irrevocably terminating the
2550 # job if it has not already reached a terminal state.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002551 }</pre>
Nathaniel Manista4f877e52015-06-15 16:44:50 +00002552</div>
2553
2554<div class="method">
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07002555 <code class="details" id="getMetrics">getMetrics(projectId, jobId, startTime=None, location=None, x__xgafv=None)</code>
Nathaniel Manista4f877e52015-06-15 16:44:50 +00002556 <pre>Request the job status.
2557
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002558To request the status of a job, we recommend using
2559`projects.locations.jobs.getMetrics` with a [regional endpoint]
2560(https://cloud.google.com/dataflow/docs/concepts/regional-endpoints). Using
2561`projects.jobs.getMetrics` is not recommended, as you can only request the
2562status of jobs that are running in `us-central1`.
2563
Nathaniel Manista4f877e52015-06-15 16:44:50 +00002564Args:
Takashi Matsuo06694102015-09-11 13:55:40 -07002565 projectId: string, A project id. (required)
2566 jobId: string, The job to get messages for. (required)
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07002567 startTime: string, Return only metric data that has changed since this time.
2568Default is to return all information about all metrics for the job.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002569 location: string, The [regional endpoint]
2570(https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
2571contains the job specified by job_id.
Takashi Matsuo06694102015-09-11 13:55:40 -07002572 x__xgafv: string, V1 error format.
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04002573 Allowed values
2574 1 - v1 error format
2575 2 - v2 error format
Nathaniel Manista4f877e52015-06-15 16:44:50 +00002576
2577Returns:
2578 An object of the form:
2579
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002580 { # JobMetrics contains a collection of metrics describing the detailed progress
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04002581 # of a Dataflow job. Metrics correspond to user-defined and system-defined
2582 # metrics in the job.
2583 #
2584 # This resource captures only the most recent values of each metric;
2585 # time-series data can be queried for them (under the same metric names)
2586 # from Cloud Monitoring.
Bu Sun Kim65020912020-05-20 12:08:20 -07002587 &quot;metrics&quot;: [ # All metrics for this job.
Takashi Matsuo06694102015-09-11 13:55:40 -07002588 { # Describes the state of a metric.
Bu Sun Kim65020912020-05-20 12:08:20 -07002589 &quot;set&quot;: &quot;&quot;, # Worker-computed aggregate value for the &quot;Set&quot; aggregation kind. The only
2590 # possible value type is a list of Values whose type can be Long, Double,
2591 # or String, according to the metric&#x27;s type. All Values in the list must
2592 # be of the same type.
2593 &quot;gauge&quot;: &quot;&quot;, # A struct value describing properties of a Gauge.
2594 # Metrics of gauge type show the value of a metric across time, and is
2595 # aggregated based on the newest value.
2596 &quot;cumulative&quot;: True or False, # True if this metric is reported as the total cumulative aggregate
2597 # value accumulated since the worker started working on this WorkItem.
2598 # By default this is false, indicating that this metric is reported
2599 # as a delta that is not associated with any WorkItem.
2600 &quot;internal&quot;: &quot;&quot;, # Worker-computed aggregate value for internal use by the Dataflow
2601 # service.
2602 &quot;kind&quot;: &quot;A String&quot;, # Metric aggregation kind. The possible metric aggregation kinds are
2603 # &quot;Sum&quot;, &quot;Max&quot;, &quot;Min&quot;, &quot;Mean&quot;, &quot;Set&quot;, &quot;And&quot;, &quot;Or&quot;, and &quot;Distribution&quot;.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002604 # The specified aggregation kind is case-insensitive.
2605 #
2606 # If omitted, this is not an aggregated value but instead
2607 # a single metric sample value.
Bu Sun Kim65020912020-05-20 12:08:20 -07002608 &quot;scalar&quot;: &quot;&quot;, # Worker-computed aggregate value for aggregation kinds &quot;Sum&quot;, &quot;Max&quot;, &quot;Min&quot;,
2609 # &quot;And&quot;, and &quot;Or&quot;. The possible value types are Long, Double, and Boolean.
2610 &quot;meanCount&quot;: &quot;&quot;, # Worker-computed aggregate value for the &quot;Mean&quot; aggregation kind.
2611 # This holds the count of the aggregated values and is used in combination
2612 # with mean_sum above to obtain the actual mean aggregate value.
2613 # The only possible value type is Long.
2614 &quot;meanSum&quot;: &quot;&quot;, # Worker-computed aggregate value for the &quot;Mean&quot; aggregation kind.
Sai Cheemalapati4ba8c232017-06-06 18:46:08 -04002615 # This holds the sum of the aggregated values and is used in combination
2616 # with mean_count below to obtain the actual mean aggregate value.
2617 # The only possible value types are Long and Double.
Bu Sun Kim65020912020-05-20 12:08:20 -07002618 &quot;updateTime&quot;: &quot;A String&quot;, # Timestamp associated with the metric value. Optional when workers are
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002619 # reporting work progress; it will be filled in responses from the
2620 # metrics API.
Bu Sun Kim65020912020-05-20 12:08:20 -07002621 &quot;name&quot;: { # Identifies a metric, by describing the source which generated the # Name of the metric.
2622 # metric.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07002623 &quot;name&quot;: &quot;A String&quot;, # Worker-defined metric name.
2624 &quot;origin&quot;: &quot;A String&quot;, # Origin (namespace) of metric name. May be blank for user-define metrics;
2625 # will be &quot;dataflow&quot; for metrics defined by the Dataflow service or SDK.
Bu Sun Kim65020912020-05-20 12:08:20 -07002626 &quot;context&quot;: { # Zero or more labeled fields which identify the part of the job this
2627 # metric is associated with, such as the name of a step or collection.
2628 #
2629 # For example, built-in counters associated with steps will have
2630 # context[&#x27;step&#x27;] = &lt;step-name&gt;. Counters associated with PCollections
2631 # in the SDK will have context[&#x27;pcollection&#x27;] = &lt;pcollection-name&gt;.
2632 &quot;a_key&quot;: &quot;A String&quot;,
2633 },
Bu Sun Kim65020912020-05-20 12:08:20 -07002634 },
2635 &quot;distribution&quot;: &quot;&quot;, # A struct value describing properties of a distribution of numeric values.
Nathaniel Manista4f877e52015-06-15 16:44:50 +00002636 },
2637 ],
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07002638 &quot;metricTime&quot;: &quot;A String&quot;, # Timestamp as of which metric values are current.
Nathaniel Manista4f877e52015-06-15 16:44:50 +00002639 }</pre>
2640</div>
2641
2642<div class="method">
Bu Sun Kim65020912020-05-20 12:08:20 -07002643 <code class="details" id="list">list(projectId, filter=None, location=None, pageToken=None, pageSize=None, view=None, x__xgafv=None)</code>
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04002644 <pre>List the jobs of a project.
Nathaniel Manista4f877e52015-06-15 16:44:50 +00002645
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002646To list the jobs of a project in a region, we recommend using
2647`projects.locations.jobs.get` with a [regional endpoint]
2648(https://cloud.google.com/dataflow/docs/concepts/regional-endpoints). To
2649list the all jobs across all regions, use `projects.jobs.aggregated`. Using
2650`projects.jobs.list` is not recommended, as you can only get the list of
2651jobs that are running in `us-central1`.
2652
Nathaniel Manista4f877e52015-06-15 16:44:50 +00002653Args:
Takashi Matsuo06694102015-09-11 13:55:40 -07002654 projectId: string, The project which owns the jobs. (required)
Bu Sun Kim65020912020-05-20 12:08:20 -07002655 filter: string, The kind of filter to use.
2656 location: string, The [regional endpoint]
2657(https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
2658contains this job.
2659 pageToken: string, Set this to the &#x27;next_page_token&#x27; field of a previous response
2660to request additional results in a long list.
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04002661 pageSize: integer, If there are many jobs, limit response to at most this many.
2662The actual number of jobs returned will be the lesser of max_responses
2663and an unspecified server-defined limit.
Bu Sun Kim65020912020-05-20 12:08:20 -07002664 view: string, Level of information requested in response. Default is `JOB_VIEW_SUMMARY`.
Takashi Matsuo06694102015-09-11 13:55:40 -07002665 x__xgafv: string, V1 error format.
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04002666 Allowed values
2667 1 - v1 error format
2668 2 - v2 error format
Nathaniel Manista4f877e52015-06-15 16:44:50 +00002669
2670Returns:
2671 An object of the form:
2672
Dan O'Mearadd494642020-05-01 07:42:23 -07002673 { # Response to a request to list Cloud Dataflow jobs in a project. This might
2674 # be a partial response, depending on the page size in the ListJobsRequest.
2675 # However, if the project does not have any jobs, an instance of
Bu Sun Kim65020912020-05-20 12:08:20 -07002676 # ListJobsResponse is not returned and the requests&#x27;s response
Dan O'Mearadd494642020-05-01 07:42:23 -07002677 # body is empty {}.
Bu Sun Kim65020912020-05-20 12:08:20 -07002678 &quot;jobs&quot;: [ # A subset of the requested job information.
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04002679 { # Defines a job to be run by the Cloud Dataflow service.
Bu Sun Kim65020912020-05-20 12:08:20 -07002680 &quot;clientRequestId&quot;: &quot;A String&quot;, # The client&#x27;s unique identifier of the job, re-used across retried attempts.
2681 # If this field is set, the service will ensure its uniqueness.
2682 # The request to create a job will fail if the service has knowledge of a
2683 # previously submitted job with the same client&#x27;s ID and job name.
2684 # The caller may use this field to ensure idempotence of job
2685 # creation across retried attempts to create a job.
2686 # By default, the field is empty and, in that case, the service ignores it.
2687 &quot;id&quot;: &quot;A String&quot;, # The unique ID of this job.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002688 #
2689 # This field is set by the Cloud Dataflow service when the Job is
2690 # created, and is immutable for the life of the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07002691 &quot;currentStateTime&quot;: &quot;A String&quot;, # The timestamp associated with the current state.
2692 &quot;transformNameMapping&quot;: { # The map of transform name prefixes of the job to be replaced to the
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002693 # corresponding name prefixes of the new job.
Bu Sun Kim65020912020-05-20 12:08:20 -07002694 &quot;a_key&quot;: &quot;A String&quot;,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002695 },
Bu Sun Kim65020912020-05-20 12:08:20 -07002696 &quot;environment&quot;: { # Describes the environment in which a Dataflow Job runs. # The environment for the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07002697 &quot;internalExperiments&quot;: { # Experimental settings.
2698 &quot;a_key&quot;: &quot;&quot;, # Properties of the object. Contains field @type with type URL.
2699 },
2700 &quot;workerRegion&quot;: &quot;A String&quot;, # The Compute Engine region
2701 # (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
2702 # which worker processing should occur, e.g. &quot;us-west1&quot;. Mutually exclusive
2703 # with worker_zone. If neither worker_region nor worker_zone is specified,
2704 # default to the control plane&#x27;s region.
2705 &quot;serviceKmsKeyName&quot;: &quot;A String&quot;, # If set, contains the Cloud KMS key identifier used to encrypt data
2706 # at rest, AKA a Customer Managed Encryption Key (CMEK).
2707 #
2708 # Format:
2709 # projects/PROJECT_ID/locations/LOCATION/keyRings/KEY_RING/cryptoKeys/KEY
2710 &quot;userAgent&quot;: { # A description of the process that generated the request.
2711 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
2712 },
2713 &quot;workerZone&quot;: &quot;A String&quot;, # The Compute Engine zone
2714 # (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
2715 # which worker processing should occur, e.g. &quot;us-west1-a&quot;. Mutually exclusive
2716 # with worker_region. If neither worker_region nor worker_zone is specified,
2717 # a zone in the control plane&#x27;s region is chosen based on available capacity.
2718 &quot;clusterManagerApiService&quot;: &quot;A String&quot;, # The type of cluster manager API to use. If unknown or
Dan O'Mearadd494642020-05-01 07:42:23 -07002719 # unspecified, the service will attempt to choose a reasonable
2720 # default. This should be in the form of the API service name,
Bu Sun Kim65020912020-05-20 12:08:20 -07002721 # e.g. &quot;compute.googleapis.com&quot;.
2722 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the system should use for temporary
2723 # storage. The system will append the suffix &quot;/temp-{JOBNAME} to
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002724 # this resource prefix, where {JOBNAME} is the value of the
2725 # job_name field. The resulting bucket and object prefix is used
2726 # as the prefix of the resources used to store temporary data
2727 # needed during the job execution. NOTE: This will override the
2728 # value in taskrunner_settings.
2729 # The supported resource type is:
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04002730 #
2731 # Google Cloud Storage:
2732 #
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002733 # storage.googleapis.com/{bucket}/{object}
2734 # bucket.storage.googleapis.com/{object}
Bu Sun Kim65020912020-05-20 12:08:20 -07002735 &quot;experiments&quot;: [ # The list of experiments to enable.
2736 &quot;A String&quot;,
2737 ],
2738 &quot;version&quot;: { # A structure describing which components and their versions of the service
2739 # are required in order to run the job.
2740 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
2741 },
2742 &quot;serviceAccountEmail&quot;: &quot;A String&quot;, # Identity to run virtual machines as. Defaults to the default account.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07002743 &quot;sdkPipelineOptions&quot;: { # The Cloud Dataflow SDK pipeline options specified by the user. These
2744 # options are passed through the service and are used to recreate the
2745 # SDK pipeline options on the worker in a language agnostic and platform
2746 # independent way.
2747 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
2748 },
2749 &quot;flexResourceSchedulingGoal&quot;: &quot;A String&quot;, # Which Flexible Resource Scheduling mode to run in.
2750 &quot;workerPools&quot;: [ # The worker pools. At least one &quot;harness&quot; worker pool must be
2751 # specified in order for the job to have workers.
2752 { # Describes one particular pool of Cloud Dataflow workers to be
2753 # instantiated by the Cloud Dataflow service in order to perform the
2754 # computations required by a job. Note that a workflow job may use
2755 # multiple pools, in order to match the various computational
2756 # requirements of the various stages of the job.
2757 &quot;numThreadsPerWorker&quot;: 42, # The number of threads per worker harness. If empty or unspecified, the
2758 # service will choose a number of threads (according to the number of cores
2759 # on the selected machine type for batch, or 1 by convention for streaming).
2760 &quot;numWorkers&quot;: 42, # Number of Google Compute Engine workers in this pool needed to
2761 # execute the job. If zero or unspecified, the service will
2762 # attempt to choose a reasonable default.
2763 &quot;zone&quot;: &quot;A String&quot;, # Zone to run the worker pools in. If empty or unspecified, the service
2764 # will attempt to choose a reasonable default.
2765 &quot;diskSourceImage&quot;: &quot;A String&quot;, # Fully qualified source image for disks.
2766 &quot;packages&quot;: [ # Packages to be installed on workers.
2767 { # The packages that must be installed in order for a worker to run the
2768 # steps of the Cloud Dataflow job that will be assigned to its worker
2769 # pool.
2770 #
2771 # This is the mechanism by which the Cloud Dataflow SDK causes code to
2772 # be loaded onto the workers. For example, the Cloud Dataflow Java SDK
2773 # might use this to install jars containing the user&#x27;s code and all of the
2774 # various dependencies (libraries, data files, etc.) required in order
2775 # for that code to run.
2776 &quot;name&quot;: &quot;A String&quot;, # The name of the package.
2777 &quot;location&quot;: &quot;A String&quot;, # The resource to read the package from. The supported resource type is:
2778 #
2779 # Google Cloud Storage:
2780 #
2781 # storage.googleapis.com/{bucket}
2782 # bucket.storage.googleapis.com/
2783 },
2784 ],
2785 &quot;teardownPolicy&quot;: &quot;A String&quot;, # Sets the policy for determining when to turndown worker pool.
2786 # Allowed values are: `TEARDOWN_ALWAYS`, `TEARDOWN_ON_SUCCESS`, and
2787 # `TEARDOWN_NEVER`.
2788 # `TEARDOWN_ALWAYS` means workers are always torn down regardless of whether
2789 # the job succeeds. `TEARDOWN_ON_SUCCESS` means workers are torn down
2790 # if the job succeeds. `TEARDOWN_NEVER` means the workers are never torn
2791 # down.
2792 #
2793 # If the workers are not torn down by the service, they will
2794 # continue to run and use Google Compute Engine VM resources in the
2795 # user&#x27;s project until they are explicitly terminated by the user.
2796 # Because of this, Google recommends using the `TEARDOWN_ALWAYS`
2797 # policy except for small, manually supervised test jobs.
2798 #
2799 # If unknown or unspecified, the service will attempt to choose a reasonable
2800 # default.
2801 &quot;onHostMaintenance&quot;: &quot;A String&quot;, # The action to take on host maintenance, as defined by the Google
2802 # Compute Engine API.
2803 &quot;poolArgs&quot;: { # Extra arguments for this worker pool.
2804 &quot;a_key&quot;: &quot;&quot;, # Properties of the object. Contains field @type with type URL.
2805 },
2806 &quot;diskSizeGb&quot;: 42, # Size of root disk for VMs, in GB. If zero or unspecified, the service will
2807 # attempt to choose a reasonable default.
2808 &quot;workerHarnessContainerImage&quot;: &quot;A String&quot;, # Required. Docker container image that executes the Cloud Dataflow worker
2809 # harness, residing in Google Container Registry.
2810 #
2811 # Deprecated for the Fn API path. Use sdk_harness_container_images instead.
2812 &quot;diskType&quot;: &quot;A String&quot;, # Type of root disk for VMs. If empty or unspecified, the service will
2813 # attempt to choose a reasonable default.
2814 &quot;machineType&quot;: &quot;A String&quot;, # Machine type (e.g. &quot;n1-standard-1&quot;). If empty or unspecified, the
2815 # service will attempt to choose a reasonable default.
2816 &quot;kind&quot;: &quot;A String&quot;, # The kind of the worker pool; currently only `harness` and `shuffle`
2817 # are supported.
2818 &quot;sdkHarnessContainerImages&quot;: [ # Set of SDK harness containers needed to execute this pipeline. This will
2819 # only be set in the Fn API path. For non-cross-language pipelines this
2820 # should have only one entry. Cross-language pipelines will have two or more
2821 # entries.
2822 { # Defines a SDK harness container for executing Dataflow pipelines.
2823 &quot;containerImage&quot;: &quot;A String&quot;, # A docker container image that resides in Google Container Registry.
2824 &quot;useSingleCorePerContainer&quot;: True or False, # If true, recommends the Dataflow service to use only one core per SDK
2825 # container instance with this image. If false (or unset) recommends using
2826 # more than one core per SDK container instance with this image for
2827 # efficiency. Note that Dataflow service may choose to override this property
2828 # if needed.
2829 },
2830 ],
2831 &quot;dataDisks&quot;: [ # Data disks that are used by a VM in this workflow.
2832 { # Describes the data disk used by a workflow job.
2833 &quot;diskType&quot;: &quot;A String&quot;, # Disk storage type, as defined by Google Compute Engine. This
2834 # must be a disk type appropriate to the project and zone in which
2835 # the workers will run. If unknown or unspecified, the service
2836 # will attempt to choose a reasonable default.
2837 #
2838 # For example, the standard persistent disk type is a resource name
2839 # typically ending in &quot;pd-standard&quot;. If SSD persistent disks are
2840 # available, the resource name typically ends with &quot;pd-ssd&quot;. The
2841 # actual valid values are defined the Google Compute Engine API,
2842 # not by the Cloud Dataflow API; consult the Google Compute Engine
2843 # documentation for more information about determining the set of
2844 # available disk types for a particular project and zone.
2845 #
2846 # Google Compute Engine Disk types are local to a particular
2847 # project in a particular zone, and so the resource name will
2848 # typically look something like this:
2849 #
2850 # compute.googleapis.com/projects/project-id/zones/zone/diskTypes/pd-standard
2851 &quot;sizeGb&quot;: 42, # Size of disk in GB. If zero or unspecified, the service will
2852 # attempt to choose a reasonable default.
2853 &quot;mountPoint&quot;: &quot;A String&quot;, # Directory in a VM where disk is mounted.
2854 },
2855 ],
2856 &quot;subnetwork&quot;: &quot;A String&quot;, # Subnetwork to which VMs will be assigned, if desired. Expected to be of
2857 # the form &quot;regions/REGION/subnetworks/SUBNETWORK&quot;.
2858 &quot;ipConfiguration&quot;: &quot;A String&quot;, # Configuration for VM IPs.
2859 &quot;taskrunnerSettings&quot;: { # Taskrunner configuration settings. # Settings passed through to Google Compute Engine workers when
2860 # using the standard Dataflow task runner. Users should ignore
2861 # this field.
2862 &quot;alsologtostderr&quot;: True or False, # Whether to also send taskrunner log info to stderr.
2863 &quot;taskGroup&quot;: &quot;A String&quot;, # The UNIX group ID on the worker VM to use for tasks launched by
2864 # taskrunner; e.g. &quot;wheel&quot;.
2865 &quot;harnessCommand&quot;: &quot;A String&quot;, # The command to launch the worker harness.
2866 &quot;logDir&quot;: &quot;A String&quot;, # The directory on the VM to store logs.
2867 &quot;oauthScopes&quot;: [ # The OAuth2 scopes to be requested by the taskrunner in order to
2868 # access the Cloud Dataflow API.
2869 &quot;A String&quot;,
2870 ],
2871 &quot;dataflowApiVersion&quot;: &quot;A String&quot;, # The API version of endpoint, e.g. &quot;v1b3&quot;
2872 &quot;logUploadLocation&quot;: &quot;A String&quot;, # Indicates where to put logs. If this is not specified, the logs
2873 # will not be uploaded.
2874 #
2875 # The supported resource type is:
2876 #
2877 # Google Cloud Storage:
2878 # storage.googleapis.com/{bucket}/{object}
2879 # bucket.storage.googleapis.com/{object}
2880 &quot;streamingWorkerMainClass&quot;: &quot;A String&quot;, # The streaming worker main class name.
2881 &quot;workflowFileName&quot;: &quot;A String&quot;, # The file to store the workflow in.
2882 &quot;languageHint&quot;: &quot;A String&quot;, # The suggested backend language.
2883 &quot;commandlinesFileName&quot;: &quot;A String&quot;, # The file to store preprocessing commands in.
2884 &quot;baseTaskDir&quot;: &quot;A String&quot;, # The location on the worker for task-specific subdirectories.
2885 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the taskrunner should use for
2886 # temporary storage.
2887 #
2888 # The supported resource type is:
2889 #
2890 # Google Cloud Storage:
2891 # storage.googleapis.com/{bucket}/{object}
2892 # bucket.storage.googleapis.com/{object}
2893 &quot;baseUrl&quot;: &quot;A String&quot;, # The base URL for the taskrunner to use when accessing Google Cloud APIs.
2894 #
2895 # When workers access Google Cloud APIs, they logically do so via
2896 # relative URLs. If this field is specified, it supplies the base
2897 # URL to use for resolving these relative URLs. The normative
2898 # algorithm used is defined by RFC 1808, &quot;Relative Uniform Resource
2899 # Locators&quot;.
2900 #
2901 # If not specified, the default value is &quot;http://www.googleapis.com/&quot;
2902 &quot;logToSerialconsole&quot;: True or False, # Whether to send taskrunner log info to Google Compute Engine VM serial
2903 # console.
2904 &quot;continueOnException&quot;: True or False, # Whether to continue taskrunner if an exception is hit.
2905 &quot;parallelWorkerSettings&quot;: { # Provides data to pass through to the worker harness. # The settings to pass to the parallel worker harness.
2906 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the system should use for temporary
2907 # storage.
2908 #
2909 # The supported resource type is:
2910 #
2911 # Google Cloud Storage:
2912 #
2913 # storage.googleapis.com/{bucket}/{object}
2914 # bucket.storage.googleapis.com/{object}
2915 &quot;reportingEnabled&quot;: True or False, # Whether to send work progress updates to the service.
2916 &quot;baseUrl&quot;: &quot;A String&quot;, # The base URL for accessing Google Cloud APIs.
2917 #
2918 # When workers access Google Cloud APIs, they logically do so via
2919 # relative URLs. If this field is specified, it supplies the base
2920 # URL to use for resolving these relative URLs. The normative
2921 # algorithm used is defined by RFC 1808, &quot;Relative Uniform Resource
2922 # Locators&quot;.
2923 #
2924 # If not specified, the default value is &quot;http://www.googleapis.com/&quot;
2925 &quot;servicePath&quot;: &quot;A String&quot;, # The Cloud Dataflow service path relative to the root URL, for example,
2926 # &quot;dataflow/v1b3/projects&quot;.
2927 &quot;shuffleServicePath&quot;: &quot;A String&quot;, # The Shuffle service path relative to the root URL, for example,
2928 # &quot;shuffle/v1beta1&quot;.
2929 &quot;workerId&quot;: &quot;A String&quot;, # The ID of the worker running this pipeline.
2930 },
2931 &quot;taskUser&quot;: &quot;A String&quot;, # The UNIX user ID on the worker VM to use for tasks launched by
2932 # taskrunner; e.g. &quot;root&quot;.
2933 &quot;vmId&quot;: &quot;A String&quot;, # The ID string of the VM.
2934 },
2935 &quot;autoscalingSettings&quot;: { # Settings for WorkerPool autoscaling. # Settings for autoscaling of this WorkerPool.
2936 &quot;algorithm&quot;: &quot;A String&quot;, # The algorithm to use for autoscaling.
2937 &quot;maxNumWorkers&quot;: 42, # The maximum number of workers to cap scaling at.
2938 },
2939 &quot;metadata&quot;: { # Metadata to set on the Google Compute Engine VMs.
2940 &quot;a_key&quot;: &quot;A String&quot;,
2941 },
2942 &quot;defaultPackageSet&quot;: &quot;A String&quot;, # The default package set to install. This allows the service to
2943 # select a default set of packages which are useful to worker
2944 # harnesses written in a particular language.
2945 &quot;network&quot;: &quot;A String&quot;, # Network to which VMs will be assigned. If empty or unspecified,
2946 # the service will use the network &quot;default&quot;.
2947 },
2948 ],
2949 &quot;dataset&quot;: &quot;A String&quot;, # The dataset for the current project where various workflow
2950 # related tables are stored.
2951 #
2952 # The supported resource type is:
2953 #
2954 # Google BigQuery:
2955 # bigquery.googleapis.com/{dataset}
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002956 },
Bu Sun Kim65020912020-05-20 12:08:20 -07002957 &quot;stageStates&quot;: [ # This field may be mutated by the Cloud Dataflow service;
2958 # callers cannot mutate it.
2959 { # A message describing the state of a particular execution stage.
Bu Sun Kim65020912020-05-20 12:08:20 -07002960 &quot;currentStateTime&quot;: &quot;A String&quot;, # The time at which the stage transitioned to this state.
2961 &quot;executionStageState&quot;: &quot;A String&quot;, # Executions stage states allow the same set of values as JobState.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07002962 &quot;executionStageName&quot;: &quot;A String&quot;, # The name of the execution stage.
Bu Sun Kim65020912020-05-20 12:08:20 -07002963 },
2964 ],
2965 &quot;jobMetadata&quot;: { # Metadata available primarily for filtering jobs. Will be included in the # This field is populated by the Dataflow service to support filtering jobs
2966 # by the metadata values provided here. Populated for ListJobs and all GetJob
2967 # views SUMMARY and higher.
2968 # ListJob response and Job SUMMARY view.
Bu Sun Kim65020912020-05-20 12:08:20 -07002969 &quot;datastoreDetails&quot;: [ # Identification of a Datastore source used in the Dataflow job.
2970 { # Metadata for a Datastore connector used by the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07002971 &quot;namespace&quot;: &quot;A String&quot;, # Namespace used in the connection.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07002972 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
Bu Sun Kim65020912020-05-20 12:08:20 -07002973 },
2974 ],
2975 &quot;sdkVersion&quot;: { # The version of the SDK used to run the job. # The SDK version used to run the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07002976 &quot;version&quot;: &quot;A String&quot;, # The version of the SDK used to run the job.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07002977 &quot;sdkSupportStatus&quot;: &quot;A String&quot;, # The support status for this SDK version.
2978 &quot;versionDisplayName&quot;: &quot;A String&quot;, # A readable string describing the version of the SDK.
Bu Sun Kim65020912020-05-20 12:08:20 -07002979 },
2980 &quot;bigqueryDetails&quot;: [ # Identification of a BigQuery source used in the Dataflow job.
2981 { # Metadata for a BigQuery connector used by the job.
2982 &quot;table&quot;: &quot;A String&quot;, # Table accessed in the connection.
2983 &quot;dataset&quot;: &quot;A String&quot;, # Dataset accessed in the connection.
Bu Sun Kim65020912020-05-20 12:08:20 -07002984 &quot;query&quot;: &quot;A String&quot;, # Query used to access data in the connection.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07002985 &quot;projectId&quot;: &quot;A String&quot;, # Project accessed in the connection.
Bu Sun Kim65020912020-05-20 12:08:20 -07002986 },
2987 ],
2988 &quot;fileDetails&quot;: [ # Identification of a File source used in the Dataflow job.
2989 { # Metadata for a File connector used by the job.
2990 &quot;filePattern&quot;: &quot;A String&quot;, # File Pattern used to access files by the connector.
2991 },
2992 ],
2993 &quot;pubsubDetails&quot;: [ # Identification of a PubSub source used in the Dataflow job.
2994 { # Metadata for a PubSub connector used by the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07002995 &quot;topic&quot;: &quot;A String&quot;, # Topic accessed in the connection.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07002996 &quot;subscription&quot;: &quot;A String&quot;, # Subscription used in the connection.
2997 },
2998 ],
2999 &quot;bigTableDetails&quot;: [ # Identification of a BigTable source used in the Dataflow job.
3000 { # Metadata for a BigTable connector used by the job.
3001 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
3002 &quot;instanceId&quot;: &quot;A String&quot;, # InstanceId accessed in the connection.
3003 &quot;tableId&quot;: &quot;A String&quot;, # TableId accessed in the connection.
3004 },
3005 ],
3006 &quot;spannerDetails&quot;: [ # Identification of a Spanner source used in the Dataflow job.
3007 { # Metadata for a Spanner connector used by the job.
3008 &quot;instanceId&quot;: &quot;A String&quot;, # InstanceId accessed in the connection.
3009 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
3010 &quot;databaseId&quot;: &quot;A String&quot;, # DatabaseId accessed in the connection.
Bu Sun Kim65020912020-05-20 12:08:20 -07003011 },
3012 ],
3013 },
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07003014 &quot;type&quot;: &quot;A String&quot;, # The type of Cloud Dataflow job.
3015 &quot;projectId&quot;: &quot;A String&quot;, # The ID of the Cloud Platform project that the job belongs to.
Bu Sun Kim65020912020-05-20 12:08:20 -07003016 &quot;createdFromSnapshotId&quot;: &quot;A String&quot;, # If this is specified, the job&#x27;s initial state is populated from the given
3017 # snapshot.
Bu Sun Kim65020912020-05-20 12:08:20 -07003018 &quot;pipelineDescription&quot;: { # A descriptive representation of submitted pipeline as well as the executed # Preliminary field: The format of this data may change at any time.
3019 # A description of the user pipeline and stages through which it is executed.
3020 # Created by Cloud Dataflow service. Only retrieved with
3021 # JOB_VIEW_DESCRIPTION or JOB_VIEW_ALL.
3022 # form. This data is provided by the Dataflow service for ease of visualizing
3023 # the pipeline and interpreting Dataflow provided metrics.
3024 &quot;executionPipelineStage&quot;: [ # Description of each stage of execution of the pipeline.
3025 { # Description of the composing transforms, names/ids, and input/outputs of a
3026 # stage of execution. Some composing transforms and sources may have been
3027 # generated by the Dataflow service during execution planning.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07003028 &quot;outputSource&quot;: [ # Output sources for this stage.
3029 { # Description of an input or output of an execution stage.
3030 &quot;sizeBytes&quot;: &quot;A String&quot;, # Size of the source, if measurable.
3031 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
3032 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this source; may be user or system generated.
3033 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
3034 # source is most closely associated.
3035 },
3036 ],
3037 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this stage.
3038 &quot;inputSource&quot;: [ # Input sources for this stage.
3039 { # Description of an input or output of an execution stage.
3040 &quot;sizeBytes&quot;: &quot;A String&quot;, # Size of the source, if measurable.
3041 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
3042 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this source; may be user or system generated.
3043 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
3044 # source is most closely associated.
3045 },
3046 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07003047 &quot;id&quot;: &quot;A String&quot;, # Dataflow service generated id for this stage.
3048 &quot;componentTransform&quot;: [ # Transforms that comprise this execution stage.
3049 { # Description of a transform executed as part of an execution stage.
3050 &quot;originalTransform&quot;: &quot;A String&quot;, # User name for the original user transform with which this transform is
3051 # most closely associated.
3052 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
3053 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this transform; may be user or system generated.
3054 },
3055 ],
3056 &quot;componentSource&quot;: [ # Collections produced and consumed by component transforms of this stage.
3057 { # Description of an interstitial value between transforms in an execution
3058 # stage.
3059 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
3060 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this transform; may be user or system generated.
3061 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
3062 # source is most closely associated.
3063 },
3064 ],
3065 &quot;kind&quot;: &quot;A String&quot;, # Type of tranform this stage is executing.
Bu Sun Kim65020912020-05-20 12:08:20 -07003066 },
3067 ],
3068 &quot;originalPipelineTransform&quot;: [ # Description of each transform in the pipeline and collections between them.
3069 { # Description of the type, names/ids, and input/outputs for a transform.
3070 &quot;kind&quot;: &quot;A String&quot;, # Type of transform.
3071 &quot;inputCollectionName&quot;: [ # User names for all collection inputs to this transform.
3072 &quot;A String&quot;,
3073 ],
3074 &quot;name&quot;: &quot;A String&quot;, # User provided name for this transform instance.
3075 &quot;id&quot;: &quot;A String&quot;, # SDK generated id of this transform instance.
3076 &quot;displayData&quot;: [ # Transform-specific display data.
3077 { # Data provided with a pipeline or transform to provide descriptive info.
Bu Sun Kim65020912020-05-20 12:08:20 -07003078 &quot;durationValue&quot;: &quot;A String&quot;, # Contains value if the data is of duration type.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07003079 &quot;int64Value&quot;: &quot;A String&quot;, # Contains value if the data is of int64 type.
Bu Sun Kim65020912020-05-20 12:08:20 -07003080 &quot;namespace&quot;: &quot;A String&quot;, # The namespace for the key. This is usually a class name or programming
3081 # language namespace (i.e. python module) which defines the display data.
3082 # This allows a dax monitoring system to specially handle the data
3083 # and perform custom rendering.
3084 &quot;floatValue&quot;: 3.14, # Contains value if the data is of float type.
3085 &quot;key&quot;: &quot;A String&quot;, # The key identifying the display data.
3086 # This is intended to be used as a label for the display data
3087 # when viewed in a dax monitoring system.
3088 &quot;shortStrValue&quot;: &quot;A String&quot;, # A possible additional shorter value to display.
3089 # For example a java_class_name_value of com.mypackage.MyDoFn
3090 # will be stored with MyDoFn as the short_str_value and
3091 # com.mypackage.MyDoFn as the java_class_name value.
3092 # short_str_value can be displayed and java_class_name_value
3093 # will be displayed as a tooltip.
3094 &quot;url&quot;: &quot;A String&quot;, # An optional full URL.
3095 &quot;label&quot;: &quot;A String&quot;, # An optional label to display in a dax UI for the element.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07003096 &quot;timestampValue&quot;: &quot;A String&quot;, # Contains value if the data is of timestamp type.
3097 &quot;boolValue&quot;: True or False, # Contains value if the data is of a boolean type.
3098 &quot;javaClassValue&quot;: &quot;A String&quot;, # Contains value if the data is of java class type.
3099 &quot;strValue&quot;: &quot;A String&quot;, # Contains value if the data is of string type.
Bu Sun Kim65020912020-05-20 12:08:20 -07003100 },
3101 ],
3102 &quot;outputCollectionName&quot;: [ # User names for all collection outputs to this transform.
3103 &quot;A String&quot;,
3104 ],
3105 },
3106 ],
3107 &quot;displayData&quot;: [ # Pipeline level display data.
3108 { # Data provided with a pipeline or transform to provide descriptive info.
Bu Sun Kim65020912020-05-20 12:08:20 -07003109 &quot;durationValue&quot;: &quot;A String&quot;, # Contains value if the data is of duration type.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07003110 &quot;int64Value&quot;: &quot;A String&quot;, # Contains value if the data is of int64 type.
Bu Sun Kim65020912020-05-20 12:08:20 -07003111 &quot;namespace&quot;: &quot;A String&quot;, # The namespace for the key. This is usually a class name or programming
3112 # language namespace (i.e. python module) which defines the display data.
3113 # This allows a dax monitoring system to specially handle the data
3114 # and perform custom rendering.
3115 &quot;floatValue&quot;: 3.14, # Contains value if the data is of float type.
3116 &quot;key&quot;: &quot;A String&quot;, # The key identifying the display data.
3117 # This is intended to be used as a label for the display data
3118 # when viewed in a dax monitoring system.
3119 &quot;shortStrValue&quot;: &quot;A String&quot;, # A possible additional shorter value to display.
3120 # For example a java_class_name_value of com.mypackage.MyDoFn
3121 # will be stored with MyDoFn as the short_str_value and
3122 # com.mypackage.MyDoFn as the java_class_name value.
3123 # short_str_value can be displayed and java_class_name_value
3124 # will be displayed as a tooltip.
3125 &quot;url&quot;: &quot;A String&quot;, # An optional full URL.
3126 &quot;label&quot;: &quot;A String&quot;, # An optional label to display in a dax UI for the element.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07003127 &quot;timestampValue&quot;: &quot;A String&quot;, # Contains value if the data is of timestamp type.
3128 &quot;boolValue&quot;: True or False, # Contains value if the data is of a boolean type.
3129 &quot;javaClassValue&quot;: &quot;A String&quot;, # Contains value if the data is of java class type.
3130 &quot;strValue&quot;: &quot;A String&quot;, # Contains value if the data is of string type.
Bu Sun Kim65020912020-05-20 12:08:20 -07003131 },
3132 ],
3133 },
3134 &quot;replaceJobId&quot;: &quot;A String&quot;, # If this job is an update of an existing job, this field is the job ID
3135 # of the job it replaced.
3136 #
3137 # When sending a `CreateJobRequest`, you can update a job by specifying it
3138 # here. The job named here is stopped, and its intermediate state is
3139 # transferred to this job.
3140 &quot;tempFiles&quot;: [ # A set of files the system should be aware of that are used
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003141 # for temporary storage. These temporary files will be
3142 # removed on job completion.
3143 # No duplicates are allowed.
3144 # No file patterns are supported.
3145 #
3146 # The supported files are:
3147 #
3148 # Google Cloud Storage:
3149 #
3150 # storage.googleapis.com/{bucket}/{object}
3151 # bucket.storage.googleapis.com/{object}
Bu Sun Kim65020912020-05-20 12:08:20 -07003152 &quot;A String&quot;,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003153 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07003154 &quot;name&quot;: &quot;A String&quot;, # The user-specified Cloud Dataflow job name.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003155 #
3156 # Only one Job with a given name may exist in a project at any
3157 # given time. If a caller attempts to create a Job with the same
3158 # name as an already-existing Job, the attempt returns the
3159 # existing Job.
3160 #
3161 # The name must match the regular expression
3162 # `[a-z]([-a-z0-9]{0,38}[a-z0-9])?`
Bu Sun Kim65020912020-05-20 12:08:20 -07003163 &quot;steps&quot;: [ # Exactly one of step or steps_location should be specified.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003164 #
3165 # The top-level steps that constitute the entire job.
3166 { # Defines a particular step within a Cloud Dataflow job.
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04003167 #
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003168 # A job consists of multiple steps, each of which performs some
3169 # specific operation as part of the overall job. Data is typically
3170 # passed from one step to another as part of the job.
3171 #
Bu Sun Kim65020912020-05-20 12:08:20 -07003172 # Here&#x27;s an example of a sequence of steps which together implement a
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003173 # Map-Reduce job:
3174 #
3175 # * Read a collection of data from some source, parsing the
Bu Sun Kim65020912020-05-20 12:08:20 -07003176 # collection&#x27;s elements.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003177 #
3178 # * Validate the elements.
3179 #
3180 # * Apply a user-defined function to map each element to some value
3181 # and extract an element-specific key value.
3182 #
3183 # * Group elements with the same key into a single element with
3184 # that key, transforming a multiply-keyed collection into a
3185 # uniquely-keyed collection.
3186 #
3187 # * Write the elements out to some data sink.
3188 #
3189 # Note that the Cloud Dataflow service may be used to run many different
3190 # types of jobs, not just Map-Reduce.
Bu Sun Kim65020912020-05-20 12:08:20 -07003191 &quot;name&quot;: &quot;A String&quot;, # The name that identifies the step. This must be unique for each
Dan O'Mearadd494642020-05-01 07:42:23 -07003192 # step with respect to all other steps in the Cloud Dataflow job.
Bu Sun Kim65020912020-05-20 12:08:20 -07003193 &quot;kind&quot;: &quot;A String&quot;, # The kind of step in the Cloud Dataflow job.
3194 &quot;properties&quot;: { # Named properties associated with the step. Each kind of
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003195 # predefined step has its own required set of properties.
3196 # Must be provided on Create. Only retrieved with JOB_VIEW_ALL.
Bu Sun Kim65020912020-05-20 12:08:20 -07003197 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003198 },
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003199 },
3200 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07003201 &quot;replacedByJobId&quot;: &quot;A String&quot;, # If another job is an update of this job (and thus, this job is in
3202 # `JOB_STATE_UPDATED`), this field contains the ID of that job.
3203 &quot;executionInfo&quot;: { # Additional information about how a Cloud Dataflow job will be executed that # Deprecated.
3204 # isn&#x27;t contained in the submitted job.
3205 &quot;stages&quot;: { # A mapping from each stage to the information about that stage.
3206 &quot;a_key&quot;: { # Contains information about how a particular
3207 # google.dataflow.v1beta3.Step will be executed.
3208 &quot;stepName&quot;: [ # The steps associated with the execution stage.
3209 # Note that stages may have several steps, and that a given step
3210 # might be run by more than one stage.
3211 &quot;A String&quot;,
3212 ],
3213 },
3214 },
3215 },
3216 &quot;currentState&quot;: &quot;A String&quot;, # The current state of the job.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003217 #
3218 # Jobs are created in the `JOB_STATE_STOPPED` state unless otherwise
3219 # specified.
3220 #
3221 # A job in the `JOB_STATE_RUNNING` state may asynchronously enter a
3222 # terminal state. After a job has reached a terminal state, no
3223 # further state updates may be made.
3224 #
3225 # This field may be mutated by the Cloud Dataflow service;
3226 # callers cannot mutate it.
Bu Sun Kim65020912020-05-20 12:08:20 -07003227 &quot;location&quot;: &quot;A String&quot;, # The [regional endpoint]
3228 # (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
3229 # contains this job.
3230 &quot;startTime&quot;: &quot;A String&quot;, # The timestamp when the job was started (transitioned to JOB_STATE_PENDING).
3231 # Flexible resource scheduling jobs are started with some delay after job
3232 # creation, so start_time is unset before start and is updated when the
3233 # job is started by the Cloud Dataflow service. For other jobs, start_time
3234 # always equals to create_time and is immutable and set by the Cloud Dataflow
3235 # service.
3236 &quot;stepsLocation&quot;: &quot;A String&quot;, # The GCS location where the steps are stored.
3237 &quot;labels&quot;: { # User-defined labels for this job.
3238 #
3239 # The labels map can contain no more than 64 entries. Entries of the labels
3240 # map are UTF8 strings that comply with the following restrictions:
3241 #
3242 # * Keys must conform to regexp: \p{Ll}\p{Lo}{0,62}
3243 # * Values must conform to regexp: [\p{Ll}\p{Lo}\p{N}_-]{0,63}
3244 # * Both keys and values are additionally constrained to be &lt;= 128 bytes in
3245 # size.
3246 &quot;a_key&quot;: &quot;A String&quot;,
Nathaniel Manista4f877e52015-06-15 16:44:50 +00003247 },
Bu Sun Kim65020912020-05-20 12:08:20 -07003248 &quot;createTime&quot;: &quot;A String&quot;, # The timestamp when the job was initially created. Immutable and set by the
3249 # Cloud Dataflow service.
3250 &quot;requestedState&quot;: &quot;A String&quot;, # The job&#x27;s requested state.
3251 #
3252 # `UpdateJob` may be used to switch between the `JOB_STATE_STOPPED` and
3253 # `JOB_STATE_RUNNING` states, by setting requested_state. `UpdateJob` may
3254 # also be used to directly set a job&#x27;s requested state to
3255 # `JOB_STATE_CANCELLED` or `JOB_STATE_DONE`, irrevocably terminating the
3256 # job if it has not already reached a terminal state.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003257 },
Nathaniel Manista4f877e52015-06-15 16:44:50 +00003258 ],
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07003259 &quot;failedLocation&quot;: [ # Zero or more messages describing the [regional endpoints]
3260 # (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
3261 # failed to respond.
3262 { # Indicates which [regional endpoint]
3263 # (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) failed
3264 # to respond to a request for data.
3265 &quot;name&quot;: &quot;A String&quot;, # The name of the [regional endpoint]
3266 # (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
3267 # failed to respond.
3268 },
3269 ],
3270 &quot;nextPageToken&quot;: &quot;A String&quot;, # Set if there may be more results than fit in this response.
Nathaniel Manista4f877e52015-06-15 16:44:50 +00003271 }</pre>
3272</div>
3273
3274<div class="method">
3275 <code class="details" id="list_next">list_next(previous_request, previous_response)</code>
3276 <pre>Retrieves the next page of results.
3277
3278Args:
3279 previous_request: The request for the previous page. (required)
3280 previous_response: The response from the request for the previous page. (required)
3281
3282Returns:
Bu Sun Kim65020912020-05-20 12:08:20 -07003283 A request object that you can call &#x27;execute()&#x27; on to request the next
Nathaniel Manista4f877e52015-06-15 16:44:50 +00003284 page. Returns None if there are no more items in the collection.
3285 </pre>
3286</div>
3287
3288<div class="method">
Dan O'Mearadd494642020-05-01 07:42:23 -07003289 <code class="details" id="snapshot">snapshot(projectId, jobId, body=None, x__xgafv=None)</code>
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003290 <pre>Snapshot the state of a streaming job.
3291
3292Args:
3293 projectId: string, The project which owns the job to be snapshotted. (required)
3294 jobId: string, The job to be snapshotted. (required)
Dan O'Mearadd494642020-05-01 07:42:23 -07003295 body: object, The request body.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003296 The object takes the form of:
3297
3298{ # Request to create a snapshot of a job.
Bu Sun Kim65020912020-05-20 12:08:20 -07003299 &quot;description&quot;: &quot;A String&quot;, # User specified description of the snapshot. Maybe empty.
3300 &quot;snapshotSources&quot;: True or False, # If true, perform snapshots for sources which support this.
3301 &quot;ttl&quot;: &quot;A String&quot;, # TTL for the snapshot.
3302 &quot;location&quot;: &quot;A String&quot;, # The location that contains this job.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003303 }
3304
3305 x__xgafv: string, V1 error format.
3306 Allowed values
3307 1 - v1 error format
3308 2 - v2 error format
3309
3310Returns:
3311 An object of the form:
3312
3313 { # Represents a snapshot of a job.
Bu Sun Kim65020912020-05-20 12:08:20 -07003314 &quot;pubsubMetadata&quot;: [ # PubSub snapshot metadata.
Dan O'Mearadd494642020-05-01 07:42:23 -07003315 { # Represents a Pubsub snapshot.
Bu Sun Kim65020912020-05-20 12:08:20 -07003316 &quot;snapshotName&quot;: &quot;A String&quot;, # The name of the Pubsub snapshot.
3317 &quot;topicName&quot;: &quot;A String&quot;, # The name of the Pubsub topic.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07003318 &quot;expireTime&quot;: &quot;A String&quot;, # The expire time of the Pubsub snapshot.
Dan O'Mearadd494642020-05-01 07:42:23 -07003319 },
3320 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07003321 &quot;creationTime&quot;: &quot;A String&quot;, # The time this snapshot was created.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07003322 &quot;sourceJobId&quot;: &quot;A String&quot;, # The job this snapshot was created from.
3323 &quot;state&quot;: &quot;A String&quot;, # State of the snapshot.
3324 &quot;projectId&quot;: &quot;A String&quot;, # The project this snapshot belongs to.
3325 &quot;ttl&quot;: &quot;A String&quot;, # The time after which this snapshot will be automatically deleted.
3326 &quot;id&quot;: &quot;A String&quot;, # The unique ID of this snapshot.
3327 &quot;description&quot;: &quot;A String&quot;, # User specified description of the snapshot. Maybe empty.
3328 &quot;diskSizeBytes&quot;: &quot;A String&quot;, # The disk byte size of the snapshot. Only available for snapshots in READY
3329 # state.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003330 }</pre>
3331</div>
3332
3333<div class="method">
Dan O'Mearadd494642020-05-01 07:42:23 -07003334 <code class="details" id="update">update(projectId, jobId, body=None, location=None, x__xgafv=None)</code>
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04003335 <pre>Updates the state of an existing Cloud Dataflow job.
Nathaniel Manista4f877e52015-06-15 16:44:50 +00003336
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003337To update the state of an existing job, we recommend using
3338`projects.locations.jobs.update` with a [regional endpoint]
3339(https://cloud.google.com/dataflow/docs/concepts/regional-endpoints). Using
3340`projects.jobs.update` is not recommended, as you can only update the state
3341of jobs that are running in `us-central1`.
3342
Nathaniel Manista4f877e52015-06-15 16:44:50 +00003343Args:
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04003344 projectId: string, The ID of the Cloud Platform project that the job belongs to. (required)
3345 jobId: string, The job ID. (required)
Dan O'Mearadd494642020-05-01 07:42:23 -07003346 body: object, The request body.
Nathaniel Manista4f877e52015-06-15 16:44:50 +00003347 The object takes the form of:
3348
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04003349{ # Defines a job to be run by the Cloud Dataflow service.
Bu Sun Kim65020912020-05-20 12:08:20 -07003350 &quot;clientRequestId&quot;: &quot;A String&quot;, # The client&#x27;s unique identifier of the job, re-used across retried attempts.
3351 # If this field is set, the service will ensure its uniqueness.
3352 # The request to create a job will fail if the service has knowledge of a
3353 # previously submitted job with the same client&#x27;s ID and job name.
3354 # The caller may use this field to ensure idempotence of job
3355 # creation across retried attempts to create a job.
3356 # By default, the field is empty and, in that case, the service ignores it.
3357 &quot;id&quot;: &quot;A String&quot;, # The unique ID of this job.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003358 #
3359 # This field is set by the Cloud Dataflow service when the Job is
3360 # created, and is immutable for the life of the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07003361 &quot;currentStateTime&quot;: &quot;A String&quot;, # The timestamp associated with the current state.
3362 &quot;transformNameMapping&quot;: { # The map of transform name prefixes of the job to be replaced to the
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003363 # corresponding name prefixes of the new job.
Bu Sun Kim65020912020-05-20 12:08:20 -07003364 &quot;a_key&quot;: &quot;A String&quot;,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003365 },
Bu Sun Kim65020912020-05-20 12:08:20 -07003366 &quot;environment&quot;: { # Describes the environment in which a Dataflow Job runs. # The environment for the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07003367 &quot;internalExperiments&quot;: { # Experimental settings.
3368 &quot;a_key&quot;: &quot;&quot;, # Properties of the object. Contains field @type with type URL.
3369 },
3370 &quot;workerRegion&quot;: &quot;A String&quot;, # The Compute Engine region
3371 # (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
3372 # which worker processing should occur, e.g. &quot;us-west1&quot;. Mutually exclusive
3373 # with worker_zone. If neither worker_region nor worker_zone is specified,
3374 # default to the control plane&#x27;s region.
3375 &quot;serviceKmsKeyName&quot;: &quot;A String&quot;, # If set, contains the Cloud KMS key identifier used to encrypt data
3376 # at rest, AKA a Customer Managed Encryption Key (CMEK).
3377 #
3378 # Format:
3379 # projects/PROJECT_ID/locations/LOCATION/keyRings/KEY_RING/cryptoKeys/KEY
3380 &quot;userAgent&quot;: { # A description of the process that generated the request.
3381 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
3382 },
3383 &quot;workerZone&quot;: &quot;A String&quot;, # The Compute Engine zone
3384 # (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
3385 # which worker processing should occur, e.g. &quot;us-west1-a&quot;. Mutually exclusive
3386 # with worker_region. If neither worker_region nor worker_zone is specified,
3387 # a zone in the control plane&#x27;s region is chosen based on available capacity.
3388 &quot;clusterManagerApiService&quot;: &quot;A String&quot;, # The type of cluster manager API to use. If unknown or
Dan O'Mearadd494642020-05-01 07:42:23 -07003389 # unspecified, the service will attempt to choose a reasonable
3390 # default. This should be in the form of the API service name,
Bu Sun Kim65020912020-05-20 12:08:20 -07003391 # e.g. &quot;compute.googleapis.com&quot;.
3392 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the system should use for temporary
3393 # storage. The system will append the suffix &quot;/temp-{JOBNAME} to
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003394 # this resource prefix, where {JOBNAME} is the value of the
3395 # job_name field. The resulting bucket and object prefix is used
3396 # as the prefix of the resources used to store temporary data
3397 # needed during the job execution. NOTE: This will override the
3398 # value in taskrunner_settings.
3399 # The supported resource type is:
3400 #
3401 # Google Cloud Storage:
3402 #
3403 # storage.googleapis.com/{bucket}/{object}
3404 # bucket.storage.googleapis.com/{object}
Bu Sun Kim65020912020-05-20 12:08:20 -07003405 &quot;experiments&quot;: [ # The list of experiments to enable.
3406 &quot;A String&quot;,
3407 ],
3408 &quot;version&quot;: { # A structure describing which components and their versions of the service
3409 # are required in order to run the job.
3410 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
3411 },
3412 &quot;serviceAccountEmail&quot;: &quot;A String&quot;, # Identity to run virtual machines as. Defaults to the default account.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07003413 &quot;sdkPipelineOptions&quot;: { # The Cloud Dataflow SDK pipeline options specified by the user. These
3414 # options are passed through the service and are used to recreate the
3415 # SDK pipeline options on the worker in a language agnostic and platform
3416 # independent way.
3417 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
3418 },
3419 &quot;flexResourceSchedulingGoal&quot;: &quot;A String&quot;, # Which Flexible Resource Scheduling mode to run in.
3420 &quot;workerPools&quot;: [ # The worker pools. At least one &quot;harness&quot; worker pool must be
3421 # specified in order for the job to have workers.
3422 { # Describes one particular pool of Cloud Dataflow workers to be
3423 # instantiated by the Cloud Dataflow service in order to perform the
3424 # computations required by a job. Note that a workflow job may use
3425 # multiple pools, in order to match the various computational
3426 # requirements of the various stages of the job.
3427 &quot;numThreadsPerWorker&quot;: 42, # The number of threads per worker harness. If empty or unspecified, the
3428 # service will choose a number of threads (according to the number of cores
3429 # on the selected machine type for batch, or 1 by convention for streaming).
3430 &quot;numWorkers&quot;: 42, # Number of Google Compute Engine workers in this pool needed to
3431 # execute the job. If zero or unspecified, the service will
3432 # attempt to choose a reasonable default.
3433 &quot;zone&quot;: &quot;A String&quot;, # Zone to run the worker pools in. If empty or unspecified, the service
3434 # will attempt to choose a reasonable default.
3435 &quot;diskSourceImage&quot;: &quot;A String&quot;, # Fully qualified source image for disks.
3436 &quot;packages&quot;: [ # Packages to be installed on workers.
3437 { # The packages that must be installed in order for a worker to run the
3438 # steps of the Cloud Dataflow job that will be assigned to its worker
3439 # pool.
3440 #
3441 # This is the mechanism by which the Cloud Dataflow SDK causes code to
3442 # be loaded onto the workers. For example, the Cloud Dataflow Java SDK
3443 # might use this to install jars containing the user&#x27;s code and all of the
3444 # various dependencies (libraries, data files, etc.) required in order
3445 # for that code to run.
3446 &quot;name&quot;: &quot;A String&quot;, # The name of the package.
3447 &quot;location&quot;: &quot;A String&quot;, # The resource to read the package from. The supported resource type is:
3448 #
3449 # Google Cloud Storage:
3450 #
3451 # storage.googleapis.com/{bucket}
3452 # bucket.storage.googleapis.com/
3453 },
3454 ],
3455 &quot;teardownPolicy&quot;: &quot;A String&quot;, # Sets the policy for determining when to turndown worker pool.
3456 # Allowed values are: `TEARDOWN_ALWAYS`, `TEARDOWN_ON_SUCCESS`, and
3457 # `TEARDOWN_NEVER`.
3458 # `TEARDOWN_ALWAYS` means workers are always torn down regardless of whether
3459 # the job succeeds. `TEARDOWN_ON_SUCCESS` means workers are torn down
3460 # if the job succeeds. `TEARDOWN_NEVER` means the workers are never torn
3461 # down.
3462 #
3463 # If the workers are not torn down by the service, they will
3464 # continue to run and use Google Compute Engine VM resources in the
3465 # user&#x27;s project until they are explicitly terminated by the user.
3466 # Because of this, Google recommends using the `TEARDOWN_ALWAYS`
3467 # policy except for small, manually supervised test jobs.
3468 #
3469 # If unknown or unspecified, the service will attempt to choose a reasonable
3470 # default.
3471 &quot;onHostMaintenance&quot;: &quot;A String&quot;, # The action to take on host maintenance, as defined by the Google
3472 # Compute Engine API.
3473 &quot;poolArgs&quot;: { # Extra arguments for this worker pool.
3474 &quot;a_key&quot;: &quot;&quot;, # Properties of the object. Contains field @type with type URL.
3475 },
3476 &quot;diskSizeGb&quot;: 42, # Size of root disk for VMs, in GB. If zero or unspecified, the service will
3477 # attempt to choose a reasonable default.
3478 &quot;workerHarnessContainerImage&quot;: &quot;A String&quot;, # Required. Docker container image that executes the Cloud Dataflow worker
3479 # harness, residing in Google Container Registry.
3480 #
3481 # Deprecated for the Fn API path. Use sdk_harness_container_images instead.
3482 &quot;diskType&quot;: &quot;A String&quot;, # Type of root disk for VMs. If empty or unspecified, the service will
3483 # attempt to choose a reasonable default.
3484 &quot;machineType&quot;: &quot;A String&quot;, # Machine type (e.g. &quot;n1-standard-1&quot;). If empty or unspecified, the
3485 # service will attempt to choose a reasonable default.
3486 &quot;kind&quot;: &quot;A String&quot;, # The kind of the worker pool; currently only `harness` and `shuffle`
3487 # are supported.
3488 &quot;sdkHarnessContainerImages&quot;: [ # Set of SDK harness containers needed to execute this pipeline. This will
3489 # only be set in the Fn API path. For non-cross-language pipelines this
3490 # should have only one entry. Cross-language pipelines will have two or more
3491 # entries.
3492 { # Defines a SDK harness container for executing Dataflow pipelines.
3493 &quot;containerImage&quot;: &quot;A String&quot;, # A docker container image that resides in Google Container Registry.
3494 &quot;useSingleCorePerContainer&quot;: True or False, # If true, recommends the Dataflow service to use only one core per SDK
3495 # container instance with this image. If false (or unset) recommends using
3496 # more than one core per SDK container instance with this image for
3497 # efficiency. Note that Dataflow service may choose to override this property
3498 # if needed.
3499 },
3500 ],
3501 &quot;dataDisks&quot;: [ # Data disks that are used by a VM in this workflow.
3502 { # Describes the data disk used by a workflow job.
3503 &quot;diskType&quot;: &quot;A String&quot;, # Disk storage type, as defined by Google Compute Engine. This
3504 # must be a disk type appropriate to the project and zone in which
3505 # the workers will run. If unknown or unspecified, the service
3506 # will attempt to choose a reasonable default.
3507 #
3508 # For example, the standard persistent disk type is a resource name
3509 # typically ending in &quot;pd-standard&quot;. If SSD persistent disks are
3510 # available, the resource name typically ends with &quot;pd-ssd&quot;. The
3511 # actual valid values are defined the Google Compute Engine API,
3512 # not by the Cloud Dataflow API; consult the Google Compute Engine
3513 # documentation for more information about determining the set of
3514 # available disk types for a particular project and zone.
3515 #
3516 # Google Compute Engine Disk types are local to a particular
3517 # project in a particular zone, and so the resource name will
3518 # typically look something like this:
3519 #
3520 # compute.googleapis.com/projects/project-id/zones/zone/diskTypes/pd-standard
3521 &quot;sizeGb&quot;: 42, # Size of disk in GB. If zero or unspecified, the service will
3522 # attempt to choose a reasonable default.
3523 &quot;mountPoint&quot;: &quot;A String&quot;, # Directory in a VM where disk is mounted.
3524 },
3525 ],
3526 &quot;subnetwork&quot;: &quot;A String&quot;, # Subnetwork to which VMs will be assigned, if desired. Expected to be of
3527 # the form &quot;regions/REGION/subnetworks/SUBNETWORK&quot;.
3528 &quot;ipConfiguration&quot;: &quot;A String&quot;, # Configuration for VM IPs.
3529 &quot;taskrunnerSettings&quot;: { # Taskrunner configuration settings. # Settings passed through to Google Compute Engine workers when
3530 # using the standard Dataflow task runner. Users should ignore
3531 # this field.
3532 &quot;alsologtostderr&quot;: True or False, # Whether to also send taskrunner log info to stderr.
3533 &quot;taskGroup&quot;: &quot;A String&quot;, # The UNIX group ID on the worker VM to use for tasks launched by
3534 # taskrunner; e.g. &quot;wheel&quot;.
3535 &quot;harnessCommand&quot;: &quot;A String&quot;, # The command to launch the worker harness.
3536 &quot;logDir&quot;: &quot;A String&quot;, # The directory on the VM to store logs.
3537 &quot;oauthScopes&quot;: [ # The OAuth2 scopes to be requested by the taskrunner in order to
3538 # access the Cloud Dataflow API.
3539 &quot;A String&quot;,
3540 ],
3541 &quot;dataflowApiVersion&quot;: &quot;A String&quot;, # The API version of endpoint, e.g. &quot;v1b3&quot;
3542 &quot;logUploadLocation&quot;: &quot;A String&quot;, # Indicates where to put logs. If this is not specified, the logs
3543 # will not be uploaded.
3544 #
3545 # The supported resource type is:
3546 #
3547 # Google Cloud Storage:
3548 # storage.googleapis.com/{bucket}/{object}
3549 # bucket.storage.googleapis.com/{object}
3550 &quot;streamingWorkerMainClass&quot;: &quot;A String&quot;, # The streaming worker main class name.
3551 &quot;workflowFileName&quot;: &quot;A String&quot;, # The file to store the workflow in.
3552 &quot;languageHint&quot;: &quot;A String&quot;, # The suggested backend language.
3553 &quot;commandlinesFileName&quot;: &quot;A String&quot;, # The file to store preprocessing commands in.
3554 &quot;baseTaskDir&quot;: &quot;A String&quot;, # The location on the worker for task-specific subdirectories.
3555 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the taskrunner should use for
3556 # temporary storage.
3557 #
3558 # The supported resource type is:
3559 #
3560 # Google Cloud Storage:
3561 # storage.googleapis.com/{bucket}/{object}
3562 # bucket.storage.googleapis.com/{object}
3563 &quot;baseUrl&quot;: &quot;A String&quot;, # The base URL for the taskrunner to use when accessing Google Cloud APIs.
3564 #
3565 # When workers access Google Cloud APIs, they logically do so via
3566 # relative URLs. If this field is specified, it supplies the base
3567 # URL to use for resolving these relative URLs. The normative
3568 # algorithm used is defined by RFC 1808, &quot;Relative Uniform Resource
3569 # Locators&quot;.
3570 #
3571 # If not specified, the default value is &quot;http://www.googleapis.com/&quot;
3572 &quot;logToSerialconsole&quot;: True or False, # Whether to send taskrunner log info to Google Compute Engine VM serial
3573 # console.
3574 &quot;continueOnException&quot;: True or False, # Whether to continue taskrunner if an exception is hit.
3575 &quot;parallelWorkerSettings&quot;: { # Provides data to pass through to the worker harness. # The settings to pass to the parallel worker harness.
3576 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the system should use for temporary
3577 # storage.
3578 #
3579 # The supported resource type is:
3580 #
3581 # Google Cloud Storage:
3582 #
3583 # storage.googleapis.com/{bucket}/{object}
3584 # bucket.storage.googleapis.com/{object}
3585 &quot;reportingEnabled&quot;: True or False, # Whether to send work progress updates to the service.
3586 &quot;baseUrl&quot;: &quot;A String&quot;, # The base URL for accessing Google Cloud APIs.
3587 #
3588 # When workers access Google Cloud APIs, they logically do so via
3589 # relative URLs. If this field is specified, it supplies the base
3590 # URL to use for resolving these relative URLs. The normative
3591 # algorithm used is defined by RFC 1808, &quot;Relative Uniform Resource
3592 # Locators&quot;.
3593 #
3594 # If not specified, the default value is &quot;http://www.googleapis.com/&quot;
3595 &quot;servicePath&quot;: &quot;A String&quot;, # The Cloud Dataflow service path relative to the root URL, for example,
3596 # &quot;dataflow/v1b3/projects&quot;.
3597 &quot;shuffleServicePath&quot;: &quot;A String&quot;, # The Shuffle service path relative to the root URL, for example,
3598 # &quot;shuffle/v1beta1&quot;.
3599 &quot;workerId&quot;: &quot;A String&quot;, # The ID of the worker running this pipeline.
3600 },
3601 &quot;taskUser&quot;: &quot;A String&quot;, # The UNIX user ID on the worker VM to use for tasks launched by
3602 # taskrunner; e.g. &quot;root&quot;.
3603 &quot;vmId&quot;: &quot;A String&quot;, # The ID string of the VM.
3604 },
3605 &quot;autoscalingSettings&quot;: { # Settings for WorkerPool autoscaling. # Settings for autoscaling of this WorkerPool.
3606 &quot;algorithm&quot;: &quot;A String&quot;, # The algorithm to use for autoscaling.
3607 &quot;maxNumWorkers&quot;: 42, # The maximum number of workers to cap scaling at.
3608 },
3609 &quot;metadata&quot;: { # Metadata to set on the Google Compute Engine VMs.
3610 &quot;a_key&quot;: &quot;A String&quot;,
3611 },
3612 &quot;defaultPackageSet&quot;: &quot;A String&quot;, # The default package set to install. This allows the service to
3613 # select a default set of packages which are useful to worker
3614 # harnesses written in a particular language.
3615 &quot;network&quot;: &quot;A String&quot;, # Network to which VMs will be assigned. If empty or unspecified,
3616 # the service will use the network &quot;default&quot;.
3617 },
3618 ],
3619 &quot;dataset&quot;: &quot;A String&quot;, # The dataset for the current project where various workflow
3620 # related tables are stored.
3621 #
3622 # The supported resource type is:
3623 #
3624 # Google BigQuery:
3625 # bigquery.googleapis.com/{dataset}
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003626 },
Bu Sun Kim65020912020-05-20 12:08:20 -07003627 &quot;stageStates&quot;: [ # This field may be mutated by the Cloud Dataflow service;
3628 # callers cannot mutate it.
3629 { # A message describing the state of a particular execution stage.
Bu Sun Kim65020912020-05-20 12:08:20 -07003630 &quot;currentStateTime&quot;: &quot;A String&quot;, # The time at which the stage transitioned to this state.
3631 &quot;executionStageState&quot;: &quot;A String&quot;, # Executions stage states allow the same set of values as JobState.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07003632 &quot;executionStageName&quot;: &quot;A String&quot;, # The name of the execution stage.
Bu Sun Kim65020912020-05-20 12:08:20 -07003633 },
3634 ],
3635 &quot;jobMetadata&quot;: { # Metadata available primarily for filtering jobs. Will be included in the # This field is populated by the Dataflow service to support filtering jobs
3636 # by the metadata values provided here. Populated for ListJobs and all GetJob
3637 # views SUMMARY and higher.
3638 # ListJob response and Job SUMMARY view.
Bu Sun Kim65020912020-05-20 12:08:20 -07003639 &quot;datastoreDetails&quot;: [ # Identification of a Datastore source used in the Dataflow job.
3640 { # Metadata for a Datastore connector used by the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07003641 &quot;namespace&quot;: &quot;A String&quot;, # Namespace used in the connection.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07003642 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
Bu Sun Kim65020912020-05-20 12:08:20 -07003643 },
3644 ],
3645 &quot;sdkVersion&quot;: { # The version of the SDK used to run the job. # The SDK version used to run the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07003646 &quot;version&quot;: &quot;A String&quot;, # The version of the SDK used to run the job.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07003647 &quot;sdkSupportStatus&quot;: &quot;A String&quot;, # The support status for this SDK version.
3648 &quot;versionDisplayName&quot;: &quot;A String&quot;, # A readable string describing the version of the SDK.
Bu Sun Kim65020912020-05-20 12:08:20 -07003649 },
3650 &quot;bigqueryDetails&quot;: [ # Identification of a BigQuery source used in the Dataflow job.
3651 { # Metadata for a BigQuery connector used by the job.
3652 &quot;table&quot;: &quot;A String&quot;, # Table accessed in the connection.
3653 &quot;dataset&quot;: &quot;A String&quot;, # Dataset accessed in the connection.
Bu Sun Kim65020912020-05-20 12:08:20 -07003654 &quot;query&quot;: &quot;A String&quot;, # Query used to access data in the connection.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07003655 &quot;projectId&quot;: &quot;A String&quot;, # Project accessed in the connection.
Bu Sun Kim65020912020-05-20 12:08:20 -07003656 },
3657 ],
3658 &quot;fileDetails&quot;: [ # Identification of a File source used in the Dataflow job.
3659 { # Metadata for a File connector used by the job.
3660 &quot;filePattern&quot;: &quot;A String&quot;, # File Pattern used to access files by the connector.
3661 },
3662 ],
3663 &quot;pubsubDetails&quot;: [ # Identification of a PubSub source used in the Dataflow job.
3664 { # Metadata for a PubSub connector used by the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07003665 &quot;topic&quot;: &quot;A String&quot;, # Topic accessed in the connection.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07003666 &quot;subscription&quot;: &quot;A String&quot;, # Subscription used in the connection.
3667 },
3668 ],
3669 &quot;bigTableDetails&quot;: [ # Identification of a BigTable source used in the Dataflow job.
3670 { # Metadata for a BigTable connector used by the job.
3671 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
3672 &quot;instanceId&quot;: &quot;A String&quot;, # InstanceId accessed in the connection.
3673 &quot;tableId&quot;: &quot;A String&quot;, # TableId accessed in the connection.
3674 },
3675 ],
3676 &quot;spannerDetails&quot;: [ # Identification of a Spanner source used in the Dataflow job.
3677 { # Metadata for a Spanner connector used by the job.
3678 &quot;instanceId&quot;: &quot;A String&quot;, # InstanceId accessed in the connection.
3679 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
3680 &quot;databaseId&quot;: &quot;A String&quot;, # DatabaseId accessed in the connection.
Bu Sun Kim65020912020-05-20 12:08:20 -07003681 },
3682 ],
3683 },
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07003684 &quot;type&quot;: &quot;A String&quot;, # The type of Cloud Dataflow job.
3685 &quot;projectId&quot;: &quot;A String&quot;, # The ID of the Cloud Platform project that the job belongs to.
Bu Sun Kim65020912020-05-20 12:08:20 -07003686 &quot;createdFromSnapshotId&quot;: &quot;A String&quot;, # If this is specified, the job&#x27;s initial state is populated from the given
3687 # snapshot.
Bu Sun Kim65020912020-05-20 12:08:20 -07003688 &quot;pipelineDescription&quot;: { # A descriptive representation of submitted pipeline as well as the executed # Preliminary field: The format of this data may change at any time.
3689 # A description of the user pipeline and stages through which it is executed.
3690 # Created by Cloud Dataflow service. Only retrieved with
3691 # JOB_VIEW_DESCRIPTION or JOB_VIEW_ALL.
3692 # form. This data is provided by the Dataflow service for ease of visualizing
3693 # the pipeline and interpreting Dataflow provided metrics.
3694 &quot;executionPipelineStage&quot;: [ # Description of each stage of execution of the pipeline.
3695 { # Description of the composing transforms, names/ids, and input/outputs of a
3696 # stage of execution. Some composing transforms and sources may have been
3697 # generated by the Dataflow service during execution planning.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07003698 &quot;outputSource&quot;: [ # Output sources for this stage.
3699 { # Description of an input or output of an execution stage.
3700 &quot;sizeBytes&quot;: &quot;A String&quot;, # Size of the source, if measurable.
3701 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
3702 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this source; may be user or system generated.
3703 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
3704 # source is most closely associated.
3705 },
3706 ],
3707 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this stage.
3708 &quot;inputSource&quot;: [ # Input sources for this stage.
3709 { # Description of an input or output of an execution stage.
3710 &quot;sizeBytes&quot;: &quot;A String&quot;, # Size of the source, if measurable.
3711 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
3712 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this source; may be user or system generated.
3713 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
3714 # source is most closely associated.
3715 },
3716 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07003717 &quot;id&quot;: &quot;A String&quot;, # Dataflow service generated id for this stage.
3718 &quot;componentTransform&quot;: [ # Transforms that comprise this execution stage.
3719 { # Description of a transform executed as part of an execution stage.
3720 &quot;originalTransform&quot;: &quot;A String&quot;, # User name for the original user transform with which this transform is
3721 # most closely associated.
3722 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
3723 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this transform; may be user or system generated.
3724 },
3725 ],
3726 &quot;componentSource&quot;: [ # Collections produced and consumed by component transforms of this stage.
3727 { # Description of an interstitial value between transforms in an execution
3728 # stage.
3729 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
3730 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this transform; may be user or system generated.
3731 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
3732 # source is most closely associated.
3733 },
3734 ],
3735 &quot;kind&quot;: &quot;A String&quot;, # Type of tranform this stage is executing.
Bu Sun Kim65020912020-05-20 12:08:20 -07003736 },
3737 ],
3738 &quot;originalPipelineTransform&quot;: [ # Description of each transform in the pipeline and collections between them.
3739 { # Description of the type, names/ids, and input/outputs for a transform.
3740 &quot;kind&quot;: &quot;A String&quot;, # Type of transform.
3741 &quot;inputCollectionName&quot;: [ # User names for all collection inputs to this transform.
3742 &quot;A String&quot;,
3743 ],
3744 &quot;name&quot;: &quot;A String&quot;, # User provided name for this transform instance.
3745 &quot;id&quot;: &quot;A String&quot;, # SDK generated id of this transform instance.
3746 &quot;displayData&quot;: [ # Transform-specific display data.
3747 { # Data provided with a pipeline or transform to provide descriptive info.
Bu Sun Kim65020912020-05-20 12:08:20 -07003748 &quot;durationValue&quot;: &quot;A String&quot;, # Contains value if the data is of duration type.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07003749 &quot;int64Value&quot;: &quot;A String&quot;, # Contains value if the data is of int64 type.
Bu Sun Kim65020912020-05-20 12:08:20 -07003750 &quot;namespace&quot;: &quot;A String&quot;, # The namespace for the key. This is usually a class name or programming
3751 # language namespace (i.e. python module) which defines the display data.
3752 # This allows a dax monitoring system to specially handle the data
3753 # and perform custom rendering.
3754 &quot;floatValue&quot;: 3.14, # Contains value if the data is of float type.
3755 &quot;key&quot;: &quot;A String&quot;, # The key identifying the display data.
3756 # This is intended to be used as a label for the display data
3757 # when viewed in a dax monitoring system.
3758 &quot;shortStrValue&quot;: &quot;A String&quot;, # A possible additional shorter value to display.
3759 # For example a java_class_name_value of com.mypackage.MyDoFn
3760 # will be stored with MyDoFn as the short_str_value and
3761 # com.mypackage.MyDoFn as the java_class_name value.
3762 # short_str_value can be displayed and java_class_name_value
3763 # will be displayed as a tooltip.
3764 &quot;url&quot;: &quot;A String&quot;, # An optional full URL.
3765 &quot;label&quot;: &quot;A String&quot;, # An optional label to display in a dax UI for the element.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07003766 &quot;timestampValue&quot;: &quot;A String&quot;, # Contains value if the data is of timestamp type.
3767 &quot;boolValue&quot;: True or False, # Contains value if the data is of a boolean type.
3768 &quot;javaClassValue&quot;: &quot;A String&quot;, # Contains value if the data is of java class type.
3769 &quot;strValue&quot;: &quot;A String&quot;, # Contains value if the data is of string type.
Bu Sun Kim65020912020-05-20 12:08:20 -07003770 },
3771 ],
3772 &quot;outputCollectionName&quot;: [ # User names for all collection outputs to this transform.
3773 &quot;A String&quot;,
3774 ],
3775 },
3776 ],
3777 &quot;displayData&quot;: [ # Pipeline level display data.
3778 { # Data provided with a pipeline or transform to provide descriptive info.
Bu Sun Kim65020912020-05-20 12:08:20 -07003779 &quot;durationValue&quot;: &quot;A String&quot;, # Contains value if the data is of duration type.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07003780 &quot;int64Value&quot;: &quot;A String&quot;, # Contains value if the data is of int64 type.
Bu Sun Kim65020912020-05-20 12:08:20 -07003781 &quot;namespace&quot;: &quot;A String&quot;, # The namespace for the key. This is usually a class name or programming
3782 # language namespace (i.e. python module) which defines the display data.
3783 # This allows a dax monitoring system to specially handle the data
3784 # and perform custom rendering.
3785 &quot;floatValue&quot;: 3.14, # Contains value if the data is of float type.
3786 &quot;key&quot;: &quot;A String&quot;, # The key identifying the display data.
3787 # This is intended to be used as a label for the display data
3788 # when viewed in a dax monitoring system.
3789 &quot;shortStrValue&quot;: &quot;A String&quot;, # A possible additional shorter value to display.
3790 # For example a java_class_name_value of com.mypackage.MyDoFn
3791 # will be stored with MyDoFn as the short_str_value and
3792 # com.mypackage.MyDoFn as the java_class_name value.
3793 # short_str_value can be displayed and java_class_name_value
3794 # will be displayed as a tooltip.
3795 &quot;url&quot;: &quot;A String&quot;, # An optional full URL.
3796 &quot;label&quot;: &quot;A String&quot;, # An optional label to display in a dax UI for the element.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07003797 &quot;timestampValue&quot;: &quot;A String&quot;, # Contains value if the data is of timestamp type.
3798 &quot;boolValue&quot;: True or False, # Contains value if the data is of a boolean type.
3799 &quot;javaClassValue&quot;: &quot;A String&quot;, # Contains value if the data is of java class type.
3800 &quot;strValue&quot;: &quot;A String&quot;, # Contains value if the data is of string type.
Bu Sun Kim65020912020-05-20 12:08:20 -07003801 },
3802 ],
3803 },
3804 &quot;replaceJobId&quot;: &quot;A String&quot;, # If this job is an update of an existing job, this field is the job ID
3805 # of the job it replaced.
3806 #
3807 # When sending a `CreateJobRequest`, you can update a job by specifying it
3808 # here. The job named here is stopped, and its intermediate state is
3809 # transferred to this job.
3810 &quot;tempFiles&quot;: [ # A set of files the system should be aware of that are used
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003811 # for temporary storage. These temporary files will be
3812 # removed on job completion.
3813 # No duplicates are allowed.
3814 # No file patterns are supported.
3815 #
3816 # The supported files are:
3817 #
3818 # Google Cloud Storage:
3819 #
3820 # storage.googleapis.com/{bucket}/{object}
3821 # bucket.storage.googleapis.com/{object}
Bu Sun Kim65020912020-05-20 12:08:20 -07003822 &quot;A String&quot;,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003823 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07003824 &quot;name&quot;: &quot;A String&quot;, # The user-specified Cloud Dataflow job name.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003825 #
3826 # Only one Job with a given name may exist in a project at any
3827 # given time. If a caller attempts to create a Job with the same
3828 # name as an already-existing Job, the attempt returns the
3829 # existing Job.
3830 #
3831 # The name must match the regular expression
3832 # `[a-z]([-a-z0-9]{0,38}[a-z0-9])?`
Bu Sun Kim65020912020-05-20 12:08:20 -07003833 &quot;steps&quot;: [ # Exactly one of step or steps_location should be specified.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003834 #
3835 # The top-level steps that constitute the entire job.
3836 { # Defines a particular step within a Cloud Dataflow job.
3837 #
3838 # A job consists of multiple steps, each of which performs some
3839 # specific operation as part of the overall job. Data is typically
3840 # passed from one step to another as part of the job.
3841 #
Bu Sun Kim65020912020-05-20 12:08:20 -07003842 # Here&#x27;s an example of a sequence of steps which together implement a
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003843 # Map-Reduce job:
3844 #
3845 # * Read a collection of data from some source, parsing the
Bu Sun Kim65020912020-05-20 12:08:20 -07003846 # collection&#x27;s elements.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003847 #
3848 # * Validate the elements.
3849 #
3850 # * Apply a user-defined function to map each element to some value
3851 # and extract an element-specific key value.
3852 #
3853 # * Group elements with the same key into a single element with
3854 # that key, transforming a multiply-keyed collection into a
3855 # uniquely-keyed collection.
3856 #
3857 # * Write the elements out to some data sink.
3858 #
3859 # Note that the Cloud Dataflow service may be used to run many different
3860 # types of jobs, not just Map-Reduce.
Bu Sun Kim65020912020-05-20 12:08:20 -07003861 &quot;name&quot;: &quot;A String&quot;, # The name that identifies the step. This must be unique for each
Dan O'Mearadd494642020-05-01 07:42:23 -07003862 # step with respect to all other steps in the Cloud Dataflow job.
Bu Sun Kim65020912020-05-20 12:08:20 -07003863 &quot;kind&quot;: &quot;A String&quot;, # The kind of step in the Cloud Dataflow job.
3864 &quot;properties&quot;: { # Named properties associated with the step. Each kind of
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003865 # predefined step has its own required set of properties.
3866 # Must be provided on Create. Only retrieved with JOB_VIEW_ALL.
Bu Sun Kim65020912020-05-20 12:08:20 -07003867 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003868 },
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003869 },
3870 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07003871 &quot;replacedByJobId&quot;: &quot;A String&quot;, # If another job is an update of this job (and thus, this job is in
3872 # `JOB_STATE_UPDATED`), this field contains the ID of that job.
3873 &quot;executionInfo&quot;: { # Additional information about how a Cloud Dataflow job will be executed that # Deprecated.
3874 # isn&#x27;t contained in the submitted job.
3875 &quot;stages&quot;: { # A mapping from each stage to the information about that stage.
3876 &quot;a_key&quot;: { # Contains information about how a particular
3877 # google.dataflow.v1beta3.Step will be executed.
3878 &quot;stepName&quot;: [ # The steps associated with the execution stage.
3879 # Note that stages may have several steps, and that a given step
3880 # might be run by more than one stage.
3881 &quot;A String&quot;,
3882 ],
3883 },
3884 },
3885 },
3886 &quot;currentState&quot;: &quot;A String&quot;, # The current state of the job.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003887 #
3888 # Jobs are created in the `JOB_STATE_STOPPED` state unless otherwise
3889 # specified.
3890 #
3891 # A job in the `JOB_STATE_RUNNING` state may asynchronously enter a
3892 # terminal state. After a job has reached a terminal state, no
3893 # further state updates may be made.
3894 #
3895 # This field may be mutated by the Cloud Dataflow service;
3896 # callers cannot mutate it.
Bu Sun Kim65020912020-05-20 12:08:20 -07003897 &quot;location&quot;: &quot;A String&quot;, # The [regional endpoint]
3898 # (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
3899 # contains this job.
3900 &quot;startTime&quot;: &quot;A String&quot;, # The timestamp when the job was started (transitioned to JOB_STATE_PENDING).
3901 # Flexible resource scheduling jobs are started with some delay after job
3902 # creation, so start_time is unset before start and is updated when the
3903 # job is started by the Cloud Dataflow service. For other jobs, start_time
3904 # always equals to create_time and is immutable and set by the Cloud Dataflow
3905 # service.
3906 &quot;stepsLocation&quot;: &quot;A String&quot;, # The GCS location where the steps are stored.
3907 &quot;labels&quot;: { # User-defined labels for this job.
3908 #
3909 # The labels map can contain no more than 64 entries. Entries of the labels
3910 # map are UTF8 strings that comply with the following restrictions:
3911 #
3912 # * Keys must conform to regexp: \p{Ll}\p{Lo}{0,62}
3913 # * Values must conform to regexp: [\p{Ll}\p{Lo}\p{N}_-]{0,63}
3914 # * Both keys and values are additionally constrained to be &lt;= 128 bytes in
3915 # size.
3916 &quot;a_key&quot;: &quot;A String&quot;,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003917 },
Bu Sun Kim65020912020-05-20 12:08:20 -07003918 &quot;createTime&quot;: &quot;A String&quot;, # The timestamp when the job was initially created. Immutable and set by the
3919 # Cloud Dataflow service.
3920 &quot;requestedState&quot;: &quot;A String&quot;, # The job&#x27;s requested state.
3921 #
3922 # `UpdateJob` may be used to switch between the `JOB_STATE_STOPPED` and
3923 # `JOB_STATE_RUNNING` states, by setting requested_state. `UpdateJob` may
3924 # also be used to directly set a job&#x27;s requested state to
3925 # `JOB_STATE_CANCELLED` or `JOB_STATE_DONE`, irrevocably terminating the
3926 # job if it has not already reached a terminal state.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003927}
3928
3929 location: string, The [regional endpoint]
3930(https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
3931contains this job.
3932 x__xgafv: string, V1 error format.
3933 Allowed values
3934 1 - v1 error format
3935 2 - v2 error format
3936
3937Returns:
3938 An object of the form:
3939
3940 { # Defines a job to be run by the Cloud Dataflow service.
Bu Sun Kim65020912020-05-20 12:08:20 -07003941 &quot;clientRequestId&quot;: &quot;A String&quot;, # The client&#x27;s unique identifier of the job, re-used across retried attempts.
3942 # If this field is set, the service will ensure its uniqueness.
3943 # The request to create a job will fail if the service has knowledge of a
3944 # previously submitted job with the same client&#x27;s ID and job name.
3945 # The caller may use this field to ensure idempotence of job
3946 # creation across retried attempts to create a job.
3947 # By default, the field is empty and, in that case, the service ignores it.
3948 &quot;id&quot;: &quot;A String&quot;, # The unique ID of this job.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003949 #
3950 # This field is set by the Cloud Dataflow service when the Job is
3951 # created, and is immutable for the life of the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07003952 &quot;currentStateTime&quot;: &quot;A String&quot;, # The timestamp associated with the current state.
3953 &quot;transformNameMapping&quot;: { # The map of transform name prefixes of the job to be replaced to the
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04003954 # corresponding name prefixes of the new job.
Bu Sun Kim65020912020-05-20 12:08:20 -07003955 &quot;a_key&quot;: &quot;A String&quot;,
Nathaniel Manista4f877e52015-06-15 16:44:50 +00003956 },
Bu Sun Kim65020912020-05-20 12:08:20 -07003957 &quot;environment&quot;: { # Describes the environment in which a Dataflow Job runs. # The environment for the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07003958 &quot;internalExperiments&quot;: { # Experimental settings.
3959 &quot;a_key&quot;: &quot;&quot;, # Properties of the object. Contains field @type with type URL.
3960 },
3961 &quot;workerRegion&quot;: &quot;A String&quot;, # The Compute Engine region
3962 # (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
3963 # which worker processing should occur, e.g. &quot;us-west1&quot;. Mutually exclusive
3964 # with worker_zone. If neither worker_region nor worker_zone is specified,
3965 # default to the control plane&#x27;s region.
3966 &quot;serviceKmsKeyName&quot;: &quot;A String&quot;, # If set, contains the Cloud KMS key identifier used to encrypt data
3967 # at rest, AKA a Customer Managed Encryption Key (CMEK).
3968 #
3969 # Format:
3970 # projects/PROJECT_ID/locations/LOCATION/keyRings/KEY_RING/cryptoKeys/KEY
3971 &quot;userAgent&quot;: { # A description of the process that generated the request.
3972 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
3973 },
3974 &quot;workerZone&quot;: &quot;A String&quot;, # The Compute Engine zone
3975 # (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
3976 # which worker processing should occur, e.g. &quot;us-west1-a&quot;. Mutually exclusive
3977 # with worker_region. If neither worker_region nor worker_zone is specified,
3978 # a zone in the control plane&#x27;s region is chosen based on available capacity.
3979 &quot;clusterManagerApiService&quot;: &quot;A String&quot;, # The type of cluster manager API to use. If unknown or
Dan O'Mearadd494642020-05-01 07:42:23 -07003980 # unspecified, the service will attempt to choose a reasonable
3981 # default. This should be in the form of the API service name,
Bu Sun Kim65020912020-05-20 12:08:20 -07003982 # e.g. &quot;compute.googleapis.com&quot;.
3983 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the system should use for temporary
3984 # storage. The system will append the suffix &quot;/temp-{JOBNAME} to
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003985 # this resource prefix, where {JOBNAME} is the value of the
3986 # job_name field. The resulting bucket and object prefix is used
3987 # as the prefix of the resources used to store temporary data
3988 # needed during the job execution. NOTE: This will override the
3989 # value in taskrunner_settings.
3990 # The supported resource type is:
3991 #
3992 # Google Cloud Storage:
3993 #
3994 # storage.googleapis.com/{bucket}/{object}
3995 # bucket.storage.googleapis.com/{object}
Bu Sun Kim65020912020-05-20 12:08:20 -07003996 &quot;experiments&quot;: [ # The list of experiments to enable.
3997 &quot;A String&quot;,
3998 ],
3999 &quot;version&quot;: { # A structure describing which components and their versions of the service
4000 # are required in order to run the job.
4001 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
4002 },
4003 &quot;serviceAccountEmail&quot;: &quot;A String&quot;, # Identity to run virtual machines as. Defaults to the default account.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07004004 &quot;sdkPipelineOptions&quot;: { # The Cloud Dataflow SDK pipeline options specified by the user. These
4005 # options are passed through the service and are used to recreate the
4006 # SDK pipeline options on the worker in a language agnostic and platform
4007 # independent way.
4008 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
4009 },
4010 &quot;flexResourceSchedulingGoal&quot;: &quot;A String&quot;, # Which Flexible Resource Scheduling mode to run in.
4011 &quot;workerPools&quot;: [ # The worker pools. At least one &quot;harness&quot; worker pool must be
4012 # specified in order for the job to have workers.
4013 { # Describes one particular pool of Cloud Dataflow workers to be
4014 # instantiated by the Cloud Dataflow service in order to perform the
4015 # computations required by a job. Note that a workflow job may use
4016 # multiple pools, in order to match the various computational
4017 # requirements of the various stages of the job.
4018 &quot;numThreadsPerWorker&quot;: 42, # The number of threads per worker harness. If empty or unspecified, the
4019 # service will choose a number of threads (according to the number of cores
4020 # on the selected machine type for batch, or 1 by convention for streaming).
4021 &quot;numWorkers&quot;: 42, # Number of Google Compute Engine workers in this pool needed to
4022 # execute the job. If zero or unspecified, the service will
4023 # attempt to choose a reasonable default.
4024 &quot;zone&quot;: &quot;A String&quot;, # Zone to run the worker pools in. If empty or unspecified, the service
4025 # will attempt to choose a reasonable default.
4026 &quot;diskSourceImage&quot;: &quot;A String&quot;, # Fully qualified source image for disks.
4027 &quot;packages&quot;: [ # Packages to be installed on workers.
4028 { # The packages that must be installed in order for a worker to run the
4029 # steps of the Cloud Dataflow job that will be assigned to its worker
4030 # pool.
4031 #
4032 # This is the mechanism by which the Cloud Dataflow SDK causes code to
4033 # be loaded onto the workers. For example, the Cloud Dataflow Java SDK
4034 # might use this to install jars containing the user&#x27;s code and all of the
4035 # various dependencies (libraries, data files, etc.) required in order
4036 # for that code to run.
4037 &quot;name&quot;: &quot;A String&quot;, # The name of the package.
4038 &quot;location&quot;: &quot;A String&quot;, # The resource to read the package from. The supported resource type is:
4039 #
4040 # Google Cloud Storage:
4041 #
4042 # storage.googleapis.com/{bucket}
4043 # bucket.storage.googleapis.com/
4044 },
4045 ],
4046 &quot;teardownPolicy&quot;: &quot;A String&quot;, # Sets the policy for determining when to turndown worker pool.
4047 # Allowed values are: `TEARDOWN_ALWAYS`, `TEARDOWN_ON_SUCCESS`, and
4048 # `TEARDOWN_NEVER`.
4049 # `TEARDOWN_ALWAYS` means workers are always torn down regardless of whether
4050 # the job succeeds. `TEARDOWN_ON_SUCCESS` means workers are torn down
4051 # if the job succeeds. `TEARDOWN_NEVER` means the workers are never torn
4052 # down.
4053 #
4054 # If the workers are not torn down by the service, they will
4055 # continue to run and use Google Compute Engine VM resources in the
4056 # user&#x27;s project until they are explicitly terminated by the user.
4057 # Because of this, Google recommends using the `TEARDOWN_ALWAYS`
4058 # policy except for small, manually supervised test jobs.
4059 #
4060 # If unknown or unspecified, the service will attempt to choose a reasonable
4061 # default.
4062 &quot;onHostMaintenance&quot;: &quot;A String&quot;, # The action to take on host maintenance, as defined by the Google
4063 # Compute Engine API.
4064 &quot;poolArgs&quot;: { # Extra arguments for this worker pool.
4065 &quot;a_key&quot;: &quot;&quot;, # Properties of the object. Contains field @type with type URL.
4066 },
4067 &quot;diskSizeGb&quot;: 42, # Size of root disk for VMs, in GB. If zero or unspecified, the service will
4068 # attempt to choose a reasonable default.
4069 &quot;workerHarnessContainerImage&quot;: &quot;A String&quot;, # Required. Docker container image that executes the Cloud Dataflow worker
4070 # harness, residing in Google Container Registry.
4071 #
4072 # Deprecated for the Fn API path. Use sdk_harness_container_images instead.
4073 &quot;diskType&quot;: &quot;A String&quot;, # Type of root disk for VMs. If empty or unspecified, the service will
4074 # attempt to choose a reasonable default.
4075 &quot;machineType&quot;: &quot;A String&quot;, # Machine type (e.g. &quot;n1-standard-1&quot;). If empty or unspecified, the
4076 # service will attempt to choose a reasonable default.
4077 &quot;kind&quot;: &quot;A String&quot;, # The kind of the worker pool; currently only `harness` and `shuffle`
4078 # are supported.
4079 &quot;sdkHarnessContainerImages&quot;: [ # Set of SDK harness containers needed to execute this pipeline. This will
4080 # only be set in the Fn API path. For non-cross-language pipelines this
4081 # should have only one entry. Cross-language pipelines will have two or more
4082 # entries.
4083 { # Defines a SDK harness container for executing Dataflow pipelines.
4084 &quot;containerImage&quot;: &quot;A String&quot;, # A docker container image that resides in Google Container Registry.
4085 &quot;useSingleCorePerContainer&quot;: True or False, # If true, recommends the Dataflow service to use only one core per SDK
4086 # container instance with this image. If false (or unset) recommends using
4087 # more than one core per SDK container instance with this image for
4088 # efficiency. Note that Dataflow service may choose to override this property
4089 # if needed.
4090 },
4091 ],
4092 &quot;dataDisks&quot;: [ # Data disks that are used by a VM in this workflow.
4093 { # Describes the data disk used by a workflow job.
4094 &quot;diskType&quot;: &quot;A String&quot;, # Disk storage type, as defined by Google Compute Engine. This
4095 # must be a disk type appropriate to the project and zone in which
4096 # the workers will run. If unknown or unspecified, the service
4097 # will attempt to choose a reasonable default.
4098 #
4099 # For example, the standard persistent disk type is a resource name
4100 # typically ending in &quot;pd-standard&quot;. If SSD persistent disks are
4101 # available, the resource name typically ends with &quot;pd-ssd&quot;. The
4102 # actual valid values are defined the Google Compute Engine API,
4103 # not by the Cloud Dataflow API; consult the Google Compute Engine
4104 # documentation for more information about determining the set of
4105 # available disk types for a particular project and zone.
4106 #
4107 # Google Compute Engine Disk types are local to a particular
4108 # project in a particular zone, and so the resource name will
4109 # typically look something like this:
4110 #
4111 # compute.googleapis.com/projects/project-id/zones/zone/diskTypes/pd-standard
4112 &quot;sizeGb&quot;: 42, # Size of disk in GB. If zero or unspecified, the service will
4113 # attempt to choose a reasonable default.
4114 &quot;mountPoint&quot;: &quot;A String&quot;, # Directory in a VM where disk is mounted.
4115 },
4116 ],
4117 &quot;subnetwork&quot;: &quot;A String&quot;, # Subnetwork to which VMs will be assigned, if desired. Expected to be of
4118 # the form &quot;regions/REGION/subnetworks/SUBNETWORK&quot;.
4119 &quot;ipConfiguration&quot;: &quot;A String&quot;, # Configuration for VM IPs.
4120 &quot;taskrunnerSettings&quot;: { # Taskrunner configuration settings. # Settings passed through to Google Compute Engine workers when
4121 # using the standard Dataflow task runner. Users should ignore
4122 # this field.
4123 &quot;alsologtostderr&quot;: True or False, # Whether to also send taskrunner log info to stderr.
4124 &quot;taskGroup&quot;: &quot;A String&quot;, # The UNIX group ID on the worker VM to use for tasks launched by
4125 # taskrunner; e.g. &quot;wheel&quot;.
4126 &quot;harnessCommand&quot;: &quot;A String&quot;, # The command to launch the worker harness.
4127 &quot;logDir&quot;: &quot;A String&quot;, # The directory on the VM to store logs.
4128 &quot;oauthScopes&quot;: [ # The OAuth2 scopes to be requested by the taskrunner in order to
4129 # access the Cloud Dataflow API.
4130 &quot;A String&quot;,
4131 ],
4132 &quot;dataflowApiVersion&quot;: &quot;A String&quot;, # The API version of endpoint, e.g. &quot;v1b3&quot;
4133 &quot;logUploadLocation&quot;: &quot;A String&quot;, # Indicates where to put logs. If this is not specified, the logs
4134 # will not be uploaded.
4135 #
4136 # The supported resource type is:
4137 #
4138 # Google Cloud Storage:
4139 # storage.googleapis.com/{bucket}/{object}
4140 # bucket.storage.googleapis.com/{object}
4141 &quot;streamingWorkerMainClass&quot;: &quot;A String&quot;, # The streaming worker main class name.
4142 &quot;workflowFileName&quot;: &quot;A String&quot;, # The file to store the workflow in.
4143 &quot;languageHint&quot;: &quot;A String&quot;, # The suggested backend language.
4144 &quot;commandlinesFileName&quot;: &quot;A String&quot;, # The file to store preprocessing commands in.
4145 &quot;baseTaskDir&quot;: &quot;A String&quot;, # The location on the worker for task-specific subdirectories.
4146 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the taskrunner should use for
4147 # temporary storage.
4148 #
4149 # The supported resource type is:
4150 #
4151 # Google Cloud Storage:
4152 # storage.googleapis.com/{bucket}/{object}
4153 # bucket.storage.googleapis.com/{object}
4154 &quot;baseUrl&quot;: &quot;A String&quot;, # The base URL for the taskrunner to use when accessing Google Cloud APIs.
4155 #
4156 # When workers access Google Cloud APIs, they logically do so via
4157 # relative URLs. If this field is specified, it supplies the base
4158 # URL to use for resolving these relative URLs. The normative
4159 # algorithm used is defined by RFC 1808, &quot;Relative Uniform Resource
4160 # Locators&quot;.
4161 #
4162 # If not specified, the default value is &quot;http://www.googleapis.com/&quot;
4163 &quot;logToSerialconsole&quot;: True or False, # Whether to send taskrunner log info to Google Compute Engine VM serial
4164 # console.
4165 &quot;continueOnException&quot;: True or False, # Whether to continue taskrunner if an exception is hit.
4166 &quot;parallelWorkerSettings&quot;: { # Provides data to pass through to the worker harness. # The settings to pass to the parallel worker harness.
4167 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the system should use for temporary
4168 # storage.
4169 #
4170 # The supported resource type is:
4171 #
4172 # Google Cloud Storage:
4173 #
4174 # storage.googleapis.com/{bucket}/{object}
4175 # bucket.storage.googleapis.com/{object}
4176 &quot;reportingEnabled&quot;: True or False, # Whether to send work progress updates to the service.
4177 &quot;baseUrl&quot;: &quot;A String&quot;, # The base URL for accessing Google Cloud APIs.
4178 #
4179 # When workers access Google Cloud APIs, they logically do so via
4180 # relative URLs. If this field is specified, it supplies the base
4181 # URL to use for resolving these relative URLs. The normative
4182 # algorithm used is defined by RFC 1808, &quot;Relative Uniform Resource
4183 # Locators&quot;.
4184 #
4185 # If not specified, the default value is &quot;http://www.googleapis.com/&quot;
4186 &quot;servicePath&quot;: &quot;A String&quot;, # The Cloud Dataflow service path relative to the root URL, for example,
4187 # &quot;dataflow/v1b3/projects&quot;.
4188 &quot;shuffleServicePath&quot;: &quot;A String&quot;, # The Shuffle service path relative to the root URL, for example,
4189 # &quot;shuffle/v1beta1&quot;.
4190 &quot;workerId&quot;: &quot;A String&quot;, # The ID of the worker running this pipeline.
4191 },
4192 &quot;taskUser&quot;: &quot;A String&quot;, # The UNIX user ID on the worker VM to use for tasks launched by
4193 # taskrunner; e.g. &quot;root&quot;.
4194 &quot;vmId&quot;: &quot;A String&quot;, # The ID string of the VM.
4195 },
4196 &quot;autoscalingSettings&quot;: { # Settings for WorkerPool autoscaling. # Settings for autoscaling of this WorkerPool.
4197 &quot;algorithm&quot;: &quot;A String&quot;, # The algorithm to use for autoscaling.
4198 &quot;maxNumWorkers&quot;: 42, # The maximum number of workers to cap scaling at.
4199 },
4200 &quot;metadata&quot;: { # Metadata to set on the Google Compute Engine VMs.
4201 &quot;a_key&quot;: &quot;A String&quot;,
4202 },
4203 &quot;defaultPackageSet&quot;: &quot;A String&quot;, # The default package set to install. This allows the service to
4204 # select a default set of packages which are useful to worker
4205 # harnesses written in a particular language.
4206 &quot;network&quot;: &quot;A String&quot;, # Network to which VMs will be assigned. If empty or unspecified,
4207 # the service will use the network &quot;default&quot;.
4208 },
4209 ],
4210 &quot;dataset&quot;: &quot;A String&quot;, # The dataset for the current project where various workflow
4211 # related tables are stored.
4212 #
4213 # The supported resource type is:
4214 #
4215 # Google BigQuery:
4216 # bigquery.googleapis.com/{dataset}
Takashi Matsuo06694102015-09-11 13:55:40 -07004217 },
Bu Sun Kim65020912020-05-20 12:08:20 -07004218 &quot;stageStates&quot;: [ # This field may be mutated by the Cloud Dataflow service;
4219 # callers cannot mutate it.
4220 { # A message describing the state of a particular execution stage.
Bu Sun Kim65020912020-05-20 12:08:20 -07004221 &quot;currentStateTime&quot;: &quot;A String&quot;, # The time at which the stage transitioned to this state.
4222 &quot;executionStageState&quot;: &quot;A String&quot;, # Executions stage states allow the same set of values as JobState.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07004223 &quot;executionStageName&quot;: &quot;A String&quot;, # The name of the execution stage.
Bu Sun Kim65020912020-05-20 12:08:20 -07004224 },
4225 ],
4226 &quot;jobMetadata&quot;: { # Metadata available primarily for filtering jobs. Will be included in the # This field is populated by the Dataflow service to support filtering jobs
4227 # by the metadata values provided here. Populated for ListJobs and all GetJob
4228 # views SUMMARY and higher.
4229 # ListJob response and Job SUMMARY view.
Bu Sun Kim65020912020-05-20 12:08:20 -07004230 &quot;datastoreDetails&quot;: [ # Identification of a Datastore source used in the Dataflow job.
4231 { # Metadata for a Datastore connector used by the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07004232 &quot;namespace&quot;: &quot;A String&quot;, # Namespace used in the connection.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07004233 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
Bu Sun Kim65020912020-05-20 12:08:20 -07004234 },
4235 ],
4236 &quot;sdkVersion&quot;: { # The version of the SDK used to run the job. # The SDK version used to run the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07004237 &quot;version&quot;: &quot;A String&quot;, # The version of the SDK used to run the job.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07004238 &quot;sdkSupportStatus&quot;: &quot;A String&quot;, # The support status for this SDK version.
4239 &quot;versionDisplayName&quot;: &quot;A String&quot;, # A readable string describing the version of the SDK.
Bu Sun Kim65020912020-05-20 12:08:20 -07004240 },
4241 &quot;bigqueryDetails&quot;: [ # Identification of a BigQuery source used in the Dataflow job.
4242 { # Metadata for a BigQuery connector used by the job.
4243 &quot;table&quot;: &quot;A String&quot;, # Table accessed in the connection.
4244 &quot;dataset&quot;: &quot;A String&quot;, # Dataset accessed in the connection.
Bu Sun Kim65020912020-05-20 12:08:20 -07004245 &quot;query&quot;: &quot;A String&quot;, # Query used to access data in the connection.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07004246 &quot;projectId&quot;: &quot;A String&quot;, # Project accessed in the connection.
Bu Sun Kim65020912020-05-20 12:08:20 -07004247 },
4248 ],
4249 &quot;fileDetails&quot;: [ # Identification of a File source used in the Dataflow job.
4250 { # Metadata for a File connector used by the job.
4251 &quot;filePattern&quot;: &quot;A String&quot;, # File Pattern used to access files by the connector.
4252 },
4253 ],
4254 &quot;pubsubDetails&quot;: [ # Identification of a PubSub source used in the Dataflow job.
4255 { # Metadata for a PubSub connector used by the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07004256 &quot;topic&quot;: &quot;A String&quot;, # Topic accessed in the connection.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07004257 &quot;subscription&quot;: &quot;A String&quot;, # Subscription used in the connection.
4258 },
4259 ],
4260 &quot;bigTableDetails&quot;: [ # Identification of a BigTable source used in the Dataflow job.
4261 { # Metadata for a BigTable connector used by the job.
4262 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
4263 &quot;instanceId&quot;: &quot;A String&quot;, # InstanceId accessed in the connection.
4264 &quot;tableId&quot;: &quot;A String&quot;, # TableId accessed in the connection.
4265 },
4266 ],
4267 &quot;spannerDetails&quot;: [ # Identification of a Spanner source used in the Dataflow job.
4268 { # Metadata for a Spanner connector used by the job.
4269 &quot;instanceId&quot;: &quot;A String&quot;, # InstanceId accessed in the connection.
4270 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
4271 &quot;databaseId&quot;: &quot;A String&quot;, # DatabaseId accessed in the connection.
Bu Sun Kim65020912020-05-20 12:08:20 -07004272 },
4273 ],
4274 },
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07004275 &quot;type&quot;: &quot;A String&quot;, # The type of Cloud Dataflow job.
4276 &quot;projectId&quot;: &quot;A String&quot;, # The ID of the Cloud Platform project that the job belongs to.
Bu Sun Kim65020912020-05-20 12:08:20 -07004277 &quot;createdFromSnapshotId&quot;: &quot;A String&quot;, # If this is specified, the job&#x27;s initial state is populated from the given
4278 # snapshot.
Bu Sun Kim65020912020-05-20 12:08:20 -07004279 &quot;pipelineDescription&quot;: { # A descriptive representation of submitted pipeline as well as the executed # Preliminary field: The format of this data may change at any time.
4280 # A description of the user pipeline and stages through which it is executed.
4281 # Created by Cloud Dataflow service. Only retrieved with
4282 # JOB_VIEW_DESCRIPTION or JOB_VIEW_ALL.
4283 # form. This data is provided by the Dataflow service for ease of visualizing
4284 # the pipeline and interpreting Dataflow provided metrics.
4285 &quot;executionPipelineStage&quot;: [ # Description of each stage of execution of the pipeline.
4286 { # Description of the composing transforms, names/ids, and input/outputs of a
4287 # stage of execution. Some composing transforms and sources may have been
4288 # generated by the Dataflow service during execution planning.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07004289 &quot;outputSource&quot;: [ # Output sources for this stage.
4290 { # Description of an input or output of an execution stage.
4291 &quot;sizeBytes&quot;: &quot;A String&quot;, # Size of the source, if measurable.
4292 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
4293 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this source; may be user or system generated.
4294 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
4295 # source is most closely associated.
4296 },
4297 ],
4298 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this stage.
4299 &quot;inputSource&quot;: [ # Input sources for this stage.
4300 { # Description of an input or output of an execution stage.
4301 &quot;sizeBytes&quot;: &quot;A String&quot;, # Size of the source, if measurable.
4302 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
4303 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this source; may be user or system generated.
4304 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
4305 # source is most closely associated.
4306 },
4307 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07004308 &quot;id&quot;: &quot;A String&quot;, # Dataflow service generated id for this stage.
4309 &quot;componentTransform&quot;: [ # Transforms that comprise this execution stage.
4310 { # Description of a transform executed as part of an execution stage.
4311 &quot;originalTransform&quot;: &quot;A String&quot;, # User name for the original user transform with which this transform is
4312 # most closely associated.
4313 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
4314 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this transform; may be user or system generated.
4315 },
4316 ],
4317 &quot;componentSource&quot;: [ # Collections produced and consumed by component transforms of this stage.
4318 { # Description of an interstitial value between transforms in an execution
4319 # stage.
4320 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
4321 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this transform; may be user or system generated.
4322 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
4323 # source is most closely associated.
4324 },
4325 ],
4326 &quot;kind&quot;: &quot;A String&quot;, # Type of tranform this stage is executing.
Bu Sun Kim65020912020-05-20 12:08:20 -07004327 },
4328 ],
4329 &quot;originalPipelineTransform&quot;: [ # Description of each transform in the pipeline and collections between them.
4330 { # Description of the type, names/ids, and input/outputs for a transform.
4331 &quot;kind&quot;: &quot;A String&quot;, # Type of transform.
4332 &quot;inputCollectionName&quot;: [ # User names for all collection inputs to this transform.
4333 &quot;A String&quot;,
4334 ],
4335 &quot;name&quot;: &quot;A String&quot;, # User provided name for this transform instance.
4336 &quot;id&quot;: &quot;A String&quot;, # SDK generated id of this transform instance.
4337 &quot;displayData&quot;: [ # Transform-specific display data.
4338 { # Data provided with a pipeline or transform to provide descriptive info.
Bu Sun Kim65020912020-05-20 12:08:20 -07004339 &quot;durationValue&quot;: &quot;A String&quot;, # Contains value if the data is of duration type.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07004340 &quot;int64Value&quot;: &quot;A String&quot;, # Contains value if the data is of int64 type.
Bu Sun Kim65020912020-05-20 12:08:20 -07004341 &quot;namespace&quot;: &quot;A String&quot;, # The namespace for the key. This is usually a class name or programming
4342 # language namespace (i.e. python module) which defines the display data.
4343 # This allows a dax monitoring system to specially handle the data
4344 # and perform custom rendering.
4345 &quot;floatValue&quot;: 3.14, # Contains value if the data is of float type.
4346 &quot;key&quot;: &quot;A String&quot;, # The key identifying the display data.
4347 # This is intended to be used as a label for the display data
4348 # when viewed in a dax monitoring system.
4349 &quot;shortStrValue&quot;: &quot;A String&quot;, # A possible additional shorter value to display.
4350 # For example a java_class_name_value of com.mypackage.MyDoFn
4351 # will be stored with MyDoFn as the short_str_value and
4352 # com.mypackage.MyDoFn as the java_class_name value.
4353 # short_str_value can be displayed and java_class_name_value
4354 # will be displayed as a tooltip.
4355 &quot;url&quot;: &quot;A String&quot;, # An optional full URL.
4356 &quot;label&quot;: &quot;A String&quot;, # An optional label to display in a dax UI for the element.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07004357 &quot;timestampValue&quot;: &quot;A String&quot;, # Contains value if the data is of timestamp type.
4358 &quot;boolValue&quot;: True or False, # Contains value if the data is of a boolean type.
4359 &quot;javaClassValue&quot;: &quot;A String&quot;, # Contains value if the data is of java class type.
4360 &quot;strValue&quot;: &quot;A String&quot;, # Contains value if the data is of string type.
Bu Sun Kim65020912020-05-20 12:08:20 -07004361 },
4362 ],
4363 &quot;outputCollectionName&quot;: [ # User names for all collection outputs to this transform.
4364 &quot;A String&quot;,
4365 ],
4366 },
4367 ],
4368 &quot;displayData&quot;: [ # Pipeline level display data.
4369 { # Data provided with a pipeline or transform to provide descriptive info.
Bu Sun Kim65020912020-05-20 12:08:20 -07004370 &quot;durationValue&quot;: &quot;A String&quot;, # Contains value if the data is of duration type.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07004371 &quot;int64Value&quot;: &quot;A String&quot;, # Contains value if the data is of int64 type.
Bu Sun Kim65020912020-05-20 12:08:20 -07004372 &quot;namespace&quot;: &quot;A String&quot;, # The namespace for the key. This is usually a class name or programming
4373 # language namespace (i.e. python module) which defines the display data.
4374 # This allows a dax monitoring system to specially handle the data
4375 # and perform custom rendering.
4376 &quot;floatValue&quot;: 3.14, # Contains value if the data is of float type.
4377 &quot;key&quot;: &quot;A String&quot;, # The key identifying the display data.
4378 # This is intended to be used as a label for the display data
4379 # when viewed in a dax monitoring system.
4380 &quot;shortStrValue&quot;: &quot;A String&quot;, # A possible additional shorter value to display.
4381 # For example a java_class_name_value of com.mypackage.MyDoFn
4382 # will be stored with MyDoFn as the short_str_value and
4383 # com.mypackage.MyDoFn as the java_class_name value.
4384 # short_str_value can be displayed and java_class_name_value
4385 # will be displayed as a tooltip.
4386 &quot;url&quot;: &quot;A String&quot;, # An optional full URL.
4387 &quot;label&quot;: &quot;A String&quot;, # An optional label to display in a dax UI for the element.
Bu Sun Kim4ed7d3f2020-05-27 12:20:54 -07004388 &quot;timestampValue&quot;: &quot;A String&quot;, # Contains value if the data is of timestamp type.
4389 &quot;boolValue&quot;: True or False, # Contains value if the data is of a boolean type.
4390 &quot;javaClassValue&quot;: &quot;A String&quot;, # Contains value if the data is of java class type.
4391 &quot;strValue&quot;: &quot;A String&quot;, # Contains value if the data is of string type.
Bu Sun Kim65020912020-05-20 12:08:20 -07004392 },
4393 ],
4394 },
4395 &quot;replaceJobId&quot;: &quot;A String&quot;, # If this job is an update of an existing job, this field is the job ID
4396 # of the job it replaced.
4397 #
4398 # When sending a `CreateJobRequest`, you can update a job by specifying it
4399 # here. The job named here is stopped, and its intermediate state is
4400 # transferred to this job.
4401 &quot;tempFiles&quot;: [ # A set of files the system should be aware of that are used
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004402 # for temporary storage. These temporary files will be
4403 # removed on job completion.
4404 # No duplicates are allowed.
4405 # No file patterns are supported.
4406 #
4407 # The supported files are:
4408 #
4409 # Google Cloud Storage:
4410 #
4411 # storage.googleapis.com/{bucket}/{object}
4412 # bucket.storage.googleapis.com/{object}
Bu Sun Kim65020912020-05-20 12:08:20 -07004413 &quot;A String&quot;,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004414 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07004415 &quot;name&quot;: &quot;A String&quot;, # The user-specified Cloud Dataflow job name.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004416 #
4417 # Only one Job with a given name may exist in a project at any
4418 # given time. If a caller attempts to create a Job with the same
4419 # name as an already-existing Job, the attempt returns the
4420 # existing Job.
4421 #
4422 # The name must match the regular expression
4423 # `[a-z]([-a-z0-9]{0,38}[a-z0-9])?`
Bu Sun Kim65020912020-05-20 12:08:20 -07004424 &quot;steps&quot;: [ # Exactly one of step or steps_location should be specified.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004425 #
4426 # The top-level steps that constitute the entire job.
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04004427 { # Defines a particular step within a Cloud Dataflow job.
4428 #
4429 # A job consists of multiple steps, each of which performs some
4430 # specific operation as part of the overall job. Data is typically
4431 # passed from one step to another as part of the job.
4432 #
Bu Sun Kim65020912020-05-20 12:08:20 -07004433 # Here&#x27;s an example of a sequence of steps which together implement a
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04004434 # Map-Reduce job:
4435 #
4436 # * Read a collection of data from some source, parsing the
Bu Sun Kim65020912020-05-20 12:08:20 -07004437 # collection&#x27;s elements.
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04004438 #
4439 # * Validate the elements.
4440 #
4441 # * Apply a user-defined function to map each element to some value
4442 # and extract an element-specific key value.
4443 #
4444 # * Group elements with the same key into a single element with
4445 # that key, transforming a multiply-keyed collection into a
4446 # uniquely-keyed collection.
4447 #
4448 # * Write the elements out to some data sink.
4449 #
4450 # Note that the Cloud Dataflow service may be used to run many different
4451 # types of jobs, not just Map-Reduce.
Bu Sun Kim65020912020-05-20 12:08:20 -07004452 &quot;name&quot;: &quot;A String&quot;, # The name that identifies the step. This must be unique for each
Dan O'Mearadd494642020-05-01 07:42:23 -07004453 # step with respect to all other steps in the Cloud Dataflow job.
Bu Sun Kim65020912020-05-20 12:08:20 -07004454 &quot;kind&quot;: &quot;A String&quot;, # The kind of step in the Cloud Dataflow job.
4455 &quot;properties&quot;: { # Named properties associated with the step. Each kind of
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04004456 # predefined step has its own required set of properties.
4457 # Must be provided on Create. Only retrieved with JOB_VIEW_ALL.
Bu Sun Kim65020912020-05-20 12:08:20 -07004458 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
Takashi Matsuo06694102015-09-11 13:55:40 -07004459 },
4460 },
4461 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07004462 &quot;replacedByJobId&quot;: &quot;A String&quot;, # If another job is an update of this job (and thus, this job is in
4463 # `JOB_STATE_UPDATED`), this field contains the ID of that job.
4464 &quot;executionInfo&quot;: { # Additional information about how a Cloud Dataflow job will be executed that # Deprecated.
4465 # isn&#x27;t contained in the submitted job.
4466 &quot;stages&quot;: { # A mapping from each stage to the information about that stage.
4467 &quot;a_key&quot;: { # Contains information about how a particular
4468 # google.dataflow.v1beta3.Step will be executed.
4469 &quot;stepName&quot;: [ # The steps associated with the execution stage.
4470 # Note that stages may have several steps, and that a given step
4471 # might be run by more than one stage.
4472 &quot;A String&quot;,
4473 ],
4474 },
4475 },
4476 },
4477 &quot;currentState&quot;: &quot;A String&quot;, # The current state of the job.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004478 #
4479 # Jobs are created in the `JOB_STATE_STOPPED` state unless otherwise
4480 # specified.
4481 #
4482 # A job in the `JOB_STATE_RUNNING` state may asynchronously enter a
4483 # terminal state. After a job has reached a terminal state, no
4484 # further state updates may be made.
4485 #
4486 # This field may be mutated by the Cloud Dataflow service;
4487 # callers cannot mutate it.
Bu Sun Kim65020912020-05-20 12:08:20 -07004488 &quot;location&quot;: &quot;A String&quot;, # The [regional endpoint]
4489 # (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
4490 # contains this job.
4491 &quot;startTime&quot;: &quot;A String&quot;, # The timestamp when the job was started (transitioned to JOB_STATE_PENDING).
4492 # Flexible resource scheduling jobs are started with some delay after job
4493 # creation, so start_time is unset before start and is updated when the
4494 # job is started by the Cloud Dataflow service. For other jobs, start_time
4495 # always equals to create_time and is immutable and set by the Cloud Dataflow
4496 # service.
4497 &quot;stepsLocation&quot;: &quot;A String&quot;, # The GCS location where the steps are stored.
4498 &quot;labels&quot;: { # User-defined labels for this job.
4499 #
4500 # The labels map can contain no more than 64 entries. Entries of the labels
4501 # map are UTF8 strings that comply with the following restrictions:
4502 #
4503 # * Keys must conform to regexp: \p{Ll}\p{Lo}{0,62}
4504 # * Values must conform to regexp: [\p{Ll}\p{Lo}\p{N}_-]{0,63}
4505 # * Both keys and values are additionally constrained to be &lt;= 128 bytes in
4506 # size.
4507 &quot;a_key&quot;: &quot;A String&quot;,
Nathaniel Manista4f877e52015-06-15 16:44:50 +00004508 },
Bu Sun Kim65020912020-05-20 12:08:20 -07004509 &quot;createTime&quot;: &quot;A String&quot;, # The timestamp when the job was initially created. Immutable and set by the
4510 # Cloud Dataflow service.
4511 &quot;requestedState&quot;: &quot;A String&quot;, # The job&#x27;s requested state.
4512 #
4513 # `UpdateJob` may be used to switch between the `JOB_STATE_STOPPED` and
4514 # `JOB_STATE_RUNNING` states, by setting requested_state. `UpdateJob` may
4515 # also be used to directly set a job&#x27;s requested state to
4516 # `JOB_STATE_CANCELLED` or `JOB_STATE_DONE`, irrevocably terminating the
4517 # job if it has not already reached a terminal state.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004518 }</pre>
Nathaniel Manista4f877e52015-06-15 16:44:50 +00004519</div>
4520
4521</body></html>