blob: 44292d9854b0ee03cc1e2d925abe133e700ded12 [file] [log] [blame]
Jon Wayne Parrott692617a2017-01-06 09:58:29 -08001<html><body>
2<style>
3
4body, h1, h2, h3, div, span, p, pre, a {
5 margin: 0;
6 padding: 0;
7 border: 0;
8 font-weight: inherit;
9 font-style: inherit;
10 font-size: 100%;
11 font-family: inherit;
12 vertical-align: baseline;
13}
14
15body {
16 font-size: 13px;
17 padding: 1em;
18}
19
20h1 {
21 font-size: 26px;
22 margin-bottom: 1em;
23}
24
25h2 {
26 font-size: 24px;
27 margin-bottom: 1em;
28}
29
30h3 {
31 font-size: 20px;
32 margin-bottom: 1em;
33 margin-top: 1em;
34}
35
36pre, code {
37 line-height: 1.5;
38 font-family: Monaco, 'DejaVu Sans Mono', 'Bitstream Vera Sans Mono', 'Lucida Console', monospace;
39}
40
41pre {
42 margin-top: 0.5em;
43}
44
45h1, h2, h3, p {
46 font-family: Arial, sans serif;
47}
48
49h1, h2, h3 {
50 border-bottom: solid #CCC 1px;
51}
52
53.toc_element {
54 margin-top: 0.5em;
55}
56
57.firstline {
58 margin-left: 2 em;
59}
60
61.method {
62 margin-top: 1em;
63 border: solid 1px #CCC;
64 padding: 1em;
65 background: #EEE;
66}
67
68.details {
69 font-weight: bold;
70 font-size: 14px;
71}
72
73</style>
74
Bu Sun Kim715bd7f2019-06-14 16:50:42 -070075<h1><a href="dataflow_v1b3.html">Dataflow API</a> . <a href="dataflow_v1b3.projects.html">projects</a> . <a href="dataflow_v1b3.projects.locations.html">locations</a> . <a href="dataflow_v1b3.projects.locations.jobs.html">jobs</a></h1>
Jon Wayne Parrott692617a2017-01-06 09:58:29 -080076<h2>Instance Methods</h2>
77<p class="toc_element">
Sai Cheemalapati4ba8c232017-06-06 18:46:08 -040078 <code><a href="dataflow_v1b3.projects.locations.jobs.debug.html">debug()</a></code>
79</p>
80<p class="firstline">Returns the debug Resource.</p>
81
82<p class="toc_element">
Jon Wayne Parrott692617a2017-01-06 09:58:29 -080083 <code><a href="dataflow_v1b3.projects.locations.jobs.messages.html">messages()</a></code>
84</p>
85<p class="firstline">Returns the messages Resource.</p>
86
87<p class="toc_element">
Bu Sun Kim715bd7f2019-06-14 16:50:42 -070088 <code><a href="dataflow_v1b3.projects.locations.jobs.snapshots.html">snapshots()</a></code>
89</p>
90<p class="firstline">Returns the snapshots Resource.</p>
91
92<p class="toc_element">
Jon Wayne Parrott692617a2017-01-06 09:58:29 -080093 <code><a href="dataflow_v1b3.projects.locations.jobs.workItems.html">workItems()</a></code>
94</p>
95<p class="firstline">Returns the workItems Resource.</p>
96
97<p class="toc_element">
Bu Sun Kim65020912020-05-20 12:08:20 -070098 <code><a href="#create">create(projectId, location, body=None, view=None, replaceJobId=None, x__xgafv=None)</a></code></p>
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -040099<p class="firstline">Creates a Cloud Dataflow job.</p>
Jon Wayne Parrott692617a2017-01-06 09:58:29 -0800100<p class="toc_element">
Bu Sun Kim65020912020-05-20 12:08:20 -0700101 <code><a href="#get">get(projectId, location, jobId, view=None, x__xgafv=None)</a></code></p>
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -0400102<p class="firstline">Gets the state of the specified Cloud Dataflow job.</p>
Jon Wayne Parrott692617a2017-01-06 09:58:29 -0800103<p class="toc_element">
104 <code><a href="#getMetrics">getMetrics(projectId, location, jobId, startTime=None, x__xgafv=None)</a></code></p>
105<p class="firstline">Request the job status.</p>
106<p class="toc_element">
Bu Sun Kim65020912020-05-20 12:08:20 -0700107 <code><a href="#list">list(projectId, location, filter=None, pageToken=None, pageSize=None, view=None, x__xgafv=None)</a></code></p>
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -0400108<p class="firstline">List the jobs of a project.</p>
Jon Wayne Parrott692617a2017-01-06 09:58:29 -0800109<p class="toc_element">
110 <code><a href="#list_next">list_next(previous_request, previous_response)</a></code></p>
111<p class="firstline">Retrieves the next page of results.</p>
112<p class="toc_element">
Dan O'Mearadd494642020-05-01 07:42:23 -0700113 <code><a href="#snapshot">snapshot(projectId, location, jobId, body=None, x__xgafv=None)</a></code></p>
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700114<p class="firstline">Snapshot the state of a streaming job.</p>
115<p class="toc_element">
Dan O'Mearadd494642020-05-01 07:42:23 -0700116 <code><a href="#update">update(projectId, location, jobId, body=None, x__xgafv=None)</a></code></p>
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -0400117<p class="firstline">Updates the state of an existing Cloud Dataflow job.</p>
Jon Wayne Parrott692617a2017-01-06 09:58:29 -0800118<h3>Method Details</h3>
119<div class="method">
Bu Sun Kim65020912020-05-20 12:08:20 -0700120 <code class="details" id="create">create(projectId, location, body=None, view=None, replaceJobId=None, x__xgafv=None)</code>
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -0400121 <pre>Creates a Cloud Dataflow job.
Jon Wayne Parrott692617a2017-01-06 09:58:29 -0800122
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700123To create a job, we recommend using `projects.locations.jobs.create` with a
124[regional endpoint]
125(https://cloud.google.com/dataflow/docs/concepts/regional-endpoints). Using
126`projects.jobs.create` is not recommended, as your job will always start
127in `us-central1`.
128
Jon Wayne Parrott692617a2017-01-06 09:58:29 -0800129Args:
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -0400130 projectId: string, The ID of the Cloud Platform project that the job belongs to. (required)
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700131 location: string, The [regional endpoint]
132(https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
133contains this job. (required)
Dan O'Mearadd494642020-05-01 07:42:23 -0700134 body: object, The request body.
Jon Wayne Parrott692617a2017-01-06 09:58:29 -0800135 The object takes the form of:
136
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -0400137{ # Defines a job to be run by the Cloud Dataflow service.
Bu Sun Kim65020912020-05-20 12:08:20 -0700138 &quot;clientRequestId&quot;: &quot;A String&quot;, # The client&#x27;s unique identifier of the job, re-used across retried attempts.
139 # If this field is set, the service will ensure its uniqueness.
140 # The request to create a job will fail if the service has knowledge of a
141 # previously submitted job with the same client&#x27;s ID and job name.
142 # The caller may use this field to ensure idempotence of job
143 # creation across retried attempts to create a job.
144 # By default, the field is empty and, in that case, the service ignores it.
145 &quot;id&quot;: &quot;A String&quot;, # The unique ID of this job.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700146 #
147 # This field is set by the Cloud Dataflow service when the Job is
148 # created, and is immutable for the life of the job.
Bu Sun Kim65020912020-05-20 12:08:20 -0700149 &quot;currentStateTime&quot;: &quot;A String&quot;, # The timestamp associated with the current state.
150 &quot;transformNameMapping&quot;: { # The map of transform name prefixes of the job to be replaced to the
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700151 # corresponding name prefixes of the new job.
Bu Sun Kim65020912020-05-20 12:08:20 -0700152 &quot;a_key&quot;: &quot;A String&quot;,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700153 },
Bu Sun Kim65020912020-05-20 12:08:20 -0700154 &quot;environment&quot;: { # Describes the environment in which a Dataflow Job runs. # The environment for the job.
155 &quot;sdkPipelineOptions&quot;: { # The Cloud Dataflow SDK pipeline options specified by the user. These
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700156 # options are passed through the service and are used to recreate the
157 # SDK pipeline options on the worker in a language agnostic and platform
158 # independent way.
Bu Sun Kim65020912020-05-20 12:08:20 -0700159 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700160 },
Bu Sun Kim65020912020-05-20 12:08:20 -0700161 &quot;flexResourceSchedulingGoal&quot;: &quot;A String&quot;, # Which Flexible Resource Scheduling mode to run in.
162 &quot;workerPools&quot;: [ # The worker pools. At least one &quot;harness&quot; worker pool must be
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700163 # specified in order for the job to have workers.
164 { # Describes one particular pool of Cloud Dataflow workers to be
165 # instantiated by the Cloud Dataflow service in order to perform the
166 # computations required by a job. Note that a workflow job may use
167 # multiple pools, in order to match the various computational
168 # requirements of the various stages of the job.
Bu Sun Kim65020912020-05-20 12:08:20 -0700169 &quot;defaultPackageSet&quot;: &quot;A String&quot;, # The default package set to install. This allows the service to
170 # select a default set of packages which are useful to worker
171 # harnesses written in a particular language.
172 &quot;network&quot;: &quot;A String&quot;, # Network to which VMs will be assigned. If empty or unspecified,
173 # the service will use the network &quot;default&quot;.
174 &quot;zone&quot;: &quot;A String&quot;, # Zone to run the worker pools in. If empty or unspecified, the service
Dan O'Mearadd494642020-05-01 07:42:23 -0700175 # will attempt to choose a reasonable default.
Bu Sun Kim65020912020-05-20 12:08:20 -0700176 &quot;numWorkers&quot;: 42, # Number of Google Compute Engine workers in this pool needed to
177 # execute the job. If zero or unspecified, the service will
178 # attempt to choose a reasonable default.
179 &quot;numThreadsPerWorker&quot;: 42, # The number of threads per worker harness. If empty or unspecified, the
Dan O'Mearadd494642020-05-01 07:42:23 -0700180 # service will choose a number of threads (according to the number of cores
181 # on the selected machine type for batch, or 1 by convention for streaming).
Bu Sun Kim65020912020-05-20 12:08:20 -0700182 &quot;diskSourceImage&quot;: &quot;A String&quot;, # Fully qualified source image for disks.
183 &quot;packages&quot;: [ # Packages to be installed on workers.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700184 { # The packages that must be installed in order for a worker to run the
185 # steps of the Cloud Dataflow job that will be assigned to its worker
186 # pool.
187 #
188 # This is the mechanism by which the Cloud Dataflow SDK causes code to
189 # be loaded onto the workers. For example, the Cloud Dataflow Java SDK
Bu Sun Kim65020912020-05-20 12:08:20 -0700190 # might use this to install jars containing the user&#x27;s code and all of the
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700191 # various dependencies (libraries, data files, etc.) required in order
192 # for that code to run.
Bu Sun Kim65020912020-05-20 12:08:20 -0700193 &quot;location&quot;: &quot;A String&quot;, # The resource to read the package from. The supported resource type is:
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700194 #
195 # Google Cloud Storage:
196 #
197 # storage.googleapis.com/{bucket}
198 # bucket.storage.googleapis.com/
Bu Sun Kim65020912020-05-20 12:08:20 -0700199 &quot;name&quot;: &quot;A String&quot;, # The name of the package.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700200 },
201 ],
Bu Sun Kim65020912020-05-20 12:08:20 -0700202 &quot;teardownPolicy&quot;: &quot;A String&quot;, # Sets the policy for determining when to turndown worker pool.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700203 # Allowed values are: `TEARDOWN_ALWAYS`, `TEARDOWN_ON_SUCCESS`, and
204 # `TEARDOWN_NEVER`.
205 # `TEARDOWN_ALWAYS` means workers are always torn down regardless of whether
206 # the job succeeds. `TEARDOWN_ON_SUCCESS` means workers are torn down
207 # if the job succeeds. `TEARDOWN_NEVER` means the workers are never torn
208 # down.
209 #
210 # If the workers are not torn down by the service, they will
211 # continue to run and use Google Compute Engine VM resources in the
Bu Sun Kim65020912020-05-20 12:08:20 -0700212 # user&#x27;s project until they are explicitly terminated by the user.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700213 # Because of this, Google recommends using the `TEARDOWN_ALWAYS`
214 # policy except for small, manually supervised test jobs.
215 #
216 # If unknown or unspecified, the service will attempt to choose a reasonable
217 # default.
Bu Sun Kim65020912020-05-20 12:08:20 -0700218 &quot;onHostMaintenance&quot;: &quot;A String&quot;, # The action to take on host maintenance, as defined by the Google
219 # Compute Engine API.
220 &quot;poolArgs&quot;: { # Extra arguments for this worker pool.
221 &quot;a_key&quot;: &quot;&quot;, # Properties of the object. Contains field @type with type URL.
222 },
223 &quot;diskSizeGb&quot;: 42, # Size of root disk for VMs, in GB. If zero or unspecified, the service will
Dan O'Mearadd494642020-05-01 07:42:23 -0700224 # attempt to choose a reasonable default.
Bu Sun Kim65020912020-05-20 12:08:20 -0700225 &quot;workerHarnessContainerImage&quot;: &quot;A String&quot;, # Required. Docker container image that executes the Cloud Dataflow worker
226 # harness, residing in Google Container Registry.
227 #
228 # Deprecated for the Fn API path. Use sdk_harness_container_images instead.
229 &quot;diskType&quot;: &quot;A String&quot;, # Type of root disk for VMs. If empty or unspecified, the service will
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700230 # attempt to choose a reasonable default.
Bu Sun Kim65020912020-05-20 12:08:20 -0700231 &quot;machineType&quot;: &quot;A String&quot;, # Machine type (e.g. &quot;n1-standard-1&quot;). If empty or unspecified, the
232 # service will attempt to choose a reasonable default.
233 &quot;kind&quot;: &quot;A String&quot;, # The kind of the worker pool; currently only `harness` and `shuffle`
234 # are supported.
235 &quot;dataDisks&quot;: [ # Data disks that are used by a VM in this workflow.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700236 { # Describes the data disk used by a workflow job.
Bu Sun Kim65020912020-05-20 12:08:20 -0700237 &quot;sizeGb&quot;: 42, # Size of disk in GB. If zero or unspecified, the service will
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700238 # attempt to choose a reasonable default.
Bu Sun Kim65020912020-05-20 12:08:20 -0700239 &quot;diskType&quot;: &quot;A String&quot;, # Disk storage type, as defined by Google Compute Engine. This
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700240 # must be a disk type appropriate to the project and zone in which
241 # the workers will run. If unknown or unspecified, the service
242 # will attempt to choose a reasonable default.
243 #
244 # For example, the standard persistent disk type is a resource name
Bu Sun Kim65020912020-05-20 12:08:20 -0700245 # typically ending in &quot;pd-standard&quot;. If SSD persistent disks are
246 # available, the resource name typically ends with &quot;pd-ssd&quot;. The
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700247 # actual valid values are defined the Google Compute Engine API,
248 # not by the Cloud Dataflow API; consult the Google Compute Engine
249 # documentation for more information about determining the set of
250 # available disk types for a particular project and zone.
251 #
252 # Google Compute Engine Disk types are local to a particular
253 # project in a particular zone, and so the resource name will
254 # typically look something like this:
255 #
256 # compute.googleapis.com/projects/project-id/zones/zone/diskTypes/pd-standard
Bu Sun Kim65020912020-05-20 12:08:20 -0700257 &quot;mountPoint&quot;: &quot;A String&quot;, # Directory in a VM where disk is mounted.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700258 },
259 ],
Bu Sun Kim65020912020-05-20 12:08:20 -0700260 &quot;sdkHarnessContainerImages&quot;: [ # Set of SDK harness containers needed to execute this pipeline. This will
Dan O'Mearadd494642020-05-01 07:42:23 -0700261 # only be set in the Fn API path. For non-cross-language pipelines this
262 # should have only one entry. Cross-language pipelines will have two or more
263 # entries.
264 { # Defines a SDK harness container for executing Dataflow pipelines.
Bu Sun Kim65020912020-05-20 12:08:20 -0700265 &quot;containerImage&quot;: &quot;A String&quot;, # A docker container image that resides in Google Container Registry.
266 &quot;useSingleCorePerContainer&quot;: True or False, # If true, recommends the Dataflow service to use only one core per SDK
Dan O'Mearadd494642020-05-01 07:42:23 -0700267 # container instance with this image. If false (or unset) recommends using
268 # more than one core per SDK container instance with this image for
269 # efficiency. Note that Dataflow service may choose to override this property
270 # if needed.
271 },
272 ],
Bu Sun Kim65020912020-05-20 12:08:20 -0700273 &quot;subnetwork&quot;: &quot;A String&quot;, # Subnetwork to which VMs will be assigned, if desired. Expected to be of
274 # the form &quot;regions/REGION/subnetworks/SUBNETWORK&quot;.
275 &quot;ipConfiguration&quot;: &quot;A String&quot;, # Configuration for VM IPs.
276 &quot;taskrunnerSettings&quot;: { # Taskrunner configuration settings. # Settings passed through to Google Compute Engine workers when
277 # using the standard Dataflow task runner. Users should ignore
278 # this field.
279 &quot;alsologtostderr&quot;: True or False, # Whether to also send taskrunner log info to stderr.
280 &quot;taskGroup&quot;: &quot;A String&quot;, # The UNIX group ID on the worker VM to use for tasks launched by
281 # taskrunner; e.g. &quot;wheel&quot;.
282 &quot;harnessCommand&quot;: &quot;A String&quot;, # The command to launch the worker harness.
283 &quot;logDir&quot;: &quot;A String&quot;, # The directory on the VM to store logs.
284 &quot;oauthScopes&quot;: [ # The OAuth2 scopes to be requested by the taskrunner in order to
285 # access the Cloud Dataflow API.
286 &quot;A String&quot;,
287 ],
288 &quot;dataflowApiVersion&quot;: &quot;A String&quot;, # The API version of endpoint, e.g. &quot;v1b3&quot;
289 &quot;logUploadLocation&quot;: &quot;A String&quot;, # Indicates where to put logs. If this is not specified, the logs
290 # will not be uploaded.
291 #
292 # The supported resource type is:
293 #
294 # Google Cloud Storage:
295 # storage.googleapis.com/{bucket}/{object}
296 # bucket.storage.googleapis.com/{object}
297 &quot;streamingWorkerMainClass&quot;: &quot;A String&quot;, # The streaming worker main class name.
298 &quot;workflowFileName&quot;: &quot;A String&quot;, # The file to store the workflow in.
299 &quot;baseTaskDir&quot;: &quot;A String&quot;, # The location on the worker for task-specific subdirectories.
300 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the taskrunner should use for
301 # temporary storage.
302 #
303 # The supported resource type is:
304 #
305 # Google Cloud Storage:
306 # storage.googleapis.com/{bucket}/{object}
307 # bucket.storage.googleapis.com/{object}
308 &quot;commandlinesFileName&quot;: &quot;A String&quot;, # The file to store preprocessing commands in.
309 &quot;languageHint&quot;: &quot;A String&quot;, # The suggested backend language.
310 &quot;baseUrl&quot;: &quot;A String&quot;, # The base URL for the taskrunner to use when accessing Google Cloud APIs.
311 #
312 # When workers access Google Cloud APIs, they logically do so via
313 # relative URLs. If this field is specified, it supplies the base
314 # URL to use for resolving these relative URLs. The normative
315 # algorithm used is defined by RFC 1808, &quot;Relative Uniform Resource
316 # Locators&quot;.
317 #
318 # If not specified, the default value is &quot;http://www.googleapis.com/&quot;
319 &quot;logToSerialconsole&quot;: True or False, # Whether to send taskrunner log info to Google Compute Engine VM serial
320 # console.
321 &quot;continueOnException&quot;: True or False, # Whether to continue taskrunner if an exception is hit.
322 &quot;parallelWorkerSettings&quot;: { # Provides data to pass through to the worker harness. # The settings to pass to the parallel worker harness.
323 &quot;baseUrl&quot;: &quot;A String&quot;, # The base URL for accessing Google Cloud APIs.
324 #
325 # When workers access Google Cloud APIs, they logically do so via
326 # relative URLs. If this field is specified, it supplies the base
327 # URL to use for resolving these relative URLs. The normative
328 # algorithm used is defined by RFC 1808, &quot;Relative Uniform Resource
329 # Locators&quot;.
330 #
331 # If not specified, the default value is &quot;http://www.googleapis.com/&quot;
332 &quot;reportingEnabled&quot;: True or False, # Whether to send work progress updates to the service.
333 &quot;servicePath&quot;: &quot;A String&quot;, # The Cloud Dataflow service path relative to the root URL, for example,
334 # &quot;dataflow/v1b3/projects&quot;.
335 &quot;shuffleServicePath&quot;: &quot;A String&quot;, # The Shuffle service path relative to the root URL, for example,
336 # &quot;shuffle/v1beta1&quot;.
337 &quot;workerId&quot;: &quot;A String&quot;, # The ID of the worker running this pipeline.
338 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the system should use for temporary
339 # storage.
340 #
341 # The supported resource type is:
342 #
343 # Google Cloud Storage:
344 #
345 # storage.googleapis.com/{bucket}/{object}
346 # bucket.storage.googleapis.com/{object}
347 },
348 &quot;vmId&quot;: &quot;A String&quot;, # The ID string of the VM.
349 &quot;taskUser&quot;: &quot;A String&quot;, # The UNIX user ID on the worker VM to use for tasks launched by
350 # taskrunner; e.g. &quot;root&quot;.
351 },
352 &quot;autoscalingSettings&quot;: { # Settings for WorkerPool autoscaling. # Settings for autoscaling of this WorkerPool.
353 &quot;maxNumWorkers&quot;: 42, # The maximum number of workers to cap scaling at.
354 &quot;algorithm&quot;: &quot;A String&quot;, # The algorithm to use for autoscaling.
355 },
356 &quot;metadata&quot;: { # Metadata to set on the Google Compute Engine VMs.
357 &quot;a_key&quot;: &quot;A String&quot;,
358 },
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700359 },
360 ],
Bu Sun Kim65020912020-05-20 12:08:20 -0700361 &quot;dataset&quot;: &quot;A String&quot;, # The dataset for the current project where various workflow
362 # related tables are stored.
363 #
364 # The supported resource type is:
365 #
366 # Google BigQuery:
367 # bigquery.googleapis.com/{dataset}
368 &quot;internalExperiments&quot;: { # Experimental settings.
369 &quot;a_key&quot;: &quot;&quot;, # Properties of the object. Contains field @type with type URL.
370 },
371 &quot;workerRegion&quot;: &quot;A String&quot;, # The Compute Engine region
372 # (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
373 # which worker processing should occur, e.g. &quot;us-west1&quot;. Mutually exclusive
374 # with worker_zone. If neither worker_region nor worker_zone is specified,
375 # default to the control plane&#x27;s region.
376 &quot;serviceKmsKeyName&quot;: &quot;A String&quot;, # If set, contains the Cloud KMS key identifier used to encrypt data
377 # at rest, AKA a Customer Managed Encryption Key (CMEK).
378 #
379 # Format:
380 # projects/PROJECT_ID/locations/LOCATION/keyRings/KEY_RING/cryptoKeys/KEY
381 &quot;userAgent&quot;: { # A description of the process that generated the request.
382 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
383 },
384 &quot;workerZone&quot;: &quot;A String&quot;, # The Compute Engine zone
385 # (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
386 # which worker processing should occur, e.g. &quot;us-west1-a&quot;. Mutually exclusive
387 # with worker_region. If neither worker_region nor worker_zone is specified,
388 # a zone in the control plane&#x27;s region is chosen based on available capacity.
389 &quot;clusterManagerApiService&quot;: &quot;A String&quot;, # The type of cluster manager API to use. If unknown or
Dan O'Mearadd494642020-05-01 07:42:23 -0700390 # unspecified, the service will attempt to choose a reasonable
391 # default. This should be in the form of the API service name,
Bu Sun Kim65020912020-05-20 12:08:20 -0700392 # e.g. &quot;compute.googleapis.com&quot;.
393 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the system should use for temporary
394 # storage. The system will append the suffix &quot;/temp-{JOBNAME} to
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700395 # this resource prefix, where {JOBNAME} is the value of the
396 # job_name field. The resulting bucket and object prefix is used
397 # as the prefix of the resources used to store temporary data
398 # needed during the job execution. NOTE: This will override the
399 # value in taskrunner_settings.
400 # The supported resource type is:
401 #
402 # Google Cloud Storage:
403 #
404 # storage.googleapis.com/{bucket}/{object}
405 # bucket.storage.googleapis.com/{object}
Bu Sun Kim65020912020-05-20 12:08:20 -0700406 &quot;experiments&quot;: [ # The list of experiments to enable.
407 &quot;A String&quot;,
408 ],
409 &quot;version&quot;: { # A structure describing which components and their versions of the service
410 # are required in order to run the job.
411 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
412 },
413 &quot;serviceAccountEmail&quot;: &quot;A String&quot;, # Identity to run virtual machines as. Defaults to the default account.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700414 },
Bu Sun Kim65020912020-05-20 12:08:20 -0700415 &quot;stageStates&quot;: [ # This field may be mutated by the Cloud Dataflow service;
416 # callers cannot mutate it.
417 { # A message describing the state of a particular execution stage.
418 &quot;executionStageName&quot;: &quot;A String&quot;, # The name of the execution stage.
419 &quot;currentStateTime&quot;: &quot;A String&quot;, # The time at which the stage transitioned to this state.
420 &quot;executionStageState&quot;: &quot;A String&quot;, # Executions stage states allow the same set of values as JobState.
421 },
422 ],
423 &quot;jobMetadata&quot;: { # Metadata available primarily for filtering jobs. Will be included in the # This field is populated by the Dataflow service to support filtering jobs
424 # by the metadata values provided here. Populated for ListJobs and all GetJob
425 # views SUMMARY and higher.
426 # ListJob response and Job SUMMARY view.
427 &quot;bigTableDetails&quot;: [ # Identification of a BigTable source used in the Dataflow job.
428 { # Metadata for a BigTable connector used by the job.
429 &quot;tableId&quot;: &quot;A String&quot;, # TableId accessed in the connection.
430 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
431 &quot;instanceId&quot;: &quot;A String&quot;, # InstanceId accessed in the connection.
432 },
433 ],
434 &quot;spannerDetails&quot;: [ # Identification of a Spanner source used in the Dataflow job.
435 { # Metadata for a Spanner connector used by the job.
436 &quot;databaseId&quot;: &quot;A String&quot;, # DatabaseId accessed in the connection.
437 &quot;instanceId&quot;: &quot;A String&quot;, # InstanceId accessed in the connection.
438 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
439 },
440 ],
441 &quot;datastoreDetails&quot;: [ # Identification of a Datastore source used in the Dataflow job.
442 { # Metadata for a Datastore connector used by the job.
443 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
444 &quot;namespace&quot;: &quot;A String&quot;, # Namespace used in the connection.
445 },
446 ],
447 &quot;sdkVersion&quot;: { # The version of the SDK used to run the job. # The SDK version used to run the job.
448 &quot;versionDisplayName&quot;: &quot;A String&quot;, # A readable string describing the version of the SDK.
449 &quot;sdkSupportStatus&quot;: &quot;A String&quot;, # The support status for this SDK version.
450 &quot;version&quot;: &quot;A String&quot;, # The version of the SDK used to run the job.
451 },
452 &quot;bigqueryDetails&quot;: [ # Identification of a BigQuery source used in the Dataflow job.
453 { # Metadata for a BigQuery connector used by the job.
454 &quot;table&quot;: &quot;A String&quot;, # Table accessed in the connection.
455 &quot;dataset&quot;: &quot;A String&quot;, # Dataset accessed in the connection.
456 &quot;projectId&quot;: &quot;A String&quot;, # Project accessed in the connection.
457 &quot;query&quot;: &quot;A String&quot;, # Query used to access data in the connection.
458 },
459 ],
460 &quot;fileDetails&quot;: [ # Identification of a File source used in the Dataflow job.
461 { # Metadata for a File connector used by the job.
462 &quot;filePattern&quot;: &quot;A String&quot;, # File Pattern used to access files by the connector.
463 },
464 ],
465 &quot;pubsubDetails&quot;: [ # Identification of a PubSub source used in the Dataflow job.
466 { # Metadata for a PubSub connector used by the job.
467 &quot;subscription&quot;: &quot;A String&quot;, # Subscription used in the connection.
468 &quot;topic&quot;: &quot;A String&quot;, # Topic accessed in the connection.
469 },
470 ],
471 },
472 &quot;createdFromSnapshotId&quot;: &quot;A String&quot;, # If this is specified, the job&#x27;s initial state is populated from the given
473 # snapshot.
474 &quot;projectId&quot;: &quot;A String&quot;, # The ID of the Cloud Platform project that the job belongs to.
475 &quot;type&quot;: &quot;A String&quot;, # The type of Cloud Dataflow job.
476 &quot;pipelineDescription&quot;: { # A descriptive representation of submitted pipeline as well as the executed # Preliminary field: The format of this data may change at any time.
477 # A description of the user pipeline and stages through which it is executed.
478 # Created by Cloud Dataflow service. Only retrieved with
479 # JOB_VIEW_DESCRIPTION or JOB_VIEW_ALL.
480 # form. This data is provided by the Dataflow service for ease of visualizing
481 # the pipeline and interpreting Dataflow provided metrics.
482 &quot;executionPipelineStage&quot;: [ # Description of each stage of execution of the pipeline.
483 { # Description of the composing transforms, names/ids, and input/outputs of a
484 # stage of execution. Some composing transforms and sources may have been
485 # generated by the Dataflow service during execution planning.
486 &quot;id&quot;: &quot;A String&quot;, # Dataflow service generated id for this stage.
487 &quot;componentTransform&quot;: [ # Transforms that comprise this execution stage.
488 { # Description of a transform executed as part of an execution stage.
489 &quot;originalTransform&quot;: &quot;A String&quot;, # User name for the original user transform with which this transform is
490 # most closely associated.
491 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
492 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this transform; may be user or system generated.
493 },
494 ],
495 &quot;componentSource&quot;: [ # Collections produced and consumed by component transforms of this stage.
496 { # Description of an interstitial value between transforms in an execution
497 # stage.
498 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
499 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this transform; may be user or system generated.
500 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
501 # source is most closely associated.
502 },
503 ],
504 &quot;kind&quot;: &quot;A String&quot;, # Type of tranform this stage is executing.
505 &quot;outputSource&quot;: [ # Output sources for this stage.
506 { # Description of an input or output of an execution stage.
507 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
508 # source is most closely associated.
509 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
510 &quot;sizeBytes&quot;: &quot;A String&quot;, # Size of the source, if measurable.
511 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this source; may be user or system generated.
512 },
513 ],
514 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this stage.
515 &quot;inputSource&quot;: [ # Input sources for this stage.
516 { # Description of an input or output of an execution stage.
517 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
518 # source is most closely associated.
519 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
520 &quot;sizeBytes&quot;: &quot;A String&quot;, # Size of the source, if measurable.
521 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this source; may be user or system generated.
522 },
523 ],
524 },
525 ],
526 &quot;originalPipelineTransform&quot;: [ # Description of each transform in the pipeline and collections between them.
527 { # Description of the type, names/ids, and input/outputs for a transform.
528 &quot;kind&quot;: &quot;A String&quot;, # Type of transform.
529 &quot;inputCollectionName&quot;: [ # User names for all collection inputs to this transform.
530 &quot;A String&quot;,
531 ],
532 &quot;name&quot;: &quot;A String&quot;, # User provided name for this transform instance.
533 &quot;id&quot;: &quot;A String&quot;, # SDK generated id of this transform instance.
534 &quot;displayData&quot;: [ # Transform-specific display data.
535 { # Data provided with a pipeline or transform to provide descriptive info.
536 &quot;timestampValue&quot;: &quot;A String&quot;, # Contains value if the data is of timestamp type.
537 &quot;boolValue&quot;: True or False, # Contains value if the data is of a boolean type.
538 &quot;javaClassValue&quot;: &quot;A String&quot;, # Contains value if the data is of java class type.
539 &quot;strValue&quot;: &quot;A String&quot;, # Contains value if the data is of string type.
540 &quot;int64Value&quot;: &quot;A String&quot;, # Contains value if the data is of int64 type.
541 &quot;durationValue&quot;: &quot;A String&quot;, # Contains value if the data is of duration type.
542 &quot;namespace&quot;: &quot;A String&quot;, # The namespace for the key. This is usually a class name or programming
543 # language namespace (i.e. python module) which defines the display data.
544 # This allows a dax monitoring system to specially handle the data
545 # and perform custom rendering.
546 &quot;floatValue&quot;: 3.14, # Contains value if the data is of float type.
547 &quot;key&quot;: &quot;A String&quot;, # The key identifying the display data.
548 # This is intended to be used as a label for the display data
549 # when viewed in a dax monitoring system.
550 &quot;shortStrValue&quot;: &quot;A String&quot;, # A possible additional shorter value to display.
551 # For example a java_class_name_value of com.mypackage.MyDoFn
552 # will be stored with MyDoFn as the short_str_value and
553 # com.mypackage.MyDoFn as the java_class_name value.
554 # short_str_value can be displayed and java_class_name_value
555 # will be displayed as a tooltip.
556 &quot;url&quot;: &quot;A String&quot;, # An optional full URL.
557 &quot;label&quot;: &quot;A String&quot;, # An optional label to display in a dax UI for the element.
558 },
559 ],
560 &quot;outputCollectionName&quot;: [ # User names for all collection outputs to this transform.
561 &quot;A String&quot;,
562 ],
563 },
564 ],
565 &quot;displayData&quot;: [ # Pipeline level display data.
566 { # Data provided with a pipeline or transform to provide descriptive info.
567 &quot;timestampValue&quot;: &quot;A String&quot;, # Contains value if the data is of timestamp type.
568 &quot;boolValue&quot;: True or False, # Contains value if the data is of a boolean type.
569 &quot;javaClassValue&quot;: &quot;A String&quot;, # Contains value if the data is of java class type.
570 &quot;strValue&quot;: &quot;A String&quot;, # Contains value if the data is of string type.
571 &quot;int64Value&quot;: &quot;A String&quot;, # Contains value if the data is of int64 type.
572 &quot;durationValue&quot;: &quot;A String&quot;, # Contains value if the data is of duration type.
573 &quot;namespace&quot;: &quot;A String&quot;, # The namespace for the key. This is usually a class name or programming
574 # language namespace (i.e. python module) which defines the display data.
575 # This allows a dax monitoring system to specially handle the data
576 # and perform custom rendering.
577 &quot;floatValue&quot;: 3.14, # Contains value if the data is of float type.
578 &quot;key&quot;: &quot;A String&quot;, # The key identifying the display data.
579 # This is intended to be used as a label for the display data
580 # when viewed in a dax monitoring system.
581 &quot;shortStrValue&quot;: &quot;A String&quot;, # A possible additional shorter value to display.
582 # For example a java_class_name_value of com.mypackage.MyDoFn
583 # will be stored with MyDoFn as the short_str_value and
584 # com.mypackage.MyDoFn as the java_class_name value.
585 # short_str_value can be displayed and java_class_name_value
586 # will be displayed as a tooltip.
587 &quot;url&quot;: &quot;A String&quot;, # An optional full URL.
588 &quot;label&quot;: &quot;A String&quot;, # An optional label to display in a dax UI for the element.
589 },
590 ],
591 },
592 &quot;replaceJobId&quot;: &quot;A String&quot;, # If this job is an update of an existing job, this field is the job ID
593 # of the job it replaced.
594 #
595 # When sending a `CreateJobRequest`, you can update a job by specifying it
596 # here. The job named here is stopped, and its intermediate state is
597 # transferred to this job.
598 &quot;tempFiles&quot;: [ # A set of files the system should be aware of that are used
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700599 # for temporary storage. These temporary files will be
600 # removed on job completion.
601 # No duplicates are allowed.
602 # No file patterns are supported.
603 #
604 # The supported files are:
605 #
606 # Google Cloud Storage:
607 #
608 # storage.googleapis.com/{bucket}/{object}
609 # bucket.storage.googleapis.com/{object}
Bu Sun Kim65020912020-05-20 12:08:20 -0700610 &quot;A String&quot;,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700611 ],
Bu Sun Kim65020912020-05-20 12:08:20 -0700612 &quot;name&quot;: &quot;A String&quot;, # The user-specified Cloud Dataflow job name.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700613 #
614 # Only one Job with a given name may exist in a project at any
615 # given time. If a caller attempts to create a Job with the same
616 # name as an already-existing Job, the attempt returns the
617 # existing Job.
618 #
619 # The name must match the regular expression
620 # `[a-z]([-a-z0-9]{0,38}[a-z0-9])?`
Bu Sun Kim65020912020-05-20 12:08:20 -0700621 &quot;steps&quot;: [ # Exactly one of step or steps_location should be specified.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700622 #
623 # The top-level steps that constitute the entire job.
624 { # Defines a particular step within a Cloud Dataflow job.
625 #
626 # A job consists of multiple steps, each of which performs some
627 # specific operation as part of the overall job. Data is typically
628 # passed from one step to another as part of the job.
629 #
Bu Sun Kim65020912020-05-20 12:08:20 -0700630 # Here&#x27;s an example of a sequence of steps which together implement a
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700631 # Map-Reduce job:
632 #
633 # * Read a collection of data from some source, parsing the
Bu Sun Kim65020912020-05-20 12:08:20 -0700634 # collection&#x27;s elements.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700635 #
636 # * Validate the elements.
637 #
638 # * Apply a user-defined function to map each element to some value
639 # and extract an element-specific key value.
640 #
641 # * Group elements with the same key into a single element with
642 # that key, transforming a multiply-keyed collection into a
643 # uniquely-keyed collection.
644 #
645 # * Write the elements out to some data sink.
646 #
647 # Note that the Cloud Dataflow service may be used to run many different
648 # types of jobs, not just Map-Reduce.
Bu Sun Kim65020912020-05-20 12:08:20 -0700649 &quot;name&quot;: &quot;A String&quot;, # The name that identifies the step. This must be unique for each
Dan O'Mearadd494642020-05-01 07:42:23 -0700650 # step with respect to all other steps in the Cloud Dataflow job.
Bu Sun Kim65020912020-05-20 12:08:20 -0700651 &quot;kind&quot;: &quot;A String&quot;, # The kind of step in the Cloud Dataflow job.
652 &quot;properties&quot;: { # Named properties associated with the step. Each kind of
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700653 # predefined step has its own required set of properties.
654 # Must be provided on Create. Only retrieved with JOB_VIEW_ALL.
Bu Sun Kim65020912020-05-20 12:08:20 -0700655 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700656 },
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700657 },
658 ],
Bu Sun Kim65020912020-05-20 12:08:20 -0700659 &quot;replacedByJobId&quot;: &quot;A String&quot;, # If another job is an update of this job (and thus, this job is in
660 # `JOB_STATE_UPDATED`), this field contains the ID of that job.
661 &quot;executionInfo&quot;: { # Additional information about how a Cloud Dataflow job will be executed that # Deprecated.
662 # isn&#x27;t contained in the submitted job.
663 &quot;stages&quot;: { # A mapping from each stage to the information about that stage.
664 &quot;a_key&quot;: { # Contains information about how a particular
665 # google.dataflow.v1beta3.Step will be executed.
666 &quot;stepName&quot;: [ # The steps associated with the execution stage.
667 # Note that stages may have several steps, and that a given step
668 # might be run by more than one stage.
669 &quot;A String&quot;,
670 ],
671 },
672 },
673 },
674 &quot;currentState&quot;: &quot;A String&quot;, # The current state of the job.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700675 #
676 # Jobs are created in the `JOB_STATE_STOPPED` state unless otherwise
677 # specified.
678 #
679 # A job in the `JOB_STATE_RUNNING` state may asynchronously enter a
680 # terminal state. After a job has reached a terminal state, no
681 # further state updates may be made.
682 #
683 # This field may be mutated by the Cloud Dataflow service;
684 # callers cannot mutate it.
Bu Sun Kim65020912020-05-20 12:08:20 -0700685 &quot;location&quot;: &quot;A String&quot;, # The [regional endpoint]
686 # (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
687 # contains this job.
688 &quot;startTime&quot;: &quot;A String&quot;, # The timestamp when the job was started (transitioned to JOB_STATE_PENDING).
689 # Flexible resource scheduling jobs are started with some delay after job
690 # creation, so start_time is unset before start and is updated when the
691 # job is started by the Cloud Dataflow service. For other jobs, start_time
692 # always equals to create_time and is immutable and set by the Cloud Dataflow
693 # service.
694 &quot;stepsLocation&quot;: &quot;A String&quot;, # The GCS location where the steps are stored.
695 &quot;labels&quot;: { # User-defined labels for this job.
696 #
697 # The labels map can contain no more than 64 entries. Entries of the labels
698 # map are UTF8 strings that comply with the following restrictions:
699 #
700 # * Keys must conform to regexp: \p{Ll}\p{Lo}{0,62}
701 # * Values must conform to regexp: [\p{Ll}\p{Lo}\p{N}_-]{0,63}
702 # * Both keys and values are additionally constrained to be &lt;= 128 bytes in
703 # size.
704 &quot;a_key&quot;: &quot;A String&quot;,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700705 },
Bu Sun Kim65020912020-05-20 12:08:20 -0700706 &quot;createTime&quot;: &quot;A String&quot;, # The timestamp when the job was initially created. Immutable and set by the
707 # Cloud Dataflow service.
708 &quot;requestedState&quot;: &quot;A String&quot;, # The job&#x27;s requested state.
709 #
710 # `UpdateJob` may be used to switch between the `JOB_STATE_STOPPED` and
711 # `JOB_STATE_RUNNING` states, by setting requested_state. `UpdateJob` may
712 # also be used to directly set a job&#x27;s requested state to
713 # `JOB_STATE_CANCELLED` or `JOB_STATE_DONE`, irrevocably terminating the
714 # job if it has not already reached a terminal state.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700715}
716
Bu Sun Kim65020912020-05-20 12:08:20 -0700717 view: string, The level of information requested in response.
718 replaceJobId: string, Deprecated. This field is now in the Job message.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700719 x__xgafv: string, V1 error format.
720 Allowed values
721 1 - v1 error format
722 2 - v2 error format
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700723
724Returns:
725 An object of the form:
726
727 { # Defines a job to be run by the Cloud Dataflow service.
Bu Sun Kim65020912020-05-20 12:08:20 -0700728 &quot;clientRequestId&quot;: &quot;A String&quot;, # The client&#x27;s unique identifier of the job, re-used across retried attempts.
729 # If this field is set, the service will ensure its uniqueness.
730 # The request to create a job will fail if the service has knowledge of a
731 # previously submitted job with the same client&#x27;s ID and job name.
732 # The caller may use this field to ensure idempotence of job
733 # creation across retried attempts to create a job.
734 # By default, the field is empty and, in that case, the service ignores it.
735 &quot;id&quot;: &quot;A String&quot;, # The unique ID of this job.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700736 #
737 # This field is set by the Cloud Dataflow service when the Job is
738 # created, and is immutable for the life of the job.
Bu Sun Kim65020912020-05-20 12:08:20 -0700739 &quot;currentStateTime&quot;: &quot;A String&quot;, # The timestamp associated with the current state.
740 &quot;transformNameMapping&quot;: { # The map of transform name prefixes of the job to be replaced to the
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -0400741 # corresponding name prefixes of the new job.
Bu Sun Kim65020912020-05-20 12:08:20 -0700742 &quot;a_key&quot;: &quot;A String&quot;,
Jon Wayne Parrott692617a2017-01-06 09:58:29 -0800743 },
Bu Sun Kim65020912020-05-20 12:08:20 -0700744 &quot;environment&quot;: { # Describes the environment in which a Dataflow Job runs. # The environment for the job.
745 &quot;sdkPipelineOptions&quot;: { # The Cloud Dataflow SDK pipeline options specified by the user. These
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -0400746 # options are passed through the service and are used to recreate the
747 # SDK pipeline options on the worker in a language agnostic and platform
748 # independent way.
Bu Sun Kim65020912020-05-20 12:08:20 -0700749 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
Jon Wayne Parrott692617a2017-01-06 09:58:29 -0800750 },
Bu Sun Kim65020912020-05-20 12:08:20 -0700751 &quot;flexResourceSchedulingGoal&quot;: &quot;A String&quot;, # Which Flexible Resource Scheduling mode to run in.
752 &quot;workerPools&quot;: [ # The worker pools. At least one &quot;harness&quot; worker pool must be
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -0400753 # specified in order for the job to have workers.
754 { # Describes one particular pool of Cloud Dataflow workers to be
755 # instantiated by the Cloud Dataflow service in order to perform the
756 # computations required by a job. Note that a workflow job may use
757 # multiple pools, in order to match the various computational
758 # requirements of the various stages of the job.
Bu Sun Kim65020912020-05-20 12:08:20 -0700759 &quot;defaultPackageSet&quot;: &quot;A String&quot;, # The default package set to install. This allows the service to
760 # select a default set of packages which are useful to worker
761 # harnesses written in a particular language.
762 &quot;network&quot;: &quot;A String&quot;, # Network to which VMs will be assigned. If empty or unspecified,
763 # the service will use the network &quot;default&quot;.
764 &quot;zone&quot;: &quot;A String&quot;, # Zone to run the worker pools in. If empty or unspecified, the service
Dan O'Mearadd494642020-05-01 07:42:23 -0700765 # will attempt to choose a reasonable default.
Bu Sun Kim65020912020-05-20 12:08:20 -0700766 &quot;numWorkers&quot;: 42, # Number of Google Compute Engine workers in this pool needed to
767 # execute the job. If zero or unspecified, the service will
768 # attempt to choose a reasonable default.
769 &quot;numThreadsPerWorker&quot;: 42, # The number of threads per worker harness. If empty or unspecified, the
Dan O'Mearadd494642020-05-01 07:42:23 -0700770 # service will choose a number of threads (according to the number of cores
771 # on the selected machine type for batch, or 1 by convention for streaming).
Bu Sun Kim65020912020-05-20 12:08:20 -0700772 &quot;diskSourceImage&quot;: &quot;A String&quot;, # Fully qualified source image for disks.
773 &quot;packages&quot;: [ # Packages to be installed on workers.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700774 { # The packages that must be installed in order for a worker to run the
775 # steps of the Cloud Dataflow job that will be assigned to its worker
776 # pool.
777 #
778 # This is the mechanism by which the Cloud Dataflow SDK causes code to
779 # be loaded onto the workers. For example, the Cloud Dataflow Java SDK
Bu Sun Kim65020912020-05-20 12:08:20 -0700780 # might use this to install jars containing the user&#x27;s code and all of the
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700781 # various dependencies (libraries, data files, etc.) required in order
782 # for that code to run.
Bu Sun Kim65020912020-05-20 12:08:20 -0700783 &quot;location&quot;: &quot;A String&quot;, # The resource to read the package from. The supported resource type is:
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700784 #
785 # Google Cloud Storage:
786 #
787 # storage.googleapis.com/{bucket}
788 # bucket.storage.googleapis.com/
Bu Sun Kim65020912020-05-20 12:08:20 -0700789 &quot;name&quot;: &quot;A String&quot;, # The name of the package.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700790 },
791 ],
Bu Sun Kim65020912020-05-20 12:08:20 -0700792 &quot;teardownPolicy&quot;: &quot;A String&quot;, # Sets the policy for determining when to turndown worker pool.
Sai Cheemalapati4ba8c232017-06-06 18:46:08 -0400793 # Allowed values are: `TEARDOWN_ALWAYS`, `TEARDOWN_ON_SUCCESS`, and
794 # `TEARDOWN_NEVER`.
795 # `TEARDOWN_ALWAYS` means workers are always torn down regardless of whether
796 # the job succeeds. `TEARDOWN_ON_SUCCESS` means workers are torn down
797 # if the job succeeds. `TEARDOWN_NEVER` means the workers are never torn
798 # down.
799 #
800 # If the workers are not torn down by the service, they will
801 # continue to run and use Google Compute Engine VM resources in the
Bu Sun Kim65020912020-05-20 12:08:20 -0700802 # user&#x27;s project until they are explicitly terminated by the user.
Sai Cheemalapati4ba8c232017-06-06 18:46:08 -0400803 # Because of this, Google recommends using the `TEARDOWN_ALWAYS`
804 # policy except for small, manually supervised test jobs.
805 #
806 # If unknown or unspecified, the service will attempt to choose a reasonable
807 # default.
Bu Sun Kim65020912020-05-20 12:08:20 -0700808 &quot;onHostMaintenance&quot;: &quot;A String&quot;, # The action to take on host maintenance, as defined by the Google
809 # Compute Engine API.
810 &quot;poolArgs&quot;: { # Extra arguments for this worker pool.
811 &quot;a_key&quot;: &quot;&quot;, # Properties of the object. Contains field @type with type URL.
812 },
813 &quot;diskSizeGb&quot;: 42, # Size of root disk for VMs, in GB. If zero or unspecified, the service will
Dan O'Mearadd494642020-05-01 07:42:23 -0700814 # attempt to choose a reasonable default.
Bu Sun Kim65020912020-05-20 12:08:20 -0700815 &quot;workerHarnessContainerImage&quot;: &quot;A String&quot;, # Required. Docker container image that executes the Cloud Dataflow worker
816 # harness, residing in Google Container Registry.
817 #
818 # Deprecated for the Fn API path. Use sdk_harness_container_images instead.
819 &quot;diskType&quot;: &quot;A String&quot;, # Type of root disk for VMs. If empty or unspecified, the service will
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -0400820 # attempt to choose a reasonable default.
Bu Sun Kim65020912020-05-20 12:08:20 -0700821 &quot;machineType&quot;: &quot;A String&quot;, # Machine type (e.g. &quot;n1-standard-1&quot;). If empty or unspecified, the
822 # service will attempt to choose a reasonable default.
823 &quot;kind&quot;: &quot;A String&quot;, # The kind of the worker pool; currently only `harness` and `shuffle`
824 # are supported.
825 &quot;dataDisks&quot;: [ # Data disks that are used by a VM in this workflow.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700826 { # Describes the data disk used by a workflow job.
Bu Sun Kim65020912020-05-20 12:08:20 -0700827 &quot;sizeGb&quot;: 42, # Size of disk in GB. If zero or unspecified, the service will
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700828 # attempt to choose a reasonable default.
Bu Sun Kim65020912020-05-20 12:08:20 -0700829 &quot;diskType&quot;: &quot;A String&quot;, # Disk storage type, as defined by Google Compute Engine. This
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700830 # must be a disk type appropriate to the project and zone in which
831 # the workers will run. If unknown or unspecified, the service
832 # will attempt to choose a reasonable default.
Sai Cheemalapati4ba8c232017-06-06 18:46:08 -0400833 #
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700834 # For example, the standard persistent disk type is a resource name
Bu Sun Kim65020912020-05-20 12:08:20 -0700835 # typically ending in &quot;pd-standard&quot;. If SSD persistent disks are
836 # available, the resource name typically ends with &quot;pd-ssd&quot;. The
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700837 # actual valid values are defined the Google Compute Engine API,
838 # not by the Cloud Dataflow API; consult the Google Compute Engine
839 # documentation for more information about determining the set of
840 # available disk types for a particular project and zone.
Sai Cheemalapati4ba8c232017-06-06 18:46:08 -0400841 #
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700842 # Google Compute Engine Disk types are local to a particular
843 # project in a particular zone, and so the resource name will
844 # typically look something like this:
845 #
846 # compute.googleapis.com/projects/project-id/zones/zone/diskTypes/pd-standard
Bu Sun Kim65020912020-05-20 12:08:20 -0700847 &quot;mountPoint&quot;: &quot;A String&quot;, # Directory in a VM where disk is mounted.
Sai Cheemalapati4ba8c232017-06-06 18:46:08 -0400848 },
849 ],
Bu Sun Kim65020912020-05-20 12:08:20 -0700850 &quot;sdkHarnessContainerImages&quot;: [ # Set of SDK harness containers needed to execute this pipeline. This will
Dan O'Mearadd494642020-05-01 07:42:23 -0700851 # only be set in the Fn API path. For non-cross-language pipelines this
852 # should have only one entry. Cross-language pipelines will have two or more
853 # entries.
854 { # Defines a SDK harness container for executing Dataflow pipelines.
Bu Sun Kim65020912020-05-20 12:08:20 -0700855 &quot;containerImage&quot;: &quot;A String&quot;, # A docker container image that resides in Google Container Registry.
856 &quot;useSingleCorePerContainer&quot;: True or False, # If true, recommends the Dataflow service to use only one core per SDK
Dan O'Mearadd494642020-05-01 07:42:23 -0700857 # container instance with this image. If false (or unset) recommends using
858 # more than one core per SDK container instance with this image for
859 # efficiency. Note that Dataflow service may choose to override this property
860 # if needed.
861 },
862 ],
Bu Sun Kim65020912020-05-20 12:08:20 -0700863 &quot;subnetwork&quot;: &quot;A String&quot;, # Subnetwork to which VMs will be assigned, if desired. Expected to be of
864 # the form &quot;regions/REGION/subnetworks/SUBNETWORK&quot;.
865 &quot;ipConfiguration&quot;: &quot;A String&quot;, # Configuration for VM IPs.
866 &quot;taskrunnerSettings&quot;: { # Taskrunner configuration settings. # Settings passed through to Google Compute Engine workers when
867 # using the standard Dataflow task runner. Users should ignore
868 # this field.
869 &quot;alsologtostderr&quot;: True or False, # Whether to also send taskrunner log info to stderr.
870 &quot;taskGroup&quot;: &quot;A String&quot;, # The UNIX group ID on the worker VM to use for tasks launched by
871 # taskrunner; e.g. &quot;wheel&quot;.
872 &quot;harnessCommand&quot;: &quot;A String&quot;, # The command to launch the worker harness.
873 &quot;logDir&quot;: &quot;A String&quot;, # The directory on the VM to store logs.
874 &quot;oauthScopes&quot;: [ # The OAuth2 scopes to be requested by the taskrunner in order to
875 # access the Cloud Dataflow API.
876 &quot;A String&quot;,
877 ],
878 &quot;dataflowApiVersion&quot;: &quot;A String&quot;, # The API version of endpoint, e.g. &quot;v1b3&quot;
879 &quot;logUploadLocation&quot;: &quot;A String&quot;, # Indicates where to put logs. If this is not specified, the logs
880 # will not be uploaded.
881 #
882 # The supported resource type is:
883 #
884 # Google Cloud Storage:
885 # storage.googleapis.com/{bucket}/{object}
886 # bucket.storage.googleapis.com/{object}
887 &quot;streamingWorkerMainClass&quot;: &quot;A String&quot;, # The streaming worker main class name.
888 &quot;workflowFileName&quot;: &quot;A String&quot;, # The file to store the workflow in.
889 &quot;baseTaskDir&quot;: &quot;A String&quot;, # The location on the worker for task-specific subdirectories.
890 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the taskrunner should use for
891 # temporary storage.
892 #
893 # The supported resource type is:
894 #
895 # Google Cloud Storage:
896 # storage.googleapis.com/{bucket}/{object}
897 # bucket.storage.googleapis.com/{object}
898 &quot;commandlinesFileName&quot;: &quot;A String&quot;, # The file to store preprocessing commands in.
899 &quot;languageHint&quot;: &quot;A String&quot;, # The suggested backend language.
900 &quot;baseUrl&quot;: &quot;A String&quot;, # The base URL for the taskrunner to use when accessing Google Cloud APIs.
901 #
902 # When workers access Google Cloud APIs, they logically do so via
903 # relative URLs. If this field is specified, it supplies the base
904 # URL to use for resolving these relative URLs. The normative
905 # algorithm used is defined by RFC 1808, &quot;Relative Uniform Resource
906 # Locators&quot;.
907 #
908 # If not specified, the default value is &quot;http://www.googleapis.com/&quot;
909 &quot;logToSerialconsole&quot;: True or False, # Whether to send taskrunner log info to Google Compute Engine VM serial
910 # console.
911 &quot;continueOnException&quot;: True or False, # Whether to continue taskrunner if an exception is hit.
912 &quot;parallelWorkerSettings&quot;: { # Provides data to pass through to the worker harness. # The settings to pass to the parallel worker harness.
913 &quot;baseUrl&quot;: &quot;A String&quot;, # The base URL for accessing Google Cloud APIs.
914 #
915 # When workers access Google Cloud APIs, they logically do so via
916 # relative URLs. If this field is specified, it supplies the base
917 # URL to use for resolving these relative URLs. The normative
918 # algorithm used is defined by RFC 1808, &quot;Relative Uniform Resource
919 # Locators&quot;.
920 #
921 # If not specified, the default value is &quot;http://www.googleapis.com/&quot;
922 &quot;reportingEnabled&quot;: True or False, # Whether to send work progress updates to the service.
923 &quot;servicePath&quot;: &quot;A String&quot;, # The Cloud Dataflow service path relative to the root URL, for example,
924 # &quot;dataflow/v1b3/projects&quot;.
925 &quot;shuffleServicePath&quot;: &quot;A String&quot;, # The Shuffle service path relative to the root URL, for example,
926 # &quot;shuffle/v1beta1&quot;.
927 &quot;workerId&quot;: &quot;A String&quot;, # The ID of the worker running this pipeline.
928 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the system should use for temporary
929 # storage.
930 #
931 # The supported resource type is:
932 #
933 # Google Cloud Storage:
934 #
935 # storage.googleapis.com/{bucket}/{object}
936 # bucket.storage.googleapis.com/{object}
937 },
938 &quot;vmId&quot;: &quot;A String&quot;, # The ID string of the VM.
939 &quot;taskUser&quot;: &quot;A String&quot;, # The UNIX user ID on the worker VM to use for tasks launched by
940 # taskrunner; e.g. &quot;root&quot;.
941 },
942 &quot;autoscalingSettings&quot;: { # Settings for WorkerPool autoscaling. # Settings for autoscaling of this WorkerPool.
943 &quot;maxNumWorkers&quot;: 42, # The maximum number of workers to cap scaling at.
944 &quot;algorithm&quot;: &quot;A String&quot;, # The algorithm to use for autoscaling.
945 },
946 &quot;metadata&quot;: { # Metadata to set on the Google Compute Engine VMs.
947 &quot;a_key&quot;: &quot;A String&quot;,
948 },
Jon Wayne Parrott692617a2017-01-06 09:58:29 -0800949 },
950 ],
Bu Sun Kim65020912020-05-20 12:08:20 -0700951 &quot;dataset&quot;: &quot;A String&quot;, # The dataset for the current project where various workflow
952 # related tables are stored.
953 #
954 # The supported resource type is:
955 #
956 # Google BigQuery:
957 # bigquery.googleapis.com/{dataset}
958 &quot;internalExperiments&quot;: { # Experimental settings.
959 &quot;a_key&quot;: &quot;&quot;, # Properties of the object. Contains field @type with type URL.
960 },
961 &quot;workerRegion&quot;: &quot;A String&quot;, # The Compute Engine region
962 # (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
963 # which worker processing should occur, e.g. &quot;us-west1&quot;. Mutually exclusive
964 # with worker_zone. If neither worker_region nor worker_zone is specified,
965 # default to the control plane&#x27;s region.
966 &quot;serviceKmsKeyName&quot;: &quot;A String&quot;, # If set, contains the Cloud KMS key identifier used to encrypt data
967 # at rest, AKA a Customer Managed Encryption Key (CMEK).
968 #
969 # Format:
970 # projects/PROJECT_ID/locations/LOCATION/keyRings/KEY_RING/cryptoKeys/KEY
971 &quot;userAgent&quot;: { # A description of the process that generated the request.
972 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
973 },
974 &quot;workerZone&quot;: &quot;A String&quot;, # The Compute Engine zone
975 # (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
976 # which worker processing should occur, e.g. &quot;us-west1-a&quot;. Mutually exclusive
977 # with worker_region. If neither worker_region nor worker_zone is specified,
978 # a zone in the control plane&#x27;s region is chosen based on available capacity.
979 &quot;clusterManagerApiService&quot;: &quot;A String&quot;, # The type of cluster manager API to use. If unknown or
Dan O'Mearadd494642020-05-01 07:42:23 -0700980 # unspecified, the service will attempt to choose a reasonable
981 # default. This should be in the form of the API service name,
Bu Sun Kim65020912020-05-20 12:08:20 -0700982 # e.g. &quot;compute.googleapis.com&quot;.
983 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the system should use for temporary
984 # storage. The system will append the suffix &quot;/temp-{JOBNAME} to
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700985 # this resource prefix, where {JOBNAME} is the value of the
986 # job_name field. The resulting bucket and object prefix is used
987 # as the prefix of the resources used to store temporary data
988 # needed during the job execution. NOTE: This will override the
989 # value in taskrunner_settings.
990 # The supported resource type is:
991 #
992 # Google Cloud Storage:
993 #
994 # storage.googleapis.com/{bucket}/{object}
995 # bucket.storage.googleapis.com/{object}
Bu Sun Kim65020912020-05-20 12:08:20 -0700996 &quot;experiments&quot;: [ # The list of experiments to enable.
997 &quot;A String&quot;,
998 ],
999 &quot;version&quot;: { # A structure describing which components and their versions of the service
1000 # are required in order to run the job.
1001 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
1002 },
1003 &quot;serviceAccountEmail&quot;: &quot;A String&quot;, # Identity to run virtual machines as. Defaults to the default account.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001004 },
Bu Sun Kim65020912020-05-20 12:08:20 -07001005 &quot;stageStates&quot;: [ # This field may be mutated by the Cloud Dataflow service;
1006 # callers cannot mutate it.
1007 { # A message describing the state of a particular execution stage.
1008 &quot;executionStageName&quot;: &quot;A String&quot;, # The name of the execution stage.
1009 &quot;currentStateTime&quot;: &quot;A String&quot;, # The time at which the stage transitioned to this state.
1010 &quot;executionStageState&quot;: &quot;A String&quot;, # Executions stage states allow the same set of values as JobState.
1011 },
1012 ],
1013 &quot;jobMetadata&quot;: { # Metadata available primarily for filtering jobs. Will be included in the # This field is populated by the Dataflow service to support filtering jobs
1014 # by the metadata values provided here. Populated for ListJobs and all GetJob
1015 # views SUMMARY and higher.
1016 # ListJob response and Job SUMMARY view.
1017 &quot;bigTableDetails&quot;: [ # Identification of a BigTable source used in the Dataflow job.
1018 { # Metadata for a BigTable connector used by the job.
1019 &quot;tableId&quot;: &quot;A String&quot;, # TableId accessed in the connection.
1020 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
1021 &quot;instanceId&quot;: &quot;A String&quot;, # InstanceId accessed in the connection.
1022 },
1023 ],
1024 &quot;spannerDetails&quot;: [ # Identification of a Spanner source used in the Dataflow job.
1025 { # Metadata for a Spanner connector used by the job.
1026 &quot;databaseId&quot;: &quot;A String&quot;, # DatabaseId accessed in the connection.
1027 &quot;instanceId&quot;: &quot;A String&quot;, # InstanceId accessed in the connection.
1028 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
1029 },
1030 ],
1031 &quot;datastoreDetails&quot;: [ # Identification of a Datastore source used in the Dataflow job.
1032 { # Metadata for a Datastore connector used by the job.
1033 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
1034 &quot;namespace&quot;: &quot;A String&quot;, # Namespace used in the connection.
1035 },
1036 ],
1037 &quot;sdkVersion&quot;: { # The version of the SDK used to run the job. # The SDK version used to run the job.
1038 &quot;versionDisplayName&quot;: &quot;A String&quot;, # A readable string describing the version of the SDK.
1039 &quot;sdkSupportStatus&quot;: &quot;A String&quot;, # The support status for this SDK version.
1040 &quot;version&quot;: &quot;A String&quot;, # The version of the SDK used to run the job.
1041 },
1042 &quot;bigqueryDetails&quot;: [ # Identification of a BigQuery source used in the Dataflow job.
1043 { # Metadata for a BigQuery connector used by the job.
1044 &quot;table&quot;: &quot;A String&quot;, # Table accessed in the connection.
1045 &quot;dataset&quot;: &quot;A String&quot;, # Dataset accessed in the connection.
1046 &quot;projectId&quot;: &quot;A String&quot;, # Project accessed in the connection.
1047 &quot;query&quot;: &quot;A String&quot;, # Query used to access data in the connection.
1048 },
1049 ],
1050 &quot;fileDetails&quot;: [ # Identification of a File source used in the Dataflow job.
1051 { # Metadata for a File connector used by the job.
1052 &quot;filePattern&quot;: &quot;A String&quot;, # File Pattern used to access files by the connector.
1053 },
1054 ],
1055 &quot;pubsubDetails&quot;: [ # Identification of a PubSub source used in the Dataflow job.
1056 { # Metadata for a PubSub connector used by the job.
1057 &quot;subscription&quot;: &quot;A String&quot;, # Subscription used in the connection.
1058 &quot;topic&quot;: &quot;A String&quot;, # Topic accessed in the connection.
1059 },
1060 ],
1061 },
1062 &quot;createdFromSnapshotId&quot;: &quot;A String&quot;, # If this is specified, the job&#x27;s initial state is populated from the given
1063 # snapshot.
1064 &quot;projectId&quot;: &quot;A String&quot;, # The ID of the Cloud Platform project that the job belongs to.
1065 &quot;type&quot;: &quot;A String&quot;, # The type of Cloud Dataflow job.
1066 &quot;pipelineDescription&quot;: { # A descriptive representation of submitted pipeline as well as the executed # Preliminary field: The format of this data may change at any time.
1067 # A description of the user pipeline and stages through which it is executed.
1068 # Created by Cloud Dataflow service. Only retrieved with
1069 # JOB_VIEW_DESCRIPTION or JOB_VIEW_ALL.
1070 # form. This data is provided by the Dataflow service for ease of visualizing
1071 # the pipeline and interpreting Dataflow provided metrics.
1072 &quot;executionPipelineStage&quot;: [ # Description of each stage of execution of the pipeline.
1073 { # Description of the composing transforms, names/ids, and input/outputs of a
1074 # stage of execution. Some composing transforms and sources may have been
1075 # generated by the Dataflow service during execution planning.
1076 &quot;id&quot;: &quot;A String&quot;, # Dataflow service generated id for this stage.
1077 &quot;componentTransform&quot;: [ # Transforms that comprise this execution stage.
1078 { # Description of a transform executed as part of an execution stage.
1079 &quot;originalTransform&quot;: &quot;A String&quot;, # User name for the original user transform with which this transform is
1080 # most closely associated.
1081 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
1082 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this transform; may be user or system generated.
1083 },
1084 ],
1085 &quot;componentSource&quot;: [ # Collections produced and consumed by component transforms of this stage.
1086 { # Description of an interstitial value between transforms in an execution
1087 # stage.
1088 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
1089 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this transform; may be user or system generated.
1090 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
1091 # source is most closely associated.
1092 },
1093 ],
1094 &quot;kind&quot;: &quot;A String&quot;, # Type of tranform this stage is executing.
1095 &quot;outputSource&quot;: [ # Output sources for this stage.
1096 { # Description of an input or output of an execution stage.
1097 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
1098 # source is most closely associated.
1099 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
1100 &quot;sizeBytes&quot;: &quot;A String&quot;, # Size of the source, if measurable.
1101 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this source; may be user or system generated.
1102 },
1103 ],
1104 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this stage.
1105 &quot;inputSource&quot;: [ # Input sources for this stage.
1106 { # Description of an input or output of an execution stage.
1107 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
1108 # source is most closely associated.
1109 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
1110 &quot;sizeBytes&quot;: &quot;A String&quot;, # Size of the source, if measurable.
1111 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this source; may be user or system generated.
1112 },
1113 ],
1114 },
1115 ],
1116 &quot;originalPipelineTransform&quot;: [ # Description of each transform in the pipeline and collections between them.
1117 { # Description of the type, names/ids, and input/outputs for a transform.
1118 &quot;kind&quot;: &quot;A String&quot;, # Type of transform.
1119 &quot;inputCollectionName&quot;: [ # User names for all collection inputs to this transform.
1120 &quot;A String&quot;,
1121 ],
1122 &quot;name&quot;: &quot;A String&quot;, # User provided name for this transform instance.
1123 &quot;id&quot;: &quot;A String&quot;, # SDK generated id of this transform instance.
1124 &quot;displayData&quot;: [ # Transform-specific display data.
1125 { # Data provided with a pipeline or transform to provide descriptive info.
1126 &quot;timestampValue&quot;: &quot;A String&quot;, # Contains value if the data is of timestamp type.
1127 &quot;boolValue&quot;: True or False, # Contains value if the data is of a boolean type.
1128 &quot;javaClassValue&quot;: &quot;A String&quot;, # Contains value if the data is of java class type.
1129 &quot;strValue&quot;: &quot;A String&quot;, # Contains value if the data is of string type.
1130 &quot;int64Value&quot;: &quot;A String&quot;, # Contains value if the data is of int64 type.
1131 &quot;durationValue&quot;: &quot;A String&quot;, # Contains value if the data is of duration type.
1132 &quot;namespace&quot;: &quot;A String&quot;, # The namespace for the key. This is usually a class name or programming
1133 # language namespace (i.e. python module) which defines the display data.
1134 # This allows a dax monitoring system to specially handle the data
1135 # and perform custom rendering.
1136 &quot;floatValue&quot;: 3.14, # Contains value if the data is of float type.
1137 &quot;key&quot;: &quot;A String&quot;, # The key identifying the display data.
1138 # This is intended to be used as a label for the display data
1139 # when viewed in a dax monitoring system.
1140 &quot;shortStrValue&quot;: &quot;A String&quot;, # A possible additional shorter value to display.
1141 # For example a java_class_name_value of com.mypackage.MyDoFn
1142 # will be stored with MyDoFn as the short_str_value and
1143 # com.mypackage.MyDoFn as the java_class_name value.
1144 # short_str_value can be displayed and java_class_name_value
1145 # will be displayed as a tooltip.
1146 &quot;url&quot;: &quot;A String&quot;, # An optional full URL.
1147 &quot;label&quot;: &quot;A String&quot;, # An optional label to display in a dax UI for the element.
1148 },
1149 ],
1150 &quot;outputCollectionName&quot;: [ # User names for all collection outputs to this transform.
1151 &quot;A String&quot;,
1152 ],
1153 },
1154 ],
1155 &quot;displayData&quot;: [ # Pipeline level display data.
1156 { # Data provided with a pipeline or transform to provide descriptive info.
1157 &quot;timestampValue&quot;: &quot;A String&quot;, # Contains value if the data is of timestamp type.
1158 &quot;boolValue&quot;: True or False, # Contains value if the data is of a boolean type.
1159 &quot;javaClassValue&quot;: &quot;A String&quot;, # Contains value if the data is of java class type.
1160 &quot;strValue&quot;: &quot;A String&quot;, # Contains value if the data is of string type.
1161 &quot;int64Value&quot;: &quot;A String&quot;, # Contains value if the data is of int64 type.
1162 &quot;durationValue&quot;: &quot;A String&quot;, # Contains value if the data is of duration type.
1163 &quot;namespace&quot;: &quot;A String&quot;, # The namespace for the key. This is usually a class name or programming
1164 # language namespace (i.e. python module) which defines the display data.
1165 # This allows a dax monitoring system to specially handle the data
1166 # and perform custom rendering.
1167 &quot;floatValue&quot;: 3.14, # Contains value if the data is of float type.
1168 &quot;key&quot;: &quot;A String&quot;, # The key identifying the display data.
1169 # This is intended to be used as a label for the display data
1170 # when viewed in a dax monitoring system.
1171 &quot;shortStrValue&quot;: &quot;A String&quot;, # A possible additional shorter value to display.
1172 # For example a java_class_name_value of com.mypackage.MyDoFn
1173 # will be stored with MyDoFn as the short_str_value and
1174 # com.mypackage.MyDoFn as the java_class_name value.
1175 # short_str_value can be displayed and java_class_name_value
1176 # will be displayed as a tooltip.
1177 &quot;url&quot;: &quot;A String&quot;, # An optional full URL.
1178 &quot;label&quot;: &quot;A String&quot;, # An optional label to display in a dax UI for the element.
1179 },
1180 ],
1181 },
1182 &quot;replaceJobId&quot;: &quot;A String&quot;, # If this job is an update of an existing job, this field is the job ID
1183 # of the job it replaced.
1184 #
1185 # When sending a `CreateJobRequest`, you can update a job by specifying it
1186 # here. The job named here is stopped, and its intermediate state is
1187 # transferred to this job.
1188 &quot;tempFiles&quot;: [ # A set of files the system should be aware of that are used
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001189 # for temporary storage. These temporary files will be
1190 # removed on job completion.
1191 # No duplicates are allowed.
1192 # No file patterns are supported.
1193 #
1194 # The supported files are:
1195 #
1196 # Google Cloud Storage:
1197 #
1198 # storage.googleapis.com/{bucket}/{object}
1199 # bucket.storage.googleapis.com/{object}
Bu Sun Kim65020912020-05-20 12:08:20 -07001200 &quot;A String&quot;,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001201 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07001202 &quot;name&quot;: &quot;A String&quot;, # The user-specified Cloud Dataflow job name.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001203 #
1204 # Only one Job with a given name may exist in a project at any
1205 # given time. If a caller attempts to create a Job with the same
1206 # name as an already-existing Job, the attempt returns the
1207 # existing Job.
1208 #
1209 # The name must match the regular expression
1210 # `[a-z]([-a-z0-9]{0,38}[a-z0-9])?`
Bu Sun Kim65020912020-05-20 12:08:20 -07001211 &quot;steps&quot;: [ # Exactly one of step or steps_location should be specified.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001212 #
1213 # The top-level steps that constitute the entire job.
1214 { # Defines a particular step within a Cloud Dataflow job.
1215 #
1216 # A job consists of multiple steps, each of which performs some
1217 # specific operation as part of the overall job. Data is typically
1218 # passed from one step to another as part of the job.
1219 #
Bu Sun Kim65020912020-05-20 12:08:20 -07001220 # Here&#x27;s an example of a sequence of steps which together implement a
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001221 # Map-Reduce job:
1222 #
1223 # * Read a collection of data from some source, parsing the
Bu Sun Kim65020912020-05-20 12:08:20 -07001224 # collection&#x27;s elements.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001225 #
1226 # * Validate the elements.
1227 #
1228 # * Apply a user-defined function to map each element to some value
1229 # and extract an element-specific key value.
1230 #
1231 # * Group elements with the same key into a single element with
1232 # that key, transforming a multiply-keyed collection into a
1233 # uniquely-keyed collection.
1234 #
1235 # * Write the elements out to some data sink.
1236 #
1237 # Note that the Cloud Dataflow service may be used to run many different
1238 # types of jobs, not just Map-Reduce.
Bu Sun Kim65020912020-05-20 12:08:20 -07001239 &quot;name&quot;: &quot;A String&quot;, # The name that identifies the step. This must be unique for each
Dan O'Mearadd494642020-05-01 07:42:23 -07001240 # step with respect to all other steps in the Cloud Dataflow job.
Bu Sun Kim65020912020-05-20 12:08:20 -07001241 &quot;kind&quot;: &quot;A String&quot;, # The kind of step in the Cloud Dataflow job.
1242 &quot;properties&quot;: { # Named properties associated with the step. Each kind of
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001243 # predefined step has its own required set of properties.
1244 # Must be provided on Create. Only retrieved with JOB_VIEW_ALL.
Bu Sun Kim65020912020-05-20 12:08:20 -07001245 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001246 },
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001247 },
1248 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07001249 &quot;replacedByJobId&quot;: &quot;A String&quot;, # If another job is an update of this job (and thus, this job is in
1250 # `JOB_STATE_UPDATED`), this field contains the ID of that job.
1251 &quot;executionInfo&quot;: { # Additional information about how a Cloud Dataflow job will be executed that # Deprecated.
1252 # isn&#x27;t contained in the submitted job.
1253 &quot;stages&quot;: { # A mapping from each stage to the information about that stage.
1254 &quot;a_key&quot;: { # Contains information about how a particular
1255 # google.dataflow.v1beta3.Step will be executed.
1256 &quot;stepName&quot;: [ # The steps associated with the execution stage.
1257 # Note that stages may have several steps, and that a given step
1258 # might be run by more than one stage.
1259 &quot;A String&quot;,
1260 ],
1261 },
1262 },
1263 },
1264 &quot;currentState&quot;: &quot;A String&quot;, # The current state of the job.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001265 #
1266 # Jobs are created in the `JOB_STATE_STOPPED` state unless otherwise
1267 # specified.
1268 #
1269 # A job in the `JOB_STATE_RUNNING` state may asynchronously enter a
1270 # terminal state. After a job has reached a terminal state, no
1271 # further state updates may be made.
1272 #
1273 # This field may be mutated by the Cloud Dataflow service;
1274 # callers cannot mutate it.
Bu Sun Kim65020912020-05-20 12:08:20 -07001275 &quot;location&quot;: &quot;A String&quot;, # The [regional endpoint]
1276 # (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
1277 # contains this job.
1278 &quot;startTime&quot;: &quot;A String&quot;, # The timestamp when the job was started (transitioned to JOB_STATE_PENDING).
1279 # Flexible resource scheduling jobs are started with some delay after job
1280 # creation, so start_time is unset before start and is updated when the
1281 # job is started by the Cloud Dataflow service. For other jobs, start_time
1282 # always equals to create_time and is immutable and set by the Cloud Dataflow
1283 # service.
1284 &quot;stepsLocation&quot;: &quot;A String&quot;, # The GCS location where the steps are stored.
1285 &quot;labels&quot;: { # User-defined labels for this job.
1286 #
1287 # The labels map can contain no more than 64 entries. Entries of the labels
1288 # map are UTF8 strings that comply with the following restrictions:
1289 #
1290 # * Keys must conform to regexp: \p{Ll}\p{Lo}{0,62}
1291 # * Values must conform to regexp: [\p{Ll}\p{Lo}\p{N}_-]{0,63}
1292 # * Both keys and values are additionally constrained to be &lt;= 128 bytes in
1293 # size.
1294 &quot;a_key&quot;: &quot;A String&quot;,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001295 },
Bu Sun Kim65020912020-05-20 12:08:20 -07001296 &quot;createTime&quot;: &quot;A String&quot;, # The timestamp when the job was initially created. Immutable and set by the
1297 # Cloud Dataflow service.
1298 &quot;requestedState&quot;: &quot;A String&quot;, # The job&#x27;s requested state.
1299 #
1300 # `UpdateJob` may be used to switch between the `JOB_STATE_STOPPED` and
1301 # `JOB_STATE_RUNNING` states, by setting requested_state. `UpdateJob` may
1302 # also be used to directly set a job&#x27;s requested state to
1303 # `JOB_STATE_CANCELLED` or `JOB_STATE_DONE`, irrevocably terminating the
1304 # job if it has not already reached a terminal state.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001305 }</pre>
1306</div>
1307
1308<div class="method">
Bu Sun Kim65020912020-05-20 12:08:20 -07001309 <code class="details" id="get">get(projectId, location, jobId, view=None, x__xgafv=None)</code>
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001310 <pre>Gets the state of the specified Cloud Dataflow job.
1311
1312To get the state of a job, we recommend using `projects.locations.jobs.get`
1313with a [regional endpoint]
1314(https://cloud.google.com/dataflow/docs/concepts/regional-endpoints). Using
1315`projects.jobs.get` is not recommended, as you can only get the state of
1316jobs that are running in `us-central1`.
1317
1318Args:
1319 projectId: string, The ID of the Cloud Platform project that the job belongs to. (required)
1320 location: string, The [regional endpoint]
1321(https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
1322contains this job. (required)
1323 jobId: string, The job ID. (required)
Bu Sun Kim65020912020-05-20 12:08:20 -07001324 view: string, The level of information requested in response.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001325 x__xgafv: string, V1 error format.
1326 Allowed values
1327 1 - v1 error format
1328 2 - v2 error format
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001329
1330Returns:
1331 An object of the form:
1332
1333 { # Defines a job to be run by the Cloud Dataflow service.
Bu Sun Kim65020912020-05-20 12:08:20 -07001334 &quot;clientRequestId&quot;: &quot;A String&quot;, # The client&#x27;s unique identifier of the job, re-used across retried attempts.
1335 # If this field is set, the service will ensure its uniqueness.
1336 # The request to create a job will fail if the service has knowledge of a
1337 # previously submitted job with the same client&#x27;s ID and job name.
1338 # The caller may use this field to ensure idempotence of job
1339 # creation across retried attempts to create a job.
1340 # By default, the field is empty and, in that case, the service ignores it.
1341 &quot;id&quot;: &quot;A String&quot;, # The unique ID of this job.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001342 #
1343 # This field is set by the Cloud Dataflow service when the Job is
1344 # created, and is immutable for the life of the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07001345 &quot;currentStateTime&quot;: &quot;A String&quot;, # The timestamp associated with the current state.
1346 &quot;transformNameMapping&quot;: { # The map of transform name prefixes of the job to be replaced to the
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001347 # corresponding name prefixes of the new job.
Bu Sun Kim65020912020-05-20 12:08:20 -07001348 &quot;a_key&quot;: &quot;A String&quot;,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001349 },
Bu Sun Kim65020912020-05-20 12:08:20 -07001350 &quot;environment&quot;: { # Describes the environment in which a Dataflow Job runs. # The environment for the job.
1351 &quot;sdkPipelineOptions&quot;: { # The Cloud Dataflow SDK pipeline options specified by the user. These
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001352 # options are passed through the service and are used to recreate the
1353 # SDK pipeline options on the worker in a language agnostic and platform
1354 # independent way.
Bu Sun Kim65020912020-05-20 12:08:20 -07001355 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001356 },
Bu Sun Kim65020912020-05-20 12:08:20 -07001357 &quot;flexResourceSchedulingGoal&quot;: &quot;A String&quot;, # Which Flexible Resource Scheduling mode to run in.
1358 &quot;workerPools&quot;: [ # The worker pools. At least one &quot;harness&quot; worker pool must be
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001359 # specified in order for the job to have workers.
1360 { # Describes one particular pool of Cloud Dataflow workers to be
1361 # instantiated by the Cloud Dataflow service in order to perform the
1362 # computations required by a job. Note that a workflow job may use
1363 # multiple pools, in order to match the various computational
1364 # requirements of the various stages of the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07001365 &quot;defaultPackageSet&quot;: &quot;A String&quot;, # The default package set to install. This allows the service to
1366 # select a default set of packages which are useful to worker
1367 # harnesses written in a particular language.
1368 &quot;network&quot;: &quot;A String&quot;, # Network to which VMs will be assigned. If empty or unspecified,
1369 # the service will use the network &quot;default&quot;.
1370 &quot;zone&quot;: &quot;A String&quot;, # Zone to run the worker pools in. If empty or unspecified, the service
Dan O'Mearadd494642020-05-01 07:42:23 -07001371 # will attempt to choose a reasonable default.
Bu Sun Kim65020912020-05-20 12:08:20 -07001372 &quot;numWorkers&quot;: 42, # Number of Google Compute Engine workers in this pool needed to
1373 # execute the job. If zero or unspecified, the service will
1374 # attempt to choose a reasonable default.
1375 &quot;numThreadsPerWorker&quot;: 42, # The number of threads per worker harness. If empty or unspecified, the
Dan O'Mearadd494642020-05-01 07:42:23 -07001376 # service will choose a number of threads (according to the number of cores
1377 # on the selected machine type for batch, or 1 by convention for streaming).
Bu Sun Kim65020912020-05-20 12:08:20 -07001378 &quot;diskSourceImage&quot;: &quot;A String&quot;, # Fully qualified source image for disks.
1379 &quot;packages&quot;: [ # Packages to be installed on workers.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001380 { # The packages that must be installed in order for a worker to run the
1381 # steps of the Cloud Dataflow job that will be assigned to its worker
1382 # pool.
1383 #
1384 # This is the mechanism by which the Cloud Dataflow SDK causes code to
1385 # be loaded onto the workers. For example, the Cloud Dataflow Java SDK
Bu Sun Kim65020912020-05-20 12:08:20 -07001386 # might use this to install jars containing the user&#x27;s code and all of the
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001387 # various dependencies (libraries, data files, etc.) required in order
1388 # for that code to run.
Bu Sun Kim65020912020-05-20 12:08:20 -07001389 &quot;location&quot;: &quot;A String&quot;, # The resource to read the package from. The supported resource type is:
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001390 #
1391 # Google Cloud Storage:
1392 #
1393 # storage.googleapis.com/{bucket}
1394 # bucket.storage.googleapis.com/
Bu Sun Kim65020912020-05-20 12:08:20 -07001395 &quot;name&quot;: &quot;A String&quot;, # The name of the package.
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04001396 },
1397 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07001398 &quot;teardownPolicy&quot;: &quot;A String&quot;, # Sets the policy for determining when to turndown worker pool.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001399 # Allowed values are: `TEARDOWN_ALWAYS`, `TEARDOWN_ON_SUCCESS`, and
1400 # `TEARDOWN_NEVER`.
1401 # `TEARDOWN_ALWAYS` means workers are always torn down regardless of whether
1402 # the job succeeds. `TEARDOWN_ON_SUCCESS` means workers are torn down
1403 # if the job succeeds. `TEARDOWN_NEVER` means the workers are never torn
1404 # down.
1405 #
1406 # If the workers are not torn down by the service, they will
1407 # continue to run and use Google Compute Engine VM resources in the
Bu Sun Kim65020912020-05-20 12:08:20 -07001408 # user&#x27;s project until they are explicitly terminated by the user.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001409 # Because of this, Google recommends using the `TEARDOWN_ALWAYS`
1410 # policy except for small, manually supervised test jobs.
1411 #
1412 # If unknown or unspecified, the service will attempt to choose a reasonable
1413 # default.
Bu Sun Kim65020912020-05-20 12:08:20 -07001414 &quot;onHostMaintenance&quot;: &quot;A String&quot;, # The action to take on host maintenance, as defined by the Google
1415 # Compute Engine API.
1416 &quot;poolArgs&quot;: { # Extra arguments for this worker pool.
1417 &quot;a_key&quot;: &quot;&quot;, # Properties of the object. Contains field @type with type URL.
1418 },
1419 &quot;diskSizeGb&quot;: 42, # Size of root disk for VMs, in GB. If zero or unspecified, the service will
Dan O'Mearadd494642020-05-01 07:42:23 -07001420 # attempt to choose a reasonable default.
Bu Sun Kim65020912020-05-20 12:08:20 -07001421 &quot;workerHarnessContainerImage&quot;: &quot;A String&quot;, # Required. Docker container image that executes the Cloud Dataflow worker
1422 # harness, residing in Google Container Registry.
1423 #
1424 # Deprecated for the Fn API path. Use sdk_harness_container_images instead.
1425 &quot;diskType&quot;: &quot;A String&quot;, # Type of root disk for VMs. If empty or unspecified, the service will
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001426 # attempt to choose a reasonable default.
Bu Sun Kim65020912020-05-20 12:08:20 -07001427 &quot;machineType&quot;: &quot;A String&quot;, # Machine type (e.g. &quot;n1-standard-1&quot;). If empty or unspecified, the
1428 # service will attempt to choose a reasonable default.
1429 &quot;kind&quot;: &quot;A String&quot;, # The kind of the worker pool; currently only `harness` and `shuffle`
1430 # are supported.
1431 &quot;dataDisks&quot;: [ # Data disks that are used by a VM in this workflow.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001432 { # Describes the data disk used by a workflow job.
Bu Sun Kim65020912020-05-20 12:08:20 -07001433 &quot;sizeGb&quot;: 42, # Size of disk in GB. If zero or unspecified, the service will
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001434 # attempt to choose a reasonable default.
Bu Sun Kim65020912020-05-20 12:08:20 -07001435 &quot;diskType&quot;: &quot;A String&quot;, # Disk storage type, as defined by Google Compute Engine. This
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001436 # must be a disk type appropriate to the project and zone in which
1437 # the workers will run. If unknown or unspecified, the service
1438 # will attempt to choose a reasonable default.
1439 #
1440 # For example, the standard persistent disk type is a resource name
Bu Sun Kim65020912020-05-20 12:08:20 -07001441 # typically ending in &quot;pd-standard&quot;. If SSD persistent disks are
1442 # available, the resource name typically ends with &quot;pd-ssd&quot;. The
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001443 # actual valid values are defined the Google Compute Engine API,
1444 # not by the Cloud Dataflow API; consult the Google Compute Engine
1445 # documentation for more information about determining the set of
1446 # available disk types for a particular project and zone.
1447 #
1448 # Google Compute Engine Disk types are local to a particular
1449 # project in a particular zone, and so the resource name will
1450 # typically look something like this:
1451 #
1452 # compute.googleapis.com/projects/project-id/zones/zone/diskTypes/pd-standard
Bu Sun Kim65020912020-05-20 12:08:20 -07001453 &quot;mountPoint&quot;: &quot;A String&quot;, # Directory in a VM where disk is mounted.
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04001454 },
1455 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07001456 &quot;sdkHarnessContainerImages&quot;: [ # Set of SDK harness containers needed to execute this pipeline. This will
Dan O'Mearadd494642020-05-01 07:42:23 -07001457 # only be set in the Fn API path. For non-cross-language pipelines this
1458 # should have only one entry. Cross-language pipelines will have two or more
1459 # entries.
1460 { # Defines a SDK harness container for executing Dataflow pipelines.
Bu Sun Kim65020912020-05-20 12:08:20 -07001461 &quot;containerImage&quot;: &quot;A String&quot;, # A docker container image that resides in Google Container Registry.
1462 &quot;useSingleCorePerContainer&quot;: True or False, # If true, recommends the Dataflow service to use only one core per SDK
Dan O'Mearadd494642020-05-01 07:42:23 -07001463 # container instance with this image. If false (or unset) recommends using
1464 # more than one core per SDK container instance with this image for
1465 # efficiency. Note that Dataflow service may choose to override this property
1466 # if needed.
1467 },
1468 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07001469 &quot;subnetwork&quot;: &quot;A String&quot;, # Subnetwork to which VMs will be assigned, if desired. Expected to be of
1470 # the form &quot;regions/REGION/subnetworks/SUBNETWORK&quot;.
1471 &quot;ipConfiguration&quot;: &quot;A String&quot;, # Configuration for VM IPs.
1472 &quot;taskrunnerSettings&quot;: { # Taskrunner configuration settings. # Settings passed through to Google Compute Engine workers when
1473 # using the standard Dataflow task runner. Users should ignore
1474 # this field.
1475 &quot;alsologtostderr&quot;: True or False, # Whether to also send taskrunner log info to stderr.
1476 &quot;taskGroup&quot;: &quot;A String&quot;, # The UNIX group ID on the worker VM to use for tasks launched by
1477 # taskrunner; e.g. &quot;wheel&quot;.
1478 &quot;harnessCommand&quot;: &quot;A String&quot;, # The command to launch the worker harness.
1479 &quot;logDir&quot;: &quot;A String&quot;, # The directory on the VM to store logs.
1480 &quot;oauthScopes&quot;: [ # The OAuth2 scopes to be requested by the taskrunner in order to
1481 # access the Cloud Dataflow API.
1482 &quot;A String&quot;,
1483 ],
1484 &quot;dataflowApiVersion&quot;: &quot;A String&quot;, # The API version of endpoint, e.g. &quot;v1b3&quot;
1485 &quot;logUploadLocation&quot;: &quot;A String&quot;, # Indicates where to put logs. If this is not specified, the logs
1486 # will not be uploaded.
1487 #
1488 # The supported resource type is:
1489 #
1490 # Google Cloud Storage:
1491 # storage.googleapis.com/{bucket}/{object}
1492 # bucket.storage.googleapis.com/{object}
1493 &quot;streamingWorkerMainClass&quot;: &quot;A String&quot;, # The streaming worker main class name.
1494 &quot;workflowFileName&quot;: &quot;A String&quot;, # The file to store the workflow in.
1495 &quot;baseTaskDir&quot;: &quot;A String&quot;, # The location on the worker for task-specific subdirectories.
1496 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the taskrunner should use for
1497 # temporary storage.
1498 #
1499 # The supported resource type is:
1500 #
1501 # Google Cloud Storage:
1502 # storage.googleapis.com/{bucket}/{object}
1503 # bucket.storage.googleapis.com/{object}
1504 &quot;commandlinesFileName&quot;: &quot;A String&quot;, # The file to store preprocessing commands in.
1505 &quot;languageHint&quot;: &quot;A String&quot;, # The suggested backend language.
1506 &quot;baseUrl&quot;: &quot;A String&quot;, # The base URL for the taskrunner to use when accessing Google Cloud APIs.
1507 #
1508 # When workers access Google Cloud APIs, they logically do so via
1509 # relative URLs. If this field is specified, it supplies the base
1510 # URL to use for resolving these relative URLs. The normative
1511 # algorithm used is defined by RFC 1808, &quot;Relative Uniform Resource
1512 # Locators&quot;.
1513 #
1514 # If not specified, the default value is &quot;http://www.googleapis.com/&quot;
1515 &quot;logToSerialconsole&quot;: True or False, # Whether to send taskrunner log info to Google Compute Engine VM serial
1516 # console.
1517 &quot;continueOnException&quot;: True or False, # Whether to continue taskrunner if an exception is hit.
1518 &quot;parallelWorkerSettings&quot;: { # Provides data to pass through to the worker harness. # The settings to pass to the parallel worker harness.
1519 &quot;baseUrl&quot;: &quot;A String&quot;, # The base URL for accessing Google Cloud APIs.
1520 #
1521 # When workers access Google Cloud APIs, they logically do so via
1522 # relative URLs. If this field is specified, it supplies the base
1523 # URL to use for resolving these relative URLs. The normative
1524 # algorithm used is defined by RFC 1808, &quot;Relative Uniform Resource
1525 # Locators&quot;.
1526 #
1527 # If not specified, the default value is &quot;http://www.googleapis.com/&quot;
1528 &quot;reportingEnabled&quot;: True or False, # Whether to send work progress updates to the service.
1529 &quot;servicePath&quot;: &quot;A String&quot;, # The Cloud Dataflow service path relative to the root URL, for example,
1530 # &quot;dataflow/v1b3/projects&quot;.
1531 &quot;shuffleServicePath&quot;: &quot;A String&quot;, # The Shuffle service path relative to the root URL, for example,
1532 # &quot;shuffle/v1beta1&quot;.
1533 &quot;workerId&quot;: &quot;A String&quot;, # The ID of the worker running this pipeline.
1534 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the system should use for temporary
1535 # storage.
1536 #
1537 # The supported resource type is:
1538 #
1539 # Google Cloud Storage:
1540 #
1541 # storage.googleapis.com/{bucket}/{object}
1542 # bucket.storage.googleapis.com/{object}
1543 },
1544 &quot;vmId&quot;: &quot;A String&quot;, # The ID string of the VM.
1545 &quot;taskUser&quot;: &quot;A String&quot;, # The UNIX user ID on the worker VM to use for tasks launched by
1546 # taskrunner; e.g. &quot;root&quot;.
1547 },
1548 &quot;autoscalingSettings&quot;: { # Settings for WorkerPool autoscaling. # Settings for autoscaling of this WorkerPool.
1549 &quot;maxNumWorkers&quot;: 42, # The maximum number of workers to cap scaling at.
1550 &quot;algorithm&quot;: &quot;A String&quot;, # The algorithm to use for autoscaling.
1551 },
1552 &quot;metadata&quot;: { # Metadata to set on the Google Compute Engine VMs.
1553 &quot;a_key&quot;: &quot;A String&quot;,
1554 },
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04001555 },
1556 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07001557 &quot;dataset&quot;: &quot;A String&quot;, # The dataset for the current project where various workflow
1558 # related tables are stored.
1559 #
1560 # The supported resource type is:
1561 #
1562 # Google BigQuery:
1563 # bigquery.googleapis.com/{dataset}
1564 &quot;internalExperiments&quot;: { # Experimental settings.
1565 &quot;a_key&quot;: &quot;&quot;, # Properties of the object. Contains field @type with type URL.
1566 },
1567 &quot;workerRegion&quot;: &quot;A String&quot;, # The Compute Engine region
1568 # (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
1569 # which worker processing should occur, e.g. &quot;us-west1&quot;. Mutually exclusive
1570 # with worker_zone. If neither worker_region nor worker_zone is specified,
1571 # default to the control plane&#x27;s region.
1572 &quot;serviceKmsKeyName&quot;: &quot;A String&quot;, # If set, contains the Cloud KMS key identifier used to encrypt data
1573 # at rest, AKA a Customer Managed Encryption Key (CMEK).
1574 #
1575 # Format:
1576 # projects/PROJECT_ID/locations/LOCATION/keyRings/KEY_RING/cryptoKeys/KEY
1577 &quot;userAgent&quot;: { # A description of the process that generated the request.
1578 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
1579 },
1580 &quot;workerZone&quot;: &quot;A String&quot;, # The Compute Engine zone
1581 # (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
1582 # which worker processing should occur, e.g. &quot;us-west1-a&quot;. Mutually exclusive
1583 # with worker_region. If neither worker_region nor worker_zone is specified,
1584 # a zone in the control plane&#x27;s region is chosen based on available capacity.
1585 &quot;clusterManagerApiService&quot;: &quot;A String&quot;, # The type of cluster manager API to use. If unknown or
Dan O'Mearadd494642020-05-01 07:42:23 -07001586 # unspecified, the service will attempt to choose a reasonable
1587 # default. This should be in the form of the API service name,
Bu Sun Kim65020912020-05-20 12:08:20 -07001588 # e.g. &quot;compute.googleapis.com&quot;.
1589 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the system should use for temporary
1590 # storage. The system will append the suffix &quot;/temp-{JOBNAME} to
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001591 # this resource prefix, where {JOBNAME} is the value of the
1592 # job_name field. The resulting bucket and object prefix is used
1593 # as the prefix of the resources used to store temporary data
1594 # needed during the job execution. NOTE: This will override the
1595 # value in taskrunner_settings.
1596 # The supported resource type is:
1597 #
1598 # Google Cloud Storage:
1599 #
1600 # storage.googleapis.com/{bucket}/{object}
1601 # bucket.storage.googleapis.com/{object}
Bu Sun Kim65020912020-05-20 12:08:20 -07001602 &quot;experiments&quot;: [ # The list of experiments to enable.
1603 &quot;A String&quot;,
1604 ],
1605 &quot;version&quot;: { # A structure describing which components and their versions of the service
1606 # are required in order to run the job.
1607 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
1608 },
1609 &quot;serviceAccountEmail&quot;: &quot;A String&quot;, # Identity to run virtual machines as. Defaults to the default account.
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04001610 },
Bu Sun Kim65020912020-05-20 12:08:20 -07001611 &quot;stageStates&quot;: [ # This field may be mutated by the Cloud Dataflow service;
1612 # callers cannot mutate it.
1613 { # A message describing the state of a particular execution stage.
1614 &quot;executionStageName&quot;: &quot;A String&quot;, # The name of the execution stage.
1615 &quot;currentStateTime&quot;: &quot;A String&quot;, # The time at which the stage transitioned to this state.
1616 &quot;executionStageState&quot;: &quot;A String&quot;, # Executions stage states allow the same set of values as JobState.
1617 },
1618 ],
1619 &quot;jobMetadata&quot;: { # Metadata available primarily for filtering jobs. Will be included in the # This field is populated by the Dataflow service to support filtering jobs
1620 # by the metadata values provided here. Populated for ListJobs and all GetJob
1621 # views SUMMARY and higher.
1622 # ListJob response and Job SUMMARY view.
1623 &quot;bigTableDetails&quot;: [ # Identification of a BigTable source used in the Dataflow job.
1624 { # Metadata for a BigTable connector used by the job.
1625 &quot;tableId&quot;: &quot;A String&quot;, # TableId accessed in the connection.
1626 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
1627 &quot;instanceId&quot;: &quot;A String&quot;, # InstanceId accessed in the connection.
1628 },
1629 ],
1630 &quot;spannerDetails&quot;: [ # Identification of a Spanner source used in the Dataflow job.
1631 { # Metadata for a Spanner connector used by the job.
1632 &quot;databaseId&quot;: &quot;A String&quot;, # DatabaseId accessed in the connection.
1633 &quot;instanceId&quot;: &quot;A String&quot;, # InstanceId accessed in the connection.
1634 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
1635 },
1636 ],
1637 &quot;datastoreDetails&quot;: [ # Identification of a Datastore source used in the Dataflow job.
1638 { # Metadata for a Datastore connector used by the job.
1639 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
1640 &quot;namespace&quot;: &quot;A String&quot;, # Namespace used in the connection.
1641 },
1642 ],
1643 &quot;sdkVersion&quot;: { # The version of the SDK used to run the job. # The SDK version used to run the job.
1644 &quot;versionDisplayName&quot;: &quot;A String&quot;, # A readable string describing the version of the SDK.
1645 &quot;sdkSupportStatus&quot;: &quot;A String&quot;, # The support status for this SDK version.
1646 &quot;version&quot;: &quot;A String&quot;, # The version of the SDK used to run the job.
1647 },
1648 &quot;bigqueryDetails&quot;: [ # Identification of a BigQuery source used in the Dataflow job.
1649 { # Metadata for a BigQuery connector used by the job.
1650 &quot;table&quot;: &quot;A String&quot;, # Table accessed in the connection.
1651 &quot;dataset&quot;: &quot;A String&quot;, # Dataset accessed in the connection.
1652 &quot;projectId&quot;: &quot;A String&quot;, # Project accessed in the connection.
1653 &quot;query&quot;: &quot;A String&quot;, # Query used to access data in the connection.
1654 },
1655 ],
1656 &quot;fileDetails&quot;: [ # Identification of a File source used in the Dataflow job.
1657 { # Metadata for a File connector used by the job.
1658 &quot;filePattern&quot;: &quot;A String&quot;, # File Pattern used to access files by the connector.
1659 },
1660 ],
1661 &quot;pubsubDetails&quot;: [ # Identification of a PubSub source used in the Dataflow job.
1662 { # Metadata for a PubSub connector used by the job.
1663 &quot;subscription&quot;: &quot;A String&quot;, # Subscription used in the connection.
1664 &quot;topic&quot;: &quot;A String&quot;, # Topic accessed in the connection.
1665 },
1666 ],
1667 },
1668 &quot;createdFromSnapshotId&quot;: &quot;A String&quot;, # If this is specified, the job&#x27;s initial state is populated from the given
1669 # snapshot.
1670 &quot;projectId&quot;: &quot;A String&quot;, # The ID of the Cloud Platform project that the job belongs to.
1671 &quot;type&quot;: &quot;A String&quot;, # The type of Cloud Dataflow job.
1672 &quot;pipelineDescription&quot;: { # A descriptive representation of submitted pipeline as well as the executed # Preliminary field: The format of this data may change at any time.
1673 # A description of the user pipeline and stages through which it is executed.
1674 # Created by Cloud Dataflow service. Only retrieved with
1675 # JOB_VIEW_DESCRIPTION or JOB_VIEW_ALL.
1676 # form. This data is provided by the Dataflow service for ease of visualizing
1677 # the pipeline and interpreting Dataflow provided metrics.
1678 &quot;executionPipelineStage&quot;: [ # Description of each stage of execution of the pipeline.
1679 { # Description of the composing transforms, names/ids, and input/outputs of a
1680 # stage of execution. Some composing transforms and sources may have been
1681 # generated by the Dataflow service during execution planning.
1682 &quot;id&quot;: &quot;A String&quot;, # Dataflow service generated id for this stage.
1683 &quot;componentTransform&quot;: [ # Transforms that comprise this execution stage.
1684 { # Description of a transform executed as part of an execution stage.
1685 &quot;originalTransform&quot;: &quot;A String&quot;, # User name for the original user transform with which this transform is
1686 # most closely associated.
1687 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
1688 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this transform; may be user or system generated.
1689 },
1690 ],
1691 &quot;componentSource&quot;: [ # Collections produced and consumed by component transforms of this stage.
1692 { # Description of an interstitial value between transforms in an execution
1693 # stage.
1694 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
1695 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this transform; may be user or system generated.
1696 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
1697 # source is most closely associated.
1698 },
1699 ],
1700 &quot;kind&quot;: &quot;A String&quot;, # Type of tranform this stage is executing.
1701 &quot;outputSource&quot;: [ # Output sources for this stage.
1702 { # Description of an input or output of an execution stage.
1703 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
1704 # source is most closely associated.
1705 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
1706 &quot;sizeBytes&quot;: &quot;A String&quot;, # Size of the source, if measurable.
1707 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this source; may be user or system generated.
1708 },
1709 ],
1710 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this stage.
1711 &quot;inputSource&quot;: [ # Input sources for this stage.
1712 { # Description of an input or output of an execution stage.
1713 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
1714 # source is most closely associated.
1715 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
1716 &quot;sizeBytes&quot;: &quot;A String&quot;, # Size of the source, if measurable.
1717 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this source; may be user or system generated.
1718 },
1719 ],
1720 },
1721 ],
1722 &quot;originalPipelineTransform&quot;: [ # Description of each transform in the pipeline and collections between them.
1723 { # Description of the type, names/ids, and input/outputs for a transform.
1724 &quot;kind&quot;: &quot;A String&quot;, # Type of transform.
1725 &quot;inputCollectionName&quot;: [ # User names for all collection inputs to this transform.
1726 &quot;A String&quot;,
1727 ],
1728 &quot;name&quot;: &quot;A String&quot;, # User provided name for this transform instance.
1729 &quot;id&quot;: &quot;A String&quot;, # SDK generated id of this transform instance.
1730 &quot;displayData&quot;: [ # Transform-specific display data.
1731 { # Data provided with a pipeline or transform to provide descriptive info.
1732 &quot;timestampValue&quot;: &quot;A String&quot;, # Contains value if the data is of timestamp type.
1733 &quot;boolValue&quot;: True or False, # Contains value if the data is of a boolean type.
1734 &quot;javaClassValue&quot;: &quot;A String&quot;, # Contains value if the data is of java class type.
1735 &quot;strValue&quot;: &quot;A String&quot;, # Contains value if the data is of string type.
1736 &quot;int64Value&quot;: &quot;A String&quot;, # Contains value if the data is of int64 type.
1737 &quot;durationValue&quot;: &quot;A String&quot;, # Contains value if the data is of duration type.
1738 &quot;namespace&quot;: &quot;A String&quot;, # The namespace for the key. This is usually a class name or programming
1739 # language namespace (i.e. python module) which defines the display data.
1740 # This allows a dax monitoring system to specially handle the data
1741 # and perform custom rendering.
1742 &quot;floatValue&quot;: 3.14, # Contains value if the data is of float type.
1743 &quot;key&quot;: &quot;A String&quot;, # The key identifying the display data.
1744 # This is intended to be used as a label for the display data
1745 # when viewed in a dax monitoring system.
1746 &quot;shortStrValue&quot;: &quot;A String&quot;, # A possible additional shorter value to display.
1747 # For example a java_class_name_value of com.mypackage.MyDoFn
1748 # will be stored with MyDoFn as the short_str_value and
1749 # com.mypackage.MyDoFn as the java_class_name value.
1750 # short_str_value can be displayed and java_class_name_value
1751 # will be displayed as a tooltip.
1752 &quot;url&quot;: &quot;A String&quot;, # An optional full URL.
1753 &quot;label&quot;: &quot;A String&quot;, # An optional label to display in a dax UI for the element.
1754 },
1755 ],
1756 &quot;outputCollectionName&quot;: [ # User names for all collection outputs to this transform.
1757 &quot;A String&quot;,
1758 ],
1759 },
1760 ],
1761 &quot;displayData&quot;: [ # Pipeline level display data.
1762 { # Data provided with a pipeline or transform to provide descriptive info.
1763 &quot;timestampValue&quot;: &quot;A String&quot;, # Contains value if the data is of timestamp type.
1764 &quot;boolValue&quot;: True or False, # Contains value if the data is of a boolean type.
1765 &quot;javaClassValue&quot;: &quot;A String&quot;, # Contains value if the data is of java class type.
1766 &quot;strValue&quot;: &quot;A String&quot;, # Contains value if the data is of string type.
1767 &quot;int64Value&quot;: &quot;A String&quot;, # Contains value if the data is of int64 type.
1768 &quot;durationValue&quot;: &quot;A String&quot;, # Contains value if the data is of duration type.
1769 &quot;namespace&quot;: &quot;A String&quot;, # The namespace for the key. This is usually a class name or programming
1770 # language namespace (i.e. python module) which defines the display data.
1771 # This allows a dax monitoring system to specially handle the data
1772 # and perform custom rendering.
1773 &quot;floatValue&quot;: 3.14, # Contains value if the data is of float type.
1774 &quot;key&quot;: &quot;A String&quot;, # The key identifying the display data.
1775 # This is intended to be used as a label for the display data
1776 # when viewed in a dax monitoring system.
1777 &quot;shortStrValue&quot;: &quot;A String&quot;, # A possible additional shorter value to display.
1778 # For example a java_class_name_value of com.mypackage.MyDoFn
1779 # will be stored with MyDoFn as the short_str_value and
1780 # com.mypackage.MyDoFn as the java_class_name value.
1781 # short_str_value can be displayed and java_class_name_value
1782 # will be displayed as a tooltip.
1783 &quot;url&quot;: &quot;A String&quot;, # An optional full URL.
1784 &quot;label&quot;: &quot;A String&quot;, # An optional label to display in a dax UI for the element.
1785 },
1786 ],
1787 },
1788 &quot;replaceJobId&quot;: &quot;A String&quot;, # If this job is an update of an existing job, this field is the job ID
1789 # of the job it replaced.
1790 #
1791 # When sending a `CreateJobRequest`, you can update a job by specifying it
1792 # here. The job named here is stopped, and its intermediate state is
1793 # transferred to this job.
1794 &quot;tempFiles&quot;: [ # A set of files the system should be aware of that are used
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001795 # for temporary storage. These temporary files will be
1796 # removed on job completion.
1797 # No duplicates are allowed.
1798 # No file patterns are supported.
1799 #
1800 # The supported files are:
1801 #
1802 # Google Cloud Storage:
1803 #
1804 # storage.googleapis.com/{bucket}/{object}
1805 # bucket.storage.googleapis.com/{object}
Bu Sun Kim65020912020-05-20 12:08:20 -07001806 &quot;A String&quot;,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001807 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07001808 &quot;name&quot;: &quot;A String&quot;, # The user-specified Cloud Dataflow job name.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001809 #
1810 # Only one Job with a given name may exist in a project at any
1811 # given time. If a caller attempts to create a Job with the same
1812 # name as an already-existing Job, the attempt returns the
1813 # existing Job.
1814 #
1815 # The name must match the regular expression
1816 # `[a-z]([-a-z0-9]{0,38}[a-z0-9])?`
Bu Sun Kim65020912020-05-20 12:08:20 -07001817 &quot;steps&quot;: [ # Exactly one of step or steps_location should be specified.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001818 #
1819 # The top-level steps that constitute the entire job.
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04001820 { # Defines a particular step within a Cloud Dataflow job.
1821 #
1822 # A job consists of multiple steps, each of which performs some
1823 # specific operation as part of the overall job. Data is typically
1824 # passed from one step to another as part of the job.
1825 #
Bu Sun Kim65020912020-05-20 12:08:20 -07001826 # Here&#x27;s an example of a sequence of steps which together implement a
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04001827 # Map-Reduce job:
1828 #
1829 # * Read a collection of data from some source, parsing the
Bu Sun Kim65020912020-05-20 12:08:20 -07001830 # collection&#x27;s elements.
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04001831 #
1832 # * Validate the elements.
1833 #
1834 # * Apply a user-defined function to map each element to some value
1835 # and extract an element-specific key value.
1836 #
1837 # * Group elements with the same key into a single element with
1838 # that key, transforming a multiply-keyed collection into a
1839 # uniquely-keyed collection.
1840 #
1841 # * Write the elements out to some data sink.
1842 #
1843 # Note that the Cloud Dataflow service may be used to run many different
1844 # types of jobs, not just Map-Reduce.
Bu Sun Kim65020912020-05-20 12:08:20 -07001845 &quot;name&quot;: &quot;A String&quot;, # The name that identifies the step. This must be unique for each
Dan O'Mearadd494642020-05-01 07:42:23 -07001846 # step with respect to all other steps in the Cloud Dataflow job.
Bu Sun Kim65020912020-05-20 12:08:20 -07001847 &quot;kind&quot;: &quot;A String&quot;, # The kind of step in the Cloud Dataflow job.
1848 &quot;properties&quot;: { # Named properties associated with the step. Each kind of
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04001849 # predefined step has its own required set of properties.
1850 # Must be provided on Create. Only retrieved with JOB_VIEW_ALL.
Bu Sun Kim65020912020-05-20 12:08:20 -07001851 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
Jon Wayne Parrott692617a2017-01-06 09:58:29 -08001852 },
1853 },
1854 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07001855 &quot;replacedByJobId&quot;: &quot;A String&quot;, # If another job is an update of this job (and thus, this job is in
1856 # `JOB_STATE_UPDATED`), this field contains the ID of that job.
1857 &quot;executionInfo&quot;: { # Additional information about how a Cloud Dataflow job will be executed that # Deprecated.
1858 # isn&#x27;t contained in the submitted job.
1859 &quot;stages&quot;: { # A mapping from each stage to the information about that stage.
1860 &quot;a_key&quot;: { # Contains information about how a particular
1861 # google.dataflow.v1beta3.Step will be executed.
1862 &quot;stepName&quot;: [ # The steps associated with the execution stage.
1863 # Note that stages may have several steps, and that a given step
1864 # might be run by more than one stage.
1865 &quot;A String&quot;,
1866 ],
1867 },
1868 },
1869 },
1870 &quot;currentState&quot;: &quot;A String&quot;, # The current state of the job.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001871 #
1872 # Jobs are created in the `JOB_STATE_STOPPED` state unless otherwise
1873 # specified.
1874 #
1875 # A job in the `JOB_STATE_RUNNING` state may asynchronously enter a
1876 # terminal state. After a job has reached a terminal state, no
1877 # further state updates may be made.
1878 #
1879 # This field may be mutated by the Cloud Dataflow service;
1880 # callers cannot mutate it.
Bu Sun Kim65020912020-05-20 12:08:20 -07001881 &quot;location&quot;: &quot;A String&quot;, # The [regional endpoint]
1882 # (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
1883 # contains this job.
1884 &quot;startTime&quot;: &quot;A String&quot;, # The timestamp when the job was started (transitioned to JOB_STATE_PENDING).
1885 # Flexible resource scheduling jobs are started with some delay after job
1886 # creation, so start_time is unset before start and is updated when the
1887 # job is started by the Cloud Dataflow service. For other jobs, start_time
1888 # always equals to create_time and is immutable and set by the Cloud Dataflow
1889 # service.
1890 &quot;stepsLocation&quot;: &quot;A String&quot;, # The GCS location where the steps are stored.
1891 &quot;labels&quot;: { # User-defined labels for this job.
1892 #
1893 # The labels map can contain no more than 64 entries. Entries of the labels
1894 # map are UTF8 strings that comply with the following restrictions:
1895 #
1896 # * Keys must conform to regexp: \p{Ll}\p{Lo}{0,62}
1897 # * Values must conform to regexp: [\p{Ll}\p{Lo}\p{N}_-]{0,63}
1898 # * Both keys and values are additionally constrained to be &lt;= 128 bytes in
1899 # size.
1900 &quot;a_key&quot;: &quot;A String&quot;,
Jon Wayne Parrott692617a2017-01-06 09:58:29 -08001901 },
Bu Sun Kim65020912020-05-20 12:08:20 -07001902 &quot;createTime&quot;: &quot;A String&quot;, # The timestamp when the job was initially created. Immutable and set by the
1903 # Cloud Dataflow service.
1904 &quot;requestedState&quot;: &quot;A String&quot;, # The job&#x27;s requested state.
1905 #
1906 # `UpdateJob` may be used to switch between the `JOB_STATE_STOPPED` and
1907 # `JOB_STATE_RUNNING` states, by setting requested_state. `UpdateJob` may
1908 # also be used to directly set a job&#x27;s requested state to
1909 # `JOB_STATE_CANCELLED` or `JOB_STATE_DONE`, irrevocably terminating the
1910 # job if it has not already reached a terminal state.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001911 }</pre>
Jon Wayne Parrott692617a2017-01-06 09:58:29 -08001912</div>
1913
1914<div class="method">
1915 <code class="details" id="getMetrics">getMetrics(projectId, location, jobId, startTime=None, x__xgafv=None)</code>
1916 <pre>Request the job status.
1917
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001918To request the status of a job, we recommend using
1919`projects.locations.jobs.getMetrics` with a [regional endpoint]
1920(https://cloud.google.com/dataflow/docs/concepts/regional-endpoints). Using
1921`projects.jobs.getMetrics` is not recommended, as you can only request the
1922status of jobs that are running in `us-central1`.
1923
Jon Wayne Parrott692617a2017-01-06 09:58:29 -08001924Args:
1925 projectId: string, A project id. (required)
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001926 location: string, The [regional endpoint]
1927(https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
1928contains the job specified by job_id. (required)
Jon Wayne Parrott692617a2017-01-06 09:58:29 -08001929 jobId: string, The job to get messages for. (required)
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04001930 startTime: string, Return only metric data that has changed since this time.
1931Default is to return all information about all metrics for the job.
Jon Wayne Parrott692617a2017-01-06 09:58:29 -08001932 x__xgafv: string, V1 error format.
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04001933 Allowed values
1934 1 - v1 error format
1935 2 - v2 error format
Jon Wayne Parrott692617a2017-01-06 09:58:29 -08001936
1937Returns:
1938 An object of the form:
1939
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001940 { # JobMetrics contains a collection of metrics describing the detailed progress
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04001941 # of a Dataflow job. Metrics correspond to user-defined and system-defined
1942 # metrics in the job.
1943 #
1944 # This resource captures only the most recent values of each metric;
1945 # time-series data can be queried for them (under the same metric names)
1946 # from Cloud Monitoring.
Bu Sun Kim65020912020-05-20 12:08:20 -07001947 &quot;metricTime&quot;: &quot;A String&quot;, # Timestamp as of which metric values are current.
1948 &quot;metrics&quot;: [ # All metrics for this job.
Jon Wayne Parrott692617a2017-01-06 09:58:29 -08001949 { # Describes the state of a metric.
Bu Sun Kim65020912020-05-20 12:08:20 -07001950 &quot;set&quot;: &quot;&quot;, # Worker-computed aggregate value for the &quot;Set&quot; aggregation kind. The only
1951 # possible value type is a list of Values whose type can be Long, Double,
1952 # or String, according to the metric&#x27;s type. All Values in the list must
1953 # be of the same type.
1954 &quot;gauge&quot;: &quot;&quot;, # A struct value describing properties of a Gauge.
1955 # Metrics of gauge type show the value of a metric across time, and is
1956 # aggregated based on the newest value.
1957 &quot;cumulative&quot;: True or False, # True if this metric is reported as the total cumulative aggregate
1958 # value accumulated since the worker started working on this WorkItem.
1959 # By default this is false, indicating that this metric is reported
1960 # as a delta that is not associated with any WorkItem.
1961 &quot;internal&quot;: &quot;&quot;, # Worker-computed aggregate value for internal use by the Dataflow
1962 # service.
1963 &quot;kind&quot;: &quot;A String&quot;, # Metric aggregation kind. The possible metric aggregation kinds are
1964 # &quot;Sum&quot;, &quot;Max&quot;, &quot;Min&quot;, &quot;Mean&quot;, &quot;Set&quot;, &quot;And&quot;, &quot;Or&quot;, and &quot;Distribution&quot;.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001965 # The specified aggregation kind is case-insensitive.
1966 #
1967 # If omitted, this is not an aggregated value but instead
1968 # a single metric sample value.
Bu Sun Kim65020912020-05-20 12:08:20 -07001969 &quot;scalar&quot;: &quot;&quot;, # Worker-computed aggregate value for aggregation kinds &quot;Sum&quot;, &quot;Max&quot;, &quot;Min&quot;,
1970 # &quot;And&quot;, and &quot;Or&quot;. The possible value types are Long, Double, and Boolean.
1971 &quot;meanCount&quot;: &quot;&quot;, # Worker-computed aggregate value for the &quot;Mean&quot; aggregation kind.
1972 # This holds the count of the aggregated values and is used in combination
1973 # with mean_sum above to obtain the actual mean aggregate value.
1974 # The only possible value type is Long.
1975 &quot;meanSum&quot;: &quot;&quot;, # Worker-computed aggregate value for the &quot;Mean&quot; aggregation kind.
Sai Cheemalapati4ba8c232017-06-06 18:46:08 -04001976 # This holds the sum of the aggregated values and is used in combination
1977 # with mean_count below to obtain the actual mean aggregate value.
1978 # The only possible value types are Long and Double.
Bu Sun Kim65020912020-05-20 12:08:20 -07001979 &quot;updateTime&quot;: &quot;A String&quot;, # Timestamp associated with the metric value. Optional when workers are
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001980 # reporting work progress; it will be filled in responses from the
1981 # metrics API.
Bu Sun Kim65020912020-05-20 12:08:20 -07001982 &quot;name&quot;: { # Identifies a metric, by describing the source which generated the # Name of the metric.
1983 # metric.
1984 &quot;context&quot;: { # Zero or more labeled fields which identify the part of the job this
1985 # metric is associated with, such as the name of a step or collection.
1986 #
1987 # For example, built-in counters associated with steps will have
1988 # context[&#x27;step&#x27;] = &lt;step-name&gt;. Counters associated with PCollections
1989 # in the SDK will have context[&#x27;pcollection&#x27;] = &lt;pcollection-name&gt;.
1990 &quot;a_key&quot;: &quot;A String&quot;,
1991 },
1992 &quot;origin&quot;: &quot;A String&quot;, # Origin (namespace) of metric name. May be blank for user-define metrics;
1993 # will be &quot;dataflow&quot; for metrics defined by the Dataflow service or SDK.
1994 &quot;name&quot;: &quot;A String&quot;, # Worker-defined metric name.
1995 },
1996 &quot;distribution&quot;: &quot;&quot;, # A struct value describing properties of a distribution of numeric values.
Jon Wayne Parrott692617a2017-01-06 09:58:29 -08001997 },
1998 ],
Jon Wayne Parrott692617a2017-01-06 09:58:29 -08001999 }</pre>
2000</div>
2001
2002<div class="method">
Bu Sun Kim65020912020-05-20 12:08:20 -07002003 <code class="details" id="list">list(projectId, location, filter=None, pageToken=None, pageSize=None, view=None, x__xgafv=None)</code>
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04002004 <pre>List the jobs of a project.
Jon Wayne Parrott692617a2017-01-06 09:58:29 -08002005
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002006To list the jobs of a project in a region, we recommend using
2007`projects.locations.jobs.get` with a [regional endpoint]
2008(https://cloud.google.com/dataflow/docs/concepts/regional-endpoints). To
2009list the all jobs across all regions, use `projects.jobs.aggregated`. Using
2010`projects.jobs.list` is not recommended, as you can only get the list of
2011jobs that are running in `us-central1`.
2012
Jon Wayne Parrott692617a2017-01-06 09:58:29 -08002013Args:
2014 projectId: string, The project which owns the jobs. (required)
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002015 location: string, The [regional endpoint]
2016(https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
2017contains this job. (required)
Bu Sun Kim65020912020-05-20 12:08:20 -07002018 filter: string, The kind of filter to use.
2019 pageToken: string, Set this to the &#x27;next_page_token&#x27; field of a previous response
2020to request additional results in a long list.
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04002021 pageSize: integer, If there are many jobs, limit response to at most this many.
2022The actual number of jobs returned will be the lesser of max_responses
2023and an unspecified server-defined limit.
Bu Sun Kim65020912020-05-20 12:08:20 -07002024 view: string, Level of information requested in response. Default is `JOB_VIEW_SUMMARY`.
Jon Wayne Parrott692617a2017-01-06 09:58:29 -08002025 x__xgafv: string, V1 error format.
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04002026 Allowed values
2027 1 - v1 error format
2028 2 - v2 error format
Jon Wayne Parrott692617a2017-01-06 09:58:29 -08002029
2030Returns:
2031 An object of the form:
2032
Dan O'Mearadd494642020-05-01 07:42:23 -07002033 { # Response to a request to list Cloud Dataflow jobs in a project. This might
2034 # be a partial response, depending on the page size in the ListJobsRequest.
2035 # However, if the project does not have any jobs, an instance of
Bu Sun Kim65020912020-05-20 12:08:20 -07002036 # ListJobsResponse is not returned and the requests&#x27;s response
Dan O'Mearadd494642020-05-01 07:42:23 -07002037 # body is empty {}.
Bu Sun Kim65020912020-05-20 12:08:20 -07002038 &quot;nextPageToken&quot;: &quot;A String&quot;, # Set if there may be more results than fit in this response.
2039 &quot;failedLocation&quot;: [ # Zero or more messages describing the [regional endpoints]
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002040 # (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
2041 # failed to respond.
2042 { # Indicates which [regional endpoint]
2043 # (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) failed
2044 # to respond to a request for data.
Bu Sun Kim65020912020-05-20 12:08:20 -07002045 &quot;name&quot;: &quot;A String&quot;, # The name of the [regional endpoint]
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002046 # (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
2047 # failed to respond.
Sai Cheemalapati4ba8c232017-06-06 18:46:08 -04002048 },
2049 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07002050 &quot;jobs&quot;: [ # A subset of the requested job information.
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04002051 { # Defines a job to be run by the Cloud Dataflow service.
Bu Sun Kim65020912020-05-20 12:08:20 -07002052 &quot;clientRequestId&quot;: &quot;A String&quot;, # The client&#x27;s unique identifier of the job, re-used across retried attempts.
2053 # If this field is set, the service will ensure its uniqueness.
2054 # The request to create a job will fail if the service has knowledge of a
2055 # previously submitted job with the same client&#x27;s ID and job name.
2056 # The caller may use this field to ensure idempotence of job
2057 # creation across retried attempts to create a job.
2058 # By default, the field is empty and, in that case, the service ignores it.
2059 &quot;id&quot;: &quot;A String&quot;, # The unique ID of this job.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002060 #
2061 # This field is set by the Cloud Dataflow service when the Job is
2062 # created, and is immutable for the life of the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07002063 &quot;currentStateTime&quot;: &quot;A String&quot;, # The timestamp associated with the current state.
2064 &quot;transformNameMapping&quot;: { # The map of transform name prefixes of the job to be replaced to the
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002065 # corresponding name prefixes of the new job.
Bu Sun Kim65020912020-05-20 12:08:20 -07002066 &quot;a_key&quot;: &quot;A String&quot;,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002067 },
Bu Sun Kim65020912020-05-20 12:08:20 -07002068 &quot;environment&quot;: { # Describes the environment in which a Dataflow Job runs. # The environment for the job.
2069 &quot;sdkPipelineOptions&quot;: { # The Cloud Dataflow SDK pipeline options specified by the user. These
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002070 # options are passed through the service and are used to recreate the
2071 # SDK pipeline options on the worker in a language agnostic and platform
2072 # independent way.
Bu Sun Kim65020912020-05-20 12:08:20 -07002073 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002074 },
Bu Sun Kim65020912020-05-20 12:08:20 -07002075 &quot;flexResourceSchedulingGoal&quot;: &quot;A String&quot;, # Which Flexible Resource Scheduling mode to run in.
2076 &quot;workerPools&quot;: [ # The worker pools. At least one &quot;harness&quot; worker pool must be
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002077 # specified in order for the job to have workers.
2078 { # Describes one particular pool of Cloud Dataflow workers to be
2079 # instantiated by the Cloud Dataflow service in order to perform the
2080 # computations required by a job. Note that a workflow job may use
2081 # multiple pools, in order to match the various computational
2082 # requirements of the various stages of the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07002083 &quot;defaultPackageSet&quot;: &quot;A String&quot;, # The default package set to install. This allows the service to
2084 # select a default set of packages which are useful to worker
2085 # harnesses written in a particular language.
2086 &quot;network&quot;: &quot;A String&quot;, # Network to which VMs will be assigned. If empty or unspecified,
2087 # the service will use the network &quot;default&quot;.
2088 &quot;zone&quot;: &quot;A String&quot;, # Zone to run the worker pools in. If empty or unspecified, the service
Dan O'Mearadd494642020-05-01 07:42:23 -07002089 # will attempt to choose a reasonable default.
Bu Sun Kim65020912020-05-20 12:08:20 -07002090 &quot;numWorkers&quot;: 42, # Number of Google Compute Engine workers in this pool needed to
2091 # execute the job. If zero or unspecified, the service will
2092 # attempt to choose a reasonable default.
2093 &quot;numThreadsPerWorker&quot;: 42, # The number of threads per worker harness. If empty or unspecified, the
Dan O'Mearadd494642020-05-01 07:42:23 -07002094 # service will choose a number of threads (according to the number of cores
2095 # on the selected machine type for batch, or 1 by convention for streaming).
Bu Sun Kim65020912020-05-20 12:08:20 -07002096 &quot;diskSourceImage&quot;: &quot;A String&quot;, # Fully qualified source image for disks.
2097 &quot;packages&quot;: [ # Packages to be installed on workers.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002098 { # The packages that must be installed in order for a worker to run the
2099 # steps of the Cloud Dataflow job that will be assigned to its worker
2100 # pool.
2101 #
2102 # This is the mechanism by which the Cloud Dataflow SDK causes code to
2103 # be loaded onto the workers. For example, the Cloud Dataflow Java SDK
Bu Sun Kim65020912020-05-20 12:08:20 -07002104 # might use this to install jars containing the user&#x27;s code and all of the
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002105 # various dependencies (libraries, data files, etc.) required in order
2106 # for that code to run.
Bu Sun Kim65020912020-05-20 12:08:20 -07002107 &quot;location&quot;: &quot;A String&quot;, # The resource to read the package from. The supported resource type is:
Sai Cheemalapati4ba8c232017-06-06 18:46:08 -04002108 #
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002109 # Google Cloud Storage:
2110 #
2111 # storage.googleapis.com/{bucket}
2112 # bucket.storage.googleapis.com/
Bu Sun Kim65020912020-05-20 12:08:20 -07002113 &quot;name&quot;: &quot;A String&quot;, # The name of the package.
Sai Cheemalapati4ba8c232017-06-06 18:46:08 -04002114 },
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002115 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07002116 &quot;teardownPolicy&quot;: &quot;A String&quot;, # Sets the policy for determining when to turndown worker pool.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002117 # Allowed values are: `TEARDOWN_ALWAYS`, `TEARDOWN_ON_SUCCESS`, and
2118 # `TEARDOWN_NEVER`.
2119 # `TEARDOWN_ALWAYS` means workers are always torn down regardless of whether
2120 # the job succeeds. `TEARDOWN_ON_SUCCESS` means workers are torn down
2121 # if the job succeeds. `TEARDOWN_NEVER` means the workers are never torn
2122 # down.
2123 #
2124 # If the workers are not torn down by the service, they will
2125 # continue to run and use Google Compute Engine VM resources in the
Bu Sun Kim65020912020-05-20 12:08:20 -07002126 # user&#x27;s project until they are explicitly terminated by the user.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002127 # Because of this, Google recommends using the `TEARDOWN_ALWAYS`
2128 # policy except for small, manually supervised test jobs.
2129 #
2130 # If unknown or unspecified, the service will attempt to choose a reasonable
2131 # default.
Bu Sun Kim65020912020-05-20 12:08:20 -07002132 &quot;onHostMaintenance&quot;: &quot;A String&quot;, # The action to take on host maintenance, as defined by the Google
2133 # Compute Engine API.
2134 &quot;poolArgs&quot;: { # Extra arguments for this worker pool.
2135 &quot;a_key&quot;: &quot;&quot;, # Properties of the object. Contains field @type with type URL.
2136 },
2137 &quot;diskSizeGb&quot;: 42, # Size of root disk for VMs, in GB. If zero or unspecified, the service will
Dan O'Mearadd494642020-05-01 07:42:23 -07002138 # attempt to choose a reasonable default.
Bu Sun Kim65020912020-05-20 12:08:20 -07002139 &quot;workerHarnessContainerImage&quot;: &quot;A String&quot;, # Required. Docker container image that executes the Cloud Dataflow worker
2140 # harness, residing in Google Container Registry.
2141 #
2142 # Deprecated for the Fn API path. Use sdk_harness_container_images instead.
2143 &quot;diskType&quot;: &quot;A String&quot;, # Type of root disk for VMs. If empty or unspecified, the service will
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002144 # attempt to choose a reasonable default.
Bu Sun Kim65020912020-05-20 12:08:20 -07002145 &quot;machineType&quot;: &quot;A String&quot;, # Machine type (e.g. &quot;n1-standard-1&quot;). If empty or unspecified, the
2146 # service will attempt to choose a reasonable default.
2147 &quot;kind&quot;: &quot;A String&quot;, # The kind of the worker pool; currently only `harness` and `shuffle`
2148 # are supported.
2149 &quot;dataDisks&quot;: [ # Data disks that are used by a VM in this workflow.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002150 { # Describes the data disk used by a workflow job.
Bu Sun Kim65020912020-05-20 12:08:20 -07002151 &quot;sizeGb&quot;: 42, # Size of disk in GB. If zero or unspecified, the service will
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002152 # attempt to choose a reasonable default.
Bu Sun Kim65020912020-05-20 12:08:20 -07002153 &quot;diskType&quot;: &quot;A String&quot;, # Disk storage type, as defined by Google Compute Engine. This
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002154 # must be a disk type appropriate to the project and zone in which
2155 # the workers will run. If unknown or unspecified, the service
2156 # will attempt to choose a reasonable default.
2157 #
2158 # For example, the standard persistent disk type is a resource name
Bu Sun Kim65020912020-05-20 12:08:20 -07002159 # typically ending in &quot;pd-standard&quot;. If SSD persistent disks are
2160 # available, the resource name typically ends with &quot;pd-ssd&quot;. The
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002161 # actual valid values are defined the Google Compute Engine API,
2162 # not by the Cloud Dataflow API; consult the Google Compute Engine
2163 # documentation for more information about determining the set of
2164 # available disk types for a particular project and zone.
2165 #
2166 # Google Compute Engine Disk types are local to a particular
2167 # project in a particular zone, and so the resource name will
2168 # typically look something like this:
2169 #
2170 # compute.googleapis.com/projects/project-id/zones/zone/diskTypes/pd-standard
Bu Sun Kim65020912020-05-20 12:08:20 -07002171 &quot;mountPoint&quot;: &quot;A String&quot;, # Directory in a VM where disk is mounted.
Sai Cheemalapati4ba8c232017-06-06 18:46:08 -04002172 },
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002173 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07002174 &quot;sdkHarnessContainerImages&quot;: [ # Set of SDK harness containers needed to execute this pipeline. This will
Dan O'Mearadd494642020-05-01 07:42:23 -07002175 # only be set in the Fn API path. For non-cross-language pipelines this
2176 # should have only one entry. Cross-language pipelines will have two or more
2177 # entries.
2178 { # Defines a SDK harness container for executing Dataflow pipelines.
Bu Sun Kim65020912020-05-20 12:08:20 -07002179 &quot;containerImage&quot;: &quot;A String&quot;, # A docker container image that resides in Google Container Registry.
2180 &quot;useSingleCorePerContainer&quot;: True or False, # If true, recommends the Dataflow service to use only one core per SDK
Dan O'Mearadd494642020-05-01 07:42:23 -07002181 # container instance with this image. If false (or unset) recommends using
2182 # more than one core per SDK container instance with this image for
2183 # efficiency. Note that Dataflow service may choose to override this property
2184 # if needed.
2185 },
2186 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07002187 &quot;subnetwork&quot;: &quot;A String&quot;, # Subnetwork to which VMs will be assigned, if desired. Expected to be of
2188 # the form &quot;regions/REGION/subnetworks/SUBNETWORK&quot;.
2189 &quot;ipConfiguration&quot;: &quot;A String&quot;, # Configuration for VM IPs.
2190 &quot;taskrunnerSettings&quot;: { # Taskrunner configuration settings. # Settings passed through to Google Compute Engine workers when
2191 # using the standard Dataflow task runner. Users should ignore
2192 # this field.
2193 &quot;alsologtostderr&quot;: True or False, # Whether to also send taskrunner log info to stderr.
2194 &quot;taskGroup&quot;: &quot;A String&quot;, # The UNIX group ID on the worker VM to use for tasks launched by
2195 # taskrunner; e.g. &quot;wheel&quot;.
2196 &quot;harnessCommand&quot;: &quot;A String&quot;, # The command to launch the worker harness.
2197 &quot;logDir&quot;: &quot;A String&quot;, # The directory on the VM to store logs.
2198 &quot;oauthScopes&quot;: [ # The OAuth2 scopes to be requested by the taskrunner in order to
2199 # access the Cloud Dataflow API.
2200 &quot;A String&quot;,
2201 ],
2202 &quot;dataflowApiVersion&quot;: &quot;A String&quot;, # The API version of endpoint, e.g. &quot;v1b3&quot;
2203 &quot;logUploadLocation&quot;: &quot;A String&quot;, # Indicates where to put logs. If this is not specified, the logs
2204 # will not be uploaded.
2205 #
2206 # The supported resource type is:
2207 #
2208 # Google Cloud Storage:
2209 # storage.googleapis.com/{bucket}/{object}
2210 # bucket.storage.googleapis.com/{object}
2211 &quot;streamingWorkerMainClass&quot;: &quot;A String&quot;, # The streaming worker main class name.
2212 &quot;workflowFileName&quot;: &quot;A String&quot;, # The file to store the workflow in.
2213 &quot;baseTaskDir&quot;: &quot;A String&quot;, # The location on the worker for task-specific subdirectories.
2214 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the taskrunner should use for
2215 # temporary storage.
2216 #
2217 # The supported resource type is:
2218 #
2219 # Google Cloud Storage:
2220 # storage.googleapis.com/{bucket}/{object}
2221 # bucket.storage.googleapis.com/{object}
2222 &quot;commandlinesFileName&quot;: &quot;A String&quot;, # The file to store preprocessing commands in.
2223 &quot;languageHint&quot;: &quot;A String&quot;, # The suggested backend language.
2224 &quot;baseUrl&quot;: &quot;A String&quot;, # The base URL for the taskrunner to use when accessing Google Cloud APIs.
2225 #
2226 # When workers access Google Cloud APIs, they logically do so via
2227 # relative URLs. If this field is specified, it supplies the base
2228 # URL to use for resolving these relative URLs. The normative
2229 # algorithm used is defined by RFC 1808, &quot;Relative Uniform Resource
2230 # Locators&quot;.
2231 #
2232 # If not specified, the default value is &quot;http://www.googleapis.com/&quot;
2233 &quot;logToSerialconsole&quot;: True or False, # Whether to send taskrunner log info to Google Compute Engine VM serial
2234 # console.
2235 &quot;continueOnException&quot;: True or False, # Whether to continue taskrunner if an exception is hit.
2236 &quot;parallelWorkerSettings&quot;: { # Provides data to pass through to the worker harness. # The settings to pass to the parallel worker harness.
2237 &quot;baseUrl&quot;: &quot;A String&quot;, # The base URL for accessing Google Cloud APIs.
2238 #
2239 # When workers access Google Cloud APIs, they logically do so via
2240 # relative URLs. If this field is specified, it supplies the base
2241 # URL to use for resolving these relative URLs. The normative
2242 # algorithm used is defined by RFC 1808, &quot;Relative Uniform Resource
2243 # Locators&quot;.
2244 #
2245 # If not specified, the default value is &quot;http://www.googleapis.com/&quot;
2246 &quot;reportingEnabled&quot;: True or False, # Whether to send work progress updates to the service.
2247 &quot;servicePath&quot;: &quot;A String&quot;, # The Cloud Dataflow service path relative to the root URL, for example,
2248 # &quot;dataflow/v1b3/projects&quot;.
2249 &quot;shuffleServicePath&quot;: &quot;A String&quot;, # The Shuffle service path relative to the root URL, for example,
2250 # &quot;shuffle/v1beta1&quot;.
2251 &quot;workerId&quot;: &quot;A String&quot;, # The ID of the worker running this pipeline.
2252 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the system should use for temporary
2253 # storage.
2254 #
2255 # The supported resource type is:
2256 #
2257 # Google Cloud Storage:
2258 #
2259 # storage.googleapis.com/{bucket}/{object}
2260 # bucket.storage.googleapis.com/{object}
2261 },
2262 &quot;vmId&quot;: &quot;A String&quot;, # The ID string of the VM.
2263 &quot;taskUser&quot;: &quot;A String&quot;, # The UNIX user ID on the worker VM to use for tasks launched by
2264 # taskrunner; e.g. &quot;root&quot;.
2265 },
2266 &quot;autoscalingSettings&quot;: { # Settings for WorkerPool autoscaling. # Settings for autoscaling of this WorkerPool.
2267 &quot;maxNumWorkers&quot;: 42, # The maximum number of workers to cap scaling at.
2268 &quot;algorithm&quot;: &quot;A String&quot;, # The algorithm to use for autoscaling.
2269 },
2270 &quot;metadata&quot;: { # Metadata to set on the Google Compute Engine VMs.
2271 &quot;a_key&quot;: &quot;A String&quot;,
2272 },
Jon Wayne Parrott692617a2017-01-06 09:58:29 -08002273 },
2274 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07002275 &quot;dataset&quot;: &quot;A String&quot;, # The dataset for the current project where various workflow
2276 # related tables are stored.
2277 #
2278 # The supported resource type is:
2279 #
2280 # Google BigQuery:
2281 # bigquery.googleapis.com/{dataset}
2282 &quot;internalExperiments&quot;: { # Experimental settings.
2283 &quot;a_key&quot;: &quot;&quot;, # Properties of the object. Contains field @type with type URL.
2284 },
2285 &quot;workerRegion&quot;: &quot;A String&quot;, # The Compute Engine region
2286 # (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
2287 # which worker processing should occur, e.g. &quot;us-west1&quot;. Mutually exclusive
2288 # with worker_zone. If neither worker_region nor worker_zone is specified,
2289 # default to the control plane&#x27;s region.
2290 &quot;serviceKmsKeyName&quot;: &quot;A String&quot;, # If set, contains the Cloud KMS key identifier used to encrypt data
2291 # at rest, AKA a Customer Managed Encryption Key (CMEK).
2292 #
2293 # Format:
2294 # projects/PROJECT_ID/locations/LOCATION/keyRings/KEY_RING/cryptoKeys/KEY
2295 &quot;userAgent&quot;: { # A description of the process that generated the request.
2296 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
2297 },
2298 &quot;workerZone&quot;: &quot;A String&quot;, # The Compute Engine zone
2299 # (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
2300 # which worker processing should occur, e.g. &quot;us-west1-a&quot;. Mutually exclusive
2301 # with worker_region. If neither worker_region nor worker_zone is specified,
2302 # a zone in the control plane&#x27;s region is chosen based on available capacity.
2303 &quot;clusterManagerApiService&quot;: &quot;A String&quot;, # The type of cluster manager API to use. If unknown or
Dan O'Mearadd494642020-05-01 07:42:23 -07002304 # unspecified, the service will attempt to choose a reasonable
2305 # default. This should be in the form of the API service name,
Bu Sun Kim65020912020-05-20 12:08:20 -07002306 # e.g. &quot;compute.googleapis.com&quot;.
2307 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the system should use for temporary
2308 # storage. The system will append the suffix &quot;/temp-{JOBNAME} to
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002309 # this resource prefix, where {JOBNAME} is the value of the
2310 # job_name field. The resulting bucket and object prefix is used
2311 # as the prefix of the resources used to store temporary data
2312 # needed during the job execution. NOTE: This will override the
2313 # value in taskrunner_settings.
2314 # The supported resource type is:
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04002315 #
2316 # Google Cloud Storage:
2317 #
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002318 # storage.googleapis.com/{bucket}/{object}
2319 # bucket.storage.googleapis.com/{object}
Bu Sun Kim65020912020-05-20 12:08:20 -07002320 &quot;experiments&quot;: [ # The list of experiments to enable.
2321 &quot;A String&quot;,
2322 ],
2323 &quot;version&quot;: { # A structure describing which components and their versions of the service
2324 # are required in order to run the job.
2325 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
2326 },
2327 &quot;serviceAccountEmail&quot;: &quot;A String&quot;, # Identity to run virtual machines as. Defaults to the default account.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002328 },
Bu Sun Kim65020912020-05-20 12:08:20 -07002329 &quot;stageStates&quot;: [ # This field may be mutated by the Cloud Dataflow service;
2330 # callers cannot mutate it.
2331 { # A message describing the state of a particular execution stage.
2332 &quot;executionStageName&quot;: &quot;A String&quot;, # The name of the execution stage.
2333 &quot;currentStateTime&quot;: &quot;A String&quot;, # The time at which the stage transitioned to this state.
2334 &quot;executionStageState&quot;: &quot;A String&quot;, # Executions stage states allow the same set of values as JobState.
2335 },
2336 ],
2337 &quot;jobMetadata&quot;: { # Metadata available primarily for filtering jobs. Will be included in the # This field is populated by the Dataflow service to support filtering jobs
2338 # by the metadata values provided here. Populated for ListJobs and all GetJob
2339 # views SUMMARY and higher.
2340 # ListJob response and Job SUMMARY view.
2341 &quot;bigTableDetails&quot;: [ # Identification of a BigTable source used in the Dataflow job.
2342 { # Metadata for a BigTable connector used by the job.
2343 &quot;tableId&quot;: &quot;A String&quot;, # TableId accessed in the connection.
2344 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
2345 &quot;instanceId&quot;: &quot;A String&quot;, # InstanceId accessed in the connection.
2346 },
2347 ],
2348 &quot;spannerDetails&quot;: [ # Identification of a Spanner source used in the Dataflow job.
2349 { # Metadata for a Spanner connector used by the job.
2350 &quot;databaseId&quot;: &quot;A String&quot;, # DatabaseId accessed in the connection.
2351 &quot;instanceId&quot;: &quot;A String&quot;, # InstanceId accessed in the connection.
2352 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
2353 },
2354 ],
2355 &quot;datastoreDetails&quot;: [ # Identification of a Datastore source used in the Dataflow job.
2356 { # Metadata for a Datastore connector used by the job.
2357 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
2358 &quot;namespace&quot;: &quot;A String&quot;, # Namespace used in the connection.
2359 },
2360 ],
2361 &quot;sdkVersion&quot;: { # The version of the SDK used to run the job. # The SDK version used to run the job.
2362 &quot;versionDisplayName&quot;: &quot;A String&quot;, # A readable string describing the version of the SDK.
2363 &quot;sdkSupportStatus&quot;: &quot;A String&quot;, # The support status for this SDK version.
2364 &quot;version&quot;: &quot;A String&quot;, # The version of the SDK used to run the job.
2365 },
2366 &quot;bigqueryDetails&quot;: [ # Identification of a BigQuery source used in the Dataflow job.
2367 { # Metadata for a BigQuery connector used by the job.
2368 &quot;table&quot;: &quot;A String&quot;, # Table accessed in the connection.
2369 &quot;dataset&quot;: &quot;A String&quot;, # Dataset accessed in the connection.
2370 &quot;projectId&quot;: &quot;A String&quot;, # Project accessed in the connection.
2371 &quot;query&quot;: &quot;A String&quot;, # Query used to access data in the connection.
2372 },
2373 ],
2374 &quot;fileDetails&quot;: [ # Identification of a File source used in the Dataflow job.
2375 { # Metadata for a File connector used by the job.
2376 &quot;filePattern&quot;: &quot;A String&quot;, # File Pattern used to access files by the connector.
2377 },
2378 ],
2379 &quot;pubsubDetails&quot;: [ # Identification of a PubSub source used in the Dataflow job.
2380 { # Metadata for a PubSub connector used by the job.
2381 &quot;subscription&quot;: &quot;A String&quot;, # Subscription used in the connection.
2382 &quot;topic&quot;: &quot;A String&quot;, # Topic accessed in the connection.
2383 },
2384 ],
2385 },
2386 &quot;createdFromSnapshotId&quot;: &quot;A String&quot;, # If this is specified, the job&#x27;s initial state is populated from the given
2387 # snapshot.
2388 &quot;projectId&quot;: &quot;A String&quot;, # The ID of the Cloud Platform project that the job belongs to.
2389 &quot;type&quot;: &quot;A String&quot;, # The type of Cloud Dataflow job.
2390 &quot;pipelineDescription&quot;: { # A descriptive representation of submitted pipeline as well as the executed # Preliminary field: The format of this data may change at any time.
2391 # A description of the user pipeline and stages through which it is executed.
2392 # Created by Cloud Dataflow service. Only retrieved with
2393 # JOB_VIEW_DESCRIPTION or JOB_VIEW_ALL.
2394 # form. This data is provided by the Dataflow service for ease of visualizing
2395 # the pipeline and interpreting Dataflow provided metrics.
2396 &quot;executionPipelineStage&quot;: [ # Description of each stage of execution of the pipeline.
2397 { # Description of the composing transforms, names/ids, and input/outputs of a
2398 # stage of execution. Some composing transforms and sources may have been
2399 # generated by the Dataflow service during execution planning.
2400 &quot;id&quot;: &quot;A String&quot;, # Dataflow service generated id for this stage.
2401 &quot;componentTransform&quot;: [ # Transforms that comprise this execution stage.
2402 { # Description of a transform executed as part of an execution stage.
2403 &quot;originalTransform&quot;: &quot;A String&quot;, # User name for the original user transform with which this transform is
2404 # most closely associated.
2405 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
2406 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this transform; may be user or system generated.
2407 },
2408 ],
2409 &quot;componentSource&quot;: [ # Collections produced and consumed by component transforms of this stage.
2410 { # Description of an interstitial value between transforms in an execution
2411 # stage.
2412 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
2413 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this transform; may be user or system generated.
2414 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
2415 # source is most closely associated.
2416 },
2417 ],
2418 &quot;kind&quot;: &quot;A String&quot;, # Type of tranform this stage is executing.
2419 &quot;outputSource&quot;: [ # Output sources for this stage.
2420 { # Description of an input or output of an execution stage.
2421 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
2422 # source is most closely associated.
2423 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
2424 &quot;sizeBytes&quot;: &quot;A String&quot;, # Size of the source, if measurable.
2425 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this source; may be user or system generated.
2426 },
2427 ],
2428 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this stage.
2429 &quot;inputSource&quot;: [ # Input sources for this stage.
2430 { # Description of an input or output of an execution stage.
2431 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
2432 # source is most closely associated.
2433 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
2434 &quot;sizeBytes&quot;: &quot;A String&quot;, # Size of the source, if measurable.
2435 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this source; may be user or system generated.
2436 },
2437 ],
2438 },
2439 ],
2440 &quot;originalPipelineTransform&quot;: [ # Description of each transform in the pipeline and collections between them.
2441 { # Description of the type, names/ids, and input/outputs for a transform.
2442 &quot;kind&quot;: &quot;A String&quot;, # Type of transform.
2443 &quot;inputCollectionName&quot;: [ # User names for all collection inputs to this transform.
2444 &quot;A String&quot;,
2445 ],
2446 &quot;name&quot;: &quot;A String&quot;, # User provided name for this transform instance.
2447 &quot;id&quot;: &quot;A String&quot;, # SDK generated id of this transform instance.
2448 &quot;displayData&quot;: [ # Transform-specific display data.
2449 { # Data provided with a pipeline or transform to provide descriptive info.
2450 &quot;timestampValue&quot;: &quot;A String&quot;, # Contains value if the data is of timestamp type.
2451 &quot;boolValue&quot;: True or False, # Contains value if the data is of a boolean type.
2452 &quot;javaClassValue&quot;: &quot;A String&quot;, # Contains value if the data is of java class type.
2453 &quot;strValue&quot;: &quot;A String&quot;, # Contains value if the data is of string type.
2454 &quot;int64Value&quot;: &quot;A String&quot;, # Contains value if the data is of int64 type.
2455 &quot;durationValue&quot;: &quot;A String&quot;, # Contains value if the data is of duration type.
2456 &quot;namespace&quot;: &quot;A String&quot;, # The namespace for the key. This is usually a class name or programming
2457 # language namespace (i.e. python module) which defines the display data.
2458 # This allows a dax monitoring system to specially handle the data
2459 # and perform custom rendering.
2460 &quot;floatValue&quot;: 3.14, # Contains value if the data is of float type.
2461 &quot;key&quot;: &quot;A String&quot;, # The key identifying the display data.
2462 # This is intended to be used as a label for the display data
2463 # when viewed in a dax monitoring system.
2464 &quot;shortStrValue&quot;: &quot;A String&quot;, # A possible additional shorter value to display.
2465 # For example a java_class_name_value of com.mypackage.MyDoFn
2466 # will be stored with MyDoFn as the short_str_value and
2467 # com.mypackage.MyDoFn as the java_class_name value.
2468 # short_str_value can be displayed and java_class_name_value
2469 # will be displayed as a tooltip.
2470 &quot;url&quot;: &quot;A String&quot;, # An optional full URL.
2471 &quot;label&quot;: &quot;A String&quot;, # An optional label to display in a dax UI for the element.
2472 },
2473 ],
2474 &quot;outputCollectionName&quot;: [ # User names for all collection outputs to this transform.
2475 &quot;A String&quot;,
2476 ],
2477 },
2478 ],
2479 &quot;displayData&quot;: [ # Pipeline level display data.
2480 { # Data provided with a pipeline or transform to provide descriptive info.
2481 &quot;timestampValue&quot;: &quot;A String&quot;, # Contains value if the data is of timestamp type.
2482 &quot;boolValue&quot;: True or False, # Contains value if the data is of a boolean type.
2483 &quot;javaClassValue&quot;: &quot;A String&quot;, # Contains value if the data is of java class type.
2484 &quot;strValue&quot;: &quot;A String&quot;, # Contains value if the data is of string type.
2485 &quot;int64Value&quot;: &quot;A String&quot;, # Contains value if the data is of int64 type.
2486 &quot;durationValue&quot;: &quot;A String&quot;, # Contains value if the data is of duration type.
2487 &quot;namespace&quot;: &quot;A String&quot;, # The namespace for the key. This is usually a class name or programming
2488 # language namespace (i.e. python module) which defines the display data.
2489 # This allows a dax monitoring system to specially handle the data
2490 # and perform custom rendering.
2491 &quot;floatValue&quot;: 3.14, # Contains value if the data is of float type.
2492 &quot;key&quot;: &quot;A String&quot;, # The key identifying the display data.
2493 # This is intended to be used as a label for the display data
2494 # when viewed in a dax monitoring system.
2495 &quot;shortStrValue&quot;: &quot;A String&quot;, # A possible additional shorter value to display.
2496 # For example a java_class_name_value of com.mypackage.MyDoFn
2497 # will be stored with MyDoFn as the short_str_value and
2498 # com.mypackage.MyDoFn as the java_class_name value.
2499 # short_str_value can be displayed and java_class_name_value
2500 # will be displayed as a tooltip.
2501 &quot;url&quot;: &quot;A String&quot;, # An optional full URL.
2502 &quot;label&quot;: &quot;A String&quot;, # An optional label to display in a dax UI for the element.
2503 },
2504 ],
2505 },
2506 &quot;replaceJobId&quot;: &quot;A String&quot;, # If this job is an update of an existing job, this field is the job ID
2507 # of the job it replaced.
2508 #
2509 # When sending a `CreateJobRequest`, you can update a job by specifying it
2510 # here. The job named here is stopped, and its intermediate state is
2511 # transferred to this job.
2512 &quot;tempFiles&quot;: [ # A set of files the system should be aware of that are used
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002513 # for temporary storage. These temporary files will be
2514 # removed on job completion.
2515 # No duplicates are allowed.
2516 # No file patterns are supported.
2517 #
2518 # The supported files are:
2519 #
2520 # Google Cloud Storage:
2521 #
2522 # storage.googleapis.com/{bucket}/{object}
2523 # bucket.storage.googleapis.com/{object}
Bu Sun Kim65020912020-05-20 12:08:20 -07002524 &quot;A String&quot;,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002525 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07002526 &quot;name&quot;: &quot;A String&quot;, # The user-specified Cloud Dataflow job name.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002527 #
2528 # Only one Job with a given name may exist in a project at any
2529 # given time. If a caller attempts to create a Job with the same
2530 # name as an already-existing Job, the attempt returns the
2531 # existing Job.
2532 #
2533 # The name must match the regular expression
2534 # `[a-z]([-a-z0-9]{0,38}[a-z0-9])?`
Bu Sun Kim65020912020-05-20 12:08:20 -07002535 &quot;steps&quot;: [ # Exactly one of step or steps_location should be specified.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002536 #
2537 # The top-level steps that constitute the entire job.
2538 { # Defines a particular step within a Cloud Dataflow job.
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04002539 #
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002540 # A job consists of multiple steps, each of which performs some
2541 # specific operation as part of the overall job. Data is typically
2542 # passed from one step to another as part of the job.
2543 #
Bu Sun Kim65020912020-05-20 12:08:20 -07002544 # Here&#x27;s an example of a sequence of steps which together implement a
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002545 # Map-Reduce job:
2546 #
2547 # * Read a collection of data from some source, parsing the
Bu Sun Kim65020912020-05-20 12:08:20 -07002548 # collection&#x27;s elements.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002549 #
2550 # * Validate the elements.
2551 #
2552 # * Apply a user-defined function to map each element to some value
2553 # and extract an element-specific key value.
2554 #
2555 # * Group elements with the same key into a single element with
2556 # that key, transforming a multiply-keyed collection into a
2557 # uniquely-keyed collection.
2558 #
2559 # * Write the elements out to some data sink.
2560 #
2561 # Note that the Cloud Dataflow service may be used to run many different
2562 # types of jobs, not just Map-Reduce.
Bu Sun Kim65020912020-05-20 12:08:20 -07002563 &quot;name&quot;: &quot;A String&quot;, # The name that identifies the step. This must be unique for each
Dan O'Mearadd494642020-05-01 07:42:23 -07002564 # step with respect to all other steps in the Cloud Dataflow job.
Bu Sun Kim65020912020-05-20 12:08:20 -07002565 &quot;kind&quot;: &quot;A String&quot;, # The kind of step in the Cloud Dataflow job.
2566 &quot;properties&quot;: { # Named properties associated with the step. Each kind of
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002567 # predefined step has its own required set of properties.
2568 # Must be provided on Create. Only retrieved with JOB_VIEW_ALL.
Bu Sun Kim65020912020-05-20 12:08:20 -07002569 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002570 },
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002571 },
2572 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07002573 &quot;replacedByJobId&quot;: &quot;A String&quot;, # If another job is an update of this job (and thus, this job is in
2574 # `JOB_STATE_UPDATED`), this field contains the ID of that job.
2575 &quot;executionInfo&quot;: { # Additional information about how a Cloud Dataflow job will be executed that # Deprecated.
2576 # isn&#x27;t contained in the submitted job.
2577 &quot;stages&quot;: { # A mapping from each stage to the information about that stage.
2578 &quot;a_key&quot;: { # Contains information about how a particular
2579 # google.dataflow.v1beta3.Step will be executed.
2580 &quot;stepName&quot;: [ # The steps associated with the execution stage.
2581 # Note that stages may have several steps, and that a given step
2582 # might be run by more than one stage.
2583 &quot;A String&quot;,
2584 ],
2585 },
2586 },
2587 },
2588 &quot;currentState&quot;: &quot;A String&quot;, # The current state of the job.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002589 #
2590 # Jobs are created in the `JOB_STATE_STOPPED` state unless otherwise
2591 # specified.
2592 #
2593 # A job in the `JOB_STATE_RUNNING` state may asynchronously enter a
2594 # terminal state. After a job has reached a terminal state, no
2595 # further state updates may be made.
2596 #
2597 # This field may be mutated by the Cloud Dataflow service;
2598 # callers cannot mutate it.
Bu Sun Kim65020912020-05-20 12:08:20 -07002599 &quot;location&quot;: &quot;A String&quot;, # The [regional endpoint]
2600 # (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
2601 # contains this job.
2602 &quot;startTime&quot;: &quot;A String&quot;, # The timestamp when the job was started (transitioned to JOB_STATE_PENDING).
2603 # Flexible resource scheduling jobs are started with some delay after job
2604 # creation, so start_time is unset before start and is updated when the
2605 # job is started by the Cloud Dataflow service. For other jobs, start_time
2606 # always equals to create_time and is immutable and set by the Cloud Dataflow
2607 # service.
2608 &quot;stepsLocation&quot;: &quot;A String&quot;, # The GCS location where the steps are stored.
2609 &quot;labels&quot;: { # User-defined labels for this job.
2610 #
2611 # The labels map can contain no more than 64 entries. Entries of the labels
2612 # map are UTF8 strings that comply with the following restrictions:
2613 #
2614 # * Keys must conform to regexp: \p{Ll}\p{Lo}{0,62}
2615 # * Values must conform to regexp: [\p{Ll}\p{Lo}\p{N}_-]{0,63}
2616 # * Both keys and values are additionally constrained to be &lt;= 128 bytes in
2617 # size.
2618 &quot;a_key&quot;: &quot;A String&quot;,
Jon Wayne Parrott692617a2017-01-06 09:58:29 -08002619 },
Bu Sun Kim65020912020-05-20 12:08:20 -07002620 &quot;createTime&quot;: &quot;A String&quot;, # The timestamp when the job was initially created. Immutable and set by the
2621 # Cloud Dataflow service.
2622 &quot;requestedState&quot;: &quot;A String&quot;, # The job&#x27;s requested state.
2623 #
2624 # `UpdateJob` may be used to switch between the `JOB_STATE_STOPPED` and
2625 # `JOB_STATE_RUNNING` states, by setting requested_state. `UpdateJob` may
2626 # also be used to directly set a job&#x27;s requested state to
2627 # `JOB_STATE_CANCELLED` or `JOB_STATE_DONE`, irrevocably terminating the
2628 # job if it has not already reached a terminal state.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002629 },
Jon Wayne Parrott692617a2017-01-06 09:58:29 -08002630 ],
2631 }</pre>
2632</div>
2633
2634<div class="method">
2635 <code class="details" id="list_next">list_next(previous_request, previous_response)</code>
2636 <pre>Retrieves the next page of results.
2637
2638Args:
2639 previous_request: The request for the previous page. (required)
2640 previous_response: The response from the request for the previous page. (required)
2641
2642Returns:
Bu Sun Kim65020912020-05-20 12:08:20 -07002643 A request object that you can call &#x27;execute()&#x27; on to request the next
Jon Wayne Parrott692617a2017-01-06 09:58:29 -08002644 page. Returns None if there are no more items in the collection.
2645 </pre>
2646</div>
2647
2648<div class="method">
Dan O'Mearadd494642020-05-01 07:42:23 -07002649 <code class="details" id="snapshot">snapshot(projectId, location, jobId, body=None, x__xgafv=None)</code>
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002650 <pre>Snapshot the state of a streaming job.
2651
2652Args:
2653 projectId: string, The project which owns the job to be snapshotted. (required)
2654 location: string, The location that contains this job. (required)
2655 jobId: string, The job to be snapshotted. (required)
Dan O'Mearadd494642020-05-01 07:42:23 -07002656 body: object, The request body.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002657 The object takes the form of:
2658
2659{ # Request to create a snapshot of a job.
Bu Sun Kim65020912020-05-20 12:08:20 -07002660 &quot;description&quot;: &quot;A String&quot;, # User specified description of the snapshot. Maybe empty.
2661 &quot;snapshotSources&quot;: True or False, # If true, perform snapshots for sources which support this.
2662 &quot;ttl&quot;: &quot;A String&quot;, # TTL for the snapshot.
2663 &quot;location&quot;: &quot;A String&quot;, # The location that contains this job.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002664 }
2665
2666 x__xgafv: string, V1 error format.
2667 Allowed values
2668 1 - v1 error format
2669 2 - v2 error format
2670
2671Returns:
2672 An object of the form:
2673
2674 { # Represents a snapshot of a job.
Bu Sun Kim65020912020-05-20 12:08:20 -07002675 &quot;state&quot;: &quot;A String&quot;, # State of the snapshot.
2676 &quot;sourceJobId&quot;: &quot;A String&quot;, # The job this snapshot was created from.
2677 &quot;projectId&quot;: &quot;A String&quot;, # The project this snapshot belongs to.
2678 &quot;id&quot;: &quot;A String&quot;, # The unique ID of this snapshot.
2679 &quot;ttl&quot;: &quot;A String&quot;, # The time after which this snapshot will be automatically deleted.
2680 &quot;description&quot;: &quot;A String&quot;, # User specified description of the snapshot. Maybe empty.
2681 &quot;diskSizeBytes&quot;: &quot;A String&quot;, # The disk byte size of the snapshot. Only available for snapshots in READY
Dan O'Mearadd494642020-05-01 07:42:23 -07002682 # state.
Bu Sun Kim65020912020-05-20 12:08:20 -07002683 &quot;pubsubMetadata&quot;: [ # PubSub snapshot metadata.
Dan O'Mearadd494642020-05-01 07:42:23 -07002684 { # Represents a Pubsub snapshot.
Bu Sun Kim65020912020-05-20 12:08:20 -07002685 &quot;expireTime&quot;: &quot;A String&quot;, # The expire time of the Pubsub snapshot.
2686 &quot;snapshotName&quot;: &quot;A String&quot;, # The name of the Pubsub snapshot.
2687 &quot;topicName&quot;: &quot;A String&quot;, # The name of the Pubsub topic.
Dan O'Mearadd494642020-05-01 07:42:23 -07002688 },
2689 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07002690 &quot;creationTime&quot;: &quot;A String&quot;, # The time this snapshot was created.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002691 }</pre>
2692</div>
2693
2694<div class="method">
Dan O'Mearadd494642020-05-01 07:42:23 -07002695 <code class="details" id="update">update(projectId, location, jobId, body=None, x__xgafv=None)</code>
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04002696 <pre>Updates the state of an existing Cloud Dataflow job.
Jon Wayne Parrott692617a2017-01-06 09:58:29 -08002697
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002698To update the state of an existing job, we recommend using
2699`projects.locations.jobs.update` with a [regional endpoint]
2700(https://cloud.google.com/dataflow/docs/concepts/regional-endpoints). Using
2701`projects.jobs.update` is not recommended, as you can only update the state
2702of jobs that are running in `us-central1`.
2703
Jon Wayne Parrott692617a2017-01-06 09:58:29 -08002704Args:
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04002705 projectId: string, The ID of the Cloud Platform project that the job belongs to. (required)
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002706 location: string, The [regional endpoint]
2707(https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
2708contains this job. (required)
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04002709 jobId: string, The job ID. (required)
Dan O'Mearadd494642020-05-01 07:42:23 -07002710 body: object, The request body.
Jon Wayne Parrott692617a2017-01-06 09:58:29 -08002711 The object takes the form of:
2712
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04002713{ # Defines a job to be run by the Cloud Dataflow service.
Bu Sun Kim65020912020-05-20 12:08:20 -07002714 &quot;clientRequestId&quot;: &quot;A String&quot;, # The client&#x27;s unique identifier of the job, re-used across retried attempts.
2715 # If this field is set, the service will ensure its uniqueness.
2716 # The request to create a job will fail if the service has knowledge of a
2717 # previously submitted job with the same client&#x27;s ID and job name.
2718 # The caller may use this field to ensure idempotence of job
2719 # creation across retried attempts to create a job.
2720 # By default, the field is empty and, in that case, the service ignores it.
2721 &quot;id&quot;: &quot;A String&quot;, # The unique ID of this job.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002722 #
2723 # This field is set by the Cloud Dataflow service when the Job is
2724 # created, and is immutable for the life of the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07002725 &quot;currentStateTime&quot;: &quot;A String&quot;, # The timestamp associated with the current state.
2726 &quot;transformNameMapping&quot;: { # The map of transform name prefixes of the job to be replaced to the
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002727 # corresponding name prefixes of the new job.
Bu Sun Kim65020912020-05-20 12:08:20 -07002728 &quot;a_key&quot;: &quot;A String&quot;,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002729 },
Bu Sun Kim65020912020-05-20 12:08:20 -07002730 &quot;environment&quot;: { # Describes the environment in which a Dataflow Job runs. # The environment for the job.
2731 &quot;sdkPipelineOptions&quot;: { # The Cloud Dataflow SDK pipeline options specified by the user. These
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002732 # options are passed through the service and are used to recreate the
2733 # SDK pipeline options on the worker in a language agnostic and platform
2734 # independent way.
Bu Sun Kim65020912020-05-20 12:08:20 -07002735 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002736 },
Bu Sun Kim65020912020-05-20 12:08:20 -07002737 &quot;flexResourceSchedulingGoal&quot;: &quot;A String&quot;, # Which Flexible Resource Scheduling mode to run in.
2738 &quot;workerPools&quot;: [ # The worker pools. At least one &quot;harness&quot; worker pool must be
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002739 # specified in order for the job to have workers.
2740 { # Describes one particular pool of Cloud Dataflow workers to be
2741 # instantiated by the Cloud Dataflow service in order to perform the
2742 # computations required by a job. Note that a workflow job may use
2743 # multiple pools, in order to match the various computational
2744 # requirements of the various stages of the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07002745 &quot;defaultPackageSet&quot;: &quot;A String&quot;, # The default package set to install. This allows the service to
2746 # select a default set of packages which are useful to worker
2747 # harnesses written in a particular language.
2748 &quot;network&quot;: &quot;A String&quot;, # Network to which VMs will be assigned. If empty or unspecified,
2749 # the service will use the network &quot;default&quot;.
2750 &quot;zone&quot;: &quot;A String&quot;, # Zone to run the worker pools in. If empty or unspecified, the service
Dan O'Mearadd494642020-05-01 07:42:23 -07002751 # will attempt to choose a reasonable default.
Bu Sun Kim65020912020-05-20 12:08:20 -07002752 &quot;numWorkers&quot;: 42, # Number of Google Compute Engine workers in this pool needed to
2753 # execute the job. If zero or unspecified, the service will
2754 # attempt to choose a reasonable default.
2755 &quot;numThreadsPerWorker&quot;: 42, # The number of threads per worker harness. If empty or unspecified, the
Dan O'Mearadd494642020-05-01 07:42:23 -07002756 # service will choose a number of threads (according to the number of cores
2757 # on the selected machine type for batch, or 1 by convention for streaming).
Bu Sun Kim65020912020-05-20 12:08:20 -07002758 &quot;diskSourceImage&quot;: &quot;A String&quot;, # Fully qualified source image for disks.
2759 &quot;packages&quot;: [ # Packages to be installed on workers.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002760 { # The packages that must be installed in order for a worker to run the
2761 # steps of the Cloud Dataflow job that will be assigned to its worker
2762 # pool.
2763 #
2764 # This is the mechanism by which the Cloud Dataflow SDK causes code to
2765 # be loaded onto the workers. For example, the Cloud Dataflow Java SDK
Bu Sun Kim65020912020-05-20 12:08:20 -07002766 # might use this to install jars containing the user&#x27;s code and all of the
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002767 # various dependencies (libraries, data files, etc.) required in order
2768 # for that code to run.
Bu Sun Kim65020912020-05-20 12:08:20 -07002769 &quot;location&quot;: &quot;A String&quot;, # The resource to read the package from. The supported resource type is:
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002770 #
2771 # Google Cloud Storage:
2772 #
2773 # storage.googleapis.com/{bucket}
2774 # bucket.storage.googleapis.com/
Bu Sun Kim65020912020-05-20 12:08:20 -07002775 &quot;name&quot;: &quot;A String&quot;, # The name of the package.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002776 },
2777 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07002778 &quot;teardownPolicy&quot;: &quot;A String&quot;, # Sets the policy for determining when to turndown worker pool.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002779 # Allowed values are: `TEARDOWN_ALWAYS`, `TEARDOWN_ON_SUCCESS`, and
2780 # `TEARDOWN_NEVER`.
2781 # `TEARDOWN_ALWAYS` means workers are always torn down regardless of whether
2782 # the job succeeds. `TEARDOWN_ON_SUCCESS` means workers are torn down
2783 # if the job succeeds. `TEARDOWN_NEVER` means the workers are never torn
2784 # down.
2785 #
2786 # If the workers are not torn down by the service, they will
2787 # continue to run and use Google Compute Engine VM resources in the
Bu Sun Kim65020912020-05-20 12:08:20 -07002788 # user&#x27;s project until they are explicitly terminated by the user.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002789 # Because of this, Google recommends using the `TEARDOWN_ALWAYS`
2790 # policy except for small, manually supervised test jobs.
2791 #
2792 # If unknown or unspecified, the service will attempt to choose a reasonable
2793 # default.
Bu Sun Kim65020912020-05-20 12:08:20 -07002794 &quot;onHostMaintenance&quot;: &quot;A String&quot;, # The action to take on host maintenance, as defined by the Google
2795 # Compute Engine API.
2796 &quot;poolArgs&quot;: { # Extra arguments for this worker pool.
2797 &quot;a_key&quot;: &quot;&quot;, # Properties of the object. Contains field @type with type URL.
2798 },
2799 &quot;diskSizeGb&quot;: 42, # Size of root disk for VMs, in GB. If zero or unspecified, the service will
Dan O'Mearadd494642020-05-01 07:42:23 -07002800 # attempt to choose a reasonable default.
Bu Sun Kim65020912020-05-20 12:08:20 -07002801 &quot;workerHarnessContainerImage&quot;: &quot;A String&quot;, # Required. Docker container image that executes the Cloud Dataflow worker
2802 # harness, residing in Google Container Registry.
2803 #
2804 # Deprecated for the Fn API path. Use sdk_harness_container_images instead.
2805 &quot;diskType&quot;: &quot;A String&quot;, # Type of root disk for VMs. If empty or unspecified, the service will
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002806 # attempt to choose a reasonable default.
Bu Sun Kim65020912020-05-20 12:08:20 -07002807 &quot;machineType&quot;: &quot;A String&quot;, # Machine type (e.g. &quot;n1-standard-1&quot;). If empty or unspecified, the
2808 # service will attempt to choose a reasonable default.
2809 &quot;kind&quot;: &quot;A String&quot;, # The kind of the worker pool; currently only `harness` and `shuffle`
2810 # are supported.
2811 &quot;dataDisks&quot;: [ # Data disks that are used by a VM in this workflow.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002812 { # Describes the data disk used by a workflow job.
Bu Sun Kim65020912020-05-20 12:08:20 -07002813 &quot;sizeGb&quot;: 42, # Size of disk in GB. If zero or unspecified, the service will
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002814 # attempt to choose a reasonable default.
Bu Sun Kim65020912020-05-20 12:08:20 -07002815 &quot;diskType&quot;: &quot;A String&quot;, # Disk storage type, as defined by Google Compute Engine. This
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002816 # must be a disk type appropriate to the project and zone in which
2817 # the workers will run. If unknown or unspecified, the service
2818 # will attempt to choose a reasonable default.
2819 #
2820 # For example, the standard persistent disk type is a resource name
Bu Sun Kim65020912020-05-20 12:08:20 -07002821 # typically ending in &quot;pd-standard&quot;. If SSD persistent disks are
2822 # available, the resource name typically ends with &quot;pd-ssd&quot;. The
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002823 # actual valid values are defined the Google Compute Engine API,
2824 # not by the Cloud Dataflow API; consult the Google Compute Engine
2825 # documentation for more information about determining the set of
2826 # available disk types for a particular project and zone.
2827 #
2828 # Google Compute Engine Disk types are local to a particular
2829 # project in a particular zone, and so the resource name will
2830 # typically look something like this:
2831 #
2832 # compute.googleapis.com/projects/project-id/zones/zone/diskTypes/pd-standard
Bu Sun Kim65020912020-05-20 12:08:20 -07002833 &quot;mountPoint&quot;: &quot;A String&quot;, # Directory in a VM where disk is mounted.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002834 },
2835 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07002836 &quot;sdkHarnessContainerImages&quot;: [ # Set of SDK harness containers needed to execute this pipeline. This will
Dan O'Mearadd494642020-05-01 07:42:23 -07002837 # only be set in the Fn API path. For non-cross-language pipelines this
2838 # should have only one entry. Cross-language pipelines will have two or more
2839 # entries.
2840 { # Defines a SDK harness container for executing Dataflow pipelines.
Bu Sun Kim65020912020-05-20 12:08:20 -07002841 &quot;containerImage&quot;: &quot;A String&quot;, # A docker container image that resides in Google Container Registry.
2842 &quot;useSingleCorePerContainer&quot;: True or False, # If true, recommends the Dataflow service to use only one core per SDK
Dan O'Mearadd494642020-05-01 07:42:23 -07002843 # container instance with this image. If false (or unset) recommends using
2844 # more than one core per SDK container instance with this image for
2845 # efficiency. Note that Dataflow service may choose to override this property
2846 # if needed.
2847 },
2848 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07002849 &quot;subnetwork&quot;: &quot;A String&quot;, # Subnetwork to which VMs will be assigned, if desired. Expected to be of
2850 # the form &quot;regions/REGION/subnetworks/SUBNETWORK&quot;.
2851 &quot;ipConfiguration&quot;: &quot;A String&quot;, # Configuration for VM IPs.
2852 &quot;taskrunnerSettings&quot;: { # Taskrunner configuration settings. # Settings passed through to Google Compute Engine workers when
2853 # using the standard Dataflow task runner. Users should ignore
2854 # this field.
2855 &quot;alsologtostderr&quot;: True or False, # Whether to also send taskrunner log info to stderr.
2856 &quot;taskGroup&quot;: &quot;A String&quot;, # The UNIX group ID on the worker VM to use for tasks launched by
2857 # taskrunner; e.g. &quot;wheel&quot;.
2858 &quot;harnessCommand&quot;: &quot;A String&quot;, # The command to launch the worker harness.
2859 &quot;logDir&quot;: &quot;A String&quot;, # The directory on the VM to store logs.
2860 &quot;oauthScopes&quot;: [ # The OAuth2 scopes to be requested by the taskrunner in order to
2861 # access the Cloud Dataflow API.
2862 &quot;A String&quot;,
2863 ],
2864 &quot;dataflowApiVersion&quot;: &quot;A String&quot;, # The API version of endpoint, e.g. &quot;v1b3&quot;
2865 &quot;logUploadLocation&quot;: &quot;A String&quot;, # Indicates where to put logs. If this is not specified, the logs
2866 # will not be uploaded.
2867 #
2868 # The supported resource type is:
2869 #
2870 # Google Cloud Storage:
2871 # storage.googleapis.com/{bucket}/{object}
2872 # bucket.storage.googleapis.com/{object}
2873 &quot;streamingWorkerMainClass&quot;: &quot;A String&quot;, # The streaming worker main class name.
2874 &quot;workflowFileName&quot;: &quot;A String&quot;, # The file to store the workflow in.
2875 &quot;baseTaskDir&quot;: &quot;A String&quot;, # The location on the worker for task-specific subdirectories.
2876 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the taskrunner should use for
2877 # temporary storage.
2878 #
2879 # The supported resource type is:
2880 #
2881 # Google Cloud Storage:
2882 # storage.googleapis.com/{bucket}/{object}
2883 # bucket.storage.googleapis.com/{object}
2884 &quot;commandlinesFileName&quot;: &quot;A String&quot;, # The file to store preprocessing commands in.
2885 &quot;languageHint&quot;: &quot;A String&quot;, # The suggested backend language.
2886 &quot;baseUrl&quot;: &quot;A String&quot;, # The base URL for the taskrunner to use when accessing Google Cloud APIs.
2887 #
2888 # When workers access Google Cloud APIs, they logically do so via
2889 # relative URLs. If this field is specified, it supplies the base
2890 # URL to use for resolving these relative URLs. The normative
2891 # algorithm used is defined by RFC 1808, &quot;Relative Uniform Resource
2892 # Locators&quot;.
2893 #
2894 # If not specified, the default value is &quot;http://www.googleapis.com/&quot;
2895 &quot;logToSerialconsole&quot;: True or False, # Whether to send taskrunner log info to Google Compute Engine VM serial
2896 # console.
2897 &quot;continueOnException&quot;: True or False, # Whether to continue taskrunner if an exception is hit.
2898 &quot;parallelWorkerSettings&quot;: { # Provides data to pass through to the worker harness. # The settings to pass to the parallel worker harness.
2899 &quot;baseUrl&quot;: &quot;A String&quot;, # The base URL for accessing Google Cloud APIs.
2900 #
2901 # When workers access Google Cloud APIs, they logically do so via
2902 # relative URLs. If this field is specified, it supplies the base
2903 # URL to use for resolving these relative URLs. The normative
2904 # algorithm used is defined by RFC 1808, &quot;Relative Uniform Resource
2905 # Locators&quot;.
2906 #
2907 # If not specified, the default value is &quot;http://www.googleapis.com/&quot;
2908 &quot;reportingEnabled&quot;: True or False, # Whether to send work progress updates to the service.
2909 &quot;servicePath&quot;: &quot;A String&quot;, # The Cloud Dataflow service path relative to the root URL, for example,
2910 # &quot;dataflow/v1b3/projects&quot;.
2911 &quot;shuffleServicePath&quot;: &quot;A String&quot;, # The Shuffle service path relative to the root URL, for example,
2912 # &quot;shuffle/v1beta1&quot;.
2913 &quot;workerId&quot;: &quot;A String&quot;, # The ID of the worker running this pipeline.
2914 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the system should use for temporary
2915 # storage.
2916 #
2917 # The supported resource type is:
2918 #
2919 # Google Cloud Storage:
2920 #
2921 # storage.googleapis.com/{bucket}/{object}
2922 # bucket.storage.googleapis.com/{object}
2923 },
2924 &quot;vmId&quot;: &quot;A String&quot;, # The ID string of the VM.
2925 &quot;taskUser&quot;: &quot;A String&quot;, # The UNIX user ID on the worker VM to use for tasks launched by
2926 # taskrunner; e.g. &quot;root&quot;.
2927 },
2928 &quot;autoscalingSettings&quot;: { # Settings for WorkerPool autoscaling. # Settings for autoscaling of this WorkerPool.
2929 &quot;maxNumWorkers&quot;: 42, # The maximum number of workers to cap scaling at.
2930 &quot;algorithm&quot;: &quot;A String&quot;, # The algorithm to use for autoscaling.
2931 },
2932 &quot;metadata&quot;: { # Metadata to set on the Google Compute Engine VMs.
2933 &quot;a_key&quot;: &quot;A String&quot;,
2934 },
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002935 },
2936 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07002937 &quot;dataset&quot;: &quot;A String&quot;, # The dataset for the current project where various workflow
2938 # related tables are stored.
2939 #
2940 # The supported resource type is:
2941 #
2942 # Google BigQuery:
2943 # bigquery.googleapis.com/{dataset}
2944 &quot;internalExperiments&quot;: { # Experimental settings.
2945 &quot;a_key&quot;: &quot;&quot;, # Properties of the object. Contains field @type with type URL.
2946 },
2947 &quot;workerRegion&quot;: &quot;A String&quot;, # The Compute Engine region
2948 # (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
2949 # which worker processing should occur, e.g. &quot;us-west1&quot;. Mutually exclusive
2950 # with worker_zone. If neither worker_region nor worker_zone is specified,
2951 # default to the control plane&#x27;s region.
2952 &quot;serviceKmsKeyName&quot;: &quot;A String&quot;, # If set, contains the Cloud KMS key identifier used to encrypt data
2953 # at rest, AKA a Customer Managed Encryption Key (CMEK).
2954 #
2955 # Format:
2956 # projects/PROJECT_ID/locations/LOCATION/keyRings/KEY_RING/cryptoKeys/KEY
2957 &quot;userAgent&quot;: { # A description of the process that generated the request.
2958 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
2959 },
2960 &quot;workerZone&quot;: &quot;A String&quot;, # The Compute Engine zone
2961 # (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
2962 # which worker processing should occur, e.g. &quot;us-west1-a&quot;. Mutually exclusive
2963 # with worker_region. If neither worker_region nor worker_zone is specified,
2964 # a zone in the control plane&#x27;s region is chosen based on available capacity.
2965 &quot;clusterManagerApiService&quot;: &quot;A String&quot;, # The type of cluster manager API to use. If unknown or
Dan O'Mearadd494642020-05-01 07:42:23 -07002966 # unspecified, the service will attempt to choose a reasonable
2967 # default. This should be in the form of the API service name,
Bu Sun Kim65020912020-05-20 12:08:20 -07002968 # e.g. &quot;compute.googleapis.com&quot;.
2969 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the system should use for temporary
2970 # storage. The system will append the suffix &quot;/temp-{JOBNAME} to
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002971 # this resource prefix, where {JOBNAME} is the value of the
2972 # job_name field. The resulting bucket and object prefix is used
2973 # as the prefix of the resources used to store temporary data
2974 # needed during the job execution. NOTE: This will override the
2975 # value in taskrunner_settings.
2976 # The supported resource type is:
2977 #
2978 # Google Cloud Storage:
2979 #
2980 # storage.googleapis.com/{bucket}/{object}
2981 # bucket.storage.googleapis.com/{object}
Bu Sun Kim65020912020-05-20 12:08:20 -07002982 &quot;experiments&quot;: [ # The list of experiments to enable.
2983 &quot;A String&quot;,
2984 ],
2985 &quot;version&quot;: { # A structure describing which components and their versions of the service
2986 # are required in order to run the job.
2987 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
2988 },
2989 &quot;serviceAccountEmail&quot;: &quot;A String&quot;, # Identity to run virtual machines as. Defaults to the default account.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002990 },
Bu Sun Kim65020912020-05-20 12:08:20 -07002991 &quot;stageStates&quot;: [ # This field may be mutated by the Cloud Dataflow service;
2992 # callers cannot mutate it.
2993 { # A message describing the state of a particular execution stage.
2994 &quot;executionStageName&quot;: &quot;A String&quot;, # The name of the execution stage.
2995 &quot;currentStateTime&quot;: &quot;A String&quot;, # The time at which the stage transitioned to this state.
2996 &quot;executionStageState&quot;: &quot;A String&quot;, # Executions stage states allow the same set of values as JobState.
2997 },
2998 ],
2999 &quot;jobMetadata&quot;: { # Metadata available primarily for filtering jobs. Will be included in the # This field is populated by the Dataflow service to support filtering jobs
3000 # by the metadata values provided here. Populated for ListJobs and all GetJob
3001 # views SUMMARY and higher.
3002 # ListJob response and Job SUMMARY view.
3003 &quot;bigTableDetails&quot;: [ # Identification of a BigTable source used in the Dataflow job.
3004 { # Metadata for a BigTable connector used by the job.
3005 &quot;tableId&quot;: &quot;A String&quot;, # TableId accessed in the connection.
3006 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
3007 &quot;instanceId&quot;: &quot;A String&quot;, # InstanceId accessed in the connection.
3008 },
3009 ],
3010 &quot;spannerDetails&quot;: [ # Identification of a Spanner source used in the Dataflow job.
3011 { # Metadata for a Spanner connector used by the job.
3012 &quot;databaseId&quot;: &quot;A String&quot;, # DatabaseId accessed in the connection.
3013 &quot;instanceId&quot;: &quot;A String&quot;, # InstanceId accessed in the connection.
3014 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
3015 },
3016 ],
3017 &quot;datastoreDetails&quot;: [ # Identification of a Datastore source used in the Dataflow job.
3018 { # Metadata for a Datastore connector used by the job.
3019 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
3020 &quot;namespace&quot;: &quot;A String&quot;, # Namespace used in the connection.
3021 },
3022 ],
3023 &quot;sdkVersion&quot;: { # The version of the SDK used to run the job. # The SDK version used to run the job.
3024 &quot;versionDisplayName&quot;: &quot;A String&quot;, # A readable string describing the version of the SDK.
3025 &quot;sdkSupportStatus&quot;: &quot;A String&quot;, # The support status for this SDK version.
3026 &quot;version&quot;: &quot;A String&quot;, # The version of the SDK used to run the job.
3027 },
3028 &quot;bigqueryDetails&quot;: [ # Identification of a BigQuery source used in the Dataflow job.
3029 { # Metadata for a BigQuery connector used by the job.
3030 &quot;table&quot;: &quot;A String&quot;, # Table accessed in the connection.
3031 &quot;dataset&quot;: &quot;A String&quot;, # Dataset accessed in the connection.
3032 &quot;projectId&quot;: &quot;A String&quot;, # Project accessed in the connection.
3033 &quot;query&quot;: &quot;A String&quot;, # Query used to access data in the connection.
3034 },
3035 ],
3036 &quot;fileDetails&quot;: [ # Identification of a File source used in the Dataflow job.
3037 { # Metadata for a File connector used by the job.
3038 &quot;filePattern&quot;: &quot;A String&quot;, # File Pattern used to access files by the connector.
3039 },
3040 ],
3041 &quot;pubsubDetails&quot;: [ # Identification of a PubSub source used in the Dataflow job.
3042 { # Metadata for a PubSub connector used by the job.
3043 &quot;subscription&quot;: &quot;A String&quot;, # Subscription used in the connection.
3044 &quot;topic&quot;: &quot;A String&quot;, # Topic accessed in the connection.
3045 },
3046 ],
3047 },
3048 &quot;createdFromSnapshotId&quot;: &quot;A String&quot;, # If this is specified, the job&#x27;s initial state is populated from the given
3049 # snapshot.
3050 &quot;projectId&quot;: &quot;A String&quot;, # The ID of the Cloud Platform project that the job belongs to.
3051 &quot;type&quot;: &quot;A String&quot;, # The type of Cloud Dataflow job.
3052 &quot;pipelineDescription&quot;: { # A descriptive representation of submitted pipeline as well as the executed # Preliminary field: The format of this data may change at any time.
3053 # A description of the user pipeline and stages through which it is executed.
3054 # Created by Cloud Dataflow service. Only retrieved with
3055 # JOB_VIEW_DESCRIPTION or JOB_VIEW_ALL.
3056 # form. This data is provided by the Dataflow service for ease of visualizing
3057 # the pipeline and interpreting Dataflow provided metrics.
3058 &quot;executionPipelineStage&quot;: [ # Description of each stage of execution of the pipeline.
3059 { # Description of the composing transforms, names/ids, and input/outputs of a
3060 # stage of execution. Some composing transforms and sources may have been
3061 # generated by the Dataflow service during execution planning.
3062 &quot;id&quot;: &quot;A String&quot;, # Dataflow service generated id for this stage.
3063 &quot;componentTransform&quot;: [ # Transforms that comprise this execution stage.
3064 { # Description of a transform executed as part of an execution stage.
3065 &quot;originalTransform&quot;: &quot;A String&quot;, # User name for the original user transform with which this transform is
3066 # most closely associated.
3067 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
3068 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this transform; may be user or system generated.
3069 },
3070 ],
3071 &quot;componentSource&quot;: [ # Collections produced and consumed by component transforms of this stage.
3072 { # Description of an interstitial value between transforms in an execution
3073 # stage.
3074 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
3075 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this transform; may be user or system generated.
3076 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
3077 # source is most closely associated.
3078 },
3079 ],
3080 &quot;kind&quot;: &quot;A String&quot;, # Type of tranform this stage is executing.
3081 &quot;outputSource&quot;: [ # Output sources for this stage.
3082 { # Description of an input or output of an execution stage.
3083 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
3084 # source is most closely associated.
3085 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
3086 &quot;sizeBytes&quot;: &quot;A String&quot;, # Size of the source, if measurable.
3087 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this source; may be user or system generated.
3088 },
3089 ],
3090 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this stage.
3091 &quot;inputSource&quot;: [ # Input sources for this stage.
3092 { # Description of an input or output of an execution stage.
3093 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
3094 # source is most closely associated.
3095 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
3096 &quot;sizeBytes&quot;: &quot;A String&quot;, # Size of the source, if measurable.
3097 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this source; may be user or system generated.
3098 },
3099 ],
3100 },
3101 ],
3102 &quot;originalPipelineTransform&quot;: [ # Description of each transform in the pipeline and collections between them.
3103 { # Description of the type, names/ids, and input/outputs for a transform.
3104 &quot;kind&quot;: &quot;A String&quot;, # Type of transform.
3105 &quot;inputCollectionName&quot;: [ # User names for all collection inputs to this transform.
3106 &quot;A String&quot;,
3107 ],
3108 &quot;name&quot;: &quot;A String&quot;, # User provided name for this transform instance.
3109 &quot;id&quot;: &quot;A String&quot;, # SDK generated id of this transform instance.
3110 &quot;displayData&quot;: [ # Transform-specific display data.
3111 { # Data provided with a pipeline or transform to provide descriptive info.
3112 &quot;timestampValue&quot;: &quot;A String&quot;, # Contains value if the data is of timestamp type.
3113 &quot;boolValue&quot;: True or False, # Contains value if the data is of a boolean type.
3114 &quot;javaClassValue&quot;: &quot;A String&quot;, # Contains value if the data is of java class type.
3115 &quot;strValue&quot;: &quot;A String&quot;, # Contains value if the data is of string type.
3116 &quot;int64Value&quot;: &quot;A String&quot;, # Contains value if the data is of int64 type.
3117 &quot;durationValue&quot;: &quot;A String&quot;, # Contains value if the data is of duration type.
3118 &quot;namespace&quot;: &quot;A String&quot;, # The namespace for the key. This is usually a class name or programming
3119 # language namespace (i.e. python module) which defines the display data.
3120 # This allows a dax monitoring system to specially handle the data
3121 # and perform custom rendering.
3122 &quot;floatValue&quot;: 3.14, # Contains value if the data is of float type.
3123 &quot;key&quot;: &quot;A String&quot;, # The key identifying the display data.
3124 # This is intended to be used as a label for the display data
3125 # when viewed in a dax monitoring system.
3126 &quot;shortStrValue&quot;: &quot;A String&quot;, # A possible additional shorter value to display.
3127 # For example a java_class_name_value of com.mypackage.MyDoFn
3128 # will be stored with MyDoFn as the short_str_value and
3129 # com.mypackage.MyDoFn as the java_class_name value.
3130 # short_str_value can be displayed and java_class_name_value
3131 # will be displayed as a tooltip.
3132 &quot;url&quot;: &quot;A String&quot;, # An optional full URL.
3133 &quot;label&quot;: &quot;A String&quot;, # An optional label to display in a dax UI for the element.
3134 },
3135 ],
3136 &quot;outputCollectionName&quot;: [ # User names for all collection outputs to this transform.
3137 &quot;A String&quot;,
3138 ],
3139 },
3140 ],
3141 &quot;displayData&quot;: [ # Pipeline level display data.
3142 { # Data provided with a pipeline or transform to provide descriptive info.
3143 &quot;timestampValue&quot;: &quot;A String&quot;, # Contains value if the data is of timestamp type.
3144 &quot;boolValue&quot;: True or False, # Contains value if the data is of a boolean type.
3145 &quot;javaClassValue&quot;: &quot;A String&quot;, # Contains value if the data is of java class type.
3146 &quot;strValue&quot;: &quot;A String&quot;, # Contains value if the data is of string type.
3147 &quot;int64Value&quot;: &quot;A String&quot;, # Contains value if the data is of int64 type.
3148 &quot;durationValue&quot;: &quot;A String&quot;, # Contains value if the data is of duration type.
3149 &quot;namespace&quot;: &quot;A String&quot;, # The namespace for the key. This is usually a class name or programming
3150 # language namespace (i.e. python module) which defines the display data.
3151 # This allows a dax monitoring system to specially handle the data
3152 # and perform custom rendering.
3153 &quot;floatValue&quot;: 3.14, # Contains value if the data is of float type.
3154 &quot;key&quot;: &quot;A String&quot;, # The key identifying the display data.
3155 # This is intended to be used as a label for the display data
3156 # when viewed in a dax monitoring system.
3157 &quot;shortStrValue&quot;: &quot;A String&quot;, # A possible additional shorter value to display.
3158 # For example a java_class_name_value of com.mypackage.MyDoFn
3159 # will be stored with MyDoFn as the short_str_value and
3160 # com.mypackage.MyDoFn as the java_class_name value.
3161 # short_str_value can be displayed and java_class_name_value
3162 # will be displayed as a tooltip.
3163 &quot;url&quot;: &quot;A String&quot;, # An optional full URL.
3164 &quot;label&quot;: &quot;A String&quot;, # An optional label to display in a dax UI for the element.
3165 },
3166 ],
3167 },
3168 &quot;replaceJobId&quot;: &quot;A String&quot;, # If this job is an update of an existing job, this field is the job ID
3169 # of the job it replaced.
3170 #
3171 # When sending a `CreateJobRequest`, you can update a job by specifying it
3172 # here. The job named here is stopped, and its intermediate state is
3173 # transferred to this job.
3174 &quot;tempFiles&quot;: [ # A set of files the system should be aware of that are used
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003175 # for temporary storage. These temporary files will be
3176 # removed on job completion.
3177 # No duplicates are allowed.
3178 # No file patterns are supported.
3179 #
3180 # The supported files are:
3181 #
3182 # Google Cloud Storage:
3183 #
3184 # storage.googleapis.com/{bucket}/{object}
3185 # bucket.storage.googleapis.com/{object}
Bu Sun Kim65020912020-05-20 12:08:20 -07003186 &quot;A String&quot;,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003187 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07003188 &quot;name&quot;: &quot;A String&quot;, # The user-specified Cloud Dataflow job name.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003189 #
3190 # Only one Job with a given name may exist in a project at any
3191 # given time. If a caller attempts to create a Job with the same
3192 # name as an already-existing Job, the attempt returns the
3193 # existing Job.
3194 #
3195 # The name must match the regular expression
3196 # `[a-z]([-a-z0-9]{0,38}[a-z0-9])?`
Bu Sun Kim65020912020-05-20 12:08:20 -07003197 &quot;steps&quot;: [ # Exactly one of step or steps_location should be specified.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003198 #
3199 # The top-level steps that constitute the entire job.
3200 { # Defines a particular step within a Cloud Dataflow job.
3201 #
3202 # A job consists of multiple steps, each of which performs some
3203 # specific operation as part of the overall job. Data is typically
3204 # passed from one step to another as part of the job.
3205 #
Bu Sun Kim65020912020-05-20 12:08:20 -07003206 # Here&#x27;s an example of a sequence of steps which together implement a
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003207 # Map-Reduce job:
3208 #
3209 # * Read a collection of data from some source, parsing the
Bu Sun Kim65020912020-05-20 12:08:20 -07003210 # collection&#x27;s elements.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003211 #
3212 # * Validate the elements.
3213 #
3214 # * Apply a user-defined function to map each element to some value
3215 # and extract an element-specific key value.
3216 #
3217 # * Group elements with the same key into a single element with
3218 # that key, transforming a multiply-keyed collection into a
3219 # uniquely-keyed collection.
3220 #
3221 # * Write the elements out to some data sink.
3222 #
3223 # Note that the Cloud Dataflow service may be used to run many different
3224 # types of jobs, not just Map-Reduce.
Bu Sun Kim65020912020-05-20 12:08:20 -07003225 &quot;name&quot;: &quot;A String&quot;, # The name that identifies the step. This must be unique for each
Dan O'Mearadd494642020-05-01 07:42:23 -07003226 # step with respect to all other steps in the Cloud Dataflow job.
Bu Sun Kim65020912020-05-20 12:08:20 -07003227 &quot;kind&quot;: &quot;A String&quot;, # The kind of step in the Cloud Dataflow job.
3228 &quot;properties&quot;: { # Named properties associated with the step. Each kind of
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003229 # predefined step has its own required set of properties.
3230 # Must be provided on Create. Only retrieved with JOB_VIEW_ALL.
Bu Sun Kim65020912020-05-20 12:08:20 -07003231 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003232 },
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003233 },
3234 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07003235 &quot;replacedByJobId&quot;: &quot;A String&quot;, # If another job is an update of this job (and thus, this job is in
3236 # `JOB_STATE_UPDATED`), this field contains the ID of that job.
3237 &quot;executionInfo&quot;: { # Additional information about how a Cloud Dataflow job will be executed that # Deprecated.
3238 # isn&#x27;t contained in the submitted job.
3239 &quot;stages&quot;: { # A mapping from each stage to the information about that stage.
3240 &quot;a_key&quot;: { # Contains information about how a particular
3241 # google.dataflow.v1beta3.Step will be executed.
3242 &quot;stepName&quot;: [ # The steps associated with the execution stage.
3243 # Note that stages may have several steps, and that a given step
3244 # might be run by more than one stage.
3245 &quot;A String&quot;,
3246 ],
3247 },
3248 },
3249 },
3250 &quot;currentState&quot;: &quot;A String&quot;, # The current state of the job.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003251 #
3252 # Jobs are created in the `JOB_STATE_STOPPED` state unless otherwise
3253 # specified.
3254 #
3255 # A job in the `JOB_STATE_RUNNING` state may asynchronously enter a
3256 # terminal state. After a job has reached a terminal state, no
3257 # further state updates may be made.
3258 #
3259 # This field may be mutated by the Cloud Dataflow service;
3260 # callers cannot mutate it.
Bu Sun Kim65020912020-05-20 12:08:20 -07003261 &quot;location&quot;: &quot;A String&quot;, # The [regional endpoint]
3262 # (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
3263 # contains this job.
3264 &quot;startTime&quot;: &quot;A String&quot;, # The timestamp when the job was started (transitioned to JOB_STATE_PENDING).
3265 # Flexible resource scheduling jobs are started with some delay after job
3266 # creation, so start_time is unset before start and is updated when the
3267 # job is started by the Cloud Dataflow service. For other jobs, start_time
3268 # always equals to create_time and is immutable and set by the Cloud Dataflow
3269 # service.
3270 &quot;stepsLocation&quot;: &quot;A String&quot;, # The GCS location where the steps are stored.
3271 &quot;labels&quot;: { # User-defined labels for this job.
3272 #
3273 # The labels map can contain no more than 64 entries. Entries of the labels
3274 # map are UTF8 strings that comply with the following restrictions:
3275 #
3276 # * Keys must conform to regexp: \p{Ll}\p{Lo}{0,62}
3277 # * Values must conform to regexp: [\p{Ll}\p{Lo}\p{N}_-]{0,63}
3278 # * Both keys and values are additionally constrained to be &lt;= 128 bytes in
3279 # size.
3280 &quot;a_key&quot;: &quot;A String&quot;,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003281 },
Bu Sun Kim65020912020-05-20 12:08:20 -07003282 &quot;createTime&quot;: &quot;A String&quot;, # The timestamp when the job was initially created. Immutable and set by the
3283 # Cloud Dataflow service.
3284 &quot;requestedState&quot;: &quot;A String&quot;, # The job&#x27;s requested state.
3285 #
3286 # `UpdateJob` may be used to switch between the `JOB_STATE_STOPPED` and
3287 # `JOB_STATE_RUNNING` states, by setting requested_state. `UpdateJob` may
3288 # also be used to directly set a job&#x27;s requested state to
3289 # `JOB_STATE_CANCELLED` or `JOB_STATE_DONE`, irrevocably terminating the
3290 # job if it has not already reached a terminal state.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003291}
3292
3293 x__xgafv: string, V1 error format.
3294 Allowed values
3295 1 - v1 error format
3296 2 - v2 error format
3297
3298Returns:
3299 An object of the form:
3300
3301 { # Defines a job to be run by the Cloud Dataflow service.
Bu Sun Kim65020912020-05-20 12:08:20 -07003302 &quot;clientRequestId&quot;: &quot;A String&quot;, # The client&#x27;s unique identifier of the job, re-used across retried attempts.
3303 # If this field is set, the service will ensure its uniqueness.
3304 # The request to create a job will fail if the service has knowledge of a
3305 # previously submitted job with the same client&#x27;s ID and job name.
3306 # The caller may use this field to ensure idempotence of job
3307 # creation across retried attempts to create a job.
3308 # By default, the field is empty and, in that case, the service ignores it.
3309 &quot;id&quot;: &quot;A String&quot;, # The unique ID of this job.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003310 #
3311 # This field is set by the Cloud Dataflow service when the Job is
3312 # created, and is immutable for the life of the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07003313 &quot;currentStateTime&quot;: &quot;A String&quot;, # The timestamp associated with the current state.
3314 &quot;transformNameMapping&quot;: { # The map of transform name prefixes of the job to be replaced to the
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04003315 # corresponding name prefixes of the new job.
Bu Sun Kim65020912020-05-20 12:08:20 -07003316 &quot;a_key&quot;: &quot;A String&quot;,
Jon Wayne Parrott692617a2017-01-06 09:58:29 -08003317 },
Bu Sun Kim65020912020-05-20 12:08:20 -07003318 &quot;environment&quot;: { # Describes the environment in which a Dataflow Job runs. # The environment for the job.
3319 &quot;sdkPipelineOptions&quot;: { # The Cloud Dataflow SDK pipeline options specified by the user. These
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04003320 # options are passed through the service and are used to recreate the
3321 # SDK pipeline options on the worker in a language agnostic and platform
3322 # independent way.
Bu Sun Kim65020912020-05-20 12:08:20 -07003323 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
Jon Wayne Parrott692617a2017-01-06 09:58:29 -08003324 },
Bu Sun Kim65020912020-05-20 12:08:20 -07003325 &quot;flexResourceSchedulingGoal&quot;: &quot;A String&quot;, # Which Flexible Resource Scheduling mode to run in.
3326 &quot;workerPools&quot;: [ # The worker pools. At least one &quot;harness&quot; worker pool must be
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04003327 # specified in order for the job to have workers.
3328 { # Describes one particular pool of Cloud Dataflow workers to be
3329 # instantiated by the Cloud Dataflow service in order to perform the
3330 # computations required by a job. Note that a workflow job may use
3331 # multiple pools, in order to match the various computational
3332 # requirements of the various stages of the job.
Bu Sun Kim65020912020-05-20 12:08:20 -07003333 &quot;defaultPackageSet&quot;: &quot;A String&quot;, # The default package set to install. This allows the service to
3334 # select a default set of packages which are useful to worker
3335 # harnesses written in a particular language.
3336 &quot;network&quot;: &quot;A String&quot;, # Network to which VMs will be assigned. If empty or unspecified,
3337 # the service will use the network &quot;default&quot;.
3338 &quot;zone&quot;: &quot;A String&quot;, # Zone to run the worker pools in. If empty or unspecified, the service
Dan O'Mearadd494642020-05-01 07:42:23 -07003339 # will attempt to choose a reasonable default.
Bu Sun Kim65020912020-05-20 12:08:20 -07003340 &quot;numWorkers&quot;: 42, # Number of Google Compute Engine workers in this pool needed to
3341 # execute the job. If zero or unspecified, the service will
3342 # attempt to choose a reasonable default.
3343 &quot;numThreadsPerWorker&quot;: 42, # The number of threads per worker harness. If empty or unspecified, the
Dan O'Mearadd494642020-05-01 07:42:23 -07003344 # service will choose a number of threads (according to the number of cores
3345 # on the selected machine type for batch, or 1 by convention for streaming).
Bu Sun Kim65020912020-05-20 12:08:20 -07003346 &quot;diskSourceImage&quot;: &quot;A String&quot;, # Fully qualified source image for disks.
3347 &quot;packages&quot;: [ # Packages to be installed on workers.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003348 { # The packages that must be installed in order for a worker to run the
3349 # steps of the Cloud Dataflow job that will be assigned to its worker
3350 # pool.
3351 #
3352 # This is the mechanism by which the Cloud Dataflow SDK causes code to
3353 # be loaded onto the workers. For example, the Cloud Dataflow Java SDK
Bu Sun Kim65020912020-05-20 12:08:20 -07003354 # might use this to install jars containing the user&#x27;s code and all of the
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003355 # various dependencies (libraries, data files, etc.) required in order
3356 # for that code to run.
Bu Sun Kim65020912020-05-20 12:08:20 -07003357 &quot;location&quot;: &quot;A String&quot;, # The resource to read the package from. The supported resource type is:
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003358 #
3359 # Google Cloud Storage:
3360 #
3361 # storage.googleapis.com/{bucket}
3362 # bucket.storage.googleapis.com/
Bu Sun Kim65020912020-05-20 12:08:20 -07003363 &quot;name&quot;: &quot;A String&quot;, # The name of the package.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003364 },
3365 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07003366 &quot;teardownPolicy&quot;: &quot;A String&quot;, # Sets the policy for determining when to turndown worker pool.
Sai Cheemalapati4ba8c232017-06-06 18:46:08 -04003367 # Allowed values are: `TEARDOWN_ALWAYS`, `TEARDOWN_ON_SUCCESS`, and
3368 # `TEARDOWN_NEVER`.
3369 # `TEARDOWN_ALWAYS` means workers are always torn down regardless of whether
3370 # the job succeeds. `TEARDOWN_ON_SUCCESS` means workers are torn down
3371 # if the job succeeds. `TEARDOWN_NEVER` means the workers are never torn
3372 # down.
3373 #
3374 # If the workers are not torn down by the service, they will
3375 # continue to run and use Google Compute Engine VM resources in the
Bu Sun Kim65020912020-05-20 12:08:20 -07003376 # user&#x27;s project until they are explicitly terminated by the user.
Sai Cheemalapati4ba8c232017-06-06 18:46:08 -04003377 # Because of this, Google recommends using the `TEARDOWN_ALWAYS`
3378 # policy except for small, manually supervised test jobs.
3379 #
3380 # If unknown or unspecified, the service will attempt to choose a reasonable
3381 # default.
Bu Sun Kim65020912020-05-20 12:08:20 -07003382 &quot;onHostMaintenance&quot;: &quot;A String&quot;, # The action to take on host maintenance, as defined by the Google
3383 # Compute Engine API.
3384 &quot;poolArgs&quot;: { # Extra arguments for this worker pool.
3385 &quot;a_key&quot;: &quot;&quot;, # Properties of the object. Contains field @type with type URL.
3386 },
3387 &quot;diskSizeGb&quot;: 42, # Size of root disk for VMs, in GB. If zero or unspecified, the service will
Dan O'Mearadd494642020-05-01 07:42:23 -07003388 # attempt to choose a reasonable default.
Bu Sun Kim65020912020-05-20 12:08:20 -07003389 &quot;workerHarnessContainerImage&quot;: &quot;A String&quot;, # Required. Docker container image that executes the Cloud Dataflow worker
3390 # harness, residing in Google Container Registry.
3391 #
3392 # Deprecated for the Fn API path. Use sdk_harness_container_images instead.
3393 &quot;diskType&quot;: &quot;A String&quot;, # Type of root disk for VMs. If empty or unspecified, the service will
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04003394 # attempt to choose a reasonable default.
Bu Sun Kim65020912020-05-20 12:08:20 -07003395 &quot;machineType&quot;: &quot;A String&quot;, # Machine type (e.g. &quot;n1-standard-1&quot;). If empty or unspecified, the
3396 # service will attempt to choose a reasonable default.
3397 &quot;kind&quot;: &quot;A String&quot;, # The kind of the worker pool; currently only `harness` and `shuffle`
3398 # are supported.
3399 &quot;dataDisks&quot;: [ # Data disks that are used by a VM in this workflow.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003400 { # Describes the data disk used by a workflow job.
Bu Sun Kim65020912020-05-20 12:08:20 -07003401 &quot;sizeGb&quot;: 42, # Size of disk in GB. If zero or unspecified, the service will
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003402 # attempt to choose a reasonable default.
Bu Sun Kim65020912020-05-20 12:08:20 -07003403 &quot;diskType&quot;: &quot;A String&quot;, # Disk storage type, as defined by Google Compute Engine. This
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003404 # must be a disk type appropriate to the project and zone in which
3405 # the workers will run. If unknown or unspecified, the service
3406 # will attempt to choose a reasonable default.
Sai Cheemalapati4ba8c232017-06-06 18:46:08 -04003407 #
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003408 # For example, the standard persistent disk type is a resource name
Bu Sun Kim65020912020-05-20 12:08:20 -07003409 # typically ending in &quot;pd-standard&quot;. If SSD persistent disks are
3410 # available, the resource name typically ends with &quot;pd-ssd&quot;. The
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003411 # actual valid values are defined the Google Compute Engine API,
3412 # not by the Cloud Dataflow API; consult the Google Compute Engine
3413 # documentation for more information about determining the set of
3414 # available disk types for a particular project and zone.
Sai Cheemalapati4ba8c232017-06-06 18:46:08 -04003415 #
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003416 # Google Compute Engine Disk types are local to a particular
3417 # project in a particular zone, and so the resource name will
3418 # typically look something like this:
3419 #
3420 # compute.googleapis.com/projects/project-id/zones/zone/diskTypes/pd-standard
Bu Sun Kim65020912020-05-20 12:08:20 -07003421 &quot;mountPoint&quot;: &quot;A String&quot;, # Directory in a VM where disk is mounted.
Sai Cheemalapati4ba8c232017-06-06 18:46:08 -04003422 },
3423 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07003424 &quot;sdkHarnessContainerImages&quot;: [ # Set of SDK harness containers needed to execute this pipeline. This will
Dan O'Mearadd494642020-05-01 07:42:23 -07003425 # only be set in the Fn API path. For non-cross-language pipelines this
3426 # should have only one entry. Cross-language pipelines will have two or more
3427 # entries.
3428 { # Defines a SDK harness container for executing Dataflow pipelines.
Bu Sun Kim65020912020-05-20 12:08:20 -07003429 &quot;containerImage&quot;: &quot;A String&quot;, # A docker container image that resides in Google Container Registry.
3430 &quot;useSingleCorePerContainer&quot;: True or False, # If true, recommends the Dataflow service to use only one core per SDK
Dan O'Mearadd494642020-05-01 07:42:23 -07003431 # container instance with this image. If false (or unset) recommends using
3432 # more than one core per SDK container instance with this image for
3433 # efficiency. Note that Dataflow service may choose to override this property
3434 # if needed.
3435 },
3436 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07003437 &quot;subnetwork&quot;: &quot;A String&quot;, # Subnetwork to which VMs will be assigned, if desired. Expected to be of
3438 # the form &quot;regions/REGION/subnetworks/SUBNETWORK&quot;.
3439 &quot;ipConfiguration&quot;: &quot;A String&quot;, # Configuration for VM IPs.
3440 &quot;taskrunnerSettings&quot;: { # Taskrunner configuration settings. # Settings passed through to Google Compute Engine workers when
3441 # using the standard Dataflow task runner. Users should ignore
3442 # this field.
3443 &quot;alsologtostderr&quot;: True or False, # Whether to also send taskrunner log info to stderr.
3444 &quot;taskGroup&quot;: &quot;A String&quot;, # The UNIX group ID on the worker VM to use for tasks launched by
3445 # taskrunner; e.g. &quot;wheel&quot;.
3446 &quot;harnessCommand&quot;: &quot;A String&quot;, # The command to launch the worker harness.
3447 &quot;logDir&quot;: &quot;A String&quot;, # The directory on the VM to store logs.
3448 &quot;oauthScopes&quot;: [ # The OAuth2 scopes to be requested by the taskrunner in order to
3449 # access the Cloud Dataflow API.
3450 &quot;A String&quot;,
3451 ],
3452 &quot;dataflowApiVersion&quot;: &quot;A String&quot;, # The API version of endpoint, e.g. &quot;v1b3&quot;
3453 &quot;logUploadLocation&quot;: &quot;A String&quot;, # Indicates where to put logs. If this is not specified, the logs
3454 # will not be uploaded.
3455 #
3456 # The supported resource type is:
3457 #
3458 # Google Cloud Storage:
3459 # storage.googleapis.com/{bucket}/{object}
3460 # bucket.storage.googleapis.com/{object}
3461 &quot;streamingWorkerMainClass&quot;: &quot;A String&quot;, # The streaming worker main class name.
3462 &quot;workflowFileName&quot;: &quot;A String&quot;, # The file to store the workflow in.
3463 &quot;baseTaskDir&quot;: &quot;A String&quot;, # The location on the worker for task-specific subdirectories.
3464 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the taskrunner should use for
3465 # temporary storage.
3466 #
3467 # The supported resource type is:
3468 #
3469 # Google Cloud Storage:
3470 # storage.googleapis.com/{bucket}/{object}
3471 # bucket.storage.googleapis.com/{object}
3472 &quot;commandlinesFileName&quot;: &quot;A String&quot;, # The file to store preprocessing commands in.
3473 &quot;languageHint&quot;: &quot;A String&quot;, # The suggested backend language.
3474 &quot;baseUrl&quot;: &quot;A String&quot;, # The base URL for the taskrunner to use when accessing Google Cloud APIs.
3475 #
3476 # When workers access Google Cloud APIs, they logically do so via
3477 # relative URLs. If this field is specified, it supplies the base
3478 # URL to use for resolving these relative URLs. The normative
3479 # algorithm used is defined by RFC 1808, &quot;Relative Uniform Resource
3480 # Locators&quot;.
3481 #
3482 # If not specified, the default value is &quot;http://www.googleapis.com/&quot;
3483 &quot;logToSerialconsole&quot;: True or False, # Whether to send taskrunner log info to Google Compute Engine VM serial
3484 # console.
3485 &quot;continueOnException&quot;: True or False, # Whether to continue taskrunner if an exception is hit.
3486 &quot;parallelWorkerSettings&quot;: { # Provides data to pass through to the worker harness. # The settings to pass to the parallel worker harness.
3487 &quot;baseUrl&quot;: &quot;A String&quot;, # The base URL for accessing Google Cloud APIs.
3488 #
3489 # When workers access Google Cloud APIs, they logically do so via
3490 # relative URLs. If this field is specified, it supplies the base
3491 # URL to use for resolving these relative URLs. The normative
3492 # algorithm used is defined by RFC 1808, &quot;Relative Uniform Resource
3493 # Locators&quot;.
3494 #
3495 # If not specified, the default value is &quot;http://www.googleapis.com/&quot;
3496 &quot;reportingEnabled&quot;: True or False, # Whether to send work progress updates to the service.
3497 &quot;servicePath&quot;: &quot;A String&quot;, # The Cloud Dataflow service path relative to the root URL, for example,
3498 # &quot;dataflow/v1b3/projects&quot;.
3499 &quot;shuffleServicePath&quot;: &quot;A String&quot;, # The Shuffle service path relative to the root URL, for example,
3500 # &quot;shuffle/v1beta1&quot;.
3501 &quot;workerId&quot;: &quot;A String&quot;, # The ID of the worker running this pipeline.
3502 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the system should use for temporary
3503 # storage.
3504 #
3505 # The supported resource type is:
3506 #
3507 # Google Cloud Storage:
3508 #
3509 # storage.googleapis.com/{bucket}/{object}
3510 # bucket.storage.googleapis.com/{object}
3511 },
3512 &quot;vmId&quot;: &quot;A String&quot;, # The ID string of the VM.
3513 &quot;taskUser&quot;: &quot;A String&quot;, # The UNIX user ID on the worker VM to use for tasks launched by
3514 # taskrunner; e.g. &quot;root&quot;.
3515 },
3516 &quot;autoscalingSettings&quot;: { # Settings for WorkerPool autoscaling. # Settings for autoscaling of this WorkerPool.
3517 &quot;maxNumWorkers&quot;: 42, # The maximum number of workers to cap scaling at.
3518 &quot;algorithm&quot;: &quot;A String&quot;, # The algorithm to use for autoscaling.
3519 },
3520 &quot;metadata&quot;: { # Metadata to set on the Google Compute Engine VMs.
3521 &quot;a_key&quot;: &quot;A String&quot;,
3522 },
Jon Wayne Parrott692617a2017-01-06 09:58:29 -08003523 },
3524 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07003525 &quot;dataset&quot;: &quot;A String&quot;, # The dataset for the current project where various workflow
3526 # related tables are stored.
3527 #
3528 # The supported resource type is:
3529 #
3530 # Google BigQuery:
3531 # bigquery.googleapis.com/{dataset}
3532 &quot;internalExperiments&quot;: { # Experimental settings.
3533 &quot;a_key&quot;: &quot;&quot;, # Properties of the object. Contains field @type with type URL.
3534 },
3535 &quot;workerRegion&quot;: &quot;A String&quot;, # The Compute Engine region
3536 # (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
3537 # which worker processing should occur, e.g. &quot;us-west1&quot;. Mutually exclusive
3538 # with worker_zone. If neither worker_region nor worker_zone is specified,
3539 # default to the control plane&#x27;s region.
3540 &quot;serviceKmsKeyName&quot;: &quot;A String&quot;, # If set, contains the Cloud KMS key identifier used to encrypt data
3541 # at rest, AKA a Customer Managed Encryption Key (CMEK).
3542 #
3543 # Format:
3544 # projects/PROJECT_ID/locations/LOCATION/keyRings/KEY_RING/cryptoKeys/KEY
3545 &quot;userAgent&quot;: { # A description of the process that generated the request.
3546 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
3547 },
3548 &quot;workerZone&quot;: &quot;A String&quot;, # The Compute Engine zone
3549 # (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in
3550 # which worker processing should occur, e.g. &quot;us-west1-a&quot;. Mutually exclusive
3551 # with worker_region. If neither worker_region nor worker_zone is specified,
3552 # a zone in the control plane&#x27;s region is chosen based on available capacity.
3553 &quot;clusterManagerApiService&quot;: &quot;A String&quot;, # The type of cluster manager API to use. If unknown or
Dan O'Mearadd494642020-05-01 07:42:23 -07003554 # unspecified, the service will attempt to choose a reasonable
3555 # default. This should be in the form of the API service name,
Bu Sun Kim65020912020-05-20 12:08:20 -07003556 # e.g. &quot;compute.googleapis.com&quot;.
3557 &quot;tempStoragePrefix&quot;: &quot;A String&quot;, # The prefix of the resources the system should use for temporary
3558 # storage. The system will append the suffix &quot;/temp-{JOBNAME} to
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003559 # this resource prefix, where {JOBNAME} is the value of the
3560 # job_name field. The resulting bucket and object prefix is used
3561 # as the prefix of the resources used to store temporary data
3562 # needed during the job execution. NOTE: This will override the
3563 # value in taskrunner_settings.
3564 # The supported resource type is:
3565 #
3566 # Google Cloud Storage:
3567 #
3568 # storage.googleapis.com/{bucket}/{object}
3569 # bucket.storage.googleapis.com/{object}
Bu Sun Kim65020912020-05-20 12:08:20 -07003570 &quot;experiments&quot;: [ # The list of experiments to enable.
3571 &quot;A String&quot;,
3572 ],
3573 &quot;version&quot;: { # A structure describing which components and their versions of the service
3574 # are required in order to run the job.
3575 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
3576 },
3577 &quot;serviceAccountEmail&quot;: &quot;A String&quot;, # Identity to run virtual machines as. Defaults to the default account.
Jon Wayne Parrott692617a2017-01-06 09:58:29 -08003578 },
Bu Sun Kim65020912020-05-20 12:08:20 -07003579 &quot;stageStates&quot;: [ # This field may be mutated by the Cloud Dataflow service;
3580 # callers cannot mutate it.
3581 { # A message describing the state of a particular execution stage.
3582 &quot;executionStageName&quot;: &quot;A String&quot;, # The name of the execution stage.
3583 &quot;currentStateTime&quot;: &quot;A String&quot;, # The time at which the stage transitioned to this state.
3584 &quot;executionStageState&quot;: &quot;A String&quot;, # Executions stage states allow the same set of values as JobState.
3585 },
3586 ],
3587 &quot;jobMetadata&quot;: { # Metadata available primarily for filtering jobs. Will be included in the # This field is populated by the Dataflow service to support filtering jobs
3588 # by the metadata values provided here. Populated for ListJobs and all GetJob
3589 # views SUMMARY and higher.
3590 # ListJob response and Job SUMMARY view.
3591 &quot;bigTableDetails&quot;: [ # Identification of a BigTable source used in the Dataflow job.
3592 { # Metadata for a BigTable connector used by the job.
3593 &quot;tableId&quot;: &quot;A String&quot;, # TableId accessed in the connection.
3594 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
3595 &quot;instanceId&quot;: &quot;A String&quot;, # InstanceId accessed in the connection.
3596 },
3597 ],
3598 &quot;spannerDetails&quot;: [ # Identification of a Spanner source used in the Dataflow job.
3599 { # Metadata for a Spanner connector used by the job.
3600 &quot;databaseId&quot;: &quot;A String&quot;, # DatabaseId accessed in the connection.
3601 &quot;instanceId&quot;: &quot;A String&quot;, # InstanceId accessed in the connection.
3602 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
3603 },
3604 ],
3605 &quot;datastoreDetails&quot;: [ # Identification of a Datastore source used in the Dataflow job.
3606 { # Metadata for a Datastore connector used by the job.
3607 &quot;projectId&quot;: &quot;A String&quot;, # ProjectId accessed in the connection.
3608 &quot;namespace&quot;: &quot;A String&quot;, # Namespace used in the connection.
3609 },
3610 ],
3611 &quot;sdkVersion&quot;: { # The version of the SDK used to run the job. # The SDK version used to run the job.
3612 &quot;versionDisplayName&quot;: &quot;A String&quot;, # A readable string describing the version of the SDK.
3613 &quot;sdkSupportStatus&quot;: &quot;A String&quot;, # The support status for this SDK version.
3614 &quot;version&quot;: &quot;A String&quot;, # The version of the SDK used to run the job.
3615 },
3616 &quot;bigqueryDetails&quot;: [ # Identification of a BigQuery source used in the Dataflow job.
3617 { # Metadata for a BigQuery connector used by the job.
3618 &quot;table&quot;: &quot;A String&quot;, # Table accessed in the connection.
3619 &quot;dataset&quot;: &quot;A String&quot;, # Dataset accessed in the connection.
3620 &quot;projectId&quot;: &quot;A String&quot;, # Project accessed in the connection.
3621 &quot;query&quot;: &quot;A String&quot;, # Query used to access data in the connection.
3622 },
3623 ],
3624 &quot;fileDetails&quot;: [ # Identification of a File source used in the Dataflow job.
3625 { # Metadata for a File connector used by the job.
3626 &quot;filePattern&quot;: &quot;A String&quot;, # File Pattern used to access files by the connector.
3627 },
3628 ],
3629 &quot;pubsubDetails&quot;: [ # Identification of a PubSub source used in the Dataflow job.
3630 { # Metadata for a PubSub connector used by the job.
3631 &quot;subscription&quot;: &quot;A String&quot;, # Subscription used in the connection.
3632 &quot;topic&quot;: &quot;A String&quot;, # Topic accessed in the connection.
3633 },
3634 ],
3635 },
3636 &quot;createdFromSnapshotId&quot;: &quot;A String&quot;, # If this is specified, the job&#x27;s initial state is populated from the given
3637 # snapshot.
3638 &quot;projectId&quot;: &quot;A String&quot;, # The ID of the Cloud Platform project that the job belongs to.
3639 &quot;type&quot;: &quot;A String&quot;, # The type of Cloud Dataflow job.
3640 &quot;pipelineDescription&quot;: { # A descriptive representation of submitted pipeline as well as the executed # Preliminary field: The format of this data may change at any time.
3641 # A description of the user pipeline and stages through which it is executed.
3642 # Created by Cloud Dataflow service. Only retrieved with
3643 # JOB_VIEW_DESCRIPTION or JOB_VIEW_ALL.
3644 # form. This data is provided by the Dataflow service for ease of visualizing
3645 # the pipeline and interpreting Dataflow provided metrics.
3646 &quot;executionPipelineStage&quot;: [ # Description of each stage of execution of the pipeline.
3647 { # Description of the composing transforms, names/ids, and input/outputs of a
3648 # stage of execution. Some composing transforms and sources may have been
3649 # generated by the Dataflow service during execution planning.
3650 &quot;id&quot;: &quot;A String&quot;, # Dataflow service generated id for this stage.
3651 &quot;componentTransform&quot;: [ # Transforms that comprise this execution stage.
3652 { # Description of a transform executed as part of an execution stage.
3653 &quot;originalTransform&quot;: &quot;A String&quot;, # User name for the original user transform with which this transform is
3654 # most closely associated.
3655 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
3656 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this transform; may be user or system generated.
3657 },
3658 ],
3659 &quot;componentSource&quot;: [ # Collections produced and consumed by component transforms of this stage.
3660 { # Description of an interstitial value between transforms in an execution
3661 # stage.
3662 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
3663 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this transform; may be user or system generated.
3664 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
3665 # source is most closely associated.
3666 },
3667 ],
3668 &quot;kind&quot;: &quot;A String&quot;, # Type of tranform this stage is executing.
3669 &quot;outputSource&quot;: [ # Output sources for this stage.
3670 { # Description of an input or output of an execution stage.
3671 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
3672 # source is most closely associated.
3673 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
3674 &quot;sizeBytes&quot;: &quot;A String&quot;, # Size of the source, if measurable.
3675 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this source; may be user or system generated.
3676 },
3677 ],
3678 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this stage.
3679 &quot;inputSource&quot;: [ # Input sources for this stage.
3680 { # Description of an input or output of an execution stage.
3681 &quot;originalTransformOrCollection&quot;: &quot;A String&quot;, # User name for the original user transform or collection with which this
3682 # source is most closely associated.
3683 &quot;name&quot;: &quot;A String&quot;, # Dataflow service generated name for this source.
3684 &quot;sizeBytes&quot;: &quot;A String&quot;, # Size of the source, if measurable.
3685 &quot;userName&quot;: &quot;A String&quot;, # Human-readable name for this source; may be user or system generated.
3686 },
3687 ],
3688 },
3689 ],
3690 &quot;originalPipelineTransform&quot;: [ # Description of each transform in the pipeline and collections between them.
3691 { # Description of the type, names/ids, and input/outputs for a transform.
3692 &quot;kind&quot;: &quot;A String&quot;, # Type of transform.
3693 &quot;inputCollectionName&quot;: [ # User names for all collection inputs to this transform.
3694 &quot;A String&quot;,
3695 ],
3696 &quot;name&quot;: &quot;A String&quot;, # User provided name for this transform instance.
3697 &quot;id&quot;: &quot;A String&quot;, # SDK generated id of this transform instance.
3698 &quot;displayData&quot;: [ # Transform-specific display data.
3699 { # Data provided with a pipeline or transform to provide descriptive info.
3700 &quot;timestampValue&quot;: &quot;A String&quot;, # Contains value if the data is of timestamp type.
3701 &quot;boolValue&quot;: True or False, # Contains value if the data is of a boolean type.
3702 &quot;javaClassValue&quot;: &quot;A String&quot;, # Contains value if the data is of java class type.
3703 &quot;strValue&quot;: &quot;A String&quot;, # Contains value if the data is of string type.
3704 &quot;int64Value&quot;: &quot;A String&quot;, # Contains value if the data is of int64 type.
3705 &quot;durationValue&quot;: &quot;A String&quot;, # Contains value if the data is of duration type.
3706 &quot;namespace&quot;: &quot;A String&quot;, # The namespace for the key. This is usually a class name or programming
3707 # language namespace (i.e. python module) which defines the display data.
3708 # This allows a dax monitoring system to specially handle the data
3709 # and perform custom rendering.
3710 &quot;floatValue&quot;: 3.14, # Contains value if the data is of float type.
3711 &quot;key&quot;: &quot;A String&quot;, # The key identifying the display data.
3712 # This is intended to be used as a label for the display data
3713 # when viewed in a dax monitoring system.
3714 &quot;shortStrValue&quot;: &quot;A String&quot;, # A possible additional shorter value to display.
3715 # For example a java_class_name_value of com.mypackage.MyDoFn
3716 # will be stored with MyDoFn as the short_str_value and
3717 # com.mypackage.MyDoFn as the java_class_name value.
3718 # short_str_value can be displayed and java_class_name_value
3719 # will be displayed as a tooltip.
3720 &quot;url&quot;: &quot;A String&quot;, # An optional full URL.
3721 &quot;label&quot;: &quot;A String&quot;, # An optional label to display in a dax UI for the element.
3722 },
3723 ],
3724 &quot;outputCollectionName&quot;: [ # User names for all collection outputs to this transform.
3725 &quot;A String&quot;,
3726 ],
3727 },
3728 ],
3729 &quot;displayData&quot;: [ # Pipeline level display data.
3730 { # Data provided with a pipeline or transform to provide descriptive info.
3731 &quot;timestampValue&quot;: &quot;A String&quot;, # Contains value if the data is of timestamp type.
3732 &quot;boolValue&quot;: True or False, # Contains value if the data is of a boolean type.
3733 &quot;javaClassValue&quot;: &quot;A String&quot;, # Contains value if the data is of java class type.
3734 &quot;strValue&quot;: &quot;A String&quot;, # Contains value if the data is of string type.
3735 &quot;int64Value&quot;: &quot;A String&quot;, # Contains value if the data is of int64 type.
3736 &quot;durationValue&quot;: &quot;A String&quot;, # Contains value if the data is of duration type.
3737 &quot;namespace&quot;: &quot;A String&quot;, # The namespace for the key. This is usually a class name or programming
3738 # language namespace (i.e. python module) which defines the display data.
3739 # This allows a dax monitoring system to specially handle the data
3740 # and perform custom rendering.
3741 &quot;floatValue&quot;: 3.14, # Contains value if the data is of float type.
3742 &quot;key&quot;: &quot;A String&quot;, # The key identifying the display data.
3743 # This is intended to be used as a label for the display data
3744 # when viewed in a dax monitoring system.
3745 &quot;shortStrValue&quot;: &quot;A String&quot;, # A possible additional shorter value to display.
3746 # For example a java_class_name_value of com.mypackage.MyDoFn
3747 # will be stored with MyDoFn as the short_str_value and
3748 # com.mypackage.MyDoFn as the java_class_name value.
3749 # short_str_value can be displayed and java_class_name_value
3750 # will be displayed as a tooltip.
3751 &quot;url&quot;: &quot;A String&quot;, # An optional full URL.
3752 &quot;label&quot;: &quot;A String&quot;, # An optional label to display in a dax UI for the element.
3753 },
3754 ],
3755 },
3756 &quot;replaceJobId&quot;: &quot;A String&quot;, # If this job is an update of an existing job, this field is the job ID
3757 # of the job it replaced.
3758 #
3759 # When sending a `CreateJobRequest`, you can update a job by specifying it
3760 # here. The job named here is stopped, and its intermediate state is
3761 # transferred to this job.
3762 &quot;tempFiles&quot;: [ # A set of files the system should be aware of that are used
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003763 # for temporary storage. These temporary files will be
3764 # removed on job completion.
3765 # No duplicates are allowed.
3766 # No file patterns are supported.
3767 #
3768 # The supported files are:
3769 #
3770 # Google Cloud Storage:
3771 #
3772 # storage.googleapis.com/{bucket}/{object}
3773 # bucket.storage.googleapis.com/{object}
Bu Sun Kim65020912020-05-20 12:08:20 -07003774 &quot;A String&quot;,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003775 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07003776 &quot;name&quot;: &quot;A String&quot;, # The user-specified Cloud Dataflow job name.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003777 #
3778 # Only one Job with a given name may exist in a project at any
3779 # given time. If a caller attempts to create a Job with the same
3780 # name as an already-existing Job, the attempt returns the
3781 # existing Job.
3782 #
3783 # The name must match the regular expression
3784 # `[a-z]([-a-z0-9]{0,38}[a-z0-9])?`
Bu Sun Kim65020912020-05-20 12:08:20 -07003785 &quot;steps&quot;: [ # Exactly one of step or steps_location should be specified.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003786 #
3787 # The top-level steps that constitute the entire job.
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04003788 { # Defines a particular step within a Cloud Dataflow job.
3789 #
3790 # A job consists of multiple steps, each of which performs some
3791 # specific operation as part of the overall job. Data is typically
3792 # passed from one step to another as part of the job.
3793 #
Bu Sun Kim65020912020-05-20 12:08:20 -07003794 # Here&#x27;s an example of a sequence of steps which together implement a
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04003795 # Map-Reduce job:
3796 #
3797 # * Read a collection of data from some source, parsing the
Bu Sun Kim65020912020-05-20 12:08:20 -07003798 # collection&#x27;s elements.
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04003799 #
3800 # * Validate the elements.
3801 #
3802 # * Apply a user-defined function to map each element to some value
3803 # and extract an element-specific key value.
3804 #
3805 # * Group elements with the same key into a single element with
3806 # that key, transforming a multiply-keyed collection into a
3807 # uniquely-keyed collection.
3808 #
3809 # * Write the elements out to some data sink.
3810 #
3811 # Note that the Cloud Dataflow service may be used to run many different
3812 # types of jobs, not just Map-Reduce.
Bu Sun Kim65020912020-05-20 12:08:20 -07003813 &quot;name&quot;: &quot;A String&quot;, # The name that identifies the step. This must be unique for each
Dan O'Mearadd494642020-05-01 07:42:23 -07003814 # step with respect to all other steps in the Cloud Dataflow job.
Bu Sun Kim65020912020-05-20 12:08:20 -07003815 &quot;kind&quot;: &quot;A String&quot;, # The kind of step in the Cloud Dataflow job.
3816 &quot;properties&quot;: { # Named properties associated with the step. Each kind of
Sai Cheemalapatic30d2b52017-03-13 12:12:03 -04003817 # predefined step has its own required set of properties.
3818 # Must be provided on Create. Only retrieved with JOB_VIEW_ALL.
Bu Sun Kim65020912020-05-20 12:08:20 -07003819 &quot;a_key&quot;: &quot;&quot;, # Properties of the object.
Jon Wayne Parrott692617a2017-01-06 09:58:29 -08003820 },
3821 },
3822 ],
Bu Sun Kim65020912020-05-20 12:08:20 -07003823 &quot;replacedByJobId&quot;: &quot;A String&quot;, # If another job is an update of this job (and thus, this job is in
3824 # `JOB_STATE_UPDATED`), this field contains the ID of that job.
3825 &quot;executionInfo&quot;: { # Additional information about how a Cloud Dataflow job will be executed that # Deprecated.
3826 # isn&#x27;t contained in the submitted job.
3827 &quot;stages&quot;: { # A mapping from each stage to the information about that stage.
3828 &quot;a_key&quot;: { # Contains information about how a particular
3829 # google.dataflow.v1beta3.Step will be executed.
3830 &quot;stepName&quot;: [ # The steps associated with the execution stage.
3831 # Note that stages may have several steps, and that a given step
3832 # might be run by more than one stage.
3833 &quot;A String&quot;,
3834 ],
3835 },
3836 },
3837 },
3838 &quot;currentState&quot;: &quot;A String&quot;, # The current state of the job.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003839 #
3840 # Jobs are created in the `JOB_STATE_STOPPED` state unless otherwise
3841 # specified.
3842 #
3843 # A job in the `JOB_STATE_RUNNING` state may asynchronously enter a
3844 # terminal state. After a job has reached a terminal state, no
3845 # further state updates may be made.
3846 #
3847 # This field may be mutated by the Cloud Dataflow service;
3848 # callers cannot mutate it.
Bu Sun Kim65020912020-05-20 12:08:20 -07003849 &quot;location&quot;: &quot;A String&quot;, # The [regional endpoint]
3850 # (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that
3851 # contains this job.
3852 &quot;startTime&quot;: &quot;A String&quot;, # The timestamp when the job was started (transitioned to JOB_STATE_PENDING).
3853 # Flexible resource scheduling jobs are started with some delay after job
3854 # creation, so start_time is unset before start and is updated when the
3855 # job is started by the Cloud Dataflow service. For other jobs, start_time
3856 # always equals to create_time and is immutable and set by the Cloud Dataflow
3857 # service.
3858 &quot;stepsLocation&quot;: &quot;A String&quot;, # The GCS location where the steps are stored.
3859 &quot;labels&quot;: { # User-defined labels for this job.
3860 #
3861 # The labels map can contain no more than 64 entries. Entries of the labels
3862 # map are UTF8 strings that comply with the following restrictions:
3863 #
3864 # * Keys must conform to regexp: \p{Ll}\p{Lo}{0,62}
3865 # * Values must conform to regexp: [\p{Ll}\p{Lo}\p{N}_-]{0,63}
3866 # * Both keys and values are additionally constrained to be &lt;= 128 bytes in
3867 # size.
3868 &quot;a_key&quot;: &quot;A String&quot;,
Jon Wayne Parrott692617a2017-01-06 09:58:29 -08003869 },
Bu Sun Kim65020912020-05-20 12:08:20 -07003870 &quot;createTime&quot;: &quot;A String&quot;, # The timestamp when the job was initially created. Immutable and set by the
3871 # Cloud Dataflow service.
3872 &quot;requestedState&quot;: &quot;A String&quot;, # The job&#x27;s requested state.
3873 #
3874 # `UpdateJob` may be used to switch between the `JOB_STATE_STOPPED` and
3875 # `JOB_STATE_RUNNING` states, by setting requested_state. `UpdateJob` may
3876 # also be used to directly set a job&#x27;s requested state to
3877 # `JOB_STATE_CANCELLED` or `JOB_STATE_DONE`, irrevocably terminating the
3878 # job if it has not already reached a terminal state.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003879 }</pre>
Jon Wayne Parrott692617a2017-01-06 09:58:29 -08003880</div>
3881
3882</body></html>