blob: 1b7cdc095a05b7926c406caddf57934cb97f07ac [file] [log] [blame]
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001<html><body>
2<style>
3
4body, h1, h2, h3, div, span, p, pre, a {
5 margin: 0;
6 padding: 0;
7 border: 0;
8 font-weight: inherit;
9 font-style: inherit;
10 font-size: 100%;
11 font-family: inherit;
12 vertical-align: baseline;
13}
14
15body {
16 font-size: 13px;
17 padding: 1em;
18}
19
20h1 {
21 font-size: 26px;
22 margin-bottom: 1em;
23}
24
25h2 {
26 font-size: 24px;
27 margin-bottom: 1em;
28}
29
30h3 {
31 font-size: 20px;
32 margin-bottom: 1em;
33 margin-top: 1em;
34}
35
36pre, code {
37 line-height: 1.5;
38 font-family: Monaco, 'DejaVu Sans Mono', 'Bitstream Vera Sans Mono', 'Lucida Console', monospace;
39}
40
41pre {
42 margin-top: 0.5em;
43}
44
45h1, h2, h3, p {
46 font-family: Arial, sans serif;
47}
48
49h1, h2, h3 {
50 border-bottom: solid #CCC 1px;
51}
52
53.toc_element {
54 margin-top: 0.5em;
55}
56
57.firstline {
58 margin-left: 2 em;
59}
60
61.method {
62 margin-top: 1em;
63 border: solid 1px #CCC;
64 padding: 1em;
65 background: #EEE;
66}
67
68.details {
69 font-weight: bold;
70 font-size: 14px;
71}
72
73</style>
74
75<h1><a href="dlp_v2.html">Cloud Data Loss Prevention (DLP) API</a> . <a href="dlp_v2.projects.html">projects</a> . <a href="dlp_v2.projects.dlpJobs.html">dlpJobs</a></h1>
76<h2>Instance Methods</h2>
77<p class="toc_element">
78 <code><a href="#cancel">cancel(name, body=None, x__xgafv=None)</a></code></p>
79<p class="firstline">Starts asynchronous cancellation on a long-running DlpJob. The server</p>
80<p class="toc_element">
Dan O'Mearadd494642020-05-01 07:42:23 -070081 <code><a href="#create">create(parent, body=None, x__xgafv=None)</a></code></p>
Bu Sun Kim715bd7f2019-06-14 16:50:42 -070082<p class="firstline">Creates a new job to inspect storage or calculate risk metrics.</p>
83<p class="toc_element">
84 <code><a href="#delete">delete(name, x__xgafv=None)</a></code></p>
85<p class="firstline">Deletes a long-running DlpJob. This method indicates that the client is</p>
86<p class="toc_element">
87 <code><a href="#get">get(name, x__xgafv=None)</a></code></p>
88<p class="firstline">Gets the latest state of a long-running DlpJob.</p>
89<p class="toc_element">
Dan O'Mearadd494642020-05-01 07:42:23 -070090 <code><a href="#list">list(parent, orderBy=None, pageSize=None, x__xgafv=None, pageToken=None, type=None, locationId=None, filter=None)</a></code></p>
Bu Sun Kim715bd7f2019-06-14 16:50:42 -070091<p class="firstline">Lists DlpJobs that match the specified filter in the request.</p>
92<p class="toc_element">
93 <code><a href="#list_next">list_next(previous_request, previous_response)</a></code></p>
94<p class="firstline">Retrieves the next page of results.</p>
95<h3>Method Details</h3>
96<div class="method">
97 <code class="details" id="cancel">cancel(name, body=None, x__xgafv=None)</code>
98 <pre>Starts asynchronous cancellation on a long-running DlpJob. The server
99makes a best effort to cancel the DlpJob, but success is not
100guaranteed.
101See https://cloud.google.com/dlp/docs/inspecting-storage and
102https://cloud.google.com/dlp/docs/compute-risk-analysis to learn more.
103
104Args:
Dan O'Mearadd494642020-05-01 07:42:23 -0700105 name: string, Required. The name of the DlpJob resource to be cancelled. (required)
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700106 body: object, The request body.
107 The object takes the form of:
108
109{ # The request message for canceling a DLP job.
110 }
111
112 x__xgafv: string, V1 error format.
113 Allowed values
114 1 - v1 error format
115 2 - v2 error format
116
117Returns:
118 An object of the form:
119
120 { # A generic empty message that you can re-use to avoid defining duplicated
121 # empty messages in your APIs. A typical example is to use it as the request
122 # or the response type of an API method. For instance:
123 #
124 # service Foo {
125 # rpc Bar(google.protobuf.Empty) returns (google.protobuf.Empty);
126 # }
127 #
128 # The JSON representation for `Empty` is empty JSON object `{}`.
129 }</pre>
130</div>
131
132<div class="method">
Dan O'Mearadd494642020-05-01 07:42:23 -0700133 <code class="details" id="create">create(parent, body=None, x__xgafv=None)</code>
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700134 <pre>Creates a new job to inspect storage or calculate risk metrics.
135See https://cloud.google.com/dlp/docs/inspecting-storage and
136https://cloud.google.com/dlp/docs/compute-risk-analysis to learn more.
137
138When no InfoTypes or CustomInfoTypes are specified in inspect jobs, the
139system will automatically choose what detectors to run. By default this may
140be all types, but may change over time as detectors are updated.
141
142Args:
Dan O'Mearadd494642020-05-01 07:42:23 -0700143 parent: string, Required. The parent resource name, for example projects/my-project-id. (required)
144 body: object, The request body.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700145 The object takes the form of:
146
147{ # Request message for CreateDlpJobRequest. Used to initiate long running
148 # jobs such as calculating risk metrics or inspecting Google Cloud
149 # Storage.
Dan O'Mearadd494642020-05-01 07:42:23 -0700150 "riskJob": { # Configuration for a risk analysis job. See # Set to choose what metric to calculate.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700151 # https://cloud.google.com/dlp/docs/concepts-risk-analysis to learn more.
152 "privacyMetric": { # Privacy metric to compute for reidentification risk analysis. # Privacy metric to compute.
Dan O'Mearadd494642020-05-01 07:42:23 -0700153 "numericalStatsConfig": { # Compute numerical stats over an individual column, including # Numerical stats
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700154 # min, max, and quantiles.
155 "field": { # General identifier of a data field in a storage service. # Field to compute numerical stats on. Supported types are
156 # integer, float, date, datetime, timestamp, time.
157 "name": "A String", # Name describing the field.
158 },
159 },
Dan O'Mearadd494642020-05-01 07:42:23 -0700160 "kMapEstimationConfig": { # Reidentifiability metric. This corresponds to a risk model similar to what # k-map
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700161 # is called "journalist risk" in the literature, except the attack dataset is
162 # statistically modeled instead of being perfectly known. This can be done
163 # using publicly available data (like the US Census), or using a custom
164 # statistical model (indicated as one or several BigQuery tables), or by
165 # extrapolating from the distribution of values in the input dataset.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700166 "regionCode": "A String", # ISO 3166-1 alpha-2 region code to use in the statistical modeling.
Dan O'Mearadd494642020-05-01 07:42:23 -0700167 # Set if no column is tagged with a region-specific InfoType (like
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700168 # US_ZIP_5) or a region code.
Dan O'Mearadd494642020-05-01 07:42:23 -0700169 "quasiIds": [ # Required. Fields considered to be quasi-identifiers. No two columns can have the
170 # same tag.
171 { # A column with a semantic tag attached.
172 "field": { # General identifier of a data field in a storage service. # Required. Identifies the column.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700173 "name": "A String", # Name describing the field.
174 },
175 "customTag": "A String", # A column can be tagged with a custom tag. In this case, the user must
176 # indicate an auxiliary table that contains statistical information on
177 # the possible values of this column (below).
178 "infoType": { # Type of information detected by the API. # A column can be tagged with a InfoType to use the relevant public
179 # dataset as a statistical model of population, if available. We
180 # currently support US ZIP codes, region codes, ages and genders.
181 # To programmatically obtain the list of supported InfoTypes, use
182 # ListInfoTypes with the supported_by=RISK_ANALYSIS filter.
183 "name": "A String", # Name of the information type. Either a name of your choosing when
184 # creating a CustomInfoType, or one of the names listed
185 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
186 # a built-in type. InfoType names should conform to the pattern
Dan O'Mearadd494642020-05-01 07:42:23 -0700187 # `[a-zA-Z0-9_]{1,64}`.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700188 },
189 "inferred": { # A generic empty message that you can re-use to avoid defining duplicated # If no semantic tag is indicated, we infer the statistical model from
190 # the distribution of values in the input data
191 # empty messages in your APIs. A typical example is to use it as the request
192 # or the response type of an API method. For instance:
193 #
194 # service Foo {
195 # rpc Bar(google.protobuf.Empty) returns (google.protobuf.Empty);
196 # }
197 #
198 # The JSON representation for `Empty` is empty JSON object `{}`.
199 },
200 },
201 ],
202 "auxiliaryTables": [ # Several auxiliary tables can be used in the analysis. Each custom_tag
203 # used to tag a quasi-identifiers column must appear in exactly one column
204 # of one auxiliary table.
205 { # An auxiliary table contains statistical information on the relative
206 # frequency of different quasi-identifiers values. It has one or several
207 # quasi-identifiers columns, and one column that indicates the relative
208 # frequency of each quasi-identifier tuple.
209 # If a tuple is present in the data but not in the auxiliary table, the
210 # corresponding relative frequency is assumed to be zero (and thus, the
211 # tuple is highly reidentifiable).
Dan O'Mearadd494642020-05-01 07:42:23 -0700212 "table": { # Message defining the location of a BigQuery table. A table is uniquely # Required. Auxiliary table location.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700213 # identified by its project_id, dataset_id, and table_name. Within a query
214 # a table is often referenced with a string in the format of:
Dan O'Mearadd494642020-05-01 07:42:23 -0700215 # `&lt;project_id&gt;:&lt;dataset_id&gt;.&lt;table_id&gt;` or
216 # `&lt;project_id&gt;.&lt;dataset_id&gt;.&lt;table_id&gt;`.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700217 "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table.
218 # If omitted, project ID is inferred from the API call.
219 "tableId": "A String", # Name of the table.
220 "datasetId": "A String", # Dataset ID of the table.
221 },
Dan O'Mearadd494642020-05-01 07:42:23 -0700222 "quasiIds": [ # Required. Quasi-identifier columns.
223 { # A quasi-identifier column has a custom_tag, used to know which column
224 # in the data corresponds to which column in the statistical model.
225 "field": { # General identifier of a data field in a storage service. # Identifies the column.
226 "name": "A String", # Name describing the field.
227 },
228 "customTag": "A String", # A auxiliary field.
229 },
230 ],
231 "relativeFrequency": { # General identifier of a data field in a storage service. # Required. The relative frequency column must contain a floating-point number
232 # between 0 and 1 (inclusive). Null values are assumed to be zero.
233 "name": "A String", # Name describing the field.
234 },
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700235 },
236 ],
237 },
Dan O'Mearadd494642020-05-01 07:42:23 -0700238 "lDiversityConfig": { # l-diversity metric, used for analysis of reidentification risk. # l-diversity
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700239 "sensitiveAttribute": { # General identifier of a data field in a storage service. # Sensitive field for computing the l-value.
240 "name": "A String", # Name describing the field.
241 },
242 "quasiIds": [ # Set of quasi-identifiers indicating how equivalence classes are
243 # defined for the l-diversity computation. When multiple fields are
244 # specified, they are considered a single composite key.
245 { # General identifier of a data field in a storage service.
246 "name": "A String", # Name describing the field.
247 },
248 ],
249 },
Dan O'Mearadd494642020-05-01 07:42:23 -0700250 "kAnonymityConfig": { # k-anonymity metric, used for analysis of reidentification risk. # K-anonymity
251 "entityId": { # An entity in a dataset is a field or set of fields that correspond to a # Message indicating that multiple rows might be associated to a
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700252 # single individual. If the same entity_id is associated to multiple
253 # quasi-identifier tuples over distinct rows, we consider the entire
254 # collection of tuples as the composite quasi-identifier. This collection
255 # is a multiset: the order in which the different tuples appear in the
256 # dataset is ignored, but their frequency is taken into account.
257 #
258 # Important note: a maximum of 1000 rows can be associated to a single
259 # entity ID. If more rows are associated with the same entity ID, some
260 # might be ignored.
261 # single person. For example, in medical records the `EntityId` might be a
262 # patient identifier, or for financial records it might be an account
263 # identifier. This message is used when generalizations or analysis must take
264 # into account that multiple rows correspond to the same entity.
265 "field": { # General identifier of a data field in a storage service. # Composite key indicating which field contains the entity identifier.
266 "name": "A String", # Name describing the field.
267 },
268 },
269 "quasiIds": [ # Set of fields to compute k-anonymity over. When multiple fields are
270 # specified, they are considered a single composite key. Structs and
271 # repeated data types are not supported; however, nested fields are
272 # supported so long as they are not structs themselves or nested within
273 # a repeated field.
274 { # General identifier of a data field in a storage service.
275 "name": "A String", # Name describing the field.
276 },
277 ],
278 },
Dan O'Mearadd494642020-05-01 07:42:23 -0700279 "categoricalStatsConfig": { # Compute numerical stats over an individual column, including # Categorical stats
280 # number of distinct values and value count distribution.
281 "field": { # General identifier of a data field in a storage service. # Field to compute categorical stats on. All column types are
282 # supported except for arrays and structs. However, it may be more
283 # informative to use NumericalStats when the field type is supported,
284 # depending on the data.
285 "name": "A String", # Name describing the field.
286 },
287 },
288 "deltaPresenceEstimationConfig": { # δ-presence metric, used to estimate how likely it is for an attacker to # delta-presence
289 # figure out that one given individual appears in a de-identified dataset.
290 # Similarly to the k-map metric, we cannot compute δ-presence exactly without
291 # knowing the attack dataset, so we use a statistical model instead.
292 "regionCode": "A String", # ISO 3166-1 alpha-2 region code to use in the statistical modeling.
293 # Set if no column is tagged with a region-specific InfoType (like
294 # US_ZIP_5) or a region code.
295 "quasiIds": [ # Required. Fields considered to be quasi-identifiers. No two fields can have the
296 # same tag.
297 { # A column with a semantic tag attached.
298 "field": { # General identifier of a data field in a storage service. # Required. Identifies the column.
299 "name": "A String", # Name describing the field.
300 },
301 "customTag": "A String", # A column can be tagged with a custom tag. In this case, the user must
302 # indicate an auxiliary table that contains statistical information on
303 # the possible values of this column (below).
304 "infoType": { # Type of information detected by the API. # A column can be tagged with a InfoType to use the relevant public
305 # dataset as a statistical model of population, if available. We
306 # currently support US ZIP codes, region codes, ages and genders.
307 # To programmatically obtain the list of supported InfoTypes, use
308 # ListInfoTypes with the supported_by=RISK_ANALYSIS filter.
309 "name": "A String", # Name of the information type. Either a name of your choosing when
310 # creating a CustomInfoType, or one of the names listed
311 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
312 # a built-in type. InfoType names should conform to the pattern
313 # `[a-zA-Z0-9_]{1,64}`.
314 },
315 "inferred": { # A generic empty message that you can re-use to avoid defining duplicated # If no semantic tag is indicated, we infer the statistical model from
316 # the distribution of values in the input data
317 # empty messages in your APIs. A typical example is to use it as the request
318 # or the response type of an API method. For instance:
319 #
320 # service Foo {
321 # rpc Bar(google.protobuf.Empty) returns (google.protobuf.Empty);
322 # }
323 #
324 # The JSON representation for `Empty` is empty JSON object `{}`.
325 },
326 },
327 ],
328 "auxiliaryTables": [ # Several auxiliary tables can be used in the analysis. Each custom_tag
329 # used to tag a quasi-identifiers field must appear in exactly one
330 # field of one auxiliary table.
331 { # An auxiliary table containing statistical information on the relative
332 # frequency of different quasi-identifiers values. It has one or several
333 # quasi-identifiers columns, and one column that indicates the relative
334 # frequency of each quasi-identifier tuple.
335 # If a tuple is present in the data but not in the auxiliary table, the
336 # corresponding relative frequency is assumed to be zero (and thus, the
337 # tuple is highly reidentifiable).
338 "relativeFrequency": { # General identifier of a data field in a storage service. # Required. The relative frequency column must contain a floating-point number
339 # between 0 and 1 (inclusive). Null values are assumed to be zero.
340 "name": "A String", # Name describing the field.
341 },
342 "quasiIds": [ # Required. Quasi-identifier columns.
343 { # A quasi-identifier column has a custom_tag, used to know which column
344 # in the data corresponds to which column in the statistical model.
345 "field": { # General identifier of a data field in a storage service. # Identifies the column.
346 "name": "A String", # Name describing the field.
347 },
348 "customTag": "A String", # A column can be tagged with a custom tag. In this case, the user must
349 # indicate an auxiliary table that contains statistical information on
350 # the possible values of this column (below).
351 },
352 ],
353 "table": { # Message defining the location of a BigQuery table. A table is uniquely # Required. Auxiliary table location.
354 # identified by its project_id, dataset_id, and table_name. Within a query
355 # a table is often referenced with a string in the format of:
356 # `&lt;project_id&gt;:&lt;dataset_id&gt;.&lt;table_id&gt;` or
357 # `&lt;project_id&gt;.&lt;dataset_id&gt;.&lt;table_id&gt;`.
358 "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table.
359 # If omitted, project ID is inferred from the API call.
360 "tableId": "A String", # Name of the table.
361 "datasetId": "A String", # Dataset ID of the table.
362 },
363 },
364 ],
365 },
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700366 },
367 "actions": [ # Actions to execute at the completion of the job. Are executed in the order
368 # provided.
369 { # A task to execute on the completion of a job.
370 # See https://cloud.google.com/dlp/docs/concepts-actions to learn more.
371 "saveFindings": { # If set, the detailed findings will be persisted to the specified # Save resulting findings in a provided location.
372 # OutputStorageConfig. Only a single instance of this action can be
373 # specified.
374 # Compatible with: Inspect, Risk
Dan O'Mearadd494642020-05-01 07:42:23 -0700375 "outputConfig": { # Cloud repository for storing output. # Location to store findings outside of DLP.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700376 "table": { # Message defining the location of a BigQuery table. A table is uniquely # Store findings in an existing table or a new table in an existing
377 # dataset. If table_id is not set a new one will be generated
378 # for you with the following format:
379 # dlp_googleapis_yyyy_mm_dd_[dlp_job_id]. Pacific timezone will be used for
380 # generating the date details.
381 #
382 # For Inspect, each column in an existing output table must have the same
383 # name, type, and mode of a field in the `Finding` object.
384 #
385 # For Risk, an existing output table should be the output of a previous
386 # Risk analysis job run on the same source table, with the same privacy
387 # metric and quasi-identifiers. Risk jobs that analyze the same table but
388 # compute a different privacy metric, or use different sets of
389 # quasi-identifiers, cannot store their results in the same table.
390 # identified by its project_id, dataset_id, and table_name. Within a query
391 # a table is often referenced with a string in the format of:
Dan O'Mearadd494642020-05-01 07:42:23 -0700392 # `&lt;project_id&gt;:&lt;dataset_id&gt;.&lt;table_id&gt;` or
393 # `&lt;project_id&gt;.&lt;dataset_id&gt;.&lt;table_id&gt;`.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700394 "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table.
395 # If omitted, project ID is inferred from the API call.
396 "tableId": "A String", # Name of the table.
397 "datasetId": "A String", # Dataset ID of the table.
398 },
399 "outputSchema": "A String", # Schema used for writing the findings for Inspect jobs. This field is only
400 # used for Inspect and must be unspecified for Risk jobs. Columns are derived
401 # from the `Finding` object. If appending to an existing table, any columns
402 # from the predefined schema that are missing will be added. No columns in
403 # the existing table will be deleted.
404 #
405 # If unspecified, then all available columns will be used for a new table or
406 # an (existing) table with no schema, and no changes will be made to an
407 # existing table that has a schema.
Dan O'Mearadd494642020-05-01 07:42:23 -0700408 # Only for use with external storage.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700409 },
410 },
Dan O'Mearadd494642020-05-01 07:42:23 -0700411 "jobNotificationEmails": { # Enable email notification to project owners and editors on jobs's # Enable email notification for project owners and editors on job's
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700412 # completion/failure.
413 # completion/failure.
414 },
415 "publishSummaryToCscc": { # Publish the result summary of a DlpJob to the Cloud Security # Publish summary to Cloud Security Command Center (Alpha).
416 # Command Center (CSCC Alpha).
417 # This action is only available for projects which are parts of
418 # an organization and whitelisted for the alpha Cloud Security Command
419 # Center.
420 # The action will publish count of finding instances and their info types.
421 # The summary of findings will be persisted in CSCC and are governed by CSCC
422 # service-specific policy, see https://cloud.google.com/terms/service-terms
423 # Only a single instance of this action can be specified.
424 # Compatible with: Inspect
425 },
Dan O'Mearadd494642020-05-01 07:42:23 -0700426 "publishToStackdriver": { # Enable Stackdriver metric dlp.googleapis.com/finding_count. This # Enable Stackdriver metric dlp.googleapis.com/finding_count.
427 # will publish a metric to stack driver on each infotype requested and
428 # how many findings were found for it. CustomDetectors will be bucketed
429 # as 'Custom' under the Stackdriver label 'info_type'.
430 },
431 "publishFindingsToCloudDataCatalog": { # Publish findings of a DlpJob to Cloud Data Catalog. Labels summarizing the # Publish findings to Cloud Datahub.
432 # results of the DlpJob will be applied to the entry for the resource scanned
433 # in Cloud Data Catalog. Any labels previously written by another DlpJob will
434 # be deleted. InfoType naming patterns are strictly enforced when using this
435 # feature. Note that the findings will be persisted in Cloud Data Catalog
436 # storage and are governed by Data Catalog service-specific policy, see
437 # https://cloud.google.com/terms/service-terms
438 # Only a single instance of this action can be specified and only allowed if
439 # all resources being scanned are BigQuery tables.
440 # Compatible with: Inspect
441 },
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700442 "pubSub": { # Publish a message into given Pub/Sub topic when DlpJob has completed. The # Publish a notification to a pubsub topic.
443 # message contains a single field, `DlpJobName`, which is equal to the
444 # finished job's
445 # [`DlpJob.name`](/dlp/docs/reference/rest/v2/projects.dlpJobs#DlpJob).
446 # Compatible with: Inspect, Risk
447 "topic": "A String", # Cloud Pub/Sub topic to send notifications to. The topic must have given
448 # publishing access rights to the DLP API service account executing
449 # the long running DlpJob sending the notifications.
450 # Format is projects/{project}/topics/{topic}.
451 },
452 },
453 ],
Dan O'Mearadd494642020-05-01 07:42:23 -0700454 "sourceTable": { # Message defining the location of a BigQuery table. A table is uniquely # Input dataset to compute metrics over.
455 # identified by its project_id, dataset_id, and table_name. Within a query
456 # a table is often referenced with a string in the format of:
457 # `&lt;project_id&gt;:&lt;dataset_id&gt;.&lt;table_id&gt;` or
458 # `&lt;project_id&gt;.&lt;dataset_id&gt;.&lt;table_id&gt;`.
459 "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table.
460 # If omitted, project ID is inferred from the API call.
461 "tableId": "A String", # Name of the table.
462 "datasetId": "A String", # Dataset ID of the table.
463 },
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700464 },
Dan O'Mearadd494642020-05-01 07:42:23 -0700465 "inspectJob": { # Controls what and how to inspect for findings. # Set to control what and how to inspect.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700466 "storageConfig": { # Shared message indicating Cloud storage type. # The data to scan.
Dan O'Mearadd494642020-05-01 07:42:23 -0700467 "cloudStorageOptions": { # Options defining a file or a set of files within a Google Cloud Storage # Google Cloud Storage options.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700468 # bucket.
469 "bytesLimitPerFile": "A String", # Max number of bytes to scan from a file. If a scanned file's size is bigger
470 # than this value then the rest of the bytes are omitted. Only one
471 # of bytes_limit_per_file and bytes_limit_per_file_percent can be specified.
472 "sampleMethod": "A String",
473 "fileSet": { # Set of files to scan. # The set of one or more files to scan.
474 "url": "A String", # The Cloud Storage url of the file(s) to scan, in the format
Dan O'Mearadd494642020-05-01 07:42:23 -0700475 # `gs://&lt;bucket&gt;/&lt;path&gt;`. Trailing wildcard in the path is allowed.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700476 #
477 # If the url ends in a trailing slash, the bucket or directory represented
478 # by the url will be scanned non-recursively (content in sub-directories
479 # will not be scanned). This means that `gs://mybucket/` is equivalent to
480 # `gs://mybucket/*`, and `gs://mybucket/directory/` is equivalent to
481 # `gs://mybucket/directory/*`.
482 #
483 # Exactly one of `url` or `regex_file_set` must be set.
484 "regexFileSet": { # Message representing a set of files in a Cloud Storage bucket. Regular # The regex-filtered set of files to scan. Exactly one of `url` or
485 # `regex_file_set` must be set.
486 # expressions are used to allow fine-grained control over which files in the
487 # bucket to include.
488 #
489 # Included files are those that match at least one item in `include_regex` and
490 # do not match any items in `exclude_regex`. Note that a file that matches
491 # items from both lists will _not_ be included. For a match to occur, the
492 # entire file path (i.e., everything in the url after the bucket name) must
493 # match the regular expression.
494 #
495 # For example, given the input `{bucket_name: "mybucket", include_regex:
496 # ["directory1/.*"], exclude_regex:
497 # ["directory1/excluded.*"]}`:
498 #
499 # * `gs://mybucket/directory1/myfile` will be included
500 # * `gs://mybucket/directory1/directory2/myfile` will be included (`.*` matches
501 # across `/`)
502 # * `gs://mybucket/directory0/directory1/myfile` will _not_ be included (the
503 # full path doesn't match any items in `include_regex`)
504 # * `gs://mybucket/directory1/excludedfile` will _not_ be included (the path
505 # matches an item in `exclude_regex`)
506 #
507 # If `include_regex` is left empty, it will match all files by default
508 # (this is equivalent to setting `include_regex: [".*"]`).
509 #
510 # Some other common use cases:
511 #
512 # * `{bucket_name: "mybucket", exclude_regex: [".*\.pdf"]}` will include all
513 # files in `mybucket` except for .pdf files
514 # * `{bucket_name: "mybucket", include_regex: ["directory/[^/]+"]}` will
515 # include all files directly under `gs://mybucket/directory/`, without matching
516 # across `/`
517 "excludeRegex": [ # A list of regular expressions matching file paths to exclude. All files in
518 # the bucket that match at least one of these regular expressions will be
519 # excluded from the scan.
520 #
521 # Regular expressions use RE2
522 # [syntax](https://github.com/google/re2/wiki/Syntax); a guide can be found
523 # under the google/re2 repository on GitHub.
524 "A String",
525 ],
526 "bucketName": "A String", # The name of a Cloud Storage bucket. Required.
527 "includeRegex": [ # A list of regular expressions matching file paths to include. All files in
528 # the bucket that match at least one of these regular expressions will be
529 # included in the set of files, except for those that also match an item in
530 # `exclude_regex`. Leaving this field empty will match all files by default
531 # (this is equivalent to including `.*` in the list).
532 #
533 # Regular expressions use RE2
534 # [syntax](https://github.com/google/re2/wiki/Syntax); a guide can be found
535 # under the google/re2 repository on GitHub.
536 "A String",
537 ],
538 },
539 },
Dan O'Mearadd494642020-05-01 07:42:23 -0700540 "filesLimitPercent": 42, # Limits the number of files to scan to this percentage of the input FileSet.
541 # Number of files scanned is rounded down. Must be between 0 and 100,
542 # inclusively. Both 0 and 100 means no limit. Defaults to 0.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700543 "bytesLimitPerFilePercent": 42, # Max percentage of bytes to scan from a file. The rest are omitted. The
544 # number of bytes scanned is rounded down. Must be between 0 and 100,
545 # inclusively. Both 0 and 100 means no limit. Defaults to 0. Only one
546 # of bytes_limit_per_file and bytes_limit_per_file_percent can be specified.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700547 "fileTypes": [ # List of file type groups to include in the scan.
548 # If empty, all files are scanned and available data format processors
549 # are applied. In addition, the binary content of the selected files
550 # is always scanned as well.
Dan O'Mearadd494642020-05-01 07:42:23 -0700551 # Images are scanned only as binary if the specified region
552 # does not support image inspection and no file_types were specified.
553 # Image inspection is restricted to 'global', 'us', 'asia', and 'europe'.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700554 "A String",
555 ],
556 },
Dan O'Mearadd494642020-05-01 07:42:23 -0700557 "datastoreOptions": { # Options defining a data set within Google Cloud Datastore. # Google Cloud Datastore options.
558 "partitionId": { # Datastore partition ID. # A partition ID identifies a grouping of entities. The grouping is always
559 # by project and namespace, however the namespace ID may be empty.
560 # A partition ID identifies a grouping of entities. The grouping is always
561 # by project and namespace, however the namespace ID may be empty.
562 #
563 # A partition ID contains several dimensions:
564 # project ID and namespace ID.
565 "projectId": "A String", # The ID of the project to which the entities belong.
566 "namespaceId": "A String", # If not empty, the ID of the namespace to which the entities belong.
567 },
568 "kind": { # A representation of a Datastore kind. # The kind to process.
569 "name": "A String", # The name of the kind.
570 },
571 },
572 "bigQueryOptions": { # Options defining BigQuery table and row identifiers. # BigQuery options.
573 "excludedFields": [ # References to fields excluded from scanning. This allows you to skip
574 # inspection of entire columns which you know have no findings.
575 { # General identifier of a data field in a storage service.
576 "name": "A String", # Name describing the field.
577 },
578 ],
579 "rowsLimit": "A String", # Max number of rows to scan. If the table has more rows than this value, the
580 # rest of the rows are omitted. If not set, or if set to 0, all rows will be
581 # scanned. Only one of rows_limit and rows_limit_percent can be specified.
582 # Cannot be used in conjunction with TimespanConfig.
583 "sampleMethod": "A String",
584 "identifyingFields": [ # Table fields that may uniquely identify a row within the table. When
585 # `actions.saveFindings.outputConfig.table` is specified, the values of
586 # columns specified here are available in the output table under
587 # `location.content_locations.record_location.record_key.id_values`. Nested
588 # fields such as `person.birthdate.year` are allowed.
589 { # General identifier of a data field in a storage service.
590 "name": "A String", # Name describing the field.
591 },
592 ],
593 "rowsLimitPercent": 42, # Max percentage of rows to scan. The rest are omitted. The number of rows
594 # scanned is rounded down. Must be between 0 and 100, inclusively. Both 0 and
595 # 100 means no limit. Defaults to 0. Only one of rows_limit and
596 # rows_limit_percent can be specified. Cannot be used in conjunction with
597 # TimespanConfig.
598 "tableReference": { # Message defining the location of a BigQuery table. A table is uniquely # Complete BigQuery table reference.
599 # identified by its project_id, dataset_id, and table_name. Within a query
600 # a table is often referenced with a string in the format of:
601 # `&lt;project_id&gt;:&lt;dataset_id&gt;.&lt;table_id&gt;` or
602 # `&lt;project_id&gt;.&lt;dataset_id&gt;.&lt;table_id&gt;`.
603 "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table.
604 # If omitted, project ID is inferred from the API call.
605 "tableId": "A String", # Name of the table.
606 "datasetId": "A String", # Dataset ID of the table.
607 },
608 },
609 "timespanConfig": { # Configuration of the timespan of the items to include in scanning.
610 # Currently only supported when inspecting Google Cloud Storage and BigQuery.
611 "timestampField": { # General identifier of a data field in a storage service. # Specification of the field containing the timestamp of scanned items.
612 # Used for data sources like Datastore and BigQuery.
613 #
614 # For BigQuery:
615 # Required to filter out rows based on the given start and
616 # end times. If not specified and the table was modified between the given
617 # start and end times, the entire table will be scanned.
618 # The valid data types of the timestamp field are: `INTEGER`, `DATE`,
619 # `TIMESTAMP`, or `DATETIME` BigQuery column.
620 #
621 # For Datastore.
622 # Valid data types of the timestamp field are: `TIMESTAMP`.
623 # Datastore entity will be scanned if the timestamp property does not
624 # exist or its value is empty or invalid.
625 "name": "A String", # Name describing the field.
626 },
627 "endTime": "A String", # Exclude files or rows newer than this value.
628 # If set to zero, no upper time limit is applied.
629 "startTime": "A String", # Exclude files or rows older than this value.
630 "enableAutoPopulationOfTimespanConfig": True or False, # When the job is started by a JobTrigger we will automatically figure out
631 # a valid start_time to avoid scanning files that have not been modified
632 # since the last time the JobTrigger executed. This will be based on the
633 # time of the execution of the last run of the JobTrigger.
634 },
635 "hybridOptions": { # Configuration to control jobs where the content being inspected is outside # Hybrid inspection options.
636 # Early access feature is in a pre-release state and might change or have
637 # limited support. For more information, see
638 # https://cloud.google.com/products#product-launch-stages.
639 # of Google Cloud Platform.
640 "tableOptions": { # Instructions regarding the table content being inspected. # If the container is a table, additional information to make findings
641 # meaningful such as the columns that are primary keys.
642 "identifyingFields": [ # The columns that are the primary keys for table objects included in
643 # ContentItem. A copy of this cell's value will stored alongside alongside
644 # each finding so that the finding can be traced to the specific row it came
645 # from. No more than 3 may be provided.
646 { # General identifier of a data field in a storage service.
647 "name": "A String", # Name describing the field.
648 },
649 ],
650 },
651 "labels": { # To organize findings, these labels will be added to each finding.
652 #
653 # Label keys must be between 1 and 63 characters long and must conform
654 # to the following regular expression: `[a-z]([-a-z0-9]*[a-z0-9])?`.
655 #
656 # Label values must be between 0 and 63 characters long and must conform
657 # to the regular expression `([a-z]([-a-z0-9]*[a-z0-9])?)?`.
658 #
659 # No more than 10 labels can be associated with a given finding.
660 #
661 # Examples:
662 # * `"environment" : "production"`
663 # * `"pipeline" : "etl"`
664 "a_key": "A String",
665 },
666 "requiredFindingLabelKeys": [ # These are labels that each inspection request must include within their
667 # 'finding_labels' map. Request may contain others, but any missing one of
668 # these will be rejected.
669 #
670 # Label keys must be between 1 and 63 characters long and must conform
671 # to the following regular expression: `[a-z]([-a-z0-9]*[a-z0-9])?`.
672 #
673 # No more than 10 keys can be required.
674 "A String",
675 ],
676 "description": "A String", # A short description of where the data is coming from. Will be stored once
677 # in the job. 256 max length.
678 },
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700679 },
680 "inspectConfig": { # Configuration description of the scanning process. # How and what to scan for.
681 # When used with redactContent only info_types and min_likelihood are currently
682 # used.
683 "excludeInfoTypes": True or False, # When true, excludes type information of the findings.
Dan O'Mearadd494642020-05-01 07:42:23 -0700684 "limits": { # Configuration to control the number of findings returned. # Configuration to control the number of findings returned.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700685 "maxFindingsPerRequest": 42, # Max number of findings that will be returned per request/job.
686 # When set within `InspectContentRequest`, the maximum returned is 2000
687 # regardless if this is set higher.
688 "maxFindingsPerInfoType": [ # Configuration of findings limit given for specified infoTypes.
689 { # Max findings configuration per infoType, per content item or long
690 # running DlpJob.
691 "infoType": { # Type of information detected by the API. # Type of information the findings limit applies to. Only one limit per
692 # info_type should be provided. If InfoTypeLimit does not have an
693 # info_type, the DLP API applies the limit against all info_types that
694 # are found but not specified in another InfoTypeLimit.
695 "name": "A String", # Name of the information type. Either a name of your choosing when
696 # creating a CustomInfoType, or one of the names listed
697 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
698 # a built-in type. InfoType names should conform to the pattern
Dan O'Mearadd494642020-05-01 07:42:23 -0700699 # `[a-zA-Z0-9_]{1,64}`.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700700 },
701 "maxFindings": 42, # Max findings limit for the given infoType.
702 },
703 ],
704 "maxFindingsPerItem": 42, # Max number of findings that will be returned for each item scanned.
Dan O'Mearadd494642020-05-01 07:42:23 -0700705 # When set within `InspectJobConfig`,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700706 # the maximum returned is 2000 regardless if this is set higher.
707 # When set within `InspectContentRequest`, this field is ignored.
708 },
709 "minLikelihood": "A String", # Only returns findings equal or above this threshold. The default is
710 # POSSIBLE.
711 # See https://cloud.google.com/dlp/docs/likelihood to learn more.
712 "customInfoTypes": [ # CustomInfoTypes provided by the user. See
713 # https://cloud.google.com/dlp/docs/creating-custom-infotypes to learn more.
714 { # Custom information type provided by the user. Used to find domain-specific
715 # sensitive information configurable to the data in question.
716 "regex": { # Message defining a custom regular expression. # Regular expression based CustomInfoType.
717 "pattern": "A String", # Pattern defining the regular expression. Its syntax
718 # (https://github.com/google/re2/wiki/Syntax) can be found under the
719 # google/re2 repository on GitHub.
720 "groupIndexes": [ # The index of the submatch to extract as findings. When not
721 # specified, the entire match is returned. No more than 3 may be included.
722 42,
723 ],
724 },
725 "surrogateType": { # Message for detecting output from deidentification transformations # Message for detecting output from deidentification transformations that
726 # support reversing.
727 # such as
728 # [`CryptoReplaceFfxFpeConfig`](/dlp/docs/reference/rest/v2/organizations.deidentifyTemplates#cryptoreplaceffxfpeconfig).
729 # These types of transformations are
730 # those that perform pseudonymization, thereby producing a "surrogate" as
731 # output. This should be used in conjunction with a field on the
732 # transformation such as `surrogate_info_type`. This CustomInfoType does
733 # not support the use of `detection_rules`.
734 },
735 "infoType": { # Type of information detected by the API. # CustomInfoType can either be a new infoType, or an extension of built-in
736 # infoType, when the name matches one of existing infoTypes and that infoType
737 # is specified in `InspectContent.info_types` field. Specifying the latter
738 # adds findings to the one detected by the system. If built-in info type is
739 # not specified in `InspectContent.info_types` list then the name is treated
740 # as a custom info type.
741 "name": "A String", # Name of the information type. Either a name of your choosing when
742 # creating a CustomInfoType, or one of the names listed
743 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
744 # a built-in type. InfoType names should conform to the pattern
Dan O'Mearadd494642020-05-01 07:42:23 -0700745 # `[a-zA-Z0-9_]{1,64}`.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700746 },
747 "dictionary": { # Custom information type based on a dictionary of words or phrases. This can # A list of phrases to detect as a CustomInfoType.
748 # be used to match sensitive information specific to the data, such as a list
749 # of employee IDs or job titles.
750 #
751 # Dictionary words are case-insensitive and all characters other than letters
752 # and digits in the unicode [Basic Multilingual
753 # Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane)
754 # will be replaced with whitespace when scanning for matches, so the
755 # dictionary phrase "Sam Johnson" will match all three phrases "sam johnson",
756 # "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters
757 # surrounding any match must be of a different type than the adjacent
758 # characters within the word, so letters must be next to non-letters and
759 # digits next to non-digits. For example, the dictionary word "jen" will
760 # match the first three letters of the text "jen123" but will return no
761 # matches for "jennifer".
762 #
763 # Dictionary words containing a large number of characters that are not
764 # letters or digits may result in unexpected findings because such characters
765 # are treated as whitespace. The
766 # [limits](https://cloud.google.com/dlp/limits) page contains details about
767 # the size limits of dictionaries. For dictionaries that do not fit within
768 # these constraints, consider using `LargeCustomDictionaryConfig` in the
769 # `StoredInfoType` API.
770 "wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for.
771 "words": [ # Words or phrases defining the dictionary. The dictionary must contain
772 # at least one phrase and every phrase must contain at least 2 characters
773 # that are letters or digits. [required]
774 "A String",
775 ],
776 },
777 "cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file
778 # is accepted.
779 "path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage.
780 # Example: gs://[BUCKET_NAME]/dictionary.txt
781 },
782 },
783 "storedType": { # A reference to a StoredInfoType to use with scanning. # Load an existing `StoredInfoType` resource for use in
784 # `InspectDataSource`. Not currently supported in `InspectContent`.
785 "name": "A String", # Resource name of the requested `StoredInfoType`, for example
786 # `organizations/433245324/storedInfoTypes/432452342` or
787 # `projects/project-id/storedInfoTypes/432452342`.
788 "createTime": "A String", # Timestamp indicating when the version of the `StoredInfoType` used for
789 # inspection was created. Output-only field, populated by the system.
790 },
791 "detectionRules": [ # Set of detection rules to apply to all findings of this CustomInfoType.
792 # Rules are applied in order that they are specified. Not supported for the
793 # `surrogate_type` CustomInfoType.
794 { # Deprecated; use `InspectionRuleSet` instead. Rule for modifying a
795 # `CustomInfoType` to alter behavior under certain circumstances, depending
796 # on the specific details of the rule. Not supported for the `surrogate_type`
797 # custom infoType.
798 "hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule.
799 # proximity of hotwords.
800 "proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside.
801 # The total length of the window cannot exceed 1000 characters. Note that
802 # the finding itself will be included in the window, so that hotwords may
803 # be used to match substrings of the finding itself. For example, the
804 # certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be
805 # adjusted upwards if the area code is known to be the local area code of
806 # a company office using the hotword regex "\(xxx\)", where "xxx"
807 # is the area code in question.
808 # rule.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700809 "windowBefore": 42, # Number of characters before the finding to consider.
Dan O'Mearadd494642020-05-01 07:42:23 -0700810 "windowAfter": 42, # Number of characters after the finding to consider.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700811 },
812 "hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword.
813 "pattern": "A String", # Pattern defining the regular expression. Its syntax
814 # (https://github.com/google/re2/wiki/Syntax) can be found under the
815 # google/re2 repository on GitHub.
816 "groupIndexes": [ # The index of the submatch to extract as findings. When not
817 # specified, the entire match is returned. No more than 3 may be included.
818 42,
819 ],
820 },
821 "likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings.
822 # part of a detection rule.
823 "relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of
824 # levels. For example, if a finding would be `POSSIBLE` without the
825 # detection rule and `relative_likelihood` is 1, then it is upgraded to
826 # `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`.
827 # Likelihood may never drop below `VERY_UNLIKELY` or exceed
828 # `VERY_LIKELY`, so applying an adjustment of 1 followed by an
829 # adjustment of -1 when base likelihood is `VERY_LIKELY` will result in
830 # a final likelihood of `LIKELY`.
831 "fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value.
832 },
833 },
834 },
835 ],
836 "exclusionType": "A String", # If set to EXCLUSION_TYPE_EXCLUDE this infoType will not cause a finding
837 # to be returned. It still can be used for rules matching.
838 "likelihood": "A String", # Likelihood to return for this CustomInfoType. This base value can be
839 # altered by a detection rule if the finding meets the criteria specified by
840 # the rule. Defaults to `VERY_LIKELY` if not specified.
841 },
842 ],
843 "includeQuote": True or False, # When true, a contextual quote from the data that triggered a finding is
844 # included in the response; see Finding.quote.
845 "ruleSet": [ # Set of rules to apply to the findings for this InspectConfig.
846 # Exclusion rules, contained in the set are executed in the end, other
847 # rules are executed in the order they are specified for each info type.
848 { # Rule set for modifying a set of infoTypes to alter behavior under certain
849 # circumstances, depending on the specific details of the rules within the set.
850 "rules": [ # Set of rules to be applied to infoTypes. The rules are applied in order.
851 { # A single inspection rule to be applied to infoTypes, specified in
852 # `InspectionRuleSet`.
853 "hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule.
854 # proximity of hotwords.
855 "proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside.
856 # The total length of the window cannot exceed 1000 characters. Note that
857 # the finding itself will be included in the window, so that hotwords may
858 # be used to match substrings of the finding itself. For example, the
859 # certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be
860 # adjusted upwards if the area code is known to be the local area code of
861 # a company office using the hotword regex "\(xxx\)", where "xxx"
862 # is the area code in question.
863 # rule.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700864 "windowBefore": 42, # Number of characters before the finding to consider.
Dan O'Mearadd494642020-05-01 07:42:23 -0700865 "windowAfter": 42, # Number of characters after the finding to consider.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700866 },
867 "hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword.
868 "pattern": "A String", # Pattern defining the regular expression. Its syntax
869 # (https://github.com/google/re2/wiki/Syntax) can be found under the
870 # google/re2 repository on GitHub.
871 "groupIndexes": [ # The index of the submatch to extract as findings. When not
872 # specified, the entire match is returned. No more than 3 may be included.
873 42,
874 ],
875 },
876 "likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings.
877 # part of a detection rule.
878 "relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of
879 # levels. For example, if a finding would be `POSSIBLE` without the
880 # detection rule and `relative_likelihood` is 1, then it is upgraded to
881 # `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`.
882 # Likelihood may never drop below `VERY_UNLIKELY` or exceed
883 # `VERY_LIKELY`, so applying an adjustment of 1 followed by an
884 # adjustment of -1 when base likelihood is `VERY_LIKELY` will result in
885 # a final likelihood of `LIKELY`.
886 "fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value.
887 },
888 },
889 "exclusionRule": { # The rule that specifies conditions when findings of infoTypes specified in # Exclusion rule.
890 # `InspectionRuleSet` are removed from results.
891 "regex": { # Message defining a custom regular expression. # Regular expression which defines the rule.
892 "pattern": "A String", # Pattern defining the regular expression. Its syntax
893 # (https://github.com/google/re2/wiki/Syntax) can be found under the
894 # google/re2 repository on GitHub.
895 "groupIndexes": [ # The index of the submatch to extract as findings. When not
896 # specified, the entire match is returned. No more than 3 may be included.
897 42,
898 ],
899 },
900 "excludeInfoTypes": { # List of exclude infoTypes. # Set of infoTypes for which findings would affect this rule.
901 "infoTypes": [ # InfoType list in ExclusionRule rule drops a finding when it overlaps or
902 # contained within with a finding of an infoType from this list. For
903 # example, for `InspectionRuleSet.info_types` containing "PHONE_NUMBER"` and
904 # `exclusion_rule` containing `exclude_info_types.info_types` with
905 # "EMAIL_ADDRESS" the phone number findings are dropped if they overlap
906 # with EMAIL_ADDRESS finding.
907 # That leads to "555-222-2222@example.org" to generate only a single
908 # finding, namely email address.
909 { # Type of information detected by the API.
910 "name": "A String", # Name of the information type. Either a name of your choosing when
911 # creating a CustomInfoType, or one of the names listed
912 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
913 # a built-in type. InfoType names should conform to the pattern
Dan O'Mearadd494642020-05-01 07:42:23 -0700914 # `[a-zA-Z0-9_]{1,64}`.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700915 },
916 ],
917 },
918 "dictionary": { # Custom information type based on a dictionary of words or phrases. This can # Dictionary which defines the rule.
919 # be used to match sensitive information specific to the data, such as a list
920 # of employee IDs or job titles.
921 #
922 # Dictionary words are case-insensitive and all characters other than letters
923 # and digits in the unicode [Basic Multilingual
924 # Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane)
925 # will be replaced with whitespace when scanning for matches, so the
926 # dictionary phrase "Sam Johnson" will match all three phrases "sam johnson",
927 # "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters
928 # surrounding any match must be of a different type than the adjacent
929 # characters within the word, so letters must be next to non-letters and
930 # digits next to non-digits. For example, the dictionary word "jen" will
931 # match the first three letters of the text "jen123" but will return no
932 # matches for "jennifer".
933 #
934 # Dictionary words containing a large number of characters that are not
935 # letters or digits may result in unexpected findings because such characters
936 # are treated as whitespace. The
937 # [limits](https://cloud.google.com/dlp/limits) page contains details about
938 # the size limits of dictionaries. For dictionaries that do not fit within
939 # these constraints, consider using `LargeCustomDictionaryConfig` in the
940 # `StoredInfoType` API.
941 "wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for.
942 "words": [ # Words or phrases defining the dictionary. The dictionary must contain
943 # at least one phrase and every phrase must contain at least 2 characters
944 # that are letters or digits. [required]
945 "A String",
946 ],
947 },
948 "cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file
949 # is accepted.
950 "path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage.
951 # Example: gs://[BUCKET_NAME]/dictionary.txt
952 },
953 },
954 "matchingType": "A String", # How the rule is applied, see MatchingType documentation for details.
955 },
956 },
957 ],
958 "infoTypes": [ # List of infoTypes this rule set is applied to.
959 { # Type of information detected by the API.
960 "name": "A String", # Name of the information type. Either a name of your choosing when
961 # creating a CustomInfoType, or one of the names listed
962 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
963 # a built-in type. InfoType names should conform to the pattern
Dan O'Mearadd494642020-05-01 07:42:23 -0700964 # `[a-zA-Z0-9_]{1,64}`.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700965 },
966 ],
967 },
968 ],
969 "contentOptions": [ # List of options defining data content to scan.
970 # If empty, text, images, and other content will be included.
971 "A String",
972 ],
973 "infoTypes": [ # Restricts what info_types to look for. The values must correspond to
974 # InfoType values returned by ListInfoTypes or listed at
975 # https://cloud.google.com/dlp/docs/infotypes-reference.
976 #
977 # When no InfoTypes or CustomInfoTypes are specified in a request, the
978 # system may automatically choose what detectors to run. By default this may
979 # be all types, but may change over time as detectors are updated.
980 #
Dan O'Mearadd494642020-05-01 07:42:23 -0700981 # If you need precise control and predictability as to what detectors are
982 # run you should specify specific InfoTypes listed in the reference,
983 # otherwise a default list will be used, which may change over time.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700984 { # Type of information detected by the API.
985 "name": "A String", # Name of the information type. Either a name of your choosing when
986 # creating a CustomInfoType, or one of the names listed
987 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
988 # a built-in type. InfoType names should conform to the pattern
Dan O'Mearadd494642020-05-01 07:42:23 -0700989 # `[a-zA-Z0-9_]{1,64}`.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -0700990 },
991 ],
992 },
993 "inspectTemplateName": "A String", # If provided, will be used as the default for all values in InspectConfig.
994 # `inspect_config` will be merged into the values persisted as part of the
995 # template.
996 "actions": [ # Actions to execute at the completion of the job.
997 { # A task to execute on the completion of a job.
998 # See https://cloud.google.com/dlp/docs/concepts-actions to learn more.
999 "saveFindings": { # If set, the detailed findings will be persisted to the specified # Save resulting findings in a provided location.
1000 # OutputStorageConfig. Only a single instance of this action can be
1001 # specified.
1002 # Compatible with: Inspect, Risk
Dan O'Mearadd494642020-05-01 07:42:23 -07001003 "outputConfig": { # Cloud repository for storing output. # Location to store findings outside of DLP.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001004 "table": { # Message defining the location of a BigQuery table. A table is uniquely # Store findings in an existing table or a new table in an existing
1005 # dataset. If table_id is not set a new one will be generated
1006 # for you with the following format:
1007 # dlp_googleapis_yyyy_mm_dd_[dlp_job_id]. Pacific timezone will be used for
1008 # generating the date details.
1009 #
1010 # For Inspect, each column in an existing output table must have the same
1011 # name, type, and mode of a field in the `Finding` object.
1012 #
1013 # For Risk, an existing output table should be the output of a previous
1014 # Risk analysis job run on the same source table, with the same privacy
1015 # metric and quasi-identifiers. Risk jobs that analyze the same table but
1016 # compute a different privacy metric, or use different sets of
1017 # quasi-identifiers, cannot store their results in the same table.
1018 # identified by its project_id, dataset_id, and table_name. Within a query
1019 # a table is often referenced with a string in the format of:
Dan O'Mearadd494642020-05-01 07:42:23 -07001020 # `&lt;project_id&gt;:&lt;dataset_id&gt;.&lt;table_id&gt;` or
1021 # `&lt;project_id&gt;.&lt;dataset_id&gt;.&lt;table_id&gt;`.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001022 "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table.
1023 # If omitted, project ID is inferred from the API call.
1024 "tableId": "A String", # Name of the table.
1025 "datasetId": "A String", # Dataset ID of the table.
1026 },
1027 "outputSchema": "A String", # Schema used for writing the findings for Inspect jobs. This field is only
1028 # used for Inspect and must be unspecified for Risk jobs. Columns are derived
1029 # from the `Finding` object. If appending to an existing table, any columns
1030 # from the predefined schema that are missing will be added. No columns in
1031 # the existing table will be deleted.
1032 #
1033 # If unspecified, then all available columns will be used for a new table or
1034 # an (existing) table with no schema, and no changes will be made to an
1035 # existing table that has a schema.
Dan O'Mearadd494642020-05-01 07:42:23 -07001036 # Only for use with external storage.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001037 },
1038 },
Dan O'Mearadd494642020-05-01 07:42:23 -07001039 "jobNotificationEmails": { # Enable email notification to project owners and editors on jobs's # Enable email notification for project owners and editors on job's
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001040 # completion/failure.
1041 # completion/failure.
1042 },
1043 "publishSummaryToCscc": { # Publish the result summary of a DlpJob to the Cloud Security # Publish summary to Cloud Security Command Center (Alpha).
1044 # Command Center (CSCC Alpha).
1045 # This action is only available for projects which are parts of
1046 # an organization and whitelisted for the alpha Cloud Security Command
1047 # Center.
1048 # The action will publish count of finding instances and their info types.
1049 # The summary of findings will be persisted in CSCC and are governed by CSCC
1050 # service-specific policy, see https://cloud.google.com/terms/service-terms
1051 # Only a single instance of this action can be specified.
1052 # Compatible with: Inspect
1053 },
Dan O'Mearadd494642020-05-01 07:42:23 -07001054 "publishToStackdriver": { # Enable Stackdriver metric dlp.googleapis.com/finding_count. This # Enable Stackdriver metric dlp.googleapis.com/finding_count.
1055 # will publish a metric to stack driver on each infotype requested and
1056 # how many findings were found for it. CustomDetectors will be bucketed
1057 # as 'Custom' under the Stackdriver label 'info_type'.
1058 },
1059 "publishFindingsToCloudDataCatalog": { # Publish findings of a DlpJob to Cloud Data Catalog. Labels summarizing the # Publish findings to Cloud Datahub.
1060 # results of the DlpJob will be applied to the entry for the resource scanned
1061 # in Cloud Data Catalog. Any labels previously written by another DlpJob will
1062 # be deleted. InfoType naming patterns are strictly enforced when using this
1063 # feature. Note that the findings will be persisted in Cloud Data Catalog
1064 # storage and are governed by Data Catalog service-specific policy, see
1065 # https://cloud.google.com/terms/service-terms
1066 # Only a single instance of this action can be specified and only allowed if
1067 # all resources being scanned are BigQuery tables.
1068 # Compatible with: Inspect
1069 },
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001070 "pubSub": { # Publish a message into given Pub/Sub topic when DlpJob has completed. The # Publish a notification to a pubsub topic.
1071 # message contains a single field, `DlpJobName`, which is equal to the
1072 # finished job's
1073 # [`DlpJob.name`](/dlp/docs/reference/rest/v2/projects.dlpJobs#DlpJob).
1074 # Compatible with: Inspect, Risk
1075 "topic": "A String", # Cloud Pub/Sub topic to send notifications to. The topic must have given
1076 # publishing access rights to the DLP API service account executing
1077 # the long running DlpJob sending the notifications.
1078 # Format is projects/{project}/topics/{topic}.
1079 },
1080 },
1081 ],
1082 },
Dan O'Mearadd494642020-05-01 07:42:23 -07001083 "locationId": "A String", # The geographic location to store and process the job. Reserved for
1084 # future extensions.
1085 "jobId": "A String", # The job id can contain uppercase and lowercase letters,
1086 # numbers, and hyphens; that is, it must match the regular
1087 # expression: `[a-zA-Z\\d-_]+`. The maximum length is 100
1088 # characters. Can be empty to allow the system to generate one.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001089 }
1090
1091 x__xgafv: string, V1 error format.
1092 Allowed values
1093 1 - v1 error format
1094 2 - v2 error format
1095
1096Returns:
1097 An object of the form:
1098
1099 { # Combines all of the information about a DLP job.
1100 "errors": [ # A stream of errors encountered running the job.
1101 { # Details information about an error encountered during job execution or
1102 # the results of an unsuccessful activation of the JobTrigger.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001103 "timestamps": [ # The times the error occurred.
1104 "A String",
1105 ],
Dan O'Mearadd494642020-05-01 07:42:23 -07001106 "details": { # The `Status` type defines a logical error model that is suitable for # Detailed error codes and messages.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001107 # different programming environments, including REST APIs and RPC APIs. It is
1108 # used by [gRPC](https://github.com/grpc). Each `Status` message contains
1109 # three pieces of data: error code, error message, and error details.
1110 #
1111 # You can find out more about this error model and how to work with it in the
1112 # [API Design Guide](https://cloud.google.com/apis/design/errors).
1113 "message": "A String", # A developer-facing error message, which should be in English. Any
1114 # user-facing error message should be localized and sent in the
1115 # google.rpc.Status.details field, or localized by the client.
1116 "code": 42, # The status code, which should be an enum value of google.rpc.Code.
1117 "details": [ # A list of messages that carry the error details. There is a common set of
1118 # message types for APIs to use.
1119 {
1120 "a_key": "", # Properties of the object. Contains field @type with type URL.
1121 },
1122 ],
1123 },
1124 },
1125 ],
1126 "name": "A String", # The server-assigned name.
1127 "inspectDetails": { # The results of an inspect DataSource job. # Results from inspecting a data source.
Dan O'Mearadd494642020-05-01 07:42:23 -07001128 "requestedOptions": { # Snapshot of the inspection configuration. # The configuration used for this job.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001129 "snapshotInspectTemplate": { # The inspectTemplate contains a configuration (set of types of sensitive data # If run with an InspectTemplate, a snapshot of its state at the time of
1130 # this run.
1131 # to be detected) to be used anywhere you otherwise would normally specify
1132 # InspectConfig. See https://cloud.google.com/dlp/docs/concepts-templates
1133 # to learn more.
Dan O'Mearadd494642020-05-01 07:42:23 -07001134 "updateTime": "A String", # Output only. The last update timestamp of an inspectTemplate.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001135 "displayName": "A String", # Display name (max 256 chars).
1136 "description": "A String", # Short description (max 256 chars).
1137 "inspectConfig": { # Configuration description of the scanning process. # The core content of the template. Configuration of the scanning process.
1138 # When used with redactContent only info_types and min_likelihood are currently
1139 # used.
1140 "excludeInfoTypes": True or False, # When true, excludes type information of the findings.
Dan O'Mearadd494642020-05-01 07:42:23 -07001141 "limits": { # Configuration to control the number of findings returned. # Configuration to control the number of findings returned.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001142 "maxFindingsPerRequest": 42, # Max number of findings that will be returned per request/job.
1143 # When set within `InspectContentRequest`, the maximum returned is 2000
1144 # regardless if this is set higher.
1145 "maxFindingsPerInfoType": [ # Configuration of findings limit given for specified infoTypes.
1146 { # Max findings configuration per infoType, per content item or long
1147 # running DlpJob.
1148 "infoType": { # Type of information detected by the API. # Type of information the findings limit applies to. Only one limit per
1149 # info_type should be provided. If InfoTypeLimit does not have an
1150 # info_type, the DLP API applies the limit against all info_types that
1151 # are found but not specified in another InfoTypeLimit.
1152 "name": "A String", # Name of the information type. Either a name of your choosing when
1153 # creating a CustomInfoType, or one of the names listed
1154 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
1155 # a built-in type. InfoType names should conform to the pattern
Dan O'Mearadd494642020-05-01 07:42:23 -07001156 # `[a-zA-Z0-9_]{1,64}`.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001157 },
1158 "maxFindings": 42, # Max findings limit for the given infoType.
1159 },
1160 ],
1161 "maxFindingsPerItem": 42, # Max number of findings that will be returned for each item scanned.
Dan O'Mearadd494642020-05-01 07:42:23 -07001162 # When set within `InspectJobConfig`,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001163 # the maximum returned is 2000 regardless if this is set higher.
1164 # When set within `InspectContentRequest`, this field is ignored.
1165 },
1166 "minLikelihood": "A String", # Only returns findings equal or above this threshold. The default is
1167 # POSSIBLE.
1168 # See https://cloud.google.com/dlp/docs/likelihood to learn more.
1169 "customInfoTypes": [ # CustomInfoTypes provided by the user. See
1170 # https://cloud.google.com/dlp/docs/creating-custom-infotypes to learn more.
1171 { # Custom information type provided by the user. Used to find domain-specific
1172 # sensitive information configurable to the data in question.
1173 "regex": { # Message defining a custom regular expression. # Regular expression based CustomInfoType.
1174 "pattern": "A String", # Pattern defining the regular expression. Its syntax
1175 # (https://github.com/google/re2/wiki/Syntax) can be found under the
1176 # google/re2 repository on GitHub.
1177 "groupIndexes": [ # The index of the submatch to extract as findings. When not
1178 # specified, the entire match is returned. No more than 3 may be included.
1179 42,
1180 ],
1181 },
1182 "surrogateType": { # Message for detecting output from deidentification transformations # Message for detecting output from deidentification transformations that
1183 # support reversing.
1184 # such as
1185 # [`CryptoReplaceFfxFpeConfig`](/dlp/docs/reference/rest/v2/organizations.deidentifyTemplates#cryptoreplaceffxfpeconfig).
1186 # These types of transformations are
1187 # those that perform pseudonymization, thereby producing a "surrogate" as
1188 # output. This should be used in conjunction with a field on the
1189 # transformation such as `surrogate_info_type`. This CustomInfoType does
1190 # not support the use of `detection_rules`.
1191 },
1192 "infoType": { # Type of information detected by the API. # CustomInfoType can either be a new infoType, or an extension of built-in
1193 # infoType, when the name matches one of existing infoTypes and that infoType
1194 # is specified in `InspectContent.info_types` field. Specifying the latter
1195 # adds findings to the one detected by the system. If built-in info type is
1196 # not specified in `InspectContent.info_types` list then the name is treated
1197 # as a custom info type.
1198 "name": "A String", # Name of the information type. Either a name of your choosing when
1199 # creating a CustomInfoType, or one of the names listed
1200 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
1201 # a built-in type. InfoType names should conform to the pattern
Dan O'Mearadd494642020-05-01 07:42:23 -07001202 # `[a-zA-Z0-9_]{1,64}`.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001203 },
1204 "dictionary": { # Custom information type based on a dictionary of words or phrases. This can # A list of phrases to detect as a CustomInfoType.
1205 # be used to match sensitive information specific to the data, such as a list
1206 # of employee IDs or job titles.
1207 #
1208 # Dictionary words are case-insensitive and all characters other than letters
1209 # and digits in the unicode [Basic Multilingual
1210 # Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane)
1211 # will be replaced with whitespace when scanning for matches, so the
1212 # dictionary phrase "Sam Johnson" will match all three phrases "sam johnson",
1213 # "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters
1214 # surrounding any match must be of a different type than the adjacent
1215 # characters within the word, so letters must be next to non-letters and
1216 # digits next to non-digits. For example, the dictionary word "jen" will
1217 # match the first three letters of the text "jen123" but will return no
1218 # matches for "jennifer".
1219 #
1220 # Dictionary words containing a large number of characters that are not
1221 # letters or digits may result in unexpected findings because such characters
1222 # are treated as whitespace. The
1223 # [limits](https://cloud.google.com/dlp/limits) page contains details about
1224 # the size limits of dictionaries. For dictionaries that do not fit within
1225 # these constraints, consider using `LargeCustomDictionaryConfig` in the
1226 # `StoredInfoType` API.
1227 "wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for.
1228 "words": [ # Words or phrases defining the dictionary. The dictionary must contain
1229 # at least one phrase and every phrase must contain at least 2 characters
1230 # that are letters or digits. [required]
1231 "A String",
1232 ],
1233 },
1234 "cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file
1235 # is accepted.
1236 "path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage.
1237 # Example: gs://[BUCKET_NAME]/dictionary.txt
1238 },
1239 },
1240 "storedType": { # A reference to a StoredInfoType to use with scanning. # Load an existing `StoredInfoType` resource for use in
1241 # `InspectDataSource`. Not currently supported in `InspectContent`.
1242 "name": "A String", # Resource name of the requested `StoredInfoType`, for example
1243 # `organizations/433245324/storedInfoTypes/432452342` or
1244 # `projects/project-id/storedInfoTypes/432452342`.
1245 "createTime": "A String", # Timestamp indicating when the version of the `StoredInfoType` used for
1246 # inspection was created. Output-only field, populated by the system.
1247 },
1248 "detectionRules": [ # Set of detection rules to apply to all findings of this CustomInfoType.
1249 # Rules are applied in order that they are specified. Not supported for the
1250 # `surrogate_type` CustomInfoType.
1251 { # Deprecated; use `InspectionRuleSet` instead. Rule for modifying a
1252 # `CustomInfoType` to alter behavior under certain circumstances, depending
1253 # on the specific details of the rule. Not supported for the `surrogate_type`
1254 # custom infoType.
1255 "hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule.
1256 # proximity of hotwords.
1257 "proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside.
1258 # The total length of the window cannot exceed 1000 characters. Note that
1259 # the finding itself will be included in the window, so that hotwords may
1260 # be used to match substrings of the finding itself. For example, the
1261 # certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be
1262 # adjusted upwards if the area code is known to be the local area code of
1263 # a company office using the hotword regex "\(xxx\)", where "xxx"
1264 # is the area code in question.
1265 # rule.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001266 "windowBefore": 42, # Number of characters before the finding to consider.
Dan O'Mearadd494642020-05-01 07:42:23 -07001267 "windowAfter": 42, # Number of characters after the finding to consider.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001268 },
1269 "hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword.
1270 "pattern": "A String", # Pattern defining the regular expression. Its syntax
1271 # (https://github.com/google/re2/wiki/Syntax) can be found under the
1272 # google/re2 repository on GitHub.
1273 "groupIndexes": [ # The index of the submatch to extract as findings. When not
1274 # specified, the entire match is returned. No more than 3 may be included.
1275 42,
1276 ],
1277 },
1278 "likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings.
1279 # part of a detection rule.
1280 "relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of
1281 # levels. For example, if a finding would be `POSSIBLE` without the
1282 # detection rule and `relative_likelihood` is 1, then it is upgraded to
1283 # `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`.
1284 # Likelihood may never drop below `VERY_UNLIKELY` or exceed
1285 # `VERY_LIKELY`, so applying an adjustment of 1 followed by an
1286 # adjustment of -1 when base likelihood is `VERY_LIKELY` will result in
1287 # a final likelihood of `LIKELY`.
1288 "fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value.
1289 },
1290 },
1291 },
1292 ],
1293 "exclusionType": "A String", # If set to EXCLUSION_TYPE_EXCLUDE this infoType will not cause a finding
1294 # to be returned. It still can be used for rules matching.
1295 "likelihood": "A String", # Likelihood to return for this CustomInfoType. This base value can be
1296 # altered by a detection rule if the finding meets the criteria specified by
1297 # the rule. Defaults to `VERY_LIKELY` if not specified.
1298 },
1299 ],
1300 "includeQuote": True or False, # When true, a contextual quote from the data that triggered a finding is
1301 # included in the response; see Finding.quote.
1302 "ruleSet": [ # Set of rules to apply to the findings for this InspectConfig.
1303 # Exclusion rules, contained in the set are executed in the end, other
1304 # rules are executed in the order they are specified for each info type.
1305 { # Rule set for modifying a set of infoTypes to alter behavior under certain
1306 # circumstances, depending on the specific details of the rules within the set.
1307 "rules": [ # Set of rules to be applied to infoTypes. The rules are applied in order.
1308 { # A single inspection rule to be applied to infoTypes, specified in
1309 # `InspectionRuleSet`.
1310 "hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule.
1311 # proximity of hotwords.
1312 "proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside.
1313 # The total length of the window cannot exceed 1000 characters. Note that
1314 # the finding itself will be included in the window, so that hotwords may
1315 # be used to match substrings of the finding itself. For example, the
1316 # certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be
1317 # adjusted upwards if the area code is known to be the local area code of
1318 # a company office using the hotword regex "\(xxx\)", where "xxx"
1319 # is the area code in question.
1320 # rule.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001321 "windowBefore": 42, # Number of characters before the finding to consider.
Dan O'Mearadd494642020-05-01 07:42:23 -07001322 "windowAfter": 42, # Number of characters after the finding to consider.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001323 },
1324 "hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword.
1325 "pattern": "A String", # Pattern defining the regular expression. Its syntax
1326 # (https://github.com/google/re2/wiki/Syntax) can be found under the
1327 # google/re2 repository on GitHub.
1328 "groupIndexes": [ # The index of the submatch to extract as findings. When not
1329 # specified, the entire match is returned. No more than 3 may be included.
1330 42,
1331 ],
1332 },
1333 "likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings.
1334 # part of a detection rule.
1335 "relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of
1336 # levels. For example, if a finding would be `POSSIBLE` without the
1337 # detection rule and `relative_likelihood` is 1, then it is upgraded to
1338 # `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`.
1339 # Likelihood may never drop below `VERY_UNLIKELY` or exceed
1340 # `VERY_LIKELY`, so applying an adjustment of 1 followed by an
1341 # adjustment of -1 when base likelihood is `VERY_LIKELY` will result in
1342 # a final likelihood of `LIKELY`.
1343 "fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value.
1344 },
1345 },
1346 "exclusionRule": { # The rule that specifies conditions when findings of infoTypes specified in # Exclusion rule.
1347 # `InspectionRuleSet` are removed from results.
1348 "regex": { # Message defining a custom regular expression. # Regular expression which defines the rule.
1349 "pattern": "A String", # Pattern defining the regular expression. Its syntax
1350 # (https://github.com/google/re2/wiki/Syntax) can be found under the
1351 # google/re2 repository on GitHub.
1352 "groupIndexes": [ # The index of the submatch to extract as findings. When not
1353 # specified, the entire match is returned. No more than 3 may be included.
1354 42,
1355 ],
1356 },
1357 "excludeInfoTypes": { # List of exclude infoTypes. # Set of infoTypes for which findings would affect this rule.
1358 "infoTypes": [ # InfoType list in ExclusionRule rule drops a finding when it overlaps or
1359 # contained within with a finding of an infoType from this list. For
1360 # example, for `InspectionRuleSet.info_types` containing "PHONE_NUMBER"` and
1361 # `exclusion_rule` containing `exclude_info_types.info_types` with
1362 # "EMAIL_ADDRESS" the phone number findings are dropped if they overlap
1363 # with EMAIL_ADDRESS finding.
1364 # That leads to "555-222-2222@example.org" to generate only a single
1365 # finding, namely email address.
1366 { # Type of information detected by the API.
1367 "name": "A String", # Name of the information type. Either a name of your choosing when
1368 # creating a CustomInfoType, or one of the names listed
1369 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
1370 # a built-in type. InfoType names should conform to the pattern
Dan O'Mearadd494642020-05-01 07:42:23 -07001371 # `[a-zA-Z0-9_]{1,64}`.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001372 },
1373 ],
1374 },
1375 "dictionary": { # Custom information type based on a dictionary of words or phrases. This can # Dictionary which defines the rule.
1376 # be used to match sensitive information specific to the data, such as a list
1377 # of employee IDs or job titles.
1378 #
1379 # Dictionary words are case-insensitive and all characters other than letters
1380 # and digits in the unicode [Basic Multilingual
1381 # Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane)
1382 # will be replaced with whitespace when scanning for matches, so the
1383 # dictionary phrase "Sam Johnson" will match all three phrases "sam johnson",
1384 # "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters
1385 # surrounding any match must be of a different type than the adjacent
1386 # characters within the word, so letters must be next to non-letters and
1387 # digits next to non-digits. For example, the dictionary word "jen" will
1388 # match the first three letters of the text "jen123" but will return no
1389 # matches for "jennifer".
1390 #
1391 # Dictionary words containing a large number of characters that are not
1392 # letters or digits may result in unexpected findings because such characters
1393 # are treated as whitespace. The
1394 # [limits](https://cloud.google.com/dlp/limits) page contains details about
1395 # the size limits of dictionaries. For dictionaries that do not fit within
1396 # these constraints, consider using `LargeCustomDictionaryConfig` in the
1397 # `StoredInfoType` API.
1398 "wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for.
1399 "words": [ # Words or phrases defining the dictionary. The dictionary must contain
1400 # at least one phrase and every phrase must contain at least 2 characters
1401 # that are letters or digits. [required]
1402 "A String",
1403 ],
1404 },
1405 "cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file
1406 # is accepted.
1407 "path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage.
1408 # Example: gs://[BUCKET_NAME]/dictionary.txt
1409 },
1410 },
1411 "matchingType": "A String", # How the rule is applied, see MatchingType documentation for details.
1412 },
1413 },
1414 ],
1415 "infoTypes": [ # List of infoTypes this rule set is applied to.
1416 { # Type of information detected by the API.
1417 "name": "A String", # Name of the information type. Either a name of your choosing when
1418 # creating a CustomInfoType, or one of the names listed
1419 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
1420 # a built-in type. InfoType names should conform to the pattern
Dan O'Mearadd494642020-05-01 07:42:23 -07001421 # `[a-zA-Z0-9_]{1,64}`.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001422 },
1423 ],
1424 },
1425 ],
1426 "contentOptions": [ # List of options defining data content to scan.
1427 # If empty, text, images, and other content will be included.
1428 "A String",
1429 ],
1430 "infoTypes": [ # Restricts what info_types to look for. The values must correspond to
1431 # InfoType values returned by ListInfoTypes or listed at
1432 # https://cloud.google.com/dlp/docs/infotypes-reference.
1433 #
1434 # When no InfoTypes or CustomInfoTypes are specified in a request, the
1435 # system may automatically choose what detectors to run. By default this may
1436 # be all types, but may change over time as detectors are updated.
1437 #
Dan O'Mearadd494642020-05-01 07:42:23 -07001438 # If you need precise control and predictability as to what detectors are
1439 # run you should specify specific InfoTypes listed in the reference,
1440 # otherwise a default list will be used, which may change over time.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001441 { # Type of information detected by the API.
1442 "name": "A String", # Name of the information type. Either a name of your choosing when
1443 # creating a CustomInfoType, or one of the names listed
1444 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
1445 # a built-in type. InfoType names should conform to the pattern
Dan O'Mearadd494642020-05-01 07:42:23 -07001446 # `[a-zA-Z0-9_]{1,64}`.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001447 },
1448 ],
1449 },
Dan O'Mearadd494642020-05-01 07:42:23 -07001450 "createTime": "A String", # Output only. The creation timestamp of an inspectTemplate.
1451 "name": "A String", # Output only. The template name.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001452 #
1453 # The template will have one of the following formats:
1454 # `projects/PROJECT_ID/inspectTemplates/TEMPLATE_ID` OR
Dan O'Mearadd494642020-05-01 07:42:23 -07001455 # `organizations/ORGANIZATION_ID/inspectTemplates/TEMPLATE_ID`;
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001456 },
Dan O'Mearadd494642020-05-01 07:42:23 -07001457 "jobConfig": { # Controls what and how to inspect for findings. # Inspect config.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001458 "storageConfig": { # Shared message indicating Cloud storage type. # The data to scan.
Dan O'Mearadd494642020-05-01 07:42:23 -07001459 "cloudStorageOptions": { # Options defining a file or a set of files within a Google Cloud Storage # Google Cloud Storage options.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001460 # bucket.
1461 "bytesLimitPerFile": "A String", # Max number of bytes to scan from a file. If a scanned file's size is bigger
1462 # than this value then the rest of the bytes are omitted. Only one
1463 # of bytes_limit_per_file and bytes_limit_per_file_percent can be specified.
1464 "sampleMethod": "A String",
1465 "fileSet": { # Set of files to scan. # The set of one or more files to scan.
1466 "url": "A String", # The Cloud Storage url of the file(s) to scan, in the format
Dan O'Mearadd494642020-05-01 07:42:23 -07001467 # `gs://&lt;bucket&gt;/&lt;path&gt;`. Trailing wildcard in the path is allowed.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001468 #
1469 # If the url ends in a trailing slash, the bucket or directory represented
1470 # by the url will be scanned non-recursively (content in sub-directories
1471 # will not be scanned). This means that `gs://mybucket/` is equivalent to
1472 # `gs://mybucket/*`, and `gs://mybucket/directory/` is equivalent to
1473 # `gs://mybucket/directory/*`.
1474 #
1475 # Exactly one of `url` or `regex_file_set` must be set.
1476 "regexFileSet": { # Message representing a set of files in a Cloud Storage bucket. Regular # The regex-filtered set of files to scan. Exactly one of `url` or
1477 # `regex_file_set` must be set.
1478 # expressions are used to allow fine-grained control over which files in the
1479 # bucket to include.
1480 #
1481 # Included files are those that match at least one item in `include_regex` and
1482 # do not match any items in `exclude_regex`. Note that a file that matches
1483 # items from both lists will _not_ be included. For a match to occur, the
1484 # entire file path (i.e., everything in the url after the bucket name) must
1485 # match the regular expression.
1486 #
1487 # For example, given the input `{bucket_name: "mybucket", include_regex:
1488 # ["directory1/.*"], exclude_regex:
1489 # ["directory1/excluded.*"]}`:
1490 #
1491 # * `gs://mybucket/directory1/myfile` will be included
1492 # * `gs://mybucket/directory1/directory2/myfile` will be included (`.*` matches
1493 # across `/`)
1494 # * `gs://mybucket/directory0/directory1/myfile` will _not_ be included (the
1495 # full path doesn't match any items in `include_regex`)
1496 # * `gs://mybucket/directory1/excludedfile` will _not_ be included (the path
1497 # matches an item in `exclude_regex`)
1498 #
1499 # If `include_regex` is left empty, it will match all files by default
1500 # (this is equivalent to setting `include_regex: [".*"]`).
1501 #
1502 # Some other common use cases:
1503 #
1504 # * `{bucket_name: "mybucket", exclude_regex: [".*\.pdf"]}` will include all
1505 # files in `mybucket` except for .pdf files
1506 # * `{bucket_name: "mybucket", include_regex: ["directory/[^/]+"]}` will
1507 # include all files directly under `gs://mybucket/directory/`, without matching
1508 # across `/`
1509 "excludeRegex": [ # A list of regular expressions matching file paths to exclude. All files in
1510 # the bucket that match at least one of these regular expressions will be
1511 # excluded from the scan.
1512 #
1513 # Regular expressions use RE2
1514 # [syntax](https://github.com/google/re2/wiki/Syntax); a guide can be found
1515 # under the google/re2 repository on GitHub.
1516 "A String",
1517 ],
1518 "bucketName": "A String", # The name of a Cloud Storage bucket. Required.
1519 "includeRegex": [ # A list of regular expressions matching file paths to include. All files in
1520 # the bucket that match at least one of these regular expressions will be
1521 # included in the set of files, except for those that also match an item in
1522 # `exclude_regex`. Leaving this field empty will match all files by default
1523 # (this is equivalent to including `.*` in the list).
1524 #
1525 # Regular expressions use RE2
1526 # [syntax](https://github.com/google/re2/wiki/Syntax); a guide can be found
1527 # under the google/re2 repository on GitHub.
1528 "A String",
1529 ],
1530 },
1531 },
Dan O'Mearadd494642020-05-01 07:42:23 -07001532 "filesLimitPercent": 42, # Limits the number of files to scan to this percentage of the input FileSet.
1533 # Number of files scanned is rounded down. Must be between 0 and 100,
1534 # inclusively. Both 0 and 100 means no limit. Defaults to 0.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001535 "bytesLimitPerFilePercent": 42, # Max percentage of bytes to scan from a file. The rest are omitted. The
1536 # number of bytes scanned is rounded down. Must be between 0 and 100,
1537 # inclusively. Both 0 and 100 means no limit. Defaults to 0. Only one
1538 # of bytes_limit_per_file and bytes_limit_per_file_percent can be specified.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001539 "fileTypes": [ # List of file type groups to include in the scan.
1540 # If empty, all files are scanned and available data format processors
1541 # are applied. In addition, the binary content of the selected files
1542 # is always scanned as well.
Dan O'Mearadd494642020-05-01 07:42:23 -07001543 # Images are scanned only as binary if the specified region
1544 # does not support image inspection and no file_types were specified.
1545 # Image inspection is restricted to 'global', 'us', 'asia', and 'europe'.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001546 "A String",
1547 ],
1548 },
Dan O'Mearadd494642020-05-01 07:42:23 -07001549 "datastoreOptions": { # Options defining a data set within Google Cloud Datastore. # Google Cloud Datastore options.
1550 "partitionId": { # Datastore partition ID. # A partition ID identifies a grouping of entities. The grouping is always
1551 # by project and namespace, however the namespace ID may be empty.
1552 # A partition ID identifies a grouping of entities. The grouping is always
1553 # by project and namespace, however the namespace ID may be empty.
1554 #
1555 # A partition ID contains several dimensions:
1556 # project ID and namespace ID.
1557 "projectId": "A String", # The ID of the project to which the entities belong.
1558 "namespaceId": "A String", # If not empty, the ID of the namespace to which the entities belong.
1559 },
1560 "kind": { # A representation of a Datastore kind. # The kind to process.
1561 "name": "A String", # The name of the kind.
1562 },
1563 },
1564 "bigQueryOptions": { # Options defining BigQuery table and row identifiers. # BigQuery options.
1565 "excludedFields": [ # References to fields excluded from scanning. This allows you to skip
1566 # inspection of entire columns which you know have no findings.
1567 { # General identifier of a data field in a storage service.
1568 "name": "A String", # Name describing the field.
1569 },
1570 ],
1571 "rowsLimit": "A String", # Max number of rows to scan. If the table has more rows than this value, the
1572 # rest of the rows are omitted. If not set, or if set to 0, all rows will be
1573 # scanned. Only one of rows_limit and rows_limit_percent can be specified.
1574 # Cannot be used in conjunction with TimespanConfig.
1575 "sampleMethod": "A String",
1576 "identifyingFields": [ # Table fields that may uniquely identify a row within the table. When
1577 # `actions.saveFindings.outputConfig.table` is specified, the values of
1578 # columns specified here are available in the output table under
1579 # `location.content_locations.record_location.record_key.id_values`. Nested
1580 # fields such as `person.birthdate.year` are allowed.
1581 { # General identifier of a data field in a storage service.
1582 "name": "A String", # Name describing the field.
1583 },
1584 ],
1585 "rowsLimitPercent": 42, # Max percentage of rows to scan. The rest are omitted. The number of rows
1586 # scanned is rounded down. Must be between 0 and 100, inclusively. Both 0 and
1587 # 100 means no limit. Defaults to 0. Only one of rows_limit and
1588 # rows_limit_percent can be specified. Cannot be used in conjunction with
1589 # TimespanConfig.
1590 "tableReference": { # Message defining the location of a BigQuery table. A table is uniquely # Complete BigQuery table reference.
1591 # identified by its project_id, dataset_id, and table_name. Within a query
1592 # a table is often referenced with a string in the format of:
1593 # `&lt;project_id&gt;:&lt;dataset_id&gt;.&lt;table_id&gt;` or
1594 # `&lt;project_id&gt;.&lt;dataset_id&gt;.&lt;table_id&gt;`.
1595 "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table.
1596 # If omitted, project ID is inferred from the API call.
1597 "tableId": "A String", # Name of the table.
1598 "datasetId": "A String", # Dataset ID of the table.
1599 },
1600 },
1601 "timespanConfig": { # Configuration of the timespan of the items to include in scanning.
1602 # Currently only supported when inspecting Google Cloud Storage and BigQuery.
1603 "timestampField": { # General identifier of a data field in a storage service. # Specification of the field containing the timestamp of scanned items.
1604 # Used for data sources like Datastore and BigQuery.
1605 #
1606 # For BigQuery:
1607 # Required to filter out rows based on the given start and
1608 # end times. If not specified and the table was modified between the given
1609 # start and end times, the entire table will be scanned.
1610 # The valid data types of the timestamp field are: `INTEGER`, `DATE`,
1611 # `TIMESTAMP`, or `DATETIME` BigQuery column.
1612 #
1613 # For Datastore.
1614 # Valid data types of the timestamp field are: `TIMESTAMP`.
1615 # Datastore entity will be scanned if the timestamp property does not
1616 # exist or its value is empty or invalid.
1617 "name": "A String", # Name describing the field.
1618 },
1619 "endTime": "A String", # Exclude files or rows newer than this value.
1620 # If set to zero, no upper time limit is applied.
1621 "startTime": "A String", # Exclude files or rows older than this value.
1622 "enableAutoPopulationOfTimespanConfig": True or False, # When the job is started by a JobTrigger we will automatically figure out
1623 # a valid start_time to avoid scanning files that have not been modified
1624 # since the last time the JobTrigger executed. This will be based on the
1625 # time of the execution of the last run of the JobTrigger.
1626 },
1627 "hybridOptions": { # Configuration to control jobs where the content being inspected is outside # Hybrid inspection options.
1628 # Early access feature is in a pre-release state and might change or have
1629 # limited support. For more information, see
1630 # https://cloud.google.com/products#product-launch-stages.
1631 # of Google Cloud Platform.
1632 "tableOptions": { # Instructions regarding the table content being inspected. # If the container is a table, additional information to make findings
1633 # meaningful such as the columns that are primary keys.
1634 "identifyingFields": [ # The columns that are the primary keys for table objects included in
1635 # ContentItem. A copy of this cell's value will stored alongside alongside
1636 # each finding so that the finding can be traced to the specific row it came
1637 # from. No more than 3 may be provided.
1638 { # General identifier of a data field in a storage service.
1639 "name": "A String", # Name describing the field.
1640 },
1641 ],
1642 },
1643 "labels": { # To organize findings, these labels will be added to each finding.
1644 #
1645 # Label keys must be between 1 and 63 characters long and must conform
1646 # to the following regular expression: `[a-z]([-a-z0-9]*[a-z0-9])?`.
1647 #
1648 # Label values must be between 0 and 63 characters long and must conform
1649 # to the regular expression `([a-z]([-a-z0-9]*[a-z0-9])?)?`.
1650 #
1651 # No more than 10 labels can be associated with a given finding.
1652 #
1653 # Examples:
1654 # * `"environment" : "production"`
1655 # * `"pipeline" : "etl"`
1656 "a_key": "A String",
1657 },
1658 "requiredFindingLabelKeys": [ # These are labels that each inspection request must include within their
1659 # 'finding_labels' map. Request may contain others, but any missing one of
1660 # these will be rejected.
1661 #
1662 # Label keys must be between 1 and 63 characters long and must conform
1663 # to the following regular expression: `[a-z]([-a-z0-9]*[a-z0-9])?`.
1664 #
1665 # No more than 10 keys can be required.
1666 "A String",
1667 ],
1668 "description": "A String", # A short description of where the data is coming from. Will be stored once
1669 # in the job. 256 max length.
1670 },
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001671 },
1672 "inspectConfig": { # Configuration description of the scanning process. # How and what to scan for.
1673 # When used with redactContent only info_types and min_likelihood are currently
1674 # used.
1675 "excludeInfoTypes": True or False, # When true, excludes type information of the findings.
Dan O'Mearadd494642020-05-01 07:42:23 -07001676 "limits": { # Configuration to control the number of findings returned. # Configuration to control the number of findings returned.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001677 "maxFindingsPerRequest": 42, # Max number of findings that will be returned per request/job.
1678 # When set within `InspectContentRequest`, the maximum returned is 2000
1679 # regardless if this is set higher.
1680 "maxFindingsPerInfoType": [ # Configuration of findings limit given for specified infoTypes.
1681 { # Max findings configuration per infoType, per content item or long
1682 # running DlpJob.
1683 "infoType": { # Type of information detected by the API. # Type of information the findings limit applies to. Only one limit per
1684 # info_type should be provided. If InfoTypeLimit does not have an
1685 # info_type, the DLP API applies the limit against all info_types that
1686 # are found but not specified in another InfoTypeLimit.
1687 "name": "A String", # Name of the information type. Either a name of your choosing when
1688 # creating a CustomInfoType, or one of the names listed
1689 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
1690 # a built-in type. InfoType names should conform to the pattern
Dan O'Mearadd494642020-05-01 07:42:23 -07001691 # `[a-zA-Z0-9_]{1,64}`.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001692 },
1693 "maxFindings": 42, # Max findings limit for the given infoType.
1694 },
1695 ],
1696 "maxFindingsPerItem": 42, # Max number of findings that will be returned for each item scanned.
Dan O'Mearadd494642020-05-01 07:42:23 -07001697 # When set within `InspectJobConfig`,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001698 # the maximum returned is 2000 regardless if this is set higher.
1699 # When set within `InspectContentRequest`, this field is ignored.
1700 },
1701 "minLikelihood": "A String", # Only returns findings equal or above this threshold. The default is
1702 # POSSIBLE.
1703 # See https://cloud.google.com/dlp/docs/likelihood to learn more.
1704 "customInfoTypes": [ # CustomInfoTypes provided by the user. See
1705 # https://cloud.google.com/dlp/docs/creating-custom-infotypes to learn more.
1706 { # Custom information type provided by the user. Used to find domain-specific
1707 # sensitive information configurable to the data in question.
1708 "regex": { # Message defining a custom regular expression. # Regular expression based CustomInfoType.
1709 "pattern": "A String", # Pattern defining the regular expression. Its syntax
1710 # (https://github.com/google/re2/wiki/Syntax) can be found under the
1711 # google/re2 repository on GitHub.
1712 "groupIndexes": [ # The index of the submatch to extract as findings. When not
1713 # specified, the entire match is returned. No more than 3 may be included.
1714 42,
1715 ],
1716 },
1717 "surrogateType": { # Message for detecting output from deidentification transformations # Message for detecting output from deidentification transformations that
1718 # support reversing.
1719 # such as
1720 # [`CryptoReplaceFfxFpeConfig`](/dlp/docs/reference/rest/v2/organizations.deidentifyTemplates#cryptoreplaceffxfpeconfig).
1721 # These types of transformations are
1722 # those that perform pseudonymization, thereby producing a "surrogate" as
1723 # output. This should be used in conjunction with a field on the
1724 # transformation such as `surrogate_info_type`. This CustomInfoType does
1725 # not support the use of `detection_rules`.
1726 },
1727 "infoType": { # Type of information detected by the API. # CustomInfoType can either be a new infoType, or an extension of built-in
1728 # infoType, when the name matches one of existing infoTypes and that infoType
1729 # is specified in `InspectContent.info_types` field. Specifying the latter
1730 # adds findings to the one detected by the system. If built-in info type is
1731 # not specified in `InspectContent.info_types` list then the name is treated
1732 # as a custom info type.
1733 "name": "A String", # Name of the information type. Either a name of your choosing when
1734 # creating a CustomInfoType, or one of the names listed
1735 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
1736 # a built-in type. InfoType names should conform to the pattern
Dan O'Mearadd494642020-05-01 07:42:23 -07001737 # `[a-zA-Z0-9_]{1,64}`.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001738 },
1739 "dictionary": { # Custom information type based on a dictionary of words or phrases. This can # A list of phrases to detect as a CustomInfoType.
1740 # be used to match sensitive information specific to the data, such as a list
1741 # of employee IDs or job titles.
1742 #
1743 # Dictionary words are case-insensitive and all characters other than letters
1744 # and digits in the unicode [Basic Multilingual
1745 # Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane)
1746 # will be replaced with whitespace when scanning for matches, so the
1747 # dictionary phrase "Sam Johnson" will match all three phrases "sam johnson",
1748 # "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters
1749 # surrounding any match must be of a different type than the adjacent
1750 # characters within the word, so letters must be next to non-letters and
1751 # digits next to non-digits. For example, the dictionary word "jen" will
1752 # match the first three letters of the text "jen123" but will return no
1753 # matches for "jennifer".
1754 #
1755 # Dictionary words containing a large number of characters that are not
1756 # letters or digits may result in unexpected findings because such characters
1757 # are treated as whitespace. The
1758 # [limits](https://cloud.google.com/dlp/limits) page contains details about
1759 # the size limits of dictionaries. For dictionaries that do not fit within
1760 # these constraints, consider using `LargeCustomDictionaryConfig` in the
1761 # `StoredInfoType` API.
1762 "wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for.
1763 "words": [ # Words or phrases defining the dictionary. The dictionary must contain
1764 # at least one phrase and every phrase must contain at least 2 characters
1765 # that are letters or digits. [required]
1766 "A String",
1767 ],
1768 },
1769 "cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file
1770 # is accepted.
1771 "path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage.
1772 # Example: gs://[BUCKET_NAME]/dictionary.txt
1773 },
1774 },
1775 "storedType": { # A reference to a StoredInfoType to use with scanning. # Load an existing `StoredInfoType` resource for use in
1776 # `InspectDataSource`. Not currently supported in `InspectContent`.
1777 "name": "A String", # Resource name of the requested `StoredInfoType`, for example
1778 # `organizations/433245324/storedInfoTypes/432452342` or
1779 # `projects/project-id/storedInfoTypes/432452342`.
1780 "createTime": "A String", # Timestamp indicating when the version of the `StoredInfoType` used for
1781 # inspection was created. Output-only field, populated by the system.
1782 },
1783 "detectionRules": [ # Set of detection rules to apply to all findings of this CustomInfoType.
1784 # Rules are applied in order that they are specified. Not supported for the
1785 # `surrogate_type` CustomInfoType.
1786 { # Deprecated; use `InspectionRuleSet` instead. Rule for modifying a
1787 # `CustomInfoType` to alter behavior under certain circumstances, depending
1788 # on the specific details of the rule. Not supported for the `surrogate_type`
1789 # custom infoType.
1790 "hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule.
1791 # proximity of hotwords.
1792 "proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside.
1793 # The total length of the window cannot exceed 1000 characters. Note that
1794 # the finding itself will be included in the window, so that hotwords may
1795 # be used to match substrings of the finding itself. For example, the
1796 # certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be
1797 # adjusted upwards if the area code is known to be the local area code of
1798 # a company office using the hotword regex "\(xxx\)", where "xxx"
1799 # is the area code in question.
1800 # rule.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001801 "windowBefore": 42, # Number of characters before the finding to consider.
Dan O'Mearadd494642020-05-01 07:42:23 -07001802 "windowAfter": 42, # Number of characters after the finding to consider.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001803 },
1804 "hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword.
1805 "pattern": "A String", # Pattern defining the regular expression. Its syntax
1806 # (https://github.com/google/re2/wiki/Syntax) can be found under the
1807 # google/re2 repository on GitHub.
1808 "groupIndexes": [ # The index of the submatch to extract as findings. When not
1809 # specified, the entire match is returned. No more than 3 may be included.
1810 42,
1811 ],
1812 },
1813 "likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings.
1814 # part of a detection rule.
1815 "relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of
1816 # levels. For example, if a finding would be `POSSIBLE` without the
1817 # detection rule and `relative_likelihood` is 1, then it is upgraded to
1818 # `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`.
1819 # Likelihood may never drop below `VERY_UNLIKELY` or exceed
1820 # `VERY_LIKELY`, so applying an adjustment of 1 followed by an
1821 # adjustment of -1 when base likelihood is `VERY_LIKELY` will result in
1822 # a final likelihood of `LIKELY`.
1823 "fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value.
1824 },
1825 },
1826 },
1827 ],
1828 "exclusionType": "A String", # If set to EXCLUSION_TYPE_EXCLUDE this infoType will not cause a finding
1829 # to be returned. It still can be used for rules matching.
1830 "likelihood": "A String", # Likelihood to return for this CustomInfoType. This base value can be
1831 # altered by a detection rule if the finding meets the criteria specified by
1832 # the rule. Defaults to `VERY_LIKELY` if not specified.
1833 },
1834 ],
1835 "includeQuote": True or False, # When true, a contextual quote from the data that triggered a finding is
1836 # included in the response; see Finding.quote.
1837 "ruleSet": [ # Set of rules to apply to the findings for this InspectConfig.
1838 # Exclusion rules, contained in the set are executed in the end, other
1839 # rules are executed in the order they are specified for each info type.
1840 { # Rule set for modifying a set of infoTypes to alter behavior under certain
1841 # circumstances, depending on the specific details of the rules within the set.
1842 "rules": [ # Set of rules to be applied to infoTypes. The rules are applied in order.
1843 { # A single inspection rule to be applied to infoTypes, specified in
1844 # `InspectionRuleSet`.
1845 "hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule.
1846 # proximity of hotwords.
1847 "proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside.
1848 # The total length of the window cannot exceed 1000 characters. Note that
1849 # the finding itself will be included in the window, so that hotwords may
1850 # be used to match substrings of the finding itself. For example, the
1851 # certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be
1852 # adjusted upwards if the area code is known to be the local area code of
1853 # a company office using the hotword regex "\(xxx\)", where "xxx"
1854 # is the area code in question.
1855 # rule.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001856 "windowBefore": 42, # Number of characters before the finding to consider.
Dan O'Mearadd494642020-05-01 07:42:23 -07001857 "windowAfter": 42, # Number of characters after the finding to consider.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001858 },
1859 "hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword.
1860 "pattern": "A String", # Pattern defining the regular expression. Its syntax
1861 # (https://github.com/google/re2/wiki/Syntax) can be found under the
1862 # google/re2 repository on GitHub.
1863 "groupIndexes": [ # The index of the submatch to extract as findings. When not
1864 # specified, the entire match is returned. No more than 3 may be included.
1865 42,
1866 ],
1867 },
1868 "likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings.
1869 # part of a detection rule.
1870 "relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of
1871 # levels. For example, if a finding would be `POSSIBLE` without the
1872 # detection rule and `relative_likelihood` is 1, then it is upgraded to
1873 # `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`.
1874 # Likelihood may never drop below `VERY_UNLIKELY` or exceed
1875 # `VERY_LIKELY`, so applying an adjustment of 1 followed by an
1876 # adjustment of -1 when base likelihood is `VERY_LIKELY` will result in
1877 # a final likelihood of `LIKELY`.
1878 "fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value.
1879 },
1880 },
1881 "exclusionRule": { # The rule that specifies conditions when findings of infoTypes specified in # Exclusion rule.
1882 # `InspectionRuleSet` are removed from results.
1883 "regex": { # Message defining a custom regular expression. # Regular expression which defines the rule.
1884 "pattern": "A String", # Pattern defining the regular expression. Its syntax
1885 # (https://github.com/google/re2/wiki/Syntax) can be found under the
1886 # google/re2 repository on GitHub.
1887 "groupIndexes": [ # The index of the submatch to extract as findings. When not
1888 # specified, the entire match is returned. No more than 3 may be included.
1889 42,
1890 ],
1891 },
1892 "excludeInfoTypes": { # List of exclude infoTypes. # Set of infoTypes for which findings would affect this rule.
1893 "infoTypes": [ # InfoType list in ExclusionRule rule drops a finding when it overlaps or
1894 # contained within with a finding of an infoType from this list. For
1895 # example, for `InspectionRuleSet.info_types` containing "PHONE_NUMBER"` and
1896 # `exclusion_rule` containing `exclude_info_types.info_types` with
1897 # "EMAIL_ADDRESS" the phone number findings are dropped if they overlap
1898 # with EMAIL_ADDRESS finding.
1899 # That leads to "555-222-2222@example.org" to generate only a single
1900 # finding, namely email address.
1901 { # Type of information detected by the API.
1902 "name": "A String", # Name of the information type. Either a name of your choosing when
1903 # creating a CustomInfoType, or one of the names listed
1904 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
1905 # a built-in type. InfoType names should conform to the pattern
Dan O'Mearadd494642020-05-01 07:42:23 -07001906 # `[a-zA-Z0-9_]{1,64}`.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001907 },
1908 ],
1909 },
1910 "dictionary": { # Custom information type based on a dictionary of words or phrases. This can # Dictionary which defines the rule.
1911 # be used to match sensitive information specific to the data, such as a list
1912 # of employee IDs or job titles.
1913 #
1914 # Dictionary words are case-insensitive and all characters other than letters
1915 # and digits in the unicode [Basic Multilingual
1916 # Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane)
1917 # will be replaced with whitespace when scanning for matches, so the
1918 # dictionary phrase "Sam Johnson" will match all three phrases "sam johnson",
1919 # "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters
1920 # surrounding any match must be of a different type than the adjacent
1921 # characters within the word, so letters must be next to non-letters and
1922 # digits next to non-digits. For example, the dictionary word "jen" will
1923 # match the first three letters of the text "jen123" but will return no
1924 # matches for "jennifer".
1925 #
1926 # Dictionary words containing a large number of characters that are not
1927 # letters or digits may result in unexpected findings because such characters
1928 # are treated as whitespace. The
1929 # [limits](https://cloud.google.com/dlp/limits) page contains details about
1930 # the size limits of dictionaries. For dictionaries that do not fit within
1931 # these constraints, consider using `LargeCustomDictionaryConfig` in the
1932 # `StoredInfoType` API.
1933 "wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for.
1934 "words": [ # Words or phrases defining the dictionary. The dictionary must contain
1935 # at least one phrase and every phrase must contain at least 2 characters
1936 # that are letters or digits. [required]
1937 "A String",
1938 ],
1939 },
1940 "cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file
1941 # is accepted.
1942 "path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage.
1943 # Example: gs://[BUCKET_NAME]/dictionary.txt
1944 },
1945 },
1946 "matchingType": "A String", # How the rule is applied, see MatchingType documentation for details.
1947 },
1948 },
1949 ],
1950 "infoTypes": [ # List of infoTypes this rule set is applied to.
1951 { # Type of information detected by the API.
1952 "name": "A String", # Name of the information type. Either a name of your choosing when
1953 # creating a CustomInfoType, or one of the names listed
1954 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
1955 # a built-in type. InfoType names should conform to the pattern
Dan O'Mearadd494642020-05-01 07:42:23 -07001956 # `[a-zA-Z0-9_]{1,64}`.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001957 },
1958 ],
1959 },
1960 ],
1961 "contentOptions": [ # List of options defining data content to scan.
1962 # If empty, text, images, and other content will be included.
1963 "A String",
1964 ],
1965 "infoTypes": [ # Restricts what info_types to look for. The values must correspond to
1966 # InfoType values returned by ListInfoTypes or listed at
1967 # https://cloud.google.com/dlp/docs/infotypes-reference.
1968 #
1969 # When no InfoTypes or CustomInfoTypes are specified in a request, the
1970 # system may automatically choose what detectors to run. By default this may
1971 # be all types, but may change over time as detectors are updated.
1972 #
Dan O'Mearadd494642020-05-01 07:42:23 -07001973 # If you need precise control and predictability as to what detectors are
1974 # run you should specify specific InfoTypes listed in the reference,
1975 # otherwise a default list will be used, which may change over time.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001976 { # Type of information detected by the API.
1977 "name": "A String", # Name of the information type. Either a name of your choosing when
1978 # creating a CustomInfoType, or one of the names listed
1979 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
1980 # a built-in type. InfoType names should conform to the pattern
Dan O'Mearadd494642020-05-01 07:42:23 -07001981 # `[a-zA-Z0-9_]{1,64}`.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001982 },
1983 ],
1984 },
1985 "inspectTemplateName": "A String", # If provided, will be used as the default for all values in InspectConfig.
1986 # `inspect_config` will be merged into the values persisted as part of the
1987 # template.
1988 "actions": [ # Actions to execute at the completion of the job.
1989 { # A task to execute on the completion of a job.
1990 # See https://cloud.google.com/dlp/docs/concepts-actions to learn more.
1991 "saveFindings": { # If set, the detailed findings will be persisted to the specified # Save resulting findings in a provided location.
1992 # OutputStorageConfig. Only a single instance of this action can be
1993 # specified.
1994 # Compatible with: Inspect, Risk
Dan O'Mearadd494642020-05-01 07:42:23 -07001995 "outputConfig": { # Cloud repository for storing output. # Location to store findings outside of DLP.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07001996 "table": { # Message defining the location of a BigQuery table. A table is uniquely # Store findings in an existing table or a new table in an existing
1997 # dataset. If table_id is not set a new one will be generated
1998 # for you with the following format:
1999 # dlp_googleapis_yyyy_mm_dd_[dlp_job_id]. Pacific timezone will be used for
2000 # generating the date details.
2001 #
2002 # For Inspect, each column in an existing output table must have the same
2003 # name, type, and mode of a field in the `Finding` object.
2004 #
2005 # For Risk, an existing output table should be the output of a previous
2006 # Risk analysis job run on the same source table, with the same privacy
2007 # metric and quasi-identifiers. Risk jobs that analyze the same table but
2008 # compute a different privacy metric, or use different sets of
2009 # quasi-identifiers, cannot store their results in the same table.
2010 # identified by its project_id, dataset_id, and table_name. Within a query
2011 # a table is often referenced with a string in the format of:
Dan O'Mearadd494642020-05-01 07:42:23 -07002012 # `&lt;project_id&gt;:&lt;dataset_id&gt;.&lt;table_id&gt;` or
2013 # `&lt;project_id&gt;.&lt;dataset_id&gt;.&lt;table_id&gt;`.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002014 "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table.
2015 # If omitted, project ID is inferred from the API call.
2016 "tableId": "A String", # Name of the table.
2017 "datasetId": "A String", # Dataset ID of the table.
2018 },
2019 "outputSchema": "A String", # Schema used for writing the findings for Inspect jobs. This field is only
2020 # used for Inspect and must be unspecified for Risk jobs. Columns are derived
2021 # from the `Finding` object. If appending to an existing table, any columns
2022 # from the predefined schema that are missing will be added. No columns in
2023 # the existing table will be deleted.
2024 #
2025 # If unspecified, then all available columns will be used for a new table or
2026 # an (existing) table with no schema, and no changes will be made to an
2027 # existing table that has a schema.
Dan O'Mearadd494642020-05-01 07:42:23 -07002028 # Only for use with external storage.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002029 },
2030 },
Dan O'Mearadd494642020-05-01 07:42:23 -07002031 "jobNotificationEmails": { # Enable email notification to project owners and editors on jobs's # Enable email notification for project owners and editors on job's
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002032 # completion/failure.
2033 # completion/failure.
2034 },
2035 "publishSummaryToCscc": { # Publish the result summary of a DlpJob to the Cloud Security # Publish summary to Cloud Security Command Center (Alpha).
2036 # Command Center (CSCC Alpha).
2037 # This action is only available for projects which are parts of
2038 # an organization and whitelisted for the alpha Cloud Security Command
2039 # Center.
2040 # The action will publish count of finding instances and their info types.
2041 # The summary of findings will be persisted in CSCC and are governed by CSCC
2042 # service-specific policy, see https://cloud.google.com/terms/service-terms
2043 # Only a single instance of this action can be specified.
2044 # Compatible with: Inspect
2045 },
Dan O'Mearadd494642020-05-01 07:42:23 -07002046 "publishToStackdriver": { # Enable Stackdriver metric dlp.googleapis.com/finding_count. This # Enable Stackdriver metric dlp.googleapis.com/finding_count.
2047 # will publish a metric to stack driver on each infotype requested and
2048 # how many findings were found for it. CustomDetectors will be bucketed
2049 # as 'Custom' under the Stackdriver label 'info_type'.
2050 },
2051 "publishFindingsToCloudDataCatalog": { # Publish findings of a DlpJob to Cloud Data Catalog. Labels summarizing the # Publish findings to Cloud Datahub.
2052 # results of the DlpJob will be applied to the entry for the resource scanned
2053 # in Cloud Data Catalog. Any labels previously written by another DlpJob will
2054 # be deleted. InfoType naming patterns are strictly enforced when using this
2055 # feature. Note that the findings will be persisted in Cloud Data Catalog
2056 # storage and are governed by Data Catalog service-specific policy, see
2057 # https://cloud.google.com/terms/service-terms
2058 # Only a single instance of this action can be specified and only allowed if
2059 # all resources being scanned are BigQuery tables.
2060 # Compatible with: Inspect
2061 },
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002062 "pubSub": { # Publish a message into given Pub/Sub topic when DlpJob has completed. The # Publish a notification to a pubsub topic.
2063 # message contains a single field, `DlpJobName`, which is equal to the
2064 # finished job's
2065 # [`DlpJob.name`](/dlp/docs/reference/rest/v2/projects.dlpJobs#DlpJob).
2066 # Compatible with: Inspect, Risk
2067 "topic": "A String", # Cloud Pub/Sub topic to send notifications to. The topic must have given
2068 # publishing access rights to the DLP API service account executing
2069 # the long running DlpJob sending the notifications.
2070 # Format is projects/{project}/topics/{topic}.
2071 },
2072 },
2073 ],
2074 },
2075 },
2076 "result": { # All result fields mentioned below are updated while the job is processing. # A summary of the outcome of this inspect job.
2077 "infoTypeStats": [ # Statistics of how many instances of each info type were found during
2078 # inspect job.
2079 { # Statistics regarding a specific InfoType.
2080 "count": "A String", # Number of findings for this infoType.
2081 "infoType": { # Type of information detected by the API. # The type of finding this stat is for.
2082 "name": "A String", # Name of the information type. Either a name of your choosing when
2083 # creating a CustomInfoType, or one of the names listed
2084 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
2085 # a built-in type. InfoType names should conform to the pattern
Dan O'Mearadd494642020-05-01 07:42:23 -07002086 # `[a-zA-Z0-9_]{1,64}`.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002087 },
2088 },
2089 ],
2090 "totalEstimatedBytes": "A String", # Estimate of the number of bytes to process.
2091 "processedBytes": "A String", # Total size in bytes that were processed.
Dan O'Mearadd494642020-05-01 07:42:23 -07002092 "hybridStats": { # Statistics related to processing hybrid inspect requests. # Statistics related to the processing of hybrid inspect.
2093 # Early access feature is in a pre-release state and might change or have
2094 # limited support. For more information, see
2095 # https://cloud.google.com/products#product-launch-stages.
2096 "abortedCount": "A String", # The number of hybrid inspection requests aborted because the job ran
2097 # out of quota or was ended before they could be processed.
2098 "pendingCount": "A String", # The number of hybrid requests currently being processed. Only populated
2099 # when called via method `getDlpJob`.
2100 # A burst of traffic may cause hybrid inspect requests to be enqueued.
2101 # Processing will take place as quickly as possible, but resource limitations
2102 # may impact how long a request is enqueued for.
2103 "processedCount": "A String", # The number of hybrid inspection requests processed within this job.
2104 },
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002105 },
2106 },
2107 "riskDetails": { # Result of a risk analysis operation request. # Results from analyzing risk of a data source.
Dan O'Mearadd494642020-05-01 07:42:23 -07002108 "numericalStatsResult": { # Result of the numerical stats computation. # Numerical stats result
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002109 "quantileValues": [ # List of 99 values that partition the set of field values into 100 equal
2110 # sized buckets.
2111 { # Set of primitive values supported by the system.
2112 # Note that for the purposes of inspection or transformation, the number
2113 # of bytes considered to comprise a 'Value' is based on its representation
2114 # as a UTF-8 encoded string. For example, if 'integer_value' is set to
2115 # 123456789, the number of bytes would be counted as 9, even though an
2116 # int64 only holds up to 8 bytes of data.
Dan O'Mearadd494642020-05-01 07:42:23 -07002117 "floatValue": 3.14, # float
2118 "timestampValue": "A String", # timestamp
2119 "dayOfWeekValue": "A String", # day of week
2120 "timeValue": { # Represents a time of day. The date and time zone are either not significant # time of day
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002121 # or are specified elsewhere. An API may choose to allow leap seconds. Related
2122 # types are google.type.Date and `google.protobuf.Timestamp`.
2123 "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose
2124 # to allow the value "24:00:00" for scenarios like business closing time.
2125 "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999.
2126 "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may
2127 # allow the value 60 if it allows leap-seconds.
2128 "minutes": 42, # Minutes of hour of day. Must be from 0 to 59.
2129 },
Dan O'Mearadd494642020-05-01 07:42:23 -07002130 "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day # date
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002131 # and time zone are either specified elsewhere or are not significant. The date
2132 # is relative to the Proleptic Gregorian Calendar. This can represent:
2133 #
2134 # * A full date, with non-zero year, month and day values
2135 # * A month and day value, with a zero year, e.g. an anniversary
2136 # * A year on its own, with zero month and day values
2137 # * A year and month value, with a zero day, e.g. a credit card expiration date
2138 #
2139 # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`.
Dan O'Mearadd494642020-05-01 07:42:23 -07002140 "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a
2141 # month and day.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002142 "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0
2143 # if specifying a year by itself or a year and month where the day is not
2144 # significant.
Dan O'Mearadd494642020-05-01 07:42:23 -07002145 "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without
2146 # a year.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002147 },
Dan O'Mearadd494642020-05-01 07:42:23 -07002148 "stringValue": "A String", # string
2149 "booleanValue": True or False, # boolean
2150 "integerValue": "A String", # integer
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002151 },
2152 ],
2153 "maxValue": { # Set of primitive values supported by the system. # Maximum value appearing in the column.
2154 # Note that for the purposes of inspection or transformation, the number
2155 # of bytes considered to comprise a 'Value' is based on its representation
2156 # as a UTF-8 encoded string. For example, if 'integer_value' is set to
2157 # 123456789, the number of bytes would be counted as 9, even though an
2158 # int64 only holds up to 8 bytes of data.
Dan O'Mearadd494642020-05-01 07:42:23 -07002159 "floatValue": 3.14, # float
2160 "timestampValue": "A String", # timestamp
2161 "dayOfWeekValue": "A String", # day of week
2162 "timeValue": { # Represents a time of day. The date and time zone are either not significant # time of day
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002163 # or are specified elsewhere. An API may choose to allow leap seconds. Related
2164 # types are google.type.Date and `google.protobuf.Timestamp`.
2165 "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose
2166 # to allow the value "24:00:00" for scenarios like business closing time.
2167 "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999.
2168 "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may
2169 # allow the value 60 if it allows leap-seconds.
2170 "minutes": 42, # Minutes of hour of day. Must be from 0 to 59.
2171 },
Dan O'Mearadd494642020-05-01 07:42:23 -07002172 "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day # date
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002173 # and time zone are either specified elsewhere or are not significant. The date
2174 # is relative to the Proleptic Gregorian Calendar. This can represent:
2175 #
2176 # * A full date, with non-zero year, month and day values
2177 # * A month and day value, with a zero year, e.g. an anniversary
2178 # * A year on its own, with zero month and day values
2179 # * A year and month value, with a zero day, e.g. a credit card expiration date
2180 #
2181 # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`.
Dan O'Mearadd494642020-05-01 07:42:23 -07002182 "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a
2183 # month and day.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002184 "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0
2185 # if specifying a year by itself or a year and month where the day is not
2186 # significant.
Dan O'Mearadd494642020-05-01 07:42:23 -07002187 "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without
2188 # a year.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002189 },
Dan O'Mearadd494642020-05-01 07:42:23 -07002190 "stringValue": "A String", # string
2191 "booleanValue": True or False, # boolean
2192 "integerValue": "A String", # integer
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002193 },
2194 "minValue": { # Set of primitive values supported by the system. # Minimum value appearing in the column.
2195 # Note that for the purposes of inspection or transformation, the number
2196 # of bytes considered to comprise a 'Value' is based on its representation
2197 # as a UTF-8 encoded string. For example, if 'integer_value' is set to
2198 # 123456789, the number of bytes would be counted as 9, even though an
2199 # int64 only holds up to 8 bytes of data.
Dan O'Mearadd494642020-05-01 07:42:23 -07002200 "floatValue": 3.14, # float
2201 "timestampValue": "A String", # timestamp
2202 "dayOfWeekValue": "A String", # day of week
2203 "timeValue": { # Represents a time of day. The date and time zone are either not significant # time of day
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002204 # or are specified elsewhere. An API may choose to allow leap seconds. Related
2205 # types are google.type.Date and `google.protobuf.Timestamp`.
2206 "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose
2207 # to allow the value "24:00:00" for scenarios like business closing time.
2208 "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999.
2209 "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may
2210 # allow the value 60 if it allows leap-seconds.
2211 "minutes": 42, # Minutes of hour of day. Must be from 0 to 59.
2212 },
Dan O'Mearadd494642020-05-01 07:42:23 -07002213 "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day # date
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002214 # and time zone are either specified elsewhere or are not significant. The date
2215 # is relative to the Proleptic Gregorian Calendar. This can represent:
2216 #
2217 # * A full date, with non-zero year, month and day values
2218 # * A month and day value, with a zero year, e.g. an anniversary
2219 # * A year on its own, with zero month and day values
2220 # * A year and month value, with a zero day, e.g. a credit card expiration date
2221 #
2222 # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`.
Dan O'Mearadd494642020-05-01 07:42:23 -07002223 "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a
2224 # month and day.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002225 "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0
2226 # if specifying a year by itself or a year and month where the day is not
2227 # significant.
Dan O'Mearadd494642020-05-01 07:42:23 -07002228 "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without
2229 # a year.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002230 },
Dan O'Mearadd494642020-05-01 07:42:23 -07002231 "stringValue": "A String", # string
2232 "booleanValue": True or False, # boolean
2233 "integerValue": "A String", # integer
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002234 },
2235 },
Dan O'Mearadd494642020-05-01 07:42:23 -07002236 "kMapEstimationResult": { # Result of the reidentifiability analysis. Note that these results are an # K-map result
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002237 # estimation, not exact values.
2238 "kMapEstimationHistogram": [ # The intervals [min_anonymity, max_anonymity] do not overlap. If a value
2239 # doesn't correspond to any such interval, the associated frequency is
2240 # zero. For example, the following records:
2241 # {min_anonymity: 1, max_anonymity: 1, frequency: 17}
2242 # {min_anonymity: 2, max_anonymity: 3, frequency: 42}
2243 # {min_anonymity: 5, max_anonymity: 10, frequency: 99}
2244 # mean that there are no record with an estimated anonymity of 4, 5, or
2245 # larger than 10.
2246 { # A KMapEstimationHistogramBucket message with the following values:
2247 # min_anonymity: 3
2248 # max_anonymity: 5
2249 # frequency: 42
2250 # means that there are 42 records whose quasi-identifier values correspond
2251 # to 3, 4 or 5 people in the overlying population. An important particular
2252 # case is when min_anonymity = max_anonymity = 1: the frequency field then
2253 # corresponds to the number of uniquely identifiable records.
2254 "bucketValues": [ # Sample of quasi-identifier tuple values in this bucket. The total
2255 # number of classes returned per bucket is capped at 20.
2256 { # A tuple of values for the quasi-identifier columns.
2257 "estimatedAnonymity": "A String", # The estimated anonymity for these quasi-identifier values.
2258 "quasiIdsValues": [ # The quasi-identifier values.
2259 { # Set of primitive values supported by the system.
2260 # Note that for the purposes of inspection or transformation, the number
2261 # of bytes considered to comprise a 'Value' is based on its representation
2262 # as a UTF-8 encoded string. For example, if 'integer_value' is set to
2263 # 123456789, the number of bytes would be counted as 9, even though an
2264 # int64 only holds up to 8 bytes of data.
Dan O'Mearadd494642020-05-01 07:42:23 -07002265 "floatValue": 3.14, # float
2266 "timestampValue": "A String", # timestamp
2267 "dayOfWeekValue": "A String", # day of week
2268 "timeValue": { # Represents a time of day. The date and time zone are either not significant # time of day
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002269 # or are specified elsewhere. An API may choose to allow leap seconds. Related
2270 # types are google.type.Date and `google.protobuf.Timestamp`.
2271 "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose
2272 # to allow the value "24:00:00" for scenarios like business closing time.
2273 "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999.
2274 "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may
2275 # allow the value 60 if it allows leap-seconds.
2276 "minutes": 42, # Minutes of hour of day. Must be from 0 to 59.
2277 },
Dan O'Mearadd494642020-05-01 07:42:23 -07002278 "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day # date
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002279 # and time zone are either specified elsewhere or are not significant. The date
2280 # is relative to the Proleptic Gregorian Calendar. This can represent:
2281 #
2282 # * A full date, with non-zero year, month and day values
2283 # * A month and day value, with a zero year, e.g. an anniversary
2284 # * A year on its own, with zero month and day values
2285 # * A year and month value, with a zero day, e.g. a credit card expiration date
2286 #
2287 # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`.
Dan O'Mearadd494642020-05-01 07:42:23 -07002288 "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a
2289 # month and day.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002290 "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0
2291 # if specifying a year by itself or a year and month where the day is not
2292 # significant.
Dan O'Mearadd494642020-05-01 07:42:23 -07002293 "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without
2294 # a year.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002295 },
Dan O'Mearadd494642020-05-01 07:42:23 -07002296 "stringValue": "A String", # string
2297 "booleanValue": True or False, # boolean
2298 "integerValue": "A String", # integer
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002299 },
2300 ],
2301 },
2302 ],
2303 "minAnonymity": "A String", # Always positive.
2304 "bucketValueCount": "A String", # Total number of distinct quasi-identifier tuple values in this bucket.
2305 "maxAnonymity": "A String", # Always greater than or equal to min_anonymity.
2306 "bucketSize": "A String", # Number of records within these anonymity bounds.
2307 },
2308 ],
2309 },
Dan O'Mearadd494642020-05-01 07:42:23 -07002310 "kAnonymityResult": { # Result of the k-anonymity computation. # K-anonymity result
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002311 "equivalenceClassHistogramBuckets": [ # Histogram of k-anonymity equivalence classes.
Dan O'Mearadd494642020-05-01 07:42:23 -07002312 { # Histogram of k-anonymity equivalence classes.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002313 "bucketValues": [ # Sample of equivalence classes in this bucket. The total number of
2314 # classes returned per bucket is capped at 20.
2315 { # The set of columns' values that share the same ldiversity value
2316 "quasiIdsValues": [ # Set of values defining the equivalence class. One value per
2317 # quasi-identifier column in the original KAnonymity metric message.
2318 # The order is always the same as the original request.
2319 { # Set of primitive values supported by the system.
2320 # Note that for the purposes of inspection or transformation, the number
2321 # of bytes considered to comprise a 'Value' is based on its representation
2322 # as a UTF-8 encoded string. For example, if 'integer_value' is set to
2323 # 123456789, the number of bytes would be counted as 9, even though an
2324 # int64 only holds up to 8 bytes of data.
Dan O'Mearadd494642020-05-01 07:42:23 -07002325 "floatValue": 3.14, # float
2326 "timestampValue": "A String", # timestamp
2327 "dayOfWeekValue": "A String", # day of week
2328 "timeValue": { # Represents a time of day. The date and time zone are either not significant # time of day
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002329 # or are specified elsewhere. An API may choose to allow leap seconds. Related
2330 # types are google.type.Date and `google.protobuf.Timestamp`.
2331 "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose
2332 # to allow the value "24:00:00" for scenarios like business closing time.
2333 "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999.
2334 "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may
2335 # allow the value 60 if it allows leap-seconds.
2336 "minutes": 42, # Minutes of hour of day. Must be from 0 to 59.
2337 },
Dan O'Mearadd494642020-05-01 07:42:23 -07002338 "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day # date
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002339 # and time zone are either specified elsewhere or are not significant. The date
2340 # is relative to the Proleptic Gregorian Calendar. This can represent:
2341 #
2342 # * A full date, with non-zero year, month and day values
2343 # * A month and day value, with a zero year, e.g. an anniversary
2344 # * A year on its own, with zero month and day values
2345 # * A year and month value, with a zero day, e.g. a credit card expiration date
2346 #
2347 # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`.
Dan O'Mearadd494642020-05-01 07:42:23 -07002348 "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a
2349 # month and day.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002350 "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0
2351 # if specifying a year by itself or a year and month where the day is not
2352 # significant.
Dan O'Mearadd494642020-05-01 07:42:23 -07002353 "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without
2354 # a year.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002355 },
Dan O'Mearadd494642020-05-01 07:42:23 -07002356 "stringValue": "A String", # string
2357 "booleanValue": True or False, # boolean
2358 "integerValue": "A String", # integer
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002359 },
2360 ],
2361 "equivalenceClassSize": "A String", # Size of the equivalence class, for example number of rows with the
2362 # above set of values.
2363 },
2364 ],
2365 "bucketValueCount": "A String", # Total number of distinct equivalence classes in this bucket.
2366 "equivalenceClassSizeLowerBound": "A String", # Lower bound on the size of the equivalence classes in this bucket.
2367 "equivalenceClassSizeUpperBound": "A String", # Upper bound on the size of the equivalence classes in this bucket.
2368 "bucketSize": "A String", # Total number of equivalence classes in this bucket.
2369 },
2370 ],
2371 },
Dan O'Mearadd494642020-05-01 07:42:23 -07002372 "lDiversityResult": { # Result of the l-diversity computation. # L-divesity result
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002373 "sensitiveValueFrequencyHistogramBuckets": [ # Histogram of l-diversity equivalence class sensitive value frequencies.
Dan O'Mearadd494642020-05-01 07:42:23 -07002374 { # Histogram of l-diversity equivalence class sensitive value frequencies.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002375 "bucketValues": [ # Sample of equivalence classes in this bucket. The total number of
2376 # classes returned per bucket is capped at 20.
2377 { # The set of columns' values that share the same ldiversity value.
2378 "numDistinctSensitiveValues": "A String", # Number of distinct sensitive values in this equivalence class.
2379 "quasiIdsValues": [ # Quasi-identifier values defining the k-anonymity equivalence
2380 # class. The order is always the same as the original request.
2381 { # Set of primitive values supported by the system.
2382 # Note that for the purposes of inspection or transformation, the number
2383 # of bytes considered to comprise a 'Value' is based on its representation
2384 # as a UTF-8 encoded string. For example, if 'integer_value' is set to
2385 # 123456789, the number of bytes would be counted as 9, even though an
2386 # int64 only holds up to 8 bytes of data.
Dan O'Mearadd494642020-05-01 07:42:23 -07002387 "floatValue": 3.14, # float
2388 "timestampValue": "A String", # timestamp
2389 "dayOfWeekValue": "A String", # day of week
2390 "timeValue": { # Represents a time of day. The date and time zone are either not significant # time of day
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002391 # or are specified elsewhere. An API may choose to allow leap seconds. Related
2392 # types are google.type.Date and `google.protobuf.Timestamp`.
2393 "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose
2394 # to allow the value "24:00:00" for scenarios like business closing time.
2395 "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999.
2396 "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may
2397 # allow the value 60 if it allows leap-seconds.
2398 "minutes": 42, # Minutes of hour of day. Must be from 0 to 59.
2399 },
Dan O'Mearadd494642020-05-01 07:42:23 -07002400 "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day # date
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002401 # and time zone are either specified elsewhere or are not significant. The date
2402 # is relative to the Proleptic Gregorian Calendar. This can represent:
2403 #
2404 # * A full date, with non-zero year, month and day values
2405 # * A month and day value, with a zero year, e.g. an anniversary
2406 # * A year on its own, with zero month and day values
2407 # * A year and month value, with a zero day, e.g. a credit card expiration date
2408 #
2409 # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`.
Dan O'Mearadd494642020-05-01 07:42:23 -07002410 "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a
2411 # month and day.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002412 "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0
2413 # if specifying a year by itself or a year and month where the day is not
2414 # significant.
Dan O'Mearadd494642020-05-01 07:42:23 -07002415 "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without
2416 # a year.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002417 },
Dan O'Mearadd494642020-05-01 07:42:23 -07002418 "stringValue": "A String", # string
2419 "booleanValue": True or False, # boolean
2420 "integerValue": "A String", # integer
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002421 },
2422 ],
2423 "topSensitiveValues": [ # Estimated frequencies of top sensitive values.
2424 { # A value of a field, including its frequency.
2425 "count": "A String", # How many times the value is contained in the field.
2426 "value": { # Set of primitive values supported by the system. # A value contained in the field in question.
2427 # Note that for the purposes of inspection or transformation, the number
2428 # of bytes considered to comprise a 'Value' is based on its representation
2429 # as a UTF-8 encoded string. For example, if 'integer_value' is set to
2430 # 123456789, the number of bytes would be counted as 9, even though an
2431 # int64 only holds up to 8 bytes of data.
Dan O'Mearadd494642020-05-01 07:42:23 -07002432 "floatValue": 3.14, # float
2433 "timestampValue": "A String", # timestamp
2434 "dayOfWeekValue": "A String", # day of week
2435 "timeValue": { # Represents a time of day. The date and time zone are either not significant # time of day
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002436 # or are specified elsewhere. An API may choose to allow leap seconds. Related
2437 # types are google.type.Date and `google.protobuf.Timestamp`.
2438 "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose
2439 # to allow the value "24:00:00" for scenarios like business closing time.
2440 "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999.
2441 "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may
2442 # allow the value 60 if it allows leap-seconds.
2443 "minutes": 42, # Minutes of hour of day. Must be from 0 to 59.
2444 },
Dan O'Mearadd494642020-05-01 07:42:23 -07002445 "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day # date
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002446 # and time zone are either specified elsewhere or are not significant. The date
2447 # is relative to the Proleptic Gregorian Calendar. This can represent:
2448 #
2449 # * A full date, with non-zero year, month and day values
2450 # * A month and day value, with a zero year, e.g. an anniversary
2451 # * A year on its own, with zero month and day values
2452 # * A year and month value, with a zero day, e.g. a credit card expiration date
2453 #
2454 # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`.
Dan O'Mearadd494642020-05-01 07:42:23 -07002455 "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a
2456 # month and day.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002457 "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0
2458 # if specifying a year by itself or a year and month where the day is not
2459 # significant.
Dan O'Mearadd494642020-05-01 07:42:23 -07002460 "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without
2461 # a year.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002462 },
Dan O'Mearadd494642020-05-01 07:42:23 -07002463 "stringValue": "A String", # string
2464 "booleanValue": True or False, # boolean
2465 "integerValue": "A String", # integer
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002466 },
2467 },
2468 ],
2469 "equivalenceClassSize": "A String", # Size of the k-anonymity equivalence class.
2470 },
2471 ],
2472 "bucketValueCount": "A String", # Total number of distinct equivalence classes in this bucket.
2473 "bucketSize": "A String", # Total number of equivalence classes in this bucket.
2474 "sensitiveValueFrequencyUpperBound": "A String", # Upper bound on the sensitive value frequencies of the equivalence
2475 # classes in this bucket.
2476 "sensitiveValueFrequencyLowerBound": "A String", # Lower bound on the sensitive value frequencies of the equivalence
2477 # classes in this bucket.
2478 },
2479 ],
2480 },
2481 "requestedPrivacyMetric": { # Privacy metric to compute for reidentification risk analysis. # Privacy metric to compute.
Dan O'Mearadd494642020-05-01 07:42:23 -07002482 "numericalStatsConfig": { # Compute numerical stats over an individual column, including # Numerical stats
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002483 # min, max, and quantiles.
2484 "field": { # General identifier of a data field in a storage service. # Field to compute numerical stats on. Supported types are
2485 # integer, float, date, datetime, timestamp, time.
2486 "name": "A String", # Name describing the field.
2487 },
2488 },
Dan O'Mearadd494642020-05-01 07:42:23 -07002489 "kMapEstimationConfig": { # Reidentifiability metric. This corresponds to a risk model similar to what # k-map
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002490 # is called "journalist risk" in the literature, except the attack dataset is
2491 # statistically modeled instead of being perfectly known. This can be done
2492 # using publicly available data (like the US Census), or using a custom
2493 # statistical model (indicated as one or several BigQuery tables), or by
2494 # extrapolating from the distribution of values in the input dataset.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002495 "regionCode": "A String", # ISO 3166-1 alpha-2 region code to use in the statistical modeling.
Dan O'Mearadd494642020-05-01 07:42:23 -07002496 # Set if no column is tagged with a region-specific InfoType (like
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002497 # US_ZIP_5) or a region code.
Dan O'Mearadd494642020-05-01 07:42:23 -07002498 "quasiIds": [ # Required. Fields considered to be quasi-identifiers. No two columns can have the
2499 # same tag.
2500 { # A column with a semantic tag attached.
2501 "field": { # General identifier of a data field in a storage service. # Required. Identifies the column.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002502 "name": "A String", # Name describing the field.
2503 },
2504 "customTag": "A String", # A column can be tagged with a custom tag. In this case, the user must
2505 # indicate an auxiliary table that contains statistical information on
2506 # the possible values of this column (below).
2507 "infoType": { # Type of information detected by the API. # A column can be tagged with a InfoType to use the relevant public
2508 # dataset as a statistical model of population, if available. We
2509 # currently support US ZIP codes, region codes, ages and genders.
2510 # To programmatically obtain the list of supported InfoTypes, use
2511 # ListInfoTypes with the supported_by=RISK_ANALYSIS filter.
2512 "name": "A String", # Name of the information type. Either a name of your choosing when
2513 # creating a CustomInfoType, or one of the names listed
2514 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
2515 # a built-in type. InfoType names should conform to the pattern
Dan O'Mearadd494642020-05-01 07:42:23 -07002516 # `[a-zA-Z0-9_]{1,64}`.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002517 },
2518 "inferred": { # A generic empty message that you can re-use to avoid defining duplicated # If no semantic tag is indicated, we infer the statistical model from
2519 # the distribution of values in the input data
2520 # empty messages in your APIs. A typical example is to use it as the request
2521 # or the response type of an API method. For instance:
2522 #
2523 # service Foo {
2524 # rpc Bar(google.protobuf.Empty) returns (google.protobuf.Empty);
2525 # }
2526 #
2527 # The JSON representation for `Empty` is empty JSON object `{}`.
2528 },
2529 },
2530 ],
2531 "auxiliaryTables": [ # Several auxiliary tables can be used in the analysis. Each custom_tag
2532 # used to tag a quasi-identifiers column must appear in exactly one column
2533 # of one auxiliary table.
2534 { # An auxiliary table contains statistical information on the relative
2535 # frequency of different quasi-identifiers values. It has one or several
2536 # quasi-identifiers columns, and one column that indicates the relative
2537 # frequency of each quasi-identifier tuple.
2538 # If a tuple is present in the data but not in the auxiliary table, the
2539 # corresponding relative frequency is assumed to be zero (and thus, the
2540 # tuple is highly reidentifiable).
Dan O'Mearadd494642020-05-01 07:42:23 -07002541 "table": { # Message defining the location of a BigQuery table. A table is uniquely # Required. Auxiliary table location.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002542 # identified by its project_id, dataset_id, and table_name. Within a query
2543 # a table is often referenced with a string in the format of:
Dan O'Mearadd494642020-05-01 07:42:23 -07002544 # `&lt;project_id&gt;:&lt;dataset_id&gt;.&lt;table_id&gt;` or
2545 # `&lt;project_id&gt;.&lt;dataset_id&gt;.&lt;table_id&gt;`.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002546 "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table.
2547 # If omitted, project ID is inferred from the API call.
2548 "tableId": "A String", # Name of the table.
2549 "datasetId": "A String", # Dataset ID of the table.
2550 },
Dan O'Mearadd494642020-05-01 07:42:23 -07002551 "quasiIds": [ # Required. Quasi-identifier columns.
2552 { # A quasi-identifier column has a custom_tag, used to know which column
2553 # in the data corresponds to which column in the statistical model.
2554 "field": { # General identifier of a data field in a storage service. # Identifies the column.
2555 "name": "A String", # Name describing the field.
2556 },
2557 "customTag": "A String", # A auxiliary field.
2558 },
2559 ],
2560 "relativeFrequency": { # General identifier of a data field in a storage service. # Required. The relative frequency column must contain a floating-point number
2561 # between 0 and 1 (inclusive). Null values are assumed to be zero.
2562 "name": "A String", # Name describing the field.
2563 },
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002564 },
2565 ],
2566 },
Dan O'Mearadd494642020-05-01 07:42:23 -07002567 "lDiversityConfig": { # l-diversity metric, used for analysis of reidentification risk. # l-diversity
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002568 "sensitiveAttribute": { # General identifier of a data field in a storage service. # Sensitive field for computing the l-value.
2569 "name": "A String", # Name describing the field.
2570 },
2571 "quasiIds": [ # Set of quasi-identifiers indicating how equivalence classes are
2572 # defined for the l-diversity computation. When multiple fields are
2573 # specified, they are considered a single composite key.
2574 { # General identifier of a data field in a storage service.
2575 "name": "A String", # Name describing the field.
2576 },
2577 ],
2578 },
Dan O'Mearadd494642020-05-01 07:42:23 -07002579 "kAnonymityConfig": { # k-anonymity metric, used for analysis of reidentification risk. # K-anonymity
2580 "entityId": { # An entity in a dataset is a field or set of fields that correspond to a # Message indicating that multiple rows might be associated to a
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002581 # single individual. If the same entity_id is associated to multiple
2582 # quasi-identifier tuples over distinct rows, we consider the entire
2583 # collection of tuples as the composite quasi-identifier. This collection
2584 # is a multiset: the order in which the different tuples appear in the
2585 # dataset is ignored, but their frequency is taken into account.
2586 #
2587 # Important note: a maximum of 1000 rows can be associated to a single
2588 # entity ID. If more rows are associated with the same entity ID, some
2589 # might be ignored.
2590 # single person. For example, in medical records the `EntityId` might be a
2591 # patient identifier, or for financial records it might be an account
2592 # identifier. This message is used when generalizations or analysis must take
2593 # into account that multiple rows correspond to the same entity.
2594 "field": { # General identifier of a data field in a storage service. # Composite key indicating which field contains the entity identifier.
2595 "name": "A String", # Name describing the field.
2596 },
2597 },
2598 "quasiIds": [ # Set of fields to compute k-anonymity over. When multiple fields are
2599 # specified, they are considered a single composite key. Structs and
2600 # repeated data types are not supported; however, nested fields are
2601 # supported so long as they are not structs themselves or nested within
2602 # a repeated field.
2603 { # General identifier of a data field in a storage service.
2604 "name": "A String", # Name describing the field.
2605 },
2606 ],
2607 },
Dan O'Mearadd494642020-05-01 07:42:23 -07002608 "categoricalStatsConfig": { # Compute numerical stats over an individual column, including # Categorical stats
2609 # number of distinct values and value count distribution.
2610 "field": { # General identifier of a data field in a storage service. # Field to compute categorical stats on. All column types are
2611 # supported except for arrays and structs. However, it may be more
2612 # informative to use NumericalStats when the field type is supported,
2613 # depending on the data.
2614 "name": "A String", # Name describing the field.
2615 },
2616 },
2617 "deltaPresenceEstimationConfig": { # δ-presence metric, used to estimate how likely it is for an attacker to # delta-presence
2618 # figure out that one given individual appears in a de-identified dataset.
2619 # Similarly to the k-map metric, we cannot compute δ-presence exactly without
2620 # knowing the attack dataset, so we use a statistical model instead.
2621 "regionCode": "A String", # ISO 3166-1 alpha-2 region code to use in the statistical modeling.
2622 # Set if no column is tagged with a region-specific InfoType (like
2623 # US_ZIP_5) or a region code.
2624 "quasiIds": [ # Required. Fields considered to be quasi-identifiers. No two fields can have the
2625 # same tag.
2626 { # A column with a semantic tag attached.
2627 "field": { # General identifier of a data field in a storage service. # Required. Identifies the column.
2628 "name": "A String", # Name describing the field.
2629 },
2630 "customTag": "A String", # A column can be tagged with a custom tag. In this case, the user must
2631 # indicate an auxiliary table that contains statistical information on
2632 # the possible values of this column (below).
2633 "infoType": { # Type of information detected by the API. # A column can be tagged with a InfoType to use the relevant public
2634 # dataset as a statistical model of population, if available. We
2635 # currently support US ZIP codes, region codes, ages and genders.
2636 # To programmatically obtain the list of supported InfoTypes, use
2637 # ListInfoTypes with the supported_by=RISK_ANALYSIS filter.
2638 "name": "A String", # Name of the information type. Either a name of your choosing when
2639 # creating a CustomInfoType, or one of the names listed
2640 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
2641 # a built-in type. InfoType names should conform to the pattern
2642 # `[a-zA-Z0-9_]{1,64}`.
2643 },
2644 "inferred": { # A generic empty message that you can re-use to avoid defining duplicated # If no semantic tag is indicated, we infer the statistical model from
2645 # the distribution of values in the input data
2646 # empty messages in your APIs. A typical example is to use it as the request
2647 # or the response type of an API method. For instance:
2648 #
2649 # service Foo {
2650 # rpc Bar(google.protobuf.Empty) returns (google.protobuf.Empty);
2651 # }
2652 #
2653 # The JSON representation for `Empty` is empty JSON object `{}`.
2654 },
2655 },
2656 ],
2657 "auxiliaryTables": [ # Several auxiliary tables can be used in the analysis. Each custom_tag
2658 # used to tag a quasi-identifiers field must appear in exactly one
2659 # field of one auxiliary table.
2660 { # An auxiliary table containing statistical information on the relative
2661 # frequency of different quasi-identifiers values. It has one or several
2662 # quasi-identifiers columns, and one column that indicates the relative
2663 # frequency of each quasi-identifier tuple.
2664 # If a tuple is present in the data but not in the auxiliary table, the
2665 # corresponding relative frequency is assumed to be zero (and thus, the
2666 # tuple is highly reidentifiable).
2667 "relativeFrequency": { # General identifier of a data field in a storage service. # Required. The relative frequency column must contain a floating-point number
2668 # between 0 and 1 (inclusive). Null values are assumed to be zero.
2669 "name": "A String", # Name describing the field.
2670 },
2671 "quasiIds": [ # Required. Quasi-identifier columns.
2672 { # A quasi-identifier column has a custom_tag, used to know which column
2673 # in the data corresponds to which column in the statistical model.
2674 "field": { # General identifier of a data field in a storage service. # Identifies the column.
2675 "name": "A String", # Name describing the field.
2676 },
2677 "customTag": "A String", # A column can be tagged with a custom tag. In this case, the user must
2678 # indicate an auxiliary table that contains statistical information on
2679 # the possible values of this column (below).
2680 },
2681 ],
2682 "table": { # Message defining the location of a BigQuery table. A table is uniquely # Required. Auxiliary table location.
2683 # identified by its project_id, dataset_id, and table_name. Within a query
2684 # a table is often referenced with a string in the format of:
2685 # `&lt;project_id&gt;:&lt;dataset_id&gt;.&lt;table_id&gt;` or
2686 # `&lt;project_id&gt;.&lt;dataset_id&gt;.&lt;table_id&gt;`.
2687 "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table.
2688 # If omitted, project ID is inferred from the API call.
2689 "tableId": "A String", # Name of the table.
2690 "datasetId": "A String", # Dataset ID of the table.
2691 },
2692 },
2693 ],
2694 },
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002695 },
Dan O'Mearadd494642020-05-01 07:42:23 -07002696 "categoricalStatsResult": { # Result of the categorical stats computation. # Categorical stats result
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002697 "valueFrequencyHistogramBuckets": [ # Histogram of value frequencies in the column.
Dan O'Mearadd494642020-05-01 07:42:23 -07002698 { # Histogram of value frequencies in the column.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002699 "bucketValues": [ # Sample of value frequencies in this bucket. The total number of
2700 # values returned per bucket is capped at 20.
2701 { # A value of a field, including its frequency.
2702 "count": "A String", # How many times the value is contained in the field.
2703 "value": { # Set of primitive values supported by the system. # A value contained in the field in question.
2704 # Note that for the purposes of inspection or transformation, the number
2705 # of bytes considered to comprise a 'Value' is based on its representation
2706 # as a UTF-8 encoded string. For example, if 'integer_value' is set to
2707 # 123456789, the number of bytes would be counted as 9, even though an
2708 # int64 only holds up to 8 bytes of data.
Dan O'Mearadd494642020-05-01 07:42:23 -07002709 "floatValue": 3.14, # float
2710 "timestampValue": "A String", # timestamp
2711 "dayOfWeekValue": "A String", # day of week
2712 "timeValue": { # Represents a time of day. The date and time zone are either not significant # time of day
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002713 # or are specified elsewhere. An API may choose to allow leap seconds. Related
2714 # types are google.type.Date and `google.protobuf.Timestamp`.
2715 "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose
2716 # to allow the value "24:00:00" for scenarios like business closing time.
2717 "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999.
2718 "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may
2719 # allow the value 60 if it allows leap-seconds.
2720 "minutes": 42, # Minutes of hour of day. Must be from 0 to 59.
2721 },
Dan O'Mearadd494642020-05-01 07:42:23 -07002722 "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day # date
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002723 # and time zone are either specified elsewhere or are not significant. The date
2724 # is relative to the Proleptic Gregorian Calendar. This can represent:
2725 #
2726 # * A full date, with non-zero year, month and day values
2727 # * A month and day value, with a zero year, e.g. an anniversary
2728 # * A year on its own, with zero month and day values
2729 # * A year and month value, with a zero day, e.g. a credit card expiration date
2730 #
2731 # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`.
Dan O'Mearadd494642020-05-01 07:42:23 -07002732 "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a
2733 # month and day.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002734 "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0
2735 # if specifying a year by itself or a year and month where the day is not
2736 # significant.
Dan O'Mearadd494642020-05-01 07:42:23 -07002737 "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without
2738 # a year.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002739 },
Dan O'Mearadd494642020-05-01 07:42:23 -07002740 "stringValue": "A String", # string
2741 "booleanValue": True or False, # boolean
2742 "integerValue": "A String", # integer
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002743 },
2744 },
2745 ],
2746 "bucketValueCount": "A String", # Total number of distinct values in this bucket.
2747 "valueFrequencyUpperBound": "A String", # Upper bound on the value frequency of the values in this bucket.
2748 "valueFrequencyLowerBound": "A String", # Lower bound on the value frequency of the values in this bucket.
2749 "bucketSize": "A String", # Total number of values in this bucket.
2750 },
2751 ],
2752 },
Dan O'Mearadd494642020-05-01 07:42:23 -07002753 "deltaPresenceEstimationResult": { # Result of the δ-presence computation. Note that these results are an # Delta-presence result
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002754 # estimation, not exact values.
2755 "deltaPresenceEstimationHistogram": [ # The intervals [min_probability, max_probability) do not overlap. If a
2756 # value doesn't correspond to any such interval, the associated frequency
2757 # is zero. For example, the following records:
2758 # {min_probability: 0, max_probability: 0.1, frequency: 17}
2759 # {min_probability: 0.2, max_probability: 0.3, frequency: 42}
2760 # {min_probability: 0.3, max_probability: 0.4, frequency: 99}
2761 # mean that there are no record with an estimated probability in [0.1, 0.2)
2762 # nor larger or equal to 0.4.
2763 { # A DeltaPresenceEstimationHistogramBucket message with the following
2764 # values:
2765 # min_probability: 0.1
2766 # max_probability: 0.2
2767 # frequency: 42
2768 # means that there are 42 records for which δ is in [0.1, 0.2). An
2769 # important particular case is when min_probability = max_probability = 1:
2770 # then, every individual who shares this quasi-identifier combination is in
2771 # the dataset.
2772 "bucketValues": [ # Sample of quasi-identifier tuple values in this bucket. The total
2773 # number of classes returned per bucket is capped at 20.
2774 { # A tuple of values for the quasi-identifier columns.
2775 "quasiIdsValues": [ # The quasi-identifier values.
2776 { # Set of primitive values supported by the system.
2777 # Note that for the purposes of inspection or transformation, the number
2778 # of bytes considered to comprise a 'Value' is based on its representation
2779 # as a UTF-8 encoded string. For example, if 'integer_value' is set to
2780 # 123456789, the number of bytes would be counted as 9, even though an
2781 # int64 only holds up to 8 bytes of data.
Dan O'Mearadd494642020-05-01 07:42:23 -07002782 "floatValue": 3.14, # float
2783 "timestampValue": "A String", # timestamp
2784 "dayOfWeekValue": "A String", # day of week
2785 "timeValue": { # Represents a time of day. The date and time zone are either not significant # time of day
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002786 # or are specified elsewhere. An API may choose to allow leap seconds. Related
2787 # types are google.type.Date and `google.protobuf.Timestamp`.
2788 "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose
2789 # to allow the value "24:00:00" for scenarios like business closing time.
2790 "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999.
2791 "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may
2792 # allow the value 60 if it allows leap-seconds.
2793 "minutes": 42, # Minutes of hour of day. Must be from 0 to 59.
2794 },
Dan O'Mearadd494642020-05-01 07:42:23 -07002795 "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day # date
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002796 # and time zone are either specified elsewhere or are not significant. The date
2797 # is relative to the Proleptic Gregorian Calendar. This can represent:
2798 #
2799 # * A full date, with non-zero year, month and day values
2800 # * A month and day value, with a zero year, e.g. an anniversary
2801 # * A year on its own, with zero month and day values
2802 # * A year and month value, with a zero day, e.g. a credit card expiration date
2803 #
2804 # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`.
Dan O'Mearadd494642020-05-01 07:42:23 -07002805 "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a
2806 # month and day.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002807 "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0
2808 # if specifying a year by itself or a year and month where the day is not
2809 # significant.
Dan O'Mearadd494642020-05-01 07:42:23 -07002810 "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without
2811 # a year.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002812 },
Dan O'Mearadd494642020-05-01 07:42:23 -07002813 "stringValue": "A String", # string
2814 "booleanValue": True or False, # boolean
2815 "integerValue": "A String", # integer
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002816 },
2817 ],
2818 "estimatedProbability": 3.14, # The estimated probability that a given individual sharing these
2819 # quasi-identifier values is in the dataset. This value, typically called
2820 # δ, is the ratio between the number of records in the dataset with these
2821 # quasi-identifier values, and the total number of individuals (inside
2822 # *and* outside the dataset) with these quasi-identifier values.
2823 # For example, if there are 15 individuals in the dataset who share the
2824 # same quasi-identifier values, and an estimated 100 people in the entire
2825 # population with these values, then δ is 0.15.
2826 },
2827 ],
2828 "bucketValueCount": "A String", # Total number of distinct quasi-identifier tuple values in this bucket.
2829 "bucketSize": "A String", # Number of records within these probability bounds.
2830 "maxProbability": 3.14, # Always greater than or equal to min_probability.
2831 "minProbability": 3.14, # Between 0 and 1.
2832 },
2833 ],
2834 },
2835 "requestedSourceTable": { # Message defining the location of a BigQuery table. A table is uniquely # Input dataset to compute metrics over.
2836 # identified by its project_id, dataset_id, and table_name. Within a query
2837 # a table is often referenced with a string in the format of:
Dan O'Mearadd494642020-05-01 07:42:23 -07002838 # `&lt;project_id&gt;:&lt;dataset_id&gt;.&lt;table_id&gt;` or
2839 # `&lt;project_id&gt;.&lt;dataset_id&gt;.&lt;table_id&gt;`.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002840 "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table.
2841 # If omitted, project ID is inferred from the API call.
2842 "tableId": "A String", # Name of the table.
2843 "datasetId": "A String", # Dataset ID of the table.
2844 },
2845 },
2846 "state": "A String", # State of a job.
2847 "jobTriggerName": "A String", # If created by a job trigger, the resource name of the trigger that
2848 # instantiated the job.
2849 "startTime": "A String", # Time when the job started.
2850 "endTime": "A String", # Time when the job finished.
2851 "type": "A String", # The type of job.
2852 "createTime": "A String", # Time when the job was created.
2853 }</pre>
2854</div>
2855
2856<div class="method">
2857 <code class="details" id="delete">delete(name, x__xgafv=None)</code>
2858 <pre>Deletes a long-running DlpJob. This method indicates that the client is
2859no longer interested in the DlpJob result. The job will be cancelled if
2860possible.
2861See https://cloud.google.com/dlp/docs/inspecting-storage and
2862https://cloud.google.com/dlp/docs/compute-risk-analysis to learn more.
2863
2864Args:
Dan O'Mearadd494642020-05-01 07:42:23 -07002865 name: string, Required. The name of the DlpJob resource to be deleted. (required)
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002866 x__xgafv: string, V1 error format.
2867 Allowed values
2868 1 - v1 error format
2869 2 - v2 error format
2870
2871Returns:
2872 An object of the form:
2873
2874 { # A generic empty message that you can re-use to avoid defining duplicated
2875 # empty messages in your APIs. A typical example is to use it as the request
2876 # or the response type of an API method. For instance:
2877 #
2878 # service Foo {
2879 # rpc Bar(google.protobuf.Empty) returns (google.protobuf.Empty);
2880 # }
2881 #
2882 # The JSON representation for `Empty` is empty JSON object `{}`.
2883 }</pre>
2884</div>
2885
2886<div class="method">
2887 <code class="details" id="get">get(name, x__xgafv=None)</code>
2888 <pre>Gets the latest state of a long-running DlpJob.
2889See https://cloud.google.com/dlp/docs/inspecting-storage and
2890https://cloud.google.com/dlp/docs/compute-risk-analysis to learn more.
2891
2892Args:
Dan O'Mearadd494642020-05-01 07:42:23 -07002893 name: string, Required. The name of the DlpJob resource. (required)
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002894 x__xgafv: string, V1 error format.
2895 Allowed values
2896 1 - v1 error format
2897 2 - v2 error format
2898
2899Returns:
2900 An object of the form:
2901
2902 { # Combines all of the information about a DLP job.
2903 "errors": [ # A stream of errors encountered running the job.
2904 { # Details information about an error encountered during job execution or
2905 # the results of an unsuccessful activation of the JobTrigger.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002906 "timestamps": [ # The times the error occurred.
2907 "A String",
2908 ],
Dan O'Mearadd494642020-05-01 07:42:23 -07002909 "details": { # The `Status` type defines a logical error model that is suitable for # Detailed error codes and messages.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002910 # different programming environments, including REST APIs and RPC APIs. It is
2911 # used by [gRPC](https://github.com/grpc). Each `Status` message contains
2912 # three pieces of data: error code, error message, and error details.
2913 #
2914 # You can find out more about this error model and how to work with it in the
2915 # [API Design Guide](https://cloud.google.com/apis/design/errors).
2916 "message": "A String", # A developer-facing error message, which should be in English. Any
2917 # user-facing error message should be localized and sent in the
2918 # google.rpc.Status.details field, or localized by the client.
2919 "code": 42, # The status code, which should be an enum value of google.rpc.Code.
2920 "details": [ # A list of messages that carry the error details. There is a common set of
2921 # message types for APIs to use.
2922 {
2923 "a_key": "", # Properties of the object. Contains field @type with type URL.
2924 },
2925 ],
2926 },
2927 },
2928 ],
2929 "name": "A String", # The server-assigned name.
2930 "inspectDetails": { # The results of an inspect DataSource job. # Results from inspecting a data source.
Dan O'Mearadd494642020-05-01 07:42:23 -07002931 "requestedOptions": { # Snapshot of the inspection configuration. # The configuration used for this job.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002932 "snapshotInspectTemplate": { # The inspectTemplate contains a configuration (set of types of sensitive data # If run with an InspectTemplate, a snapshot of its state at the time of
2933 # this run.
2934 # to be detected) to be used anywhere you otherwise would normally specify
2935 # InspectConfig. See https://cloud.google.com/dlp/docs/concepts-templates
2936 # to learn more.
Dan O'Mearadd494642020-05-01 07:42:23 -07002937 "updateTime": "A String", # Output only. The last update timestamp of an inspectTemplate.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002938 "displayName": "A String", # Display name (max 256 chars).
2939 "description": "A String", # Short description (max 256 chars).
2940 "inspectConfig": { # Configuration description of the scanning process. # The core content of the template. Configuration of the scanning process.
2941 # When used with redactContent only info_types and min_likelihood are currently
2942 # used.
2943 "excludeInfoTypes": True or False, # When true, excludes type information of the findings.
Dan O'Mearadd494642020-05-01 07:42:23 -07002944 "limits": { # Configuration to control the number of findings returned. # Configuration to control the number of findings returned.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002945 "maxFindingsPerRequest": 42, # Max number of findings that will be returned per request/job.
2946 # When set within `InspectContentRequest`, the maximum returned is 2000
2947 # regardless if this is set higher.
2948 "maxFindingsPerInfoType": [ # Configuration of findings limit given for specified infoTypes.
2949 { # Max findings configuration per infoType, per content item or long
2950 # running DlpJob.
2951 "infoType": { # Type of information detected by the API. # Type of information the findings limit applies to. Only one limit per
2952 # info_type should be provided. If InfoTypeLimit does not have an
2953 # info_type, the DLP API applies the limit against all info_types that
2954 # are found but not specified in another InfoTypeLimit.
2955 "name": "A String", # Name of the information type. Either a name of your choosing when
2956 # creating a CustomInfoType, or one of the names listed
2957 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
2958 # a built-in type. InfoType names should conform to the pattern
Dan O'Mearadd494642020-05-01 07:42:23 -07002959 # `[a-zA-Z0-9_]{1,64}`.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002960 },
2961 "maxFindings": 42, # Max findings limit for the given infoType.
2962 },
2963 ],
2964 "maxFindingsPerItem": 42, # Max number of findings that will be returned for each item scanned.
Dan O'Mearadd494642020-05-01 07:42:23 -07002965 # When set within `InspectJobConfig`,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07002966 # the maximum returned is 2000 regardless if this is set higher.
2967 # When set within `InspectContentRequest`, this field is ignored.
2968 },
2969 "minLikelihood": "A String", # Only returns findings equal or above this threshold. The default is
2970 # POSSIBLE.
2971 # See https://cloud.google.com/dlp/docs/likelihood to learn more.
2972 "customInfoTypes": [ # CustomInfoTypes provided by the user. See
2973 # https://cloud.google.com/dlp/docs/creating-custom-infotypes to learn more.
2974 { # Custom information type provided by the user. Used to find domain-specific
2975 # sensitive information configurable to the data in question.
2976 "regex": { # Message defining a custom regular expression. # Regular expression based CustomInfoType.
2977 "pattern": "A String", # Pattern defining the regular expression. Its syntax
2978 # (https://github.com/google/re2/wiki/Syntax) can be found under the
2979 # google/re2 repository on GitHub.
2980 "groupIndexes": [ # The index of the submatch to extract as findings. When not
2981 # specified, the entire match is returned. No more than 3 may be included.
2982 42,
2983 ],
2984 },
2985 "surrogateType": { # Message for detecting output from deidentification transformations # Message for detecting output from deidentification transformations that
2986 # support reversing.
2987 # such as
2988 # [`CryptoReplaceFfxFpeConfig`](/dlp/docs/reference/rest/v2/organizations.deidentifyTemplates#cryptoreplaceffxfpeconfig).
2989 # These types of transformations are
2990 # those that perform pseudonymization, thereby producing a "surrogate" as
2991 # output. This should be used in conjunction with a field on the
2992 # transformation such as `surrogate_info_type`. This CustomInfoType does
2993 # not support the use of `detection_rules`.
2994 },
2995 "infoType": { # Type of information detected by the API. # CustomInfoType can either be a new infoType, or an extension of built-in
2996 # infoType, when the name matches one of existing infoTypes and that infoType
2997 # is specified in `InspectContent.info_types` field. Specifying the latter
2998 # adds findings to the one detected by the system. If built-in info type is
2999 # not specified in `InspectContent.info_types` list then the name is treated
3000 # as a custom info type.
3001 "name": "A String", # Name of the information type. Either a name of your choosing when
3002 # creating a CustomInfoType, or one of the names listed
3003 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
3004 # a built-in type. InfoType names should conform to the pattern
Dan O'Mearadd494642020-05-01 07:42:23 -07003005 # `[a-zA-Z0-9_]{1,64}`.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003006 },
3007 "dictionary": { # Custom information type based on a dictionary of words or phrases. This can # A list of phrases to detect as a CustomInfoType.
3008 # be used to match sensitive information specific to the data, such as a list
3009 # of employee IDs or job titles.
3010 #
3011 # Dictionary words are case-insensitive and all characters other than letters
3012 # and digits in the unicode [Basic Multilingual
3013 # Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane)
3014 # will be replaced with whitespace when scanning for matches, so the
3015 # dictionary phrase "Sam Johnson" will match all three phrases "sam johnson",
3016 # "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters
3017 # surrounding any match must be of a different type than the adjacent
3018 # characters within the word, so letters must be next to non-letters and
3019 # digits next to non-digits. For example, the dictionary word "jen" will
3020 # match the first three letters of the text "jen123" but will return no
3021 # matches for "jennifer".
3022 #
3023 # Dictionary words containing a large number of characters that are not
3024 # letters or digits may result in unexpected findings because such characters
3025 # are treated as whitespace. The
3026 # [limits](https://cloud.google.com/dlp/limits) page contains details about
3027 # the size limits of dictionaries. For dictionaries that do not fit within
3028 # these constraints, consider using `LargeCustomDictionaryConfig` in the
3029 # `StoredInfoType` API.
3030 "wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for.
3031 "words": [ # Words or phrases defining the dictionary. The dictionary must contain
3032 # at least one phrase and every phrase must contain at least 2 characters
3033 # that are letters or digits. [required]
3034 "A String",
3035 ],
3036 },
3037 "cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file
3038 # is accepted.
3039 "path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage.
3040 # Example: gs://[BUCKET_NAME]/dictionary.txt
3041 },
3042 },
3043 "storedType": { # A reference to a StoredInfoType to use with scanning. # Load an existing `StoredInfoType` resource for use in
3044 # `InspectDataSource`. Not currently supported in `InspectContent`.
3045 "name": "A String", # Resource name of the requested `StoredInfoType`, for example
3046 # `organizations/433245324/storedInfoTypes/432452342` or
3047 # `projects/project-id/storedInfoTypes/432452342`.
3048 "createTime": "A String", # Timestamp indicating when the version of the `StoredInfoType` used for
3049 # inspection was created. Output-only field, populated by the system.
3050 },
3051 "detectionRules": [ # Set of detection rules to apply to all findings of this CustomInfoType.
3052 # Rules are applied in order that they are specified. Not supported for the
3053 # `surrogate_type` CustomInfoType.
3054 { # Deprecated; use `InspectionRuleSet` instead. Rule for modifying a
3055 # `CustomInfoType` to alter behavior under certain circumstances, depending
3056 # on the specific details of the rule. Not supported for the `surrogate_type`
3057 # custom infoType.
3058 "hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule.
3059 # proximity of hotwords.
3060 "proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside.
3061 # The total length of the window cannot exceed 1000 characters. Note that
3062 # the finding itself will be included in the window, so that hotwords may
3063 # be used to match substrings of the finding itself. For example, the
3064 # certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be
3065 # adjusted upwards if the area code is known to be the local area code of
3066 # a company office using the hotword regex "\(xxx\)", where "xxx"
3067 # is the area code in question.
3068 # rule.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003069 "windowBefore": 42, # Number of characters before the finding to consider.
Dan O'Mearadd494642020-05-01 07:42:23 -07003070 "windowAfter": 42, # Number of characters after the finding to consider.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003071 },
3072 "hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword.
3073 "pattern": "A String", # Pattern defining the regular expression. Its syntax
3074 # (https://github.com/google/re2/wiki/Syntax) can be found under the
3075 # google/re2 repository on GitHub.
3076 "groupIndexes": [ # The index of the submatch to extract as findings. When not
3077 # specified, the entire match is returned. No more than 3 may be included.
3078 42,
3079 ],
3080 },
3081 "likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings.
3082 # part of a detection rule.
3083 "relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of
3084 # levels. For example, if a finding would be `POSSIBLE` without the
3085 # detection rule and `relative_likelihood` is 1, then it is upgraded to
3086 # `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`.
3087 # Likelihood may never drop below `VERY_UNLIKELY` or exceed
3088 # `VERY_LIKELY`, so applying an adjustment of 1 followed by an
3089 # adjustment of -1 when base likelihood is `VERY_LIKELY` will result in
3090 # a final likelihood of `LIKELY`.
3091 "fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value.
3092 },
3093 },
3094 },
3095 ],
3096 "exclusionType": "A String", # If set to EXCLUSION_TYPE_EXCLUDE this infoType will not cause a finding
3097 # to be returned. It still can be used for rules matching.
3098 "likelihood": "A String", # Likelihood to return for this CustomInfoType. This base value can be
3099 # altered by a detection rule if the finding meets the criteria specified by
3100 # the rule. Defaults to `VERY_LIKELY` if not specified.
3101 },
3102 ],
3103 "includeQuote": True or False, # When true, a contextual quote from the data that triggered a finding is
3104 # included in the response; see Finding.quote.
3105 "ruleSet": [ # Set of rules to apply to the findings for this InspectConfig.
3106 # Exclusion rules, contained in the set are executed in the end, other
3107 # rules are executed in the order they are specified for each info type.
3108 { # Rule set for modifying a set of infoTypes to alter behavior under certain
3109 # circumstances, depending on the specific details of the rules within the set.
3110 "rules": [ # Set of rules to be applied to infoTypes. The rules are applied in order.
3111 { # A single inspection rule to be applied to infoTypes, specified in
3112 # `InspectionRuleSet`.
3113 "hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule.
3114 # proximity of hotwords.
3115 "proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside.
3116 # The total length of the window cannot exceed 1000 characters. Note that
3117 # the finding itself will be included in the window, so that hotwords may
3118 # be used to match substrings of the finding itself. For example, the
3119 # certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be
3120 # adjusted upwards if the area code is known to be the local area code of
3121 # a company office using the hotword regex "\(xxx\)", where "xxx"
3122 # is the area code in question.
3123 # rule.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003124 "windowBefore": 42, # Number of characters before the finding to consider.
Dan O'Mearadd494642020-05-01 07:42:23 -07003125 "windowAfter": 42, # Number of characters after the finding to consider.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003126 },
3127 "hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword.
3128 "pattern": "A String", # Pattern defining the regular expression. Its syntax
3129 # (https://github.com/google/re2/wiki/Syntax) can be found under the
3130 # google/re2 repository on GitHub.
3131 "groupIndexes": [ # The index of the submatch to extract as findings. When not
3132 # specified, the entire match is returned. No more than 3 may be included.
3133 42,
3134 ],
3135 },
3136 "likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings.
3137 # part of a detection rule.
3138 "relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of
3139 # levels. For example, if a finding would be `POSSIBLE` without the
3140 # detection rule and `relative_likelihood` is 1, then it is upgraded to
3141 # `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`.
3142 # Likelihood may never drop below `VERY_UNLIKELY` or exceed
3143 # `VERY_LIKELY`, so applying an adjustment of 1 followed by an
3144 # adjustment of -1 when base likelihood is `VERY_LIKELY` will result in
3145 # a final likelihood of `LIKELY`.
3146 "fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value.
3147 },
3148 },
3149 "exclusionRule": { # The rule that specifies conditions when findings of infoTypes specified in # Exclusion rule.
3150 # `InspectionRuleSet` are removed from results.
3151 "regex": { # Message defining a custom regular expression. # Regular expression which defines the rule.
3152 "pattern": "A String", # Pattern defining the regular expression. Its syntax
3153 # (https://github.com/google/re2/wiki/Syntax) can be found under the
3154 # google/re2 repository on GitHub.
3155 "groupIndexes": [ # The index of the submatch to extract as findings. When not
3156 # specified, the entire match is returned. No more than 3 may be included.
3157 42,
3158 ],
3159 },
3160 "excludeInfoTypes": { # List of exclude infoTypes. # Set of infoTypes for which findings would affect this rule.
3161 "infoTypes": [ # InfoType list in ExclusionRule rule drops a finding when it overlaps or
3162 # contained within with a finding of an infoType from this list. For
3163 # example, for `InspectionRuleSet.info_types` containing "PHONE_NUMBER"` and
3164 # `exclusion_rule` containing `exclude_info_types.info_types` with
3165 # "EMAIL_ADDRESS" the phone number findings are dropped if they overlap
3166 # with EMAIL_ADDRESS finding.
3167 # That leads to "555-222-2222@example.org" to generate only a single
3168 # finding, namely email address.
3169 { # Type of information detected by the API.
3170 "name": "A String", # Name of the information type. Either a name of your choosing when
3171 # creating a CustomInfoType, or one of the names listed
3172 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
3173 # a built-in type. InfoType names should conform to the pattern
Dan O'Mearadd494642020-05-01 07:42:23 -07003174 # `[a-zA-Z0-9_]{1,64}`.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003175 },
3176 ],
3177 },
3178 "dictionary": { # Custom information type based on a dictionary of words or phrases. This can # Dictionary which defines the rule.
3179 # be used to match sensitive information specific to the data, such as a list
3180 # of employee IDs or job titles.
3181 #
3182 # Dictionary words are case-insensitive and all characters other than letters
3183 # and digits in the unicode [Basic Multilingual
3184 # Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane)
3185 # will be replaced with whitespace when scanning for matches, so the
3186 # dictionary phrase "Sam Johnson" will match all three phrases "sam johnson",
3187 # "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters
3188 # surrounding any match must be of a different type than the adjacent
3189 # characters within the word, so letters must be next to non-letters and
3190 # digits next to non-digits. For example, the dictionary word "jen" will
3191 # match the first three letters of the text "jen123" but will return no
3192 # matches for "jennifer".
3193 #
3194 # Dictionary words containing a large number of characters that are not
3195 # letters or digits may result in unexpected findings because such characters
3196 # are treated as whitespace. The
3197 # [limits](https://cloud.google.com/dlp/limits) page contains details about
3198 # the size limits of dictionaries. For dictionaries that do not fit within
3199 # these constraints, consider using `LargeCustomDictionaryConfig` in the
3200 # `StoredInfoType` API.
3201 "wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for.
3202 "words": [ # Words or phrases defining the dictionary. The dictionary must contain
3203 # at least one phrase and every phrase must contain at least 2 characters
3204 # that are letters or digits. [required]
3205 "A String",
3206 ],
3207 },
3208 "cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file
3209 # is accepted.
3210 "path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage.
3211 # Example: gs://[BUCKET_NAME]/dictionary.txt
3212 },
3213 },
3214 "matchingType": "A String", # How the rule is applied, see MatchingType documentation for details.
3215 },
3216 },
3217 ],
3218 "infoTypes": [ # List of infoTypes this rule set is applied to.
3219 { # Type of information detected by the API.
3220 "name": "A String", # Name of the information type. Either a name of your choosing when
3221 # creating a CustomInfoType, or one of the names listed
3222 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
3223 # a built-in type. InfoType names should conform to the pattern
Dan O'Mearadd494642020-05-01 07:42:23 -07003224 # `[a-zA-Z0-9_]{1,64}`.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003225 },
3226 ],
3227 },
3228 ],
3229 "contentOptions": [ # List of options defining data content to scan.
3230 # If empty, text, images, and other content will be included.
3231 "A String",
3232 ],
3233 "infoTypes": [ # Restricts what info_types to look for. The values must correspond to
3234 # InfoType values returned by ListInfoTypes or listed at
3235 # https://cloud.google.com/dlp/docs/infotypes-reference.
3236 #
3237 # When no InfoTypes or CustomInfoTypes are specified in a request, the
3238 # system may automatically choose what detectors to run. By default this may
3239 # be all types, but may change over time as detectors are updated.
3240 #
Dan O'Mearadd494642020-05-01 07:42:23 -07003241 # If you need precise control and predictability as to what detectors are
3242 # run you should specify specific InfoTypes listed in the reference,
3243 # otherwise a default list will be used, which may change over time.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003244 { # Type of information detected by the API.
3245 "name": "A String", # Name of the information type. Either a name of your choosing when
3246 # creating a CustomInfoType, or one of the names listed
3247 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
3248 # a built-in type. InfoType names should conform to the pattern
Dan O'Mearadd494642020-05-01 07:42:23 -07003249 # `[a-zA-Z0-9_]{1,64}`.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003250 },
3251 ],
3252 },
Dan O'Mearadd494642020-05-01 07:42:23 -07003253 "createTime": "A String", # Output only. The creation timestamp of an inspectTemplate.
3254 "name": "A String", # Output only. The template name.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003255 #
3256 # The template will have one of the following formats:
3257 # `projects/PROJECT_ID/inspectTemplates/TEMPLATE_ID` OR
Dan O'Mearadd494642020-05-01 07:42:23 -07003258 # `organizations/ORGANIZATION_ID/inspectTemplates/TEMPLATE_ID`;
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003259 },
Dan O'Mearadd494642020-05-01 07:42:23 -07003260 "jobConfig": { # Controls what and how to inspect for findings. # Inspect config.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003261 "storageConfig": { # Shared message indicating Cloud storage type. # The data to scan.
Dan O'Mearadd494642020-05-01 07:42:23 -07003262 "cloudStorageOptions": { # Options defining a file or a set of files within a Google Cloud Storage # Google Cloud Storage options.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003263 # bucket.
3264 "bytesLimitPerFile": "A String", # Max number of bytes to scan from a file. If a scanned file's size is bigger
3265 # than this value then the rest of the bytes are omitted. Only one
3266 # of bytes_limit_per_file and bytes_limit_per_file_percent can be specified.
3267 "sampleMethod": "A String",
3268 "fileSet": { # Set of files to scan. # The set of one or more files to scan.
3269 "url": "A String", # The Cloud Storage url of the file(s) to scan, in the format
Dan O'Mearadd494642020-05-01 07:42:23 -07003270 # `gs://&lt;bucket&gt;/&lt;path&gt;`. Trailing wildcard in the path is allowed.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003271 #
3272 # If the url ends in a trailing slash, the bucket or directory represented
3273 # by the url will be scanned non-recursively (content in sub-directories
3274 # will not be scanned). This means that `gs://mybucket/` is equivalent to
3275 # `gs://mybucket/*`, and `gs://mybucket/directory/` is equivalent to
3276 # `gs://mybucket/directory/*`.
3277 #
3278 # Exactly one of `url` or `regex_file_set` must be set.
3279 "regexFileSet": { # Message representing a set of files in a Cloud Storage bucket. Regular # The regex-filtered set of files to scan. Exactly one of `url` or
3280 # `regex_file_set` must be set.
3281 # expressions are used to allow fine-grained control over which files in the
3282 # bucket to include.
3283 #
3284 # Included files are those that match at least one item in `include_regex` and
3285 # do not match any items in `exclude_regex`. Note that a file that matches
3286 # items from both lists will _not_ be included. For a match to occur, the
3287 # entire file path (i.e., everything in the url after the bucket name) must
3288 # match the regular expression.
3289 #
3290 # For example, given the input `{bucket_name: "mybucket", include_regex:
3291 # ["directory1/.*"], exclude_regex:
3292 # ["directory1/excluded.*"]}`:
3293 #
3294 # * `gs://mybucket/directory1/myfile` will be included
3295 # * `gs://mybucket/directory1/directory2/myfile` will be included (`.*` matches
3296 # across `/`)
3297 # * `gs://mybucket/directory0/directory1/myfile` will _not_ be included (the
3298 # full path doesn't match any items in `include_regex`)
3299 # * `gs://mybucket/directory1/excludedfile` will _not_ be included (the path
3300 # matches an item in `exclude_regex`)
3301 #
3302 # If `include_regex` is left empty, it will match all files by default
3303 # (this is equivalent to setting `include_regex: [".*"]`).
3304 #
3305 # Some other common use cases:
3306 #
3307 # * `{bucket_name: "mybucket", exclude_regex: [".*\.pdf"]}` will include all
3308 # files in `mybucket` except for .pdf files
3309 # * `{bucket_name: "mybucket", include_regex: ["directory/[^/]+"]}` will
3310 # include all files directly under `gs://mybucket/directory/`, without matching
3311 # across `/`
3312 "excludeRegex": [ # A list of regular expressions matching file paths to exclude. All files in
3313 # the bucket that match at least one of these regular expressions will be
3314 # excluded from the scan.
3315 #
3316 # Regular expressions use RE2
3317 # [syntax](https://github.com/google/re2/wiki/Syntax); a guide can be found
3318 # under the google/re2 repository on GitHub.
3319 "A String",
3320 ],
3321 "bucketName": "A String", # The name of a Cloud Storage bucket. Required.
3322 "includeRegex": [ # A list of regular expressions matching file paths to include. All files in
3323 # the bucket that match at least one of these regular expressions will be
3324 # included in the set of files, except for those that also match an item in
3325 # `exclude_regex`. Leaving this field empty will match all files by default
3326 # (this is equivalent to including `.*` in the list).
3327 #
3328 # Regular expressions use RE2
3329 # [syntax](https://github.com/google/re2/wiki/Syntax); a guide can be found
3330 # under the google/re2 repository on GitHub.
3331 "A String",
3332 ],
3333 },
3334 },
Dan O'Mearadd494642020-05-01 07:42:23 -07003335 "filesLimitPercent": 42, # Limits the number of files to scan to this percentage of the input FileSet.
3336 # Number of files scanned is rounded down. Must be between 0 and 100,
3337 # inclusively. Both 0 and 100 means no limit. Defaults to 0.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003338 "bytesLimitPerFilePercent": 42, # Max percentage of bytes to scan from a file. The rest are omitted. The
3339 # number of bytes scanned is rounded down. Must be between 0 and 100,
3340 # inclusively. Both 0 and 100 means no limit. Defaults to 0. Only one
3341 # of bytes_limit_per_file and bytes_limit_per_file_percent can be specified.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003342 "fileTypes": [ # List of file type groups to include in the scan.
3343 # If empty, all files are scanned and available data format processors
3344 # are applied. In addition, the binary content of the selected files
3345 # is always scanned as well.
Dan O'Mearadd494642020-05-01 07:42:23 -07003346 # Images are scanned only as binary if the specified region
3347 # does not support image inspection and no file_types were specified.
3348 # Image inspection is restricted to 'global', 'us', 'asia', and 'europe'.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003349 "A String",
3350 ],
3351 },
Dan O'Mearadd494642020-05-01 07:42:23 -07003352 "datastoreOptions": { # Options defining a data set within Google Cloud Datastore. # Google Cloud Datastore options.
3353 "partitionId": { # Datastore partition ID. # A partition ID identifies a grouping of entities. The grouping is always
3354 # by project and namespace, however the namespace ID may be empty.
3355 # A partition ID identifies a grouping of entities. The grouping is always
3356 # by project and namespace, however the namespace ID may be empty.
3357 #
3358 # A partition ID contains several dimensions:
3359 # project ID and namespace ID.
3360 "projectId": "A String", # The ID of the project to which the entities belong.
3361 "namespaceId": "A String", # If not empty, the ID of the namespace to which the entities belong.
3362 },
3363 "kind": { # A representation of a Datastore kind. # The kind to process.
3364 "name": "A String", # The name of the kind.
3365 },
3366 },
3367 "bigQueryOptions": { # Options defining BigQuery table and row identifiers. # BigQuery options.
3368 "excludedFields": [ # References to fields excluded from scanning. This allows you to skip
3369 # inspection of entire columns which you know have no findings.
3370 { # General identifier of a data field in a storage service.
3371 "name": "A String", # Name describing the field.
3372 },
3373 ],
3374 "rowsLimit": "A String", # Max number of rows to scan. If the table has more rows than this value, the
3375 # rest of the rows are omitted. If not set, or if set to 0, all rows will be
3376 # scanned. Only one of rows_limit and rows_limit_percent can be specified.
3377 # Cannot be used in conjunction with TimespanConfig.
3378 "sampleMethod": "A String",
3379 "identifyingFields": [ # Table fields that may uniquely identify a row within the table. When
3380 # `actions.saveFindings.outputConfig.table` is specified, the values of
3381 # columns specified here are available in the output table under
3382 # `location.content_locations.record_location.record_key.id_values`. Nested
3383 # fields such as `person.birthdate.year` are allowed.
3384 { # General identifier of a data field in a storage service.
3385 "name": "A String", # Name describing the field.
3386 },
3387 ],
3388 "rowsLimitPercent": 42, # Max percentage of rows to scan. The rest are omitted. The number of rows
3389 # scanned is rounded down. Must be between 0 and 100, inclusively. Both 0 and
3390 # 100 means no limit. Defaults to 0. Only one of rows_limit and
3391 # rows_limit_percent can be specified. Cannot be used in conjunction with
3392 # TimespanConfig.
3393 "tableReference": { # Message defining the location of a BigQuery table. A table is uniquely # Complete BigQuery table reference.
3394 # identified by its project_id, dataset_id, and table_name. Within a query
3395 # a table is often referenced with a string in the format of:
3396 # `&lt;project_id&gt;:&lt;dataset_id&gt;.&lt;table_id&gt;` or
3397 # `&lt;project_id&gt;.&lt;dataset_id&gt;.&lt;table_id&gt;`.
3398 "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table.
3399 # If omitted, project ID is inferred from the API call.
3400 "tableId": "A String", # Name of the table.
3401 "datasetId": "A String", # Dataset ID of the table.
3402 },
3403 },
3404 "timespanConfig": { # Configuration of the timespan of the items to include in scanning.
3405 # Currently only supported when inspecting Google Cloud Storage and BigQuery.
3406 "timestampField": { # General identifier of a data field in a storage service. # Specification of the field containing the timestamp of scanned items.
3407 # Used for data sources like Datastore and BigQuery.
3408 #
3409 # For BigQuery:
3410 # Required to filter out rows based on the given start and
3411 # end times. If not specified and the table was modified between the given
3412 # start and end times, the entire table will be scanned.
3413 # The valid data types of the timestamp field are: `INTEGER`, `DATE`,
3414 # `TIMESTAMP`, or `DATETIME` BigQuery column.
3415 #
3416 # For Datastore.
3417 # Valid data types of the timestamp field are: `TIMESTAMP`.
3418 # Datastore entity will be scanned if the timestamp property does not
3419 # exist or its value is empty or invalid.
3420 "name": "A String", # Name describing the field.
3421 },
3422 "endTime": "A String", # Exclude files or rows newer than this value.
3423 # If set to zero, no upper time limit is applied.
3424 "startTime": "A String", # Exclude files or rows older than this value.
3425 "enableAutoPopulationOfTimespanConfig": True or False, # When the job is started by a JobTrigger we will automatically figure out
3426 # a valid start_time to avoid scanning files that have not been modified
3427 # since the last time the JobTrigger executed. This will be based on the
3428 # time of the execution of the last run of the JobTrigger.
3429 },
3430 "hybridOptions": { # Configuration to control jobs where the content being inspected is outside # Hybrid inspection options.
3431 # Early access feature is in a pre-release state and might change or have
3432 # limited support. For more information, see
3433 # https://cloud.google.com/products#product-launch-stages.
3434 # of Google Cloud Platform.
3435 "tableOptions": { # Instructions regarding the table content being inspected. # If the container is a table, additional information to make findings
3436 # meaningful such as the columns that are primary keys.
3437 "identifyingFields": [ # The columns that are the primary keys for table objects included in
3438 # ContentItem. A copy of this cell's value will stored alongside alongside
3439 # each finding so that the finding can be traced to the specific row it came
3440 # from. No more than 3 may be provided.
3441 { # General identifier of a data field in a storage service.
3442 "name": "A String", # Name describing the field.
3443 },
3444 ],
3445 },
3446 "labels": { # To organize findings, these labels will be added to each finding.
3447 #
3448 # Label keys must be between 1 and 63 characters long and must conform
3449 # to the following regular expression: `[a-z]([-a-z0-9]*[a-z0-9])?`.
3450 #
3451 # Label values must be between 0 and 63 characters long and must conform
3452 # to the regular expression `([a-z]([-a-z0-9]*[a-z0-9])?)?`.
3453 #
3454 # No more than 10 labels can be associated with a given finding.
3455 #
3456 # Examples:
3457 # * `"environment" : "production"`
3458 # * `"pipeline" : "etl"`
3459 "a_key": "A String",
3460 },
3461 "requiredFindingLabelKeys": [ # These are labels that each inspection request must include within their
3462 # 'finding_labels' map. Request may contain others, but any missing one of
3463 # these will be rejected.
3464 #
3465 # Label keys must be between 1 and 63 characters long and must conform
3466 # to the following regular expression: `[a-z]([-a-z0-9]*[a-z0-9])?`.
3467 #
3468 # No more than 10 keys can be required.
3469 "A String",
3470 ],
3471 "description": "A String", # A short description of where the data is coming from. Will be stored once
3472 # in the job. 256 max length.
3473 },
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003474 },
3475 "inspectConfig": { # Configuration description of the scanning process. # How and what to scan for.
3476 # When used with redactContent only info_types and min_likelihood are currently
3477 # used.
3478 "excludeInfoTypes": True or False, # When true, excludes type information of the findings.
Dan O'Mearadd494642020-05-01 07:42:23 -07003479 "limits": { # Configuration to control the number of findings returned. # Configuration to control the number of findings returned.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003480 "maxFindingsPerRequest": 42, # Max number of findings that will be returned per request/job.
3481 # When set within `InspectContentRequest`, the maximum returned is 2000
3482 # regardless if this is set higher.
3483 "maxFindingsPerInfoType": [ # Configuration of findings limit given for specified infoTypes.
3484 { # Max findings configuration per infoType, per content item or long
3485 # running DlpJob.
3486 "infoType": { # Type of information detected by the API. # Type of information the findings limit applies to. Only one limit per
3487 # info_type should be provided. If InfoTypeLimit does not have an
3488 # info_type, the DLP API applies the limit against all info_types that
3489 # are found but not specified in another InfoTypeLimit.
3490 "name": "A String", # Name of the information type. Either a name of your choosing when
3491 # creating a CustomInfoType, or one of the names listed
3492 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
3493 # a built-in type. InfoType names should conform to the pattern
Dan O'Mearadd494642020-05-01 07:42:23 -07003494 # `[a-zA-Z0-9_]{1,64}`.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003495 },
3496 "maxFindings": 42, # Max findings limit for the given infoType.
3497 },
3498 ],
3499 "maxFindingsPerItem": 42, # Max number of findings that will be returned for each item scanned.
Dan O'Mearadd494642020-05-01 07:42:23 -07003500 # When set within `InspectJobConfig`,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003501 # the maximum returned is 2000 regardless if this is set higher.
3502 # When set within `InspectContentRequest`, this field is ignored.
3503 },
3504 "minLikelihood": "A String", # Only returns findings equal or above this threshold. The default is
3505 # POSSIBLE.
3506 # See https://cloud.google.com/dlp/docs/likelihood to learn more.
3507 "customInfoTypes": [ # CustomInfoTypes provided by the user. See
3508 # https://cloud.google.com/dlp/docs/creating-custom-infotypes to learn more.
3509 { # Custom information type provided by the user. Used to find domain-specific
3510 # sensitive information configurable to the data in question.
3511 "regex": { # Message defining a custom regular expression. # Regular expression based CustomInfoType.
3512 "pattern": "A String", # Pattern defining the regular expression. Its syntax
3513 # (https://github.com/google/re2/wiki/Syntax) can be found under the
3514 # google/re2 repository on GitHub.
3515 "groupIndexes": [ # The index of the submatch to extract as findings. When not
3516 # specified, the entire match is returned. No more than 3 may be included.
3517 42,
3518 ],
3519 },
3520 "surrogateType": { # Message for detecting output from deidentification transformations # Message for detecting output from deidentification transformations that
3521 # support reversing.
3522 # such as
3523 # [`CryptoReplaceFfxFpeConfig`](/dlp/docs/reference/rest/v2/organizations.deidentifyTemplates#cryptoreplaceffxfpeconfig).
3524 # These types of transformations are
3525 # those that perform pseudonymization, thereby producing a "surrogate" as
3526 # output. This should be used in conjunction with a field on the
3527 # transformation such as `surrogate_info_type`. This CustomInfoType does
3528 # not support the use of `detection_rules`.
3529 },
3530 "infoType": { # Type of information detected by the API. # CustomInfoType can either be a new infoType, or an extension of built-in
3531 # infoType, when the name matches one of existing infoTypes and that infoType
3532 # is specified in `InspectContent.info_types` field. Specifying the latter
3533 # adds findings to the one detected by the system. If built-in info type is
3534 # not specified in `InspectContent.info_types` list then the name is treated
3535 # as a custom info type.
3536 "name": "A String", # Name of the information type. Either a name of your choosing when
3537 # creating a CustomInfoType, or one of the names listed
3538 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
3539 # a built-in type. InfoType names should conform to the pattern
Dan O'Mearadd494642020-05-01 07:42:23 -07003540 # `[a-zA-Z0-9_]{1,64}`.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003541 },
3542 "dictionary": { # Custom information type based on a dictionary of words or phrases. This can # A list of phrases to detect as a CustomInfoType.
3543 # be used to match sensitive information specific to the data, such as a list
3544 # of employee IDs or job titles.
3545 #
3546 # Dictionary words are case-insensitive and all characters other than letters
3547 # and digits in the unicode [Basic Multilingual
3548 # Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane)
3549 # will be replaced with whitespace when scanning for matches, so the
3550 # dictionary phrase "Sam Johnson" will match all three phrases "sam johnson",
3551 # "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters
3552 # surrounding any match must be of a different type than the adjacent
3553 # characters within the word, so letters must be next to non-letters and
3554 # digits next to non-digits. For example, the dictionary word "jen" will
3555 # match the first three letters of the text "jen123" but will return no
3556 # matches for "jennifer".
3557 #
3558 # Dictionary words containing a large number of characters that are not
3559 # letters or digits may result in unexpected findings because such characters
3560 # are treated as whitespace. The
3561 # [limits](https://cloud.google.com/dlp/limits) page contains details about
3562 # the size limits of dictionaries. For dictionaries that do not fit within
3563 # these constraints, consider using `LargeCustomDictionaryConfig` in the
3564 # `StoredInfoType` API.
3565 "wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for.
3566 "words": [ # Words or phrases defining the dictionary. The dictionary must contain
3567 # at least one phrase and every phrase must contain at least 2 characters
3568 # that are letters or digits. [required]
3569 "A String",
3570 ],
3571 },
3572 "cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file
3573 # is accepted.
3574 "path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage.
3575 # Example: gs://[BUCKET_NAME]/dictionary.txt
3576 },
3577 },
3578 "storedType": { # A reference to a StoredInfoType to use with scanning. # Load an existing `StoredInfoType` resource for use in
3579 # `InspectDataSource`. Not currently supported in `InspectContent`.
3580 "name": "A String", # Resource name of the requested `StoredInfoType`, for example
3581 # `organizations/433245324/storedInfoTypes/432452342` or
3582 # `projects/project-id/storedInfoTypes/432452342`.
3583 "createTime": "A String", # Timestamp indicating when the version of the `StoredInfoType` used for
3584 # inspection was created. Output-only field, populated by the system.
3585 },
3586 "detectionRules": [ # Set of detection rules to apply to all findings of this CustomInfoType.
3587 # Rules are applied in order that they are specified. Not supported for the
3588 # `surrogate_type` CustomInfoType.
3589 { # Deprecated; use `InspectionRuleSet` instead. Rule for modifying a
3590 # `CustomInfoType` to alter behavior under certain circumstances, depending
3591 # on the specific details of the rule. Not supported for the `surrogate_type`
3592 # custom infoType.
3593 "hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule.
3594 # proximity of hotwords.
3595 "proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside.
3596 # The total length of the window cannot exceed 1000 characters. Note that
3597 # the finding itself will be included in the window, so that hotwords may
3598 # be used to match substrings of the finding itself. For example, the
3599 # certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be
3600 # adjusted upwards if the area code is known to be the local area code of
3601 # a company office using the hotword regex "\(xxx\)", where "xxx"
3602 # is the area code in question.
3603 # rule.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003604 "windowBefore": 42, # Number of characters before the finding to consider.
Dan O'Mearadd494642020-05-01 07:42:23 -07003605 "windowAfter": 42, # Number of characters after the finding to consider.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003606 },
3607 "hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword.
3608 "pattern": "A String", # Pattern defining the regular expression. Its syntax
3609 # (https://github.com/google/re2/wiki/Syntax) can be found under the
3610 # google/re2 repository on GitHub.
3611 "groupIndexes": [ # The index of the submatch to extract as findings. When not
3612 # specified, the entire match is returned. No more than 3 may be included.
3613 42,
3614 ],
3615 },
3616 "likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings.
3617 # part of a detection rule.
3618 "relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of
3619 # levels. For example, if a finding would be `POSSIBLE` without the
3620 # detection rule and `relative_likelihood` is 1, then it is upgraded to
3621 # `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`.
3622 # Likelihood may never drop below `VERY_UNLIKELY` or exceed
3623 # `VERY_LIKELY`, so applying an adjustment of 1 followed by an
3624 # adjustment of -1 when base likelihood is `VERY_LIKELY` will result in
3625 # a final likelihood of `LIKELY`.
3626 "fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value.
3627 },
3628 },
3629 },
3630 ],
3631 "exclusionType": "A String", # If set to EXCLUSION_TYPE_EXCLUDE this infoType will not cause a finding
3632 # to be returned. It still can be used for rules matching.
3633 "likelihood": "A String", # Likelihood to return for this CustomInfoType. This base value can be
3634 # altered by a detection rule if the finding meets the criteria specified by
3635 # the rule. Defaults to `VERY_LIKELY` if not specified.
3636 },
3637 ],
3638 "includeQuote": True or False, # When true, a contextual quote from the data that triggered a finding is
3639 # included in the response; see Finding.quote.
3640 "ruleSet": [ # Set of rules to apply to the findings for this InspectConfig.
3641 # Exclusion rules, contained in the set are executed in the end, other
3642 # rules are executed in the order they are specified for each info type.
3643 { # Rule set for modifying a set of infoTypes to alter behavior under certain
3644 # circumstances, depending on the specific details of the rules within the set.
3645 "rules": [ # Set of rules to be applied to infoTypes. The rules are applied in order.
3646 { # A single inspection rule to be applied to infoTypes, specified in
3647 # `InspectionRuleSet`.
3648 "hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule.
3649 # proximity of hotwords.
3650 "proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside.
3651 # The total length of the window cannot exceed 1000 characters. Note that
3652 # the finding itself will be included in the window, so that hotwords may
3653 # be used to match substrings of the finding itself. For example, the
3654 # certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be
3655 # adjusted upwards if the area code is known to be the local area code of
3656 # a company office using the hotword regex "\(xxx\)", where "xxx"
3657 # is the area code in question.
3658 # rule.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003659 "windowBefore": 42, # Number of characters before the finding to consider.
Dan O'Mearadd494642020-05-01 07:42:23 -07003660 "windowAfter": 42, # Number of characters after the finding to consider.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003661 },
3662 "hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword.
3663 "pattern": "A String", # Pattern defining the regular expression. Its syntax
3664 # (https://github.com/google/re2/wiki/Syntax) can be found under the
3665 # google/re2 repository on GitHub.
3666 "groupIndexes": [ # The index of the submatch to extract as findings. When not
3667 # specified, the entire match is returned. No more than 3 may be included.
3668 42,
3669 ],
3670 },
3671 "likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings.
3672 # part of a detection rule.
3673 "relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of
3674 # levels. For example, if a finding would be `POSSIBLE` without the
3675 # detection rule and `relative_likelihood` is 1, then it is upgraded to
3676 # `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`.
3677 # Likelihood may never drop below `VERY_UNLIKELY` or exceed
3678 # `VERY_LIKELY`, so applying an adjustment of 1 followed by an
3679 # adjustment of -1 when base likelihood is `VERY_LIKELY` will result in
3680 # a final likelihood of `LIKELY`.
3681 "fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value.
3682 },
3683 },
3684 "exclusionRule": { # The rule that specifies conditions when findings of infoTypes specified in # Exclusion rule.
3685 # `InspectionRuleSet` are removed from results.
3686 "regex": { # Message defining a custom regular expression. # Regular expression which defines the rule.
3687 "pattern": "A String", # Pattern defining the regular expression. Its syntax
3688 # (https://github.com/google/re2/wiki/Syntax) can be found under the
3689 # google/re2 repository on GitHub.
3690 "groupIndexes": [ # The index of the submatch to extract as findings. When not
3691 # specified, the entire match is returned. No more than 3 may be included.
3692 42,
3693 ],
3694 },
3695 "excludeInfoTypes": { # List of exclude infoTypes. # Set of infoTypes for which findings would affect this rule.
3696 "infoTypes": [ # InfoType list in ExclusionRule rule drops a finding when it overlaps or
3697 # contained within with a finding of an infoType from this list. For
3698 # example, for `InspectionRuleSet.info_types` containing "PHONE_NUMBER"` and
3699 # `exclusion_rule` containing `exclude_info_types.info_types` with
3700 # "EMAIL_ADDRESS" the phone number findings are dropped if they overlap
3701 # with EMAIL_ADDRESS finding.
3702 # That leads to "555-222-2222@example.org" to generate only a single
3703 # finding, namely email address.
3704 { # Type of information detected by the API.
3705 "name": "A String", # Name of the information type. Either a name of your choosing when
3706 # creating a CustomInfoType, or one of the names listed
3707 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
3708 # a built-in type. InfoType names should conform to the pattern
Dan O'Mearadd494642020-05-01 07:42:23 -07003709 # `[a-zA-Z0-9_]{1,64}`.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003710 },
3711 ],
3712 },
3713 "dictionary": { # Custom information type based on a dictionary of words or phrases. This can # Dictionary which defines the rule.
3714 # be used to match sensitive information specific to the data, such as a list
3715 # of employee IDs or job titles.
3716 #
3717 # Dictionary words are case-insensitive and all characters other than letters
3718 # and digits in the unicode [Basic Multilingual
3719 # Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane)
3720 # will be replaced with whitespace when scanning for matches, so the
3721 # dictionary phrase "Sam Johnson" will match all three phrases "sam johnson",
3722 # "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters
3723 # surrounding any match must be of a different type than the adjacent
3724 # characters within the word, so letters must be next to non-letters and
3725 # digits next to non-digits. For example, the dictionary word "jen" will
3726 # match the first three letters of the text "jen123" but will return no
3727 # matches for "jennifer".
3728 #
3729 # Dictionary words containing a large number of characters that are not
3730 # letters or digits may result in unexpected findings because such characters
3731 # are treated as whitespace. The
3732 # [limits](https://cloud.google.com/dlp/limits) page contains details about
3733 # the size limits of dictionaries. For dictionaries that do not fit within
3734 # these constraints, consider using `LargeCustomDictionaryConfig` in the
3735 # `StoredInfoType` API.
3736 "wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for.
3737 "words": [ # Words or phrases defining the dictionary. The dictionary must contain
3738 # at least one phrase and every phrase must contain at least 2 characters
3739 # that are letters or digits. [required]
3740 "A String",
3741 ],
3742 },
3743 "cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file
3744 # is accepted.
3745 "path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage.
3746 # Example: gs://[BUCKET_NAME]/dictionary.txt
3747 },
3748 },
3749 "matchingType": "A String", # How the rule is applied, see MatchingType documentation for details.
3750 },
3751 },
3752 ],
3753 "infoTypes": [ # List of infoTypes this rule set is applied to.
3754 { # Type of information detected by the API.
3755 "name": "A String", # Name of the information type. Either a name of your choosing when
3756 # creating a CustomInfoType, or one of the names listed
3757 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
3758 # a built-in type. InfoType names should conform to the pattern
Dan O'Mearadd494642020-05-01 07:42:23 -07003759 # `[a-zA-Z0-9_]{1,64}`.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003760 },
3761 ],
3762 },
3763 ],
3764 "contentOptions": [ # List of options defining data content to scan.
3765 # If empty, text, images, and other content will be included.
3766 "A String",
3767 ],
3768 "infoTypes": [ # Restricts what info_types to look for. The values must correspond to
3769 # InfoType values returned by ListInfoTypes or listed at
3770 # https://cloud.google.com/dlp/docs/infotypes-reference.
3771 #
3772 # When no InfoTypes or CustomInfoTypes are specified in a request, the
3773 # system may automatically choose what detectors to run. By default this may
3774 # be all types, but may change over time as detectors are updated.
3775 #
Dan O'Mearadd494642020-05-01 07:42:23 -07003776 # If you need precise control and predictability as to what detectors are
3777 # run you should specify specific InfoTypes listed in the reference,
3778 # otherwise a default list will be used, which may change over time.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003779 { # Type of information detected by the API.
3780 "name": "A String", # Name of the information type. Either a name of your choosing when
3781 # creating a CustomInfoType, or one of the names listed
3782 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
3783 # a built-in type. InfoType names should conform to the pattern
Dan O'Mearadd494642020-05-01 07:42:23 -07003784 # `[a-zA-Z0-9_]{1,64}`.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003785 },
3786 ],
3787 },
3788 "inspectTemplateName": "A String", # If provided, will be used as the default for all values in InspectConfig.
3789 # `inspect_config` will be merged into the values persisted as part of the
3790 # template.
3791 "actions": [ # Actions to execute at the completion of the job.
3792 { # A task to execute on the completion of a job.
3793 # See https://cloud.google.com/dlp/docs/concepts-actions to learn more.
3794 "saveFindings": { # If set, the detailed findings will be persisted to the specified # Save resulting findings in a provided location.
3795 # OutputStorageConfig. Only a single instance of this action can be
3796 # specified.
3797 # Compatible with: Inspect, Risk
Dan O'Mearadd494642020-05-01 07:42:23 -07003798 "outputConfig": { # Cloud repository for storing output. # Location to store findings outside of DLP.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003799 "table": { # Message defining the location of a BigQuery table. A table is uniquely # Store findings in an existing table or a new table in an existing
3800 # dataset. If table_id is not set a new one will be generated
3801 # for you with the following format:
3802 # dlp_googleapis_yyyy_mm_dd_[dlp_job_id]. Pacific timezone will be used for
3803 # generating the date details.
3804 #
3805 # For Inspect, each column in an existing output table must have the same
3806 # name, type, and mode of a field in the `Finding` object.
3807 #
3808 # For Risk, an existing output table should be the output of a previous
3809 # Risk analysis job run on the same source table, with the same privacy
3810 # metric and quasi-identifiers. Risk jobs that analyze the same table but
3811 # compute a different privacy metric, or use different sets of
3812 # quasi-identifiers, cannot store their results in the same table.
3813 # identified by its project_id, dataset_id, and table_name. Within a query
3814 # a table is often referenced with a string in the format of:
Dan O'Mearadd494642020-05-01 07:42:23 -07003815 # `&lt;project_id&gt;:&lt;dataset_id&gt;.&lt;table_id&gt;` or
3816 # `&lt;project_id&gt;.&lt;dataset_id&gt;.&lt;table_id&gt;`.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003817 "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table.
3818 # If omitted, project ID is inferred from the API call.
3819 "tableId": "A String", # Name of the table.
3820 "datasetId": "A String", # Dataset ID of the table.
3821 },
3822 "outputSchema": "A String", # Schema used for writing the findings for Inspect jobs. This field is only
3823 # used for Inspect and must be unspecified for Risk jobs. Columns are derived
3824 # from the `Finding` object. If appending to an existing table, any columns
3825 # from the predefined schema that are missing will be added. No columns in
3826 # the existing table will be deleted.
3827 #
3828 # If unspecified, then all available columns will be used for a new table or
3829 # an (existing) table with no schema, and no changes will be made to an
3830 # existing table that has a schema.
Dan O'Mearadd494642020-05-01 07:42:23 -07003831 # Only for use with external storage.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003832 },
3833 },
Dan O'Mearadd494642020-05-01 07:42:23 -07003834 "jobNotificationEmails": { # Enable email notification to project owners and editors on jobs's # Enable email notification for project owners and editors on job's
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003835 # completion/failure.
3836 # completion/failure.
3837 },
3838 "publishSummaryToCscc": { # Publish the result summary of a DlpJob to the Cloud Security # Publish summary to Cloud Security Command Center (Alpha).
3839 # Command Center (CSCC Alpha).
3840 # This action is only available for projects which are parts of
3841 # an organization and whitelisted for the alpha Cloud Security Command
3842 # Center.
3843 # The action will publish count of finding instances and their info types.
3844 # The summary of findings will be persisted in CSCC and are governed by CSCC
3845 # service-specific policy, see https://cloud.google.com/terms/service-terms
3846 # Only a single instance of this action can be specified.
3847 # Compatible with: Inspect
3848 },
Dan O'Mearadd494642020-05-01 07:42:23 -07003849 "publishToStackdriver": { # Enable Stackdriver metric dlp.googleapis.com/finding_count. This # Enable Stackdriver metric dlp.googleapis.com/finding_count.
3850 # will publish a metric to stack driver on each infotype requested and
3851 # how many findings were found for it. CustomDetectors will be bucketed
3852 # as 'Custom' under the Stackdriver label 'info_type'.
3853 },
3854 "publishFindingsToCloudDataCatalog": { # Publish findings of a DlpJob to Cloud Data Catalog. Labels summarizing the # Publish findings to Cloud Datahub.
3855 # results of the DlpJob will be applied to the entry for the resource scanned
3856 # in Cloud Data Catalog. Any labels previously written by another DlpJob will
3857 # be deleted. InfoType naming patterns are strictly enforced when using this
3858 # feature. Note that the findings will be persisted in Cloud Data Catalog
3859 # storage and are governed by Data Catalog service-specific policy, see
3860 # https://cloud.google.com/terms/service-terms
3861 # Only a single instance of this action can be specified and only allowed if
3862 # all resources being scanned are BigQuery tables.
3863 # Compatible with: Inspect
3864 },
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003865 "pubSub": { # Publish a message into given Pub/Sub topic when DlpJob has completed. The # Publish a notification to a pubsub topic.
3866 # message contains a single field, `DlpJobName`, which is equal to the
3867 # finished job's
3868 # [`DlpJob.name`](/dlp/docs/reference/rest/v2/projects.dlpJobs#DlpJob).
3869 # Compatible with: Inspect, Risk
3870 "topic": "A String", # Cloud Pub/Sub topic to send notifications to. The topic must have given
3871 # publishing access rights to the DLP API service account executing
3872 # the long running DlpJob sending the notifications.
3873 # Format is projects/{project}/topics/{topic}.
3874 },
3875 },
3876 ],
3877 },
3878 },
3879 "result": { # All result fields mentioned below are updated while the job is processing. # A summary of the outcome of this inspect job.
3880 "infoTypeStats": [ # Statistics of how many instances of each info type were found during
3881 # inspect job.
3882 { # Statistics regarding a specific InfoType.
3883 "count": "A String", # Number of findings for this infoType.
3884 "infoType": { # Type of information detected by the API. # The type of finding this stat is for.
3885 "name": "A String", # Name of the information type. Either a name of your choosing when
3886 # creating a CustomInfoType, or one of the names listed
3887 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
3888 # a built-in type. InfoType names should conform to the pattern
Dan O'Mearadd494642020-05-01 07:42:23 -07003889 # `[a-zA-Z0-9_]{1,64}`.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003890 },
3891 },
3892 ],
3893 "totalEstimatedBytes": "A String", # Estimate of the number of bytes to process.
3894 "processedBytes": "A String", # Total size in bytes that were processed.
Dan O'Mearadd494642020-05-01 07:42:23 -07003895 "hybridStats": { # Statistics related to processing hybrid inspect requests. # Statistics related to the processing of hybrid inspect.
3896 # Early access feature is in a pre-release state and might change or have
3897 # limited support. For more information, see
3898 # https://cloud.google.com/products#product-launch-stages.
3899 "abortedCount": "A String", # The number of hybrid inspection requests aborted because the job ran
3900 # out of quota or was ended before they could be processed.
3901 "pendingCount": "A String", # The number of hybrid requests currently being processed. Only populated
3902 # when called via method `getDlpJob`.
3903 # A burst of traffic may cause hybrid inspect requests to be enqueued.
3904 # Processing will take place as quickly as possible, but resource limitations
3905 # may impact how long a request is enqueued for.
3906 "processedCount": "A String", # The number of hybrid inspection requests processed within this job.
3907 },
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003908 },
3909 },
3910 "riskDetails": { # Result of a risk analysis operation request. # Results from analyzing risk of a data source.
Dan O'Mearadd494642020-05-01 07:42:23 -07003911 "numericalStatsResult": { # Result of the numerical stats computation. # Numerical stats result
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003912 "quantileValues": [ # List of 99 values that partition the set of field values into 100 equal
3913 # sized buckets.
3914 { # Set of primitive values supported by the system.
3915 # Note that for the purposes of inspection or transformation, the number
3916 # of bytes considered to comprise a 'Value' is based on its representation
3917 # as a UTF-8 encoded string. For example, if 'integer_value' is set to
3918 # 123456789, the number of bytes would be counted as 9, even though an
3919 # int64 only holds up to 8 bytes of data.
Dan O'Mearadd494642020-05-01 07:42:23 -07003920 "floatValue": 3.14, # float
3921 "timestampValue": "A String", # timestamp
3922 "dayOfWeekValue": "A String", # day of week
3923 "timeValue": { # Represents a time of day. The date and time zone are either not significant # time of day
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003924 # or are specified elsewhere. An API may choose to allow leap seconds. Related
3925 # types are google.type.Date and `google.protobuf.Timestamp`.
3926 "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose
3927 # to allow the value "24:00:00" for scenarios like business closing time.
3928 "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999.
3929 "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may
3930 # allow the value 60 if it allows leap-seconds.
3931 "minutes": 42, # Minutes of hour of day. Must be from 0 to 59.
3932 },
Dan O'Mearadd494642020-05-01 07:42:23 -07003933 "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day # date
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003934 # and time zone are either specified elsewhere or are not significant. The date
3935 # is relative to the Proleptic Gregorian Calendar. This can represent:
3936 #
3937 # * A full date, with non-zero year, month and day values
3938 # * A month and day value, with a zero year, e.g. an anniversary
3939 # * A year on its own, with zero month and day values
3940 # * A year and month value, with a zero day, e.g. a credit card expiration date
3941 #
3942 # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`.
Dan O'Mearadd494642020-05-01 07:42:23 -07003943 "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a
3944 # month and day.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003945 "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0
3946 # if specifying a year by itself or a year and month where the day is not
3947 # significant.
Dan O'Mearadd494642020-05-01 07:42:23 -07003948 "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without
3949 # a year.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003950 },
Dan O'Mearadd494642020-05-01 07:42:23 -07003951 "stringValue": "A String", # string
3952 "booleanValue": True or False, # boolean
3953 "integerValue": "A String", # integer
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003954 },
3955 ],
3956 "maxValue": { # Set of primitive values supported by the system. # Maximum value appearing in the column.
3957 # Note that for the purposes of inspection or transformation, the number
3958 # of bytes considered to comprise a 'Value' is based on its representation
3959 # as a UTF-8 encoded string. For example, if 'integer_value' is set to
3960 # 123456789, the number of bytes would be counted as 9, even though an
3961 # int64 only holds up to 8 bytes of data.
Dan O'Mearadd494642020-05-01 07:42:23 -07003962 "floatValue": 3.14, # float
3963 "timestampValue": "A String", # timestamp
3964 "dayOfWeekValue": "A String", # day of week
3965 "timeValue": { # Represents a time of day. The date and time zone are either not significant # time of day
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003966 # or are specified elsewhere. An API may choose to allow leap seconds. Related
3967 # types are google.type.Date and `google.protobuf.Timestamp`.
3968 "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose
3969 # to allow the value "24:00:00" for scenarios like business closing time.
3970 "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999.
3971 "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may
3972 # allow the value 60 if it allows leap-seconds.
3973 "minutes": 42, # Minutes of hour of day. Must be from 0 to 59.
3974 },
Dan O'Mearadd494642020-05-01 07:42:23 -07003975 "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day # date
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003976 # and time zone are either specified elsewhere or are not significant. The date
3977 # is relative to the Proleptic Gregorian Calendar. This can represent:
3978 #
3979 # * A full date, with non-zero year, month and day values
3980 # * A month and day value, with a zero year, e.g. an anniversary
3981 # * A year on its own, with zero month and day values
3982 # * A year and month value, with a zero day, e.g. a credit card expiration date
3983 #
3984 # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`.
Dan O'Mearadd494642020-05-01 07:42:23 -07003985 "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a
3986 # month and day.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003987 "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0
3988 # if specifying a year by itself or a year and month where the day is not
3989 # significant.
Dan O'Mearadd494642020-05-01 07:42:23 -07003990 "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without
3991 # a year.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003992 },
Dan O'Mearadd494642020-05-01 07:42:23 -07003993 "stringValue": "A String", # string
3994 "booleanValue": True or False, # boolean
3995 "integerValue": "A String", # integer
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07003996 },
3997 "minValue": { # Set of primitive values supported by the system. # Minimum value appearing in the column.
3998 # Note that for the purposes of inspection or transformation, the number
3999 # of bytes considered to comprise a 'Value' is based on its representation
4000 # as a UTF-8 encoded string. For example, if 'integer_value' is set to
4001 # 123456789, the number of bytes would be counted as 9, even though an
4002 # int64 only holds up to 8 bytes of data.
Dan O'Mearadd494642020-05-01 07:42:23 -07004003 "floatValue": 3.14, # float
4004 "timestampValue": "A String", # timestamp
4005 "dayOfWeekValue": "A String", # day of week
4006 "timeValue": { # Represents a time of day. The date and time zone are either not significant # time of day
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004007 # or are specified elsewhere. An API may choose to allow leap seconds. Related
4008 # types are google.type.Date and `google.protobuf.Timestamp`.
4009 "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose
4010 # to allow the value "24:00:00" for scenarios like business closing time.
4011 "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999.
4012 "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may
4013 # allow the value 60 if it allows leap-seconds.
4014 "minutes": 42, # Minutes of hour of day. Must be from 0 to 59.
4015 },
Dan O'Mearadd494642020-05-01 07:42:23 -07004016 "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day # date
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004017 # and time zone are either specified elsewhere or are not significant. The date
4018 # is relative to the Proleptic Gregorian Calendar. This can represent:
4019 #
4020 # * A full date, with non-zero year, month and day values
4021 # * A month and day value, with a zero year, e.g. an anniversary
4022 # * A year on its own, with zero month and day values
4023 # * A year and month value, with a zero day, e.g. a credit card expiration date
4024 #
4025 # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`.
Dan O'Mearadd494642020-05-01 07:42:23 -07004026 "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a
4027 # month and day.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004028 "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0
4029 # if specifying a year by itself or a year and month where the day is not
4030 # significant.
Dan O'Mearadd494642020-05-01 07:42:23 -07004031 "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without
4032 # a year.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004033 },
Dan O'Mearadd494642020-05-01 07:42:23 -07004034 "stringValue": "A String", # string
4035 "booleanValue": True or False, # boolean
4036 "integerValue": "A String", # integer
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004037 },
4038 },
Dan O'Mearadd494642020-05-01 07:42:23 -07004039 "kMapEstimationResult": { # Result of the reidentifiability analysis. Note that these results are an # K-map result
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004040 # estimation, not exact values.
4041 "kMapEstimationHistogram": [ # The intervals [min_anonymity, max_anonymity] do not overlap. If a value
4042 # doesn't correspond to any such interval, the associated frequency is
4043 # zero. For example, the following records:
4044 # {min_anonymity: 1, max_anonymity: 1, frequency: 17}
4045 # {min_anonymity: 2, max_anonymity: 3, frequency: 42}
4046 # {min_anonymity: 5, max_anonymity: 10, frequency: 99}
4047 # mean that there are no record with an estimated anonymity of 4, 5, or
4048 # larger than 10.
4049 { # A KMapEstimationHistogramBucket message with the following values:
4050 # min_anonymity: 3
4051 # max_anonymity: 5
4052 # frequency: 42
4053 # means that there are 42 records whose quasi-identifier values correspond
4054 # to 3, 4 or 5 people in the overlying population. An important particular
4055 # case is when min_anonymity = max_anonymity = 1: the frequency field then
4056 # corresponds to the number of uniquely identifiable records.
4057 "bucketValues": [ # Sample of quasi-identifier tuple values in this bucket. The total
4058 # number of classes returned per bucket is capped at 20.
4059 { # A tuple of values for the quasi-identifier columns.
4060 "estimatedAnonymity": "A String", # The estimated anonymity for these quasi-identifier values.
4061 "quasiIdsValues": [ # The quasi-identifier values.
4062 { # Set of primitive values supported by the system.
4063 # Note that for the purposes of inspection or transformation, the number
4064 # of bytes considered to comprise a 'Value' is based on its representation
4065 # as a UTF-8 encoded string. For example, if 'integer_value' is set to
4066 # 123456789, the number of bytes would be counted as 9, even though an
4067 # int64 only holds up to 8 bytes of data.
Dan O'Mearadd494642020-05-01 07:42:23 -07004068 "floatValue": 3.14, # float
4069 "timestampValue": "A String", # timestamp
4070 "dayOfWeekValue": "A String", # day of week
4071 "timeValue": { # Represents a time of day. The date and time zone are either not significant # time of day
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004072 # or are specified elsewhere. An API may choose to allow leap seconds. Related
4073 # types are google.type.Date and `google.protobuf.Timestamp`.
4074 "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose
4075 # to allow the value "24:00:00" for scenarios like business closing time.
4076 "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999.
4077 "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may
4078 # allow the value 60 if it allows leap-seconds.
4079 "minutes": 42, # Minutes of hour of day. Must be from 0 to 59.
4080 },
Dan O'Mearadd494642020-05-01 07:42:23 -07004081 "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day # date
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004082 # and time zone are either specified elsewhere or are not significant. The date
4083 # is relative to the Proleptic Gregorian Calendar. This can represent:
4084 #
4085 # * A full date, with non-zero year, month and day values
4086 # * A month and day value, with a zero year, e.g. an anniversary
4087 # * A year on its own, with zero month and day values
4088 # * A year and month value, with a zero day, e.g. a credit card expiration date
4089 #
4090 # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`.
Dan O'Mearadd494642020-05-01 07:42:23 -07004091 "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a
4092 # month and day.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004093 "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0
4094 # if specifying a year by itself or a year and month where the day is not
4095 # significant.
Dan O'Mearadd494642020-05-01 07:42:23 -07004096 "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without
4097 # a year.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004098 },
Dan O'Mearadd494642020-05-01 07:42:23 -07004099 "stringValue": "A String", # string
4100 "booleanValue": True or False, # boolean
4101 "integerValue": "A String", # integer
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004102 },
4103 ],
4104 },
4105 ],
4106 "minAnonymity": "A String", # Always positive.
4107 "bucketValueCount": "A String", # Total number of distinct quasi-identifier tuple values in this bucket.
4108 "maxAnonymity": "A String", # Always greater than or equal to min_anonymity.
4109 "bucketSize": "A String", # Number of records within these anonymity bounds.
4110 },
4111 ],
4112 },
Dan O'Mearadd494642020-05-01 07:42:23 -07004113 "kAnonymityResult": { # Result of the k-anonymity computation. # K-anonymity result
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004114 "equivalenceClassHistogramBuckets": [ # Histogram of k-anonymity equivalence classes.
Dan O'Mearadd494642020-05-01 07:42:23 -07004115 { # Histogram of k-anonymity equivalence classes.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004116 "bucketValues": [ # Sample of equivalence classes in this bucket. The total number of
4117 # classes returned per bucket is capped at 20.
4118 { # The set of columns' values that share the same ldiversity value
4119 "quasiIdsValues": [ # Set of values defining the equivalence class. One value per
4120 # quasi-identifier column in the original KAnonymity metric message.
4121 # The order is always the same as the original request.
4122 { # Set of primitive values supported by the system.
4123 # Note that for the purposes of inspection or transformation, the number
4124 # of bytes considered to comprise a 'Value' is based on its representation
4125 # as a UTF-8 encoded string. For example, if 'integer_value' is set to
4126 # 123456789, the number of bytes would be counted as 9, even though an
4127 # int64 only holds up to 8 bytes of data.
Dan O'Mearadd494642020-05-01 07:42:23 -07004128 "floatValue": 3.14, # float
4129 "timestampValue": "A String", # timestamp
4130 "dayOfWeekValue": "A String", # day of week
4131 "timeValue": { # Represents a time of day. The date and time zone are either not significant # time of day
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004132 # or are specified elsewhere. An API may choose to allow leap seconds. Related
4133 # types are google.type.Date and `google.protobuf.Timestamp`.
4134 "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose
4135 # to allow the value "24:00:00" for scenarios like business closing time.
4136 "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999.
4137 "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may
4138 # allow the value 60 if it allows leap-seconds.
4139 "minutes": 42, # Minutes of hour of day. Must be from 0 to 59.
4140 },
Dan O'Mearadd494642020-05-01 07:42:23 -07004141 "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day # date
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004142 # and time zone are either specified elsewhere or are not significant. The date
4143 # is relative to the Proleptic Gregorian Calendar. This can represent:
4144 #
4145 # * A full date, with non-zero year, month and day values
4146 # * A month and day value, with a zero year, e.g. an anniversary
4147 # * A year on its own, with zero month and day values
4148 # * A year and month value, with a zero day, e.g. a credit card expiration date
4149 #
4150 # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`.
Dan O'Mearadd494642020-05-01 07:42:23 -07004151 "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a
4152 # month and day.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004153 "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0
4154 # if specifying a year by itself or a year and month where the day is not
4155 # significant.
Dan O'Mearadd494642020-05-01 07:42:23 -07004156 "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without
4157 # a year.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004158 },
Dan O'Mearadd494642020-05-01 07:42:23 -07004159 "stringValue": "A String", # string
4160 "booleanValue": True or False, # boolean
4161 "integerValue": "A String", # integer
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004162 },
4163 ],
4164 "equivalenceClassSize": "A String", # Size of the equivalence class, for example number of rows with the
4165 # above set of values.
4166 },
4167 ],
4168 "bucketValueCount": "A String", # Total number of distinct equivalence classes in this bucket.
4169 "equivalenceClassSizeLowerBound": "A String", # Lower bound on the size of the equivalence classes in this bucket.
4170 "equivalenceClassSizeUpperBound": "A String", # Upper bound on the size of the equivalence classes in this bucket.
4171 "bucketSize": "A String", # Total number of equivalence classes in this bucket.
4172 },
4173 ],
4174 },
Dan O'Mearadd494642020-05-01 07:42:23 -07004175 "lDiversityResult": { # Result of the l-diversity computation. # L-divesity result
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004176 "sensitiveValueFrequencyHistogramBuckets": [ # Histogram of l-diversity equivalence class sensitive value frequencies.
Dan O'Mearadd494642020-05-01 07:42:23 -07004177 { # Histogram of l-diversity equivalence class sensitive value frequencies.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004178 "bucketValues": [ # Sample of equivalence classes in this bucket. The total number of
4179 # classes returned per bucket is capped at 20.
4180 { # The set of columns' values that share the same ldiversity value.
4181 "numDistinctSensitiveValues": "A String", # Number of distinct sensitive values in this equivalence class.
4182 "quasiIdsValues": [ # Quasi-identifier values defining the k-anonymity equivalence
4183 # class. The order is always the same as the original request.
4184 { # Set of primitive values supported by the system.
4185 # Note that for the purposes of inspection or transformation, the number
4186 # of bytes considered to comprise a 'Value' is based on its representation
4187 # as a UTF-8 encoded string. For example, if 'integer_value' is set to
4188 # 123456789, the number of bytes would be counted as 9, even though an
4189 # int64 only holds up to 8 bytes of data.
Dan O'Mearadd494642020-05-01 07:42:23 -07004190 "floatValue": 3.14, # float
4191 "timestampValue": "A String", # timestamp
4192 "dayOfWeekValue": "A String", # day of week
4193 "timeValue": { # Represents a time of day. The date and time zone are either not significant # time of day
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004194 # or are specified elsewhere. An API may choose to allow leap seconds. Related
4195 # types are google.type.Date and `google.protobuf.Timestamp`.
4196 "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose
4197 # to allow the value "24:00:00" for scenarios like business closing time.
4198 "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999.
4199 "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may
4200 # allow the value 60 if it allows leap-seconds.
4201 "minutes": 42, # Minutes of hour of day. Must be from 0 to 59.
4202 },
Dan O'Mearadd494642020-05-01 07:42:23 -07004203 "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day # date
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004204 # and time zone are either specified elsewhere or are not significant. The date
4205 # is relative to the Proleptic Gregorian Calendar. This can represent:
4206 #
4207 # * A full date, with non-zero year, month and day values
4208 # * A month and day value, with a zero year, e.g. an anniversary
4209 # * A year on its own, with zero month and day values
4210 # * A year and month value, with a zero day, e.g. a credit card expiration date
4211 #
4212 # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`.
Dan O'Mearadd494642020-05-01 07:42:23 -07004213 "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a
4214 # month and day.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004215 "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0
4216 # if specifying a year by itself or a year and month where the day is not
4217 # significant.
Dan O'Mearadd494642020-05-01 07:42:23 -07004218 "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without
4219 # a year.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004220 },
Dan O'Mearadd494642020-05-01 07:42:23 -07004221 "stringValue": "A String", # string
4222 "booleanValue": True or False, # boolean
4223 "integerValue": "A String", # integer
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004224 },
4225 ],
4226 "topSensitiveValues": [ # Estimated frequencies of top sensitive values.
4227 { # A value of a field, including its frequency.
4228 "count": "A String", # How many times the value is contained in the field.
4229 "value": { # Set of primitive values supported by the system. # A value contained in the field in question.
4230 # Note that for the purposes of inspection or transformation, the number
4231 # of bytes considered to comprise a 'Value' is based on its representation
4232 # as a UTF-8 encoded string. For example, if 'integer_value' is set to
4233 # 123456789, the number of bytes would be counted as 9, even though an
4234 # int64 only holds up to 8 bytes of data.
Dan O'Mearadd494642020-05-01 07:42:23 -07004235 "floatValue": 3.14, # float
4236 "timestampValue": "A String", # timestamp
4237 "dayOfWeekValue": "A String", # day of week
4238 "timeValue": { # Represents a time of day. The date and time zone are either not significant # time of day
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004239 # or are specified elsewhere. An API may choose to allow leap seconds. Related
4240 # types are google.type.Date and `google.protobuf.Timestamp`.
4241 "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose
4242 # to allow the value "24:00:00" for scenarios like business closing time.
4243 "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999.
4244 "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may
4245 # allow the value 60 if it allows leap-seconds.
4246 "minutes": 42, # Minutes of hour of day. Must be from 0 to 59.
4247 },
Dan O'Mearadd494642020-05-01 07:42:23 -07004248 "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day # date
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004249 # and time zone are either specified elsewhere or are not significant. The date
4250 # is relative to the Proleptic Gregorian Calendar. This can represent:
4251 #
4252 # * A full date, with non-zero year, month and day values
4253 # * A month and day value, with a zero year, e.g. an anniversary
4254 # * A year on its own, with zero month and day values
4255 # * A year and month value, with a zero day, e.g. a credit card expiration date
4256 #
4257 # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`.
Dan O'Mearadd494642020-05-01 07:42:23 -07004258 "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a
4259 # month and day.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004260 "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0
4261 # if specifying a year by itself or a year and month where the day is not
4262 # significant.
Dan O'Mearadd494642020-05-01 07:42:23 -07004263 "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without
4264 # a year.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004265 },
Dan O'Mearadd494642020-05-01 07:42:23 -07004266 "stringValue": "A String", # string
4267 "booleanValue": True or False, # boolean
4268 "integerValue": "A String", # integer
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004269 },
4270 },
4271 ],
4272 "equivalenceClassSize": "A String", # Size of the k-anonymity equivalence class.
4273 },
4274 ],
4275 "bucketValueCount": "A String", # Total number of distinct equivalence classes in this bucket.
4276 "bucketSize": "A String", # Total number of equivalence classes in this bucket.
4277 "sensitiveValueFrequencyUpperBound": "A String", # Upper bound on the sensitive value frequencies of the equivalence
4278 # classes in this bucket.
4279 "sensitiveValueFrequencyLowerBound": "A String", # Lower bound on the sensitive value frequencies of the equivalence
4280 # classes in this bucket.
4281 },
4282 ],
4283 },
4284 "requestedPrivacyMetric": { # Privacy metric to compute for reidentification risk analysis. # Privacy metric to compute.
Dan O'Mearadd494642020-05-01 07:42:23 -07004285 "numericalStatsConfig": { # Compute numerical stats over an individual column, including # Numerical stats
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004286 # min, max, and quantiles.
4287 "field": { # General identifier of a data field in a storage service. # Field to compute numerical stats on. Supported types are
4288 # integer, float, date, datetime, timestamp, time.
4289 "name": "A String", # Name describing the field.
4290 },
4291 },
Dan O'Mearadd494642020-05-01 07:42:23 -07004292 "kMapEstimationConfig": { # Reidentifiability metric. This corresponds to a risk model similar to what # k-map
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004293 # is called "journalist risk" in the literature, except the attack dataset is
4294 # statistically modeled instead of being perfectly known. This can be done
4295 # using publicly available data (like the US Census), or using a custom
4296 # statistical model (indicated as one or several BigQuery tables), or by
4297 # extrapolating from the distribution of values in the input dataset.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004298 "regionCode": "A String", # ISO 3166-1 alpha-2 region code to use in the statistical modeling.
Dan O'Mearadd494642020-05-01 07:42:23 -07004299 # Set if no column is tagged with a region-specific InfoType (like
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004300 # US_ZIP_5) or a region code.
Dan O'Mearadd494642020-05-01 07:42:23 -07004301 "quasiIds": [ # Required. Fields considered to be quasi-identifiers. No two columns can have the
4302 # same tag.
4303 { # A column with a semantic tag attached.
4304 "field": { # General identifier of a data field in a storage service. # Required. Identifies the column.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004305 "name": "A String", # Name describing the field.
4306 },
4307 "customTag": "A String", # A column can be tagged with a custom tag. In this case, the user must
4308 # indicate an auxiliary table that contains statistical information on
4309 # the possible values of this column (below).
4310 "infoType": { # Type of information detected by the API. # A column can be tagged with a InfoType to use the relevant public
4311 # dataset as a statistical model of population, if available. We
4312 # currently support US ZIP codes, region codes, ages and genders.
4313 # To programmatically obtain the list of supported InfoTypes, use
4314 # ListInfoTypes with the supported_by=RISK_ANALYSIS filter.
4315 "name": "A String", # Name of the information type. Either a name of your choosing when
4316 # creating a CustomInfoType, or one of the names listed
4317 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
4318 # a built-in type. InfoType names should conform to the pattern
Dan O'Mearadd494642020-05-01 07:42:23 -07004319 # `[a-zA-Z0-9_]{1,64}`.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004320 },
4321 "inferred": { # A generic empty message that you can re-use to avoid defining duplicated # If no semantic tag is indicated, we infer the statistical model from
4322 # the distribution of values in the input data
4323 # empty messages in your APIs. A typical example is to use it as the request
4324 # or the response type of an API method. For instance:
4325 #
4326 # service Foo {
4327 # rpc Bar(google.protobuf.Empty) returns (google.protobuf.Empty);
4328 # }
4329 #
4330 # The JSON representation for `Empty` is empty JSON object `{}`.
4331 },
4332 },
4333 ],
4334 "auxiliaryTables": [ # Several auxiliary tables can be used in the analysis. Each custom_tag
4335 # used to tag a quasi-identifiers column must appear in exactly one column
4336 # of one auxiliary table.
4337 { # An auxiliary table contains statistical information on the relative
4338 # frequency of different quasi-identifiers values. It has one or several
4339 # quasi-identifiers columns, and one column that indicates the relative
4340 # frequency of each quasi-identifier tuple.
4341 # If a tuple is present in the data but not in the auxiliary table, the
4342 # corresponding relative frequency is assumed to be zero (and thus, the
4343 # tuple is highly reidentifiable).
Dan O'Mearadd494642020-05-01 07:42:23 -07004344 "table": { # Message defining the location of a BigQuery table. A table is uniquely # Required. Auxiliary table location.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004345 # identified by its project_id, dataset_id, and table_name. Within a query
4346 # a table is often referenced with a string in the format of:
Dan O'Mearadd494642020-05-01 07:42:23 -07004347 # `&lt;project_id&gt;:&lt;dataset_id&gt;.&lt;table_id&gt;` or
4348 # `&lt;project_id&gt;.&lt;dataset_id&gt;.&lt;table_id&gt;`.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004349 "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table.
4350 # If omitted, project ID is inferred from the API call.
4351 "tableId": "A String", # Name of the table.
4352 "datasetId": "A String", # Dataset ID of the table.
4353 },
Dan O'Mearadd494642020-05-01 07:42:23 -07004354 "quasiIds": [ # Required. Quasi-identifier columns.
4355 { # A quasi-identifier column has a custom_tag, used to know which column
4356 # in the data corresponds to which column in the statistical model.
4357 "field": { # General identifier of a data field in a storage service. # Identifies the column.
4358 "name": "A String", # Name describing the field.
4359 },
4360 "customTag": "A String", # A auxiliary field.
4361 },
4362 ],
4363 "relativeFrequency": { # General identifier of a data field in a storage service. # Required. The relative frequency column must contain a floating-point number
4364 # between 0 and 1 (inclusive). Null values are assumed to be zero.
4365 "name": "A String", # Name describing the field.
4366 },
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004367 },
4368 ],
4369 },
Dan O'Mearadd494642020-05-01 07:42:23 -07004370 "lDiversityConfig": { # l-diversity metric, used for analysis of reidentification risk. # l-diversity
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004371 "sensitiveAttribute": { # General identifier of a data field in a storage service. # Sensitive field for computing the l-value.
4372 "name": "A String", # Name describing the field.
4373 },
4374 "quasiIds": [ # Set of quasi-identifiers indicating how equivalence classes are
4375 # defined for the l-diversity computation. When multiple fields are
4376 # specified, they are considered a single composite key.
4377 { # General identifier of a data field in a storage service.
4378 "name": "A String", # Name describing the field.
4379 },
4380 ],
4381 },
Dan O'Mearadd494642020-05-01 07:42:23 -07004382 "kAnonymityConfig": { # k-anonymity metric, used for analysis of reidentification risk. # K-anonymity
4383 "entityId": { # An entity in a dataset is a field or set of fields that correspond to a # Message indicating that multiple rows might be associated to a
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004384 # single individual. If the same entity_id is associated to multiple
4385 # quasi-identifier tuples over distinct rows, we consider the entire
4386 # collection of tuples as the composite quasi-identifier. This collection
4387 # is a multiset: the order in which the different tuples appear in the
4388 # dataset is ignored, but their frequency is taken into account.
4389 #
4390 # Important note: a maximum of 1000 rows can be associated to a single
4391 # entity ID. If more rows are associated with the same entity ID, some
4392 # might be ignored.
4393 # single person. For example, in medical records the `EntityId` might be a
4394 # patient identifier, or for financial records it might be an account
4395 # identifier. This message is used when generalizations or analysis must take
4396 # into account that multiple rows correspond to the same entity.
4397 "field": { # General identifier of a data field in a storage service. # Composite key indicating which field contains the entity identifier.
4398 "name": "A String", # Name describing the field.
4399 },
4400 },
4401 "quasiIds": [ # Set of fields to compute k-anonymity over. When multiple fields are
4402 # specified, they are considered a single composite key. Structs and
4403 # repeated data types are not supported; however, nested fields are
4404 # supported so long as they are not structs themselves or nested within
4405 # a repeated field.
4406 { # General identifier of a data field in a storage service.
4407 "name": "A String", # Name describing the field.
4408 },
4409 ],
4410 },
Dan O'Mearadd494642020-05-01 07:42:23 -07004411 "categoricalStatsConfig": { # Compute numerical stats over an individual column, including # Categorical stats
4412 # number of distinct values and value count distribution.
4413 "field": { # General identifier of a data field in a storage service. # Field to compute categorical stats on. All column types are
4414 # supported except for arrays and structs. However, it may be more
4415 # informative to use NumericalStats when the field type is supported,
4416 # depending on the data.
4417 "name": "A String", # Name describing the field.
4418 },
4419 },
4420 "deltaPresenceEstimationConfig": { # δ-presence metric, used to estimate how likely it is for an attacker to # delta-presence
4421 # figure out that one given individual appears in a de-identified dataset.
4422 # Similarly to the k-map metric, we cannot compute δ-presence exactly without
4423 # knowing the attack dataset, so we use a statistical model instead.
4424 "regionCode": "A String", # ISO 3166-1 alpha-2 region code to use in the statistical modeling.
4425 # Set if no column is tagged with a region-specific InfoType (like
4426 # US_ZIP_5) or a region code.
4427 "quasiIds": [ # Required. Fields considered to be quasi-identifiers. No two fields can have the
4428 # same tag.
4429 { # A column with a semantic tag attached.
4430 "field": { # General identifier of a data field in a storage service. # Required. Identifies the column.
4431 "name": "A String", # Name describing the field.
4432 },
4433 "customTag": "A String", # A column can be tagged with a custom tag. In this case, the user must
4434 # indicate an auxiliary table that contains statistical information on
4435 # the possible values of this column (below).
4436 "infoType": { # Type of information detected by the API. # A column can be tagged with a InfoType to use the relevant public
4437 # dataset as a statistical model of population, if available. We
4438 # currently support US ZIP codes, region codes, ages and genders.
4439 # To programmatically obtain the list of supported InfoTypes, use
4440 # ListInfoTypes with the supported_by=RISK_ANALYSIS filter.
4441 "name": "A String", # Name of the information type. Either a name of your choosing when
4442 # creating a CustomInfoType, or one of the names listed
4443 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
4444 # a built-in type. InfoType names should conform to the pattern
4445 # `[a-zA-Z0-9_]{1,64}`.
4446 },
4447 "inferred": { # A generic empty message that you can re-use to avoid defining duplicated # If no semantic tag is indicated, we infer the statistical model from
4448 # the distribution of values in the input data
4449 # empty messages in your APIs. A typical example is to use it as the request
4450 # or the response type of an API method. For instance:
4451 #
4452 # service Foo {
4453 # rpc Bar(google.protobuf.Empty) returns (google.protobuf.Empty);
4454 # }
4455 #
4456 # The JSON representation for `Empty` is empty JSON object `{}`.
4457 },
4458 },
4459 ],
4460 "auxiliaryTables": [ # Several auxiliary tables can be used in the analysis. Each custom_tag
4461 # used to tag a quasi-identifiers field must appear in exactly one
4462 # field of one auxiliary table.
4463 { # An auxiliary table containing statistical information on the relative
4464 # frequency of different quasi-identifiers values. It has one or several
4465 # quasi-identifiers columns, and one column that indicates the relative
4466 # frequency of each quasi-identifier tuple.
4467 # If a tuple is present in the data but not in the auxiliary table, the
4468 # corresponding relative frequency is assumed to be zero (and thus, the
4469 # tuple is highly reidentifiable).
4470 "relativeFrequency": { # General identifier of a data field in a storage service. # Required. The relative frequency column must contain a floating-point number
4471 # between 0 and 1 (inclusive). Null values are assumed to be zero.
4472 "name": "A String", # Name describing the field.
4473 },
4474 "quasiIds": [ # Required. Quasi-identifier columns.
4475 { # A quasi-identifier column has a custom_tag, used to know which column
4476 # in the data corresponds to which column in the statistical model.
4477 "field": { # General identifier of a data field in a storage service. # Identifies the column.
4478 "name": "A String", # Name describing the field.
4479 },
4480 "customTag": "A String", # A column can be tagged with a custom tag. In this case, the user must
4481 # indicate an auxiliary table that contains statistical information on
4482 # the possible values of this column (below).
4483 },
4484 ],
4485 "table": { # Message defining the location of a BigQuery table. A table is uniquely # Required. Auxiliary table location.
4486 # identified by its project_id, dataset_id, and table_name. Within a query
4487 # a table is often referenced with a string in the format of:
4488 # `&lt;project_id&gt;:&lt;dataset_id&gt;.&lt;table_id&gt;` or
4489 # `&lt;project_id&gt;.&lt;dataset_id&gt;.&lt;table_id&gt;`.
4490 "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table.
4491 # If omitted, project ID is inferred from the API call.
4492 "tableId": "A String", # Name of the table.
4493 "datasetId": "A String", # Dataset ID of the table.
4494 },
4495 },
4496 ],
4497 },
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004498 },
Dan O'Mearadd494642020-05-01 07:42:23 -07004499 "categoricalStatsResult": { # Result of the categorical stats computation. # Categorical stats result
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004500 "valueFrequencyHistogramBuckets": [ # Histogram of value frequencies in the column.
Dan O'Mearadd494642020-05-01 07:42:23 -07004501 { # Histogram of value frequencies in the column.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004502 "bucketValues": [ # Sample of value frequencies in this bucket. The total number of
4503 # values returned per bucket is capped at 20.
4504 { # A value of a field, including its frequency.
4505 "count": "A String", # How many times the value is contained in the field.
4506 "value": { # Set of primitive values supported by the system. # A value contained in the field in question.
4507 # Note that for the purposes of inspection or transformation, the number
4508 # of bytes considered to comprise a 'Value' is based on its representation
4509 # as a UTF-8 encoded string. For example, if 'integer_value' is set to
4510 # 123456789, the number of bytes would be counted as 9, even though an
4511 # int64 only holds up to 8 bytes of data.
Dan O'Mearadd494642020-05-01 07:42:23 -07004512 "floatValue": 3.14, # float
4513 "timestampValue": "A String", # timestamp
4514 "dayOfWeekValue": "A String", # day of week
4515 "timeValue": { # Represents a time of day. The date and time zone are either not significant # time of day
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004516 # or are specified elsewhere. An API may choose to allow leap seconds. Related
4517 # types are google.type.Date and `google.protobuf.Timestamp`.
4518 "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose
4519 # to allow the value "24:00:00" for scenarios like business closing time.
4520 "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999.
4521 "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may
4522 # allow the value 60 if it allows leap-seconds.
4523 "minutes": 42, # Minutes of hour of day. Must be from 0 to 59.
4524 },
Dan O'Mearadd494642020-05-01 07:42:23 -07004525 "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day # date
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004526 # and time zone are either specified elsewhere or are not significant. The date
4527 # is relative to the Proleptic Gregorian Calendar. This can represent:
4528 #
4529 # * A full date, with non-zero year, month and day values
4530 # * A month and day value, with a zero year, e.g. an anniversary
4531 # * A year on its own, with zero month and day values
4532 # * A year and month value, with a zero day, e.g. a credit card expiration date
4533 #
4534 # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`.
Dan O'Mearadd494642020-05-01 07:42:23 -07004535 "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a
4536 # month and day.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004537 "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0
4538 # if specifying a year by itself or a year and month where the day is not
4539 # significant.
Dan O'Mearadd494642020-05-01 07:42:23 -07004540 "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without
4541 # a year.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004542 },
Dan O'Mearadd494642020-05-01 07:42:23 -07004543 "stringValue": "A String", # string
4544 "booleanValue": True or False, # boolean
4545 "integerValue": "A String", # integer
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004546 },
4547 },
4548 ],
4549 "bucketValueCount": "A String", # Total number of distinct values in this bucket.
4550 "valueFrequencyUpperBound": "A String", # Upper bound on the value frequency of the values in this bucket.
4551 "valueFrequencyLowerBound": "A String", # Lower bound on the value frequency of the values in this bucket.
4552 "bucketSize": "A String", # Total number of values in this bucket.
4553 },
4554 ],
4555 },
Dan O'Mearadd494642020-05-01 07:42:23 -07004556 "deltaPresenceEstimationResult": { # Result of the δ-presence computation. Note that these results are an # Delta-presence result
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004557 # estimation, not exact values.
4558 "deltaPresenceEstimationHistogram": [ # The intervals [min_probability, max_probability) do not overlap. If a
4559 # value doesn't correspond to any such interval, the associated frequency
4560 # is zero. For example, the following records:
4561 # {min_probability: 0, max_probability: 0.1, frequency: 17}
4562 # {min_probability: 0.2, max_probability: 0.3, frequency: 42}
4563 # {min_probability: 0.3, max_probability: 0.4, frequency: 99}
4564 # mean that there are no record with an estimated probability in [0.1, 0.2)
4565 # nor larger or equal to 0.4.
4566 { # A DeltaPresenceEstimationHistogramBucket message with the following
4567 # values:
4568 # min_probability: 0.1
4569 # max_probability: 0.2
4570 # frequency: 42
4571 # means that there are 42 records for which δ is in [0.1, 0.2). An
4572 # important particular case is when min_probability = max_probability = 1:
4573 # then, every individual who shares this quasi-identifier combination is in
4574 # the dataset.
4575 "bucketValues": [ # Sample of quasi-identifier tuple values in this bucket. The total
4576 # number of classes returned per bucket is capped at 20.
4577 { # A tuple of values for the quasi-identifier columns.
4578 "quasiIdsValues": [ # The quasi-identifier values.
4579 { # Set of primitive values supported by the system.
4580 # Note that for the purposes of inspection or transformation, the number
4581 # of bytes considered to comprise a 'Value' is based on its representation
4582 # as a UTF-8 encoded string. For example, if 'integer_value' is set to
4583 # 123456789, the number of bytes would be counted as 9, even though an
4584 # int64 only holds up to 8 bytes of data.
Dan O'Mearadd494642020-05-01 07:42:23 -07004585 "floatValue": 3.14, # float
4586 "timestampValue": "A String", # timestamp
4587 "dayOfWeekValue": "A String", # day of week
4588 "timeValue": { # Represents a time of day. The date and time zone are either not significant # time of day
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004589 # or are specified elsewhere. An API may choose to allow leap seconds. Related
4590 # types are google.type.Date and `google.protobuf.Timestamp`.
4591 "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose
4592 # to allow the value "24:00:00" for scenarios like business closing time.
4593 "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999.
4594 "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may
4595 # allow the value 60 if it allows leap-seconds.
4596 "minutes": 42, # Minutes of hour of day. Must be from 0 to 59.
4597 },
Dan O'Mearadd494642020-05-01 07:42:23 -07004598 "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day # date
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004599 # and time zone are either specified elsewhere or are not significant. The date
4600 # is relative to the Proleptic Gregorian Calendar. This can represent:
4601 #
4602 # * A full date, with non-zero year, month and day values
4603 # * A month and day value, with a zero year, e.g. an anniversary
4604 # * A year on its own, with zero month and day values
4605 # * A year and month value, with a zero day, e.g. a credit card expiration date
4606 #
4607 # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`.
Dan O'Mearadd494642020-05-01 07:42:23 -07004608 "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a
4609 # month and day.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004610 "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0
4611 # if specifying a year by itself or a year and month where the day is not
4612 # significant.
Dan O'Mearadd494642020-05-01 07:42:23 -07004613 "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without
4614 # a year.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004615 },
Dan O'Mearadd494642020-05-01 07:42:23 -07004616 "stringValue": "A String", # string
4617 "booleanValue": True or False, # boolean
4618 "integerValue": "A String", # integer
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004619 },
4620 ],
4621 "estimatedProbability": 3.14, # The estimated probability that a given individual sharing these
4622 # quasi-identifier values is in the dataset. This value, typically called
4623 # δ, is the ratio between the number of records in the dataset with these
4624 # quasi-identifier values, and the total number of individuals (inside
4625 # *and* outside the dataset) with these quasi-identifier values.
4626 # For example, if there are 15 individuals in the dataset who share the
4627 # same quasi-identifier values, and an estimated 100 people in the entire
4628 # population with these values, then δ is 0.15.
4629 },
4630 ],
4631 "bucketValueCount": "A String", # Total number of distinct quasi-identifier tuple values in this bucket.
4632 "bucketSize": "A String", # Number of records within these probability bounds.
4633 "maxProbability": 3.14, # Always greater than or equal to min_probability.
4634 "minProbability": 3.14, # Between 0 and 1.
4635 },
4636 ],
4637 },
4638 "requestedSourceTable": { # Message defining the location of a BigQuery table. A table is uniquely # Input dataset to compute metrics over.
4639 # identified by its project_id, dataset_id, and table_name. Within a query
4640 # a table is often referenced with a string in the format of:
Dan O'Mearadd494642020-05-01 07:42:23 -07004641 # `&lt;project_id&gt;:&lt;dataset_id&gt;.&lt;table_id&gt;` or
4642 # `&lt;project_id&gt;.&lt;dataset_id&gt;.&lt;table_id&gt;`.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004643 "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table.
4644 # If omitted, project ID is inferred from the API call.
4645 "tableId": "A String", # Name of the table.
4646 "datasetId": "A String", # Dataset ID of the table.
4647 },
4648 },
4649 "state": "A String", # State of a job.
4650 "jobTriggerName": "A String", # If created by a job trigger, the resource name of the trigger that
4651 # instantiated the job.
4652 "startTime": "A String", # Time when the job started.
4653 "endTime": "A String", # Time when the job finished.
4654 "type": "A String", # The type of job.
4655 "createTime": "A String", # Time when the job was created.
4656 }</pre>
4657</div>
4658
4659<div class="method">
Dan O'Mearadd494642020-05-01 07:42:23 -07004660 <code class="details" id="list">list(parent, orderBy=None, pageSize=None, x__xgafv=None, pageToken=None, type=None, locationId=None, filter=None)</code>
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004661 <pre>Lists DlpJobs that match the specified filter in the request.
4662See https://cloud.google.com/dlp/docs/inspecting-storage and
4663https://cloud.google.com/dlp/docs/compute-risk-analysis to learn more.
4664
4665Args:
Dan O'Mearadd494642020-05-01 07:42:23 -07004666 parent: string, Required. The parent resource name, for example projects/my-project-id. (required)
4667 orderBy: string, Comma separated list of fields to order by,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004668followed by `asc` or `desc` postfix. This list is case-insensitive,
4669default sorting order is ascending, redundant space characters are
4670insignificant.
4671
4672Example: `name asc, end_time asc, create_time desc`
4673
4674Supported fields are:
4675
4676- `create_time`: corresponds to time the job was created.
4677- `end_time`: corresponds to time the job ended.
4678- `name`: corresponds to job's name.
4679- `state`: corresponds to `state`
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004680 pageSize: integer, The standard list page size.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004681 x__xgafv: string, V1 error format.
4682 Allowed values
4683 1 - v1 error format
4684 2 - v2 error format
Dan O'Mearadd494642020-05-01 07:42:23 -07004685 pageToken: string, The standard list page token.
4686 type: string, The type of job. Defaults to `DlpJobType.INSPECT`
4687 locationId: string, The geographic location where jobs will be retrieved from.
4688Use `-` for all locations. Reserved for future extensions.
4689 filter: string, Allows filtering.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004690
4691Supported syntax:
4692
4693* Filter expressions are made up of one or more restrictions.
4694* Restrictions can be combined by `AND` or `OR` logical operators. A
4695sequence of restrictions implicitly uses `AND`.
Dan O'Mearadd494642020-05-01 07:42:23 -07004696* A restriction has the form of `{field} {operator} {value}`.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004697* Supported fields/values for inspect jobs:
4698 - `state` - PENDING|RUNNING|CANCELED|FINISHED|FAILED
4699 - `inspected_storage` - DATASTORE|CLOUD_STORAGE|BIGQUERY
4700 - `trigger_name` - The resource name of the trigger that created job.
4701 - 'end_time` - Corresponds to time the job finished.
4702 - 'start_time` - Corresponds to time the job finished.
4703* Supported fields for risk analysis jobs:
4704 - `state` - RUNNING|CANCELED|FINISHED|FAILED
4705 - 'end_time` - Corresponds to time the job finished.
4706 - 'start_time` - Corresponds to time the job finished.
4707* The operator must be `=` or `!=`.
4708
4709Examples:
4710
4711* inspected_storage = cloud_storage AND state = done
4712* inspected_storage = cloud_storage OR inspected_storage = bigquery
4713* inspected_storage = cloud_storage AND (state = done OR state = canceled)
Dan O'Mearadd494642020-05-01 07:42:23 -07004714* end_time &gt; \"2017-12-12T00:00:00+00:00\"
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004715
4716The length of this field should be no more than 500 characters.
4717
4718Returns:
4719 An object of the form:
4720
4721 { # The response message for listing DLP jobs.
4722 "nextPageToken": "A String", # The standard List next-page token.
4723 "jobs": [ # A list of DlpJobs that matches the specified filter in the request.
4724 { # Combines all of the information about a DLP job.
4725 "errors": [ # A stream of errors encountered running the job.
4726 { # Details information about an error encountered during job execution or
4727 # the results of an unsuccessful activation of the JobTrigger.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004728 "timestamps": [ # The times the error occurred.
4729 "A String",
4730 ],
Dan O'Mearadd494642020-05-01 07:42:23 -07004731 "details": { # The `Status` type defines a logical error model that is suitable for # Detailed error codes and messages.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004732 # different programming environments, including REST APIs and RPC APIs. It is
4733 # used by [gRPC](https://github.com/grpc). Each `Status` message contains
4734 # three pieces of data: error code, error message, and error details.
4735 #
4736 # You can find out more about this error model and how to work with it in the
4737 # [API Design Guide](https://cloud.google.com/apis/design/errors).
4738 "message": "A String", # A developer-facing error message, which should be in English. Any
4739 # user-facing error message should be localized and sent in the
4740 # google.rpc.Status.details field, or localized by the client.
4741 "code": 42, # The status code, which should be an enum value of google.rpc.Code.
4742 "details": [ # A list of messages that carry the error details. There is a common set of
4743 # message types for APIs to use.
4744 {
4745 "a_key": "", # Properties of the object. Contains field @type with type URL.
4746 },
4747 ],
4748 },
4749 },
4750 ],
4751 "name": "A String", # The server-assigned name.
4752 "inspectDetails": { # The results of an inspect DataSource job. # Results from inspecting a data source.
Dan O'Mearadd494642020-05-01 07:42:23 -07004753 "requestedOptions": { # Snapshot of the inspection configuration. # The configuration used for this job.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004754 "snapshotInspectTemplate": { # The inspectTemplate contains a configuration (set of types of sensitive data # If run with an InspectTemplate, a snapshot of its state at the time of
4755 # this run.
4756 # to be detected) to be used anywhere you otherwise would normally specify
4757 # InspectConfig. See https://cloud.google.com/dlp/docs/concepts-templates
4758 # to learn more.
Dan O'Mearadd494642020-05-01 07:42:23 -07004759 "updateTime": "A String", # Output only. The last update timestamp of an inspectTemplate.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004760 "displayName": "A String", # Display name (max 256 chars).
4761 "description": "A String", # Short description (max 256 chars).
4762 "inspectConfig": { # Configuration description of the scanning process. # The core content of the template. Configuration of the scanning process.
4763 # When used with redactContent only info_types and min_likelihood are currently
4764 # used.
4765 "excludeInfoTypes": True or False, # When true, excludes type information of the findings.
Dan O'Mearadd494642020-05-01 07:42:23 -07004766 "limits": { # Configuration to control the number of findings returned. # Configuration to control the number of findings returned.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004767 "maxFindingsPerRequest": 42, # Max number of findings that will be returned per request/job.
4768 # When set within `InspectContentRequest`, the maximum returned is 2000
4769 # regardless if this is set higher.
4770 "maxFindingsPerInfoType": [ # Configuration of findings limit given for specified infoTypes.
4771 { # Max findings configuration per infoType, per content item or long
4772 # running DlpJob.
4773 "infoType": { # Type of information detected by the API. # Type of information the findings limit applies to. Only one limit per
4774 # info_type should be provided. If InfoTypeLimit does not have an
4775 # info_type, the DLP API applies the limit against all info_types that
4776 # are found but not specified in another InfoTypeLimit.
4777 "name": "A String", # Name of the information type. Either a name of your choosing when
4778 # creating a CustomInfoType, or one of the names listed
4779 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
4780 # a built-in type. InfoType names should conform to the pattern
Dan O'Mearadd494642020-05-01 07:42:23 -07004781 # `[a-zA-Z0-9_]{1,64}`.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004782 },
4783 "maxFindings": 42, # Max findings limit for the given infoType.
4784 },
4785 ],
4786 "maxFindingsPerItem": 42, # Max number of findings that will be returned for each item scanned.
Dan O'Mearadd494642020-05-01 07:42:23 -07004787 # When set within `InspectJobConfig`,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004788 # the maximum returned is 2000 regardless if this is set higher.
4789 # When set within `InspectContentRequest`, this field is ignored.
4790 },
4791 "minLikelihood": "A String", # Only returns findings equal or above this threshold. The default is
4792 # POSSIBLE.
4793 # See https://cloud.google.com/dlp/docs/likelihood to learn more.
4794 "customInfoTypes": [ # CustomInfoTypes provided by the user. See
4795 # https://cloud.google.com/dlp/docs/creating-custom-infotypes to learn more.
4796 { # Custom information type provided by the user. Used to find domain-specific
4797 # sensitive information configurable to the data in question.
4798 "regex": { # Message defining a custom regular expression. # Regular expression based CustomInfoType.
4799 "pattern": "A String", # Pattern defining the regular expression. Its syntax
4800 # (https://github.com/google/re2/wiki/Syntax) can be found under the
4801 # google/re2 repository on GitHub.
4802 "groupIndexes": [ # The index of the submatch to extract as findings. When not
4803 # specified, the entire match is returned. No more than 3 may be included.
4804 42,
4805 ],
4806 },
4807 "surrogateType": { # Message for detecting output from deidentification transformations # Message for detecting output from deidentification transformations that
4808 # support reversing.
4809 # such as
4810 # [`CryptoReplaceFfxFpeConfig`](/dlp/docs/reference/rest/v2/organizations.deidentifyTemplates#cryptoreplaceffxfpeconfig).
4811 # These types of transformations are
4812 # those that perform pseudonymization, thereby producing a "surrogate" as
4813 # output. This should be used in conjunction with a field on the
4814 # transformation such as `surrogate_info_type`. This CustomInfoType does
4815 # not support the use of `detection_rules`.
4816 },
4817 "infoType": { # Type of information detected by the API. # CustomInfoType can either be a new infoType, or an extension of built-in
4818 # infoType, when the name matches one of existing infoTypes and that infoType
4819 # is specified in `InspectContent.info_types` field. Specifying the latter
4820 # adds findings to the one detected by the system. If built-in info type is
4821 # not specified in `InspectContent.info_types` list then the name is treated
4822 # as a custom info type.
4823 "name": "A String", # Name of the information type. Either a name of your choosing when
4824 # creating a CustomInfoType, or one of the names listed
4825 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
4826 # a built-in type. InfoType names should conform to the pattern
Dan O'Mearadd494642020-05-01 07:42:23 -07004827 # `[a-zA-Z0-9_]{1,64}`.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004828 },
4829 "dictionary": { # Custom information type based on a dictionary of words or phrases. This can # A list of phrases to detect as a CustomInfoType.
4830 # be used to match sensitive information specific to the data, such as a list
4831 # of employee IDs or job titles.
4832 #
4833 # Dictionary words are case-insensitive and all characters other than letters
4834 # and digits in the unicode [Basic Multilingual
4835 # Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane)
4836 # will be replaced with whitespace when scanning for matches, so the
4837 # dictionary phrase "Sam Johnson" will match all three phrases "sam johnson",
4838 # "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters
4839 # surrounding any match must be of a different type than the adjacent
4840 # characters within the word, so letters must be next to non-letters and
4841 # digits next to non-digits. For example, the dictionary word "jen" will
4842 # match the first three letters of the text "jen123" but will return no
4843 # matches for "jennifer".
4844 #
4845 # Dictionary words containing a large number of characters that are not
4846 # letters or digits may result in unexpected findings because such characters
4847 # are treated as whitespace. The
4848 # [limits](https://cloud.google.com/dlp/limits) page contains details about
4849 # the size limits of dictionaries. For dictionaries that do not fit within
4850 # these constraints, consider using `LargeCustomDictionaryConfig` in the
4851 # `StoredInfoType` API.
4852 "wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for.
4853 "words": [ # Words or phrases defining the dictionary. The dictionary must contain
4854 # at least one phrase and every phrase must contain at least 2 characters
4855 # that are letters or digits. [required]
4856 "A String",
4857 ],
4858 },
4859 "cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file
4860 # is accepted.
4861 "path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage.
4862 # Example: gs://[BUCKET_NAME]/dictionary.txt
4863 },
4864 },
4865 "storedType": { # A reference to a StoredInfoType to use with scanning. # Load an existing `StoredInfoType` resource for use in
4866 # `InspectDataSource`. Not currently supported in `InspectContent`.
4867 "name": "A String", # Resource name of the requested `StoredInfoType`, for example
4868 # `organizations/433245324/storedInfoTypes/432452342` or
4869 # `projects/project-id/storedInfoTypes/432452342`.
4870 "createTime": "A String", # Timestamp indicating when the version of the `StoredInfoType` used for
4871 # inspection was created. Output-only field, populated by the system.
4872 },
4873 "detectionRules": [ # Set of detection rules to apply to all findings of this CustomInfoType.
4874 # Rules are applied in order that they are specified. Not supported for the
4875 # `surrogate_type` CustomInfoType.
4876 { # Deprecated; use `InspectionRuleSet` instead. Rule for modifying a
4877 # `CustomInfoType` to alter behavior under certain circumstances, depending
4878 # on the specific details of the rule. Not supported for the `surrogate_type`
4879 # custom infoType.
4880 "hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule.
4881 # proximity of hotwords.
4882 "proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside.
4883 # The total length of the window cannot exceed 1000 characters. Note that
4884 # the finding itself will be included in the window, so that hotwords may
4885 # be used to match substrings of the finding itself. For example, the
4886 # certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be
4887 # adjusted upwards if the area code is known to be the local area code of
4888 # a company office using the hotword regex "\(xxx\)", where "xxx"
4889 # is the area code in question.
4890 # rule.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004891 "windowBefore": 42, # Number of characters before the finding to consider.
Dan O'Mearadd494642020-05-01 07:42:23 -07004892 "windowAfter": 42, # Number of characters after the finding to consider.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004893 },
4894 "hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword.
4895 "pattern": "A String", # Pattern defining the regular expression. Its syntax
4896 # (https://github.com/google/re2/wiki/Syntax) can be found under the
4897 # google/re2 repository on GitHub.
4898 "groupIndexes": [ # The index of the submatch to extract as findings. When not
4899 # specified, the entire match is returned. No more than 3 may be included.
4900 42,
4901 ],
4902 },
4903 "likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings.
4904 # part of a detection rule.
4905 "relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of
4906 # levels. For example, if a finding would be `POSSIBLE` without the
4907 # detection rule and `relative_likelihood` is 1, then it is upgraded to
4908 # `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`.
4909 # Likelihood may never drop below `VERY_UNLIKELY` or exceed
4910 # `VERY_LIKELY`, so applying an adjustment of 1 followed by an
4911 # adjustment of -1 when base likelihood is `VERY_LIKELY` will result in
4912 # a final likelihood of `LIKELY`.
4913 "fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value.
4914 },
4915 },
4916 },
4917 ],
4918 "exclusionType": "A String", # If set to EXCLUSION_TYPE_EXCLUDE this infoType will not cause a finding
4919 # to be returned. It still can be used for rules matching.
4920 "likelihood": "A String", # Likelihood to return for this CustomInfoType. This base value can be
4921 # altered by a detection rule if the finding meets the criteria specified by
4922 # the rule. Defaults to `VERY_LIKELY` if not specified.
4923 },
4924 ],
4925 "includeQuote": True or False, # When true, a contextual quote from the data that triggered a finding is
4926 # included in the response; see Finding.quote.
4927 "ruleSet": [ # Set of rules to apply to the findings for this InspectConfig.
4928 # Exclusion rules, contained in the set are executed in the end, other
4929 # rules are executed in the order they are specified for each info type.
4930 { # Rule set for modifying a set of infoTypes to alter behavior under certain
4931 # circumstances, depending on the specific details of the rules within the set.
4932 "rules": [ # Set of rules to be applied to infoTypes. The rules are applied in order.
4933 { # A single inspection rule to be applied to infoTypes, specified in
4934 # `InspectionRuleSet`.
4935 "hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule.
4936 # proximity of hotwords.
4937 "proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside.
4938 # The total length of the window cannot exceed 1000 characters. Note that
4939 # the finding itself will be included in the window, so that hotwords may
4940 # be used to match substrings of the finding itself. For example, the
4941 # certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be
4942 # adjusted upwards if the area code is known to be the local area code of
4943 # a company office using the hotword regex "\(xxx\)", where "xxx"
4944 # is the area code in question.
4945 # rule.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004946 "windowBefore": 42, # Number of characters before the finding to consider.
Dan O'Mearadd494642020-05-01 07:42:23 -07004947 "windowAfter": 42, # Number of characters after the finding to consider.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004948 },
4949 "hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword.
4950 "pattern": "A String", # Pattern defining the regular expression. Its syntax
4951 # (https://github.com/google/re2/wiki/Syntax) can be found under the
4952 # google/re2 repository on GitHub.
4953 "groupIndexes": [ # The index of the submatch to extract as findings. When not
4954 # specified, the entire match is returned. No more than 3 may be included.
4955 42,
4956 ],
4957 },
4958 "likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings.
4959 # part of a detection rule.
4960 "relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of
4961 # levels. For example, if a finding would be `POSSIBLE` without the
4962 # detection rule and `relative_likelihood` is 1, then it is upgraded to
4963 # `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`.
4964 # Likelihood may never drop below `VERY_UNLIKELY` or exceed
4965 # `VERY_LIKELY`, so applying an adjustment of 1 followed by an
4966 # adjustment of -1 when base likelihood is `VERY_LIKELY` will result in
4967 # a final likelihood of `LIKELY`.
4968 "fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value.
4969 },
4970 },
4971 "exclusionRule": { # The rule that specifies conditions when findings of infoTypes specified in # Exclusion rule.
4972 # `InspectionRuleSet` are removed from results.
4973 "regex": { # Message defining a custom regular expression. # Regular expression which defines the rule.
4974 "pattern": "A String", # Pattern defining the regular expression. Its syntax
4975 # (https://github.com/google/re2/wiki/Syntax) can be found under the
4976 # google/re2 repository on GitHub.
4977 "groupIndexes": [ # The index of the submatch to extract as findings. When not
4978 # specified, the entire match is returned. No more than 3 may be included.
4979 42,
4980 ],
4981 },
4982 "excludeInfoTypes": { # List of exclude infoTypes. # Set of infoTypes for which findings would affect this rule.
4983 "infoTypes": [ # InfoType list in ExclusionRule rule drops a finding when it overlaps or
4984 # contained within with a finding of an infoType from this list. For
4985 # example, for `InspectionRuleSet.info_types` containing "PHONE_NUMBER"` and
4986 # `exclusion_rule` containing `exclude_info_types.info_types` with
4987 # "EMAIL_ADDRESS" the phone number findings are dropped if they overlap
4988 # with EMAIL_ADDRESS finding.
4989 # That leads to "555-222-2222@example.org" to generate only a single
4990 # finding, namely email address.
4991 { # Type of information detected by the API.
4992 "name": "A String", # Name of the information type. Either a name of your choosing when
4993 # creating a CustomInfoType, or one of the names listed
4994 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
4995 # a built-in type. InfoType names should conform to the pattern
Dan O'Mearadd494642020-05-01 07:42:23 -07004996 # `[a-zA-Z0-9_]{1,64}`.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07004997 },
4998 ],
4999 },
5000 "dictionary": { # Custom information type based on a dictionary of words or phrases. This can # Dictionary which defines the rule.
5001 # be used to match sensitive information specific to the data, such as a list
5002 # of employee IDs or job titles.
5003 #
5004 # Dictionary words are case-insensitive and all characters other than letters
5005 # and digits in the unicode [Basic Multilingual
5006 # Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane)
5007 # will be replaced with whitespace when scanning for matches, so the
5008 # dictionary phrase "Sam Johnson" will match all three phrases "sam johnson",
5009 # "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters
5010 # surrounding any match must be of a different type than the adjacent
5011 # characters within the word, so letters must be next to non-letters and
5012 # digits next to non-digits. For example, the dictionary word "jen" will
5013 # match the first three letters of the text "jen123" but will return no
5014 # matches for "jennifer".
5015 #
5016 # Dictionary words containing a large number of characters that are not
5017 # letters or digits may result in unexpected findings because such characters
5018 # are treated as whitespace. The
5019 # [limits](https://cloud.google.com/dlp/limits) page contains details about
5020 # the size limits of dictionaries. For dictionaries that do not fit within
5021 # these constraints, consider using `LargeCustomDictionaryConfig` in the
5022 # `StoredInfoType` API.
5023 "wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for.
5024 "words": [ # Words or phrases defining the dictionary. The dictionary must contain
5025 # at least one phrase and every phrase must contain at least 2 characters
5026 # that are letters or digits. [required]
5027 "A String",
5028 ],
5029 },
5030 "cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file
5031 # is accepted.
5032 "path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage.
5033 # Example: gs://[BUCKET_NAME]/dictionary.txt
5034 },
5035 },
5036 "matchingType": "A String", # How the rule is applied, see MatchingType documentation for details.
5037 },
5038 },
5039 ],
5040 "infoTypes": [ # List of infoTypes this rule set is applied to.
5041 { # Type of information detected by the API.
5042 "name": "A String", # Name of the information type. Either a name of your choosing when
5043 # creating a CustomInfoType, or one of the names listed
5044 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
5045 # a built-in type. InfoType names should conform to the pattern
Dan O'Mearadd494642020-05-01 07:42:23 -07005046 # `[a-zA-Z0-9_]{1,64}`.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005047 },
5048 ],
5049 },
5050 ],
5051 "contentOptions": [ # List of options defining data content to scan.
5052 # If empty, text, images, and other content will be included.
5053 "A String",
5054 ],
5055 "infoTypes": [ # Restricts what info_types to look for. The values must correspond to
5056 # InfoType values returned by ListInfoTypes or listed at
5057 # https://cloud.google.com/dlp/docs/infotypes-reference.
5058 #
5059 # When no InfoTypes or CustomInfoTypes are specified in a request, the
5060 # system may automatically choose what detectors to run. By default this may
5061 # be all types, but may change over time as detectors are updated.
5062 #
Dan O'Mearadd494642020-05-01 07:42:23 -07005063 # If you need precise control and predictability as to what detectors are
5064 # run you should specify specific InfoTypes listed in the reference,
5065 # otherwise a default list will be used, which may change over time.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005066 { # Type of information detected by the API.
5067 "name": "A String", # Name of the information type. Either a name of your choosing when
5068 # creating a CustomInfoType, or one of the names listed
5069 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
5070 # a built-in type. InfoType names should conform to the pattern
Dan O'Mearadd494642020-05-01 07:42:23 -07005071 # `[a-zA-Z0-9_]{1,64}`.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005072 },
5073 ],
5074 },
Dan O'Mearadd494642020-05-01 07:42:23 -07005075 "createTime": "A String", # Output only. The creation timestamp of an inspectTemplate.
5076 "name": "A String", # Output only. The template name.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005077 #
5078 # The template will have one of the following formats:
5079 # `projects/PROJECT_ID/inspectTemplates/TEMPLATE_ID` OR
Dan O'Mearadd494642020-05-01 07:42:23 -07005080 # `organizations/ORGANIZATION_ID/inspectTemplates/TEMPLATE_ID`;
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005081 },
Dan O'Mearadd494642020-05-01 07:42:23 -07005082 "jobConfig": { # Controls what and how to inspect for findings. # Inspect config.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005083 "storageConfig": { # Shared message indicating Cloud storage type. # The data to scan.
Dan O'Mearadd494642020-05-01 07:42:23 -07005084 "cloudStorageOptions": { # Options defining a file or a set of files within a Google Cloud Storage # Google Cloud Storage options.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005085 # bucket.
5086 "bytesLimitPerFile": "A String", # Max number of bytes to scan from a file. If a scanned file's size is bigger
5087 # than this value then the rest of the bytes are omitted. Only one
5088 # of bytes_limit_per_file and bytes_limit_per_file_percent can be specified.
5089 "sampleMethod": "A String",
5090 "fileSet": { # Set of files to scan. # The set of one or more files to scan.
5091 "url": "A String", # The Cloud Storage url of the file(s) to scan, in the format
Dan O'Mearadd494642020-05-01 07:42:23 -07005092 # `gs://&lt;bucket&gt;/&lt;path&gt;`. Trailing wildcard in the path is allowed.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005093 #
5094 # If the url ends in a trailing slash, the bucket or directory represented
5095 # by the url will be scanned non-recursively (content in sub-directories
5096 # will not be scanned). This means that `gs://mybucket/` is equivalent to
5097 # `gs://mybucket/*`, and `gs://mybucket/directory/` is equivalent to
5098 # `gs://mybucket/directory/*`.
5099 #
5100 # Exactly one of `url` or `regex_file_set` must be set.
5101 "regexFileSet": { # Message representing a set of files in a Cloud Storage bucket. Regular # The regex-filtered set of files to scan. Exactly one of `url` or
5102 # `regex_file_set` must be set.
5103 # expressions are used to allow fine-grained control over which files in the
5104 # bucket to include.
5105 #
5106 # Included files are those that match at least one item in `include_regex` and
5107 # do not match any items in `exclude_regex`. Note that a file that matches
5108 # items from both lists will _not_ be included. For a match to occur, the
5109 # entire file path (i.e., everything in the url after the bucket name) must
5110 # match the regular expression.
5111 #
5112 # For example, given the input `{bucket_name: "mybucket", include_regex:
5113 # ["directory1/.*"], exclude_regex:
5114 # ["directory1/excluded.*"]}`:
5115 #
5116 # * `gs://mybucket/directory1/myfile` will be included
5117 # * `gs://mybucket/directory1/directory2/myfile` will be included (`.*` matches
5118 # across `/`)
5119 # * `gs://mybucket/directory0/directory1/myfile` will _not_ be included (the
5120 # full path doesn't match any items in `include_regex`)
5121 # * `gs://mybucket/directory1/excludedfile` will _not_ be included (the path
5122 # matches an item in `exclude_regex`)
5123 #
5124 # If `include_regex` is left empty, it will match all files by default
5125 # (this is equivalent to setting `include_regex: [".*"]`).
5126 #
5127 # Some other common use cases:
5128 #
5129 # * `{bucket_name: "mybucket", exclude_regex: [".*\.pdf"]}` will include all
5130 # files in `mybucket` except for .pdf files
5131 # * `{bucket_name: "mybucket", include_regex: ["directory/[^/]+"]}` will
5132 # include all files directly under `gs://mybucket/directory/`, without matching
5133 # across `/`
5134 "excludeRegex": [ # A list of regular expressions matching file paths to exclude. All files in
5135 # the bucket that match at least one of these regular expressions will be
5136 # excluded from the scan.
5137 #
5138 # Regular expressions use RE2
5139 # [syntax](https://github.com/google/re2/wiki/Syntax); a guide can be found
5140 # under the google/re2 repository on GitHub.
5141 "A String",
5142 ],
5143 "bucketName": "A String", # The name of a Cloud Storage bucket. Required.
5144 "includeRegex": [ # A list of regular expressions matching file paths to include. All files in
5145 # the bucket that match at least one of these regular expressions will be
5146 # included in the set of files, except for those that also match an item in
5147 # `exclude_regex`. Leaving this field empty will match all files by default
5148 # (this is equivalent to including `.*` in the list).
5149 #
5150 # Regular expressions use RE2
5151 # [syntax](https://github.com/google/re2/wiki/Syntax); a guide can be found
5152 # under the google/re2 repository on GitHub.
5153 "A String",
5154 ],
5155 },
5156 },
Dan O'Mearadd494642020-05-01 07:42:23 -07005157 "filesLimitPercent": 42, # Limits the number of files to scan to this percentage of the input FileSet.
5158 # Number of files scanned is rounded down. Must be between 0 and 100,
5159 # inclusively. Both 0 and 100 means no limit. Defaults to 0.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005160 "bytesLimitPerFilePercent": 42, # Max percentage of bytes to scan from a file. The rest are omitted. The
5161 # number of bytes scanned is rounded down. Must be between 0 and 100,
5162 # inclusively. Both 0 and 100 means no limit. Defaults to 0. Only one
5163 # of bytes_limit_per_file and bytes_limit_per_file_percent can be specified.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005164 "fileTypes": [ # List of file type groups to include in the scan.
5165 # If empty, all files are scanned and available data format processors
5166 # are applied. In addition, the binary content of the selected files
5167 # is always scanned as well.
Dan O'Mearadd494642020-05-01 07:42:23 -07005168 # Images are scanned only as binary if the specified region
5169 # does not support image inspection and no file_types were specified.
5170 # Image inspection is restricted to 'global', 'us', 'asia', and 'europe'.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005171 "A String",
5172 ],
5173 },
Dan O'Mearadd494642020-05-01 07:42:23 -07005174 "datastoreOptions": { # Options defining a data set within Google Cloud Datastore. # Google Cloud Datastore options.
5175 "partitionId": { # Datastore partition ID. # A partition ID identifies a grouping of entities. The grouping is always
5176 # by project and namespace, however the namespace ID may be empty.
5177 # A partition ID identifies a grouping of entities. The grouping is always
5178 # by project and namespace, however the namespace ID may be empty.
5179 #
5180 # A partition ID contains several dimensions:
5181 # project ID and namespace ID.
5182 "projectId": "A String", # The ID of the project to which the entities belong.
5183 "namespaceId": "A String", # If not empty, the ID of the namespace to which the entities belong.
5184 },
5185 "kind": { # A representation of a Datastore kind. # The kind to process.
5186 "name": "A String", # The name of the kind.
5187 },
5188 },
5189 "bigQueryOptions": { # Options defining BigQuery table and row identifiers. # BigQuery options.
5190 "excludedFields": [ # References to fields excluded from scanning. This allows you to skip
5191 # inspection of entire columns which you know have no findings.
5192 { # General identifier of a data field in a storage service.
5193 "name": "A String", # Name describing the field.
5194 },
5195 ],
5196 "rowsLimit": "A String", # Max number of rows to scan. If the table has more rows than this value, the
5197 # rest of the rows are omitted. If not set, or if set to 0, all rows will be
5198 # scanned. Only one of rows_limit and rows_limit_percent can be specified.
5199 # Cannot be used in conjunction with TimespanConfig.
5200 "sampleMethod": "A String",
5201 "identifyingFields": [ # Table fields that may uniquely identify a row within the table. When
5202 # `actions.saveFindings.outputConfig.table` is specified, the values of
5203 # columns specified here are available in the output table under
5204 # `location.content_locations.record_location.record_key.id_values`. Nested
5205 # fields such as `person.birthdate.year` are allowed.
5206 { # General identifier of a data field in a storage service.
5207 "name": "A String", # Name describing the field.
5208 },
5209 ],
5210 "rowsLimitPercent": 42, # Max percentage of rows to scan. The rest are omitted. The number of rows
5211 # scanned is rounded down. Must be between 0 and 100, inclusively. Both 0 and
5212 # 100 means no limit. Defaults to 0. Only one of rows_limit and
5213 # rows_limit_percent can be specified. Cannot be used in conjunction with
5214 # TimespanConfig.
5215 "tableReference": { # Message defining the location of a BigQuery table. A table is uniquely # Complete BigQuery table reference.
5216 # identified by its project_id, dataset_id, and table_name. Within a query
5217 # a table is often referenced with a string in the format of:
5218 # `&lt;project_id&gt;:&lt;dataset_id&gt;.&lt;table_id&gt;` or
5219 # `&lt;project_id&gt;.&lt;dataset_id&gt;.&lt;table_id&gt;`.
5220 "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table.
5221 # If omitted, project ID is inferred from the API call.
5222 "tableId": "A String", # Name of the table.
5223 "datasetId": "A String", # Dataset ID of the table.
5224 },
5225 },
5226 "timespanConfig": { # Configuration of the timespan of the items to include in scanning.
5227 # Currently only supported when inspecting Google Cloud Storage and BigQuery.
5228 "timestampField": { # General identifier of a data field in a storage service. # Specification of the field containing the timestamp of scanned items.
5229 # Used for data sources like Datastore and BigQuery.
5230 #
5231 # For BigQuery:
5232 # Required to filter out rows based on the given start and
5233 # end times. If not specified and the table was modified between the given
5234 # start and end times, the entire table will be scanned.
5235 # The valid data types of the timestamp field are: `INTEGER`, `DATE`,
5236 # `TIMESTAMP`, or `DATETIME` BigQuery column.
5237 #
5238 # For Datastore.
5239 # Valid data types of the timestamp field are: `TIMESTAMP`.
5240 # Datastore entity will be scanned if the timestamp property does not
5241 # exist or its value is empty or invalid.
5242 "name": "A String", # Name describing the field.
5243 },
5244 "endTime": "A String", # Exclude files or rows newer than this value.
5245 # If set to zero, no upper time limit is applied.
5246 "startTime": "A String", # Exclude files or rows older than this value.
5247 "enableAutoPopulationOfTimespanConfig": True or False, # When the job is started by a JobTrigger we will automatically figure out
5248 # a valid start_time to avoid scanning files that have not been modified
5249 # since the last time the JobTrigger executed. This will be based on the
5250 # time of the execution of the last run of the JobTrigger.
5251 },
5252 "hybridOptions": { # Configuration to control jobs where the content being inspected is outside # Hybrid inspection options.
5253 # Early access feature is in a pre-release state and might change or have
5254 # limited support. For more information, see
5255 # https://cloud.google.com/products#product-launch-stages.
5256 # of Google Cloud Platform.
5257 "tableOptions": { # Instructions regarding the table content being inspected. # If the container is a table, additional information to make findings
5258 # meaningful such as the columns that are primary keys.
5259 "identifyingFields": [ # The columns that are the primary keys for table objects included in
5260 # ContentItem. A copy of this cell's value will stored alongside alongside
5261 # each finding so that the finding can be traced to the specific row it came
5262 # from. No more than 3 may be provided.
5263 { # General identifier of a data field in a storage service.
5264 "name": "A String", # Name describing the field.
5265 },
5266 ],
5267 },
5268 "labels": { # To organize findings, these labels will be added to each finding.
5269 #
5270 # Label keys must be between 1 and 63 characters long and must conform
5271 # to the following regular expression: `[a-z]([-a-z0-9]*[a-z0-9])?`.
5272 #
5273 # Label values must be between 0 and 63 characters long and must conform
5274 # to the regular expression `([a-z]([-a-z0-9]*[a-z0-9])?)?`.
5275 #
5276 # No more than 10 labels can be associated with a given finding.
5277 #
5278 # Examples:
5279 # * `"environment" : "production"`
5280 # * `"pipeline" : "etl"`
5281 "a_key": "A String",
5282 },
5283 "requiredFindingLabelKeys": [ # These are labels that each inspection request must include within their
5284 # 'finding_labels' map. Request may contain others, but any missing one of
5285 # these will be rejected.
5286 #
5287 # Label keys must be between 1 and 63 characters long and must conform
5288 # to the following regular expression: `[a-z]([-a-z0-9]*[a-z0-9])?`.
5289 #
5290 # No more than 10 keys can be required.
5291 "A String",
5292 ],
5293 "description": "A String", # A short description of where the data is coming from. Will be stored once
5294 # in the job. 256 max length.
5295 },
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005296 },
5297 "inspectConfig": { # Configuration description of the scanning process. # How and what to scan for.
5298 # When used with redactContent only info_types and min_likelihood are currently
5299 # used.
5300 "excludeInfoTypes": True or False, # When true, excludes type information of the findings.
Dan O'Mearadd494642020-05-01 07:42:23 -07005301 "limits": { # Configuration to control the number of findings returned. # Configuration to control the number of findings returned.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005302 "maxFindingsPerRequest": 42, # Max number of findings that will be returned per request/job.
5303 # When set within `InspectContentRequest`, the maximum returned is 2000
5304 # regardless if this is set higher.
5305 "maxFindingsPerInfoType": [ # Configuration of findings limit given for specified infoTypes.
5306 { # Max findings configuration per infoType, per content item or long
5307 # running DlpJob.
5308 "infoType": { # Type of information detected by the API. # Type of information the findings limit applies to. Only one limit per
5309 # info_type should be provided. If InfoTypeLimit does not have an
5310 # info_type, the DLP API applies the limit against all info_types that
5311 # are found but not specified in another InfoTypeLimit.
5312 "name": "A String", # Name of the information type. Either a name of your choosing when
5313 # creating a CustomInfoType, or one of the names listed
5314 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
5315 # a built-in type. InfoType names should conform to the pattern
Dan O'Mearadd494642020-05-01 07:42:23 -07005316 # `[a-zA-Z0-9_]{1,64}`.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005317 },
5318 "maxFindings": 42, # Max findings limit for the given infoType.
5319 },
5320 ],
5321 "maxFindingsPerItem": 42, # Max number of findings that will be returned for each item scanned.
Dan O'Mearadd494642020-05-01 07:42:23 -07005322 # When set within `InspectJobConfig`,
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005323 # the maximum returned is 2000 regardless if this is set higher.
5324 # When set within `InspectContentRequest`, this field is ignored.
5325 },
5326 "minLikelihood": "A String", # Only returns findings equal or above this threshold. The default is
5327 # POSSIBLE.
5328 # See https://cloud.google.com/dlp/docs/likelihood to learn more.
5329 "customInfoTypes": [ # CustomInfoTypes provided by the user. See
5330 # https://cloud.google.com/dlp/docs/creating-custom-infotypes to learn more.
5331 { # Custom information type provided by the user. Used to find domain-specific
5332 # sensitive information configurable to the data in question.
5333 "regex": { # Message defining a custom regular expression. # Regular expression based CustomInfoType.
5334 "pattern": "A String", # Pattern defining the regular expression. Its syntax
5335 # (https://github.com/google/re2/wiki/Syntax) can be found under the
5336 # google/re2 repository on GitHub.
5337 "groupIndexes": [ # The index of the submatch to extract as findings. When not
5338 # specified, the entire match is returned. No more than 3 may be included.
5339 42,
5340 ],
5341 },
5342 "surrogateType": { # Message for detecting output from deidentification transformations # Message for detecting output from deidentification transformations that
5343 # support reversing.
5344 # such as
5345 # [`CryptoReplaceFfxFpeConfig`](/dlp/docs/reference/rest/v2/organizations.deidentifyTemplates#cryptoreplaceffxfpeconfig).
5346 # These types of transformations are
5347 # those that perform pseudonymization, thereby producing a "surrogate" as
5348 # output. This should be used in conjunction with a field on the
5349 # transformation such as `surrogate_info_type`. This CustomInfoType does
5350 # not support the use of `detection_rules`.
5351 },
5352 "infoType": { # Type of information detected by the API. # CustomInfoType can either be a new infoType, or an extension of built-in
5353 # infoType, when the name matches one of existing infoTypes and that infoType
5354 # is specified in `InspectContent.info_types` field. Specifying the latter
5355 # adds findings to the one detected by the system. If built-in info type is
5356 # not specified in `InspectContent.info_types` list then the name is treated
5357 # as a custom info type.
5358 "name": "A String", # Name of the information type. Either a name of your choosing when
5359 # creating a CustomInfoType, or one of the names listed
5360 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
5361 # a built-in type. InfoType names should conform to the pattern
Dan O'Mearadd494642020-05-01 07:42:23 -07005362 # `[a-zA-Z0-9_]{1,64}`.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005363 },
5364 "dictionary": { # Custom information type based on a dictionary of words or phrases. This can # A list of phrases to detect as a CustomInfoType.
5365 # be used to match sensitive information specific to the data, such as a list
5366 # of employee IDs or job titles.
5367 #
5368 # Dictionary words are case-insensitive and all characters other than letters
5369 # and digits in the unicode [Basic Multilingual
5370 # Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane)
5371 # will be replaced with whitespace when scanning for matches, so the
5372 # dictionary phrase "Sam Johnson" will match all three phrases "sam johnson",
5373 # "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters
5374 # surrounding any match must be of a different type than the adjacent
5375 # characters within the word, so letters must be next to non-letters and
5376 # digits next to non-digits. For example, the dictionary word "jen" will
5377 # match the first three letters of the text "jen123" but will return no
5378 # matches for "jennifer".
5379 #
5380 # Dictionary words containing a large number of characters that are not
5381 # letters or digits may result in unexpected findings because such characters
5382 # are treated as whitespace. The
5383 # [limits](https://cloud.google.com/dlp/limits) page contains details about
5384 # the size limits of dictionaries. For dictionaries that do not fit within
5385 # these constraints, consider using `LargeCustomDictionaryConfig` in the
5386 # `StoredInfoType` API.
5387 "wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for.
5388 "words": [ # Words or phrases defining the dictionary. The dictionary must contain
5389 # at least one phrase and every phrase must contain at least 2 characters
5390 # that are letters or digits. [required]
5391 "A String",
5392 ],
5393 },
5394 "cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file
5395 # is accepted.
5396 "path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage.
5397 # Example: gs://[BUCKET_NAME]/dictionary.txt
5398 },
5399 },
5400 "storedType": { # A reference to a StoredInfoType to use with scanning. # Load an existing `StoredInfoType` resource for use in
5401 # `InspectDataSource`. Not currently supported in `InspectContent`.
5402 "name": "A String", # Resource name of the requested `StoredInfoType`, for example
5403 # `organizations/433245324/storedInfoTypes/432452342` or
5404 # `projects/project-id/storedInfoTypes/432452342`.
5405 "createTime": "A String", # Timestamp indicating when the version of the `StoredInfoType` used for
5406 # inspection was created. Output-only field, populated by the system.
5407 },
5408 "detectionRules": [ # Set of detection rules to apply to all findings of this CustomInfoType.
5409 # Rules are applied in order that they are specified. Not supported for the
5410 # `surrogate_type` CustomInfoType.
5411 { # Deprecated; use `InspectionRuleSet` instead. Rule for modifying a
5412 # `CustomInfoType` to alter behavior under certain circumstances, depending
5413 # on the specific details of the rule. Not supported for the `surrogate_type`
5414 # custom infoType.
5415 "hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule.
5416 # proximity of hotwords.
5417 "proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside.
5418 # The total length of the window cannot exceed 1000 characters. Note that
5419 # the finding itself will be included in the window, so that hotwords may
5420 # be used to match substrings of the finding itself. For example, the
5421 # certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be
5422 # adjusted upwards if the area code is known to be the local area code of
5423 # a company office using the hotword regex "\(xxx\)", where "xxx"
5424 # is the area code in question.
5425 # rule.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005426 "windowBefore": 42, # Number of characters before the finding to consider.
Dan O'Mearadd494642020-05-01 07:42:23 -07005427 "windowAfter": 42, # Number of characters after the finding to consider.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005428 },
5429 "hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword.
5430 "pattern": "A String", # Pattern defining the regular expression. Its syntax
5431 # (https://github.com/google/re2/wiki/Syntax) can be found under the
5432 # google/re2 repository on GitHub.
5433 "groupIndexes": [ # The index of the submatch to extract as findings. When not
5434 # specified, the entire match is returned. No more than 3 may be included.
5435 42,
5436 ],
5437 },
5438 "likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings.
5439 # part of a detection rule.
5440 "relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of
5441 # levels. For example, if a finding would be `POSSIBLE` without the
5442 # detection rule and `relative_likelihood` is 1, then it is upgraded to
5443 # `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`.
5444 # Likelihood may never drop below `VERY_UNLIKELY` or exceed
5445 # `VERY_LIKELY`, so applying an adjustment of 1 followed by an
5446 # adjustment of -1 when base likelihood is `VERY_LIKELY` will result in
5447 # a final likelihood of `LIKELY`.
5448 "fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value.
5449 },
5450 },
5451 },
5452 ],
5453 "exclusionType": "A String", # If set to EXCLUSION_TYPE_EXCLUDE this infoType will not cause a finding
5454 # to be returned. It still can be used for rules matching.
5455 "likelihood": "A String", # Likelihood to return for this CustomInfoType. This base value can be
5456 # altered by a detection rule if the finding meets the criteria specified by
5457 # the rule. Defaults to `VERY_LIKELY` if not specified.
5458 },
5459 ],
5460 "includeQuote": True or False, # When true, a contextual quote from the data that triggered a finding is
5461 # included in the response; see Finding.quote.
5462 "ruleSet": [ # Set of rules to apply to the findings for this InspectConfig.
5463 # Exclusion rules, contained in the set are executed in the end, other
5464 # rules are executed in the order they are specified for each info type.
5465 { # Rule set for modifying a set of infoTypes to alter behavior under certain
5466 # circumstances, depending on the specific details of the rules within the set.
5467 "rules": [ # Set of rules to be applied to infoTypes. The rules are applied in order.
5468 { # A single inspection rule to be applied to infoTypes, specified in
5469 # `InspectionRuleSet`.
5470 "hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule.
5471 # proximity of hotwords.
5472 "proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside.
5473 # The total length of the window cannot exceed 1000 characters. Note that
5474 # the finding itself will be included in the window, so that hotwords may
5475 # be used to match substrings of the finding itself. For example, the
5476 # certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be
5477 # adjusted upwards if the area code is known to be the local area code of
5478 # a company office using the hotword regex "\(xxx\)", where "xxx"
5479 # is the area code in question.
5480 # rule.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005481 "windowBefore": 42, # Number of characters before the finding to consider.
Dan O'Mearadd494642020-05-01 07:42:23 -07005482 "windowAfter": 42, # Number of characters after the finding to consider.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005483 },
5484 "hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword.
5485 "pattern": "A String", # Pattern defining the regular expression. Its syntax
5486 # (https://github.com/google/re2/wiki/Syntax) can be found under the
5487 # google/re2 repository on GitHub.
5488 "groupIndexes": [ # The index of the submatch to extract as findings. When not
5489 # specified, the entire match is returned. No more than 3 may be included.
5490 42,
5491 ],
5492 },
5493 "likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings.
5494 # part of a detection rule.
5495 "relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of
5496 # levels. For example, if a finding would be `POSSIBLE` without the
5497 # detection rule and `relative_likelihood` is 1, then it is upgraded to
5498 # `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`.
5499 # Likelihood may never drop below `VERY_UNLIKELY` or exceed
5500 # `VERY_LIKELY`, so applying an adjustment of 1 followed by an
5501 # adjustment of -1 when base likelihood is `VERY_LIKELY` will result in
5502 # a final likelihood of `LIKELY`.
5503 "fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value.
5504 },
5505 },
5506 "exclusionRule": { # The rule that specifies conditions when findings of infoTypes specified in # Exclusion rule.
5507 # `InspectionRuleSet` are removed from results.
5508 "regex": { # Message defining a custom regular expression. # Regular expression which defines the rule.
5509 "pattern": "A String", # Pattern defining the regular expression. Its syntax
5510 # (https://github.com/google/re2/wiki/Syntax) can be found under the
5511 # google/re2 repository on GitHub.
5512 "groupIndexes": [ # The index of the submatch to extract as findings. When not
5513 # specified, the entire match is returned. No more than 3 may be included.
5514 42,
5515 ],
5516 },
5517 "excludeInfoTypes": { # List of exclude infoTypes. # Set of infoTypes for which findings would affect this rule.
5518 "infoTypes": [ # InfoType list in ExclusionRule rule drops a finding when it overlaps or
5519 # contained within with a finding of an infoType from this list. For
5520 # example, for `InspectionRuleSet.info_types` containing "PHONE_NUMBER"` and
5521 # `exclusion_rule` containing `exclude_info_types.info_types` with
5522 # "EMAIL_ADDRESS" the phone number findings are dropped if they overlap
5523 # with EMAIL_ADDRESS finding.
5524 # That leads to "555-222-2222@example.org" to generate only a single
5525 # finding, namely email address.
5526 { # Type of information detected by the API.
5527 "name": "A String", # Name of the information type. Either a name of your choosing when
5528 # creating a CustomInfoType, or one of the names listed
5529 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
5530 # a built-in type. InfoType names should conform to the pattern
Dan O'Mearadd494642020-05-01 07:42:23 -07005531 # `[a-zA-Z0-9_]{1,64}`.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005532 },
5533 ],
5534 },
5535 "dictionary": { # Custom information type based on a dictionary of words or phrases. This can # Dictionary which defines the rule.
5536 # be used to match sensitive information specific to the data, such as a list
5537 # of employee IDs or job titles.
5538 #
5539 # Dictionary words are case-insensitive and all characters other than letters
5540 # and digits in the unicode [Basic Multilingual
5541 # Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane)
5542 # will be replaced with whitespace when scanning for matches, so the
5543 # dictionary phrase "Sam Johnson" will match all three phrases "sam johnson",
5544 # "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters
5545 # surrounding any match must be of a different type than the adjacent
5546 # characters within the word, so letters must be next to non-letters and
5547 # digits next to non-digits. For example, the dictionary word "jen" will
5548 # match the first three letters of the text "jen123" but will return no
5549 # matches for "jennifer".
5550 #
5551 # Dictionary words containing a large number of characters that are not
5552 # letters or digits may result in unexpected findings because such characters
5553 # are treated as whitespace. The
5554 # [limits](https://cloud.google.com/dlp/limits) page contains details about
5555 # the size limits of dictionaries. For dictionaries that do not fit within
5556 # these constraints, consider using `LargeCustomDictionaryConfig` in the
5557 # `StoredInfoType` API.
5558 "wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for.
5559 "words": [ # Words or phrases defining the dictionary. The dictionary must contain
5560 # at least one phrase and every phrase must contain at least 2 characters
5561 # that are letters or digits. [required]
5562 "A String",
5563 ],
5564 },
5565 "cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file
5566 # is accepted.
5567 "path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage.
5568 # Example: gs://[BUCKET_NAME]/dictionary.txt
5569 },
5570 },
5571 "matchingType": "A String", # How the rule is applied, see MatchingType documentation for details.
5572 },
5573 },
5574 ],
5575 "infoTypes": [ # List of infoTypes this rule set is applied to.
5576 { # Type of information detected by the API.
5577 "name": "A String", # Name of the information type. Either a name of your choosing when
5578 # creating a CustomInfoType, or one of the names listed
5579 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
5580 # a built-in type. InfoType names should conform to the pattern
Dan O'Mearadd494642020-05-01 07:42:23 -07005581 # `[a-zA-Z0-9_]{1,64}`.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005582 },
5583 ],
5584 },
5585 ],
5586 "contentOptions": [ # List of options defining data content to scan.
5587 # If empty, text, images, and other content will be included.
5588 "A String",
5589 ],
5590 "infoTypes": [ # Restricts what info_types to look for. The values must correspond to
5591 # InfoType values returned by ListInfoTypes or listed at
5592 # https://cloud.google.com/dlp/docs/infotypes-reference.
5593 #
5594 # When no InfoTypes or CustomInfoTypes are specified in a request, the
5595 # system may automatically choose what detectors to run. By default this may
5596 # be all types, but may change over time as detectors are updated.
5597 #
Dan O'Mearadd494642020-05-01 07:42:23 -07005598 # If you need precise control and predictability as to what detectors are
5599 # run you should specify specific InfoTypes listed in the reference,
5600 # otherwise a default list will be used, which may change over time.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005601 { # Type of information detected by the API.
5602 "name": "A String", # Name of the information type. Either a name of your choosing when
5603 # creating a CustomInfoType, or one of the names listed
5604 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
5605 # a built-in type. InfoType names should conform to the pattern
Dan O'Mearadd494642020-05-01 07:42:23 -07005606 # `[a-zA-Z0-9_]{1,64}`.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005607 },
5608 ],
5609 },
5610 "inspectTemplateName": "A String", # If provided, will be used as the default for all values in InspectConfig.
5611 # `inspect_config` will be merged into the values persisted as part of the
5612 # template.
5613 "actions": [ # Actions to execute at the completion of the job.
5614 { # A task to execute on the completion of a job.
5615 # See https://cloud.google.com/dlp/docs/concepts-actions to learn more.
5616 "saveFindings": { # If set, the detailed findings will be persisted to the specified # Save resulting findings in a provided location.
5617 # OutputStorageConfig. Only a single instance of this action can be
5618 # specified.
5619 # Compatible with: Inspect, Risk
Dan O'Mearadd494642020-05-01 07:42:23 -07005620 "outputConfig": { # Cloud repository for storing output. # Location to store findings outside of DLP.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005621 "table": { # Message defining the location of a BigQuery table. A table is uniquely # Store findings in an existing table or a new table in an existing
5622 # dataset. If table_id is not set a new one will be generated
5623 # for you with the following format:
5624 # dlp_googleapis_yyyy_mm_dd_[dlp_job_id]. Pacific timezone will be used for
5625 # generating the date details.
5626 #
5627 # For Inspect, each column in an existing output table must have the same
5628 # name, type, and mode of a field in the `Finding` object.
5629 #
5630 # For Risk, an existing output table should be the output of a previous
5631 # Risk analysis job run on the same source table, with the same privacy
5632 # metric and quasi-identifiers. Risk jobs that analyze the same table but
5633 # compute a different privacy metric, or use different sets of
5634 # quasi-identifiers, cannot store their results in the same table.
5635 # identified by its project_id, dataset_id, and table_name. Within a query
5636 # a table is often referenced with a string in the format of:
Dan O'Mearadd494642020-05-01 07:42:23 -07005637 # `&lt;project_id&gt;:&lt;dataset_id&gt;.&lt;table_id&gt;` or
5638 # `&lt;project_id&gt;.&lt;dataset_id&gt;.&lt;table_id&gt;`.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005639 "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table.
5640 # If omitted, project ID is inferred from the API call.
5641 "tableId": "A String", # Name of the table.
5642 "datasetId": "A String", # Dataset ID of the table.
5643 },
5644 "outputSchema": "A String", # Schema used for writing the findings for Inspect jobs. This field is only
5645 # used for Inspect and must be unspecified for Risk jobs. Columns are derived
5646 # from the `Finding` object. If appending to an existing table, any columns
5647 # from the predefined schema that are missing will be added. No columns in
5648 # the existing table will be deleted.
5649 #
5650 # If unspecified, then all available columns will be used for a new table or
5651 # an (existing) table with no schema, and no changes will be made to an
5652 # existing table that has a schema.
Dan O'Mearadd494642020-05-01 07:42:23 -07005653 # Only for use with external storage.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005654 },
5655 },
Dan O'Mearadd494642020-05-01 07:42:23 -07005656 "jobNotificationEmails": { # Enable email notification to project owners and editors on jobs's # Enable email notification for project owners and editors on job's
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005657 # completion/failure.
5658 # completion/failure.
5659 },
5660 "publishSummaryToCscc": { # Publish the result summary of a DlpJob to the Cloud Security # Publish summary to Cloud Security Command Center (Alpha).
5661 # Command Center (CSCC Alpha).
5662 # This action is only available for projects which are parts of
5663 # an organization and whitelisted for the alpha Cloud Security Command
5664 # Center.
5665 # The action will publish count of finding instances and their info types.
5666 # The summary of findings will be persisted in CSCC and are governed by CSCC
5667 # service-specific policy, see https://cloud.google.com/terms/service-terms
5668 # Only a single instance of this action can be specified.
5669 # Compatible with: Inspect
5670 },
Dan O'Mearadd494642020-05-01 07:42:23 -07005671 "publishToStackdriver": { # Enable Stackdriver metric dlp.googleapis.com/finding_count. This # Enable Stackdriver metric dlp.googleapis.com/finding_count.
5672 # will publish a metric to stack driver on each infotype requested and
5673 # how many findings were found for it. CustomDetectors will be bucketed
5674 # as 'Custom' under the Stackdriver label 'info_type'.
5675 },
5676 "publishFindingsToCloudDataCatalog": { # Publish findings of a DlpJob to Cloud Data Catalog. Labels summarizing the # Publish findings to Cloud Datahub.
5677 # results of the DlpJob will be applied to the entry for the resource scanned
5678 # in Cloud Data Catalog. Any labels previously written by another DlpJob will
5679 # be deleted. InfoType naming patterns are strictly enforced when using this
5680 # feature. Note that the findings will be persisted in Cloud Data Catalog
5681 # storage and are governed by Data Catalog service-specific policy, see
5682 # https://cloud.google.com/terms/service-terms
5683 # Only a single instance of this action can be specified and only allowed if
5684 # all resources being scanned are BigQuery tables.
5685 # Compatible with: Inspect
5686 },
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005687 "pubSub": { # Publish a message into given Pub/Sub topic when DlpJob has completed. The # Publish a notification to a pubsub topic.
5688 # message contains a single field, `DlpJobName`, which is equal to the
5689 # finished job's
5690 # [`DlpJob.name`](/dlp/docs/reference/rest/v2/projects.dlpJobs#DlpJob).
5691 # Compatible with: Inspect, Risk
5692 "topic": "A String", # Cloud Pub/Sub topic to send notifications to. The topic must have given
5693 # publishing access rights to the DLP API service account executing
5694 # the long running DlpJob sending the notifications.
5695 # Format is projects/{project}/topics/{topic}.
5696 },
5697 },
5698 ],
5699 },
5700 },
5701 "result": { # All result fields mentioned below are updated while the job is processing. # A summary of the outcome of this inspect job.
5702 "infoTypeStats": [ # Statistics of how many instances of each info type were found during
5703 # inspect job.
5704 { # Statistics regarding a specific InfoType.
5705 "count": "A String", # Number of findings for this infoType.
5706 "infoType": { # Type of information detected by the API. # The type of finding this stat is for.
5707 "name": "A String", # Name of the information type. Either a name of your choosing when
5708 # creating a CustomInfoType, or one of the names listed
5709 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
5710 # a built-in type. InfoType names should conform to the pattern
Dan O'Mearadd494642020-05-01 07:42:23 -07005711 # `[a-zA-Z0-9_]{1,64}`.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005712 },
5713 },
5714 ],
5715 "totalEstimatedBytes": "A String", # Estimate of the number of bytes to process.
5716 "processedBytes": "A String", # Total size in bytes that were processed.
Dan O'Mearadd494642020-05-01 07:42:23 -07005717 "hybridStats": { # Statistics related to processing hybrid inspect requests. # Statistics related to the processing of hybrid inspect.
5718 # Early access feature is in a pre-release state and might change or have
5719 # limited support. For more information, see
5720 # https://cloud.google.com/products#product-launch-stages.
5721 "abortedCount": "A String", # The number of hybrid inspection requests aborted because the job ran
5722 # out of quota or was ended before they could be processed.
5723 "pendingCount": "A String", # The number of hybrid requests currently being processed. Only populated
5724 # when called via method `getDlpJob`.
5725 # A burst of traffic may cause hybrid inspect requests to be enqueued.
5726 # Processing will take place as quickly as possible, but resource limitations
5727 # may impact how long a request is enqueued for.
5728 "processedCount": "A String", # The number of hybrid inspection requests processed within this job.
5729 },
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005730 },
5731 },
5732 "riskDetails": { # Result of a risk analysis operation request. # Results from analyzing risk of a data source.
Dan O'Mearadd494642020-05-01 07:42:23 -07005733 "numericalStatsResult": { # Result of the numerical stats computation. # Numerical stats result
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005734 "quantileValues": [ # List of 99 values that partition the set of field values into 100 equal
5735 # sized buckets.
5736 { # Set of primitive values supported by the system.
5737 # Note that for the purposes of inspection or transformation, the number
5738 # of bytes considered to comprise a 'Value' is based on its representation
5739 # as a UTF-8 encoded string. For example, if 'integer_value' is set to
5740 # 123456789, the number of bytes would be counted as 9, even though an
5741 # int64 only holds up to 8 bytes of data.
Dan O'Mearadd494642020-05-01 07:42:23 -07005742 "floatValue": 3.14, # float
5743 "timestampValue": "A String", # timestamp
5744 "dayOfWeekValue": "A String", # day of week
5745 "timeValue": { # Represents a time of day. The date and time zone are either not significant # time of day
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005746 # or are specified elsewhere. An API may choose to allow leap seconds. Related
5747 # types are google.type.Date and `google.protobuf.Timestamp`.
5748 "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose
5749 # to allow the value "24:00:00" for scenarios like business closing time.
5750 "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999.
5751 "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may
5752 # allow the value 60 if it allows leap-seconds.
5753 "minutes": 42, # Minutes of hour of day. Must be from 0 to 59.
5754 },
Dan O'Mearadd494642020-05-01 07:42:23 -07005755 "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day # date
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005756 # and time zone are either specified elsewhere or are not significant. The date
5757 # is relative to the Proleptic Gregorian Calendar. This can represent:
5758 #
5759 # * A full date, with non-zero year, month and day values
5760 # * A month and day value, with a zero year, e.g. an anniversary
5761 # * A year on its own, with zero month and day values
5762 # * A year and month value, with a zero day, e.g. a credit card expiration date
5763 #
5764 # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`.
Dan O'Mearadd494642020-05-01 07:42:23 -07005765 "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a
5766 # month and day.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005767 "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0
5768 # if specifying a year by itself or a year and month where the day is not
5769 # significant.
Dan O'Mearadd494642020-05-01 07:42:23 -07005770 "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without
5771 # a year.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005772 },
Dan O'Mearadd494642020-05-01 07:42:23 -07005773 "stringValue": "A String", # string
5774 "booleanValue": True or False, # boolean
5775 "integerValue": "A String", # integer
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005776 },
5777 ],
5778 "maxValue": { # Set of primitive values supported by the system. # Maximum value appearing in the column.
5779 # Note that for the purposes of inspection or transformation, the number
5780 # of bytes considered to comprise a 'Value' is based on its representation
5781 # as a UTF-8 encoded string. For example, if 'integer_value' is set to
5782 # 123456789, the number of bytes would be counted as 9, even though an
5783 # int64 only holds up to 8 bytes of data.
Dan O'Mearadd494642020-05-01 07:42:23 -07005784 "floatValue": 3.14, # float
5785 "timestampValue": "A String", # timestamp
5786 "dayOfWeekValue": "A String", # day of week
5787 "timeValue": { # Represents a time of day. The date and time zone are either not significant # time of day
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005788 # or are specified elsewhere. An API may choose to allow leap seconds. Related
5789 # types are google.type.Date and `google.protobuf.Timestamp`.
5790 "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose
5791 # to allow the value "24:00:00" for scenarios like business closing time.
5792 "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999.
5793 "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may
5794 # allow the value 60 if it allows leap-seconds.
5795 "minutes": 42, # Minutes of hour of day. Must be from 0 to 59.
5796 },
Dan O'Mearadd494642020-05-01 07:42:23 -07005797 "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day # date
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005798 # and time zone are either specified elsewhere or are not significant. The date
5799 # is relative to the Proleptic Gregorian Calendar. This can represent:
5800 #
5801 # * A full date, with non-zero year, month and day values
5802 # * A month and day value, with a zero year, e.g. an anniversary
5803 # * A year on its own, with zero month and day values
5804 # * A year and month value, with a zero day, e.g. a credit card expiration date
5805 #
5806 # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`.
Dan O'Mearadd494642020-05-01 07:42:23 -07005807 "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a
5808 # month and day.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005809 "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0
5810 # if specifying a year by itself or a year and month where the day is not
5811 # significant.
Dan O'Mearadd494642020-05-01 07:42:23 -07005812 "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without
5813 # a year.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005814 },
Dan O'Mearadd494642020-05-01 07:42:23 -07005815 "stringValue": "A String", # string
5816 "booleanValue": True or False, # boolean
5817 "integerValue": "A String", # integer
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005818 },
5819 "minValue": { # Set of primitive values supported by the system. # Minimum value appearing in the column.
5820 # Note that for the purposes of inspection or transformation, the number
5821 # of bytes considered to comprise a 'Value' is based on its representation
5822 # as a UTF-8 encoded string. For example, if 'integer_value' is set to
5823 # 123456789, the number of bytes would be counted as 9, even though an
5824 # int64 only holds up to 8 bytes of data.
Dan O'Mearadd494642020-05-01 07:42:23 -07005825 "floatValue": 3.14, # float
5826 "timestampValue": "A String", # timestamp
5827 "dayOfWeekValue": "A String", # day of week
5828 "timeValue": { # Represents a time of day. The date and time zone are either not significant # time of day
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005829 # or are specified elsewhere. An API may choose to allow leap seconds. Related
5830 # types are google.type.Date and `google.protobuf.Timestamp`.
5831 "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose
5832 # to allow the value "24:00:00" for scenarios like business closing time.
5833 "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999.
5834 "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may
5835 # allow the value 60 if it allows leap-seconds.
5836 "minutes": 42, # Minutes of hour of day. Must be from 0 to 59.
5837 },
Dan O'Mearadd494642020-05-01 07:42:23 -07005838 "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day # date
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005839 # and time zone are either specified elsewhere or are not significant. The date
5840 # is relative to the Proleptic Gregorian Calendar. This can represent:
5841 #
5842 # * A full date, with non-zero year, month and day values
5843 # * A month and day value, with a zero year, e.g. an anniversary
5844 # * A year on its own, with zero month and day values
5845 # * A year and month value, with a zero day, e.g. a credit card expiration date
5846 #
5847 # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`.
Dan O'Mearadd494642020-05-01 07:42:23 -07005848 "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a
5849 # month and day.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005850 "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0
5851 # if specifying a year by itself or a year and month where the day is not
5852 # significant.
Dan O'Mearadd494642020-05-01 07:42:23 -07005853 "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without
5854 # a year.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005855 },
Dan O'Mearadd494642020-05-01 07:42:23 -07005856 "stringValue": "A String", # string
5857 "booleanValue": True or False, # boolean
5858 "integerValue": "A String", # integer
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005859 },
5860 },
Dan O'Mearadd494642020-05-01 07:42:23 -07005861 "kMapEstimationResult": { # Result of the reidentifiability analysis. Note that these results are an # K-map result
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005862 # estimation, not exact values.
5863 "kMapEstimationHistogram": [ # The intervals [min_anonymity, max_anonymity] do not overlap. If a value
5864 # doesn't correspond to any such interval, the associated frequency is
5865 # zero. For example, the following records:
5866 # {min_anonymity: 1, max_anonymity: 1, frequency: 17}
5867 # {min_anonymity: 2, max_anonymity: 3, frequency: 42}
5868 # {min_anonymity: 5, max_anonymity: 10, frequency: 99}
5869 # mean that there are no record with an estimated anonymity of 4, 5, or
5870 # larger than 10.
5871 { # A KMapEstimationHistogramBucket message with the following values:
5872 # min_anonymity: 3
5873 # max_anonymity: 5
5874 # frequency: 42
5875 # means that there are 42 records whose quasi-identifier values correspond
5876 # to 3, 4 or 5 people in the overlying population. An important particular
5877 # case is when min_anonymity = max_anonymity = 1: the frequency field then
5878 # corresponds to the number of uniquely identifiable records.
5879 "bucketValues": [ # Sample of quasi-identifier tuple values in this bucket. The total
5880 # number of classes returned per bucket is capped at 20.
5881 { # A tuple of values for the quasi-identifier columns.
5882 "estimatedAnonymity": "A String", # The estimated anonymity for these quasi-identifier values.
5883 "quasiIdsValues": [ # The quasi-identifier values.
5884 { # Set of primitive values supported by the system.
5885 # Note that for the purposes of inspection or transformation, the number
5886 # of bytes considered to comprise a 'Value' is based on its representation
5887 # as a UTF-8 encoded string. For example, if 'integer_value' is set to
5888 # 123456789, the number of bytes would be counted as 9, even though an
5889 # int64 only holds up to 8 bytes of data.
Dan O'Mearadd494642020-05-01 07:42:23 -07005890 "floatValue": 3.14, # float
5891 "timestampValue": "A String", # timestamp
5892 "dayOfWeekValue": "A String", # day of week
5893 "timeValue": { # Represents a time of day. The date and time zone are either not significant # time of day
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005894 # or are specified elsewhere. An API may choose to allow leap seconds. Related
5895 # types are google.type.Date and `google.protobuf.Timestamp`.
5896 "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose
5897 # to allow the value "24:00:00" for scenarios like business closing time.
5898 "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999.
5899 "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may
5900 # allow the value 60 if it allows leap-seconds.
5901 "minutes": 42, # Minutes of hour of day. Must be from 0 to 59.
5902 },
Dan O'Mearadd494642020-05-01 07:42:23 -07005903 "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day # date
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005904 # and time zone are either specified elsewhere or are not significant. The date
5905 # is relative to the Proleptic Gregorian Calendar. This can represent:
5906 #
5907 # * A full date, with non-zero year, month and day values
5908 # * A month and day value, with a zero year, e.g. an anniversary
5909 # * A year on its own, with zero month and day values
5910 # * A year and month value, with a zero day, e.g. a credit card expiration date
5911 #
5912 # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`.
Dan O'Mearadd494642020-05-01 07:42:23 -07005913 "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a
5914 # month and day.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005915 "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0
5916 # if specifying a year by itself or a year and month where the day is not
5917 # significant.
Dan O'Mearadd494642020-05-01 07:42:23 -07005918 "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without
5919 # a year.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005920 },
Dan O'Mearadd494642020-05-01 07:42:23 -07005921 "stringValue": "A String", # string
5922 "booleanValue": True or False, # boolean
5923 "integerValue": "A String", # integer
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005924 },
5925 ],
5926 },
5927 ],
5928 "minAnonymity": "A String", # Always positive.
5929 "bucketValueCount": "A String", # Total number of distinct quasi-identifier tuple values in this bucket.
5930 "maxAnonymity": "A String", # Always greater than or equal to min_anonymity.
5931 "bucketSize": "A String", # Number of records within these anonymity bounds.
5932 },
5933 ],
5934 },
Dan O'Mearadd494642020-05-01 07:42:23 -07005935 "kAnonymityResult": { # Result of the k-anonymity computation. # K-anonymity result
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005936 "equivalenceClassHistogramBuckets": [ # Histogram of k-anonymity equivalence classes.
Dan O'Mearadd494642020-05-01 07:42:23 -07005937 { # Histogram of k-anonymity equivalence classes.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005938 "bucketValues": [ # Sample of equivalence classes in this bucket. The total number of
5939 # classes returned per bucket is capped at 20.
5940 { # The set of columns' values that share the same ldiversity value
5941 "quasiIdsValues": [ # Set of values defining the equivalence class. One value per
5942 # quasi-identifier column in the original KAnonymity metric message.
5943 # The order is always the same as the original request.
5944 { # Set of primitive values supported by the system.
5945 # Note that for the purposes of inspection or transformation, the number
5946 # of bytes considered to comprise a 'Value' is based on its representation
5947 # as a UTF-8 encoded string. For example, if 'integer_value' is set to
5948 # 123456789, the number of bytes would be counted as 9, even though an
5949 # int64 only holds up to 8 bytes of data.
Dan O'Mearadd494642020-05-01 07:42:23 -07005950 "floatValue": 3.14, # float
5951 "timestampValue": "A String", # timestamp
5952 "dayOfWeekValue": "A String", # day of week
5953 "timeValue": { # Represents a time of day. The date and time zone are either not significant # time of day
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005954 # or are specified elsewhere. An API may choose to allow leap seconds. Related
5955 # types are google.type.Date and `google.protobuf.Timestamp`.
5956 "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose
5957 # to allow the value "24:00:00" for scenarios like business closing time.
5958 "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999.
5959 "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may
5960 # allow the value 60 if it allows leap-seconds.
5961 "minutes": 42, # Minutes of hour of day. Must be from 0 to 59.
5962 },
Dan O'Mearadd494642020-05-01 07:42:23 -07005963 "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day # date
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005964 # and time zone are either specified elsewhere or are not significant. The date
5965 # is relative to the Proleptic Gregorian Calendar. This can represent:
5966 #
5967 # * A full date, with non-zero year, month and day values
5968 # * A month and day value, with a zero year, e.g. an anniversary
5969 # * A year on its own, with zero month and day values
5970 # * A year and month value, with a zero day, e.g. a credit card expiration date
5971 #
5972 # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`.
Dan O'Mearadd494642020-05-01 07:42:23 -07005973 "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a
5974 # month and day.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005975 "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0
5976 # if specifying a year by itself or a year and month where the day is not
5977 # significant.
Dan O'Mearadd494642020-05-01 07:42:23 -07005978 "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without
5979 # a year.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005980 },
Dan O'Mearadd494642020-05-01 07:42:23 -07005981 "stringValue": "A String", # string
5982 "booleanValue": True or False, # boolean
5983 "integerValue": "A String", # integer
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005984 },
5985 ],
5986 "equivalenceClassSize": "A String", # Size of the equivalence class, for example number of rows with the
5987 # above set of values.
5988 },
5989 ],
5990 "bucketValueCount": "A String", # Total number of distinct equivalence classes in this bucket.
5991 "equivalenceClassSizeLowerBound": "A String", # Lower bound on the size of the equivalence classes in this bucket.
5992 "equivalenceClassSizeUpperBound": "A String", # Upper bound on the size of the equivalence classes in this bucket.
5993 "bucketSize": "A String", # Total number of equivalence classes in this bucket.
5994 },
5995 ],
5996 },
Dan O'Mearadd494642020-05-01 07:42:23 -07005997 "lDiversityResult": { # Result of the l-diversity computation. # L-divesity result
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07005998 "sensitiveValueFrequencyHistogramBuckets": [ # Histogram of l-diversity equivalence class sensitive value frequencies.
Dan O'Mearadd494642020-05-01 07:42:23 -07005999 { # Histogram of l-diversity equivalence class sensitive value frequencies.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07006000 "bucketValues": [ # Sample of equivalence classes in this bucket. The total number of
6001 # classes returned per bucket is capped at 20.
6002 { # The set of columns' values that share the same ldiversity value.
6003 "numDistinctSensitiveValues": "A String", # Number of distinct sensitive values in this equivalence class.
6004 "quasiIdsValues": [ # Quasi-identifier values defining the k-anonymity equivalence
6005 # class. The order is always the same as the original request.
6006 { # Set of primitive values supported by the system.
6007 # Note that for the purposes of inspection or transformation, the number
6008 # of bytes considered to comprise a 'Value' is based on its representation
6009 # as a UTF-8 encoded string. For example, if 'integer_value' is set to
6010 # 123456789, the number of bytes would be counted as 9, even though an
6011 # int64 only holds up to 8 bytes of data.
Dan O'Mearadd494642020-05-01 07:42:23 -07006012 "floatValue": 3.14, # float
6013 "timestampValue": "A String", # timestamp
6014 "dayOfWeekValue": "A String", # day of week
6015 "timeValue": { # Represents a time of day. The date and time zone are either not significant # time of day
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07006016 # or are specified elsewhere. An API may choose to allow leap seconds. Related
6017 # types are google.type.Date and `google.protobuf.Timestamp`.
6018 "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose
6019 # to allow the value "24:00:00" for scenarios like business closing time.
6020 "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999.
6021 "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may
6022 # allow the value 60 if it allows leap-seconds.
6023 "minutes": 42, # Minutes of hour of day. Must be from 0 to 59.
6024 },
Dan O'Mearadd494642020-05-01 07:42:23 -07006025 "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day # date
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07006026 # and time zone are either specified elsewhere or are not significant. The date
6027 # is relative to the Proleptic Gregorian Calendar. This can represent:
6028 #
6029 # * A full date, with non-zero year, month and day values
6030 # * A month and day value, with a zero year, e.g. an anniversary
6031 # * A year on its own, with zero month and day values
6032 # * A year and month value, with a zero day, e.g. a credit card expiration date
6033 #
6034 # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`.
Dan O'Mearadd494642020-05-01 07:42:23 -07006035 "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a
6036 # month and day.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07006037 "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0
6038 # if specifying a year by itself or a year and month where the day is not
6039 # significant.
Dan O'Mearadd494642020-05-01 07:42:23 -07006040 "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without
6041 # a year.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07006042 },
Dan O'Mearadd494642020-05-01 07:42:23 -07006043 "stringValue": "A String", # string
6044 "booleanValue": True or False, # boolean
6045 "integerValue": "A String", # integer
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07006046 },
6047 ],
6048 "topSensitiveValues": [ # Estimated frequencies of top sensitive values.
6049 { # A value of a field, including its frequency.
6050 "count": "A String", # How many times the value is contained in the field.
6051 "value": { # Set of primitive values supported by the system. # A value contained in the field in question.
6052 # Note that for the purposes of inspection or transformation, the number
6053 # of bytes considered to comprise a 'Value' is based on its representation
6054 # as a UTF-8 encoded string. For example, if 'integer_value' is set to
6055 # 123456789, the number of bytes would be counted as 9, even though an
6056 # int64 only holds up to 8 bytes of data.
Dan O'Mearadd494642020-05-01 07:42:23 -07006057 "floatValue": 3.14, # float
6058 "timestampValue": "A String", # timestamp
6059 "dayOfWeekValue": "A String", # day of week
6060 "timeValue": { # Represents a time of day. The date and time zone are either not significant # time of day
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07006061 # or are specified elsewhere. An API may choose to allow leap seconds. Related
6062 # types are google.type.Date and `google.protobuf.Timestamp`.
6063 "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose
6064 # to allow the value "24:00:00" for scenarios like business closing time.
6065 "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999.
6066 "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may
6067 # allow the value 60 if it allows leap-seconds.
6068 "minutes": 42, # Minutes of hour of day. Must be from 0 to 59.
6069 },
Dan O'Mearadd494642020-05-01 07:42:23 -07006070 "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day # date
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07006071 # and time zone are either specified elsewhere or are not significant. The date
6072 # is relative to the Proleptic Gregorian Calendar. This can represent:
6073 #
6074 # * A full date, with non-zero year, month and day values
6075 # * A month and day value, with a zero year, e.g. an anniversary
6076 # * A year on its own, with zero month and day values
6077 # * A year and month value, with a zero day, e.g. a credit card expiration date
6078 #
6079 # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`.
Dan O'Mearadd494642020-05-01 07:42:23 -07006080 "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a
6081 # month and day.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07006082 "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0
6083 # if specifying a year by itself or a year and month where the day is not
6084 # significant.
Dan O'Mearadd494642020-05-01 07:42:23 -07006085 "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without
6086 # a year.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07006087 },
Dan O'Mearadd494642020-05-01 07:42:23 -07006088 "stringValue": "A String", # string
6089 "booleanValue": True or False, # boolean
6090 "integerValue": "A String", # integer
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07006091 },
6092 },
6093 ],
6094 "equivalenceClassSize": "A String", # Size of the k-anonymity equivalence class.
6095 },
6096 ],
6097 "bucketValueCount": "A String", # Total number of distinct equivalence classes in this bucket.
6098 "bucketSize": "A String", # Total number of equivalence classes in this bucket.
6099 "sensitiveValueFrequencyUpperBound": "A String", # Upper bound on the sensitive value frequencies of the equivalence
6100 # classes in this bucket.
6101 "sensitiveValueFrequencyLowerBound": "A String", # Lower bound on the sensitive value frequencies of the equivalence
6102 # classes in this bucket.
6103 },
6104 ],
6105 },
6106 "requestedPrivacyMetric": { # Privacy metric to compute for reidentification risk analysis. # Privacy metric to compute.
Dan O'Mearadd494642020-05-01 07:42:23 -07006107 "numericalStatsConfig": { # Compute numerical stats over an individual column, including # Numerical stats
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07006108 # min, max, and quantiles.
6109 "field": { # General identifier of a data field in a storage service. # Field to compute numerical stats on. Supported types are
6110 # integer, float, date, datetime, timestamp, time.
6111 "name": "A String", # Name describing the field.
6112 },
6113 },
Dan O'Mearadd494642020-05-01 07:42:23 -07006114 "kMapEstimationConfig": { # Reidentifiability metric. This corresponds to a risk model similar to what # k-map
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07006115 # is called "journalist risk" in the literature, except the attack dataset is
6116 # statistically modeled instead of being perfectly known. This can be done
6117 # using publicly available data (like the US Census), or using a custom
6118 # statistical model (indicated as one or several BigQuery tables), or by
6119 # extrapolating from the distribution of values in the input dataset.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07006120 "regionCode": "A String", # ISO 3166-1 alpha-2 region code to use in the statistical modeling.
Dan O'Mearadd494642020-05-01 07:42:23 -07006121 # Set if no column is tagged with a region-specific InfoType (like
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07006122 # US_ZIP_5) or a region code.
Dan O'Mearadd494642020-05-01 07:42:23 -07006123 "quasiIds": [ # Required. Fields considered to be quasi-identifiers. No two columns can have the
6124 # same tag.
6125 { # A column with a semantic tag attached.
6126 "field": { # General identifier of a data field in a storage service. # Required. Identifies the column.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07006127 "name": "A String", # Name describing the field.
6128 },
6129 "customTag": "A String", # A column can be tagged with a custom tag. In this case, the user must
6130 # indicate an auxiliary table that contains statistical information on
6131 # the possible values of this column (below).
6132 "infoType": { # Type of information detected by the API. # A column can be tagged with a InfoType to use the relevant public
6133 # dataset as a statistical model of population, if available. We
6134 # currently support US ZIP codes, region codes, ages and genders.
6135 # To programmatically obtain the list of supported InfoTypes, use
6136 # ListInfoTypes with the supported_by=RISK_ANALYSIS filter.
6137 "name": "A String", # Name of the information type. Either a name of your choosing when
6138 # creating a CustomInfoType, or one of the names listed
6139 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
6140 # a built-in type. InfoType names should conform to the pattern
Dan O'Mearadd494642020-05-01 07:42:23 -07006141 # `[a-zA-Z0-9_]{1,64}`.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07006142 },
6143 "inferred": { # A generic empty message that you can re-use to avoid defining duplicated # If no semantic tag is indicated, we infer the statistical model from
6144 # the distribution of values in the input data
6145 # empty messages in your APIs. A typical example is to use it as the request
6146 # or the response type of an API method. For instance:
6147 #
6148 # service Foo {
6149 # rpc Bar(google.protobuf.Empty) returns (google.protobuf.Empty);
6150 # }
6151 #
6152 # The JSON representation for `Empty` is empty JSON object `{}`.
6153 },
6154 },
6155 ],
6156 "auxiliaryTables": [ # Several auxiliary tables can be used in the analysis. Each custom_tag
6157 # used to tag a quasi-identifiers column must appear in exactly one column
6158 # of one auxiliary table.
6159 { # An auxiliary table contains statistical information on the relative
6160 # frequency of different quasi-identifiers values. It has one or several
6161 # quasi-identifiers columns, and one column that indicates the relative
6162 # frequency of each quasi-identifier tuple.
6163 # If a tuple is present in the data but not in the auxiliary table, the
6164 # corresponding relative frequency is assumed to be zero (and thus, the
6165 # tuple is highly reidentifiable).
Dan O'Mearadd494642020-05-01 07:42:23 -07006166 "table": { # Message defining the location of a BigQuery table. A table is uniquely # Required. Auxiliary table location.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07006167 # identified by its project_id, dataset_id, and table_name. Within a query
6168 # a table is often referenced with a string in the format of:
Dan O'Mearadd494642020-05-01 07:42:23 -07006169 # `&lt;project_id&gt;:&lt;dataset_id&gt;.&lt;table_id&gt;` or
6170 # `&lt;project_id&gt;.&lt;dataset_id&gt;.&lt;table_id&gt;`.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07006171 "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table.
6172 # If omitted, project ID is inferred from the API call.
6173 "tableId": "A String", # Name of the table.
6174 "datasetId": "A String", # Dataset ID of the table.
6175 },
Dan O'Mearadd494642020-05-01 07:42:23 -07006176 "quasiIds": [ # Required. Quasi-identifier columns.
6177 { # A quasi-identifier column has a custom_tag, used to know which column
6178 # in the data corresponds to which column in the statistical model.
6179 "field": { # General identifier of a data field in a storage service. # Identifies the column.
6180 "name": "A String", # Name describing the field.
6181 },
6182 "customTag": "A String", # A auxiliary field.
6183 },
6184 ],
6185 "relativeFrequency": { # General identifier of a data field in a storage service. # Required. The relative frequency column must contain a floating-point number
6186 # between 0 and 1 (inclusive). Null values are assumed to be zero.
6187 "name": "A String", # Name describing the field.
6188 },
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07006189 },
6190 ],
6191 },
Dan O'Mearadd494642020-05-01 07:42:23 -07006192 "lDiversityConfig": { # l-diversity metric, used for analysis of reidentification risk. # l-diversity
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07006193 "sensitiveAttribute": { # General identifier of a data field in a storage service. # Sensitive field for computing the l-value.
6194 "name": "A String", # Name describing the field.
6195 },
6196 "quasiIds": [ # Set of quasi-identifiers indicating how equivalence classes are
6197 # defined for the l-diversity computation. When multiple fields are
6198 # specified, they are considered a single composite key.
6199 { # General identifier of a data field in a storage service.
6200 "name": "A String", # Name describing the field.
6201 },
6202 ],
6203 },
Dan O'Mearadd494642020-05-01 07:42:23 -07006204 "kAnonymityConfig": { # k-anonymity metric, used for analysis of reidentification risk. # K-anonymity
6205 "entityId": { # An entity in a dataset is a field or set of fields that correspond to a # Message indicating that multiple rows might be associated to a
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07006206 # single individual. If the same entity_id is associated to multiple
6207 # quasi-identifier tuples over distinct rows, we consider the entire
6208 # collection of tuples as the composite quasi-identifier. This collection
6209 # is a multiset: the order in which the different tuples appear in the
6210 # dataset is ignored, but their frequency is taken into account.
6211 #
6212 # Important note: a maximum of 1000 rows can be associated to a single
6213 # entity ID. If more rows are associated with the same entity ID, some
6214 # might be ignored.
6215 # single person. For example, in medical records the `EntityId` might be a
6216 # patient identifier, or for financial records it might be an account
6217 # identifier. This message is used when generalizations or analysis must take
6218 # into account that multiple rows correspond to the same entity.
6219 "field": { # General identifier of a data field in a storage service. # Composite key indicating which field contains the entity identifier.
6220 "name": "A String", # Name describing the field.
6221 },
6222 },
6223 "quasiIds": [ # Set of fields to compute k-anonymity over. When multiple fields are
6224 # specified, they are considered a single composite key. Structs and
6225 # repeated data types are not supported; however, nested fields are
6226 # supported so long as they are not structs themselves or nested within
6227 # a repeated field.
6228 { # General identifier of a data field in a storage service.
6229 "name": "A String", # Name describing the field.
6230 },
6231 ],
6232 },
Dan O'Mearadd494642020-05-01 07:42:23 -07006233 "categoricalStatsConfig": { # Compute numerical stats over an individual column, including # Categorical stats
6234 # number of distinct values and value count distribution.
6235 "field": { # General identifier of a data field in a storage service. # Field to compute categorical stats on. All column types are
6236 # supported except for arrays and structs. However, it may be more
6237 # informative to use NumericalStats when the field type is supported,
6238 # depending on the data.
6239 "name": "A String", # Name describing the field.
6240 },
6241 },
6242 "deltaPresenceEstimationConfig": { # δ-presence metric, used to estimate how likely it is for an attacker to # delta-presence
6243 # figure out that one given individual appears in a de-identified dataset.
6244 # Similarly to the k-map metric, we cannot compute δ-presence exactly without
6245 # knowing the attack dataset, so we use a statistical model instead.
6246 "regionCode": "A String", # ISO 3166-1 alpha-2 region code to use in the statistical modeling.
6247 # Set if no column is tagged with a region-specific InfoType (like
6248 # US_ZIP_5) or a region code.
6249 "quasiIds": [ # Required. Fields considered to be quasi-identifiers. No two fields can have the
6250 # same tag.
6251 { # A column with a semantic tag attached.
6252 "field": { # General identifier of a data field in a storage service. # Required. Identifies the column.
6253 "name": "A String", # Name describing the field.
6254 },
6255 "customTag": "A String", # A column can be tagged with a custom tag. In this case, the user must
6256 # indicate an auxiliary table that contains statistical information on
6257 # the possible values of this column (below).
6258 "infoType": { # Type of information detected by the API. # A column can be tagged with a InfoType to use the relevant public
6259 # dataset as a statistical model of population, if available. We
6260 # currently support US ZIP codes, region codes, ages and genders.
6261 # To programmatically obtain the list of supported InfoTypes, use
6262 # ListInfoTypes with the supported_by=RISK_ANALYSIS filter.
6263 "name": "A String", # Name of the information type. Either a name of your choosing when
6264 # creating a CustomInfoType, or one of the names listed
6265 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
6266 # a built-in type. InfoType names should conform to the pattern
6267 # `[a-zA-Z0-9_]{1,64}`.
6268 },
6269 "inferred": { # A generic empty message that you can re-use to avoid defining duplicated # If no semantic tag is indicated, we infer the statistical model from
6270 # the distribution of values in the input data
6271 # empty messages in your APIs. A typical example is to use it as the request
6272 # or the response type of an API method. For instance:
6273 #
6274 # service Foo {
6275 # rpc Bar(google.protobuf.Empty) returns (google.protobuf.Empty);
6276 # }
6277 #
6278 # The JSON representation for `Empty` is empty JSON object `{}`.
6279 },
6280 },
6281 ],
6282 "auxiliaryTables": [ # Several auxiliary tables can be used in the analysis. Each custom_tag
6283 # used to tag a quasi-identifiers field must appear in exactly one
6284 # field of one auxiliary table.
6285 { # An auxiliary table containing statistical information on the relative
6286 # frequency of different quasi-identifiers values. It has one or several
6287 # quasi-identifiers columns, and one column that indicates the relative
6288 # frequency of each quasi-identifier tuple.
6289 # If a tuple is present in the data but not in the auxiliary table, the
6290 # corresponding relative frequency is assumed to be zero (and thus, the
6291 # tuple is highly reidentifiable).
6292 "relativeFrequency": { # General identifier of a data field in a storage service. # Required. The relative frequency column must contain a floating-point number
6293 # between 0 and 1 (inclusive). Null values are assumed to be zero.
6294 "name": "A String", # Name describing the field.
6295 },
6296 "quasiIds": [ # Required. Quasi-identifier columns.
6297 { # A quasi-identifier column has a custom_tag, used to know which column
6298 # in the data corresponds to which column in the statistical model.
6299 "field": { # General identifier of a data field in a storage service. # Identifies the column.
6300 "name": "A String", # Name describing the field.
6301 },
6302 "customTag": "A String", # A column can be tagged with a custom tag. In this case, the user must
6303 # indicate an auxiliary table that contains statistical information on
6304 # the possible values of this column (below).
6305 },
6306 ],
6307 "table": { # Message defining the location of a BigQuery table. A table is uniquely # Required. Auxiliary table location.
6308 # identified by its project_id, dataset_id, and table_name. Within a query
6309 # a table is often referenced with a string in the format of:
6310 # `&lt;project_id&gt;:&lt;dataset_id&gt;.&lt;table_id&gt;` or
6311 # `&lt;project_id&gt;.&lt;dataset_id&gt;.&lt;table_id&gt;`.
6312 "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table.
6313 # If omitted, project ID is inferred from the API call.
6314 "tableId": "A String", # Name of the table.
6315 "datasetId": "A String", # Dataset ID of the table.
6316 },
6317 },
6318 ],
6319 },
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07006320 },
Dan O'Mearadd494642020-05-01 07:42:23 -07006321 "categoricalStatsResult": { # Result of the categorical stats computation. # Categorical stats result
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07006322 "valueFrequencyHistogramBuckets": [ # Histogram of value frequencies in the column.
Dan O'Mearadd494642020-05-01 07:42:23 -07006323 { # Histogram of value frequencies in the column.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07006324 "bucketValues": [ # Sample of value frequencies in this bucket. The total number of
6325 # values returned per bucket is capped at 20.
6326 { # A value of a field, including its frequency.
6327 "count": "A String", # How many times the value is contained in the field.
6328 "value": { # Set of primitive values supported by the system. # A value contained in the field in question.
6329 # Note that for the purposes of inspection or transformation, the number
6330 # of bytes considered to comprise a 'Value' is based on its representation
6331 # as a UTF-8 encoded string. For example, if 'integer_value' is set to
6332 # 123456789, the number of bytes would be counted as 9, even though an
6333 # int64 only holds up to 8 bytes of data.
Dan O'Mearadd494642020-05-01 07:42:23 -07006334 "floatValue": 3.14, # float
6335 "timestampValue": "A String", # timestamp
6336 "dayOfWeekValue": "A String", # day of week
6337 "timeValue": { # Represents a time of day. The date and time zone are either not significant # time of day
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07006338 # or are specified elsewhere. An API may choose to allow leap seconds. Related
6339 # types are google.type.Date and `google.protobuf.Timestamp`.
6340 "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose
6341 # to allow the value "24:00:00" for scenarios like business closing time.
6342 "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999.
6343 "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may
6344 # allow the value 60 if it allows leap-seconds.
6345 "minutes": 42, # Minutes of hour of day. Must be from 0 to 59.
6346 },
Dan O'Mearadd494642020-05-01 07:42:23 -07006347 "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day # date
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07006348 # and time zone are either specified elsewhere or are not significant. The date
6349 # is relative to the Proleptic Gregorian Calendar. This can represent:
6350 #
6351 # * A full date, with non-zero year, month and day values
6352 # * A month and day value, with a zero year, e.g. an anniversary
6353 # * A year on its own, with zero month and day values
6354 # * A year and month value, with a zero day, e.g. a credit card expiration date
6355 #
6356 # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`.
Dan O'Mearadd494642020-05-01 07:42:23 -07006357 "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a
6358 # month and day.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07006359 "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0
6360 # if specifying a year by itself or a year and month where the day is not
6361 # significant.
Dan O'Mearadd494642020-05-01 07:42:23 -07006362 "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without
6363 # a year.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07006364 },
Dan O'Mearadd494642020-05-01 07:42:23 -07006365 "stringValue": "A String", # string
6366 "booleanValue": True or False, # boolean
6367 "integerValue": "A String", # integer
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07006368 },
6369 },
6370 ],
6371 "bucketValueCount": "A String", # Total number of distinct values in this bucket.
6372 "valueFrequencyUpperBound": "A String", # Upper bound on the value frequency of the values in this bucket.
6373 "valueFrequencyLowerBound": "A String", # Lower bound on the value frequency of the values in this bucket.
6374 "bucketSize": "A String", # Total number of values in this bucket.
6375 },
6376 ],
6377 },
Dan O'Mearadd494642020-05-01 07:42:23 -07006378 "deltaPresenceEstimationResult": { # Result of the δ-presence computation. Note that these results are an # Delta-presence result
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07006379 # estimation, not exact values.
6380 "deltaPresenceEstimationHistogram": [ # The intervals [min_probability, max_probability) do not overlap. If a
6381 # value doesn't correspond to any such interval, the associated frequency
6382 # is zero. For example, the following records:
6383 # {min_probability: 0, max_probability: 0.1, frequency: 17}
6384 # {min_probability: 0.2, max_probability: 0.3, frequency: 42}
6385 # {min_probability: 0.3, max_probability: 0.4, frequency: 99}
6386 # mean that there are no record with an estimated probability in [0.1, 0.2)
6387 # nor larger or equal to 0.4.
6388 { # A DeltaPresenceEstimationHistogramBucket message with the following
6389 # values:
6390 # min_probability: 0.1
6391 # max_probability: 0.2
6392 # frequency: 42
6393 # means that there are 42 records for which δ is in [0.1, 0.2). An
6394 # important particular case is when min_probability = max_probability = 1:
6395 # then, every individual who shares this quasi-identifier combination is in
6396 # the dataset.
6397 "bucketValues": [ # Sample of quasi-identifier tuple values in this bucket. The total
6398 # number of classes returned per bucket is capped at 20.
6399 { # A tuple of values for the quasi-identifier columns.
6400 "quasiIdsValues": [ # The quasi-identifier values.
6401 { # Set of primitive values supported by the system.
6402 # Note that for the purposes of inspection or transformation, the number
6403 # of bytes considered to comprise a 'Value' is based on its representation
6404 # as a UTF-8 encoded string. For example, if 'integer_value' is set to
6405 # 123456789, the number of bytes would be counted as 9, even though an
6406 # int64 only holds up to 8 bytes of data.
Dan O'Mearadd494642020-05-01 07:42:23 -07006407 "floatValue": 3.14, # float
6408 "timestampValue": "A String", # timestamp
6409 "dayOfWeekValue": "A String", # day of week
6410 "timeValue": { # Represents a time of day. The date and time zone are either not significant # time of day
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07006411 # or are specified elsewhere. An API may choose to allow leap seconds. Related
6412 # types are google.type.Date and `google.protobuf.Timestamp`.
6413 "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose
6414 # to allow the value "24:00:00" for scenarios like business closing time.
6415 "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999.
6416 "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may
6417 # allow the value 60 if it allows leap-seconds.
6418 "minutes": 42, # Minutes of hour of day. Must be from 0 to 59.
6419 },
Dan O'Mearadd494642020-05-01 07:42:23 -07006420 "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day # date
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07006421 # and time zone are either specified elsewhere or are not significant. The date
6422 # is relative to the Proleptic Gregorian Calendar. This can represent:
6423 #
6424 # * A full date, with non-zero year, month and day values
6425 # * A month and day value, with a zero year, e.g. an anniversary
6426 # * A year on its own, with zero month and day values
6427 # * A year and month value, with a zero day, e.g. a credit card expiration date
6428 #
6429 # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`.
Dan O'Mearadd494642020-05-01 07:42:23 -07006430 "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a
6431 # month and day.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07006432 "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0
6433 # if specifying a year by itself or a year and month where the day is not
6434 # significant.
Dan O'Mearadd494642020-05-01 07:42:23 -07006435 "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without
6436 # a year.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07006437 },
Dan O'Mearadd494642020-05-01 07:42:23 -07006438 "stringValue": "A String", # string
6439 "booleanValue": True or False, # boolean
6440 "integerValue": "A String", # integer
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07006441 },
6442 ],
6443 "estimatedProbability": 3.14, # The estimated probability that a given individual sharing these
6444 # quasi-identifier values is in the dataset. This value, typically called
6445 # δ, is the ratio between the number of records in the dataset with these
6446 # quasi-identifier values, and the total number of individuals (inside
6447 # *and* outside the dataset) with these quasi-identifier values.
6448 # For example, if there are 15 individuals in the dataset who share the
6449 # same quasi-identifier values, and an estimated 100 people in the entire
6450 # population with these values, then δ is 0.15.
6451 },
6452 ],
6453 "bucketValueCount": "A String", # Total number of distinct quasi-identifier tuple values in this bucket.
6454 "bucketSize": "A String", # Number of records within these probability bounds.
6455 "maxProbability": 3.14, # Always greater than or equal to min_probability.
6456 "minProbability": 3.14, # Between 0 and 1.
6457 },
6458 ],
6459 },
6460 "requestedSourceTable": { # Message defining the location of a BigQuery table. A table is uniquely # Input dataset to compute metrics over.
6461 # identified by its project_id, dataset_id, and table_name. Within a query
6462 # a table is often referenced with a string in the format of:
Dan O'Mearadd494642020-05-01 07:42:23 -07006463 # `&lt;project_id&gt;:&lt;dataset_id&gt;.&lt;table_id&gt;` or
6464 # `&lt;project_id&gt;.&lt;dataset_id&gt;.&lt;table_id&gt;`.
Bu Sun Kim715bd7f2019-06-14 16:50:42 -07006465 "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table.
6466 # If omitted, project ID is inferred from the API call.
6467 "tableId": "A String", # Name of the table.
6468 "datasetId": "A String", # Dataset ID of the table.
6469 },
6470 },
6471 "state": "A String", # State of a job.
6472 "jobTriggerName": "A String", # If created by a job trigger, the resource name of the trigger that
6473 # instantiated the job.
6474 "startTime": "A String", # Time when the job started.
6475 "endTime": "A String", # Time when the job finished.
6476 "type": "A String", # The type of job.
6477 "createTime": "A String", # Time when the job was created.
6478 },
6479 ],
6480 }</pre>
6481</div>
6482
6483<div class="method">
6484 <code class="details" id="list_next">list_next(previous_request, previous_response)</code>
6485 <pre>Retrieves the next page of results.
6486
6487Args:
6488 previous_request: The request for the previous page. (required)
6489 previous_response: The response from the request for the previous page. (required)
6490
6491Returns:
6492 A request object that you can call 'execute()' on to request the next
6493 page. Returns None if there are no more items in the collection.
6494 </pre>
6495</div>
6496
6497</body></html>