Blame - docs/dyn/dlp_v2.projects.jobTriggers.html - platform/external/python/google-api-python-client

<h1><a href="dlp_v2.html">Cloud Data Loss Prevention (DLP) API</a> . <a href="dlp_v2.projects.html">projects</a> . <a href="dlp_v2.projects.jobTriggers.html">jobTriggers</a></h1>

76

<h2>Instance Methods</h2>

77

78

<code><a href="#activate">activate(name, body=None, x__xgafv=None)</a></code></p>

79

<p class="firstline">Activate a job trigger. Causes the immediate execute of a trigger</p>

80

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

81

<code><a href="#create">create(parent, body=None, x__xgafv=None)</a></code></p>

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

82

<p class="firstline">Creates a job trigger to run DLP actions such as scanning storage for</p>

83

84

<code><a href="#delete">delete(name, x__xgafv=None)</a></code></p>

85

<p class="firstline">Deletes a job trigger.</p>

86

87

<code><a href="#get">get(name, x__xgafv=None)</a></code></p>

88

<p class="firstline">Gets a job trigger.</p>

89

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

90

<code><a href="#list">list(parent, orderBy=None, pageSize=None, pageToken=None, x__xgafv=None, locationId=None, filter=None)</a></code></p>

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

91

<p class="firstline">Lists job triggers.</p>

92

93

<code><a href="#list_next">list_next(previous_request, previous_response)</a></code></p>

94

<p class="firstline">Retrieves the next page of results.</p>

95

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

96

<code><a href="#patch">patch(name, body=None, x__xgafv=None)</a></code></p>

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

97

<p class="firstline">Updates a job trigger.</p>

98

<h3>Method Details</h3>

99

100

<code class="details" id="activate">activate(name, body=None, x__xgafv=None)</code>

101

<pre>Activate a job trigger. Causes the immediate execute of a trigger

102

instead of waiting on the trigger event to occur.

103

104

Args:

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

105

name: string, Required. Resource name of the trigger to activate, for example

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

106

`projects/dlp-test-project/jobTriggers/53234423`. (required)

107

body: object, The request body.

108

The object takes the form of:

109

110

{ # Request message for ActivateJobTrigger.

111

}

112

113

x__xgafv: string, V1 error format.

Allowed values

1 - v1 error format

2 - v2 error format

Returns:

An object of the form:

120

121

{ # Combines all of the information about a DLP job.

122

"errors": [ # A stream of errors encountered running the job.

123

{ # Details information about an error encountered during job execution or

124

# the results of an unsuccessful activation of the JobTrigger.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

125

"timestamps": [ # The times the error occurred.

126

"A String",

127

],

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

128

"details": { # The `Status` type defines a logical error model that is suitable for # Detailed error codes and messages.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

129

# different programming environments, including REST APIs and RPC APIs. It is

130

# used by [gRPC](https://github.com/grpc). Each `Status` message contains

131

# three pieces of data: error code, error message, and error details.

132

#

133

# You can find out more about this error model and how to work with it in the

134

# [API Design Guide](https://cloud.google.com/apis/design/errors).

135

"message": "A String", # A developer-facing error message, which should be in English. Any

136

# user-facing error message should be localized and sent in the

137

# google.rpc.Status.details field, or localized by the client.

138

"code": 42, # The status code, which should be an enum value of google.rpc.Code.

139

"details": [ # A list of messages that carry the error details. There is a common set of

140

# message types for APIs to use.

141

{

142

"a_key": "", # Properties of the object. Contains field @type with type URL.

},

],

},

},

],

"name": "A String", # The server-assigned name.

149

"inspectDetails": { # The results of an inspect DataSource job. # Results from inspecting a data source.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

150

"requestedOptions": { # Snapshot of the inspection configuration. # The configuration used for this job.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

151

"snapshotInspectTemplate": { # The inspectTemplate contains a configuration (set of types of sensitive data # If run with an InspectTemplate, a snapshot of its state at the time of

152

# this run.

153

# to be detected) to be used anywhere you otherwise would normally specify

154

# InspectConfig. See https://cloud.google.com/dlp/docs/concepts-templates

155

# to learn more.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

156

"updateTime": "A String", # Output only. The last update timestamp of an inspectTemplate.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

157

"displayName": "A String", # Display name (max 256 chars).

158

"description": "A String", # Short description (max 256 chars).

159

"inspectConfig": { # Configuration description of the scanning process. # The core content of the template. Configuration of the scanning process.

160

# When used with redactContent only info_types and min_likelihood are currently

161

# used.

162

"excludeInfoTypes": True or False, # When true, excludes type information of the findings.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

163

"limits": { # Configuration to control the number of findings returned. # Configuration to control the number of findings returned.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

164

"maxFindingsPerRequest": 42, # Max number of findings that will be returned per request/job.

165

# When set within `InspectContentRequest`, the maximum returned is 2000

166

# regardless if this is set higher.

167

"maxFindingsPerInfoType": [ # Configuration of findings limit given for specified infoTypes.

168

{ # Max findings configuration per infoType, per content item or long

169

# running DlpJob.

170

"infoType": { # Type of information detected by the API. # Type of information the findings limit applies to. Only one limit per

171

# info_type should be provided. If InfoTypeLimit does not have an

172

# info_type, the DLP API applies the limit against all info_types that

173

# are found but not specified in another InfoTypeLimit.

174

"name": "A String", # Name of the information type. Either a name of your choosing when

175

# creating a CustomInfoType, or one of the names listed

176

# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying

177

# a built-in type. InfoType names should conform to the pattern

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

178

# `[a-zA-Z0-9_]{1,64}`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

179

},

180

"maxFindings": 42, # Max findings limit for the given infoType.

181

},

182

],

183

"maxFindingsPerItem": 42, # Max number of findings that will be returned for each item scanned.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

184

# When set within `InspectJobConfig`,

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

185

# the maximum returned is 2000 regardless if this is set higher.

186

# When set within `InspectContentRequest`, this field is ignored.

187

},

188

"minLikelihood": "A String", # Only returns findings equal or above this threshold. The default is

189

# POSSIBLE.

190

# See https://cloud.google.com/dlp/docs/likelihood to learn more.

191

"customInfoTypes": [ # CustomInfoTypes provided by the user. See

192

# https://cloud.google.com/dlp/docs/creating-custom-infotypes to learn more.

193

{ # Custom information type provided by the user. Used to find domain-specific

194

# sensitive information configurable to the data in question.

195

"regex": { # Message defining a custom regular expression. # Regular expression based CustomInfoType.

196

"pattern": "A String", # Pattern defining the regular expression. Its syntax

197

# (https://github.com/google/re2/wiki/Syntax) can be found under the

198

# google/re2 repository on GitHub.

199

"groupIndexes": [ # The index of the submatch to extract as findings. When not

200

# specified, the entire match is returned. No more than 3 may be included.

42,

],

},

"surrogateType": { # Message for detecting output from deidentification transformations # Message for detecting output from deidentification transformations that

205

# support reversing.

206

# such as

207

# [`CryptoReplaceFfxFpeConfig`](/dlp/docs/reference/rest/v2/organizations.deidentifyTemplates#cryptoreplaceffxfpeconfig).

208

# These types of transformations are

209

# those that perform pseudonymization, thereby producing a "surrogate" as

210

# output. This should be used in conjunction with a field on the

211

# transformation such as `surrogate_info_type`. This CustomInfoType does

212

# not support the use of `detection_rules`.

213

},

214

"infoType": { # Type of information detected by the API. # CustomInfoType can either be a new infoType, or an extension of built-in

215

# infoType, when the name matches one of existing infoTypes and that infoType

216

# is specified in `InspectContent.info_types` field. Specifying the latter

217

# adds findings to the one detected by the system. If built-in info type is

218

# not specified in `InspectContent.info_types` list then the name is treated

219

# as a custom info type.

220

"name": "A String", # Name of the information type. Either a name of your choosing when

221

# creating a CustomInfoType, or one of the names listed

222

# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying

223

# a built-in type. InfoType names should conform to the pattern

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

224

# `[a-zA-Z0-9_]{1,64}`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

225

},

226

"dictionary": { # Custom information type based on a dictionary of words or phrases. This can # A list of phrases to detect as a CustomInfoType.

227

# be used to match sensitive information specific to the data, such as a list

228

# of employee IDs or job titles.

229

#

230

# Dictionary words are case-insensitive and all characters other than letters

231

# and digits in the unicode [Basic Multilingual

232

# Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane)

233

# will be replaced with whitespace when scanning for matches, so the

234

# dictionary phrase "Sam Johnson" will match all three phrases "sam johnson",

235

# "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters

236

# surrounding any match must be of a different type than the adjacent

237

# characters within the word, so letters must be next to non-letters and

238

# digits next to non-digits. For example, the dictionary word "jen" will

239

# match the first three letters of the text "jen123" but will return no

240

# matches for "jennifer".

241

#

242

# Dictionary words containing a large number of characters that are not

243

# letters or digits may result in unexpected findings because such characters

244

# are treated as whitespace. The

245

# [limits](https://cloud.google.com/dlp/limits) page contains details about

246

# the size limits of dictionaries. For dictionaries that do not fit within

247

# these constraints, consider using `LargeCustomDictionaryConfig` in the

248

# `StoredInfoType` API.

249

"wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for.

250

"words": [ # Words or phrases defining the dictionary. The dictionary must contain

251

# at least one phrase and every phrase must contain at least 2 characters

252

# that are letters or digits. [required]

"A String",

],

},

"cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file

257

# is accepted.

258

"path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage.

259

# Example: gs://[BUCKET_NAME]/dictionary.txt

260

},

261

},

262

"storedType": { # A reference to a StoredInfoType to use with scanning. # Load an existing `StoredInfoType` resource for use in

263

# `InspectDataSource`. Not currently supported in `InspectContent`.

264

"name": "A String", # Resource name of the requested `StoredInfoType`, for example

265

# `organizations/433245324/storedInfoTypes/432452342` or

266

# `projects/project-id/storedInfoTypes/432452342`.

267

"createTime": "A String", # Timestamp indicating when the version of the `StoredInfoType` used for

268

# inspection was created. Output-only field, populated by the system.

269

},

270

"detectionRules": [ # Set of detection rules to apply to all findings of this CustomInfoType.

271

# Rules are applied in order that they are specified. Not supported for the

272

# `surrogate_type` CustomInfoType.

273

{ # Deprecated; use `InspectionRuleSet` instead. Rule for modifying a

274

# `CustomInfoType` to alter behavior under certain circumstances, depending

275

# on the specific details of the rule. Not supported for the `surrogate_type`

276

# custom infoType.

277

"hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule.

278

# proximity of hotwords.

279

"proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside.

280

# The total length of the window cannot exceed 1000 characters. Note that

281

# the finding itself will be included in the window, so that hotwords may

282

# be used to match substrings of the finding itself. For example, the

283

# certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be

284

# adjusted upwards if the area code is known to be the local area code of

285

# a company office using the hotword regex "\(xxx\)", where "xxx"

286

# is the area code in question.

287

# rule.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

288

"windowBefore": 42, # Number of characters before the finding to consider.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

289

"windowAfter": 42, # Number of characters after the finding to consider.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

290

},

291

"hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword.

292

"pattern": "A String", # Pattern defining the regular expression. Its syntax

293

# (https://github.com/google/re2/wiki/Syntax) can be found under the

294

# google/re2 repository on GitHub.

295

"groupIndexes": [ # The index of the submatch to extract as findings. When not

296

# specified, the entire match is returned. No more than 3 may be included.

42,

],

},

"likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings.

301

# part of a detection rule.

302

"relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of

303

# levels. For example, if a finding would be `POSSIBLE` without the

304

# detection rule and `relative_likelihood` is 1, then it is upgraded to

305

# `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`.

306

# Likelihood may never drop below `VERY_UNLIKELY` or exceed

307

# `VERY_LIKELY`, so applying an adjustment of 1 followed by an

308

# adjustment of -1 when base likelihood is `VERY_LIKELY` will result in

309

# a final likelihood of `LIKELY`.

310

"fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value.

},

},

},

],

"exclusionType": "A String", # If set to EXCLUSION_TYPE_EXCLUDE this infoType will not cause a finding

316

# to be returned. It still can be used for rules matching.

317

"likelihood": "A String", # Likelihood to return for this CustomInfoType. This base value can be

318

# altered by a detection rule if the finding meets the criteria specified by

319

# the rule. Defaults to `VERY_LIKELY` if not specified.

320

},

321

],

322

"includeQuote": True or False, # When true, a contextual quote from the data that triggered a finding is

323

# included in the response; see Finding.quote.

324

"ruleSet": [ # Set of rules to apply to the findings for this InspectConfig.

325

# Exclusion rules, contained in the set are executed in the end, other

326

# rules are executed in the order they are specified for each info type.

327

{ # Rule set for modifying a set of infoTypes to alter behavior under certain

328

# circumstances, depending on the specific details of the rules within the set.

329

"rules": [ # Set of rules to be applied to infoTypes. The rules are applied in order.

330

{ # A single inspection rule to be applied to infoTypes, specified in

331

# `InspectionRuleSet`.

332

"hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule.

333

# proximity of hotwords.

334

"proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside.

335

# The total length of the window cannot exceed 1000 characters. Note that

336

# the finding itself will be included in the window, so that hotwords may

337

# be used to match substrings of the finding itself. For example, the

338

# certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be

339

# adjusted upwards if the area code is known to be the local area code of

340

# a company office using the hotword regex "\(xxx\)", where "xxx"

341

# is the area code in question.

342

# rule.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

343

"windowBefore": 42, # Number of characters before the finding to consider.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

344

"windowAfter": 42, # Number of characters after the finding to consider.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

345

},

346

"hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword.

347

"pattern": "A String", # Pattern defining the regular expression. Its syntax

348

# (https://github.com/google/re2/wiki/Syntax) can be found under the

349

# google/re2 repository on GitHub.

350

"groupIndexes": [ # The index of the submatch to extract as findings. When not

351

# specified, the entire match is returned. No more than 3 may be included.

42,

],

},

"likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings.

356

# part of a detection rule.

357

"relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of

358

# levels. For example, if a finding would be `POSSIBLE` without the

359

# detection rule and `relative_likelihood` is 1, then it is upgraded to

360

# `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`.

361

# Likelihood may never drop below `VERY_UNLIKELY` or exceed

362

# `VERY_LIKELY`, so applying an adjustment of 1 followed by an

363

# adjustment of -1 when base likelihood is `VERY_LIKELY` will result in

364

# a final likelihood of `LIKELY`.

365

"fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value.

366

},

367

},

368

"exclusionRule": { # The rule that specifies conditions when findings of infoTypes specified in # Exclusion rule.

369

# `InspectionRuleSet` are removed from results.

370

"regex": { # Message defining a custom regular expression. # Regular expression which defines the rule.

371

"pattern": "A String", # Pattern defining the regular expression. Its syntax

372

# (https://github.com/google/re2/wiki/Syntax) can be found under the

373

# google/re2 repository on GitHub.

374

"groupIndexes": [ # The index of the submatch to extract as findings. When not

375

# specified, the entire match is returned. No more than 3 may be included.

42,

],

},

"excludeInfoTypes": { # List of exclude infoTypes. # Set of infoTypes for which findings would affect this rule.

380

"infoTypes": [ # InfoType list in ExclusionRule rule drops a finding when it overlaps or

381

# contained within with a finding of an infoType from this list. For

382

# example, for `InspectionRuleSet.info_types` containing "PHONE_NUMBER"` and

383

# `exclusion_rule` containing `exclude_info_types.info_types` with

384

# "EMAIL_ADDRESS" the phone number findings are dropped if they overlap

385

# with EMAIL_ADDRESS finding.

386

# That leads to "555-222-2222@example.org" to generate only a single

387

# finding, namely email address.

388

{ # Type of information detected by the API.

389

"name": "A String", # Name of the information type. Either a name of your choosing when

390

# creating a CustomInfoType, or one of the names listed

391

# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying

392

# a built-in type. InfoType names should conform to the pattern

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

393

# `[a-zA-Z0-9_]{1,64}`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

},

],

},

"dictionary": { # Custom information type based on a dictionary of words or phrases. This can # Dictionary which defines the rule.

398

# be used to match sensitive information specific to the data, such as a list

399

# of employee IDs or job titles.

400

#

401

# Dictionary words are case-insensitive and all characters other than letters

402

# and digits in the unicode [Basic Multilingual

403

# Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane)

404

# will be replaced with whitespace when scanning for matches, so the

405

# dictionary phrase "Sam Johnson" will match all three phrases "sam johnson",

406

# "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters

407

# surrounding any match must be of a different type than the adjacent

408

# characters within the word, so letters must be next to non-letters and

409

# digits next to non-digits. For example, the dictionary word "jen" will

410

# match the first three letters of the text "jen123" but will return no

411

# matches for "jennifer".

412

#

413

# Dictionary words containing a large number of characters that are not

414

# letters or digits may result in unexpected findings because such characters

415

# are treated as whitespace. The

416

# [limits](https://cloud.google.com/dlp/limits) page contains details about

417

# the size limits of dictionaries. For dictionaries that do not fit within

418

# these constraints, consider using `LargeCustomDictionaryConfig` in the

419

# `StoredInfoType` API.

420

"wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for.

421

"words": [ # Words or phrases defining the dictionary. The dictionary must contain

422

# at least one phrase and every phrase must contain at least 2 characters

423

# that are letters or digits. [required]

"A String",

],

},

"cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file

428

# is accepted.

429

"path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage.

430

# Example: gs://[BUCKET_NAME]/dictionary.txt

431

},

432

},

433

"matchingType": "A String", # How the rule is applied, see MatchingType documentation for details.

},

},

],

"infoTypes": [ # List of infoTypes this rule set is applied to.

438

{ # Type of information detected by the API.

439

"name": "A String", # Name of the information type. Either a name of your choosing when

440

# creating a CustomInfoType, or one of the names listed

441

# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying

442

# a built-in type. InfoType names should conform to the pattern

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

443

# `[a-zA-Z0-9_]{1,64}`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

},

],

},

],

"contentOptions": [ # List of options defining data content to scan.

449

# If empty, text, images, and other content will be included.

450

"A String",

451

],

452

"infoTypes": [ # Restricts what info_types to look for. The values must correspond to

453

# InfoType values returned by ListInfoTypes or listed at

454

# https://cloud.google.com/dlp/docs/infotypes-reference.

455

#

456

# When no InfoTypes or CustomInfoTypes are specified in a request, the

457

# system may automatically choose what detectors to run. By default this may

458

# be all types, but may change over time as detectors are updated.

459

#

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

460

# If you need precise control and predictability as to what detectors are

461

# run you should specify specific InfoTypes listed in the reference,

462

# otherwise a default list will be used, which may change over time.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

463

{ # Type of information detected by the API.

464

"name": "A String", # Name of the information type. Either a name of your choosing when

465

# creating a CustomInfoType, or one of the names listed

466

# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying

467

# a built-in type. InfoType names should conform to the pattern

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

468

# `[a-zA-Z0-9_]{1,64}`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

469

},

470

],

471

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

472

"createTime": "A String", # Output only. The creation timestamp of an inspectTemplate.

473

"name": "A String", # Output only. The template name.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

474

#

475

# The template will have one of the following formats:

476

# `projects/PROJECT_ID/inspectTemplates/TEMPLATE_ID` OR

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

477

# `organizations/ORGANIZATION_ID/inspectTemplates/TEMPLATE_ID`;

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

478

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

479

"jobConfig": { # Controls what and how to inspect for findings. # Inspect config.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

480

"storageConfig": { # Shared message indicating Cloud storage type. # The data to scan.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

481

"cloudStorageOptions": { # Options defining a file or a set of files within a Google Cloud Storage # Google Cloud Storage options.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

482

# bucket.

483

"bytesLimitPerFile": "A String", # Max number of bytes to scan from a file. If a scanned file's size is bigger

484

# than this value then the rest of the bytes are omitted. Only one

485

# of bytes_limit_per_file and bytes_limit_per_file_percent can be specified.

486

"sampleMethod": "A String",

487

"fileSet": { # Set of files to scan. # The set of one or more files to scan.

488

"url": "A String", # The Cloud Storage url of the file(s) to scan, in the format

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

489

# `gs://<bucket>/<path>`. Trailing wildcard in the path is allowed.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

490

#

491

# If the url ends in a trailing slash, the bucket or directory represented

492

# by the url will be scanned non-recursively (content in sub-directories

493

# will not be scanned). This means that `gs://mybucket/` is equivalent to

494

# `gs://mybucket/*`, and `gs://mybucket/directory/` is equivalent to

495

# `gs://mybucket/directory/*`.

496

#

497

# Exactly one of `url` or `regex_file_set` must be set.

498

"regexFileSet": { # Message representing a set of files in a Cloud Storage bucket. Regular # The regex-filtered set of files to scan. Exactly one of `url` or

499

# `regex_file_set` must be set.

500

# expressions are used to allow fine-grained control over which files in the

501

# bucket to include.

502

#

503

# Included files are those that match at least one item in `include_regex` and

504

# do not match any items in `exclude_regex`. Note that a file that matches

505

# items from both lists will _not_ be included. For a match to occur, the

506

# entire file path (i.e., everything in the url after the bucket name) must

507

# match the regular expression.

508

#

509

# For example, given the input `{bucket_name: "mybucket", include_regex:

510

# ["directory1/.*"], exclude_regex:

511

# ["directory1/excluded.*"]}`:

512

#

513

# * `gs://mybucket/directory1/myfile` will be included

514

# * `gs://mybucket/directory1/directory2/myfile` will be included (`.*` matches

515

# across `/`)

516

# * `gs://mybucket/directory0/directory1/myfile` will _not_ be included (the

517

# full path doesn't match any items in `include_regex`)

518

# * `gs://mybucket/directory1/excludedfile` will _not_ be included (the path

519

# matches an item in `exclude_regex`)

520

#

521

# If `include_regex` is left empty, it will match all files by default

522

# (this is equivalent to setting `include_regex: [".*"]`).

523

#

524

# Some other common use cases:

525

#

526

# * `{bucket_name: "mybucket", exclude_regex: [".*\.pdf"]}` will include all

527

# files in `mybucket` except for .pdf files

528

# * `{bucket_name: "mybucket", include_regex: ["directory/[^/]+"]}` will

529

# include all files directly under `gs://mybucket/directory/`, without matching

530

# across `/`

531

"excludeRegex": [ # A list of regular expressions matching file paths to exclude. All files in

532

# the bucket that match at least one of these regular expressions will be

533

# excluded from the scan.

534

#

535

# Regular expressions use RE2

536

# [syntax](https://github.com/google/re2/wiki/Syntax); a guide can be found

537

# under the google/re2 repository on GitHub.

538

"A String",

539

],

540

"bucketName": "A String", # The name of a Cloud Storage bucket. Required.

541

"includeRegex": [ # A list of regular expressions matching file paths to include. All files in

542

# the bucket that match at least one of these regular expressions will be

543

# included in the set of files, except for those that also match an item in

544

# `exclude_regex`. Leaving this field empty will match all files by default

545

# (this is equivalent to including `.*` in the list).

546

#

547

# Regular expressions use RE2

548

# [syntax](https://github.com/google/re2/wiki/Syntax); a guide can be found

549

# under the google/re2 repository on GitHub.

"A String",

],

},

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

554

"filesLimitPercent": 42, # Limits the number of files to scan to this percentage of the input FileSet.

555

# Number of files scanned is rounded down. Must be between 0 and 100,

556

# inclusively. Both 0 and 100 means no limit. Defaults to 0.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

557

"bytesLimitPerFilePercent": 42, # Max percentage of bytes to scan from a file. The rest are omitted. The

558

# number of bytes scanned is rounded down. Must be between 0 and 100,

559

# inclusively. Both 0 and 100 means no limit. Defaults to 0. Only one

560

# of bytes_limit_per_file and bytes_limit_per_file_percent can be specified.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

561

"fileTypes": [ # List of file type groups to include in the scan.

562

# If empty, all files are scanned and available data format processors

563

# are applied. In addition, the binary content of the selected files

564

# is always scanned as well.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

565

# Images are scanned only as binary if the specified region

566

# does not support image inspection and no file_types were specified.

567

# Image inspection is restricted to 'global', 'us', 'asia', and 'europe'.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

568

"A String",

569

],

570

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

571

"datastoreOptions": { # Options defining a data set within Google Cloud Datastore. # Google Cloud Datastore options.

572

"partitionId": { # Datastore partition ID. # A partition ID identifies a grouping of entities. The grouping is always

573

# by project and namespace, however the namespace ID may be empty.

574

# A partition ID identifies a grouping of entities. The grouping is always

575

# by project and namespace, however the namespace ID may be empty.

576

#

577

# A partition ID contains several dimensions:

578

# project ID and namespace ID.

579

"projectId": "A String", # The ID of the project to which the entities belong.

580

"namespaceId": "A String", # If not empty, the ID of the namespace to which the entities belong.

581

},

582

"kind": { # A representation of a Datastore kind. # The kind to process.

583

"name": "A String", # The name of the kind.

584

},

585

},

586

"bigQueryOptions": { # Options defining BigQuery table and row identifiers. # BigQuery options.

587

"excludedFields": [ # References to fields excluded from scanning. This allows you to skip

588

# inspection of entire columns which you know have no findings.

589

{ # General identifier of a data field in a storage service.

590

"name": "A String", # Name describing the field.

591

},

592

],

593

"rowsLimit": "A String", # Max number of rows to scan. If the table has more rows than this value, the

594

# rest of the rows are omitted. If not set, or if set to 0, all rows will be

595

# scanned. Only one of rows_limit and rows_limit_percent can be specified.

596

# Cannot be used in conjunction with TimespanConfig.

597

"sampleMethod": "A String",

598

"identifyingFields": [ # Table fields that may uniquely identify a row within the table. When

599

# `actions.saveFindings.outputConfig.table` is specified, the values of

600

# columns specified here are available in the output table under

601

# `location.content_locations.record_location.record_key.id_values`. Nested

602

# fields such as `person.birthdate.year` are allowed.

603

{ # General identifier of a data field in a storage service.

604

"name": "A String", # Name describing the field.

605

},

606

],

607

"rowsLimitPercent": 42, # Max percentage of rows to scan. The rest are omitted. The number of rows

608

# scanned is rounded down. Must be between 0 and 100, inclusively. Both 0 and

609

# 100 means no limit. Defaults to 0. Only one of rows_limit and

610

# rows_limit_percent can be specified. Cannot be used in conjunction with

611

# TimespanConfig.

612

"tableReference": { # Message defining the location of a BigQuery table. A table is uniquely # Complete BigQuery table reference.

613

# identified by its project_id, dataset_id, and table_name. Within a query

614

# a table is often referenced with a string in the format of:

615

# `<project_id>:<dataset_id>.<table_id>` or

616

# `<project_id>.<dataset_id>.<table_id>`.

617

"projectId": "A String", # The Google Cloud Platform project ID of the project containing the table.

618

# If omitted, project ID is inferred from the API call.

619

"tableId": "A String", # Name of the table.

620

"datasetId": "A String", # Dataset ID of the table.

621

},

622

},

623

"timespanConfig": { # Configuration of the timespan of the items to include in scanning.

624

# Currently only supported when inspecting Google Cloud Storage and BigQuery.

625

"timestampField": { # General identifier of a data field in a storage service. # Specification of the field containing the timestamp of scanned items.

626

# Used for data sources like Datastore and BigQuery.

627

#

628

# For BigQuery:

629

# Required to filter out rows based on the given start and

630

# end times. If not specified and the table was modified between the given

631

# start and end times, the entire table will be scanned.

632

# The valid data types of the timestamp field are: `INTEGER`, `DATE`,

633

# `TIMESTAMP`, or `DATETIME` BigQuery column.

634

#

635

# For Datastore.

636

# Valid data types of the timestamp field are: `TIMESTAMP`.

637

# Datastore entity will be scanned if the timestamp property does not

638

# exist or its value is empty or invalid.

639

"name": "A String", # Name describing the field.

640

},

641

"endTime": "A String", # Exclude files or rows newer than this value.

642

# If set to zero, no upper time limit is applied.

643

"startTime": "A String", # Exclude files or rows older than this value.

644

"enableAutoPopulationOfTimespanConfig": True or False, # When the job is started by a JobTrigger we will automatically figure out

645

# a valid start_time to avoid scanning files that have not been modified

646

# since the last time the JobTrigger executed. This will be based on the

647

# time of the execution of the last run of the JobTrigger.

648

},

649

"hybridOptions": { # Configuration to control jobs where the content being inspected is outside # Hybrid inspection options.

650

# Early access feature is in a pre-release state and might change or have

651

# limited support. For more information, see

652

# https://cloud.google.com/products#product-launch-stages.

653

# of Google Cloud Platform.

654

"tableOptions": { # Instructions regarding the table content being inspected. # If the container is a table, additional information to make findings

655

# meaningful such as the columns that are primary keys.

656

"identifyingFields": [ # The columns that are the primary keys for table objects included in

657

# ContentItem. A copy of this cell's value will stored alongside alongside

658

# each finding so that the finding can be traced to the specific row it came

659

# from. No more than 3 may be provided.

660

{ # General identifier of a data field in a storage service.

661

"name": "A String", # Name describing the field.

},

],

},

"labels": { # To organize findings, these labels will be added to each finding.

666

#

667

# Label keys must be between 1 and 63 characters long and must conform

668

# to the following regular expression: `[a-z]([-a-z0-9]*[a-z0-9])?`.

669

#

670

# Label values must be between 0 and 63 characters long and must conform

671

# to the regular expression `([a-z]([-a-z0-9]*[a-z0-9])?)?`.

672

#

673

# No more than 10 labels can be associated with a given finding.

674

#

675

# Examples:

676

# * `"environment" : "production"`

677

# * `"pipeline" : "etl"`

678

"a_key": "A String",

679

},

680

"requiredFindingLabelKeys": [ # These are labels that each inspection request must include within their

681

# 'finding_labels' map. Request may contain others, but any missing one of

682

# these will be rejected.

683

#

684

# Label keys must be between 1 and 63 characters long and must conform

685

# to the following regular expression: `[a-z]([-a-z0-9]*[a-z0-9])?`.

686

#

687

# No more than 10 keys can be required.

688

"A String",

689

],

690

"description": "A String", # A short description of where the data is coming from. Will be stored once

691

# in the job. 256 max length.

692

},

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

693

},

694

"inspectConfig": { # Configuration description of the scanning process. # How and what to scan for.

695

# When used with redactContent only info_types and min_likelihood are currently

696

# used.

697

"excludeInfoTypes": True or False, # When true, excludes type information of the findings.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

698

"limits": { # Configuration to control the number of findings returned. # Configuration to control the number of findings returned.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

699

"maxFindingsPerRequest": 42, # Max number of findings that will be returned per request/job.

700

# When set within `InspectContentRequest`, the maximum returned is 2000

701

# regardless if this is set higher.

702

"maxFindingsPerInfoType": [ # Configuration of findings limit given for specified infoTypes.

703

{ # Max findings configuration per infoType, per content item or long

704

# running DlpJob.

705

"infoType": { # Type of information detected by the API. # Type of information the findings limit applies to. Only one limit per

706

# info_type should be provided. If InfoTypeLimit does not have an

707

# info_type, the DLP API applies the limit against all info_types that

708

# are found but not specified in another InfoTypeLimit.

709

"name": "A String", # Name of the information type. Either a name of your choosing when

710

# creating a CustomInfoType, or one of the names listed

711

# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying

712

# a built-in type. InfoType names should conform to the pattern

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

713

# `[a-zA-Z0-9_]{1,64}`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

714

},

715

"maxFindings": 42, # Max findings limit for the given infoType.

716

},

717

],

718

"maxFindingsPerItem": 42, # Max number of findings that will be returned for each item scanned.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

719

# When set within `InspectJobConfig`,

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

720

# the maximum returned is 2000 regardless if this is set higher.

721

# When set within `InspectContentRequest`, this field is ignored.

722

},

723

"minLikelihood": "A String", # Only returns findings equal or above this threshold. The default is

724

# POSSIBLE.

725

# See https://cloud.google.com/dlp/docs/likelihood to learn more.

726

"customInfoTypes": [ # CustomInfoTypes provided by the user. See

727

# https://cloud.google.com/dlp/docs/creating-custom-infotypes to learn more.

728

{ # Custom information type provided by the user. Used to find domain-specific

729

# sensitive information configurable to the data in question.

730

"regex": { # Message defining a custom regular expression. # Regular expression based CustomInfoType.

731

"pattern": "A String", # Pattern defining the regular expression. Its syntax

732

# (https://github.com/google/re2/wiki/Syntax) can be found under the

733

# google/re2 repository on GitHub.

734

"groupIndexes": [ # The index of the submatch to extract as findings. When not

735

# specified, the entire match is returned. No more than 3 may be included.

42,

],

},

"surrogateType": { # Message for detecting output from deidentification transformations # Message for detecting output from deidentification transformations that

740

# support reversing.

741

# such as

742

# [`CryptoReplaceFfxFpeConfig`](/dlp/docs/reference/rest/v2/organizations.deidentifyTemplates#cryptoreplaceffxfpeconfig).

743

# These types of transformations are

744

# those that perform pseudonymization, thereby producing a "surrogate" as

745

# output. This should be used in conjunction with a field on the

746

# transformation such as `surrogate_info_type`. This CustomInfoType does

747

# not support the use of `detection_rules`.

748

},

749

"infoType": { # Type of information detected by the API. # CustomInfoType can either be a new infoType, or an extension of built-in

750

# infoType, when the name matches one of existing infoTypes and that infoType

751

# is specified in `InspectContent.info_types` field. Specifying the latter

752

# adds findings to the one detected by the system. If built-in info type is

753

# not specified in `InspectContent.info_types` list then the name is treated

754

# as a custom info type.

755

"name": "A String", # Name of the information type. Either a name of your choosing when

756

# creating a CustomInfoType, or one of the names listed

757

# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying

758

# a built-in type. InfoType names should conform to the pattern

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

759

# `[a-zA-Z0-9_]{1,64}`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

760

},

761

"dictionary": { # Custom information type based on a dictionary of words or phrases. This can # A list of phrases to detect as a CustomInfoType.

762

# be used to match sensitive information specific to the data, such as a list

763

# of employee IDs or job titles.

764

#

765

# Dictionary words are case-insensitive and all characters other than letters

766

# and digits in the unicode [Basic Multilingual

767

# Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane)

768

# will be replaced with whitespace when scanning for matches, so the

769

# dictionary phrase "Sam Johnson" will match all three phrases "sam johnson",

770

# "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters

771

# surrounding any match must be of a different type than the adjacent

772

# characters within the word, so letters must be next to non-letters and

773

# digits next to non-digits. For example, the dictionary word "jen" will

774

# match the first three letters of the text "jen123" but will return no

775

# matches for "jennifer".

776

#

777

# Dictionary words containing a large number of characters that are not

778

# letters or digits may result in unexpected findings because such characters

779

# are treated as whitespace. The

780

# [limits](https://cloud.google.com/dlp/limits) page contains details about

781

# the size limits of dictionaries. For dictionaries that do not fit within

782

# these constraints, consider using `LargeCustomDictionaryConfig` in the

783

# `StoredInfoType` API.

784

"wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for.

785

"words": [ # Words or phrases defining the dictionary. The dictionary must contain

786

# at least one phrase and every phrase must contain at least 2 characters

787

# that are letters or digits. [required]

"A String",

],

},

"cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file

792

# is accepted.

793

"path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage.

794

# Example: gs://[BUCKET_NAME]/dictionary.txt

795

},

796

},

797

"storedType": { # A reference to a StoredInfoType to use with scanning. # Load an existing `StoredInfoType` resource for use in

798

# `InspectDataSource`. Not currently supported in `InspectContent`.

799

"name": "A String", # Resource name of the requested `StoredInfoType`, for example

800

# `organizations/433245324/storedInfoTypes/432452342` or

801

# `projects/project-id/storedInfoTypes/432452342`.

802

"createTime": "A String", # Timestamp indicating when the version of the `StoredInfoType` used for

803

# inspection was created. Output-only field, populated by the system.

804

},

805

"detectionRules": [ # Set of detection rules to apply to all findings of this CustomInfoType.

806

# Rules are applied in order that they are specified. Not supported for the

807

# `surrogate_type` CustomInfoType.

808

{ # Deprecated; use `InspectionRuleSet` instead. Rule for modifying a

809

# `CustomInfoType` to alter behavior under certain circumstances, depending

810

# on the specific details of the rule. Not supported for the `surrogate_type`

811

# custom infoType.

812

"hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule.

813

# proximity of hotwords.

814

"proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside.

815

# The total length of the window cannot exceed 1000 characters. Note that

816

# the finding itself will be included in the window, so that hotwords may

817

# be used to match substrings of the finding itself. For example, the

818

# certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be

819

# adjusted upwards if the area code is known to be the local area code of

820

# a company office using the hotword regex "\(xxx\)", where "xxx"

821

# is the area code in question.

822

# rule.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

823

"windowBefore": 42, # Number of characters before the finding to consider.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

824

"windowAfter": 42, # Number of characters after the finding to consider.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

825

},

826

"hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword.

827

"pattern": "A String", # Pattern defining the regular expression. Its syntax

828

# (https://github.com/google/re2/wiki/Syntax) can be found under the

829

# google/re2 repository on GitHub.

830

"groupIndexes": [ # The index of the submatch to extract as findings. When not

831

# specified, the entire match is returned. No more than 3 may be included.

42,

],

},

"likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings.

836

# part of a detection rule.

837

"relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of

838

# levels. For example, if a finding would be `POSSIBLE` without the

839

# detection rule and `relative_likelihood` is 1, then it is upgraded to

840

# `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`.

841

# Likelihood may never drop below `VERY_UNLIKELY` or exceed

842

# `VERY_LIKELY`, so applying an adjustment of 1 followed by an

843

# adjustment of -1 when base likelihood is `VERY_LIKELY` will result in

844

# a final likelihood of `LIKELY`.

845

"fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value.

},

},

},

],

"exclusionType": "A String", # If set to EXCLUSION_TYPE_EXCLUDE this infoType will not cause a finding

851

# to be returned. It still can be used for rules matching.

852

"likelihood": "A String", # Likelihood to return for this CustomInfoType. This base value can be

853

# altered by a detection rule if the finding meets the criteria specified by

854

# the rule. Defaults to `VERY_LIKELY` if not specified.

855

},

856

],

857

"includeQuote": True or False, # When true, a contextual quote from the data that triggered a finding is

858

# included in the response; see Finding.quote.

859

"ruleSet": [ # Set of rules to apply to the findings for this InspectConfig.

860

# Exclusion rules, contained in the set are executed in the end, other

861

# rules are executed in the order they are specified for each info type.

862

{ # Rule set for modifying a set of infoTypes to alter behavior under certain

863

# circumstances, depending on the specific details of the rules within the set.

864

"rules": [ # Set of rules to be applied to infoTypes. The rules are applied in order.

865

{ # A single inspection rule to be applied to infoTypes, specified in

866

# `InspectionRuleSet`.

867

"hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule.

868

# proximity of hotwords.

869

"proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside.

870

# The total length of the window cannot exceed 1000 characters. Note that

871

# the finding itself will be included in the window, so that hotwords may

872

# be used to match substrings of the finding itself. For example, the

873

# certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be

874

# adjusted upwards if the area code is known to be the local area code of

875

# a company office using the hotword regex "\(xxx\)", where "xxx"

876

# is the area code in question.

877

# rule.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

878

"windowBefore": 42, # Number of characters before the finding to consider.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

879

"windowAfter": 42, # Number of characters after the finding to consider.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

880

},

881

"hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword.

882

"pattern": "A String", # Pattern defining the regular expression. Its syntax

883

# (https://github.com/google/re2/wiki/Syntax) can be found under the

884

# google/re2 repository on GitHub.

885

"groupIndexes": [ # The index of the submatch to extract as findings. When not

886

# specified, the entire match is returned. No more than 3 may be included.

42,

],

},

"likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings.

891

# part of a detection rule.

892

"relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of

893

# levels. For example, if a finding would be `POSSIBLE` without the

894

# detection rule and `relative_likelihood` is 1, then it is upgraded to

895

# `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`.

896

# Likelihood may never drop below `VERY_UNLIKELY` or exceed

897

# `VERY_LIKELY`, so applying an adjustment of 1 followed by an

898

# adjustment of -1 when base likelihood is `VERY_LIKELY` will result in

899

# a final likelihood of `LIKELY`.

900

"fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value.

901

},

902

},

903

"exclusionRule": { # The rule that specifies conditions when findings of infoTypes specified in # Exclusion rule.

904

# `InspectionRuleSet` are removed from results.

905

"regex": { # Message defining a custom regular expression. # Regular expression which defines the rule.

906

"pattern": "A String", # Pattern defining the regular expression. Its syntax

907

# (https://github.com/google/re2/wiki/Syntax) can be found under the

908

# google/re2 repository on GitHub.

909

"groupIndexes": [ # The index of the submatch to extract as findings. When not

910

# specified, the entire match is returned. No more than 3 may be included.

42,

],

},

"excludeInfoTypes": { # List of exclude infoTypes. # Set of infoTypes for which findings would affect this rule.

915

"infoTypes": [ # InfoType list in ExclusionRule rule drops a finding when it overlaps or

916

# contained within with a finding of an infoType from this list. For

917

# example, for `InspectionRuleSet.info_types` containing "PHONE_NUMBER"` and

918

# `exclusion_rule` containing `exclude_info_types.info_types` with

919

# "EMAIL_ADDRESS" the phone number findings are dropped if they overlap

920

# with EMAIL_ADDRESS finding.

921

# That leads to "555-222-2222@example.org" to generate only a single

922

# finding, namely email address.

923

{ # Type of information detected by the API.

924

"name": "A String", # Name of the information type. Either a name of your choosing when

925

# creating a CustomInfoType, or one of the names listed

926

# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying

927

# a built-in type. InfoType names should conform to the pattern

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

928

# `[a-zA-Z0-9_]{1,64}`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

},

],

},

"dictionary": { # Custom information type based on a dictionary of words or phrases. This can # Dictionary which defines the rule.

933

# be used to match sensitive information specific to the data, such as a list

934

# of employee IDs or job titles.

935

#

936

# Dictionary words are case-insensitive and all characters other than letters

937

# and digits in the unicode [Basic Multilingual

938

# Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane)

939

# will be replaced with whitespace when scanning for matches, so the

940

# dictionary phrase "Sam Johnson" will match all three phrases "sam johnson",

941

# "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters

942

# surrounding any match must be of a different type than the adjacent

943

# characters within the word, so letters must be next to non-letters and

944

# digits next to non-digits. For example, the dictionary word "jen" will

945

# match the first three letters of the text "jen123" but will return no

946

# matches for "jennifer".

947

#

948

# Dictionary words containing a large number of characters that are not

949

# letters or digits may result in unexpected findings because such characters

950

# are treated as whitespace. The

951

# [limits](https://cloud.google.com/dlp/limits) page contains details about

952

# the size limits of dictionaries. For dictionaries that do not fit within

953

# these constraints, consider using `LargeCustomDictionaryConfig` in the

954

# `StoredInfoType` API.

955

"wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for.

956

"words": [ # Words or phrases defining the dictionary. The dictionary must contain

957

# at least one phrase and every phrase must contain at least 2 characters

958

# that are letters or digits. [required]

"A String",

],

},

"cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file

963

# is accepted.

964

"path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage.

965

# Example: gs://[BUCKET_NAME]/dictionary.txt

966

},

967

},

968

"matchingType": "A String", # How the rule is applied, see MatchingType documentation for details.

},

},

],

"infoTypes": [ # List of infoTypes this rule set is applied to.

973

{ # Type of information detected by the API.

974

"name": "A String", # Name of the information type. Either a name of your choosing when

975

# creating a CustomInfoType, or one of the names listed

976

# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying

977

# a built-in type. InfoType names should conform to the pattern

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

978

# `[a-zA-Z0-9_]{1,64}`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

},

],

},

],

"contentOptions": [ # List of options defining data content to scan.

984

# If empty, text, images, and other content will be included.

985

"A String",

986

],

987

"infoTypes": [ # Restricts what info_types to look for. The values must correspond to

988

# InfoType values returned by ListInfoTypes or listed at

989

# https://cloud.google.com/dlp/docs/infotypes-reference.

990

#

991

# When no InfoTypes or CustomInfoTypes are specified in a request, the

992

# system may automatically choose what detectors to run. By default this may

993

# be all types, but may change over time as detectors are updated.

994

#

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

995

# If you need precise control and predictability as to what detectors are

996

# run you should specify specific InfoTypes listed in the reference,

997

# otherwise a default list will be used, which may change over time.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

998

{ # Type of information detected by the API.

999

"name": "A String", # Name of the information type. Either a name of your choosing when

1000

# creating a CustomInfoType, or one of the names listed

1001

# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying

1002

# a built-in type. InfoType names should conform to the pattern

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1003

# `[a-zA-Z0-9_]{1,64}`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

},

],

},

"inspectTemplateName": "A String", # If provided, will be used as the default for all values in InspectConfig.

1008

# `inspect_config` will be merged into the values persisted as part of the

1009

# template.

1010

"actions": [ # Actions to execute at the completion of the job.

1011

{ # A task to execute on the completion of a job.

1012

# See https://cloud.google.com/dlp/docs/concepts-actions to learn more.

1013

"saveFindings": { # If set, the detailed findings will be persisted to the specified # Save resulting findings in a provided location.

1014

# OutputStorageConfig. Only a single instance of this action can be

1015

# specified.

1016

# Compatible with: Inspect, Risk

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1017

"outputConfig": { # Cloud repository for storing output. # Location to store findings outside of DLP.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1018

"table": { # Message defining the location of a BigQuery table. A table is uniquely # Store findings in an existing table or a new table in an existing

1019

# dataset. If table_id is not set a new one will be generated

1020

# for you with the following format:

1021

# dlp_googleapis_yyyy_mm_dd_[dlp_job_id]. Pacific timezone will be used for

1022

# generating the date details.

1023

#

1024

# For Inspect, each column in an existing output table must have the same

1025

# name, type, and mode of a field in the `Finding` object.

1026

#

1027

# For Risk, an existing output table should be the output of a previous

1028

# Risk analysis job run on the same source table, with the same privacy

1029

# metric and quasi-identifiers. Risk jobs that analyze the same table but

1030

# compute a different privacy metric, or use different sets of

1031

# quasi-identifiers, cannot store their results in the same table.

1032

# identified by its project_id, dataset_id, and table_name. Within a query

1033

# a table is often referenced with a string in the format of:

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1034

# `<project_id>:<dataset_id>.<table_id>` or

1035

# `<project_id>.<dataset_id>.<table_id>`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1036

"projectId": "A String", # The Google Cloud Platform project ID of the project containing the table.

1037

# If omitted, project ID is inferred from the API call.

1038

"tableId": "A String", # Name of the table.

1039

"datasetId": "A String", # Dataset ID of the table.

1040

},

1041

"outputSchema": "A String", # Schema used for writing the findings for Inspect jobs. This field is only

1042

# used for Inspect and must be unspecified for Risk jobs. Columns are derived

1043

# from the `Finding` object. If appending to an existing table, any columns

1044

# from the predefined schema that are missing will be added. No columns in

1045

# the existing table will be deleted.

1046

#

1047

# If unspecified, then all available columns will be used for a new table or

1048

# an (existing) table with no schema, and no changes will be made to an

1049

# existing table that has a schema.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1050

# Only for use with external storage.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1051

},

1052

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1053

"jobNotificationEmails": { # Enable email notification to project owners and editors on jobs's # Enable email notification for project owners and editors on job's

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1054

# completion/failure.

1055

# completion/failure.

1056

},

1057

"publishSummaryToCscc": { # Publish the result summary of a DlpJob to the Cloud Security # Publish summary to Cloud Security Command Center (Alpha).

1058

# Command Center (CSCC Alpha).

1059

# This action is only available for projects which are parts of

1060

# an organization and whitelisted for the alpha Cloud Security Command

1061

# Center.

1062

# The action will publish count of finding instances and their info types.

1063

# The summary of findings will be persisted in CSCC and are governed by CSCC

1064

# service-specific policy, see https://cloud.google.com/terms/service-terms

1065

# Only a single instance of this action can be specified.

1066

# Compatible with: Inspect

1067

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1068

"publishToStackdriver": { # Enable Stackdriver metric dlp.googleapis.com/finding_count. This # Enable Stackdriver metric dlp.googleapis.com/finding_count.

1069

# will publish a metric to stack driver on each infotype requested and

1070

# how many findings were found for it. CustomDetectors will be bucketed

1071

# as 'Custom' under the Stackdriver label 'info_type'.

1072

},

1073

"publishFindingsToCloudDataCatalog": { # Publish findings of a DlpJob to Cloud Data Catalog. Labels summarizing the # Publish findings to Cloud Datahub.

1074

# results of the DlpJob will be applied to the entry for the resource scanned

1075

# in Cloud Data Catalog. Any labels previously written by another DlpJob will

1076

# be deleted. InfoType naming patterns are strictly enforced when using this

1077

# feature. Note that the findings will be persisted in Cloud Data Catalog

1078

# storage and are governed by Data Catalog service-specific policy, see

1079

# https://cloud.google.com/terms/service-terms

1080

# Only a single instance of this action can be specified and only allowed if

1081

# all resources being scanned are BigQuery tables.

1082

# Compatible with: Inspect

1083

},

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1084

"pubSub": { # Publish a message into given Pub/Sub topic when DlpJob has completed. The # Publish a notification to a pubsub topic.

1085

# message contains a single field, `DlpJobName`, which is equal to the

1086

# finished job's

1087

# [`DlpJob.name`](/dlp/docs/reference/rest/v2/projects.dlpJobs#DlpJob).

1088

# Compatible with: Inspect, Risk

1089

"topic": "A String", # Cloud Pub/Sub topic to send notifications to. The topic must have given

1090

# publishing access rights to the DLP API service account executing

1091

# the long running DlpJob sending the notifications.

1092

# Format is projects/{project}/topics/{topic}.

},

},

],

},

},

"result": { # All result fields mentioned below are updated while the job is processing. # A summary of the outcome of this inspect job.

1099

"infoTypeStats": [ # Statistics of how many instances of each info type were found during

1100

# inspect job.

1101

{ # Statistics regarding a specific InfoType.

1102

"count": "A String", # Number of findings for this infoType.

1103

"infoType": { # Type of information detected by the API. # The type of finding this stat is for.

1104

"name": "A String", # Name of the information type. Either a name of your choosing when

1105

# creating a CustomInfoType, or one of the names listed

1106

# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying

1107

# a built-in type. InfoType names should conform to the pattern

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1108

# `[a-zA-Z0-9_]{1,64}`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

},

},

],

"totalEstimatedBytes": "A String", # Estimate of the number of bytes to process.

1113

"processedBytes": "A String", # Total size in bytes that were processed.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1114

"hybridStats": { # Statistics related to processing hybrid inspect requests. # Statistics related to the processing of hybrid inspect.

1115

# Early access feature is in a pre-release state and might change or have

1116

# limited support. For more information, see

1117

# https://cloud.google.com/products#product-launch-stages.

1118

"abortedCount": "A String", # The number of hybrid inspection requests aborted because the job ran

1119

# out of quota or was ended before they could be processed.

1120

"pendingCount": "A String", # The number of hybrid requests currently being processed. Only populated

1121

# when called via method `getDlpJob`.

1122

# A burst of traffic may cause hybrid inspect requests to be enqueued.

1123

# Processing will take place as quickly as possible, but resource limitations

1124

# may impact how long a request is enqueued for.

1125

"processedCount": "A String", # The number of hybrid inspection requests processed within this job.

1126

},

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1127

},

1128

},

1129

"riskDetails": { # Result of a risk analysis operation request. # Results from analyzing risk of a data source.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1130

"numericalStatsResult": { # Result of the numerical stats computation. # Numerical stats result

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1131

"quantileValues": [ # List of 99 values that partition the set of field values into 100 equal

1132

# sized buckets.

1133

{ # Set of primitive values supported by the system.

1134

# Note that for the purposes of inspection or transformation, the number

1135

# of bytes considered to comprise a 'Value' is based on its representation

1136

# as a UTF-8 encoded string. For example, if 'integer_value' is set to

1137

# 123456789, the number of bytes would be counted as 9, even though an

1138

# int64 only holds up to 8 bytes of data.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1139

"floatValue": 3.14, # float

1140

"timestampValue": "A String", # timestamp

1141

"dayOfWeekValue": "A String", # day of week

1142

"timeValue": { # Represents a time of day. The date and time zone are either not significant # time of day

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1143

# or are specified elsewhere. An API may choose to allow leap seconds. Related

1144

# types are google.type.Date and `google.protobuf.Timestamp`.

1145

"hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose

1146

# to allow the value "24:00:00" for scenarios like business closing time.

1147

"nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999.

1148

"seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may

1149

# allow the value 60 if it allows leap-seconds.

1150

"minutes": 42, # Minutes of hour of day. Must be from 0 to 59.

1151

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1152

"dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day # date

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1153

# and time zone are either specified elsewhere or are not significant. The date

1154

# is relative to the Proleptic Gregorian Calendar. This can represent:

1155

#

1156

# * A full date, with non-zero year, month and day values

1157

# * A month and day value, with a zero year, e.g. an anniversary

1158

# * A year on its own, with zero month and day values

1159

# * A year and month value, with a zero day, e.g. a credit card expiration date

1160

#

1161

# Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1162

"month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a

1163

# month and day.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1164

"day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0

1165

# if specifying a year by itself or a year and month where the day is not

1166

# significant.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1167

"year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without

1168

# a year.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1169

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1170

"stringValue": "A String", # string

1171

"booleanValue": True or False, # boolean

1172

"integerValue": "A String", # integer

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1173

},

1174

],

1175

"maxValue": { # Set of primitive values supported by the system. # Maximum value appearing in the column.

1176

# Note that for the purposes of inspection or transformation, the number

1177

# of bytes considered to comprise a 'Value' is based on its representation

1178

# as a UTF-8 encoded string. For example, if 'integer_value' is set to

1179

# 123456789, the number of bytes would be counted as 9, even though an

1180

# int64 only holds up to 8 bytes of data.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1181

"floatValue": 3.14, # float

1182

"timestampValue": "A String", # timestamp

1183

"dayOfWeekValue": "A String", # day of week

1184

"timeValue": { # Represents a time of day. The date and time zone are either not significant # time of day

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1185

# or are specified elsewhere. An API may choose to allow leap seconds. Related

1186

# types are google.type.Date and `google.protobuf.Timestamp`.

1187

"hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose

1188

# to allow the value "24:00:00" for scenarios like business closing time.

1189

"nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999.

1190

"seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may

1191

# allow the value 60 if it allows leap-seconds.

1192

"minutes": 42, # Minutes of hour of day. Must be from 0 to 59.

1193

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1194

"dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day # date

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1195

# and time zone are either specified elsewhere or are not significant. The date

1196

# is relative to the Proleptic Gregorian Calendar. This can represent:

1197

#

1198

# * A full date, with non-zero year, month and day values

1199

# * A month and day value, with a zero year, e.g. an anniversary

1200

# * A year on its own, with zero month and day values

1201

# * A year and month value, with a zero day, e.g. a credit card expiration date

1202

#

1203

# Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1204

"month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a

1205

# month and day.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1206

"day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0

1207

# if specifying a year by itself or a year and month where the day is not

1208

# significant.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1209

"year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without

1210

# a year.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1211

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1212

"stringValue": "A String", # string

1213

"booleanValue": True or False, # boolean

1214

"integerValue": "A String", # integer

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1215

},

1216

"minValue": { # Set of primitive values supported by the system. # Minimum value appearing in the column.

1217

# Note that for the purposes of inspection or transformation, the number

1218

# of bytes considered to comprise a 'Value' is based on its representation

1219

# as a UTF-8 encoded string. For example, if 'integer_value' is set to

1220

# 123456789, the number of bytes would be counted as 9, even though an

1221

# int64 only holds up to 8 bytes of data.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1222

"floatValue": 3.14, # float

1223

"timestampValue": "A String", # timestamp

1224

"dayOfWeekValue": "A String", # day of week

1225

"timeValue": { # Represents a time of day. The date and time zone are either not significant # time of day

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1226

# or are specified elsewhere. An API may choose to allow leap seconds. Related

1227

# types are google.type.Date and `google.protobuf.Timestamp`.

1228

"hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose

1229

# to allow the value "24:00:00" for scenarios like business closing time.

1230

"nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999.

1231

"seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may

1232

# allow the value 60 if it allows leap-seconds.

1233

"minutes": 42, # Minutes of hour of day. Must be from 0 to 59.

1234

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1235

"dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day # date

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1236

# and time zone are either specified elsewhere or are not significant. The date

1237

# is relative to the Proleptic Gregorian Calendar. This can represent:

1238

#

1239

# * A full date, with non-zero year, month and day values

1240

# * A month and day value, with a zero year, e.g. an anniversary

1241

# * A year on its own, with zero month and day values

1242

# * A year and month value, with a zero day, e.g. a credit card expiration date

1243

#

1244

# Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1245

"month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a

1246

# month and day.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1247

"day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0

1248

# if specifying a year by itself or a year and month where the day is not

1249

# significant.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1250

"year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without

1251

# a year.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1252

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1253

"stringValue": "A String", # string

1254

"booleanValue": True or False, # boolean

1255

"integerValue": "A String", # integer

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1256

},

1257

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1258

"kMapEstimationResult": { # Result of the reidentifiability analysis. Note that these results are an # K-map result

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1259

# estimation, not exact values.

1260

"kMapEstimationHistogram": [ # The intervals [min_anonymity, max_anonymity] do not overlap. If a value

1261

# doesn't correspond to any such interval, the associated frequency is

1262

# zero. For example, the following records:

1263

# {min_anonymity: 1, max_anonymity: 1, frequency: 17}

1264

# {min_anonymity: 2, max_anonymity: 3, frequency: 42}

1265

# {min_anonymity: 5, max_anonymity: 10, frequency: 99}

1266

# mean that there are no record with an estimated anonymity of 4, 5, or

1267

# larger than 10.

1268

{ # A KMapEstimationHistogramBucket message with the following values:

# min_anonymity: 3

# max_anonymity: 5

# frequency: 42

# means that there are 42 records whose quasi-identifier values correspond

1273

# to 3, 4 or 5 people in the overlying population. An important particular

1274

# case is when min_anonymity = max_anonymity = 1: the frequency field then

1275

# corresponds to the number of uniquely identifiable records.

1276

"bucketValues": [ # Sample of quasi-identifier tuple values in this bucket. The total

1277

# number of classes returned per bucket is capped at 20.

1278

{ # A tuple of values for the quasi-identifier columns.

1279

"estimatedAnonymity": "A String", # The estimated anonymity for these quasi-identifier values.

1280

"quasiIdsValues": [ # The quasi-identifier values.

1281

{ # Set of primitive values supported by the system.

1282

# Note that for the purposes of inspection or transformation, the number

1283

# of bytes considered to comprise a 'Value' is based on its representation

1284

# as a UTF-8 encoded string. For example, if 'integer_value' is set to

1285

# 123456789, the number of bytes would be counted as 9, even though an

1286

# int64 only holds up to 8 bytes of data.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1287

"floatValue": 3.14, # float

1288

"timestampValue": "A String", # timestamp

1289

"dayOfWeekValue": "A String", # day of week

1290

"timeValue": { # Represents a time of day. The date and time zone are either not significant # time of day

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1291

# or are specified elsewhere. An API may choose to allow leap seconds. Related

1292

# types are google.type.Date and `google.protobuf.Timestamp`.

1293

"hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose

1294

# to allow the value "24:00:00" for scenarios like business closing time.

1295

"nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999.

1296

"seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may

1297

# allow the value 60 if it allows leap-seconds.

1298

"minutes": 42, # Minutes of hour of day. Must be from 0 to 59.

1299

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1300

"dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day # date

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1301

# and time zone are either specified elsewhere or are not significant. The date

1302

# is relative to the Proleptic Gregorian Calendar. This can represent:

1303

#

1304

# * A full date, with non-zero year, month and day values

1305

# * A month and day value, with a zero year, e.g. an anniversary

1306

# * A year on its own, with zero month and day values

1307

# * A year and month value, with a zero day, e.g. a credit card expiration date

1308

#

1309

# Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1310

"month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a

1311

# month and day.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1312

"day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0

1313

# if specifying a year by itself or a year and month where the day is not

1314

# significant.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1315

"year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without

1316

# a year.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1317

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1318

"stringValue": "A String", # string

1319

"booleanValue": True or False, # boolean

1320

"integerValue": "A String", # integer

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

},

],

},

],

"minAnonymity": "A String", # Always positive.

1326

"bucketValueCount": "A String", # Total number of distinct quasi-identifier tuple values in this bucket.

1327

"maxAnonymity": "A String", # Always greater than or equal to min_anonymity.

1328

"bucketSize": "A String", # Number of records within these anonymity bounds.

1329

},

1330

],

1331

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1332

"kAnonymityResult": { # Result of the k-anonymity computation. # K-anonymity result

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1333

"equivalenceClassHistogramBuckets": [ # Histogram of k-anonymity equivalence classes.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1334

{ # Histogram of k-anonymity equivalence classes.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1335

"bucketValues": [ # Sample of equivalence classes in this bucket. The total number of

1336

# classes returned per bucket is capped at 20.

1337

{ # The set of columns' values that share the same ldiversity value

1338

"quasiIdsValues": [ # Set of values defining the equivalence class. One value per

1339

# quasi-identifier column in the original KAnonymity metric message.

1340

# The order is always the same as the original request.

1341

{ # Set of primitive values supported by the system.

1342

# Note that for the purposes of inspection or transformation, the number

1343

# of bytes considered to comprise a 'Value' is based on its representation

1344

# as a UTF-8 encoded string. For example, if 'integer_value' is set to

1345

# 123456789, the number of bytes would be counted as 9, even though an

1346

# int64 only holds up to 8 bytes of data.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1347

"floatValue": 3.14, # float

1348

"timestampValue": "A String", # timestamp

1349

"dayOfWeekValue": "A String", # day of week

1350

"timeValue": { # Represents a time of day. The date and time zone are either not significant # time of day

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1351

# or are specified elsewhere. An API may choose to allow leap seconds. Related

1352

# types are google.type.Date and `google.protobuf.Timestamp`.

1353

"hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose

1354

# to allow the value "24:00:00" for scenarios like business closing time.

1355

"nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999.

1356

"seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may

1357

# allow the value 60 if it allows leap-seconds.

1358

"minutes": 42, # Minutes of hour of day. Must be from 0 to 59.

1359

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1360

"dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day # date

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1361

# and time zone are either specified elsewhere or are not significant. The date

1362

# is relative to the Proleptic Gregorian Calendar. This can represent:

1363

#

1364

# * A full date, with non-zero year, month and day values

1365

# * A month and day value, with a zero year, e.g. an anniversary

1366

# * A year on its own, with zero month and day values

1367

# * A year and month value, with a zero day, e.g. a credit card expiration date

1368

#

1369

# Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1370

"month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a

1371

# month and day.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1372

"day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0

1373

# if specifying a year by itself or a year and month where the day is not

1374

# significant.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1375

"year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without

1376

# a year.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1377

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1378

"stringValue": "A String", # string

1379

"booleanValue": True or False, # boolean

1380

"integerValue": "A String", # integer

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1381

},

1382

],

1383

"equivalenceClassSize": "A String", # Size of the equivalence class, for example number of rows with the

1384

# above set of values.

1385

},

1386

],

1387

"bucketValueCount": "A String", # Total number of distinct equivalence classes in this bucket.

1388

"equivalenceClassSizeLowerBound": "A String", # Lower bound on the size of the equivalence classes in this bucket.

1389

"equivalenceClassSizeUpperBound": "A String", # Upper bound on the size of the equivalence classes in this bucket.

1390

"bucketSize": "A String", # Total number of equivalence classes in this bucket.

1391

},

1392

],

1393

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1394

"lDiversityResult": { # Result of the l-diversity computation. # L-divesity result

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1395

"sensitiveValueFrequencyHistogramBuckets": [ # Histogram of l-diversity equivalence class sensitive value frequencies.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1396

{ # Histogram of l-diversity equivalence class sensitive value frequencies.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1397

"bucketValues": [ # Sample of equivalence classes in this bucket. The total number of

1398

# classes returned per bucket is capped at 20.

1399

{ # The set of columns' values that share the same ldiversity value.

1400

"numDistinctSensitiveValues": "A String", # Number of distinct sensitive values in this equivalence class.

1401

"quasiIdsValues": [ # Quasi-identifier values defining the k-anonymity equivalence

1402

# class. The order is always the same as the original request.

1403

{ # Set of primitive values supported by the system.

1404

# Note that for the purposes of inspection or transformation, the number

1405

# of bytes considered to comprise a 'Value' is based on its representation

1406

# as a UTF-8 encoded string. For example, if 'integer_value' is set to

1407

# 123456789, the number of bytes would be counted as 9, even though an

1408

# int64 only holds up to 8 bytes of data.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1409

"floatValue": 3.14, # float

1410

"timestampValue": "A String", # timestamp

1411

"dayOfWeekValue": "A String", # day of week

1412

"timeValue": { # Represents a time of day. The date and time zone are either not significant # time of day

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1413

# or are specified elsewhere. An API may choose to allow leap seconds. Related

1414

# types are google.type.Date and `google.protobuf.Timestamp`.

1415

"hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose

1416

# to allow the value "24:00:00" for scenarios like business closing time.

1417

"nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999.

1418

"seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may

1419

# allow the value 60 if it allows leap-seconds.

1420

"minutes": 42, # Minutes of hour of day. Must be from 0 to 59.

1421

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1422

"dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day # date

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1423

# and time zone are either specified elsewhere or are not significant. The date

1424

# is relative to the Proleptic Gregorian Calendar. This can represent:

1425

#

1426

# * A full date, with non-zero year, month and day values

1427

# * A month and day value, with a zero year, e.g. an anniversary

1428

# * A year on its own, with zero month and day values

1429

# * A year and month value, with a zero day, e.g. a credit card expiration date

1430

#

1431

# Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1432

"month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a

1433

# month and day.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1434

"day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0

1435

# if specifying a year by itself or a year and month where the day is not

1436

# significant.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1437

"year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without

1438

# a year.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1439

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1440

"stringValue": "A String", # string

1441

"booleanValue": True or False, # boolean

1442

"integerValue": "A String", # integer

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1443

},

1444

],

1445

"topSensitiveValues": [ # Estimated frequencies of top sensitive values.

1446

{ # A value of a field, including its frequency.

1447

"count": "A String", # How many times the value is contained in the field.

1448

"value": { # Set of primitive values supported by the system. # A value contained in the field in question.

1449

# Note that for the purposes of inspection or transformation, the number

1450

# of bytes considered to comprise a 'Value' is based on its representation

1451

# as a UTF-8 encoded string. For example, if 'integer_value' is set to

1452

# 123456789, the number of bytes would be counted as 9, even though an

1453

# int64 only holds up to 8 bytes of data.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1454

"floatValue": 3.14, # float

1455

"timestampValue": "A String", # timestamp

1456

"dayOfWeekValue": "A String", # day of week

1457

"timeValue": { # Represents a time of day. The date and time zone are either not significant # time of day

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1458

# or are specified elsewhere. An API may choose to allow leap seconds. Related

1459

# types are google.type.Date and `google.protobuf.Timestamp`.

1460

"hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose

1461

# to allow the value "24:00:00" for scenarios like business closing time.

1462

"nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999.

1463

"seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may

1464

# allow the value 60 if it allows leap-seconds.

1465

"minutes": 42, # Minutes of hour of day. Must be from 0 to 59.

1466

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1467

"dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day # date

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1468

# and time zone are either specified elsewhere or are not significant. The date

1469

# is relative to the Proleptic Gregorian Calendar. This can represent:

1470

#

1471

# * A full date, with non-zero year, month and day values

1472

# * A month and day value, with a zero year, e.g. an anniversary

1473

# * A year on its own, with zero month and day values

1474

# * A year and month value, with a zero day, e.g. a credit card expiration date

1475

#

1476

# Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1477

"month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a

1478

# month and day.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1479

"day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0

1480

# if specifying a year by itself or a year and month where the day is not

1481

# significant.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1482

"year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without

1483

# a year.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1484

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1485

"stringValue": "A String", # string

1486

"booleanValue": True or False, # boolean

1487

"integerValue": "A String", # integer

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

},

},

],

"equivalenceClassSize": "A String", # Size of the k-anonymity equivalence class.

1492

},

1493

],

1494

"bucketValueCount": "A String", # Total number of distinct equivalence classes in this bucket.

1495

"bucketSize": "A String", # Total number of equivalence classes in this bucket.

1496

"sensitiveValueFrequencyUpperBound": "A String", # Upper bound on the sensitive value frequencies of the equivalence

1497

# classes in this bucket.

1498

"sensitiveValueFrequencyLowerBound": "A String", # Lower bound on the sensitive value frequencies of the equivalence

1499

# classes in this bucket.

},

],

},

"requestedPrivacyMetric": { # Privacy metric to compute for reidentification risk analysis. # Privacy metric to compute.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1504

"numericalStatsConfig": { # Compute numerical stats over an individual column, including # Numerical stats

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1505

# min, max, and quantiles.

1506

"field": { # General identifier of a data field in a storage service. # Field to compute numerical stats on. Supported types are

1507

# integer, float, date, datetime, timestamp, time.

1508

"name": "A String", # Name describing the field.

1509

},

1510

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1511

"kMapEstimationConfig": { # Reidentifiability metric. This corresponds to a risk model similar to what # k-map

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1512

# is called "journalist risk" in the literature, except the attack dataset is

1513

# statistically modeled instead of being perfectly known. This can be done

1514

# using publicly available data (like the US Census), or using a custom

1515

# statistical model (indicated as one or several BigQuery tables), or by

1516

# extrapolating from the distribution of values in the input dataset.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1517

"regionCode": "A String", # ISO 3166-1 alpha-2 region code to use in the statistical modeling.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1518

# Set if no column is tagged with a region-specific InfoType (like

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1519

# US_ZIP_5) or a region code.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1520

"quasiIds": [ # Required. Fields considered to be quasi-identifiers. No two columns can have the

1521

# same tag.

1522

{ # A column with a semantic tag attached.

1523

"field": { # General identifier of a data field in a storage service. # Required. Identifies the column.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1524

"name": "A String", # Name describing the field.

1525

},

1526

"customTag": "A String", # A column can be tagged with a custom tag. In this case, the user must

1527

# indicate an auxiliary table that contains statistical information on

1528

# the possible values of this column (below).

1529

"infoType": { # Type of information detected by the API. # A column can be tagged with a InfoType to use the relevant public

1530

# dataset as a statistical model of population, if available. We

1531

# currently support US ZIP codes, region codes, ages and genders.

1532

# To programmatically obtain the list of supported InfoTypes, use

1533

# ListInfoTypes with the supported_by=RISK_ANALYSIS filter.

1534

"name": "A String", # Name of the information type. Either a name of your choosing when

1535

# creating a CustomInfoType, or one of the names listed

1536

# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying

1537

# a built-in type. InfoType names should conform to the pattern

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1538

# `[a-zA-Z0-9_]{1,64}`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1539

},

1540

"inferred": { # A generic empty message that you can re-use to avoid defining duplicated # If no semantic tag is indicated, we infer the statistical model from

1541

# the distribution of values in the input data

1542

# empty messages in your APIs. A typical example is to use it as the request

1543

# or the response type of an API method. For instance:

1544

#

1545

# service Foo {

1546

# rpc Bar(google.protobuf.Empty) returns (google.protobuf.Empty);

1547

# }

1548

#

1549

# The JSON representation for `Empty` is empty JSON object `{}`.

},

},

],

"auxiliaryTables": [ # Several auxiliary tables can be used in the analysis. Each custom_tag

1554

# used to tag a quasi-identifiers column must appear in exactly one column

1555

# of one auxiliary table.

1556

{ # An auxiliary table contains statistical information on the relative

1557

# frequency of different quasi-identifiers values. It has one or several

1558

# quasi-identifiers columns, and one column that indicates the relative

1559

# frequency of each quasi-identifier tuple.

1560

# If a tuple is present in the data but not in the auxiliary table, the

1561

# corresponding relative frequency is assumed to be zero (and thus, the

1562

# tuple is highly reidentifiable).

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1563

"table": { # Message defining the location of a BigQuery table. A table is uniquely # Required. Auxiliary table location.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1564

# identified by its project_id, dataset_id, and table_name. Within a query

1565

# a table is often referenced with a string in the format of:

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1566

# `<project_id>:<dataset_id>.<table_id>` or

1567

# `<project_id>.<dataset_id>.<table_id>`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1568

"projectId": "A String", # The Google Cloud Platform project ID of the project containing the table.

1569

# If omitted, project ID is inferred from the API call.

1570

"tableId": "A String", # Name of the table.

1571

"datasetId": "A String", # Dataset ID of the table.

1572

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1573

"quasiIds": [ # Required. Quasi-identifier columns.

1574

{ # A quasi-identifier column has a custom_tag, used to know which column

1575

# in the data corresponds to which column in the statistical model.

1576

"field": { # General identifier of a data field in a storage service. # Identifies the column.

1577

"name": "A String", # Name describing the field.

1578

},

1579

"customTag": "A String", # A auxiliary field.

1580

},

1581

],

1582

"relativeFrequency": { # General identifier of a data field in a storage service. # Required. The relative frequency column must contain a floating-point number

1583

# between 0 and 1 (inclusive). Null values are assumed to be zero.

1584

"name": "A String", # Name describing the field.

1585

},

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1586

},

1587

],

1588

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1589

"lDiversityConfig": { # l-diversity metric, used for analysis of reidentification risk. # l-diversity

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1590

"sensitiveAttribute": { # General identifier of a data field in a storage service. # Sensitive field for computing the l-value.

1591

"name": "A String", # Name describing the field.

1592

},

1593

"quasiIds": [ # Set of quasi-identifiers indicating how equivalence classes are

1594

# defined for the l-diversity computation. When multiple fields are

1595

# specified, they are considered a single composite key.

1596

{ # General identifier of a data field in a storage service.

1597

"name": "A String", # Name describing the field.

1598

},

1599

],

1600

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1601

"kAnonymityConfig": { # k-anonymity metric, used for analysis of reidentification risk. # K-anonymity

1602

"entityId": { # An entity in a dataset is a field or set of fields that correspond to a # Message indicating that multiple rows might be associated to a

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1603

# single individual. If the same entity_id is associated to multiple

1604

# quasi-identifier tuples over distinct rows, we consider the entire

1605

# collection of tuples as the composite quasi-identifier. This collection

1606

# is a multiset: the order in which the different tuples appear in the

1607

# dataset is ignored, but their frequency is taken into account.

1608

#

1609

# Important note: a maximum of 1000 rows can be associated to a single

1610

# entity ID. If more rows are associated with the same entity ID, some

1611

# might be ignored.

1612

# single person. For example, in medical records the `EntityId` might be a

1613

# patient identifier, or for financial records it might be an account

1614

# identifier. This message is used when generalizations or analysis must take

1615

# into account that multiple rows correspond to the same entity.

1616

"field": { # General identifier of a data field in a storage service. # Composite key indicating which field contains the entity identifier.

1617

"name": "A String", # Name describing the field.

1618

},

1619

},

1620

"quasiIds": [ # Set of fields to compute k-anonymity over. When multiple fields are

1621

# specified, they are considered a single composite key. Structs and

1622

# repeated data types are not supported; however, nested fields are

1623

# supported so long as they are not structs themselves or nested within

1624

# a repeated field.

1625

{ # General identifier of a data field in a storage service.

1626

"name": "A String", # Name describing the field.

1627

},

1628

],

1629

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1630

"categoricalStatsConfig": { # Compute numerical stats over an individual column, including # Categorical stats

1631

# number of distinct values and value count distribution.

1632

"field": { # General identifier of a data field in a storage service. # Field to compute categorical stats on. All column types are

1633

# supported except for arrays and structs. However, it may be more

1634

# informative to use NumericalStats when the field type is supported,

1635

# depending on the data.

1636

"name": "A String", # Name describing the field.

1637

},

1638

},

1639

"deltaPresenceEstimationConfig": { # δ-presence metric, used to estimate how likely it is for an attacker to # delta-presence

1640

# figure out that one given individual appears in a de-identified dataset.

1641

# Similarly to the k-map metric, we cannot compute δ-presence exactly without

1642

# knowing the attack dataset, so we use a statistical model instead.

1643

"regionCode": "A String", # ISO 3166-1 alpha-2 region code to use in the statistical modeling.

1644

# Set if no column is tagged with a region-specific InfoType (like

1645

# US_ZIP_5) or a region code.

1646

"quasiIds": [ # Required. Fields considered to be quasi-identifiers. No two fields can have the

1647

# same tag.

1648

{ # A column with a semantic tag attached.

1649

"field": { # General identifier of a data field in a storage service. # Required. Identifies the column.

1650

"name": "A String", # Name describing the field.

1651

},

1652

"customTag": "A String", # A column can be tagged with a custom tag. In this case, the user must

1653

# indicate an auxiliary table that contains statistical information on

1654

# the possible values of this column (below).

1655

"infoType": { # Type of information detected by the API. # A column can be tagged with a InfoType to use the relevant public

1656

# dataset as a statistical model of population, if available. We

1657

# currently support US ZIP codes, region codes, ages and genders.

1658

# To programmatically obtain the list of supported InfoTypes, use

1659

# ListInfoTypes with the supported_by=RISK_ANALYSIS filter.

1660

"name": "A String", # Name of the information type. Either a name of your choosing when

1661

# creating a CustomInfoType, or one of the names listed

1662

# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying

1663

# a built-in type. InfoType names should conform to the pattern

1664

# `[a-zA-Z0-9_]{1,64}`.

1665

},

1666

"inferred": { # A generic empty message that you can re-use to avoid defining duplicated # If no semantic tag is indicated, we infer the statistical model from

1667

# the distribution of values in the input data

1668

# empty messages in your APIs. A typical example is to use it as the request

1669

# or the response type of an API method. For instance:

1670

#

1671

# service Foo {

1672

# rpc Bar(google.protobuf.Empty) returns (google.protobuf.Empty);

1673

# }

1674

#

1675

# The JSON representation for `Empty` is empty JSON object `{}`.

},

},

],

"auxiliaryTables": [ # Several auxiliary tables can be used in the analysis. Each custom_tag

1680

# used to tag a quasi-identifiers field must appear in exactly one

1681

# field of one auxiliary table.

1682

{ # An auxiliary table containing statistical information on the relative

1683

# frequency of different quasi-identifiers values. It has one or several

1684

# quasi-identifiers columns, and one column that indicates the relative

1685

# frequency of each quasi-identifier tuple.

1686

# If a tuple is present in the data but not in the auxiliary table, the

1687

# corresponding relative frequency is assumed to be zero (and thus, the

1688

# tuple is highly reidentifiable).

1689

"relativeFrequency": { # General identifier of a data field in a storage service. # Required. The relative frequency column must contain a floating-point number

1690

# between 0 and 1 (inclusive). Null values are assumed to be zero.

1691

"name": "A String", # Name describing the field.

1692

},

1693

"quasiIds": [ # Required. Quasi-identifier columns.

1694

{ # A quasi-identifier column has a custom_tag, used to know which column

1695

# in the data corresponds to which column in the statistical model.

1696

"field": { # General identifier of a data field in a storage service. # Identifies the column.

1697

"name": "A String", # Name describing the field.

1698

},

1699

"customTag": "A String", # A column can be tagged with a custom tag. In this case, the user must

1700

# indicate an auxiliary table that contains statistical information on

1701

# the possible values of this column (below).

1702

},

1703

],

1704

"table": { # Message defining the location of a BigQuery table. A table is uniquely # Required. Auxiliary table location.

1705

# identified by its project_id, dataset_id, and table_name. Within a query

1706

# a table is often referenced with a string in the format of:

1707

# `<project_id>:<dataset_id>.<table_id>` or

1708

# `<project_id>.<dataset_id>.<table_id>`.

1709

"projectId": "A String", # The Google Cloud Platform project ID of the project containing the table.

1710

# If omitted, project ID is inferred from the API call.

1711

"tableId": "A String", # Name of the table.

1712

"datasetId": "A String", # Dataset ID of the table.

},

},

],

},

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1717

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1718

"categoricalStatsResult": { # Result of the categorical stats computation. # Categorical stats result

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1719

"valueFrequencyHistogramBuckets": [ # Histogram of value frequencies in the column.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1720

{ # Histogram of value frequencies in the column.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1721

"bucketValues": [ # Sample of value frequencies in this bucket. The total number of

1722

# values returned per bucket is capped at 20.

1723

{ # A value of a field, including its frequency.

1724

"count": "A String", # How many times the value is contained in the field.

1725

"value": { # Set of primitive values supported by the system. # A value contained in the field in question.

1726

# Note that for the purposes of inspection or transformation, the number

1727

# of bytes considered to comprise a 'Value' is based on its representation

1728

# as a UTF-8 encoded string. For example, if 'integer_value' is set to

1729

# 123456789, the number of bytes would be counted as 9, even though an

1730

# int64 only holds up to 8 bytes of data.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1731

"floatValue": 3.14, # float

1732

"timestampValue": "A String", # timestamp

1733

"dayOfWeekValue": "A String", # day of week

1734

"timeValue": { # Represents a time of day. The date and time zone are either not significant # time of day

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1735

# or are specified elsewhere. An API may choose to allow leap seconds. Related

1736

# types are google.type.Date and `google.protobuf.Timestamp`.

1737

"hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose

1738

# to allow the value "24:00:00" for scenarios like business closing time.

1739

"nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999.

1740

"seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may

1741

# allow the value 60 if it allows leap-seconds.

1742

"minutes": 42, # Minutes of hour of day. Must be from 0 to 59.

1743

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1744

"dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day # date

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1745

# and time zone are either specified elsewhere or are not significant. The date

1746

# is relative to the Proleptic Gregorian Calendar. This can represent:

1747

#

1748

# * A full date, with non-zero year, month and day values

1749

# * A month and day value, with a zero year, e.g. an anniversary

1750

# * A year on its own, with zero month and day values

1751

# * A year and month value, with a zero day, e.g. a credit card expiration date

1752

#

1753

# Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1754

"month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a

1755

# month and day.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1756

"day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0

1757

# if specifying a year by itself or a year and month where the day is not

1758

# significant.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1759

"year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without

1760

# a year.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1761

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1762

"stringValue": "A String", # string

1763

"booleanValue": True or False, # boolean

1764

"integerValue": "A String", # integer

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

},

},

],

"bucketValueCount": "A String", # Total number of distinct values in this bucket.

1769

"valueFrequencyUpperBound": "A String", # Upper bound on the value frequency of the values in this bucket.

1770

"valueFrequencyLowerBound": "A String", # Lower bound on the value frequency of the values in this bucket.

1771

"bucketSize": "A String", # Total number of values in this bucket.

1772

},

1773

],

1774

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1775

"deltaPresenceEstimationResult": { # Result of the δ-presence computation. Note that these results are an # Delta-presence result

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1776

# estimation, not exact values.

1777

"deltaPresenceEstimationHistogram": [ # The intervals [min_probability, max_probability) do not overlap. If a

1778

# value doesn't correspond to any such interval, the associated frequency

1779

# is zero. For example, the following records:

1780

# {min_probability: 0, max_probability: 0.1, frequency: 17}

1781

# {min_probability: 0.2, max_probability: 0.3, frequency: 42}

1782

# {min_probability: 0.3, max_probability: 0.4, frequency: 99}

1783

# mean that there are no record with an estimated probability in [0.1, 0.2)

1784

# nor larger or equal to 0.4.

1785

{ # A DeltaPresenceEstimationHistogramBucket message with the following

1786

# values:

1787

# min_probability: 0.1

1788

# max_probability: 0.2

1789

# frequency: 42

1790

# means that there are 42 records for which δ is in [0.1, 0.2). An

1791

# important particular case is when min_probability = max_probability = 1:

1792

# then, every individual who shares this quasi-identifier combination is in

1793

# the dataset.

1794

"bucketValues": [ # Sample of quasi-identifier tuple values in this bucket. The total

1795

# number of classes returned per bucket is capped at 20.

1796

{ # A tuple of values for the quasi-identifier columns.

1797

"quasiIdsValues": [ # The quasi-identifier values.

1798

{ # Set of primitive values supported by the system.

1799

# Note that for the purposes of inspection or transformation, the number

1800

# of bytes considered to comprise a 'Value' is based on its representation

1801

# as a UTF-8 encoded string. For example, if 'integer_value' is set to

1802

# 123456789, the number of bytes would be counted as 9, even though an

1803

# int64 only holds up to 8 bytes of data.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1804

"floatValue": 3.14, # float

1805

"timestampValue": "A String", # timestamp

1806

"dayOfWeekValue": "A String", # day of week

1807

"timeValue": { # Represents a time of day. The date and time zone are either not significant # time of day

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1808

# or are specified elsewhere. An API may choose to allow leap seconds. Related

1809

# types are google.type.Date and `google.protobuf.Timestamp`.

1810

"hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose

1811

# to allow the value "24:00:00" for scenarios like business closing time.

1812

"nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999.

1813

"seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may

1814

# allow the value 60 if it allows leap-seconds.

1815

"minutes": 42, # Minutes of hour of day. Must be from 0 to 59.

1816

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1817

"dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day # date

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1818

# and time zone are either specified elsewhere or are not significant. The date

1819

# is relative to the Proleptic Gregorian Calendar. This can represent:

1820

#

1821

# * A full date, with non-zero year, month and day values

1822

# * A month and day value, with a zero year, e.g. an anniversary

1823

# * A year on its own, with zero month and day values

1824

# * A year and month value, with a zero day, e.g. a credit card expiration date

1825

#

1826

# Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1827

"month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a

1828

# month and day.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1829

"day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0

1830

# if specifying a year by itself or a year and month where the day is not

1831

# significant.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1832

"year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without

1833

# a year.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1834

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1835

"stringValue": "A String", # string

1836

"booleanValue": True or False, # boolean

1837

"integerValue": "A String", # integer

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1838

},

1839

],

1840

"estimatedProbability": 3.14, # The estimated probability that a given individual sharing these

1841

# quasi-identifier values is in the dataset. This value, typically called

1842

# δ, is the ratio between the number of records in the dataset with these

1843

# quasi-identifier values, and the total number of individuals (inside

1844

# *and* outside the dataset) with these quasi-identifier values.

1845

# For example, if there are 15 individuals in the dataset who share the

1846

# same quasi-identifier values, and an estimated 100 people in the entire

1847

# population with these values, then δ is 0.15.

1848

},

1849

],

1850

"bucketValueCount": "A String", # Total number of distinct quasi-identifier tuple values in this bucket.

1851

"bucketSize": "A String", # Number of records within these probability bounds.

1852

"maxProbability": 3.14, # Always greater than or equal to min_probability.

1853

"minProbability": 3.14, # Between 0 and 1.

},

],

},

"requestedSourceTable": { # Message defining the location of a BigQuery table. A table is uniquely # Input dataset to compute metrics over.

1858

# identified by its project_id, dataset_id, and table_name. Within a query

1859

# a table is often referenced with a string in the format of:

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1860

# `<project_id>:<dataset_id>.<table_id>` or

1861

# `<project_id>.<dataset_id>.<table_id>`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1862

"projectId": "A String", # The Google Cloud Platform project ID of the project containing the table.

1863

# If omitted, project ID is inferred from the API call.

1864

"tableId": "A String", # Name of the table.

1865

"datasetId": "A String", # Dataset ID of the table.

1866

},

1867

},

1868

"state": "A String", # State of a job.

1869

"jobTriggerName": "A String", # If created by a job trigger, the resource name of the trigger that

1870

# instantiated the job.

1871

"startTime": "A String", # Time when the job started.

1872

"endTime": "A String", # Time when the job finished.

1873

"type": "A String", # The type of job.

1874

"createTime": "A String", # Time when the job was created.

}</pre>

</div>

Dan O'Meara

2020-05-01 07:42:23 -0700

[diff] [blame^]

1879

<code class="details" id="create">create(parent, body=None, x__xgafv=None)</code>

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1880

<pre>Creates a job trigger to run DLP actions such as scanning storage for

1881

sensitive information on a set schedule.

1882

See https://cloud.google.com/dlp/docs/creating-job-triggers to learn more.

1883

1884

Args:

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1885

parent: string, Required. The parent resource name, for example projects/my-project-id. (required)

1886

body: object, The request body.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1887

The object takes the form of:

1888

1889

{ # Request message for CreateJobTrigger.

1890

"triggerId": "A String", # The trigger id can contain uppercase and lowercase letters,

1891

# numbers, and hyphens; that is, it must match the regular

1892

# expression: `[a-zA-Z\\d-_]+`. The maximum length is 100

1893

# characters. Can be empty to allow the system to generate one.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1894

"locationId": "A String", # The geographic location to store the job trigger. Reserved for

1895

# future extensions.

1896

"jobTrigger": { # Contains a configuration to make dlp api calls on a repeating basis. # Required. The JobTrigger to create.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1897

# See https://cloud.google.com/dlp/docs/concepts-job-triggers to learn more.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1898

"status": "A String", # Required. A status for this trigger.

1899

"updateTime": "A String", # Output only. The last update timestamp of a triggeredJob.

1900

"errors": [ # Output only. A stream of errors encountered when the trigger was activated. Repeated

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1901

# errors may result in the JobTrigger automatically being paused.

1902

# Will return the last 100 errors. Whenever the JobTrigger is modified

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1903

# this list will be cleared.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1904

{ # Details information about an error encountered during job execution or

1905

# the results of an unsuccessful activation of the JobTrigger.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1906

"timestamps": [ # The times the error occurred.

1907

"A String",

1908

],

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1909

"details": { # The `Status` type defines a logical error model that is suitable for # Detailed error codes and messages.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1910

# different programming environments, including REST APIs and RPC APIs. It is

1911

# used by [gRPC](https://github.com/grpc). Each `Status` message contains

1912

# three pieces of data: error code, error message, and error details.

1913

#

1914

# You can find out more about this error model and how to work with it in the

1915

# [API Design Guide](https://cloud.google.com/apis/design/errors).

1916

"message": "A String", # A developer-facing error message, which should be in English. Any

1917

# user-facing error message should be localized and sent in the

1918

# google.rpc.Status.details field, or localized by the client.

1919

"code": 42, # The status code, which should be an enum value of google.rpc.Code.

1920

"details": [ # A list of messages that carry the error details. There is a common set of

1921

# message types for APIs to use.

1922

{

1923

"a_key": "", # Properties of the object. Contains field @type with type URL.

},

],

},

},

],

"displayName": "A String", # Display name (max 100 chars)

1930

"description": "A String", # User provided description (max 256 chars)

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1931

"triggers": [ # A list of triggers which will be OR'ed together. Only one in the list

1932

# needs to trigger for a job to be started. The list may contain only

1933

# a single Schedule trigger and must have at least one object.

1934

{ # What event needs to occur for a new job to be started.

1935

"manual": { # Job trigger option for hybrid jobs. Jobs must be manually created # For use with hybrid jobs. Jobs must be manually created and finished.

1936

# Early access feature is in a pre-release state and might change or have

1937

# limited support. For more information, see

1938

# https://cloud.google.com/products#product-launch-stages.

1939

# and finished.

1940

},

1941

"schedule": { # Schedule for triggeredJobs. # Create a job on a repeating basis based on the elapse of time.

1942

"recurrencePeriodDuration": "A String", # With this option a job is started a regular periodic basis. For

1943

# example: every day (86400 seconds).

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1944

#

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1945

# A scheduled start time will be skipped if the previous

1946

# execution has not ended when its scheduled time occurs.

1947

#

1948

# This value must be set to a time duration greater than or equal

1949

# to 1 day and can be no longer than 60 days.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1950

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1951

},

1952

],

1953

"inspectJob": { # Controls what and how to inspect for findings. # For inspect jobs, a snapshot of the configuration.

1954

"storageConfig": { # Shared message indicating Cloud storage type. # The data to scan.

1955

"cloudStorageOptions": { # Options defining a file or a set of files within a Google Cloud Storage # Google Cloud Storage options.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1956

# bucket.

1957

"bytesLimitPerFile": "A String", # Max number of bytes to scan from a file. If a scanned file's size is bigger

1958

# than this value then the rest of the bytes are omitted. Only one

1959

# of bytes_limit_per_file and bytes_limit_per_file_percent can be specified.

1960

"sampleMethod": "A String",

1961

"fileSet": { # Set of files to scan. # The set of one or more files to scan.

1962

"url": "A String", # The Cloud Storage url of the file(s) to scan, in the format

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

1963

# `gs://<bucket>/<path>`. Trailing wildcard in the path is allowed.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

1964

#

1965

# If the url ends in a trailing slash, the bucket or directory represented

1966

# by the url will be scanned non-recursively (content in sub-directories

1967

# will not be scanned). This means that `gs://mybucket/` is equivalent to

1968

# `gs://mybucket/*`, and `gs://mybucket/directory/` is equivalent to

1969

# `gs://mybucket/directory/*`.

1970

#

1971

# Exactly one of `url` or `regex_file_set` must be set.

1972

"regexFileSet": { # Message representing a set of files in a Cloud Storage bucket. Regular # The regex-filtered set of files to scan. Exactly one of `url` or

1973

# `regex_file_set` must be set.

1974

# expressions are used to allow fine-grained control over which files in the

1975

# bucket to include.

1976

#

1977

# Included files are those that match at least one item in `include_regex` and

1978

# do not match any items in `exclude_regex`. Note that a file that matches

1979

# items from both lists will _not_ be included. For a match to occur, the

1980

# entire file path (i.e., everything in the url after the bucket name) must

1981

# match the regular expression.

1982

#

1983

# For example, given the input `{bucket_name: "mybucket", include_regex:

1984

# ["directory1/.*"], exclude_regex:

1985

# ["directory1/excluded.*"]}`:

1986

#

1987

# * `gs://mybucket/directory1/myfile` will be included

1988

# * `gs://mybucket/directory1/directory2/myfile` will be included (`.*` matches

1989

# across `/`)

1990

# * `gs://mybucket/directory0/directory1/myfile` will _not_ be included (the

1991

# full path doesn't match any items in `include_regex`)

1992

# * `gs://mybucket/directory1/excludedfile` will _not_ be included (the path

1993

# matches an item in `exclude_regex`)

1994

#

1995

# If `include_regex` is left empty, it will match all files by default

1996

# (this is equivalent to setting `include_regex: [".*"]`).

1997

#

1998

# Some other common use cases:

1999

#

2000

# * `{bucket_name: "mybucket", exclude_regex: [".*\.pdf"]}` will include all

2001

# files in `mybucket` except for .pdf files

2002

# * `{bucket_name: "mybucket", include_regex: ["directory/[^/]+"]}` will

2003

# include all files directly under `gs://mybucket/directory/`, without matching

2004

# across `/`

2005

"excludeRegex": [ # A list of regular expressions matching file paths to exclude. All files in

2006

# the bucket that match at least one of these regular expressions will be

2007

# excluded from the scan.

2008

#

2009

# Regular expressions use RE2

2010

# [syntax](https://github.com/google/re2/wiki/Syntax); a guide can be found

2011

# under the google/re2 repository on GitHub.

2012

"A String",

2013

],

2014

"bucketName": "A String", # The name of a Cloud Storage bucket. Required.

2015

"includeRegex": [ # A list of regular expressions matching file paths to include. All files in

2016

# the bucket that match at least one of these regular expressions will be

2017

# included in the set of files, except for those that also match an item in

2018

# `exclude_regex`. Leaving this field empty will match all files by default

2019

# (this is equivalent to including `.*` in the list).

2020

#

2021

# Regular expressions use RE2

2022

# [syntax](https://github.com/google/re2/wiki/Syntax); a guide can be found

2023

# under the google/re2 repository on GitHub.

"A String",

],

},

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

2028

"filesLimitPercent": 42, # Limits the number of files to scan to this percentage of the input FileSet.

2029

# Number of files scanned is rounded down. Must be between 0 and 100,

2030

# inclusively. Both 0 and 100 means no limit. Defaults to 0.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

2031

"bytesLimitPerFilePercent": 42, # Max percentage of bytes to scan from a file. The rest are omitted. The

2032

# number of bytes scanned is rounded down. Must be between 0 and 100,

2033

# inclusively. Both 0 and 100 means no limit. Defaults to 0. Only one

2034

# of bytes_limit_per_file and bytes_limit_per_file_percent can be specified.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

2035

"fileTypes": [ # List of file type groups to include in the scan.

2036

# If empty, all files are scanned and available data format processors

2037

# are applied. In addition, the binary content of the selected files

2038

# is always scanned as well.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

2039

# Images are scanned only as binary if the specified region

2040

# does not support image inspection and no file_types were specified.

2041

# Image inspection is restricted to 'global', 'us', 'asia', and 'europe'.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

2042

"A String",

2043

],

2044

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

2045

"datastoreOptions": { # Options defining a data set within Google Cloud Datastore. # Google Cloud Datastore options.

2046

"partitionId": { # Datastore partition ID. # A partition ID identifies a grouping of entities. The grouping is always

2047

# by project and namespace, however the namespace ID may be empty.

2048

# A partition ID identifies a grouping of entities. The grouping is always

2049

# by project and namespace, however the namespace ID may be empty.

2050

#

2051

# A partition ID contains several dimensions:

2052

# project ID and namespace ID.

2053

"projectId": "A String", # The ID of the project to which the entities belong.

2054

"namespaceId": "A String", # If not empty, the ID of the namespace to which the entities belong.

2055

},

2056

"kind": { # A representation of a Datastore kind. # The kind to process.

2057

"name": "A String", # The name of the kind.

2058

},

2059

},

2060

"bigQueryOptions": { # Options defining BigQuery table and row identifiers. # BigQuery options.

2061

"excludedFields": [ # References to fields excluded from scanning. This allows you to skip

2062

# inspection of entire columns which you know have no findings.

2063

{ # General identifier of a data field in a storage service.

2064

"name": "A String", # Name describing the field.

2065

},

2066

],

2067

"rowsLimit": "A String", # Max number of rows to scan. If the table has more rows than this value, the

2068

# rest of the rows are omitted. If not set, or if set to 0, all rows will be

2069

# scanned. Only one of rows_limit and rows_limit_percent can be specified.

2070

# Cannot be used in conjunction with TimespanConfig.

2071

"sampleMethod": "A String",

2072

"identifyingFields": [ # Table fields that may uniquely identify a row within the table. When

2073

# `actions.saveFindings.outputConfig.table` is specified, the values of

2074

# columns specified here are available in the output table under

2075

# `location.content_locations.record_location.record_key.id_values`. Nested

2076

# fields such as `person.birthdate.year` are allowed.

2077

{ # General identifier of a data field in a storage service.

2078

"name": "A String", # Name describing the field.

2079

},

2080

],

2081

"rowsLimitPercent": 42, # Max percentage of rows to scan. The rest are omitted. The number of rows

2082

# scanned is rounded down. Must be between 0 and 100, inclusively. Both 0 and

2083

# 100 means no limit. Defaults to 0. Only one of rows_limit and

2084

# rows_limit_percent can be specified. Cannot be used in conjunction with

2085

# TimespanConfig.

2086

"tableReference": { # Message defining the location of a BigQuery table. A table is uniquely # Complete BigQuery table reference.

2087

# identified by its project_id, dataset_id, and table_name. Within a query

2088

# a table is often referenced with a string in the format of:

2089

# `<project_id>:<dataset_id>.<table_id>` or

2090

# `<project_id>.<dataset_id>.<table_id>`.

2091

"projectId": "A String", # The Google Cloud Platform project ID of the project containing the table.

2092

# If omitted, project ID is inferred from the API call.

2093

"tableId": "A String", # Name of the table.

2094

"datasetId": "A String", # Dataset ID of the table.

2095

},

2096

},

2097

"timespanConfig": { # Configuration of the timespan of the items to include in scanning.

2098

# Currently only supported when inspecting Google Cloud Storage and BigQuery.

2099

"timestampField": { # General identifier of a data field in a storage service. # Specification of the field containing the timestamp of scanned items.

2100

# Used for data sources like Datastore and BigQuery.

2101

#

2102

# For BigQuery:

2103

# Required to filter out rows based on the given start and

2104

# end times. If not specified and the table was modified between the given

2105

# start and end times, the entire table will be scanned.

2106

# The valid data types of the timestamp field are: `INTEGER`, `DATE`,

2107

# `TIMESTAMP`, or `DATETIME` BigQuery column.

2108

#

2109

# For Datastore.

2110

# Valid data types of the timestamp field are: `TIMESTAMP`.

2111

# Datastore entity will be scanned if the timestamp property does not

2112

# exist or its value is empty or invalid.

2113

"name": "A String", # Name describing the field.

2114

},

2115

"endTime": "A String", # Exclude files or rows newer than this value.

2116

# If set to zero, no upper time limit is applied.

2117

"startTime": "A String", # Exclude files or rows older than this value.

2118

"enableAutoPopulationOfTimespanConfig": True or False, # When the job is started by a JobTrigger we will automatically figure out

2119

# a valid start_time to avoid scanning files that have not been modified

2120

# since the last time the JobTrigger executed. This will be based on the

2121

# time of the execution of the last run of the JobTrigger.

2122

},

2123

"hybridOptions": { # Configuration to control jobs where the content being inspected is outside # Hybrid inspection options.

2124

# Early access feature is in a pre-release state and might change or have

2125

# limited support. For more information, see

2126

# https://cloud.google.com/products#product-launch-stages.

2127

# of Google Cloud Platform.

2128

"tableOptions": { # Instructions regarding the table content being inspected. # If the container is a table, additional information to make findings

2129

# meaningful such as the columns that are primary keys.

2130

"identifyingFields": [ # The columns that are the primary keys for table objects included in

2131

# ContentItem. A copy of this cell's value will stored alongside alongside

2132

# each finding so that the finding can be traced to the specific row it came

2133

# from. No more than 3 may be provided.

2134

{ # General identifier of a data field in a storage service.

2135

"name": "A String", # Name describing the field.

},

],

},

"labels": { # To organize findings, these labels will be added to each finding.

2140

#

2141

# Label keys must be between 1 and 63 characters long and must conform

2142

# to the following regular expression: `[a-z]([-a-z0-9]*[a-z0-9])?`.

2143

#

2144

# Label values must be between 0 and 63 characters long and must conform

2145

# to the regular expression `([a-z]([-a-z0-9]*[a-z0-9])?)?`.

2146

#

2147

# No more than 10 labels can be associated with a given finding.

2148

#

2149

# Examples:

2150

# * `"environment" : "production"`

2151

# * `"pipeline" : "etl"`

2152

"a_key": "A String",

2153

},

2154

"requiredFindingLabelKeys": [ # These are labels that each inspection request must include within their

2155

# 'finding_labels' map. Request may contain others, but any missing one of

2156

# these will be rejected.

2157

#

2158

# Label keys must be between 1 and 63 characters long and must conform

2159

# to the following regular expression: `[a-z]([-a-z0-9]*[a-z0-9])?`.

2160

#

2161

# No more than 10 keys can be required.

2162

"A String",

2163

],

2164

"description": "A String", # A short description of where the data is coming from. Will be stored once

2165

# in the job. 256 max length.

2166

},

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

2167

},

2168

"inspectConfig": { # Configuration description of the scanning process. # How and what to scan for.

2169

# When used with redactContent only info_types and min_likelihood are currently

2170

# used.

2171

"excludeInfoTypes": True or False, # When true, excludes type information of the findings.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

2172

"limits": { # Configuration to control the number of findings returned. # Configuration to control the number of findings returned.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

2173

"maxFindingsPerRequest": 42, # Max number of findings that will be returned per request/job.

2174

# When set within `InspectContentRequest`, the maximum returned is 2000

2175

# regardless if this is set higher.

2176

"maxFindingsPerInfoType": [ # Configuration of findings limit given for specified infoTypes.

2177

{ # Max findings configuration per infoType, per content item or long

2178

# running DlpJob.

2179

"infoType": { # Type of information detected by the API. # Type of information the findings limit applies to. Only one limit per

2180

# info_type should be provided. If InfoTypeLimit does not have an

2181

# info_type, the DLP API applies the limit against all info_types that

2182

# are found but not specified in another InfoTypeLimit.

2183

"name": "A String", # Name of the information type. Either a name of your choosing when

2184

# creating a CustomInfoType, or one of the names listed

2185

# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying

2186

# a built-in type. InfoType names should conform to the pattern

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

2187

# `[a-zA-Z0-9_]{1,64}`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

2188

},

2189

"maxFindings": 42, # Max findings limit for the given infoType.

2190

},

2191

],

2192

"maxFindingsPerItem": 42, # Max number of findings that will be returned for each item scanned.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

2193

# When set within `InspectJobConfig`,

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

2194

# the maximum returned is 2000 regardless if this is set higher.

2195

# When set within `InspectContentRequest`, this field is ignored.

2196

},

2197

"minLikelihood": "A String", # Only returns findings equal or above this threshold. The default is

2198

# POSSIBLE.

2199

# See https://cloud.google.com/dlp/docs/likelihood to learn more.

2200

"customInfoTypes": [ # CustomInfoTypes provided by the user. See

2201

# https://cloud.google.com/dlp/docs/creating-custom-infotypes to learn more.

2202

{ # Custom information type provided by the user. Used to find domain-specific

2203

# sensitive information configurable to the data in question.

2204

"regex": { # Message defining a custom regular expression. # Regular expression based CustomInfoType.

2205

"pattern": "A String", # Pattern defining the regular expression. Its syntax

2206

# (https://github.com/google/re2/wiki/Syntax) can be found under the

2207

# google/re2 repository on GitHub.

2208

"groupIndexes": [ # The index of the submatch to extract as findings. When not

2209

# specified, the entire match is returned. No more than 3 may be included.

42,

],

},

"surrogateType": { # Message for detecting output from deidentification transformations # Message for detecting output from deidentification transformations that

2214

# support reversing.

2215

# such as

2216

# [`CryptoReplaceFfxFpeConfig`](/dlp/docs/reference/rest/v2/organizations.deidentifyTemplates#cryptoreplaceffxfpeconfig).

2217

# These types of transformations are

2218

# those that perform pseudonymization, thereby producing a "surrogate" as

2219

# output. This should be used in conjunction with a field on the

2220

# transformation such as `surrogate_info_type`. This CustomInfoType does

2221

# not support the use of `detection_rules`.

2222

},

2223

"infoType": { # Type of information detected by the API. # CustomInfoType can either be a new infoType, or an extension of built-in

2224

# infoType, when the name matches one of existing infoTypes and that infoType

2225

# is specified in `InspectContent.info_types` field. Specifying the latter

2226

# adds findings to the one detected by the system. If built-in info type is

2227

# not specified in `InspectContent.info_types` list then the name is treated

2228

# as a custom info type.

2229

"name": "A String", # Name of the information type. Either a name of your choosing when

2230

# creating a CustomInfoType, or one of the names listed

2231

# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying

2232

# a built-in type. InfoType names should conform to the pattern

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

2233

# `[a-zA-Z0-9_]{1,64}`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

2234

},

2235

"dictionary": { # Custom information type based on a dictionary of words or phrases. This can # A list of phrases to detect as a CustomInfoType.

2236

# be used to match sensitive information specific to the data, such as a list

2237

# of employee IDs or job titles.

2238

#

2239

# Dictionary words are case-insensitive and all characters other than letters

2240

# and digits in the unicode [Basic Multilingual

2241

# Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane)

2242

# will be replaced with whitespace when scanning for matches, so the

2243

# dictionary phrase "Sam Johnson" will match all three phrases "sam johnson",

2244

# "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters

2245

# surrounding any match must be of a different type than the adjacent

2246

# characters within the word, so letters must be next to non-letters and

2247

# digits next to non-digits. For example, the dictionary word "jen" will

2248

# match the first three letters of the text "jen123" but will return no

2249

# matches for "jennifer".

2250

#

2251

# Dictionary words containing a large number of characters that are not

2252

# letters or digits may result in unexpected findings because such characters

2253

# are treated as whitespace. The

2254

# [limits](https://cloud.google.com/dlp/limits) page contains details about

2255

# the size limits of dictionaries. For dictionaries that do not fit within

2256

# these constraints, consider using `LargeCustomDictionaryConfig` in the

2257

# `StoredInfoType` API.

2258

"wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for.

2259

"words": [ # Words or phrases defining the dictionary. The dictionary must contain

2260

# at least one phrase and every phrase must contain at least 2 characters

2261

# that are letters or digits. [required]

"A String",

],

},

"cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file

2266

# is accepted.

2267

"path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage.

2268

# Example: gs://[BUCKET_NAME]/dictionary.txt

2269

},

2270

},

2271

"storedType": { # A reference to a StoredInfoType to use with scanning. # Load an existing `StoredInfoType` resource for use in

2272

# `InspectDataSource`. Not currently supported in `InspectContent`.

2273

"name": "A String", # Resource name of the requested `StoredInfoType`, for example

2274

# `organizations/433245324/storedInfoTypes/432452342` or

2275

# `projects/project-id/storedInfoTypes/432452342`.

2276

"createTime": "A String", # Timestamp indicating when the version of the `StoredInfoType` used for

2277

# inspection was created. Output-only field, populated by the system.

2278

},

2279

"detectionRules": [ # Set of detection rules to apply to all findings of this CustomInfoType.

2280

# Rules are applied in order that they are specified. Not supported for the

2281

# `surrogate_type` CustomInfoType.

2282

{ # Deprecated; use `InspectionRuleSet` instead. Rule for modifying a

2283

# `CustomInfoType` to alter behavior under certain circumstances, depending

2284

# on the specific details of the rule. Not supported for the `surrogate_type`

2285

# custom infoType.

2286

"hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule.

2287

# proximity of hotwords.

2288

"proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside.

2289

# The total length of the window cannot exceed 1000 characters. Note that

2290

# the finding itself will be included in the window, so that hotwords may

2291

# be used to match substrings of the finding itself. For example, the

2292

# certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be

2293

# adjusted upwards if the area code is known to be the local area code of

2294

# a company office using the hotword regex "\(xxx\)", where "xxx"

2295

# is the area code in question.

2296

# rule.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

2297

"windowBefore": 42, # Number of characters before the finding to consider.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

2298

"windowAfter": 42, # Number of characters after the finding to consider.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

2299

},

2300

"hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword.

2301

"pattern": "A String", # Pattern defining the regular expression. Its syntax

2302

# (https://github.com/google/re2/wiki/Syntax) can be found under the

2303

# google/re2 repository on GitHub.

2304

"groupIndexes": [ # The index of the submatch to extract as findings. When not

2305

# specified, the entire match is returned. No more than 3 may be included.

42,

],

},

"likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings.

2310

# part of a detection rule.

2311

"relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of

2312

# levels. For example, if a finding would be `POSSIBLE` without the

2313

# detection rule and `relative_likelihood` is 1, then it is upgraded to

2314

# `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`.

2315

# Likelihood may never drop below `VERY_UNLIKELY` or exceed

2316

# `VERY_LIKELY`, so applying an adjustment of 1 followed by an

2317

# adjustment of -1 when base likelihood is `VERY_LIKELY` will result in

2318

# a final likelihood of `LIKELY`.

2319

"fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value.

},

},

},

],

"exclusionType": "A String", # If set to EXCLUSION_TYPE_EXCLUDE this infoType will not cause a finding

2325

# to be returned. It still can be used for rules matching.

2326

"likelihood": "A String", # Likelihood to return for this CustomInfoType. This base value can be

2327

# altered by a detection rule if the finding meets the criteria specified by

2328

# the rule. Defaults to `VERY_LIKELY` if not specified.

2329

},

2330

],

2331

"includeQuote": True or False, # When true, a contextual quote from the data that triggered a finding is

2332

# included in the response; see Finding.quote.

2333

"ruleSet": [ # Set of rules to apply to the findings for this InspectConfig.

2334

# Exclusion rules, contained in the set are executed in the end, other

2335

# rules are executed in the order they are specified for each info type.

2336

{ # Rule set for modifying a set of infoTypes to alter behavior under certain

2337

# circumstances, depending on the specific details of the rules within the set.

2338

"rules": [ # Set of rules to be applied to infoTypes. The rules are applied in order.

2339

{ # A single inspection rule to be applied to infoTypes, specified in

2340

# `InspectionRuleSet`.

2341

"hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule.

2342

# proximity of hotwords.

2343

"proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside.

2344

# The total length of the window cannot exceed 1000 characters. Note that

2345

# the finding itself will be included in the window, so that hotwords may

2346

# be used to match substrings of the finding itself. For example, the

2347

# certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be

2348

# adjusted upwards if the area code is known to be the local area code of

2349

# a company office using the hotword regex "\(xxx\)", where "xxx"

2350

# is the area code in question.

2351

# rule.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

2352

"windowBefore": 42, # Number of characters before the finding to consider.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

2353

"windowAfter": 42, # Number of characters after the finding to consider.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

2354

},

2355

"hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword.

2356

"pattern": "A String", # Pattern defining the regular expression. Its syntax

2357

# (https://github.com/google/re2/wiki/Syntax) can be found under the

2358

# google/re2 repository on GitHub.

2359

"groupIndexes": [ # The index of the submatch to extract as findings. When not

2360

# specified, the entire match is returned. No more than 3 may be included.

42,

],

},

"likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings.

2365

# part of a detection rule.

2366

"relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of

2367

# levels. For example, if a finding would be `POSSIBLE` without the

2368

# detection rule and `relative_likelihood` is 1, then it is upgraded to

2369

# `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`.

2370

# Likelihood may never drop below `VERY_UNLIKELY` or exceed

2371

# `VERY_LIKELY`, so applying an adjustment of 1 followed by an

2372

# adjustment of -1 when base likelihood is `VERY_LIKELY` will result in

2373

# a final likelihood of `LIKELY`.

2374

"fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value.

2375

},

2376

},

2377

"exclusionRule": { # The rule that specifies conditions when findings of infoTypes specified in # Exclusion rule.

2378

# `InspectionRuleSet` are removed from results.

2379

"regex": { # Message defining a custom regular expression. # Regular expression which defines the rule.

2380

"pattern": "A String", # Pattern defining the regular expression. Its syntax

2381

# (https://github.com/google/re2/wiki/Syntax) can be found under the

2382

# google/re2 repository on GitHub.

2383

"groupIndexes": [ # The index of the submatch to extract as findings. When not

2384

# specified, the entire match is returned. No more than 3 may be included.

42,

],

},

"excludeInfoTypes": { # List of exclude infoTypes. # Set of infoTypes for which findings would affect this rule.

2389

"infoTypes": [ # InfoType list in ExclusionRule rule drops a finding when it overlaps or

2390

# contained within with a finding of an infoType from this list. For

2391

# example, for `InspectionRuleSet.info_types` containing "PHONE_NUMBER"` and

2392

# `exclusion_rule` containing `exclude_info_types.info_types` with

2393

# "EMAIL_ADDRESS" the phone number findings are dropped if they overlap

2394

# with EMAIL_ADDRESS finding.

2395

# That leads to "555-222-2222@example.org" to generate only a single

2396

# finding, namely email address.

2397

{ # Type of information detected by the API.

2398

"name": "A String", # Name of the information type. Either a name of your choosing when

2399

# creating a CustomInfoType, or one of the names listed

2400

# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying

2401

# a built-in type. InfoType names should conform to the pattern

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

2402

# `[a-zA-Z0-9_]{1,64}`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

},

],

},

"dictionary": { # Custom information type based on a dictionary of words or phrases. This can # Dictionary which defines the rule.

2407

# be used to match sensitive information specific to the data, such as a list

2408

# of employee IDs or job titles.

2409

#

2410

# Dictionary words are case-insensitive and all characters other than letters

2411

# and digits in the unicode [Basic Multilingual

2412

# Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane)

2413

# will be replaced with whitespace when scanning for matches, so the

2414

# dictionary phrase "Sam Johnson" will match all three phrases "sam johnson",

2415

# "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters

2416

# surrounding any match must be of a different type than the adjacent

2417

# characters within the word, so letters must be next to non-letters and

2418

# digits next to non-digits. For example, the dictionary word "jen" will

2419

# match the first three letters of the text "jen123" but will return no

2420

# matches for "jennifer".

2421

#

2422

# Dictionary words containing a large number of characters that are not

2423

# letters or digits may result in unexpected findings because such characters

2424

# are treated as whitespace. The

2425

# [limits](https://cloud.google.com/dlp/limits) page contains details about

2426

# the size limits of dictionaries. For dictionaries that do not fit within

2427

# these constraints, consider using `LargeCustomDictionaryConfig` in the

2428

# `StoredInfoType` API.

2429

"wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for.

2430

"words": [ # Words or phrases defining the dictionary. The dictionary must contain

2431

# at least one phrase and every phrase must contain at least 2 characters

2432

# that are letters or digits. [required]

"A String",

],

},

"cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file

2437

# is accepted.

2438

"path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage.

2439

# Example: gs://[BUCKET_NAME]/dictionary.txt

2440

},

2441

},

2442

"matchingType": "A String", # How the rule is applied, see MatchingType documentation for details.

},

},

],

"infoTypes": [ # List of infoTypes this rule set is applied to.

2447

{ # Type of information detected by the API.

2448

"name": "A String", # Name of the information type. Either a name of your choosing when

2449

# creating a CustomInfoType, or one of the names listed

2450

# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying

2451

# a built-in type. InfoType names should conform to the pattern

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

2452

# `[a-zA-Z0-9_]{1,64}`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

},

],

},

],

"contentOptions": [ # List of options defining data content to scan.

2458

# If empty, text, images, and other content will be included.

2459

"A String",

2460

],

2461

"infoTypes": [ # Restricts what info_types to look for. The values must correspond to

2462

# InfoType values returned by ListInfoTypes or listed at

2463

# https://cloud.google.com/dlp/docs/infotypes-reference.

2464

#

2465

# When no InfoTypes or CustomInfoTypes are specified in a request, the

2466

# system may automatically choose what detectors to run. By default this may

2467

# be all types, but may change over time as detectors are updated.

2468

#

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

2469

# If you need precise control and predictability as to what detectors are

2470

# run you should specify specific InfoTypes listed in the reference,

2471

# otherwise a default list will be used, which may change over time.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

2472

{ # Type of information detected by the API.

2473

"name": "A String", # Name of the information type. Either a name of your choosing when

2474

# creating a CustomInfoType, or one of the names listed

2475

# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying

2476

# a built-in type. InfoType names should conform to the pattern

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

2477

# `[a-zA-Z0-9_]{1,64}`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

},

],

},

"inspectTemplateName": "A String", # If provided, will be used as the default for all values in InspectConfig.

2482

# `inspect_config` will be merged into the values persisted as part of the

2483

# template.

2484

"actions": [ # Actions to execute at the completion of the job.

2485

{ # A task to execute on the completion of a job.

2486

# See https://cloud.google.com/dlp/docs/concepts-actions to learn more.

2487

"saveFindings": { # If set, the detailed findings will be persisted to the specified # Save resulting findings in a provided location.

2488

# OutputStorageConfig. Only a single instance of this action can be

2489

# specified.

2490

# Compatible with: Inspect, Risk

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

2491

"outputConfig": { # Cloud repository for storing output. # Location to store findings outside of DLP.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

2492

"table": { # Message defining the location of a BigQuery table. A table is uniquely # Store findings in an existing table or a new table in an existing

2493

# dataset. If table_id is not set a new one will be generated

2494

# for you with the following format:

2495

# dlp_googleapis_yyyy_mm_dd_[dlp_job_id]. Pacific timezone will be used for

2496

# generating the date details.

2497

#

2498

# For Inspect, each column in an existing output table must have the same

2499

# name, type, and mode of a field in the `Finding` object.

2500

#

2501

# For Risk, an existing output table should be the output of a previous

2502

# Risk analysis job run on the same source table, with the same privacy

2503

# metric and quasi-identifiers. Risk jobs that analyze the same table but

2504

# compute a different privacy metric, or use different sets of

2505

# quasi-identifiers, cannot store their results in the same table.

2506

# identified by its project_id, dataset_id, and table_name. Within a query

2507

# a table is often referenced with a string in the format of:

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

2508

# `<project_id>:<dataset_id>.<table_id>` or

2509

# `<project_id>.<dataset_id>.<table_id>`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

2510

"projectId": "A String", # The Google Cloud Platform project ID of the project containing the table.

2511

# If omitted, project ID is inferred from the API call.

2512

"tableId": "A String", # Name of the table.

2513

"datasetId": "A String", # Dataset ID of the table.

2514

},

2515

"outputSchema": "A String", # Schema used for writing the findings for Inspect jobs. This field is only

2516

# used for Inspect and must be unspecified for Risk jobs. Columns are derived

2517

# from the `Finding` object. If appending to an existing table, any columns

2518

# from the predefined schema that are missing will be added. No columns in

2519

# the existing table will be deleted.

2520

#

2521

# If unspecified, then all available columns will be used for a new table or

2522

# an (existing) table with no schema, and no changes will be made to an

2523

# existing table that has a schema.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

2524

# Only for use with external storage.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

2525

},

2526

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

2527

"jobNotificationEmails": { # Enable email notification to project owners and editors on jobs's # Enable email notification for project owners and editors on job's

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

2528

# completion/failure.

2529

# completion/failure.

2530

},

2531

"publishSummaryToCscc": { # Publish the result summary of a DlpJob to the Cloud Security # Publish summary to Cloud Security Command Center (Alpha).

2532

# Command Center (CSCC Alpha).

2533

# This action is only available for projects which are parts of

2534

# an organization and whitelisted for the alpha Cloud Security Command

2535

# Center.

2536

# The action will publish count of finding instances and their info types.

2537

# The summary of findings will be persisted in CSCC and are governed by CSCC

2538

# service-specific policy, see https://cloud.google.com/terms/service-terms

2539

# Only a single instance of this action can be specified.

2540

# Compatible with: Inspect

2541

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

2542

"publishToStackdriver": { # Enable Stackdriver metric dlp.googleapis.com/finding_count. This # Enable Stackdriver metric dlp.googleapis.com/finding_count.

2543

# will publish a metric to stack driver on each infotype requested and

2544

# how many findings were found for it. CustomDetectors will be bucketed

2545

# as 'Custom' under the Stackdriver label 'info_type'.

2546

},

2547

"publishFindingsToCloudDataCatalog": { # Publish findings of a DlpJob to Cloud Data Catalog. Labels summarizing the # Publish findings to Cloud Datahub.

2548

# results of the DlpJob will be applied to the entry for the resource scanned

2549

# in Cloud Data Catalog. Any labels previously written by another DlpJob will

2550

# be deleted. InfoType naming patterns are strictly enforced when using this

2551

# feature. Note that the findings will be persisted in Cloud Data Catalog

2552

# storage and are governed by Data Catalog service-specific policy, see

2553

# https://cloud.google.com/terms/service-terms

2554

# Only a single instance of this action can be specified and only allowed if

2555

# all resources being scanned are BigQuery tables.

2556

# Compatible with: Inspect

2557

},

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

2558

"pubSub": { # Publish a message into given Pub/Sub topic when DlpJob has completed. The # Publish a notification to a pubsub topic.

2559

# message contains a single field, `DlpJobName`, which is equal to the

2560

# finished job's

2561

# [`DlpJob.name`](/dlp/docs/reference/rest/v2/projects.dlpJobs#DlpJob).

2562

# Compatible with: Inspect, Risk

2563

"topic": "A String", # Cloud Pub/Sub topic to send notifications to. The topic must have given

2564

# publishing access rights to the DLP API service account executing

2565

# the long running DlpJob sending the notifications.

2566

# Format is projects/{project}/topics/{topic}.

},

},

],

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

2571

"lastRunTime": "A String", # Output only. The timestamp of the last time this trigger executed.

2572

"createTime": "A String", # Output only. The creation timestamp of a triggeredJob.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

2573

"name": "A String", # Unique resource name for the triggeredJob, assigned by the service when the

2574

# triggeredJob is created, for example

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

2575

# `projects/dlp-test-project/jobTriggers/53234423`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

},

}

x__xgafv: string, V1 error format.

Allowed values

1 - v1 error format

2 - v2 error format

Returns:

An object of the form:

2586

2587

{ # Contains a configuration to make dlp api calls on a repeating basis.

2588

# See https://cloud.google.com/dlp/docs/concepts-job-triggers to learn more.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

2589

"status": "A String", # Required. A status for this trigger.

2590

"updateTime": "A String", # Output only. The last update timestamp of a triggeredJob.

2591

"errors": [ # Output only. A stream of errors encountered when the trigger was activated. Repeated

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

2592

# errors may result in the JobTrigger automatically being paused.

2593

# Will return the last 100 errors. Whenever the JobTrigger is modified

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

2594

# this list will be cleared.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

2595

{ # Details information about an error encountered during job execution or

2596

# the results of an unsuccessful activation of the JobTrigger.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

2597

"timestamps": [ # The times the error occurred.

2598

"A String",

2599

],

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

2600

"details": { # The `Status` type defines a logical error model that is suitable for # Detailed error codes and messages.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

2601

# different programming environments, including REST APIs and RPC APIs. It is

2602

# used by [gRPC](https://github.com/grpc). Each `Status` message contains

2603

# three pieces of data: error code, error message, and error details.

2604

#

2605

# You can find out more about this error model and how to work with it in the

2606

# [API Design Guide](https://cloud.google.com/apis/design/errors).

2607

"message": "A String", # A developer-facing error message, which should be in English. Any

2608

# user-facing error message should be localized and sent in the

2609

# google.rpc.Status.details field, or localized by the client.

2610

"code": 42, # The status code, which should be an enum value of google.rpc.Code.

2611

"details": [ # A list of messages that carry the error details. There is a common set of

2612

# message types for APIs to use.

2613

{

2614

"a_key": "", # Properties of the object. Contains field @type with type URL.

},

],

},

},

],

"displayName": "A String", # Display name (max 100 chars)

2621

"description": "A String", # User provided description (max 256 chars)

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

2622

"triggers": [ # A list of triggers which will be OR'ed together. Only one in the list

2623

# needs to trigger for a job to be started. The list may contain only

2624

# a single Schedule trigger and must have at least one object.

2625

{ # What event needs to occur for a new job to be started.

2626

"manual": { # Job trigger option for hybrid jobs. Jobs must be manually created # For use with hybrid jobs. Jobs must be manually created and finished.

2627

# Early access feature is in a pre-release state and might change or have

2628

# limited support. For more information, see

2629

# https://cloud.google.com/products#product-launch-stages.

2630

# and finished.

2631

},

2632

"schedule": { # Schedule for triggeredJobs. # Create a job on a repeating basis based on the elapse of time.

2633

"recurrencePeriodDuration": "A String", # With this option a job is started a regular periodic basis. For

2634

# example: every day (86400 seconds).

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

2635

#

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

2636

# A scheduled start time will be skipped if the previous

2637

# execution has not ended when its scheduled time occurs.

2638

#

2639

# This value must be set to a time duration greater than or equal

2640

# to 1 day and can be no longer than 60 days.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

2641

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

2642

},

2643

],

2644

"inspectJob": { # Controls what and how to inspect for findings. # For inspect jobs, a snapshot of the configuration.

2645

"storageConfig": { # Shared message indicating Cloud storage type. # The data to scan.

2646

"cloudStorageOptions": { # Options defining a file or a set of files within a Google Cloud Storage # Google Cloud Storage options.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

2647

# bucket.

2648

"bytesLimitPerFile": "A String", # Max number of bytes to scan from a file. If a scanned file's size is bigger

2649

# than this value then the rest of the bytes are omitted. Only one

2650

# of bytes_limit_per_file and bytes_limit_per_file_percent can be specified.

2651

"sampleMethod": "A String",

2652

"fileSet": { # Set of files to scan. # The set of one or more files to scan.

2653

"url": "A String", # The Cloud Storage url of the file(s) to scan, in the format

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

2654

# `gs://<bucket>/<path>`. Trailing wildcard in the path is allowed.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

2655

#

2656

# If the url ends in a trailing slash, the bucket or directory represented

2657

# by the url will be scanned non-recursively (content in sub-directories

2658

# will not be scanned). This means that `gs://mybucket/` is equivalent to

2659

# `gs://mybucket/*`, and `gs://mybucket/directory/` is equivalent to

2660

# `gs://mybucket/directory/*`.

2661

#

2662

# Exactly one of `url` or `regex_file_set` must be set.

2663

"regexFileSet": { # Message representing a set of files in a Cloud Storage bucket. Regular # The regex-filtered set of files to scan. Exactly one of `url` or

2664

# `regex_file_set` must be set.

2665

# expressions are used to allow fine-grained control over which files in the

2666

# bucket to include.

2667

#

2668

# Included files are those that match at least one item in `include_regex` and

2669

# do not match any items in `exclude_regex`. Note that a file that matches

2670

# items from both lists will _not_ be included. For a match to occur, the

2671

# entire file path (i.e., everything in the url after the bucket name) must

2672

# match the regular expression.

2673

#

2674

# For example, given the input `{bucket_name: "mybucket", include_regex:

2675

# ["directory1/.*"], exclude_regex:

2676

# ["directory1/excluded.*"]}`:

2677

#

2678

# * `gs://mybucket/directory1/myfile` will be included

2679

# * `gs://mybucket/directory1/directory2/myfile` will be included (`.*` matches

2680

# across `/`)

2681

# * `gs://mybucket/directory0/directory1/myfile` will _not_ be included (the

2682

# full path doesn't match any items in `include_regex`)

2683

# * `gs://mybucket/directory1/excludedfile` will _not_ be included (the path

2684

# matches an item in `exclude_regex`)

2685

#

2686

# If `include_regex` is left empty, it will match all files by default

2687

# (this is equivalent to setting `include_regex: [".*"]`).

2688

#

2689

# Some other common use cases:

2690

#

2691

# * `{bucket_name: "mybucket", exclude_regex: [".*\.pdf"]}` will include all

2692

# files in `mybucket` except for .pdf files

2693

# * `{bucket_name: "mybucket", include_regex: ["directory/[^/]+"]}` will

2694

# include all files directly under `gs://mybucket/directory/`, without matching

2695

# across `/`

2696

"excludeRegex": [ # A list of regular expressions matching file paths to exclude. All files in

2697

# the bucket that match at least one of these regular expressions will be

2698

# excluded from the scan.

2699

#

2700

# Regular expressions use RE2

2701

# [syntax](https://github.com/google/re2/wiki/Syntax); a guide can be found

2702

# under the google/re2 repository on GitHub.

2703

"A String",

2704

],

2705

"bucketName": "A String", # The name of a Cloud Storage bucket. Required.

2706

"includeRegex": [ # A list of regular expressions matching file paths to include. All files in

2707

# the bucket that match at least one of these regular expressions will be

2708

# included in the set of files, except for those that also match an item in

2709

# `exclude_regex`. Leaving this field empty will match all files by default

2710

# (this is equivalent to including `.*` in the list).

2711

#

2712

# Regular expressions use RE2

2713

# [syntax](https://github.com/google/re2/wiki/Syntax); a guide can be found

2714

# under the google/re2 repository on GitHub.

"A String",

],

},

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

2719

"filesLimitPercent": 42, # Limits the number of files to scan to this percentage of the input FileSet.

2720

# Number of files scanned is rounded down. Must be between 0 and 100,

2721

# inclusively. Both 0 and 100 means no limit. Defaults to 0.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

2722

"bytesLimitPerFilePercent": 42, # Max percentage of bytes to scan from a file. The rest are omitted. The

2723

# number of bytes scanned is rounded down. Must be between 0 and 100,

2724

# inclusively. Both 0 and 100 means no limit. Defaults to 0. Only one

2725

# of bytes_limit_per_file and bytes_limit_per_file_percent can be specified.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

2726

"fileTypes": [ # List of file type groups to include in the scan.

2727

# If empty, all files are scanned and available data format processors

2728

# are applied. In addition, the binary content of the selected files

2729

# is always scanned as well.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

2730

# Images are scanned only as binary if the specified region

2731

# does not support image inspection and no file_types were specified.

2732

# Image inspection is restricted to 'global', 'us', 'asia', and 'europe'.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

2733

"A String",

2734

],

2735

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

2736

"datastoreOptions": { # Options defining a data set within Google Cloud Datastore. # Google Cloud Datastore options.

2737

"partitionId": { # Datastore partition ID. # A partition ID identifies a grouping of entities. The grouping is always

2738

# by project and namespace, however the namespace ID may be empty.

2739

# A partition ID identifies a grouping of entities. The grouping is always

2740

# by project and namespace, however the namespace ID may be empty.

2741

#

2742

# A partition ID contains several dimensions:

2743

# project ID and namespace ID.

2744

"projectId": "A String", # The ID of the project to which the entities belong.

2745

"namespaceId": "A String", # If not empty, the ID of the namespace to which the entities belong.

2746

},

2747

"kind": { # A representation of a Datastore kind. # The kind to process.

2748

"name": "A String", # The name of the kind.

2749

},

2750

},

2751

"bigQueryOptions": { # Options defining BigQuery table and row identifiers. # BigQuery options.

2752

"excludedFields": [ # References to fields excluded from scanning. This allows you to skip

2753

# inspection of entire columns which you know have no findings.

2754

{ # General identifier of a data field in a storage service.

2755

"name": "A String", # Name describing the field.

2756

},

2757

],

2758

"rowsLimit": "A String", # Max number of rows to scan. If the table has more rows than this value, the

2759

# rest of the rows are omitted. If not set, or if set to 0, all rows will be

2760

# scanned. Only one of rows_limit and rows_limit_percent can be specified.

2761

# Cannot be used in conjunction with TimespanConfig.

2762

"sampleMethod": "A String",

2763

"identifyingFields": [ # Table fields that may uniquely identify a row within the table. When

2764

# `actions.saveFindings.outputConfig.table` is specified, the values of

2765

# columns specified here are available in the output table under

2766

# `location.content_locations.record_location.record_key.id_values`. Nested

2767

# fields such as `person.birthdate.year` are allowed.

2768

{ # General identifier of a data field in a storage service.

2769

"name": "A String", # Name describing the field.

2770

},

2771

],

2772

"rowsLimitPercent": 42, # Max percentage of rows to scan. The rest are omitted. The number of rows

2773

# scanned is rounded down. Must be between 0 and 100, inclusively. Both 0 and

2774

# 100 means no limit. Defaults to 0. Only one of rows_limit and

2775

# rows_limit_percent can be specified. Cannot be used in conjunction with

2776

# TimespanConfig.

2777

"tableReference": { # Message defining the location of a BigQuery table. A table is uniquely # Complete BigQuery table reference.

2778

# identified by its project_id, dataset_id, and table_name. Within a query

2779

# a table is often referenced with a string in the format of:

2780

# `<project_id>:<dataset_id>.<table_id>` or

2781

# `<project_id>.<dataset_id>.<table_id>`.

2782

"projectId": "A String", # The Google Cloud Platform project ID of the project containing the table.

2783

# If omitted, project ID is inferred from the API call.

2784

"tableId": "A String", # Name of the table.

2785

"datasetId": "A String", # Dataset ID of the table.

2786

},

2787

},

2788

"timespanConfig": { # Configuration of the timespan of the items to include in scanning.

2789

# Currently only supported when inspecting Google Cloud Storage and BigQuery.

2790

"timestampField": { # General identifier of a data field in a storage service. # Specification of the field containing the timestamp of scanned items.

2791

# Used for data sources like Datastore and BigQuery.

2792

#

2793

# For BigQuery:

2794

# Required to filter out rows based on the given start and

2795

# end times. If not specified and the table was modified between the given

2796

# start and end times, the entire table will be scanned.

2797

# The valid data types of the timestamp field are: `INTEGER`, `DATE`,

2798

# `TIMESTAMP`, or `DATETIME` BigQuery column.

2799

#

2800

# For Datastore.

2801

# Valid data types of the timestamp field are: `TIMESTAMP`.

2802

# Datastore entity will be scanned if the timestamp property does not

2803

# exist or its value is empty or invalid.

2804

"name": "A String", # Name describing the field.

2805

},

2806

"endTime": "A String", # Exclude files or rows newer than this value.

2807

# If set to zero, no upper time limit is applied.

2808

"startTime": "A String", # Exclude files or rows older than this value.

2809

"enableAutoPopulationOfTimespanConfig": True or False, # When the job is started by a JobTrigger we will automatically figure out

2810

# a valid start_time to avoid scanning files that have not been modified

2811

# since the last time the JobTrigger executed. This will be based on the

2812

# time of the execution of the last run of the JobTrigger.

2813

},

2814

"hybridOptions": { # Configuration to control jobs where the content being inspected is outside # Hybrid inspection options.

2815

# Early access feature is in a pre-release state and might change or have

2816

# limited support. For more information, see

2817

# https://cloud.google.com/products#product-launch-stages.

2818

# of Google Cloud Platform.

2819

"tableOptions": { # Instructions regarding the table content being inspected. # If the container is a table, additional information to make findings

2820

# meaningful such as the columns that are primary keys.

2821

"identifyingFields": [ # The columns that are the primary keys for table objects included in

2822

# ContentItem. A copy of this cell's value will stored alongside alongside

2823

# each finding so that the finding can be traced to the specific row it came

2824

# from. No more than 3 may be provided.

2825

{ # General identifier of a data field in a storage service.

2826

"name": "A String", # Name describing the field.

},

],

},

"labels": { # To organize findings, these labels will be added to each finding.

2831

#

2832

# Label keys must be between 1 and 63 characters long and must conform

2833

# to the following regular expression: `[a-z]([-a-z0-9]*[a-z0-9])?`.

2834

#

2835

# Label values must be between 0 and 63 characters long and must conform

2836

# to the regular expression `([a-z]([-a-z0-9]*[a-z0-9])?)?`.

2837

#

2838

# No more than 10 labels can be associated with a given finding.

2839

#

2840

# Examples:

2841

# * `"environment" : "production"`

2842

# * `"pipeline" : "etl"`

2843

"a_key": "A String",

2844

},

2845

"requiredFindingLabelKeys": [ # These are labels that each inspection request must include within their

2846

# 'finding_labels' map. Request may contain others, but any missing one of

2847

# these will be rejected.

2848

#

2849

# Label keys must be between 1 and 63 characters long and must conform

2850

# to the following regular expression: `[a-z]([-a-z0-9]*[a-z0-9])?`.

2851

#

2852

# No more than 10 keys can be required.

2853

"A String",

2854

],

2855

"description": "A String", # A short description of where the data is coming from. Will be stored once

2856

# in the job. 256 max length.

2857

},

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

2858

},

2859

"inspectConfig": { # Configuration description of the scanning process. # How and what to scan for.

2860

# When used with redactContent only info_types and min_likelihood are currently

2861

# used.

2862

"excludeInfoTypes": True or False, # When true, excludes type information of the findings.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

2863

"limits": { # Configuration to control the number of findings returned. # Configuration to control the number of findings returned.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

2864

"maxFindingsPerRequest": 42, # Max number of findings that will be returned per request/job.

2865

# When set within `InspectContentRequest`, the maximum returned is 2000

2866

# regardless if this is set higher.

2867

"maxFindingsPerInfoType": [ # Configuration of findings limit given for specified infoTypes.

2868

{ # Max findings configuration per infoType, per content item or long

2869

# running DlpJob.

2870

"infoType": { # Type of information detected by the API. # Type of information the findings limit applies to. Only one limit per

2871

# info_type should be provided. If InfoTypeLimit does not have an

2872

# info_type, the DLP API applies the limit against all info_types that

2873

# are found but not specified in another InfoTypeLimit.

2874

"name": "A String", # Name of the information type. Either a name of your choosing when

2875

# creating a CustomInfoType, or one of the names listed

2876

# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying

2877

# a built-in type. InfoType names should conform to the pattern

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

2878

# `[a-zA-Z0-9_]{1,64}`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

2879

},

2880

"maxFindings": 42, # Max findings limit for the given infoType.

2881

},

2882

],

2883

"maxFindingsPerItem": 42, # Max number of findings that will be returned for each item scanned.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

2884

# When set within `InspectJobConfig`,

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

2885

# the maximum returned is 2000 regardless if this is set higher.

2886

# When set within `InspectContentRequest`, this field is ignored.

2887

},

2888

"minLikelihood": "A String", # Only returns findings equal or above this threshold. The default is

2889

# POSSIBLE.

2890

# See https://cloud.google.com/dlp/docs/likelihood to learn more.

2891

"customInfoTypes": [ # CustomInfoTypes provided by the user. See

2892

# https://cloud.google.com/dlp/docs/creating-custom-infotypes to learn more.

2893

{ # Custom information type provided by the user. Used to find domain-specific

2894

# sensitive information configurable to the data in question.

2895

"regex": { # Message defining a custom regular expression. # Regular expression based CustomInfoType.

2896

"pattern": "A String", # Pattern defining the regular expression. Its syntax

2897

# (https://github.com/google/re2/wiki/Syntax) can be found under the

2898

# google/re2 repository on GitHub.

2899

"groupIndexes": [ # The index of the submatch to extract as findings. When not

2900

# specified, the entire match is returned. No more than 3 may be included.

42,

],

},

"surrogateType": { # Message for detecting output from deidentification transformations # Message for detecting output from deidentification transformations that

2905

# support reversing.

2906

# such as

2907

# [`CryptoReplaceFfxFpeConfig`](/dlp/docs/reference/rest/v2/organizations.deidentifyTemplates#cryptoreplaceffxfpeconfig).

2908

# These types of transformations are

2909

# those that perform pseudonymization, thereby producing a "surrogate" as

2910

# output. This should be used in conjunction with a field on the

2911

# transformation such as `surrogate_info_type`. This CustomInfoType does

2912

# not support the use of `detection_rules`.

2913

},

2914

"infoType": { # Type of information detected by the API. # CustomInfoType can either be a new infoType, or an extension of built-in

2915

# infoType, when the name matches one of existing infoTypes and that infoType

2916

# is specified in `InspectContent.info_types` field. Specifying the latter

2917

# adds findings to the one detected by the system. If built-in info type is

2918

# not specified in `InspectContent.info_types` list then the name is treated

2919

# as a custom info type.

2920

"name": "A String", # Name of the information type. Either a name of your choosing when

2921

# creating a CustomInfoType, or one of the names listed

2922

# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying

2923

# a built-in type. InfoType names should conform to the pattern

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

2924

# `[a-zA-Z0-9_]{1,64}`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

2925

},

2926

"dictionary": { # Custom information type based on a dictionary of words or phrases. This can # A list of phrases to detect as a CustomInfoType.

2927

# be used to match sensitive information specific to the data, such as a list

2928

# of employee IDs or job titles.

2929

#

2930

# Dictionary words are case-insensitive and all characters other than letters

2931

# and digits in the unicode [Basic Multilingual

2932

# Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane)

2933

# will be replaced with whitespace when scanning for matches, so the

2934

# dictionary phrase "Sam Johnson" will match all three phrases "sam johnson",

2935

# "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters

2936

# surrounding any match must be of a different type than the adjacent

2937

# characters within the word, so letters must be next to non-letters and

2938

# digits next to non-digits. For example, the dictionary word "jen" will

2939

# match the first three letters of the text "jen123" but will return no

2940

# matches for "jennifer".

2941

#

2942

# Dictionary words containing a large number of characters that are not

2943

# letters or digits may result in unexpected findings because such characters

2944

# are treated as whitespace. The

2945

# [limits](https://cloud.google.com/dlp/limits) page contains details about

2946

# the size limits of dictionaries. For dictionaries that do not fit within

2947

# these constraints, consider using `LargeCustomDictionaryConfig` in the

2948

# `StoredInfoType` API.

2949

"wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for.

2950

"words": [ # Words or phrases defining the dictionary. The dictionary must contain

2951

# at least one phrase and every phrase must contain at least 2 characters

2952

# that are letters or digits. [required]

"A String",

],

},

"cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file

2957

# is accepted.

2958

"path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage.

2959

# Example: gs://[BUCKET_NAME]/dictionary.txt

2960

},

2961

},

2962

"storedType": { # A reference to a StoredInfoType to use with scanning. # Load an existing `StoredInfoType` resource for use in

2963

# `InspectDataSource`. Not currently supported in `InspectContent`.

2964

"name": "A String", # Resource name of the requested `StoredInfoType`, for example

2965

# `organizations/433245324/storedInfoTypes/432452342` or

2966

# `projects/project-id/storedInfoTypes/432452342`.

2967

"createTime": "A String", # Timestamp indicating when the version of the `StoredInfoType` used for

2968

# inspection was created. Output-only field, populated by the system.

2969

},

2970

"detectionRules": [ # Set of detection rules to apply to all findings of this CustomInfoType.

2971

# Rules are applied in order that they are specified. Not supported for the

2972

# `surrogate_type` CustomInfoType.

2973

{ # Deprecated; use `InspectionRuleSet` instead. Rule for modifying a

2974

# `CustomInfoType` to alter behavior under certain circumstances, depending

2975

# on the specific details of the rule. Not supported for the `surrogate_type`

2976

# custom infoType.

2977

"hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule.

2978

# proximity of hotwords.

2979

"proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside.

2980

# The total length of the window cannot exceed 1000 characters. Note that

2981

# the finding itself will be included in the window, so that hotwords may

2982

# be used to match substrings of the finding itself. For example, the

2983

# certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be

2984

# adjusted upwards if the area code is known to be the local area code of

2985

# a company office using the hotword regex "\(xxx\)", where "xxx"

2986

# is the area code in question.

2987

# rule.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

2988

"windowBefore": 42, # Number of characters before the finding to consider.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

2989

"windowAfter": 42, # Number of characters after the finding to consider.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

2990

},

2991

"hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword.

2992

"pattern": "A String", # Pattern defining the regular expression. Its syntax

2993

# (https://github.com/google/re2/wiki/Syntax) can be found under the

2994

# google/re2 repository on GitHub.

2995

"groupIndexes": [ # The index of the submatch to extract as findings. When not

2996

# specified, the entire match is returned. No more than 3 may be included.

42,

],

},

"likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings.

3001

# part of a detection rule.

3002

"relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of

3003

# levels. For example, if a finding would be `POSSIBLE` without the

3004

# detection rule and `relative_likelihood` is 1, then it is upgraded to

3005

# `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`.

3006

# Likelihood may never drop below `VERY_UNLIKELY` or exceed

3007

# `VERY_LIKELY`, so applying an adjustment of 1 followed by an

3008

# adjustment of -1 when base likelihood is `VERY_LIKELY` will result in

3009

# a final likelihood of `LIKELY`.

3010

"fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value.

},

},

},

],

"exclusionType": "A String", # If set to EXCLUSION_TYPE_EXCLUDE this infoType will not cause a finding

3016

# to be returned. It still can be used for rules matching.

3017

"likelihood": "A String", # Likelihood to return for this CustomInfoType. This base value can be

3018

# altered by a detection rule if the finding meets the criteria specified by

3019

# the rule. Defaults to `VERY_LIKELY` if not specified.

3020

},

3021

],

3022

"includeQuote": True or False, # When true, a contextual quote from the data that triggered a finding is

3023

# included in the response; see Finding.quote.

3024

"ruleSet": [ # Set of rules to apply to the findings for this InspectConfig.

3025

# Exclusion rules, contained in the set are executed in the end, other

3026

# rules are executed in the order they are specified for each info type.

3027

{ # Rule set for modifying a set of infoTypes to alter behavior under certain

3028

# circumstances, depending on the specific details of the rules within the set.

3029

"rules": [ # Set of rules to be applied to infoTypes. The rules are applied in order.

3030

{ # A single inspection rule to be applied to infoTypes, specified in

3031

# `InspectionRuleSet`.

3032

"hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule.

3033

# proximity of hotwords.

3034

"proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside.

3035

# The total length of the window cannot exceed 1000 characters. Note that

3036

# the finding itself will be included in the window, so that hotwords may

3037

# be used to match substrings of the finding itself. For example, the

3038

# certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be

3039

# adjusted upwards if the area code is known to be the local area code of

3040

# a company office using the hotword regex "\(xxx\)", where "xxx"

3041

# is the area code in question.

3042

# rule.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

3043

"windowBefore": 42, # Number of characters before the finding to consider.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

3044

"windowAfter": 42, # Number of characters after the finding to consider.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

3045

},

3046

"hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword.

3047

"pattern": "A String", # Pattern defining the regular expression. Its syntax

3048

# (https://github.com/google/re2/wiki/Syntax) can be found under the

3049

# google/re2 repository on GitHub.

3050

"groupIndexes": [ # The index of the submatch to extract as findings. When not

3051

# specified, the entire match is returned. No more than 3 may be included.

42,

],

},

"likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings.

3056

# part of a detection rule.

3057

"relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of

3058

# levels. For example, if a finding would be `POSSIBLE` without the

3059

# detection rule and `relative_likelihood` is 1, then it is upgraded to

3060

# `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`.

3061

# Likelihood may never drop below `VERY_UNLIKELY` or exceed

3062

# `VERY_LIKELY`, so applying an adjustment of 1 followed by an

3063

# adjustment of -1 when base likelihood is `VERY_LIKELY` will result in

3064

# a final likelihood of `LIKELY`.

3065

"fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value.

3066

},

3067

},

3068

"exclusionRule": { # The rule that specifies conditions when findings of infoTypes specified in # Exclusion rule.

3069

# `InspectionRuleSet` are removed from results.

3070

"regex": { # Message defining a custom regular expression. # Regular expression which defines the rule.

3071

"pattern": "A String", # Pattern defining the regular expression. Its syntax

3072

# (https://github.com/google/re2/wiki/Syntax) can be found under the

3073

# google/re2 repository on GitHub.

3074

"groupIndexes": [ # The index of the submatch to extract as findings. When not

3075

# specified, the entire match is returned. No more than 3 may be included.

42,

],

},

"excludeInfoTypes": { # List of exclude infoTypes. # Set of infoTypes for which findings would affect this rule.

3080

"infoTypes": [ # InfoType list in ExclusionRule rule drops a finding when it overlaps or

3081

# contained within with a finding of an infoType from this list. For

3082

# example, for `InspectionRuleSet.info_types` containing "PHONE_NUMBER"` and

3083

# `exclusion_rule` containing `exclude_info_types.info_types` with

3084

# "EMAIL_ADDRESS" the phone number findings are dropped if they overlap

3085

# with EMAIL_ADDRESS finding.

3086

# That leads to "555-222-2222@example.org" to generate only a single

3087

# finding, namely email address.

3088

{ # Type of information detected by the API.

3089

"name": "A String", # Name of the information type. Either a name of your choosing when

3090

# creating a CustomInfoType, or one of the names listed

3091

# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying

3092

# a built-in type. InfoType names should conform to the pattern

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

3093

# `[a-zA-Z0-9_]{1,64}`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

},

],

},

"dictionary": { # Custom information type based on a dictionary of words or phrases. This can # Dictionary which defines the rule.

3098

# be used to match sensitive information specific to the data, such as a list

3099

# of employee IDs or job titles.

3100

#

3101

# Dictionary words are case-insensitive and all characters other than letters

3102

# and digits in the unicode [Basic Multilingual

3103

# Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane)

3104

# will be replaced with whitespace when scanning for matches, so the

3105

# dictionary phrase "Sam Johnson" will match all three phrases "sam johnson",

3106

# "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters

3107

# surrounding any match must be of a different type than the adjacent

3108

# characters within the word, so letters must be next to non-letters and

3109

# digits next to non-digits. For example, the dictionary word "jen" will

3110

# match the first three letters of the text "jen123" but will return no

3111

# matches for "jennifer".

3112

#

3113

# Dictionary words containing a large number of characters that are not

3114

# letters or digits may result in unexpected findings because such characters

3115

# are treated as whitespace. The

3116

# [limits](https://cloud.google.com/dlp/limits) page contains details about

3117

# the size limits of dictionaries. For dictionaries that do not fit within

3118

# these constraints, consider using `LargeCustomDictionaryConfig` in the

3119

# `StoredInfoType` API.

3120

"wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for.

3121

"words": [ # Words or phrases defining the dictionary. The dictionary must contain

3122

# at least one phrase and every phrase must contain at least 2 characters

3123

# that are letters or digits. [required]

"A String",

],

},

"cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file

3128

# is accepted.

3129

"path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage.

3130

# Example: gs://[BUCKET_NAME]/dictionary.txt

3131

},

3132

},

3133

"matchingType": "A String", # How the rule is applied, see MatchingType documentation for details.

},

},

],

"infoTypes": [ # List of infoTypes this rule set is applied to.

3138

{ # Type of information detected by the API.

3139

"name": "A String", # Name of the information type. Either a name of your choosing when

3140

# creating a CustomInfoType, or one of the names listed

3141

# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying

3142

# a built-in type. InfoType names should conform to the pattern

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

3143

# `[a-zA-Z0-9_]{1,64}`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

},

],

},

],

"contentOptions": [ # List of options defining data content to scan.

3149

# If empty, text, images, and other content will be included.

3150

"A String",

3151

],

3152

"infoTypes": [ # Restricts what info_types to look for. The values must correspond to

3153

# InfoType values returned by ListInfoTypes or listed at

3154

# https://cloud.google.com/dlp/docs/infotypes-reference.

3155

#

3156

# When no InfoTypes or CustomInfoTypes are specified in a request, the

3157

# system may automatically choose what detectors to run. By default this may

3158

# be all types, but may change over time as detectors are updated.

3159

#

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

3160

# If you need precise control and predictability as to what detectors are

3161

# run you should specify specific InfoTypes listed in the reference,

3162

# otherwise a default list will be used, which may change over time.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

3163

{ # Type of information detected by the API.

3164

"name": "A String", # Name of the information type. Either a name of your choosing when

3165

# creating a CustomInfoType, or one of the names listed

3166

# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying

3167

# a built-in type. InfoType names should conform to the pattern

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

3168

# `[a-zA-Z0-9_]{1,64}`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

},

],

},

"inspectTemplateName": "A String", # If provided, will be used as the default for all values in InspectConfig.

3173

# `inspect_config` will be merged into the values persisted as part of the

3174

# template.

3175

"actions": [ # Actions to execute at the completion of the job.

3176

{ # A task to execute on the completion of a job.

3177

# See https://cloud.google.com/dlp/docs/concepts-actions to learn more.

3178

"saveFindings": { # If set, the detailed findings will be persisted to the specified # Save resulting findings in a provided location.

3179

# OutputStorageConfig. Only a single instance of this action can be

3180

# specified.

3181

# Compatible with: Inspect, Risk

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

3182

"outputConfig": { # Cloud repository for storing output. # Location to store findings outside of DLP.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

3183

"table": { # Message defining the location of a BigQuery table. A table is uniquely # Store findings in an existing table or a new table in an existing

3184

# dataset. If table_id is not set a new one will be generated

3185

# for you with the following format:

3186

# dlp_googleapis_yyyy_mm_dd_[dlp_job_id]. Pacific timezone will be used for

3187

# generating the date details.

3188

#

3189

# For Inspect, each column in an existing output table must have the same

3190

# name, type, and mode of a field in the `Finding` object.

3191

#

3192

# For Risk, an existing output table should be the output of a previous

3193

# Risk analysis job run on the same source table, with the same privacy

3194

# metric and quasi-identifiers. Risk jobs that analyze the same table but

3195

# compute a different privacy metric, or use different sets of

3196

# quasi-identifiers, cannot store their results in the same table.

3197

# identified by its project_id, dataset_id, and table_name. Within a query

3198

# a table is often referenced with a string in the format of:

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

3199

# `<project_id>:<dataset_id>.<table_id>` or

3200

# `<project_id>.<dataset_id>.<table_id>`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

3201

"projectId": "A String", # The Google Cloud Platform project ID of the project containing the table.

3202

# If omitted, project ID is inferred from the API call.

3203

"tableId": "A String", # Name of the table.

3204

"datasetId": "A String", # Dataset ID of the table.

3205

},

3206

"outputSchema": "A String", # Schema used for writing the findings for Inspect jobs. This field is only

3207

# used for Inspect and must be unspecified for Risk jobs. Columns are derived

3208

# from the `Finding` object. If appending to an existing table, any columns

3209

# from the predefined schema that are missing will be added. No columns in

3210

# the existing table will be deleted.

3211

#

3212

# If unspecified, then all available columns will be used for a new table or

3213

# an (existing) table with no schema, and no changes will be made to an

3214

# existing table that has a schema.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

3215

# Only for use with external storage.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

3216

},

3217

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

3218

"jobNotificationEmails": { # Enable email notification to project owners and editors on jobs's # Enable email notification for project owners and editors on job's

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

3219

# completion/failure.

3220

# completion/failure.

3221

},

3222

"publishSummaryToCscc": { # Publish the result summary of a DlpJob to the Cloud Security # Publish summary to Cloud Security Command Center (Alpha).

3223

# Command Center (CSCC Alpha).

3224

# This action is only available for projects which are parts of

3225

# an organization and whitelisted for the alpha Cloud Security Command

3226

# Center.

3227

# The action will publish count of finding instances and their info types.

3228

# The summary of findings will be persisted in CSCC and are governed by CSCC

3229

# service-specific policy, see https://cloud.google.com/terms/service-terms

3230

# Only a single instance of this action can be specified.

3231

# Compatible with: Inspect

3232

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

3233

"publishToStackdriver": { # Enable Stackdriver metric dlp.googleapis.com/finding_count. This # Enable Stackdriver metric dlp.googleapis.com/finding_count.

3234

# will publish a metric to stack driver on each infotype requested and

3235

# how many findings were found for it. CustomDetectors will be bucketed

3236

# as 'Custom' under the Stackdriver label 'info_type'.

3237

},

3238

"publishFindingsToCloudDataCatalog": { # Publish findings of a DlpJob to Cloud Data Catalog. Labels summarizing the # Publish findings to Cloud Datahub.

3239

# results of the DlpJob will be applied to the entry for the resource scanned

3240

# in Cloud Data Catalog. Any labels previously written by another DlpJob will

3241

# be deleted. InfoType naming patterns are strictly enforced when using this

3242

# feature. Note that the findings will be persisted in Cloud Data Catalog

3243

# storage and are governed by Data Catalog service-specific policy, see

3244

# https://cloud.google.com/terms/service-terms

3245

# Only a single instance of this action can be specified and only allowed if

3246

# all resources being scanned are BigQuery tables.

3247

# Compatible with: Inspect

3248

},

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

3249

"pubSub": { # Publish a message into given Pub/Sub topic when DlpJob has completed. The # Publish a notification to a pubsub topic.

3250

# message contains a single field, `DlpJobName`, which is equal to the

3251

# finished job's

3252

# [`DlpJob.name`](/dlp/docs/reference/rest/v2/projects.dlpJobs#DlpJob).

3253

# Compatible with: Inspect, Risk

3254

"topic": "A String", # Cloud Pub/Sub topic to send notifications to. The topic must have given

3255

# publishing access rights to the DLP API service account executing

3256

# the long running DlpJob sending the notifications.

3257

# Format is projects/{project}/topics/{topic}.

},

},

],

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

3262

"lastRunTime": "A String", # Output only. The timestamp of the last time this trigger executed.

3263

"createTime": "A String", # Output only. The creation timestamp of a triggeredJob.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

3264

"name": "A String", # Unique resource name for the triggeredJob, assigned by the service when the

3265

# triggeredJob is created, for example

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

3266

# `projects/dlp-test-project/jobTriggers/53234423`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

}</pre>

</div>

<code class="details" id="delete">delete(name, x__xgafv=None)</code>

3272

<pre>Deletes a job trigger.

3273

See https://cloud.google.com/dlp/docs/creating-job-triggers to learn more.

3274

3275

Args:

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

3276

name: string, Required. Resource name of the project and the triggeredJob, for example

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

3277

`projects/dlp-test-project/jobTriggers/53234423`. (required)

3278

x__xgafv: string, V1 error format.

Allowed values

1 - v1 error format

2 - v2 error format

Returns:

An object of the form:

3285

3286

{ # A generic empty message that you can re-use to avoid defining duplicated

3287

# empty messages in your APIs. A typical example is to use it as the request

3288

# or the response type of an API method. For instance:

3289

#

3290

# service Foo {

3291

# rpc Bar(google.protobuf.Empty) returns (google.protobuf.Empty);

3292

# }

3293

#

3294

# The JSON representation for `Empty` is empty JSON object `{}`.

}</pre>

</div>

<code class="details" id="get">get(name, x__xgafv=None)</code>

3300

<pre>Gets a job trigger.

3301

See https://cloud.google.com/dlp/docs/creating-job-triggers to learn more.

3302

3303

Args:

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

3304

name: string, Required. Resource name of the project and the triggeredJob, for example

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

3305

`projects/dlp-test-project/jobTriggers/53234423`. (required)

3306

x__xgafv: string, V1 error format.

Allowed values

1 - v1 error format

2 - v2 error format

Returns:

An object of the form:

3313

3314

{ # Contains a configuration to make dlp api calls on a repeating basis.

3315

# See https://cloud.google.com/dlp/docs/concepts-job-triggers to learn more.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

3316

"status": "A String", # Required. A status for this trigger.

3317

"updateTime": "A String", # Output only. The last update timestamp of a triggeredJob.

3318

"errors": [ # Output only. A stream of errors encountered when the trigger was activated. Repeated

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

3319

# errors may result in the JobTrigger automatically being paused.

3320

# Will return the last 100 errors. Whenever the JobTrigger is modified

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

3321

# this list will be cleared.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

3322

{ # Details information about an error encountered during job execution or

3323

# the results of an unsuccessful activation of the JobTrigger.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

3324

"timestamps": [ # The times the error occurred.

3325

"A String",

3326

],

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

3327

"details": { # The `Status` type defines a logical error model that is suitable for # Detailed error codes and messages.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

3328

# different programming environments, including REST APIs and RPC APIs. It is

3329

# used by [gRPC](https://github.com/grpc). Each `Status` message contains

3330

# three pieces of data: error code, error message, and error details.

3331

#

3332

# You can find out more about this error model and how to work with it in the

3333

# [API Design Guide](https://cloud.google.com/apis/design/errors).

3334

"message": "A String", # A developer-facing error message, which should be in English. Any

3335

# user-facing error message should be localized and sent in the

3336

# google.rpc.Status.details field, or localized by the client.

3337

"code": 42, # The status code, which should be an enum value of google.rpc.Code.

3338

"details": [ # A list of messages that carry the error details. There is a common set of

3339

# message types for APIs to use.

3340

{

3341

"a_key": "", # Properties of the object. Contains field @type with type URL.

},

],

},

},

],

"displayName": "A String", # Display name (max 100 chars)

3348

"description": "A String", # User provided description (max 256 chars)

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

3349

"triggers": [ # A list of triggers which will be OR'ed together. Only one in the list

3350

# needs to trigger for a job to be started. The list may contain only

3351

# a single Schedule trigger and must have at least one object.

3352

{ # What event needs to occur for a new job to be started.

3353

"manual": { # Job trigger option for hybrid jobs. Jobs must be manually created # For use with hybrid jobs. Jobs must be manually created and finished.

3354

# Early access feature is in a pre-release state and might change or have

3355

# limited support. For more information, see

3356

# https://cloud.google.com/products#product-launch-stages.

3357

# and finished.

3358

},

3359

"schedule": { # Schedule for triggeredJobs. # Create a job on a repeating basis based on the elapse of time.

3360

"recurrencePeriodDuration": "A String", # With this option a job is started a regular periodic basis. For

3361

# example: every day (86400 seconds).

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

3362

#

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

3363

# A scheduled start time will be skipped if the previous

3364

# execution has not ended when its scheduled time occurs.

3365

#

3366

# This value must be set to a time duration greater than or equal

3367

# to 1 day and can be no longer than 60 days.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

3368

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

3369

},

3370

],

3371

"inspectJob": { # Controls what and how to inspect for findings. # For inspect jobs, a snapshot of the configuration.

3372

"storageConfig": { # Shared message indicating Cloud storage type. # The data to scan.

3373

"cloudStorageOptions": { # Options defining a file or a set of files within a Google Cloud Storage # Google Cloud Storage options.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

3374

# bucket.

3375

"bytesLimitPerFile": "A String", # Max number of bytes to scan from a file. If a scanned file's size is bigger

3376

# than this value then the rest of the bytes are omitted. Only one

3377

# of bytes_limit_per_file and bytes_limit_per_file_percent can be specified.

3378

"sampleMethod": "A String",

3379

"fileSet": { # Set of files to scan. # The set of one or more files to scan.

3380

"url": "A String", # The Cloud Storage url of the file(s) to scan, in the format

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

3381

# `gs://<bucket>/<path>`. Trailing wildcard in the path is allowed.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

3382

#

3383

# If the url ends in a trailing slash, the bucket or directory represented

3384

# by the url will be scanned non-recursively (content in sub-directories

3385

# will not be scanned). This means that `gs://mybucket/` is equivalent to

3386

# `gs://mybucket/*`, and `gs://mybucket/directory/` is equivalent to

3387

# `gs://mybucket/directory/*`.

3388

#

3389

# Exactly one of `url` or `regex_file_set` must be set.

3390

"regexFileSet": { # Message representing a set of files in a Cloud Storage bucket. Regular # The regex-filtered set of files to scan. Exactly one of `url` or

3391

# `regex_file_set` must be set.

3392

# expressions are used to allow fine-grained control over which files in the

3393

# bucket to include.

3394

#

3395

# Included files are those that match at least one item in `include_regex` and

3396

# do not match any items in `exclude_regex`. Note that a file that matches

3397

# items from both lists will _not_ be included. For a match to occur, the

3398

# entire file path (i.e., everything in the url after the bucket name) must

3399

# match the regular expression.

3400

#

3401

# For example, given the input `{bucket_name: "mybucket", include_regex:

3402

# ["directory1/.*"], exclude_regex:

3403

# ["directory1/excluded.*"]}`:

3404

#

3405

# * `gs://mybucket/directory1/myfile` will be included

3406

# * `gs://mybucket/directory1/directory2/myfile` will be included (`.*` matches

3407

# across `/`)

3408

# * `gs://mybucket/directory0/directory1/myfile` will _not_ be included (the

3409

# full path doesn't match any items in `include_regex`)

3410

# * `gs://mybucket/directory1/excludedfile` will _not_ be included (the path

3411

# matches an item in `exclude_regex`)

3412

#

3413

# If `include_regex` is left empty, it will match all files by default

3414

# (this is equivalent to setting `include_regex: [".*"]`).

3415

#

3416

# Some other common use cases:

3417

#

3418

# * `{bucket_name: "mybucket", exclude_regex: [".*\.pdf"]}` will include all

3419

# files in `mybucket` except for .pdf files

3420

# * `{bucket_name: "mybucket", include_regex: ["directory/[^/]+"]}` will

3421

# include all files directly under `gs://mybucket/directory/`, without matching

3422

# across `/`

3423

"excludeRegex": [ # A list of regular expressions matching file paths to exclude. All files in

3424

# the bucket that match at least one of these regular expressions will be

3425

# excluded from the scan.

3426

#

3427

# Regular expressions use RE2

3428

# [syntax](https://github.com/google/re2/wiki/Syntax); a guide can be found

3429

# under the google/re2 repository on GitHub.

3430

"A String",

3431

],

3432

"bucketName": "A String", # The name of a Cloud Storage bucket. Required.

3433

"includeRegex": [ # A list of regular expressions matching file paths to include. All files in

3434

# the bucket that match at least one of these regular expressions will be

3435

# included in the set of files, except for those that also match an item in

3436

# `exclude_regex`. Leaving this field empty will match all files by default

3437

# (this is equivalent to including `.*` in the list).

3438

#

3439

# Regular expressions use RE2

3440

# [syntax](https://github.com/google/re2/wiki/Syntax); a guide can be found

3441

# under the google/re2 repository on GitHub.

"A String",

],

},

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

3446

"filesLimitPercent": 42, # Limits the number of files to scan to this percentage of the input FileSet.

3447

# Number of files scanned is rounded down. Must be between 0 and 100,

3448

# inclusively. Both 0 and 100 means no limit. Defaults to 0.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

3449

"bytesLimitPerFilePercent": 42, # Max percentage of bytes to scan from a file. The rest are omitted. The

3450

# number of bytes scanned is rounded down. Must be between 0 and 100,

3451

# inclusively. Both 0 and 100 means no limit. Defaults to 0. Only one

3452

# of bytes_limit_per_file and bytes_limit_per_file_percent can be specified.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

3453

"fileTypes": [ # List of file type groups to include in the scan.

3454

# If empty, all files are scanned and available data format processors

3455

# are applied. In addition, the binary content of the selected files

3456

# is always scanned as well.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

3457

# Images are scanned only as binary if the specified region

3458

# does not support image inspection and no file_types were specified.

3459

# Image inspection is restricted to 'global', 'us', 'asia', and 'europe'.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

3460

"A String",

3461

],

3462

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

3463

"datastoreOptions": { # Options defining a data set within Google Cloud Datastore. # Google Cloud Datastore options.

3464

"partitionId": { # Datastore partition ID. # A partition ID identifies a grouping of entities. The grouping is always

3465

# by project and namespace, however the namespace ID may be empty.

3466

# A partition ID identifies a grouping of entities. The grouping is always

3467

# by project and namespace, however the namespace ID may be empty.

3468

#

3469

# A partition ID contains several dimensions:

3470

# project ID and namespace ID.

3471

"projectId": "A String", # The ID of the project to which the entities belong.

3472

"namespaceId": "A String", # If not empty, the ID of the namespace to which the entities belong.

3473

},

3474

"kind": { # A representation of a Datastore kind. # The kind to process.

3475

"name": "A String", # The name of the kind.

3476

},

3477

},

3478

"bigQueryOptions": { # Options defining BigQuery table and row identifiers. # BigQuery options.

3479

"excludedFields": [ # References to fields excluded from scanning. This allows you to skip

3480

# inspection of entire columns which you know have no findings.

3481

{ # General identifier of a data field in a storage service.

3482

"name": "A String", # Name describing the field.

3483

},

3484

],

3485

"rowsLimit": "A String", # Max number of rows to scan. If the table has more rows than this value, the

3486

# rest of the rows are omitted. If not set, or if set to 0, all rows will be

3487

# scanned. Only one of rows_limit and rows_limit_percent can be specified.

3488

# Cannot be used in conjunction with TimespanConfig.

3489

"sampleMethod": "A String",

3490

"identifyingFields": [ # Table fields that may uniquely identify a row within the table. When

3491

# `actions.saveFindings.outputConfig.table` is specified, the values of

3492

# columns specified here are available in the output table under

3493

# `location.content_locations.record_location.record_key.id_values`. Nested

3494

# fields such as `person.birthdate.year` are allowed.

3495

{ # General identifier of a data field in a storage service.

3496

"name": "A String", # Name describing the field.

3497

},

3498

],

3499

"rowsLimitPercent": 42, # Max percentage of rows to scan. The rest are omitted. The number of rows

3500

# scanned is rounded down. Must be between 0 and 100, inclusively. Both 0 and

3501

# 100 means no limit. Defaults to 0. Only one of rows_limit and

3502

# rows_limit_percent can be specified. Cannot be used in conjunction with

3503

# TimespanConfig.

3504

"tableReference": { # Message defining the location of a BigQuery table. A table is uniquely # Complete BigQuery table reference.

3505

# identified by its project_id, dataset_id, and table_name. Within a query

3506

# a table is often referenced with a string in the format of:

3507

# `<project_id>:<dataset_id>.<table_id>` or

3508

# `<project_id>.<dataset_id>.<table_id>`.

3509

"projectId": "A String", # The Google Cloud Platform project ID of the project containing the table.

3510

# If omitted, project ID is inferred from the API call.

3511

"tableId": "A String", # Name of the table.

3512

"datasetId": "A String", # Dataset ID of the table.

3513

},

3514

},

3515

"timespanConfig": { # Configuration of the timespan of the items to include in scanning.

3516

# Currently only supported when inspecting Google Cloud Storage and BigQuery.

3517

"timestampField": { # General identifier of a data field in a storage service. # Specification of the field containing the timestamp of scanned items.

3518

# Used for data sources like Datastore and BigQuery.

3519

#

3520

# For BigQuery:

3521

# Required to filter out rows based on the given start and

3522

# end times. If not specified and the table was modified between the given

3523

# start and end times, the entire table will be scanned.

3524

# The valid data types of the timestamp field are: `INTEGER`, `DATE`,

3525

# `TIMESTAMP`, or `DATETIME` BigQuery column.

3526

#

3527

# For Datastore.

3528

# Valid data types of the timestamp field are: `TIMESTAMP`.

3529

# Datastore entity will be scanned if the timestamp property does not

3530

# exist or its value is empty or invalid.

3531

"name": "A String", # Name describing the field.

3532

},

3533

"endTime": "A String", # Exclude files or rows newer than this value.

3534

# If set to zero, no upper time limit is applied.

3535

"startTime": "A String", # Exclude files or rows older than this value.

3536

"enableAutoPopulationOfTimespanConfig": True or False, # When the job is started by a JobTrigger we will automatically figure out

3537

# a valid start_time to avoid scanning files that have not been modified

3538

# since the last time the JobTrigger executed. This will be based on the

3539

# time of the execution of the last run of the JobTrigger.

3540

},

3541

"hybridOptions": { # Configuration to control jobs where the content being inspected is outside # Hybrid inspection options.

3542

# Early access feature is in a pre-release state and might change or have

3543

# limited support. For more information, see

3544

# https://cloud.google.com/products#product-launch-stages.

3545

# of Google Cloud Platform.

3546

"tableOptions": { # Instructions regarding the table content being inspected. # If the container is a table, additional information to make findings

3547

# meaningful such as the columns that are primary keys.

3548

"identifyingFields": [ # The columns that are the primary keys for table objects included in

3549

# ContentItem. A copy of this cell's value will stored alongside alongside

3550

# each finding so that the finding can be traced to the specific row it came

3551

# from. No more than 3 may be provided.

3552

{ # General identifier of a data field in a storage service.

3553

"name": "A String", # Name describing the field.

},

],

},

"labels": { # To organize findings, these labels will be added to each finding.

3558

#

3559

# Label keys must be between 1 and 63 characters long and must conform

3560

# to the following regular expression: `[a-z]([-a-z0-9]*[a-z0-9])?`.

3561

#

3562

# Label values must be between 0 and 63 characters long and must conform

3563

# to the regular expression `([a-z]([-a-z0-9]*[a-z0-9])?)?`.

3564

#

3565

# No more than 10 labels can be associated with a given finding.

3566

#

3567

# Examples:

3568

# * `"environment" : "production"`

3569

# * `"pipeline" : "etl"`

3570

"a_key": "A String",

3571

},

3572

"requiredFindingLabelKeys": [ # These are labels that each inspection request must include within their

3573

# 'finding_labels' map. Request may contain others, but any missing one of

3574

# these will be rejected.

3575

#

3576

# Label keys must be between 1 and 63 characters long and must conform

3577

# to the following regular expression: `[a-z]([-a-z0-9]*[a-z0-9])?`.

3578

#

3579

# No more than 10 keys can be required.

3580

"A String",

3581

],

3582

"description": "A String", # A short description of where the data is coming from. Will be stored once

3583

# in the job. 256 max length.

3584

},

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

3585

},

3586

"inspectConfig": { # Configuration description of the scanning process. # How and what to scan for.

3587

# When used with redactContent only info_types and min_likelihood are currently

3588

# used.

3589

"excludeInfoTypes": True or False, # When true, excludes type information of the findings.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

3590

"limits": { # Configuration to control the number of findings returned. # Configuration to control the number of findings returned.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

3591

"maxFindingsPerRequest": 42, # Max number of findings that will be returned per request/job.

3592

# When set within `InspectContentRequest`, the maximum returned is 2000

3593

# regardless if this is set higher.

3594

"maxFindingsPerInfoType": [ # Configuration of findings limit given for specified infoTypes.

3595

{ # Max findings configuration per infoType, per content item or long

3596

# running DlpJob.

3597

"infoType": { # Type of information detected by the API. # Type of information the findings limit applies to. Only one limit per

3598

# info_type should be provided. If InfoTypeLimit does not have an

3599

# info_type, the DLP API applies the limit against all info_types that

3600

# are found but not specified in another InfoTypeLimit.

3601

"name": "A String", # Name of the information type. Either a name of your choosing when

3602

# creating a CustomInfoType, or one of the names listed

3603

# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying

3604

# a built-in type. InfoType names should conform to the pattern

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

3605

# `[a-zA-Z0-9_]{1,64}`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

3606

},

3607

"maxFindings": 42, # Max findings limit for the given infoType.

3608

},

3609

],

3610

"maxFindingsPerItem": 42, # Max number of findings that will be returned for each item scanned.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

3611

# When set within `InspectJobConfig`,

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

3612

# the maximum returned is 2000 regardless if this is set higher.

3613

# When set within `InspectContentRequest`, this field is ignored.

3614

},

3615

"minLikelihood": "A String", # Only returns findings equal or above this threshold. The default is

3616

# POSSIBLE.

3617

# See https://cloud.google.com/dlp/docs/likelihood to learn more.

3618

"customInfoTypes": [ # CustomInfoTypes provided by the user. See

3619

# https://cloud.google.com/dlp/docs/creating-custom-infotypes to learn more.

3620

{ # Custom information type provided by the user. Used to find domain-specific

3621

# sensitive information configurable to the data in question.

3622

"regex": { # Message defining a custom regular expression. # Regular expression based CustomInfoType.

3623

"pattern": "A String", # Pattern defining the regular expression. Its syntax

3624

# (https://github.com/google/re2/wiki/Syntax) can be found under the

3625

# google/re2 repository on GitHub.

3626

"groupIndexes": [ # The index of the submatch to extract as findings. When not

3627

# specified, the entire match is returned. No more than 3 may be included.

42,

],

},

"surrogateType": { # Message for detecting output from deidentification transformations # Message for detecting output from deidentification transformations that

3632

# support reversing.

3633

# such as

3634

# [`CryptoReplaceFfxFpeConfig`](/dlp/docs/reference/rest/v2/organizations.deidentifyTemplates#cryptoreplaceffxfpeconfig).

3635

# These types of transformations are

3636

# those that perform pseudonymization, thereby producing a "surrogate" as

3637

# output. This should be used in conjunction with a field on the

3638

# transformation such as `surrogate_info_type`. This CustomInfoType does

3639

# not support the use of `detection_rules`.

3640

},

3641

"infoType": { # Type of information detected by the API. # CustomInfoType can either be a new infoType, or an extension of built-in

3642

# infoType, when the name matches one of existing infoTypes and that infoType

3643

# is specified in `InspectContent.info_types` field. Specifying the latter

3644

# adds findings to the one detected by the system. If built-in info type is

3645

# not specified in `InspectContent.info_types` list then the name is treated

3646

# as a custom info type.

3647

"name": "A String", # Name of the information type. Either a name of your choosing when

3648

# creating a CustomInfoType, or one of the names listed

3649

# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying

3650

# a built-in type. InfoType names should conform to the pattern

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

3651

# `[a-zA-Z0-9_]{1,64}`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

3652

},

3653

"dictionary": { # Custom information type based on a dictionary of words or phrases. This can # A list of phrases to detect as a CustomInfoType.

3654

# be used to match sensitive information specific to the data, such as a list

3655

# of employee IDs or job titles.

3656

#

3657

# Dictionary words are case-insensitive and all characters other than letters

3658

# and digits in the unicode [Basic Multilingual

3659

# Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane)

3660

# will be replaced with whitespace when scanning for matches, so the

3661

# dictionary phrase "Sam Johnson" will match all three phrases "sam johnson",

3662

# "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters

3663

# surrounding any match must be of a different type than the adjacent

3664

# characters within the word, so letters must be next to non-letters and

3665

# digits next to non-digits. For example, the dictionary word "jen" will

3666

# match the first three letters of the text "jen123" but will return no

3667

# matches for "jennifer".

3668

#

3669

# Dictionary words containing a large number of characters that are not

3670

# letters or digits may result in unexpected findings because such characters

3671

# are treated as whitespace. The

3672

# [limits](https://cloud.google.com/dlp/limits) page contains details about

3673

# the size limits of dictionaries. For dictionaries that do not fit within

3674

# these constraints, consider using `LargeCustomDictionaryConfig` in the

3675

# `StoredInfoType` API.

3676

"wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for.

3677

"words": [ # Words or phrases defining the dictionary. The dictionary must contain

3678

# at least one phrase and every phrase must contain at least 2 characters

3679

# that are letters or digits. [required]

"A String",

],

},

"cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file

3684

# is accepted.

3685

"path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage.

3686

# Example: gs://[BUCKET_NAME]/dictionary.txt

3687

},

3688

},

3689

"storedType": { # A reference to a StoredInfoType to use with scanning. # Load an existing `StoredInfoType` resource for use in

3690

# `InspectDataSource`. Not currently supported in `InspectContent`.

3691

"name": "A String", # Resource name of the requested `StoredInfoType`, for example

3692

# `organizations/433245324/storedInfoTypes/432452342` or

3693

# `projects/project-id/storedInfoTypes/432452342`.

3694

"createTime": "A String", # Timestamp indicating when the version of the `StoredInfoType` used for

3695

# inspection was created. Output-only field, populated by the system.

3696

},

3697

"detectionRules": [ # Set of detection rules to apply to all findings of this CustomInfoType.

3698

# Rules are applied in order that they are specified. Not supported for the

3699

# `surrogate_type` CustomInfoType.

3700

{ # Deprecated; use `InspectionRuleSet` instead. Rule for modifying a

3701

# `CustomInfoType` to alter behavior under certain circumstances, depending

3702

# on the specific details of the rule. Not supported for the `surrogate_type`

3703

# custom infoType.

3704

"hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule.

3705

# proximity of hotwords.

3706

"proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside.

3707

# The total length of the window cannot exceed 1000 characters. Note that

3708

# the finding itself will be included in the window, so that hotwords may

3709

# be used to match substrings of the finding itself. For example, the

3710

# certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be

3711

# adjusted upwards if the area code is known to be the local area code of

3712

# a company office using the hotword regex "\(xxx\)", where "xxx"

3713

# is the area code in question.

3714

# rule.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

3715

"windowBefore": 42, # Number of characters before the finding to consider.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

3716

"windowAfter": 42, # Number of characters after the finding to consider.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

3717

},

3718

"hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword.

3719

"pattern": "A String", # Pattern defining the regular expression. Its syntax

3720

# (https://github.com/google/re2/wiki/Syntax) can be found under the

3721

# google/re2 repository on GitHub.

3722

"groupIndexes": [ # The index of the submatch to extract as findings. When not

3723

# specified, the entire match is returned. No more than 3 may be included.

42,

],

},

"likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings.

3728

# part of a detection rule.

3729

"relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of

3730

# levels. For example, if a finding would be `POSSIBLE` without the

3731

# detection rule and `relative_likelihood` is 1, then it is upgraded to

3732

# `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`.

3733

# Likelihood may never drop below `VERY_UNLIKELY` or exceed

3734

# `VERY_LIKELY`, so applying an adjustment of 1 followed by an

3735

# adjustment of -1 when base likelihood is `VERY_LIKELY` will result in

3736

# a final likelihood of `LIKELY`.

3737

"fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value.

},

},

},

],

"exclusionType": "A String", # If set to EXCLUSION_TYPE_EXCLUDE this infoType will not cause a finding

3743

# to be returned. It still can be used for rules matching.

3744

"likelihood": "A String", # Likelihood to return for this CustomInfoType. This base value can be

3745

# altered by a detection rule if the finding meets the criteria specified by

3746

# the rule. Defaults to `VERY_LIKELY` if not specified.

3747

},

3748

],

3749

"includeQuote": True or False, # When true, a contextual quote from the data that triggered a finding is

3750

# included in the response; see Finding.quote.

3751

"ruleSet": [ # Set of rules to apply to the findings for this InspectConfig.

3752

# Exclusion rules, contained in the set are executed in the end, other

3753

# rules are executed in the order they are specified for each info type.

3754

{ # Rule set for modifying a set of infoTypes to alter behavior under certain

3755

# circumstances, depending on the specific details of the rules within the set.

3756

"rules": [ # Set of rules to be applied to infoTypes. The rules are applied in order.

3757

{ # A single inspection rule to be applied to infoTypes, specified in

3758

# `InspectionRuleSet`.

3759

"hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule.

3760

# proximity of hotwords.

3761

"proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside.

3762

# The total length of the window cannot exceed 1000 characters. Note that

3763

# the finding itself will be included in the window, so that hotwords may

3764

# be used to match substrings of the finding itself. For example, the

3765

# certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be

3766

# adjusted upwards if the area code is known to be the local area code of

3767

# a company office using the hotword regex "\(xxx\)", where "xxx"

3768

# is the area code in question.

3769

# rule.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

3770

"windowBefore": 42, # Number of characters before the finding to consider.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

3771

"windowAfter": 42, # Number of characters after the finding to consider.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

3772

},

3773

"hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword.

3774

"pattern": "A String", # Pattern defining the regular expression. Its syntax

3775

# (https://github.com/google/re2/wiki/Syntax) can be found under the

3776

# google/re2 repository on GitHub.

3777

"groupIndexes": [ # The index of the submatch to extract as findings. When not

3778

# specified, the entire match is returned. No more than 3 may be included.

42,

],

},

"likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings.

3783

# part of a detection rule.

3784

"relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of

3785

# levels. For example, if a finding would be `POSSIBLE` without the

3786

# detection rule and `relative_likelihood` is 1, then it is upgraded to

3787

# `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`.

3788

# Likelihood may never drop below `VERY_UNLIKELY` or exceed

3789

# `VERY_LIKELY`, so applying an adjustment of 1 followed by an

3790

# adjustment of -1 when base likelihood is `VERY_LIKELY` will result in

3791

# a final likelihood of `LIKELY`.

3792

"fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value.

3793

},

3794

},

3795

"exclusionRule": { # The rule that specifies conditions when findings of infoTypes specified in # Exclusion rule.

3796

# `InspectionRuleSet` are removed from results.

3797

"regex": { # Message defining a custom regular expression. # Regular expression which defines the rule.

3798

"pattern": "A String", # Pattern defining the regular expression. Its syntax

3799

# (https://github.com/google/re2/wiki/Syntax) can be found under the

3800

# google/re2 repository on GitHub.

3801

"groupIndexes": [ # The index of the submatch to extract as findings. When not

3802

# specified, the entire match is returned. No more than 3 may be included.

42,

],

},

"excludeInfoTypes": { # List of exclude infoTypes. # Set of infoTypes for which findings would affect this rule.

3807

"infoTypes": [ # InfoType list in ExclusionRule rule drops a finding when it overlaps or

3808

# contained within with a finding of an infoType from this list. For

3809

# example, for `InspectionRuleSet.info_types` containing "PHONE_NUMBER"` and

3810

# `exclusion_rule` containing `exclude_info_types.info_types` with

3811

# "EMAIL_ADDRESS" the phone number findings are dropped if they overlap

3812

# with EMAIL_ADDRESS finding.

3813

# That leads to "555-222-2222@example.org" to generate only a single

3814

# finding, namely email address.

3815

{ # Type of information detected by the API.

3816

"name": "A String", # Name of the information type. Either a name of your choosing when

3817

# creating a CustomInfoType, or one of the names listed

3818

# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying

3819

# a built-in type. InfoType names should conform to the pattern

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

3820

# `[a-zA-Z0-9_]{1,64}`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

},

],

},

"dictionary": { # Custom information type based on a dictionary of words or phrases. This can # Dictionary which defines the rule.

3825

# be used to match sensitive information specific to the data, such as a list

3826

# of employee IDs or job titles.

3827

#

3828

# Dictionary words are case-insensitive and all characters other than letters

3829

# and digits in the unicode [Basic Multilingual

3830

# Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane)

3831

# will be replaced with whitespace when scanning for matches, so the

3832

# dictionary phrase "Sam Johnson" will match all three phrases "sam johnson",

3833

# "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters

3834

# surrounding any match must be of a different type than the adjacent

3835

# characters within the word, so letters must be next to non-letters and

3836

# digits next to non-digits. For example, the dictionary word "jen" will

3837

# match the first three letters of the text "jen123" but will return no

3838

# matches for "jennifer".

3839

#

3840

# Dictionary words containing a large number of characters that are not

3841

# letters or digits may result in unexpected findings because such characters

3842

# are treated as whitespace. The

3843

# [limits](https://cloud.google.com/dlp/limits) page contains details about

3844

# the size limits of dictionaries. For dictionaries that do not fit within

3845

# these constraints, consider using `LargeCustomDictionaryConfig` in the

3846

# `StoredInfoType` API.

3847

"wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for.

3848

"words": [ # Words or phrases defining the dictionary. The dictionary must contain

3849

# at least one phrase and every phrase must contain at least 2 characters

3850

# that are letters or digits. [required]

"A String",

],

},

"cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file

3855

# is accepted.

3856

"path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage.

3857

# Example: gs://[BUCKET_NAME]/dictionary.txt

3858

},

3859

},

3860

"matchingType": "A String", # How the rule is applied, see MatchingType documentation for details.

},

},

],

"infoTypes": [ # List of infoTypes this rule set is applied to.

3865

{ # Type of information detected by the API.

3866

"name": "A String", # Name of the information type. Either a name of your choosing when

3867

# creating a CustomInfoType, or one of the names listed

3868

# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying

3869

# a built-in type. InfoType names should conform to the pattern

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

3870

# `[a-zA-Z0-9_]{1,64}`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

},

],

},

],

"contentOptions": [ # List of options defining data content to scan.

3876

# If empty, text, images, and other content will be included.

3877

"A String",

3878

],

3879

"infoTypes": [ # Restricts what info_types to look for. The values must correspond to

3880

# InfoType values returned by ListInfoTypes or listed at

3881

# https://cloud.google.com/dlp/docs/infotypes-reference.

3882

#

3883

# When no InfoTypes or CustomInfoTypes are specified in a request, the

3884

# system may automatically choose what detectors to run. By default this may

3885

# be all types, but may change over time as detectors are updated.

3886

#

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

3887

# If you need precise control and predictability as to what detectors are

3888

# run you should specify specific InfoTypes listed in the reference,

3889

# otherwise a default list will be used, which may change over time.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

3890

{ # Type of information detected by the API.

3891

"name": "A String", # Name of the information type. Either a name of your choosing when

3892

# creating a CustomInfoType, or one of the names listed

3893

# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying

3894

# a built-in type. InfoType names should conform to the pattern

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

3895

# `[a-zA-Z0-9_]{1,64}`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

},

],

},

"inspectTemplateName": "A String", # If provided, will be used as the default for all values in InspectConfig.

3900

# `inspect_config` will be merged into the values persisted as part of the

3901

# template.

3902

"actions": [ # Actions to execute at the completion of the job.

3903

{ # A task to execute on the completion of a job.

3904

# See https://cloud.google.com/dlp/docs/concepts-actions to learn more.

3905

"saveFindings": { # If set, the detailed findings will be persisted to the specified # Save resulting findings in a provided location.

3906

# OutputStorageConfig. Only a single instance of this action can be

3907

# specified.

3908

# Compatible with: Inspect, Risk

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

3909

"outputConfig": { # Cloud repository for storing output. # Location to store findings outside of DLP.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

3910

"table": { # Message defining the location of a BigQuery table. A table is uniquely # Store findings in an existing table or a new table in an existing

3911

# dataset. If table_id is not set a new one will be generated

3912

# for you with the following format:

3913

# dlp_googleapis_yyyy_mm_dd_[dlp_job_id]. Pacific timezone will be used for

3914

# generating the date details.

3915

#

3916

# For Inspect, each column in an existing output table must have the same

3917

# name, type, and mode of a field in the `Finding` object.

3918

#

3919

# For Risk, an existing output table should be the output of a previous

3920

# Risk analysis job run on the same source table, with the same privacy

3921

# metric and quasi-identifiers. Risk jobs that analyze the same table but

3922

# compute a different privacy metric, or use different sets of

3923

# quasi-identifiers, cannot store their results in the same table.

3924

# identified by its project_id, dataset_id, and table_name. Within a query

3925

# a table is often referenced with a string in the format of:

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

3926

# `<project_id>:<dataset_id>.<table_id>` or

3927

# `<project_id>.<dataset_id>.<table_id>`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

3928

"projectId": "A String", # The Google Cloud Platform project ID of the project containing the table.

3929

# If omitted, project ID is inferred from the API call.

3930

"tableId": "A String", # Name of the table.

3931

"datasetId": "A String", # Dataset ID of the table.

3932

},

3933

"outputSchema": "A String", # Schema used for writing the findings for Inspect jobs. This field is only

3934

# used for Inspect and must be unspecified for Risk jobs. Columns are derived

3935

# from the `Finding` object. If appending to an existing table, any columns

3936

# from the predefined schema that are missing will be added. No columns in

3937

# the existing table will be deleted.

3938

#

3939

# If unspecified, then all available columns will be used for a new table or

3940

# an (existing) table with no schema, and no changes will be made to an

3941

# existing table that has a schema.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

3942

# Only for use with external storage.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

3943

},

3944

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

3945

"jobNotificationEmails": { # Enable email notification to project owners and editors on jobs's # Enable email notification for project owners and editors on job's

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

3946

# completion/failure.

3947

# completion/failure.

3948

},

3949

"publishSummaryToCscc": { # Publish the result summary of a DlpJob to the Cloud Security # Publish summary to Cloud Security Command Center (Alpha).

3950

# Command Center (CSCC Alpha).

3951

# This action is only available for projects which are parts of

3952

# an organization and whitelisted for the alpha Cloud Security Command

3953

# Center.

3954

# The action will publish count of finding instances and their info types.

3955

# The summary of findings will be persisted in CSCC and are governed by CSCC

3956

# service-specific policy, see https://cloud.google.com/terms/service-terms

3957

# Only a single instance of this action can be specified.

3958

# Compatible with: Inspect

3959

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

3960

"publishToStackdriver": { # Enable Stackdriver metric dlp.googleapis.com/finding_count. This # Enable Stackdriver metric dlp.googleapis.com/finding_count.

3961

# will publish a metric to stack driver on each infotype requested and

3962

# how many findings were found for it. CustomDetectors will be bucketed

3963

# as 'Custom' under the Stackdriver label 'info_type'.

3964

},

3965

"publishFindingsToCloudDataCatalog": { # Publish findings of a DlpJob to Cloud Data Catalog. Labels summarizing the # Publish findings to Cloud Datahub.

3966

# results of the DlpJob will be applied to the entry for the resource scanned

3967

# in Cloud Data Catalog. Any labels previously written by another DlpJob will

3968

# be deleted. InfoType naming patterns are strictly enforced when using this

3969

# feature. Note that the findings will be persisted in Cloud Data Catalog

3970

# storage and are governed by Data Catalog service-specific policy, see

3971

# https://cloud.google.com/terms/service-terms

3972

# Only a single instance of this action can be specified and only allowed if

3973

# all resources being scanned are BigQuery tables.

3974

# Compatible with: Inspect

3975

},

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

3976

"pubSub": { # Publish a message into given Pub/Sub topic when DlpJob has completed. The # Publish a notification to a pubsub topic.

3977

# message contains a single field, `DlpJobName`, which is equal to the

3978

# finished job's

3979

# [`DlpJob.name`](/dlp/docs/reference/rest/v2/projects.dlpJobs#DlpJob).

3980

# Compatible with: Inspect, Risk

3981

"topic": "A String", # Cloud Pub/Sub topic to send notifications to. The topic must have given

3982

# publishing access rights to the DLP API service account executing

3983

# the long running DlpJob sending the notifications.

3984

# Format is projects/{project}/topics/{topic}.

},

},

],

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

3989

"lastRunTime": "A String", # Output only. The timestamp of the last time this trigger executed.

3990

"createTime": "A String", # Output only. The creation timestamp of a triggeredJob.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

3991

"name": "A String", # Unique resource name for the triggeredJob, assigned by the service when the

3992

# triggeredJob is created, for example

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

3993

# `projects/dlp-test-project/jobTriggers/53234423`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

}</pre>

</div>

Dan O'Meara

2020-05-01 07:42:23 -0700

[diff] [blame^]

3998

<code class="details" id="list">list(parent, orderBy=None, pageSize=None, pageToken=None, x__xgafv=None, locationId=None, filter=None)</code>

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

3999

<pre>Lists job triggers.

4000

See https://cloud.google.com/dlp/docs/creating-job-triggers to learn more.

4001

4002

Args:

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

4003

parent: string, Required. The parent resource name, for example `projects/my-project-id`. (required)

4004

orderBy: string, Comma separated list of triggeredJob fields to order by,

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

4005

followed by `asc` or `desc` postfix. This list is case-insensitive,

4006

default sorting order is ascending, redundant space characters are

4007

insignificant.

4008

4009

Example: `name asc,update_time, create_time desc`

4010

4011

Supported fields are:

4012

4013

- `create_time`: corresponds to time the JobTrigger was created.

4014

- `update_time`: corresponds to time the JobTrigger was last updated.

4015

- `last_run_time`: corresponds to the last time the JobTrigger ran.

4016

- `name`: corresponds to JobTrigger's name.

4017

- `display_name`: corresponds to JobTrigger's display name.

4018

- `status`: corresponds to JobTrigger's status.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

4019

pageSize: integer, Size of the page, can be limited by a server.

4020

pageToken: string, Page token to continue retrieval. Comes from previous call

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

4021

to ListJobTriggers. `order_by` field must not

4022

change for subsequent calls.

4023

x__xgafv: string, V1 error format.

4024

Allowed values

4025

1 - v1 error format

4026

2 - v2 error format

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

4027

locationId: string, The geographic location where job triggers will be retrieved from.

4028

Use `-` for all locations. Reserved for future extensions.

4029

filter: string, Allows filtering.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

Supported syntax:

* Filter expressions are made up of one or more restrictions.

4034

* Restrictions can be combined by `AND` or `OR` logical operators. A

4035

sequence of restrictions implicitly uses `AND`.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

4036

* A restriction has the form of `{field} {operator} {value}`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

4037

* Supported fields/values for inspect jobs:

4038

- `status` - HEALTHY|PAUSED|CANCELLED

4039

- `inspected_storage` - DATASTORE|CLOUD_STORAGE|BIGQUERY

4040

- 'last_run_time` - RFC 3339 formatted timestamp, surrounded by

4041

quotation marks. Nanoseconds are ignored.

4042

- 'error_count' - Number of errors that have occurred while running.

4043

* The operator must be `=` or `!=` for status and inspected_storage.

Examples:

* inspected_storage = cloud_storage AND status = HEALTHY

4048

* inspected_storage = cloud_storage OR inspected_storage = bigquery

4049

* inspected_storage = cloud_storage AND (state = PAUSED OR state = HEALTHY)

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

4050

* last_run_time > \"2017-12-12T00:00:00+00:00\"

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

4051

4052

The length of this field should be no more than 500 characters.

4053

4054

Returns:

4055

An object of the form:

4056

4057

{ # Response message for ListJobTriggers.

4058

"nextPageToken": "A String", # If the next page is available then the next page token to be used

4059

# in following ListJobTriggers request.

4060

"jobTriggers": [ # List of triggeredJobs, up to page_size in ListJobTriggersRequest.

4061

{ # Contains a configuration to make dlp api calls on a repeating basis.

4062

# See https://cloud.google.com/dlp/docs/concepts-job-triggers to learn more.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

4063

"status": "A String", # Required. A status for this trigger.

4064

"updateTime": "A String", # Output only. The last update timestamp of a triggeredJob.

4065

"errors": [ # Output only. A stream of errors encountered when the trigger was activated. Repeated

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

4066

# errors may result in the JobTrigger automatically being paused.

4067

# Will return the last 100 errors. Whenever the JobTrigger is modified

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

4068

# this list will be cleared.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

4069

{ # Details information about an error encountered during job execution or

4070

# the results of an unsuccessful activation of the JobTrigger.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

4071

"timestamps": [ # The times the error occurred.

4072

"A String",

4073

],

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

4074

"details": { # The `Status` type defines a logical error model that is suitable for # Detailed error codes and messages.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

4075

# different programming environments, including REST APIs and RPC APIs. It is

4076

# used by [gRPC](https://github.com/grpc). Each `Status` message contains

4077

# three pieces of data: error code, error message, and error details.

4078

#

4079

# You can find out more about this error model and how to work with it in the

4080

# [API Design Guide](https://cloud.google.com/apis/design/errors).

4081

"message": "A String", # A developer-facing error message, which should be in English. Any

4082

# user-facing error message should be localized and sent in the

4083

# google.rpc.Status.details field, or localized by the client.

4084

"code": 42, # The status code, which should be an enum value of google.rpc.Code.

4085

"details": [ # A list of messages that carry the error details. There is a common set of

4086

# message types for APIs to use.

4087

{

4088

"a_key": "", # Properties of the object. Contains field @type with type URL.

},

],

},

},

],

"displayName": "A String", # Display name (max 100 chars)

4095

"description": "A String", # User provided description (max 256 chars)

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

4096

"triggers": [ # A list of triggers which will be OR'ed together. Only one in the list

4097

# needs to trigger for a job to be started. The list may contain only

4098

# a single Schedule trigger and must have at least one object.

4099

{ # What event needs to occur for a new job to be started.

4100

"manual": { # Job trigger option for hybrid jobs. Jobs must be manually created # For use with hybrid jobs. Jobs must be manually created and finished.

4101

# Early access feature is in a pre-release state and might change or have

4102

# limited support. For more information, see

4103

# https://cloud.google.com/products#product-launch-stages.

4104

# and finished.

4105

},

4106

"schedule": { # Schedule for triggeredJobs. # Create a job on a repeating basis based on the elapse of time.

4107

"recurrencePeriodDuration": "A String", # With this option a job is started a regular periodic basis. For

4108

# example: every day (86400 seconds).

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

4109

#

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

4110

# A scheduled start time will be skipped if the previous

4111

# execution has not ended when its scheduled time occurs.

4112

#

4113

# This value must be set to a time duration greater than or equal

4114

# to 1 day and can be no longer than 60 days.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

4115

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

4116

},

4117

],

4118

"inspectJob": { # Controls what and how to inspect for findings. # For inspect jobs, a snapshot of the configuration.

4119

"storageConfig": { # Shared message indicating Cloud storage type. # The data to scan.

4120

"cloudStorageOptions": { # Options defining a file or a set of files within a Google Cloud Storage # Google Cloud Storage options.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

4121

# bucket.

4122

"bytesLimitPerFile": "A String", # Max number of bytes to scan from a file. If a scanned file's size is bigger

4123

# than this value then the rest of the bytes are omitted. Only one

4124

# of bytes_limit_per_file and bytes_limit_per_file_percent can be specified.

4125

"sampleMethod": "A String",

4126

"fileSet": { # Set of files to scan. # The set of one or more files to scan.

4127

"url": "A String", # The Cloud Storage url of the file(s) to scan, in the format

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

4128

# `gs://<bucket>/<path>`. Trailing wildcard in the path is allowed.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

4129

#

4130

# If the url ends in a trailing slash, the bucket or directory represented

4131

# by the url will be scanned non-recursively (content in sub-directories

4132

# will not be scanned). This means that `gs://mybucket/` is equivalent to

4133

# `gs://mybucket/*`, and `gs://mybucket/directory/` is equivalent to

4134

# `gs://mybucket/directory/*`.

4135

#

4136

# Exactly one of `url` or `regex_file_set` must be set.

4137

"regexFileSet": { # Message representing a set of files in a Cloud Storage bucket. Regular # The regex-filtered set of files to scan. Exactly one of `url` or

4138

# `regex_file_set` must be set.

4139

# expressions are used to allow fine-grained control over which files in the

4140

# bucket to include.

4141

#

4142

# Included files are those that match at least one item in `include_regex` and

4143

# do not match any items in `exclude_regex`. Note that a file that matches

4144

# items from both lists will _not_ be included. For a match to occur, the

4145

# entire file path (i.e., everything in the url after the bucket name) must

4146

# match the regular expression.

4147

#

4148

# For example, given the input `{bucket_name: "mybucket", include_regex:

4149

# ["directory1/.*"], exclude_regex:

4150

# ["directory1/excluded.*"]}`:

4151

#

4152

# * `gs://mybucket/directory1/myfile` will be included

4153

# * `gs://mybucket/directory1/directory2/myfile` will be included (`.*` matches

4154

# across `/`)

4155

# * `gs://mybucket/directory0/directory1/myfile` will _not_ be included (the

4156

# full path doesn't match any items in `include_regex`)

4157

# * `gs://mybucket/directory1/excludedfile` will _not_ be included (the path

4158

# matches an item in `exclude_regex`)

4159

#

4160

# If `include_regex` is left empty, it will match all files by default

4161

# (this is equivalent to setting `include_regex: [".*"]`).

4162

#

4163

# Some other common use cases:

4164

#

4165

# * `{bucket_name: "mybucket", exclude_regex: [".*\.pdf"]}` will include all

4166

# files in `mybucket` except for .pdf files

4167

# * `{bucket_name: "mybucket", include_regex: ["directory/[^/]+"]}` will

4168

# include all files directly under `gs://mybucket/directory/`, without matching

4169

# across `/`

4170

"excludeRegex": [ # A list of regular expressions matching file paths to exclude. All files in

4171

# the bucket that match at least one of these regular expressions will be

4172

# excluded from the scan.

4173

#

4174

# Regular expressions use RE2

4175

# [syntax](https://github.com/google/re2/wiki/Syntax); a guide can be found

4176

# under the google/re2 repository on GitHub.

4177

"A String",

4178

],

4179

"bucketName": "A String", # The name of a Cloud Storage bucket. Required.

4180

"includeRegex": [ # A list of regular expressions matching file paths to include. All files in

4181

# the bucket that match at least one of these regular expressions will be

4182

# included in the set of files, except for those that also match an item in

4183

# `exclude_regex`. Leaving this field empty will match all files by default

4184

# (this is equivalent to including `.*` in the list).

4185

#

4186

# Regular expressions use RE2

4187

# [syntax](https://github.com/google/re2/wiki/Syntax); a guide can be found

4188

# under the google/re2 repository on GitHub.

"A String",

],

},

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

4193

"filesLimitPercent": 42, # Limits the number of files to scan to this percentage of the input FileSet.

4194

# Number of files scanned is rounded down. Must be between 0 and 100,

4195

# inclusively. Both 0 and 100 means no limit. Defaults to 0.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

4196

"bytesLimitPerFilePercent": 42, # Max percentage of bytes to scan from a file. The rest are omitted. The

4197

# number of bytes scanned is rounded down. Must be between 0 and 100,

4198

# inclusively. Both 0 and 100 means no limit. Defaults to 0. Only one

4199

# of bytes_limit_per_file and bytes_limit_per_file_percent can be specified.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

4200

"fileTypes": [ # List of file type groups to include in the scan.

4201

# If empty, all files are scanned and available data format processors

4202

# are applied. In addition, the binary content of the selected files

4203

# is always scanned as well.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

4204

# Images are scanned only as binary if the specified region

4205

# does not support image inspection and no file_types were specified.

4206

# Image inspection is restricted to 'global', 'us', 'asia', and 'europe'.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

4207

"A String",

4208

],

4209

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

4210

"datastoreOptions": { # Options defining a data set within Google Cloud Datastore. # Google Cloud Datastore options.

4211

"partitionId": { # Datastore partition ID. # A partition ID identifies a grouping of entities. The grouping is always

4212

# by project and namespace, however the namespace ID may be empty.

4213

# A partition ID identifies a grouping of entities. The grouping is always

4214

# by project and namespace, however the namespace ID may be empty.

4215

#

4216

# A partition ID contains several dimensions:

4217

# project ID and namespace ID.

4218

"projectId": "A String", # The ID of the project to which the entities belong.

4219

"namespaceId": "A String", # If not empty, the ID of the namespace to which the entities belong.

4220

},

4221

"kind": { # A representation of a Datastore kind. # The kind to process.

4222

"name": "A String", # The name of the kind.

4223

},

4224

},

4225

"bigQueryOptions": { # Options defining BigQuery table and row identifiers. # BigQuery options.

4226

"excludedFields": [ # References to fields excluded from scanning. This allows you to skip

4227

# inspection of entire columns which you know have no findings.

4228

{ # General identifier of a data field in a storage service.

4229

"name": "A String", # Name describing the field.

4230

},

4231

],

4232

"rowsLimit": "A String", # Max number of rows to scan. If the table has more rows than this value, the

4233

# rest of the rows are omitted. If not set, or if set to 0, all rows will be

4234

# scanned. Only one of rows_limit and rows_limit_percent can be specified.

4235

# Cannot be used in conjunction with TimespanConfig.

4236

"sampleMethod": "A String",

4237

"identifyingFields": [ # Table fields that may uniquely identify a row within the table. When

4238

# `actions.saveFindings.outputConfig.table` is specified, the values of

4239

# columns specified here are available in the output table under

4240

# `location.content_locations.record_location.record_key.id_values`. Nested

4241

# fields such as `person.birthdate.year` are allowed.

4242

{ # General identifier of a data field in a storage service.

4243

"name": "A String", # Name describing the field.

4244

},

4245

],

4246

"rowsLimitPercent": 42, # Max percentage of rows to scan. The rest are omitted. The number of rows

4247

# scanned is rounded down. Must be between 0 and 100, inclusively. Both 0 and

4248

# 100 means no limit. Defaults to 0. Only one of rows_limit and

4249

# rows_limit_percent can be specified. Cannot be used in conjunction with

4250

# TimespanConfig.

4251

"tableReference": { # Message defining the location of a BigQuery table. A table is uniquely # Complete BigQuery table reference.

4252

# identified by its project_id, dataset_id, and table_name. Within a query

4253

# a table is often referenced with a string in the format of:

4254

# `<project_id>:<dataset_id>.<table_id>` or

4255

# `<project_id>.<dataset_id>.<table_id>`.

4256

"projectId": "A String", # The Google Cloud Platform project ID of the project containing the table.

4257

# If omitted, project ID is inferred from the API call.

4258

"tableId": "A String", # Name of the table.

4259

"datasetId": "A String", # Dataset ID of the table.

4260

},

4261

},

4262

"timespanConfig": { # Configuration of the timespan of the items to include in scanning.

4263

# Currently only supported when inspecting Google Cloud Storage and BigQuery.

4264

"timestampField": { # General identifier of a data field in a storage service. # Specification of the field containing the timestamp of scanned items.

4265

# Used for data sources like Datastore and BigQuery.

4266

#

4267

# For BigQuery:

4268

# Required to filter out rows based on the given start and

4269

# end times. If not specified and the table was modified between the given

4270

# start and end times, the entire table will be scanned.

4271

# The valid data types of the timestamp field are: `INTEGER`, `DATE`,

4272

# `TIMESTAMP`, or `DATETIME` BigQuery column.

4273

#

4274

# For Datastore.

4275

# Valid data types of the timestamp field are: `TIMESTAMP`.

4276

# Datastore entity will be scanned if the timestamp property does not

4277

# exist or its value is empty or invalid.

4278

"name": "A String", # Name describing the field.

4279

},

4280

"endTime": "A String", # Exclude files or rows newer than this value.

4281

# If set to zero, no upper time limit is applied.

4282

"startTime": "A String", # Exclude files or rows older than this value.

4283

"enableAutoPopulationOfTimespanConfig": True or False, # When the job is started by a JobTrigger we will automatically figure out

4284

# a valid start_time to avoid scanning files that have not been modified

4285

# since the last time the JobTrigger executed. This will be based on the

4286

# time of the execution of the last run of the JobTrigger.

4287

},

4288

"hybridOptions": { # Configuration to control jobs where the content being inspected is outside # Hybrid inspection options.

4289

# Early access feature is in a pre-release state and might change or have

4290

# limited support. For more information, see

4291

# https://cloud.google.com/products#product-launch-stages.

4292

# of Google Cloud Platform.

4293

"tableOptions": { # Instructions regarding the table content being inspected. # If the container is a table, additional information to make findings

4294

# meaningful such as the columns that are primary keys.

4295

"identifyingFields": [ # The columns that are the primary keys for table objects included in

4296

# ContentItem. A copy of this cell's value will stored alongside alongside

4297

# each finding so that the finding can be traced to the specific row it came

4298

# from. No more than 3 may be provided.

4299

{ # General identifier of a data field in a storage service.

4300

"name": "A String", # Name describing the field.

},

],

},

"labels": { # To organize findings, these labels will be added to each finding.

4305

#

4306

# Label keys must be between 1 and 63 characters long and must conform

4307

# to the following regular expression: `[a-z]([-a-z0-9]*[a-z0-9])?`.

4308

#

4309

# Label values must be between 0 and 63 characters long and must conform

4310

# to the regular expression `([a-z]([-a-z0-9]*[a-z0-9])?)?`.

4311

#

4312

# No more than 10 labels can be associated with a given finding.

4313

#

4314

# Examples:

4315

# * `"environment" : "production"`

4316

# * `"pipeline" : "etl"`

4317

"a_key": "A String",

4318

},

4319

"requiredFindingLabelKeys": [ # These are labels that each inspection request must include within their

4320

# 'finding_labels' map. Request may contain others, but any missing one of

4321

# these will be rejected.

4322

#

4323

# Label keys must be between 1 and 63 characters long and must conform

4324

# to the following regular expression: `[a-z]([-a-z0-9]*[a-z0-9])?`.

4325

#

4326

# No more than 10 keys can be required.

4327

"A String",

4328

],

4329

"description": "A String", # A short description of where the data is coming from. Will be stored once

4330

# in the job. 256 max length.

4331

},

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

4332

},

4333

"inspectConfig": { # Configuration description of the scanning process. # How and what to scan for.

4334

# When used with redactContent only info_types and min_likelihood are currently

4335

# used.

4336

"excludeInfoTypes": True or False, # When true, excludes type information of the findings.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

4337

"limits": { # Configuration to control the number of findings returned. # Configuration to control the number of findings returned.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

4338

"maxFindingsPerRequest": 42, # Max number of findings that will be returned per request/job.

4339

# When set within `InspectContentRequest`, the maximum returned is 2000

4340

# regardless if this is set higher.

4341

"maxFindingsPerInfoType": [ # Configuration of findings limit given for specified infoTypes.

4342

{ # Max findings configuration per infoType, per content item or long

4343

# running DlpJob.

4344

"infoType": { # Type of information detected by the API. # Type of information the findings limit applies to. Only one limit per

4345

# info_type should be provided. If InfoTypeLimit does not have an

4346

# info_type, the DLP API applies the limit against all info_types that

4347

# are found but not specified in another InfoTypeLimit.

4348

"name": "A String", # Name of the information type. Either a name of your choosing when

4349

# creating a CustomInfoType, or one of the names listed

4350

# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying

4351

# a built-in type. InfoType names should conform to the pattern

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

4352

# `[a-zA-Z0-9_]{1,64}`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

4353

},

4354

"maxFindings": 42, # Max findings limit for the given infoType.

4355

},

4356

],

4357

"maxFindingsPerItem": 42, # Max number of findings that will be returned for each item scanned.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

4358

# When set within `InspectJobConfig`,

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

4359

# the maximum returned is 2000 regardless if this is set higher.

4360

# When set within `InspectContentRequest`, this field is ignored.

4361

},

4362

"minLikelihood": "A String", # Only returns findings equal or above this threshold. The default is

4363

# POSSIBLE.

4364

# See https://cloud.google.com/dlp/docs/likelihood to learn more.

4365

"customInfoTypes": [ # CustomInfoTypes provided by the user. See

4366

# https://cloud.google.com/dlp/docs/creating-custom-infotypes to learn more.

4367

{ # Custom information type provided by the user. Used to find domain-specific

4368

# sensitive information configurable to the data in question.

4369

"regex": { # Message defining a custom regular expression. # Regular expression based CustomInfoType.

4370

"pattern": "A String", # Pattern defining the regular expression. Its syntax

4371

# (https://github.com/google/re2/wiki/Syntax) can be found under the

4372

# google/re2 repository on GitHub.

4373

"groupIndexes": [ # The index of the submatch to extract as findings. When not

4374

# specified, the entire match is returned. No more than 3 may be included.

42,

],

},

"surrogateType": { # Message for detecting output from deidentification transformations # Message for detecting output from deidentification transformations that

4379

# support reversing.

4380

# such as

4381

# [`CryptoReplaceFfxFpeConfig`](/dlp/docs/reference/rest/v2/organizations.deidentifyTemplates#cryptoreplaceffxfpeconfig).

4382

# These types of transformations are

4383

# those that perform pseudonymization, thereby producing a "surrogate" as

4384

# output. This should be used in conjunction with a field on the

4385

# transformation such as `surrogate_info_type`. This CustomInfoType does

4386

# not support the use of `detection_rules`.

4387

},

4388

"infoType": { # Type of information detected by the API. # CustomInfoType can either be a new infoType, or an extension of built-in

4389

# infoType, when the name matches one of existing infoTypes and that infoType

4390

# is specified in `InspectContent.info_types` field. Specifying the latter

4391

# adds findings to the one detected by the system. If built-in info type is

4392

# not specified in `InspectContent.info_types` list then the name is treated

4393

# as a custom info type.

4394

"name": "A String", # Name of the information type. Either a name of your choosing when

4395

# creating a CustomInfoType, or one of the names listed

4396

# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying

4397

# a built-in type. InfoType names should conform to the pattern

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

4398

# `[a-zA-Z0-9_]{1,64}`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

4399

},

4400

"dictionary": { # Custom information type based on a dictionary of words or phrases. This can # A list of phrases to detect as a CustomInfoType.

4401

# be used to match sensitive information specific to the data, such as a list

4402

# of employee IDs or job titles.

4403

#

4404

# Dictionary words are case-insensitive and all characters other than letters

4405

# and digits in the unicode [Basic Multilingual

4406

# Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane)

4407

# will be replaced with whitespace when scanning for matches, so the

4408

# dictionary phrase "Sam Johnson" will match all three phrases "sam johnson",

4409

# "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters

4410

# surrounding any match must be of a different type than the adjacent

4411

# characters within the word, so letters must be next to non-letters and

4412

# digits next to non-digits. For example, the dictionary word "jen" will

4413

# match the first three letters of the text "jen123" but will return no

4414

# matches for "jennifer".

4415

#

4416

# Dictionary words containing a large number of characters that are not

4417

# letters or digits may result in unexpected findings because such characters

4418

# are treated as whitespace. The

4419

# [limits](https://cloud.google.com/dlp/limits) page contains details about

4420

# the size limits of dictionaries. For dictionaries that do not fit within

4421

# these constraints, consider using `LargeCustomDictionaryConfig` in the

4422

# `StoredInfoType` API.

4423

"wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for.

4424

"words": [ # Words or phrases defining the dictionary. The dictionary must contain

4425

# at least one phrase and every phrase must contain at least 2 characters

4426

# that are letters or digits. [required]

"A String",

],

},

"cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file

4431

# is accepted.

4432

"path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage.

4433

# Example: gs://[BUCKET_NAME]/dictionary.txt

4434

},

4435

},

4436

"storedType": { # A reference to a StoredInfoType to use with scanning. # Load an existing `StoredInfoType` resource for use in

4437

# `InspectDataSource`. Not currently supported in `InspectContent`.

4438

"name": "A String", # Resource name of the requested `StoredInfoType`, for example

4439

# `organizations/433245324/storedInfoTypes/432452342` or

4440

# `projects/project-id/storedInfoTypes/432452342`.

4441

"createTime": "A String", # Timestamp indicating when the version of the `StoredInfoType` used for

4442

# inspection was created. Output-only field, populated by the system.

4443

},

4444

"detectionRules": [ # Set of detection rules to apply to all findings of this CustomInfoType.

4445

# Rules are applied in order that they are specified. Not supported for the

4446

# `surrogate_type` CustomInfoType.

4447

{ # Deprecated; use `InspectionRuleSet` instead. Rule for modifying a

4448

# `CustomInfoType` to alter behavior under certain circumstances, depending

4449

# on the specific details of the rule. Not supported for the `surrogate_type`

4450

# custom infoType.

4451

"hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule.

4452

# proximity of hotwords.

4453

"proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside.

4454

# The total length of the window cannot exceed 1000 characters. Note that

4455

# the finding itself will be included in the window, so that hotwords may

4456

# be used to match substrings of the finding itself. For example, the

4457

# certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be

4458

# adjusted upwards if the area code is known to be the local area code of

4459

# a company office using the hotword regex "\(xxx\)", where "xxx"

4460

# is the area code in question.

4461

# rule.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

4462

"windowBefore": 42, # Number of characters before the finding to consider.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

4463

"windowAfter": 42, # Number of characters after the finding to consider.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

4464

},

4465

"hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword.

4466

"pattern": "A String", # Pattern defining the regular expression. Its syntax

4467

# (https://github.com/google/re2/wiki/Syntax) can be found under the

4468

# google/re2 repository on GitHub.

4469

"groupIndexes": [ # The index of the submatch to extract as findings. When not

4470

# specified, the entire match is returned. No more than 3 may be included.

42,

],

},

"likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings.

4475

# part of a detection rule.

4476

"relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of

4477

# levels. For example, if a finding would be `POSSIBLE` without the

4478

# detection rule and `relative_likelihood` is 1, then it is upgraded to

4479

# `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`.

4480

# Likelihood may never drop below `VERY_UNLIKELY` or exceed

4481

# `VERY_LIKELY`, so applying an adjustment of 1 followed by an

4482

# adjustment of -1 when base likelihood is `VERY_LIKELY` will result in

4483

# a final likelihood of `LIKELY`.

4484

"fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value.

},

},

},

],

"exclusionType": "A String", # If set to EXCLUSION_TYPE_EXCLUDE this infoType will not cause a finding

4490

# to be returned. It still can be used for rules matching.

4491

"likelihood": "A String", # Likelihood to return for this CustomInfoType. This base value can be

4492

# altered by a detection rule if the finding meets the criteria specified by

4493

# the rule. Defaults to `VERY_LIKELY` if not specified.

4494

},

4495

],

4496

"includeQuote": True or False, # When true, a contextual quote from the data that triggered a finding is

4497

# included in the response; see Finding.quote.

4498

"ruleSet": [ # Set of rules to apply to the findings for this InspectConfig.

4499

# Exclusion rules, contained in the set are executed in the end, other

4500

# rules are executed in the order they are specified for each info type.

4501

{ # Rule set for modifying a set of infoTypes to alter behavior under certain

4502

# circumstances, depending on the specific details of the rules within the set.

4503

"rules": [ # Set of rules to be applied to infoTypes. The rules are applied in order.

4504

{ # A single inspection rule to be applied to infoTypes, specified in

4505

# `InspectionRuleSet`.

4506

"hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule.

4507

# proximity of hotwords.

4508

"proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside.

4509

# The total length of the window cannot exceed 1000 characters. Note that

4510

# the finding itself will be included in the window, so that hotwords may

4511

# be used to match substrings of the finding itself. For example, the

4512

# certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be

4513

# adjusted upwards if the area code is known to be the local area code of

4514

# a company office using the hotword regex "\(xxx\)", where "xxx"

4515

# is the area code in question.

4516

# rule.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

4517

"windowBefore": 42, # Number of characters before the finding to consider.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

4518

"windowAfter": 42, # Number of characters after the finding to consider.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

4519

},

4520

"hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword.

4521

"pattern": "A String", # Pattern defining the regular expression. Its syntax

4522

# (https://github.com/google/re2/wiki/Syntax) can be found under the

4523

# google/re2 repository on GitHub.

4524

"groupIndexes": [ # The index of the submatch to extract as findings. When not

4525

# specified, the entire match is returned. No more than 3 may be included.

42,

],

},

"likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings.

4530

# part of a detection rule.

4531

"relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of

4532

# levels. For example, if a finding would be `POSSIBLE` without the

4533

# detection rule and `relative_likelihood` is 1, then it is upgraded to

4534

# `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`.

4535

# Likelihood may never drop below `VERY_UNLIKELY` or exceed

4536

# `VERY_LIKELY`, so applying an adjustment of 1 followed by an

4537

# adjustment of -1 when base likelihood is `VERY_LIKELY` will result in

4538

# a final likelihood of `LIKELY`.

4539

"fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value.

4540

},

4541

},

4542

"exclusionRule": { # The rule that specifies conditions when findings of infoTypes specified in # Exclusion rule.

4543

# `InspectionRuleSet` are removed from results.

4544

"regex": { # Message defining a custom regular expression. # Regular expression which defines the rule.

4545

"pattern": "A String", # Pattern defining the regular expression. Its syntax

4546

# (https://github.com/google/re2/wiki/Syntax) can be found under the

4547

# google/re2 repository on GitHub.

4548

"groupIndexes": [ # The index of the submatch to extract as findings. When not

4549

# specified, the entire match is returned. No more than 3 may be included.

42,

],

},

"excludeInfoTypes": { # List of exclude infoTypes. # Set of infoTypes for which findings would affect this rule.

4554

"infoTypes": [ # InfoType list in ExclusionRule rule drops a finding when it overlaps or

4555

# contained within with a finding of an infoType from this list. For

4556

# example, for `InspectionRuleSet.info_types` containing "PHONE_NUMBER"` and

4557

# `exclusion_rule` containing `exclude_info_types.info_types` with

4558

# "EMAIL_ADDRESS" the phone number findings are dropped if they overlap

4559

# with EMAIL_ADDRESS finding.

4560

# That leads to "555-222-2222@example.org" to generate only a single

4561

# finding, namely email address.

4562

{ # Type of information detected by the API.

4563

"name": "A String", # Name of the information type. Either a name of your choosing when

4564

# creating a CustomInfoType, or one of the names listed

4565

# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying

4566

# a built-in type. InfoType names should conform to the pattern

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

4567

# `[a-zA-Z0-9_]{1,64}`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

},

],

},

"dictionary": { # Custom information type based on a dictionary of words or phrases. This can # Dictionary which defines the rule.

4572

# be used to match sensitive information specific to the data, such as a list

4573

# of employee IDs or job titles.

4574

#

4575

# Dictionary words are case-insensitive and all characters other than letters

4576

# and digits in the unicode [Basic Multilingual

4577

# Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane)

4578

# will be replaced with whitespace when scanning for matches, so the

4579

# dictionary phrase "Sam Johnson" will match all three phrases "sam johnson",

4580

# "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters

4581

# surrounding any match must be of a different type than the adjacent

4582

# characters within the word, so letters must be next to non-letters and

4583

# digits next to non-digits. For example, the dictionary word "jen" will

4584

# match the first three letters of the text "jen123" but will return no

4585

# matches for "jennifer".

4586

#

4587

# Dictionary words containing a large number of characters that are not

4588

# letters or digits may result in unexpected findings because such characters

4589

# are treated as whitespace. The

4590

# [limits](https://cloud.google.com/dlp/limits) page contains details about

4591

# the size limits of dictionaries. For dictionaries that do not fit within

4592

# these constraints, consider using `LargeCustomDictionaryConfig` in the

4593

# `StoredInfoType` API.

4594

"wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for.

4595

"words": [ # Words or phrases defining the dictionary. The dictionary must contain

4596

# at least one phrase and every phrase must contain at least 2 characters

4597

# that are letters or digits. [required]

"A String",

],

},

"cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file

4602

# is accepted.

4603

"path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage.

4604

# Example: gs://[BUCKET_NAME]/dictionary.txt

4605

},

4606

},

4607

"matchingType": "A String", # How the rule is applied, see MatchingType documentation for details.

},

},

],

"infoTypes": [ # List of infoTypes this rule set is applied to.

4612

{ # Type of information detected by the API.

4613

"name": "A String", # Name of the information type. Either a name of your choosing when

4614

# creating a CustomInfoType, or one of the names listed

4615

# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying

4616

# a built-in type. InfoType names should conform to the pattern

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

4617

# `[a-zA-Z0-9_]{1,64}`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

},

],

},

],

"contentOptions": [ # List of options defining data content to scan.

4623

# If empty, text, images, and other content will be included.

4624

"A String",

4625

],

4626

"infoTypes": [ # Restricts what info_types to look for. The values must correspond to

4627

# InfoType values returned by ListInfoTypes or listed at

4628

# https://cloud.google.com/dlp/docs/infotypes-reference.

4629

#

4630

# When no InfoTypes or CustomInfoTypes are specified in a request, the

4631

# system may automatically choose what detectors to run. By default this may

4632

# be all types, but may change over time as detectors are updated.

4633

#

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

4634

# If you need precise control and predictability as to what detectors are

4635

# run you should specify specific InfoTypes listed in the reference,

4636

# otherwise a default list will be used, which may change over time.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

4637

{ # Type of information detected by the API.

4638

"name": "A String", # Name of the information type. Either a name of your choosing when

4639

# creating a CustomInfoType, or one of the names listed

4640

# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying

4641

# a built-in type. InfoType names should conform to the pattern

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

4642

# `[a-zA-Z0-9_]{1,64}`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

},

],

},

"inspectTemplateName": "A String", # If provided, will be used as the default for all values in InspectConfig.

4647

# `inspect_config` will be merged into the values persisted as part of the

4648

# template.

4649

"actions": [ # Actions to execute at the completion of the job.

4650

{ # A task to execute on the completion of a job.

4651

# See https://cloud.google.com/dlp/docs/concepts-actions to learn more.

4652

"saveFindings": { # If set, the detailed findings will be persisted to the specified # Save resulting findings in a provided location.

4653

# OutputStorageConfig. Only a single instance of this action can be

4654

# specified.

4655

# Compatible with: Inspect, Risk

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

4656

"outputConfig": { # Cloud repository for storing output. # Location to store findings outside of DLP.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

4657

"table": { # Message defining the location of a BigQuery table. A table is uniquely # Store findings in an existing table or a new table in an existing

4658

# dataset. If table_id is not set a new one will be generated

4659

# for you with the following format:

4660

# dlp_googleapis_yyyy_mm_dd_[dlp_job_id]. Pacific timezone will be used for

4661

# generating the date details.

4662

#

4663

# For Inspect, each column in an existing output table must have the same

4664

# name, type, and mode of a field in the `Finding` object.

4665

#

4666

# For Risk, an existing output table should be the output of a previous

4667

# Risk analysis job run on the same source table, with the same privacy

4668

# metric and quasi-identifiers. Risk jobs that analyze the same table but

4669

# compute a different privacy metric, or use different sets of

4670

# quasi-identifiers, cannot store their results in the same table.

4671

# identified by its project_id, dataset_id, and table_name. Within a query

4672

# a table is often referenced with a string in the format of:

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

4673

# `<project_id>:<dataset_id>.<table_id>` or

4674

# `<project_id>.<dataset_id>.<table_id>`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

4675

"projectId": "A String", # The Google Cloud Platform project ID of the project containing the table.

4676

# If omitted, project ID is inferred from the API call.

4677

"tableId": "A String", # Name of the table.

4678

"datasetId": "A String", # Dataset ID of the table.

4679

},

4680

"outputSchema": "A String", # Schema used for writing the findings for Inspect jobs. This field is only

4681

# used for Inspect and must be unspecified for Risk jobs. Columns are derived

4682

# from the `Finding` object. If appending to an existing table, any columns

4683

# from the predefined schema that are missing will be added. No columns in

4684

# the existing table will be deleted.

4685

#

4686

# If unspecified, then all available columns will be used for a new table or

4687

# an (existing) table with no schema, and no changes will be made to an

4688

# existing table that has a schema.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

4689

# Only for use with external storage.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

4690

},

4691

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

4692

"jobNotificationEmails": { # Enable email notification to project owners and editors on jobs's # Enable email notification for project owners and editors on job's

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

4693

# completion/failure.

4694

# completion/failure.

4695

},

4696

"publishSummaryToCscc": { # Publish the result summary of a DlpJob to the Cloud Security # Publish summary to Cloud Security Command Center (Alpha).

4697

# Command Center (CSCC Alpha).

4698

# This action is only available for projects which are parts of

4699

# an organization and whitelisted for the alpha Cloud Security Command

4700

# Center.

4701

# The action will publish count of finding instances and their info types.

4702

# The summary of findings will be persisted in CSCC and are governed by CSCC

4703

# service-specific policy, see https://cloud.google.com/terms/service-terms

4704

# Only a single instance of this action can be specified.

4705

# Compatible with: Inspect

4706

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

4707

"publishToStackdriver": { # Enable Stackdriver metric dlp.googleapis.com/finding_count. This # Enable Stackdriver metric dlp.googleapis.com/finding_count.

4708

# will publish a metric to stack driver on each infotype requested and

4709

# how many findings were found for it. CustomDetectors will be bucketed

4710

# as 'Custom' under the Stackdriver label 'info_type'.

4711

},

4712

"publishFindingsToCloudDataCatalog": { # Publish findings of a DlpJob to Cloud Data Catalog. Labels summarizing the # Publish findings to Cloud Datahub.

4713

# results of the DlpJob will be applied to the entry for the resource scanned

4714

# in Cloud Data Catalog. Any labels previously written by another DlpJob will

4715

# be deleted. InfoType naming patterns are strictly enforced when using this

4716

# feature. Note that the findings will be persisted in Cloud Data Catalog

4717

# storage and are governed by Data Catalog service-specific policy, see

4718

# https://cloud.google.com/terms/service-terms

4719

# Only a single instance of this action can be specified and only allowed if

4720

# all resources being scanned are BigQuery tables.

4721

# Compatible with: Inspect

4722

},

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

4723

"pubSub": { # Publish a message into given Pub/Sub topic when DlpJob has completed. The # Publish a notification to a pubsub topic.

4724

# message contains a single field, `DlpJobName`, which is equal to the

4725

# finished job's

4726

# [`DlpJob.name`](/dlp/docs/reference/rest/v2/projects.dlpJobs#DlpJob).

4727

# Compatible with: Inspect, Risk

4728

"topic": "A String", # Cloud Pub/Sub topic to send notifications to. The topic must have given

4729

# publishing access rights to the DLP API service account executing

4730

# the long running DlpJob sending the notifications.

4731

# Format is projects/{project}/topics/{topic}.

},

},

],

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

4736

"lastRunTime": "A String", # Output only. The timestamp of the last time this trigger executed.

4737

"createTime": "A String", # Output only. The creation timestamp of a triggeredJob.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

4738

"name": "A String", # Unique resource name for the triggeredJob, assigned by the service when the

4739

# triggeredJob is created, for example

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

4740

# `projects/dlp-test-project/jobTriggers/53234423`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

},

],

}</pre>

</div>

<code class="details" id="list_next">list_next(previous_request, previous_response)</code>

4748

<pre>Retrieves the next page of results.

4749

4750

Args:

4751

previous_request: The request for the previous page. (required)

4752

previous_response: The response from the request for the previous page. (required)

4753

4754

Returns:

4755

A request object that you can call 'execute()' on to request the next

4756

page. Returns None if there are no more items in the collection.

</pre>

</div>

Dan O'Meara

2020-05-01 07:42:23 -0700

[diff] [blame^]

4761

<code class="details" id="patch">patch(name, body=None, x__xgafv=None)</code>

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

4762

<pre>Updates a job trigger.

4763

See https://cloud.google.com/dlp/docs/creating-job-triggers to learn more.

4764

4765

Args:

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

4766

name: string, Required. Resource name of the project and the triggeredJob, for example

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

4767

`projects/dlp-test-project/jobTriggers/53234423`. (required)

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

4768

body: object, The request body.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

4769

The object takes the form of:

4770

4771

{ # Request message for UpdateJobTrigger.

4772

"jobTrigger": { # Contains a configuration to make dlp api calls on a repeating basis. # New JobTrigger value.

4773

# See https://cloud.google.com/dlp/docs/concepts-job-triggers to learn more.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

4774

"status": "A String", # Required. A status for this trigger.

4775

"updateTime": "A String", # Output only. The last update timestamp of a triggeredJob.

4776

"errors": [ # Output only. A stream of errors encountered when the trigger was activated. Repeated

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

4777

# errors may result in the JobTrigger automatically being paused.

4778

# Will return the last 100 errors. Whenever the JobTrigger is modified

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

4779

# this list will be cleared.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

4780

{ # Details information about an error encountered during job execution or

4781

# the results of an unsuccessful activation of the JobTrigger.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

4782

"timestamps": [ # The times the error occurred.

4783

"A String",

4784

],

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

4785

"details": { # The `Status` type defines a logical error model that is suitable for # Detailed error codes and messages.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

4786

# different programming environments, including REST APIs and RPC APIs. It is

4787

# used by [gRPC](https://github.com/grpc). Each `Status` message contains

4788

# three pieces of data: error code, error message, and error details.

4789

#

4790

# You can find out more about this error model and how to work with it in the

4791

# [API Design Guide](https://cloud.google.com/apis/design/errors).

4792

"message": "A String", # A developer-facing error message, which should be in English. Any

4793

# user-facing error message should be localized and sent in the

4794

# google.rpc.Status.details field, or localized by the client.

4795

"code": 42, # The status code, which should be an enum value of google.rpc.Code.

4796

"details": [ # A list of messages that carry the error details. There is a common set of

4797

# message types for APIs to use.

4798

{

4799

"a_key": "", # Properties of the object. Contains field @type with type URL.

},

],

},

},

],

"displayName": "A String", # Display name (max 100 chars)

4806

"description": "A String", # User provided description (max 256 chars)

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

4807

"triggers": [ # A list of triggers which will be OR'ed together. Only one in the list

4808

# needs to trigger for a job to be started. The list may contain only

4809

# a single Schedule trigger and must have at least one object.

4810

{ # What event needs to occur for a new job to be started.

4811

"manual": { # Job trigger option for hybrid jobs. Jobs must be manually created # For use with hybrid jobs. Jobs must be manually created and finished.

4812

# Early access feature is in a pre-release state and might change or have

4813

# limited support. For more information, see

4814

# https://cloud.google.com/products#product-launch-stages.

4815

# and finished.

4816

},

4817

"schedule": { # Schedule for triggeredJobs. # Create a job on a repeating basis based on the elapse of time.

4818

"recurrencePeriodDuration": "A String", # With this option a job is started a regular periodic basis. For

4819

# example: every day (86400 seconds).

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

4820

#

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

4821

# A scheduled start time will be skipped if the previous

4822

# execution has not ended when its scheduled time occurs.

4823

#

4824

# This value must be set to a time duration greater than or equal

4825

# to 1 day and can be no longer than 60 days.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

4826

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

4827

},

4828

],

4829

"inspectJob": { # Controls what and how to inspect for findings. # For inspect jobs, a snapshot of the configuration.

4830

"storageConfig": { # Shared message indicating Cloud storage type. # The data to scan.

4831

"cloudStorageOptions": { # Options defining a file or a set of files within a Google Cloud Storage # Google Cloud Storage options.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

4832

# bucket.

4833

"bytesLimitPerFile": "A String", # Max number of bytes to scan from a file. If a scanned file's size is bigger

4834

# than this value then the rest of the bytes are omitted. Only one

4835

# of bytes_limit_per_file and bytes_limit_per_file_percent can be specified.

4836

"sampleMethod": "A String",

4837

"fileSet": { # Set of files to scan. # The set of one or more files to scan.

4838

"url": "A String", # The Cloud Storage url of the file(s) to scan, in the format

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

4839

# `gs://<bucket>/<path>`. Trailing wildcard in the path is allowed.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

4840

#

4841

# If the url ends in a trailing slash, the bucket or directory represented

4842

# by the url will be scanned non-recursively (content in sub-directories

4843

# will not be scanned). This means that `gs://mybucket/` is equivalent to

4844

# `gs://mybucket/*`, and `gs://mybucket/directory/` is equivalent to

4845

# `gs://mybucket/directory/*`.

4846

#

4847

# Exactly one of `url` or `regex_file_set` must be set.

4848

"regexFileSet": { # Message representing a set of files in a Cloud Storage bucket. Regular # The regex-filtered set of files to scan. Exactly one of `url` or

4849

# `regex_file_set` must be set.

4850

# expressions are used to allow fine-grained control over which files in the

4851

# bucket to include.

4852

#

4853

# Included files are those that match at least one item in `include_regex` and

4854

# do not match any items in `exclude_regex`. Note that a file that matches

4855

# items from both lists will _not_ be included. For a match to occur, the

4856

# entire file path (i.e., everything in the url after the bucket name) must

4857

# match the regular expression.

4858

#

4859

# For example, given the input `{bucket_name: "mybucket", include_regex:

4860

# ["directory1/.*"], exclude_regex:

4861

# ["directory1/excluded.*"]}`:

4862

#

4863

# * `gs://mybucket/directory1/myfile` will be included

4864

# * `gs://mybucket/directory1/directory2/myfile` will be included (`.*` matches

4865

# across `/`)

4866

# * `gs://mybucket/directory0/directory1/myfile` will _not_ be included (the

4867

# full path doesn't match any items in `include_regex`)

4868

# * `gs://mybucket/directory1/excludedfile` will _not_ be included (the path

4869

# matches an item in `exclude_regex`)

4870

#

4871

# If `include_regex` is left empty, it will match all files by default

4872

# (this is equivalent to setting `include_regex: [".*"]`).

4873

#

4874

# Some other common use cases:

4875

#

4876

# * `{bucket_name: "mybucket", exclude_regex: [".*\.pdf"]}` will include all

4877

# files in `mybucket` except for .pdf files

4878

# * `{bucket_name: "mybucket", include_regex: ["directory/[^/]+"]}` will

4879

# include all files directly under `gs://mybucket/directory/`, without matching

4880

# across `/`

4881

"excludeRegex": [ # A list of regular expressions matching file paths to exclude. All files in

4882

# the bucket that match at least one of these regular expressions will be

4883

# excluded from the scan.

4884

#

4885

# Regular expressions use RE2

4886

# [syntax](https://github.com/google/re2/wiki/Syntax); a guide can be found

4887

# under the google/re2 repository on GitHub.

4888

"A String",

4889

],

4890

"bucketName": "A String", # The name of a Cloud Storage bucket. Required.

4891

"includeRegex": [ # A list of regular expressions matching file paths to include. All files in

4892

# the bucket that match at least one of these regular expressions will be

4893

# included in the set of files, except for those that also match an item in

4894

# `exclude_regex`. Leaving this field empty will match all files by default

4895

# (this is equivalent to including `.*` in the list).

4896

#

4897

# Regular expressions use RE2

4898

# [syntax](https://github.com/google/re2/wiki/Syntax); a guide can be found

4899

# under the google/re2 repository on GitHub.

"A String",

],

},

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

4904

"filesLimitPercent": 42, # Limits the number of files to scan to this percentage of the input FileSet.

4905

# Number of files scanned is rounded down. Must be between 0 and 100,

4906

# inclusively. Both 0 and 100 means no limit. Defaults to 0.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

4907

"bytesLimitPerFilePercent": 42, # Max percentage of bytes to scan from a file. The rest are omitted. The

4908

# number of bytes scanned is rounded down. Must be between 0 and 100,

4909

# inclusively. Both 0 and 100 means no limit. Defaults to 0. Only one

4910

# of bytes_limit_per_file and bytes_limit_per_file_percent can be specified.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

4911

"fileTypes": [ # List of file type groups to include in the scan.

4912

# If empty, all files are scanned and available data format processors

4913

# are applied. In addition, the binary content of the selected files

4914

# is always scanned as well.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

4915

# Images are scanned only as binary if the specified region

4916

# does not support image inspection and no file_types were specified.

4917

# Image inspection is restricted to 'global', 'us', 'asia', and 'europe'.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

4918

"A String",

4919

],

4920

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

4921

"datastoreOptions": { # Options defining a data set within Google Cloud Datastore. # Google Cloud Datastore options.

4922

"partitionId": { # Datastore partition ID. # A partition ID identifies a grouping of entities. The grouping is always

4923

# by project and namespace, however the namespace ID may be empty.

4924

# A partition ID identifies a grouping of entities. The grouping is always

4925

# by project and namespace, however the namespace ID may be empty.

4926

#

4927

# A partition ID contains several dimensions:

4928

# project ID and namespace ID.

4929

"projectId": "A String", # The ID of the project to which the entities belong.

4930

"namespaceId": "A String", # If not empty, the ID of the namespace to which the entities belong.

4931

},

4932

"kind": { # A representation of a Datastore kind. # The kind to process.

4933

"name": "A String", # The name of the kind.

4934

},

4935

},

4936

"bigQueryOptions": { # Options defining BigQuery table and row identifiers. # BigQuery options.

4937

"excludedFields": [ # References to fields excluded from scanning. This allows you to skip

4938

# inspection of entire columns which you know have no findings.

4939

{ # General identifier of a data field in a storage service.

4940

"name": "A String", # Name describing the field.

4941

},

4942

],

4943

"rowsLimit": "A String", # Max number of rows to scan. If the table has more rows than this value, the

4944

# rest of the rows are omitted. If not set, or if set to 0, all rows will be

4945

# scanned. Only one of rows_limit and rows_limit_percent can be specified.

4946

# Cannot be used in conjunction with TimespanConfig.

4947

"sampleMethod": "A String",

4948

"identifyingFields": [ # Table fields that may uniquely identify a row within the table. When

4949

# `actions.saveFindings.outputConfig.table` is specified, the values of

4950

# columns specified here are available in the output table under

4951

# `location.content_locations.record_location.record_key.id_values`. Nested

4952

# fields such as `person.birthdate.year` are allowed.

4953

{ # General identifier of a data field in a storage service.

4954

"name": "A String", # Name describing the field.

4955

},

4956

],

4957

"rowsLimitPercent": 42, # Max percentage of rows to scan. The rest are omitted. The number of rows

4958

# scanned is rounded down. Must be between 0 and 100, inclusively. Both 0 and

4959

# 100 means no limit. Defaults to 0. Only one of rows_limit and

4960

# rows_limit_percent can be specified. Cannot be used in conjunction with

4961

# TimespanConfig.

4962

"tableReference": { # Message defining the location of a BigQuery table. A table is uniquely # Complete BigQuery table reference.

4963

# identified by its project_id, dataset_id, and table_name. Within a query

4964

# a table is often referenced with a string in the format of:

4965

# `<project_id>:<dataset_id>.<table_id>` or

4966

# `<project_id>.<dataset_id>.<table_id>`.

4967

"projectId": "A String", # The Google Cloud Platform project ID of the project containing the table.

4968

# If omitted, project ID is inferred from the API call.

4969

"tableId": "A String", # Name of the table.

4970

"datasetId": "A String", # Dataset ID of the table.

4971

},

4972

},

4973

"timespanConfig": { # Configuration of the timespan of the items to include in scanning.

4974

# Currently only supported when inspecting Google Cloud Storage and BigQuery.

4975

"timestampField": { # General identifier of a data field in a storage service. # Specification of the field containing the timestamp of scanned items.

4976

# Used for data sources like Datastore and BigQuery.

4977

#

4978

# For BigQuery:

4979

# Required to filter out rows based on the given start and

4980

# end times. If not specified and the table was modified between the given

4981

# start and end times, the entire table will be scanned.

4982

# The valid data types of the timestamp field are: `INTEGER`, `DATE`,

4983

# `TIMESTAMP`, or `DATETIME` BigQuery column.

4984

#

4985

# For Datastore.

4986

# Valid data types of the timestamp field are: `TIMESTAMP`.

4987

# Datastore entity will be scanned if the timestamp property does not

4988

# exist or its value is empty or invalid.

4989

"name": "A String", # Name describing the field.

4990

},

4991

"endTime": "A String", # Exclude files or rows newer than this value.

4992

# If set to zero, no upper time limit is applied.

4993

"startTime": "A String", # Exclude files or rows older than this value.

4994

"enableAutoPopulationOfTimespanConfig": True or False, # When the job is started by a JobTrigger we will automatically figure out

4995

# a valid start_time to avoid scanning files that have not been modified

4996

# since the last time the JobTrigger executed. This will be based on the

4997

# time of the execution of the last run of the JobTrigger.

4998

},

4999

"hybridOptions": { # Configuration to control jobs where the content being inspected is outside # Hybrid inspection options.

5000

# Early access feature is in a pre-release state and might change or have

5001

# limited support. For more information, see

5002

# https://cloud.google.com/products#product-launch-stages.

5003

# of Google Cloud Platform.

5004

"tableOptions": { # Instructions regarding the table content being inspected. # If the container is a table, additional information to make findings

5005

# meaningful such as the columns that are primary keys.

5006

"identifyingFields": [ # The columns that are the primary keys for table objects included in

5007

# ContentItem. A copy of this cell's value will stored alongside alongside

5008

# each finding so that the finding can be traced to the specific row it came

5009

# from. No more than 3 may be provided.

5010

{ # General identifier of a data field in a storage service.

5011

"name": "A String", # Name describing the field.

},

],

},

"labels": { # To organize findings, these labels will be added to each finding.

5016

#

5017

# Label keys must be between 1 and 63 characters long and must conform

5018

# to the following regular expression: `[a-z]([-a-z0-9]*[a-z0-9])?`.

5019

#

5020

# Label values must be between 0 and 63 characters long and must conform

5021

# to the regular expression `([a-z]([-a-z0-9]*[a-z0-9])?)?`.

5022

#

5023

# No more than 10 labels can be associated with a given finding.

5024

#

5025

# Examples:

5026

# * `"environment" : "production"`

5027

# * `"pipeline" : "etl"`

5028

"a_key": "A String",

5029

},

5030

"requiredFindingLabelKeys": [ # These are labels that each inspection request must include within their

5031

# 'finding_labels' map. Request may contain others, but any missing one of

5032

# these will be rejected.

5033

#

5034

# Label keys must be between 1 and 63 characters long and must conform

5035

# to the following regular expression: `[a-z]([-a-z0-9]*[a-z0-9])?`.

5036

#

5037

# No more than 10 keys can be required.

5038

"A String",

5039

],

5040

"description": "A String", # A short description of where the data is coming from. Will be stored once

5041

# in the job. 256 max length.

5042

},

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

5043

},

5044

"inspectConfig": { # Configuration description of the scanning process. # How and what to scan for.

5045

# When used with redactContent only info_types and min_likelihood are currently

5046

# used.

5047

"excludeInfoTypes": True or False, # When true, excludes type information of the findings.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

5048

"limits": { # Configuration to control the number of findings returned. # Configuration to control the number of findings returned.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

5049

"maxFindingsPerRequest": 42, # Max number of findings that will be returned per request/job.

5050

# When set within `InspectContentRequest`, the maximum returned is 2000

5051

# regardless if this is set higher.

5052

"maxFindingsPerInfoType": [ # Configuration of findings limit given for specified infoTypes.

5053

{ # Max findings configuration per infoType, per content item or long

5054

# running DlpJob.

5055

"infoType": { # Type of information detected by the API. # Type of information the findings limit applies to. Only one limit per

5056

# info_type should be provided. If InfoTypeLimit does not have an

5057

# info_type, the DLP API applies the limit against all info_types that

5058

# are found but not specified in another InfoTypeLimit.

5059

"name": "A String", # Name of the information type. Either a name of your choosing when

5060

# creating a CustomInfoType, or one of the names listed

5061

# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying

5062

# a built-in type. InfoType names should conform to the pattern

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

5063

# `[a-zA-Z0-9_]{1,64}`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

5064

},

5065

"maxFindings": 42, # Max findings limit for the given infoType.

5066

},

5067

],

5068

"maxFindingsPerItem": 42, # Max number of findings that will be returned for each item scanned.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

5069

# When set within `InspectJobConfig`,

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

5070

# the maximum returned is 2000 regardless if this is set higher.

5071

# When set within `InspectContentRequest`, this field is ignored.

5072

},

5073

"minLikelihood": "A String", # Only returns findings equal or above this threshold. The default is

5074

# POSSIBLE.

5075

# See https://cloud.google.com/dlp/docs/likelihood to learn more.

5076

"customInfoTypes": [ # CustomInfoTypes provided by the user. See

5077

# https://cloud.google.com/dlp/docs/creating-custom-infotypes to learn more.

5078

{ # Custom information type provided by the user. Used to find domain-specific

5079

# sensitive information configurable to the data in question.

5080

"regex": { # Message defining a custom regular expression. # Regular expression based CustomInfoType.

5081

"pattern": "A String", # Pattern defining the regular expression. Its syntax

5082

# (https://github.com/google/re2/wiki/Syntax) can be found under the

5083

# google/re2 repository on GitHub.

5084

"groupIndexes": [ # The index of the submatch to extract as findings. When not

5085

# specified, the entire match is returned. No more than 3 may be included.

42,

],

},

"surrogateType": { # Message for detecting output from deidentification transformations # Message for detecting output from deidentification transformations that

5090

# support reversing.

5091

# such as

5092

# [`CryptoReplaceFfxFpeConfig`](/dlp/docs/reference/rest/v2/organizations.deidentifyTemplates#cryptoreplaceffxfpeconfig).

5093

# These types of transformations are

5094

# those that perform pseudonymization, thereby producing a "surrogate" as

5095

# output. This should be used in conjunction with a field on the

5096

# transformation such as `surrogate_info_type`. This CustomInfoType does

5097

# not support the use of `detection_rules`.

5098

},

5099

"infoType": { # Type of information detected by the API. # CustomInfoType can either be a new infoType, or an extension of built-in

5100

# infoType, when the name matches one of existing infoTypes and that infoType

5101

# is specified in `InspectContent.info_types` field. Specifying the latter

5102

# adds findings to the one detected by the system. If built-in info type is

5103

# not specified in `InspectContent.info_types` list then the name is treated

5104

# as a custom info type.

5105

"name": "A String", # Name of the information type. Either a name of your choosing when

5106

# creating a CustomInfoType, or one of the names listed

5107

# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying

5108

# a built-in type. InfoType names should conform to the pattern

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

5109

# `[a-zA-Z0-9_]{1,64}`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

5110

},

5111

"dictionary": { # Custom information type based on a dictionary of words or phrases. This can # A list of phrases to detect as a CustomInfoType.

5112

# be used to match sensitive information specific to the data, such as a list

5113

# of employee IDs or job titles.

5114

#

5115

# Dictionary words are case-insensitive and all characters other than letters

5116

# and digits in the unicode [Basic Multilingual

5117

# Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane)

5118

# will be replaced with whitespace when scanning for matches, so the

5119

# dictionary phrase "Sam Johnson" will match all three phrases "sam johnson",

5120

# "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters

5121

# surrounding any match must be of a different type than the adjacent

5122

# characters within the word, so letters must be next to non-letters and

5123

# digits next to non-digits. For example, the dictionary word "jen" will

5124

# match the first three letters of the text "jen123" but will return no

5125

# matches for "jennifer".

5126

#

5127

# Dictionary words containing a large number of characters that are not

5128

# letters or digits may result in unexpected findings because such characters

5129

# are treated as whitespace. The

5130

# [limits](https://cloud.google.com/dlp/limits) page contains details about

5131

# the size limits of dictionaries. For dictionaries that do not fit within

5132

# these constraints, consider using `LargeCustomDictionaryConfig` in the

5133

# `StoredInfoType` API.

5134

"wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for.

5135

"words": [ # Words or phrases defining the dictionary. The dictionary must contain

5136

# at least one phrase and every phrase must contain at least 2 characters

5137

# that are letters or digits. [required]

"A String",

],

},

"cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file

5142

# is accepted.

5143

"path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage.

5144

# Example: gs://[BUCKET_NAME]/dictionary.txt

5145

},

5146

},

5147

"storedType": { # A reference to a StoredInfoType to use with scanning. # Load an existing `StoredInfoType` resource for use in

5148

# `InspectDataSource`. Not currently supported in `InspectContent`.

5149

"name": "A String", # Resource name of the requested `StoredInfoType`, for example

5150

# `organizations/433245324/storedInfoTypes/432452342` or

5151

# `projects/project-id/storedInfoTypes/432452342`.

5152

"createTime": "A String", # Timestamp indicating when the version of the `StoredInfoType` used for

5153

# inspection was created. Output-only field, populated by the system.

5154

},

5155

"detectionRules": [ # Set of detection rules to apply to all findings of this CustomInfoType.

5156

# Rules are applied in order that they are specified. Not supported for the

5157

# `surrogate_type` CustomInfoType.

5158

{ # Deprecated; use `InspectionRuleSet` instead. Rule for modifying a

5159

# `CustomInfoType` to alter behavior under certain circumstances, depending

5160

# on the specific details of the rule. Not supported for the `surrogate_type`

5161

# custom infoType.

5162

"hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule.

5163

# proximity of hotwords.

5164

"proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside.

5165

# The total length of the window cannot exceed 1000 characters. Note that

5166

# the finding itself will be included in the window, so that hotwords may

5167

# be used to match substrings of the finding itself. For example, the

5168

# certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be

5169

# adjusted upwards if the area code is known to be the local area code of

5170

# a company office using the hotword regex "\(xxx\)", where "xxx"

5171

# is the area code in question.

5172

# rule.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

5173

"windowBefore": 42, # Number of characters before the finding to consider.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

5174

"windowAfter": 42, # Number of characters after the finding to consider.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

5175

},

5176

"hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword.

5177

"pattern": "A String", # Pattern defining the regular expression. Its syntax

5178

# (https://github.com/google/re2/wiki/Syntax) can be found under the

5179

# google/re2 repository on GitHub.

5180

"groupIndexes": [ # The index of the submatch to extract as findings. When not

5181

# specified, the entire match is returned. No more than 3 may be included.

42,

],

},

"likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings.

5186

# part of a detection rule.

5187

"relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of

5188

# levels. For example, if a finding would be `POSSIBLE` without the

5189

# detection rule and `relative_likelihood` is 1, then it is upgraded to

5190

# `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`.

5191

# Likelihood may never drop below `VERY_UNLIKELY` or exceed

5192

# `VERY_LIKELY`, so applying an adjustment of 1 followed by an

5193

# adjustment of -1 when base likelihood is `VERY_LIKELY` will result in

5194

# a final likelihood of `LIKELY`.

5195

"fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value.

},

},

},

],

"exclusionType": "A String", # If set to EXCLUSION_TYPE_EXCLUDE this infoType will not cause a finding

5201

# to be returned. It still can be used for rules matching.

5202

"likelihood": "A String", # Likelihood to return for this CustomInfoType. This base value can be

5203

# altered by a detection rule if the finding meets the criteria specified by

5204

# the rule. Defaults to `VERY_LIKELY` if not specified.

5205

},

5206

],

5207

"includeQuote": True or False, # When true, a contextual quote from the data that triggered a finding is

5208

# included in the response; see Finding.quote.

5209

"ruleSet": [ # Set of rules to apply to the findings for this InspectConfig.

5210

# Exclusion rules, contained in the set are executed in the end, other

5211

# rules are executed in the order they are specified for each info type.

5212

{ # Rule set for modifying a set of infoTypes to alter behavior under certain

5213

# circumstances, depending on the specific details of the rules within the set.

5214

"rules": [ # Set of rules to be applied to infoTypes. The rules are applied in order.

5215

{ # A single inspection rule to be applied to infoTypes, specified in

5216

# `InspectionRuleSet`.

5217

"hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule.

5218

# proximity of hotwords.

5219

"proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside.

5220

# The total length of the window cannot exceed 1000 characters. Note that

5221

# the finding itself will be included in the window, so that hotwords may

5222

# be used to match substrings of the finding itself. For example, the

5223

# certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be

5224

# adjusted upwards if the area code is known to be the local area code of

5225

# a company office using the hotword regex "\(xxx\)", where "xxx"

5226

# is the area code in question.

5227

# rule.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

5228

"windowBefore": 42, # Number of characters before the finding to consider.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

5229

"windowAfter": 42, # Number of characters after the finding to consider.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

5230

},

5231

"hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword.

5232

"pattern": "A String", # Pattern defining the regular expression. Its syntax

5233

# (https://github.com/google/re2/wiki/Syntax) can be found under the

5234

# google/re2 repository on GitHub.

5235

"groupIndexes": [ # The index of the submatch to extract as findings. When not

5236

# specified, the entire match is returned. No more than 3 may be included.

42,

],

},

"likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings.

5241

# part of a detection rule.

5242

"relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of

5243

# levels. For example, if a finding would be `POSSIBLE` without the

5244

# detection rule and `relative_likelihood` is 1, then it is upgraded to

5245

# `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`.

5246

# Likelihood may never drop below `VERY_UNLIKELY` or exceed

5247

# `VERY_LIKELY`, so applying an adjustment of 1 followed by an

5248

# adjustment of -1 when base likelihood is `VERY_LIKELY` will result in

5249

# a final likelihood of `LIKELY`.

5250

"fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value.

5251

},

5252

},

5253

"exclusionRule": { # The rule that specifies conditions when findings of infoTypes specified in # Exclusion rule.

5254

# `InspectionRuleSet` are removed from results.

5255

"regex": { # Message defining a custom regular expression. # Regular expression which defines the rule.

5256

"pattern": "A String", # Pattern defining the regular expression. Its syntax

5257

# (https://github.com/google/re2/wiki/Syntax) can be found under the

5258

# google/re2 repository on GitHub.

5259

"groupIndexes": [ # The index of the submatch to extract as findings. When not

5260

# specified, the entire match is returned. No more than 3 may be included.

42,

],

},

"excludeInfoTypes": { # List of exclude infoTypes. # Set of infoTypes for which findings would affect this rule.

5265

"infoTypes": [ # InfoType list in ExclusionRule rule drops a finding when it overlaps or

5266

# contained within with a finding of an infoType from this list. For

5267

# example, for `InspectionRuleSet.info_types` containing "PHONE_NUMBER"` and

5268

# `exclusion_rule` containing `exclude_info_types.info_types` with

5269

# "EMAIL_ADDRESS" the phone number findings are dropped if they overlap

5270

# with EMAIL_ADDRESS finding.

5271

# That leads to "555-222-2222@example.org" to generate only a single

5272

# finding, namely email address.

5273

{ # Type of information detected by the API.

5274

"name": "A String", # Name of the information type. Either a name of your choosing when

5275

# creating a CustomInfoType, or one of the names listed

5276

# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying

5277

# a built-in type. InfoType names should conform to the pattern

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

5278

# `[a-zA-Z0-9_]{1,64}`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

},

],

},

"dictionary": { # Custom information type based on a dictionary of words or phrases. This can # Dictionary which defines the rule.

5283

# be used to match sensitive information specific to the data, such as a list

5284

# of employee IDs or job titles.

5285

#

5286

# Dictionary words are case-insensitive and all characters other than letters

5287

# and digits in the unicode [Basic Multilingual

5288

# Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane)

5289

# will be replaced with whitespace when scanning for matches, so the

5290

# dictionary phrase "Sam Johnson" will match all three phrases "sam johnson",

5291

# "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters

5292

# surrounding any match must be of a different type than the adjacent

5293

# characters within the word, so letters must be next to non-letters and

5294

# digits next to non-digits. For example, the dictionary word "jen" will

5295

# match the first three letters of the text "jen123" but will return no

5296

# matches for "jennifer".

5297

#

5298

# Dictionary words containing a large number of characters that are not

5299

# letters or digits may result in unexpected findings because such characters

5300

# are treated as whitespace. The

5301

# [limits](https://cloud.google.com/dlp/limits) page contains details about

5302

# the size limits of dictionaries. For dictionaries that do not fit within

5303

# these constraints, consider using `LargeCustomDictionaryConfig` in the

5304

# `StoredInfoType` API.

5305

"wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for.

5306

"words": [ # Words or phrases defining the dictionary. The dictionary must contain

5307

# at least one phrase and every phrase must contain at least 2 characters

5308

# that are letters or digits. [required]

"A String",

],

},

"cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file

5313

# is accepted.

5314

"path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage.

5315

# Example: gs://[BUCKET_NAME]/dictionary.txt

5316

},

5317

},

5318

"matchingType": "A String", # How the rule is applied, see MatchingType documentation for details.

},

},

],

"infoTypes": [ # List of infoTypes this rule set is applied to.

5323

{ # Type of information detected by the API.

5324

"name": "A String", # Name of the information type. Either a name of your choosing when

5325

# creating a CustomInfoType, or one of the names listed

5326

# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying

5327

# a built-in type. InfoType names should conform to the pattern

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

5328

# `[a-zA-Z0-9_]{1,64}`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

},

],

},

],

"contentOptions": [ # List of options defining data content to scan.

5334

# If empty, text, images, and other content will be included.

5335

"A String",

5336

],

5337

"infoTypes": [ # Restricts what info_types to look for. The values must correspond to

5338

# InfoType values returned by ListInfoTypes or listed at

5339

# https://cloud.google.com/dlp/docs/infotypes-reference.

5340

#

5341

# When no InfoTypes or CustomInfoTypes are specified in a request, the

5342

# system may automatically choose what detectors to run. By default this may

5343

# be all types, but may change over time as detectors are updated.

5344

#

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

5345

# If you need precise control and predictability as to what detectors are

5346

# run you should specify specific InfoTypes listed in the reference,

5347

# otherwise a default list will be used, which may change over time.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

5348

{ # Type of information detected by the API.

5349

"name": "A String", # Name of the information type. Either a name of your choosing when

5350

# creating a CustomInfoType, or one of the names listed

5351

# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying

5352

# a built-in type. InfoType names should conform to the pattern

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

5353

# `[a-zA-Z0-9_]{1,64}`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

},

],

},

"inspectTemplateName": "A String", # If provided, will be used as the default for all values in InspectConfig.

5358

# `inspect_config` will be merged into the values persisted as part of the

5359

# template.

5360

"actions": [ # Actions to execute at the completion of the job.

5361

{ # A task to execute on the completion of a job.

5362

# See https://cloud.google.com/dlp/docs/concepts-actions to learn more.

5363

"saveFindings": { # If set, the detailed findings will be persisted to the specified # Save resulting findings in a provided location.

5364

# OutputStorageConfig. Only a single instance of this action can be

5365

# specified.

5366

# Compatible with: Inspect, Risk

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

5367

"outputConfig": { # Cloud repository for storing output. # Location to store findings outside of DLP.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

5368

"table": { # Message defining the location of a BigQuery table. A table is uniquely # Store findings in an existing table or a new table in an existing

5369

# dataset. If table_id is not set a new one will be generated

5370

# for you with the following format:

5371

# dlp_googleapis_yyyy_mm_dd_[dlp_job_id]. Pacific timezone will be used for

5372

# generating the date details.

5373

#

5374

# For Inspect, each column in an existing output table must have the same

5375

# name, type, and mode of a field in the `Finding` object.

5376

#

5377

# For Risk, an existing output table should be the output of a previous

5378

# Risk analysis job run on the same source table, with the same privacy

5379

# metric and quasi-identifiers. Risk jobs that analyze the same table but

5380

# compute a different privacy metric, or use different sets of

5381

# quasi-identifiers, cannot store their results in the same table.

5382

# identified by its project_id, dataset_id, and table_name. Within a query

5383

# a table is often referenced with a string in the format of:

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

5384

# `<project_id>:<dataset_id>.<table_id>` or

5385

# `<project_id>.<dataset_id>.<table_id>`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

5386

"projectId": "A String", # The Google Cloud Platform project ID of the project containing the table.

5387

# If omitted, project ID is inferred from the API call.

5388

"tableId": "A String", # Name of the table.

5389

"datasetId": "A String", # Dataset ID of the table.

5390

},

5391

"outputSchema": "A String", # Schema used for writing the findings for Inspect jobs. This field is only

5392

# used for Inspect and must be unspecified for Risk jobs. Columns are derived

5393

# from the `Finding` object. If appending to an existing table, any columns

5394

# from the predefined schema that are missing will be added. No columns in

5395

# the existing table will be deleted.

5396

#

5397

# If unspecified, then all available columns will be used for a new table or

5398

# an (existing) table with no schema, and no changes will be made to an

5399

# existing table that has a schema.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

5400

# Only for use with external storage.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

5401

},

5402

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

5403

"jobNotificationEmails": { # Enable email notification to project owners and editors on jobs's # Enable email notification for project owners and editors on job's

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

5404

# completion/failure.

5405

# completion/failure.

5406

},

5407

"publishSummaryToCscc": { # Publish the result summary of a DlpJob to the Cloud Security # Publish summary to Cloud Security Command Center (Alpha).

5408

# Command Center (CSCC Alpha).

5409

# This action is only available for projects which are parts of

5410

# an organization and whitelisted for the alpha Cloud Security Command

5411

# Center.

5412

# The action will publish count of finding instances and their info types.

5413

# The summary of findings will be persisted in CSCC and are governed by CSCC

5414

# service-specific policy, see https://cloud.google.com/terms/service-terms

5415

# Only a single instance of this action can be specified.

5416

# Compatible with: Inspect

5417

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

5418

"publishToStackdriver": { # Enable Stackdriver metric dlp.googleapis.com/finding_count. This # Enable Stackdriver metric dlp.googleapis.com/finding_count.

5419

# will publish a metric to stack driver on each infotype requested and

5420

# how many findings were found for it. CustomDetectors will be bucketed

5421

# as 'Custom' under the Stackdriver label 'info_type'.

5422

},

5423

"publishFindingsToCloudDataCatalog": { # Publish findings of a DlpJob to Cloud Data Catalog. Labels summarizing the # Publish findings to Cloud Datahub.

5424

# results of the DlpJob will be applied to the entry for the resource scanned

5425

# in Cloud Data Catalog. Any labels previously written by another DlpJob will

5426

# be deleted. InfoType naming patterns are strictly enforced when using this

5427

# feature. Note that the findings will be persisted in Cloud Data Catalog

5428

# storage and are governed by Data Catalog service-specific policy, see

5429

# https://cloud.google.com/terms/service-terms

5430

# Only a single instance of this action can be specified and only allowed if

5431

# all resources being scanned are BigQuery tables.

5432

# Compatible with: Inspect

5433

},

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

5434

"pubSub": { # Publish a message into given Pub/Sub topic when DlpJob has completed. The # Publish a notification to a pubsub topic.

5435

# message contains a single field, `DlpJobName`, which is equal to the

5436

# finished job's

5437

# [`DlpJob.name`](/dlp/docs/reference/rest/v2/projects.dlpJobs#DlpJob).

5438

# Compatible with: Inspect, Risk

5439

"topic": "A String", # Cloud Pub/Sub topic to send notifications to. The topic must have given

5440

# publishing access rights to the DLP API service account executing

5441

# the long running DlpJob sending the notifications.

5442

# Format is projects/{project}/topics/{topic}.

},

},

],

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

5447

"lastRunTime": "A String", # Output only. The timestamp of the last time this trigger executed.

5448

"createTime": "A String", # Output only. The creation timestamp of a triggeredJob.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

5449

"name": "A String", # Unique resource name for the triggeredJob, assigned by the service when the

5450

# triggeredJob is created, for example

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

5451

# `projects/dlp-test-project/jobTriggers/53234423`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

5452

},

5453

"updateMask": "A String", # Mask to control which fields get updated.

5454

}

5455

5456

x__xgafv: string, V1 error format.

Allowed values

1 - v1 error format

2 - v2 error format

Returns:

An object of the form:

5463

5464

{ # Contains a configuration to make dlp api calls on a repeating basis.

5465

# See https://cloud.google.com/dlp/docs/concepts-job-triggers to learn more.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

5466

"status": "A String", # Required. A status for this trigger.

5467

"updateTime": "A String", # Output only. The last update timestamp of a triggeredJob.

5468

"errors": [ # Output only. A stream of errors encountered when the trigger was activated. Repeated

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

5469

# errors may result in the JobTrigger automatically being paused.

5470

# Will return the last 100 errors. Whenever the JobTrigger is modified

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

5471

# this list will be cleared.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

5472

{ # Details information about an error encountered during job execution or

5473

# the results of an unsuccessful activation of the JobTrigger.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

5474

"timestamps": [ # The times the error occurred.

5475

"A String",

5476

],

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

5477

"details": { # The `Status` type defines a logical error model that is suitable for # Detailed error codes and messages.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

5478

# different programming environments, including REST APIs and RPC APIs. It is

5479

# used by [gRPC](https://github.com/grpc). Each `Status` message contains

5480

# three pieces of data: error code, error message, and error details.

5481

#

5482

# You can find out more about this error model and how to work with it in the

5483

# [API Design Guide](https://cloud.google.com/apis/design/errors).

5484

"message": "A String", # A developer-facing error message, which should be in English. Any

5485

# user-facing error message should be localized and sent in the

5486

# google.rpc.Status.details field, or localized by the client.

5487

"code": 42, # The status code, which should be an enum value of google.rpc.Code.

5488

"details": [ # A list of messages that carry the error details. There is a common set of

5489

# message types for APIs to use.

5490

{

5491

"a_key": "", # Properties of the object. Contains field @type with type URL.

},

],

},

},

],

"displayName": "A String", # Display name (max 100 chars)

5498

"description": "A String", # User provided description (max 256 chars)

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

5499

"triggers": [ # A list of triggers which will be OR'ed together. Only one in the list

5500

# needs to trigger for a job to be started. The list may contain only

5501

# a single Schedule trigger and must have at least one object.

5502

{ # What event needs to occur for a new job to be started.

5503

"manual": { # Job trigger option for hybrid jobs. Jobs must be manually created # For use with hybrid jobs. Jobs must be manually created and finished.

5504

# Early access feature is in a pre-release state and might change or have

5505

# limited support. For more information, see

5506

# https://cloud.google.com/products#product-launch-stages.

5507

# and finished.

5508

},

5509

"schedule": { # Schedule for triggeredJobs. # Create a job on a repeating basis based on the elapse of time.

5510

"recurrencePeriodDuration": "A String", # With this option a job is started a regular periodic basis. For

5511

# example: every day (86400 seconds).

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

5512

#

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

5513

# A scheduled start time will be skipped if the previous

5514

# execution has not ended when its scheduled time occurs.

5515

#

5516

# This value must be set to a time duration greater than or equal

5517

# to 1 day and can be no longer than 60 days.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

5518

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

5519

},

5520

],

5521

"inspectJob": { # Controls what and how to inspect for findings. # For inspect jobs, a snapshot of the configuration.

5522

"storageConfig": { # Shared message indicating Cloud storage type. # The data to scan.

5523

"cloudStorageOptions": { # Options defining a file or a set of files within a Google Cloud Storage # Google Cloud Storage options.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

5524

# bucket.

5525

"bytesLimitPerFile": "A String", # Max number of bytes to scan from a file. If a scanned file's size is bigger

5526

# than this value then the rest of the bytes are omitted. Only one

5527

# of bytes_limit_per_file and bytes_limit_per_file_percent can be specified.

5528

"sampleMethod": "A String",

5529

"fileSet": { # Set of files to scan. # The set of one or more files to scan.

5530

"url": "A String", # The Cloud Storage url of the file(s) to scan, in the format

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

5531

# `gs://<bucket>/<path>`. Trailing wildcard in the path is allowed.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

5532

#

5533

# If the url ends in a trailing slash, the bucket or directory represented

5534

# by the url will be scanned non-recursively (content in sub-directories

5535

# will not be scanned). This means that `gs://mybucket/` is equivalent to

5536

# `gs://mybucket/*`, and `gs://mybucket/directory/` is equivalent to

5537

# `gs://mybucket/directory/*`.

5538

#

5539

# Exactly one of `url` or `regex_file_set` must be set.

5540

"regexFileSet": { # Message representing a set of files in a Cloud Storage bucket. Regular # The regex-filtered set of files to scan. Exactly one of `url` or

5541

# `regex_file_set` must be set.

5542

# expressions are used to allow fine-grained control over which files in the

5543

# bucket to include.

5544

#

5545

# Included files are those that match at least one item in `include_regex` and

5546

# do not match any items in `exclude_regex`. Note that a file that matches

5547

# items from both lists will _not_ be included. For a match to occur, the

5548

# entire file path (i.e., everything in the url after the bucket name) must

5549

# match the regular expression.

5550

#

5551

# For example, given the input `{bucket_name: "mybucket", include_regex:

5552

# ["directory1/.*"], exclude_regex:

5553

# ["directory1/excluded.*"]}`:

5554

#

5555

# * `gs://mybucket/directory1/myfile` will be included

5556

# * `gs://mybucket/directory1/directory2/myfile` will be included (`.*` matches

5557

# across `/`)

5558

# * `gs://mybucket/directory0/directory1/myfile` will _not_ be included (the

5559

# full path doesn't match any items in `include_regex`)

5560

# * `gs://mybucket/directory1/excludedfile` will _not_ be included (the path

5561

# matches an item in `exclude_regex`)

5562

#

5563

# If `include_regex` is left empty, it will match all files by default

5564

# (this is equivalent to setting `include_regex: [".*"]`).

5565

#

5566

# Some other common use cases:

5567

#

5568

# * `{bucket_name: "mybucket", exclude_regex: [".*\.pdf"]}` will include all

5569

# files in `mybucket` except for .pdf files

5570

# * `{bucket_name: "mybucket", include_regex: ["directory/[^/]+"]}` will

5571

# include all files directly under `gs://mybucket/directory/`, without matching

5572

# across `/`

5573

"excludeRegex": [ # A list of regular expressions matching file paths to exclude. All files in

5574

# the bucket that match at least one of these regular expressions will be

5575

# excluded from the scan.

5576

#

5577

# Regular expressions use RE2

5578

# [syntax](https://github.com/google/re2/wiki/Syntax); a guide can be found

5579

# under the google/re2 repository on GitHub.

5580

"A String",

5581

],

5582

"bucketName": "A String", # The name of a Cloud Storage bucket. Required.

5583

"includeRegex": [ # A list of regular expressions matching file paths to include. All files in

5584

# the bucket that match at least one of these regular expressions will be

5585

# included in the set of files, except for those that also match an item in

5586

# `exclude_regex`. Leaving this field empty will match all files by default

5587

# (this is equivalent to including `.*` in the list).

5588

#

5589

# Regular expressions use RE2

5590

# [syntax](https://github.com/google/re2/wiki/Syntax); a guide can be found

5591

# under the google/re2 repository on GitHub.

"A String",

],

},

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

5596

"filesLimitPercent": 42, # Limits the number of files to scan to this percentage of the input FileSet.

5597

# Number of files scanned is rounded down. Must be between 0 and 100,

5598

# inclusively. Both 0 and 100 means no limit. Defaults to 0.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

5599

"bytesLimitPerFilePercent": 42, # Max percentage of bytes to scan from a file. The rest are omitted. The

5600

# number of bytes scanned is rounded down. Must be between 0 and 100,

5601

# inclusively. Both 0 and 100 means no limit. Defaults to 0. Only one

5602

# of bytes_limit_per_file and bytes_limit_per_file_percent can be specified.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

5603

"fileTypes": [ # List of file type groups to include in the scan.

5604

# If empty, all files are scanned and available data format processors

5605

# are applied. In addition, the binary content of the selected files

5606

# is always scanned as well.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

5607

# Images are scanned only as binary if the specified region

5608

# does not support image inspection and no file_types were specified.

5609

# Image inspection is restricted to 'global', 'us', 'asia', and 'europe'.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

5610

"A String",

5611

],

5612

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

5613

"datastoreOptions": { # Options defining a data set within Google Cloud Datastore. # Google Cloud Datastore options.

5614

"partitionId": { # Datastore partition ID. # A partition ID identifies a grouping of entities. The grouping is always

5615

# by project and namespace, however the namespace ID may be empty.

5616

# A partition ID identifies a grouping of entities. The grouping is always

5617

# by project and namespace, however the namespace ID may be empty.

5618

#

5619

# A partition ID contains several dimensions:

5620

# project ID and namespace ID.

5621

"projectId": "A String", # The ID of the project to which the entities belong.

5622

"namespaceId": "A String", # If not empty, the ID of the namespace to which the entities belong.

5623

},

5624

"kind": { # A representation of a Datastore kind. # The kind to process.

5625

"name": "A String", # The name of the kind.

5626

},

5627

},

5628

"bigQueryOptions": { # Options defining BigQuery table and row identifiers. # BigQuery options.

5629

"excludedFields": [ # References to fields excluded from scanning. This allows you to skip

5630

# inspection of entire columns which you know have no findings.

5631

{ # General identifier of a data field in a storage service.

5632

"name": "A String", # Name describing the field.

5633

},

5634

],

5635

"rowsLimit": "A String", # Max number of rows to scan. If the table has more rows than this value, the

5636

# rest of the rows are omitted. If not set, or if set to 0, all rows will be

5637

# scanned. Only one of rows_limit and rows_limit_percent can be specified.

5638

# Cannot be used in conjunction with TimespanConfig.

5639

"sampleMethod": "A String",

5640

"identifyingFields": [ # Table fields that may uniquely identify a row within the table. When

5641

# `actions.saveFindings.outputConfig.table` is specified, the values of

5642

# columns specified here are available in the output table under

5643

# `location.content_locations.record_location.record_key.id_values`. Nested

5644

# fields such as `person.birthdate.year` are allowed.

5645

{ # General identifier of a data field in a storage service.

5646

"name": "A String", # Name describing the field.

5647

},

5648

],

5649

"rowsLimitPercent": 42, # Max percentage of rows to scan. The rest are omitted. The number of rows

5650

# scanned is rounded down. Must be between 0 and 100, inclusively. Both 0 and

5651

# 100 means no limit. Defaults to 0. Only one of rows_limit and

5652

# rows_limit_percent can be specified. Cannot be used in conjunction with

5653

# TimespanConfig.

5654

"tableReference": { # Message defining the location of a BigQuery table. A table is uniquely # Complete BigQuery table reference.

5655

# identified by its project_id, dataset_id, and table_name. Within a query

5656

# a table is often referenced with a string in the format of:

5657

# `<project_id>:<dataset_id>.<table_id>` or

5658

# `<project_id>.<dataset_id>.<table_id>`.

5659

"projectId": "A String", # The Google Cloud Platform project ID of the project containing the table.

5660

# If omitted, project ID is inferred from the API call.

5661

"tableId": "A String", # Name of the table.

5662

"datasetId": "A String", # Dataset ID of the table.

5663

},

5664

},

5665

"timespanConfig": { # Configuration of the timespan of the items to include in scanning.

5666

# Currently only supported when inspecting Google Cloud Storage and BigQuery.

5667

"timestampField": { # General identifier of a data field in a storage service. # Specification of the field containing the timestamp of scanned items.

5668

# Used for data sources like Datastore and BigQuery.

5669

#

5670

# For BigQuery:

5671

# Required to filter out rows based on the given start and

5672

# end times. If not specified and the table was modified between the given

5673

# start and end times, the entire table will be scanned.

5674

# The valid data types of the timestamp field are: `INTEGER`, `DATE`,

5675

# `TIMESTAMP`, or `DATETIME` BigQuery column.

5676

#

5677

# For Datastore.

5678

# Valid data types of the timestamp field are: `TIMESTAMP`.

5679

# Datastore entity will be scanned if the timestamp property does not

5680

# exist or its value is empty or invalid.

5681

"name": "A String", # Name describing the field.

5682

},

5683

"endTime": "A String", # Exclude files or rows newer than this value.

5684

# If set to zero, no upper time limit is applied.

5685

"startTime": "A String", # Exclude files or rows older than this value.

5686

"enableAutoPopulationOfTimespanConfig": True or False, # When the job is started by a JobTrigger we will automatically figure out

5687

# a valid start_time to avoid scanning files that have not been modified

5688

# since the last time the JobTrigger executed. This will be based on the

5689

# time of the execution of the last run of the JobTrigger.

5690

},

5691

"hybridOptions": { # Configuration to control jobs where the content being inspected is outside # Hybrid inspection options.

5692

# Early access feature is in a pre-release state and might change or have

5693

# limited support. For more information, see

5694

# https://cloud.google.com/products#product-launch-stages.

5695

# of Google Cloud Platform.

5696

"tableOptions": { # Instructions regarding the table content being inspected. # If the container is a table, additional information to make findings

5697

# meaningful such as the columns that are primary keys.

5698

"identifyingFields": [ # The columns that are the primary keys for table objects included in

5699

# ContentItem. A copy of this cell's value will stored alongside alongside

5700

# each finding so that the finding can be traced to the specific row it came

5701

# from. No more than 3 may be provided.

5702

{ # General identifier of a data field in a storage service.

5703

"name": "A String", # Name describing the field.

},

],

},

"labels": { # To organize findings, these labels will be added to each finding.

5708

#

5709

# Label keys must be between 1 and 63 characters long and must conform

5710

# to the following regular expression: `[a-z]([-a-z0-9]*[a-z0-9])?`.

5711

#

5712

# Label values must be between 0 and 63 characters long and must conform

5713

# to the regular expression `([a-z]([-a-z0-9]*[a-z0-9])?)?`.

5714

#

5715

# No more than 10 labels can be associated with a given finding.

5716

#

5717

# Examples:

5718

# * `"environment" : "production"`

5719

# * `"pipeline" : "etl"`

5720

"a_key": "A String",

5721

},

5722

"requiredFindingLabelKeys": [ # These are labels that each inspection request must include within their

5723

# 'finding_labels' map. Request may contain others, but any missing one of

5724

# these will be rejected.

5725

#

5726

# Label keys must be between 1 and 63 characters long and must conform

5727

# to the following regular expression: `[a-z]([-a-z0-9]*[a-z0-9])?`.

5728

#

5729

# No more than 10 keys can be required.

5730

"A String",

5731

],

5732

"description": "A String", # A short description of where the data is coming from. Will be stored once

5733

# in the job. 256 max length.

5734

},

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

5735

},

5736

"inspectConfig": { # Configuration description of the scanning process. # How and what to scan for.

5737

# When used with redactContent only info_types and min_likelihood are currently

5738

# used.

5739

"excludeInfoTypes": True or False, # When true, excludes type information of the findings.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

5740

"limits": { # Configuration to control the number of findings returned. # Configuration to control the number of findings returned.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

5741

"maxFindingsPerRequest": 42, # Max number of findings that will be returned per request/job.

5742

# When set within `InspectContentRequest`, the maximum returned is 2000

5743

# regardless if this is set higher.

5744

"maxFindingsPerInfoType": [ # Configuration of findings limit given for specified infoTypes.

5745

{ # Max findings configuration per infoType, per content item or long

5746

# running DlpJob.

5747

"infoType": { # Type of information detected by the API. # Type of information the findings limit applies to. Only one limit per

5748

# info_type should be provided. If InfoTypeLimit does not have an

5749

# info_type, the DLP API applies the limit against all info_types that

5750

# are found but not specified in another InfoTypeLimit.

5751

"name": "A String", # Name of the information type. Either a name of your choosing when

5752

# creating a CustomInfoType, or one of the names listed

5753

# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying

5754

# a built-in type. InfoType names should conform to the pattern

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

5755

# `[a-zA-Z0-9_]{1,64}`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

5756

},

5757

"maxFindings": 42, # Max findings limit for the given infoType.

5758

},

5759

],

5760

"maxFindingsPerItem": 42, # Max number of findings that will be returned for each item scanned.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

5761

# When set within `InspectJobConfig`,

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

5762

# the maximum returned is 2000 regardless if this is set higher.

5763

# When set within `InspectContentRequest`, this field is ignored.

5764

},

5765

"minLikelihood": "A String", # Only returns findings equal or above this threshold. The default is

5766

# POSSIBLE.

5767

# See https://cloud.google.com/dlp/docs/likelihood to learn more.

5768

"customInfoTypes": [ # CustomInfoTypes provided by the user. See

5769

# https://cloud.google.com/dlp/docs/creating-custom-infotypes to learn more.

5770

{ # Custom information type provided by the user. Used to find domain-specific

5771

# sensitive information configurable to the data in question.

5772

"regex": { # Message defining a custom regular expression. # Regular expression based CustomInfoType.

5773

"pattern": "A String", # Pattern defining the regular expression. Its syntax

5774

# (https://github.com/google/re2/wiki/Syntax) can be found under the

5775

# google/re2 repository on GitHub.

5776

"groupIndexes": [ # The index of the submatch to extract as findings. When not

5777

# specified, the entire match is returned. No more than 3 may be included.

42,

],

},

"surrogateType": { # Message for detecting output from deidentification transformations # Message for detecting output from deidentification transformations that

5782

# support reversing.

5783

# such as

5784

# [`CryptoReplaceFfxFpeConfig`](/dlp/docs/reference/rest/v2/organizations.deidentifyTemplates#cryptoreplaceffxfpeconfig).

5785

# These types of transformations are

5786

# those that perform pseudonymization, thereby producing a "surrogate" as

5787

# output. This should be used in conjunction with a field on the

5788

# transformation such as `surrogate_info_type`. This CustomInfoType does

5789

# not support the use of `detection_rules`.

5790

},

5791

"infoType": { # Type of information detected by the API. # CustomInfoType can either be a new infoType, or an extension of built-in

5792

# infoType, when the name matches one of existing infoTypes and that infoType

5793

# is specified in `InspectContent.info_types` field. Specifying the latter

5794

# adds findings to the one detected by the system. If built-in info type is

5795

# not specified in `InspectContent.info_types` list then the name is treated

5796

# as a custom info type.

5797

"name": "A String", # Name of the information type. Either a name of your choosing when

5798

# creating a CustomInfoType, or one of the names listed

5799

# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying

5800

# a built-in type. InfoType names should conform to the pattern

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

5801

# `[a-zA-Z0-9_]{1,64}`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

5802

},

5803

"dictionary": { # Custom information type based on a dictionary of words or phrases. This can # A list of phrases to detect as a CustomInfoType.

5804

# be used to match sensitive information specific to the data, such as a list

5805

# of employee IDs or job titles.

5806

#

5807

# Dictionary words are case-insensitive and all characters other than letters

5808

# and digits in the unicode [Basic Multilingual

5809

# Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane)

5810

# will be replaced with whitespace when scanning for matches, so the

5811

# dictionary phrase "Sam Johnson" will match all three phrases "sam johnson",

5812

# "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters

5813

# surrounding any match must be of a different type than the adjacent

5814

# characters within the word, so letters must be next to non-letters and

5815

# digits next to non-digits. For example, the dictionary word "jen" will

5816

# match the first three letters of the text "jen123" but will return no

5817

# matches for "jennifer".

5818

#

5819

# Dictionary words containing a large number of characters that are not

5820

# letters or digits may result in unexpected findings because such characters

5821

# are treated as whitespace. The

5822

# [limits](https://cloud.google.com/dlp/limits) page contains details about

5823

# the size limits of dictionaries. For dictionaries that do not fit within

5824

# these constraints, consider using `LargeCustomDictionaryConfig` in the

5825

# `StoredInfoType` API.

5826

"wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for.

5827

"words": [ # Words or phrases defining the dictionary. The dictionary must contain

5828

# at least one phrase and every phrase must contain at least 2 characters

5829

# that are letters or digits. [required]

"A String",

],

},

"cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file

5834

# is accepted.

5835

"path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage.

5836

# Example: gs://[BUCKET_NAME]/dictionary.txt

5837

},

5838

},

5839

"storedType": { # A reference to a StoredInfoType to use with scanning. # Load an existing `StoredInfoType` resource for use in

5840

# `InspectDataSource`. Not currently supported in `InspectContent`.

5841

"name": "A String", # Resource name of the requested `StoredInfoType`, for example

5842

# `organizations/433245324/storedInfoTypes/432452342` or

5843

# `projects/project-id/storedInfoTypes/432452342`.

5844

"createTime": "A String", # Timestamp indicating when the version of the `StoredInfoType` used for

5845

# inspection was created. Output-only field, populated by the system.

5846

},

5847

"detectionRules": [ # Set of detection rules to apply to all findings of this CustomInfoType.

5848

# Rules are applied in order that they are specified. Not supported for the

5849

# `surrogate_type` CustomInfoType.

5850

{ # Deprecated; use `InspectionRuleSet` instead. Rule for modifying a

5851

# `CustomInfoType` to alter behavior under certain circumstances, depending

5852

# on the specific details of the rule. Not supported for the `surrogate_type`

5853

# custom infoType.

5854

"hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule.

5855

# proximity of hotwords.

5856

"proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside.

5857

# The total length of the window cannot exceed 1000 characters. Note that

5858

# the finding itself will be included in the window, so that hotwords may

5859

# be used to match substrings of the finding itself. For example, the

5860

# certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be

5861

# adjusted upwards if the area code is known to be the local area code of

5862

# a company office using the hotword regex "\(xxx\)", where "xxx"

5863

# is the area code in question.

5864

# rule.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

5865

"windowBefore": 42, # Number of characters before the finding to consider.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

5866

"windowAfter": 42, # Number of characters after the finding to consider.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

5867

},

5868

"hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword.

5869

"pattern": "A String", # Pattern defining the regular expression. Its syntax

5870

# (https://github.com/google/re2/wiki/Syntax) can be found under the

5871

# google/re2 repository on GitHub.

5872

"groupIndexes": [ # The index of the submatch to extract as findings. When not

5873

# specified, the entire match is returned. No more than 3 may be included.

42,

],

},

"likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings.

5878

# part of a detection rule.

5879

"relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of

5880

# levels. For example, if a finding would be `POSSIBLE` without the

5881

# detection rule and `relative_likelihood` is 1, then it is upgraded to

5882

# `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`.

5883

# Likelihood may never drop below `VERY_UNLIKELY` or exceed

5884

# `VERY_LIKELY`, so applying an adjustment of 1 followed by an

5885

# adjustment of -1 when base likelihood is `VERY_LIKELY` will result in

5886

# a final likelihood of `LIKELY`.

5887

"fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value.

},

},

},

],

"exclusionType": "A String", # If set to EXCLUSION_TYPE_EXCLUDE this infoType will not cause a finding

5893

# to be returned. It still can be used for rules matching.

5894

"likelihood": "A String", # Likelihood to return for this CustomInfoType. This base value can be

5895

# altered by a detection rule if the finding meets the criteria specified by

5896

# the rule. Defaults to `VERY_LIKELY` if not specified.

5897

},

5898

],

5899

"includeQuote": True or False, # When true, a contextual quote from the data that triggered a finding is

5900

# included in the response; see Finding.quote.

5901

"ruleSet": [ # Set of rules to apply to the findings for this InspectConfig.

5902

# Exclusion rules, contained in the set are executed in the end, other

5903

# rules are executed in the order they are specified for each info type.

5904

{ # Rule set for modifying a set of infoTypes to alter behavior under certain

5905

# circumstances, depending on the specific details of the rules within the set.

5906

"rules": [ # Set of rules to be applied to infoTypes. The rules are applied in order.

5907

{ # A single inspection rule to be applied to infoTypes, specified in

5908

# `InspectionRuleSet`.

5909

"hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule.

5910

# proximity of hotwords.

5911

"proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside.

5912

# The total length of the window cannot exceed 1000 characters. Note that

5913

# the finding itself will be included in the window, so that hotwords may

5914

# be used to match substrings of the finding itself. For example, the

5915

# certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be

5916

# adjusted upwards if the area code is known to be the local area code of

5917

# a company office using the hotword regex "\(xxx\)", where "xxx"

5918

# is the area code in question.

5919

# rule.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

5920

"windowBefore": 42, # Number of characters before the finding to consider.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

5921

"windowAfter": 42, # Number of characters after the finding to consider.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

5922

},

5923

"hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword.

5924

"pattern": "A String", # Pattern defining the regular expression. Its syntax

5925

# (https://github.com/google/re2/wiki/Syntax) can be found under the

5926

# google/re2 repository on GitHub.

5927

"groupIndexes": [ # The index of the submatch to extract as findings. When not

5928

# specified, the entire match is returned. No more than 3 may be included.

42,

],

},

"likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings.

5933

# part of a detection rule.

5934

"relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of

5935

# levels. For example, if a finding would be `POSSIBLE` without the

5936

# detection rule and `relative_likelihood` is 1, then it is upgraded to

5937

# `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`.

5938

# Likelihood may never drop below `VERY_UNLIKELY` or exceed

5939

# `VERY_LIKELY`, so applying an adjustment of 1 followed by an

5940

# adjustment of -1 when base likelihood is `VERY_LIKELY` will result in

5941

# a final likelihood of `LIKELY`.

5942

"fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value.

5943

},

5944

},

5945

"exclusionRule": { # The rule that specifies conditions when findings of infoTypes specified in # Exclusion rule.

5946

# `InspectionRuleSet` are removed from results.

5947

"regex": { # Message defining a custom regular expression. # Regular expression which defines the rule.

5948

"pattern": "A String", # Pattern defining the regular expression. Its syntax

5949

# (https://github.com/google/re2/wiki/Syntax) can be found under the

5950

# google/re2 repository on GitHub.

5951

"groupIndexes": [ # The index of the submatch to extract as findings. When not

5952

# specified, the entire match is returned. No more than 3 may be included.

42,

],

},

"excludeInfoTypes": { # List of exclude infoTypes. # Set of infoTypes for which findings would affect this rule.

5957

"infoTypes": [ # InfoType list in ExclusionRule rule drops a finding when it overlaps or

5958

# contained within with a finding of an infoType from this list. For

5959

# example, for `InspectionRuleSet.info_types` containing "PHONE_NUMBER"` and

5960

# `exclusion_rule` containing `exclude_info_types.info_types` with

5961

# "EMAIL_ADDRESS" the phone number findings are dropped if they overlap

5962

# with EMAIL_ADDRESS finding.

5963

# That leads to "555-222-2222@example.org" to generate only a single

5964

# finding, namely email address.

5965

{ # Type of information detected by the API.

5966

"name": "A String", # Name of the information type. Either a name of your choosing when

5967

# creating a CustomInfoType, or one of the names listed

5968

# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying

5969

# a built-in type. InfoType names should conform to the pattern

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

5970

# `[a-zA-Z0-9_]{1,64}`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

},

],

},

"dictionary": { # Custom information type based on a dictionary of words or phrases. This can # Dictionary which defines the rule.

5975

# be used to match sensitive information specific to the data, such as a list

5976

# of employee IDs or job titles.

5977

#

5978

# Dictionary words are case-insensitive and all characters other than letters

5979

# and digits in the unicode [Basic Multilingual

5980

# Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane)

5981

# will be replaced with whitespace when scanning for matches, so the

5982

# dictionary phrase "Sam Johnson" will match all three phrases "sam johnson",

5983

# "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters

5984

# surrounding any match must be of a different type than the adjacent

5985

# characters within the word, so letters must be next to non-letters and

5986

# digits next to non-digits. For example, the dictionary word "jen" will

5987

# match the first three letters of the text "jen123" but will return no

5988

# matches for "jennifer".

5989

#

5990

# Dictionary words containing a large number of characters that are not

5991

# letters or digits may result in unexpected findings because such characters

5992

# are treated as whitespace. The

5993

# [limits](https://cloud.google.com/dlp/limits) page contains details about

5994

# the size limits of dictionaries. For dictionaries that do not fit within

5995

# these constraints, consider using `LargeCustomDictionaryConfig` in the

5996

# `StoredInfoType` API.

5997

"wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for.

5998

"words": [ # Words or phrases defining the dictionary. The dictionary must contain

5999

# at least one phrase and every phrase must contain at least 2 characters

6000

# that are letters or digits. [required]

"A String",

],

},

"cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file

6005

# is accepted.

6006

"path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage.

6007

# Example: gs://[BUCKET_NAME]/dictionary.txt

6008

},

6009

},

6010

"matchingType": "A String", # How the rule is applied, see MatchingType documentation for details.

},

},

],

"infoTypes": [ # List of infoTypes this rule set is applied to.

6015

{ # Type of information detected by the API.

6016

"name": "A String", # Name of the information type. Either a name of your choosing when

6017

# creating a CustomInfoType, or one of the names listed

6018

# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying

6019

# a built-in type. InfoType names should conform to the pattern

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

6020

# `[a-zA-Z0-9_]{1,64}`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

},

],

},

],

"contentOptions": [ # List of options defining data content to scan.

6026

# If empty, text, images, and other content will be included.

6027

"A String",

6028

],

6029

"infoTypes": [ # Restricts what info_types to look for. The values must correspond to

6030

# InfoType values returned by ListInfoTypes or listed at

6031

# https://cloud.google.com/dlp/docs/infotypes-reference.

6032

#

6033

# When no InfoTypes or CustomInfoTypes are specified in a request, the

6034

# system may automatically choose what detectors to run. By default this may

6035

# be all types, but may change over time as detectors are updated.

6036

#

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

6037

# If you need precise control and predictability as to what detectors are

6038

# run you should specify specific InfoTypes listed in the reference,

6039

# otherwise a default list will be used, which may change over time.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

6040

{ # Type of information detected by the API.

6041

"name": "A String", # Name of the information type. Either a name of your choosing when

6042

# creating a CustomInfoType, or one of the names listed

6043

# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying

6044

# a built-in type. InfoType names should conform to the pattern

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

6045

# `[a-zA-Z0-9_]{1,64}`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

},

],

},

"inspectTemplateName": "A String", # If provided, will be used as the default for all values in InspectConfig.

6050

# `inspect_config` will be merged into the values persisted as part of the

6051

# template.

6052

"actions": [ # Actions to execute at the completion of the job.

6053

{ # A task to execute on the completion of a job.

6054

# See https://cloud.google.com/dlp/docs/concepts-actions to learn more.

6055

"saveFindings": { # If set, the detailed findings will be persisted to the specified # Save resulting findings in a provided location.

6056

# OutputStorageConfig. Only a single instance of this action can be

6057

# specified.

6058

# Compatible with: Inspect, Risk

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

6059

"outputConfig": { # Cloud repository for storing output. # Location to store findings outside of DLP.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

6060

"table": { # Message defining the location of a BigQuery table. A table is uniquely # Store findings in an existing table or a new table in an existing

6061

# dataset. If table_id is not set a new one will be generated

6062

# for you with the following format:

6063

# dlp_googleapis_yyyy_mm_dd_[dlp_job_id]. Pacific timezone will be used for

6064

# generating the date details.

6065

#

6066

# For Inspect, each column in an existing output table must have the same

6067

# name, type, and mode of a field in the `Finding` object.

6068

#

6069

# For Risk, an existing output table should be the output of a previous

6070

# Risk analysis job run on the same source table, with the same privacy

6071

# metric and quasi-identifiers. Risk jobs that analyze the same table but

6072

# compute a different privacy metric, or use different sets of

6073

# quasi-identifiers, cannot store their results in the same table.

6074

# identified by its project_id, dataset_id, and table_name. Within a query

6075

# a table is often referenced with a string in the format of:

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

6076

# `<project_id>:<dataset_id>.<table_id>` or

6077

# `<project_id>.<dataset_id>.<table_id>`.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

6078

"projectId": "A String", # The Google Cloud Platform project ID of the project containing the table.

6079

# If omitted, project ID is inferred from the API call.

6080

"tableId": "A String", # Name of the table.

6081

"datasetId": "A String", # Dataset ID of the table.

6082

},

6083

"outputSchema": "A String", # Schema used for writing the findings for Inspect jobs. This field is only

6084

# used for Inspect and must be unspecified for Risk jobs. Columns are derived

6085

# from the `Finding` object. If appending to an existing table, any columns

6086

# from the predefined schema that are missing will be added. No columns in

6087

# the existing table will be deleted.

6088

#

6089

# If unspecified, then all available columns will be used for a new table or

6090

# an (existing) table with no schema, and no changes will be made to an

6091

# existing table that has a schema.

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

6092

# Only for use with external storage.

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

6093

},

6094

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

6095

"jobNotificationEmails": { # Enable email notification to project owners and editors on jobs's # Enable email notification for project owners and editors on job's

Bu Sun Kim

715bd7f

2019-06-14 16:50:42 -0700

[diff] [blame]

6096

# completion/failure.

6097

# completion/failure.

6098

},

6099

"publishSummaryToCscc": { # Publish the result summary of a DlpJob to the Cloud Security # Publish summary to Cloud Security Command Center (Alpha).

6100

# Command Center (CSCC Alpha).

6101

# This action is only available for projects which are parts of

6102

# an organization and whitelisted for the alpha Cloud Security Command

6103

# Center.

6104

# The action will publish count of finding instances and their info types.

6105

# The summary of findings will be persisted in CSCC and are governed by CSCC

6106

# service-specific policy, see https://cloud.google.com/terms/service-terms

6107

# Only a single instance of this action can be specified.

6108

# Compatible with: Inspect

6109

},

Dan O'Meara

dd49464

2020-05-01 07:42:23 -0700

[diff] [blame^]

6110

"publishToStackdriver": { # Enable Stackdriver metric dlp.googleapis.com/finding_count. This # Enable Stackdriver metric dlp.googleapis.com/finding_count.

6111

# will publish a metric to stack driver on each infotype requested and

6112

# how many findings were found for it. CustomDetectors will be bucketed

6113

# as 'Custom' under the Stackdriver label 'info_type'.

6114

},

6115

"publishFindingsToCloudDataCatalog": { # Publish findings of a DlpJob to Cloud Data Catalog. Labels summarizing the # Publish findings to Cloud Datahub.

6116

# results of the DlpJob will be applied to the entry for the resource scanned

6117

# in Cloud Data Catalog. Any labels previously written by another DlpJob will

6118

# be deleted. InfoType naming patterns are strictly enforced when using this

6119

# feature. Note that the findings will be persisted in Cloud Data Catalog

6120

# storage and are governed by Data Catalog service-specific policy, see

6121

# https://cloud.google.com/terms/service-terms