Sensitive Data Definitions

Sensitive Data Definitions provide the ability to search for sensitive data types using criteria that you specify. The Sensitive Data Definition allows the combination of AnyFind and Sensitive Data Types to specify the results that are matched in a location. With a Sensitive Data Definition, you can locate only the data that matches a specific set of custom criteria.

In this article

Where Sensitive Data Definitions are used

Sensitive Data Definitions are used in Sensitive Data Types.

Sensitive Data Definitions

A Sensitive Data Definition generally consists of three fields. Though in some cases, based on the operator selected, a fourth field displays. These fields are:

Field Description
Sensitive Data Type The Sensitive Data Type specifies the type of data that you are filtering on. "Social Security Number" and "Password" are examples of Data Types. Any sensitive data types that you have created also displays here as an available selection.
Operator The Operator defines how results are found based upon the Data Type. "Equals" and "Require" are examples of an operator. For detailed information, please refer to Operator Types.
Value field The Value Field specifies the value used to qualify the data. This can range from a secondary Sensitive Data Type to a predetermined list of values from which you can select. In some cases both a secondary data type and a predetermined list of values displays together, depending upon the Operator that you selected. Some examples of this are if you choose an operator of "Near with Distance" or "Far with Distance."

Operator Types

When creating a filter for a sensitive data definition, you first select a Sensitive Data Type and then you select an operator. There are 12 types of operators available:

Field Description
Near

Tests that sensitive data types are within a certain distance of each other. Used as NEAR(A,B,C) where A and B are sensitive data types and C is the distance configured in the following policy setting:

Settings\SensitiveDataEngine\NearDistance. The value is the maximum allowable number of characters between sensitive data types. By using the Near operator and configuring the distance in a policy, it allows you to use the same distance value for all filters using the Near operator without the need to specify the distance each time you use it in a sensitive data definition. For example, if an SSN has a match at location 3 with length 9, and CCN has a match at location 20 with length 16 then NEAR(SSN,CCN,10) is True but NEAR(SSN,CCN,7) is False. To specify the distance independently from the policy setting, use the Near with Distance operator. The default value is 50.

Near with Distance Tests that sensitive data types are within a certain distance of each other. Used as NEAR(A,B,C) where A and B are sensitive data types and C is the distance specified in the definition. The distance is specified in the definition itself and specifies the maximum allowable distance between sensitive data types. When you select Near with Distance, a value field is presented allowing you to set the specific distance. For example, if an SSN has a match at location 3 with length 9, and CCN has a match at location 20 with length 16 then NEAR(SSN,CCN,10) is True but NEAR(SSN,CCN,7) is False. If you do not want to specify the distance in the definition itself, use the Near operator.
Far

Tests that sensitive data types are at least certain distance away from each other. Used as FAR(A,B,C) where A and B are sensitive data types and C is the distance configured in the following policy setting:

Settings\SensitiveDataEngine\FarDistance. The value is the minimum allowable number of characters between sensitive data types. By using the Far operator and configuring the distance in a policy, it allows you to use the same distance value for all filters using the Far operator without the need to specify the distance each time you use it in a sensitive data definition. For example, if an SSN has a match at location 3 with length 9, and CCN has a match at location 20 with length 16 then FAR(SSN,CCN,10) is False but FAR(SSN,CCN,7) is True. To specify the distance independently from the policy setting, use the Far with Distance operator. The default value is 50.

Far with Distance Tests that sensitive data types are at least a certain distance away from each other. Used as FAR(A,B,C) where A and B sensitive data types and C is the distance specified in the definition. The distance is specified in the definition itself and specifies the minimum allowable distance between sensitive data types. When you select Far with Distance, a value field is presented allowing you to set the specific distance. If you do not want to specify the distance in the definition itself, use the Far operator.
Equals Used to compare the unique count of a type.
Does Not Equal Used to compare the unique count of a type.
Less Than Used to compare the unique count of a type.
Less Than or Equals Used to compare the unique count of a type.
Greater Than Used to compare the unique count of a type.
Greater Than or Equals Used to compare the unique count of a type.
Require Tests that a minimum number of sensitive data types are present. When this operator is selected the value field changes to "at least this many Sensitive Data Types:" The numerical value cannot exceed the number of Sensitive Data Types selected. For example, if you selected two Sensitive Data Types, then you cannot select a value greater than 2. For example: If you select Credit Card Number, Password, Drivers License, and Passport Number as the Sensitive Data Types, Require as the Operator, and 3 as the Value, the definition returns True only when at least 3 of the selected Sensitive Data Types are present in a location. If there are 2 or fewer of those Sensitive Data Types present, it is False.
Allow Tests that a maximum number of sensitive data types are present. When this operator is selected the value Field changes to "no more than this many Sensitive Data Types:" The numerical value cannot be less than that number of Sensitive Data Types selected. For example, if you selected three Sensitive Data Types, then you cannot select a value less than 3. For example, If you select Credit Card Number, Password, Drivers License, and Passport Number as the Sensitive Data Types, Allow as the Operator and 5 as the Value, the definition returns True only when up to 5 of the selected Sensitive Data Types are present in a location. If 6 or more of those Sensitive Data Types are present, it is False.

Note: Each of the operators that specify a match count (Equals, Does Not Equal, Less Than, Less Than or Equals, Greater Than, Greater Than or Equals) are for unique match counts. When creating a sensitive data definition and selecting a Keyword as the Sensitive Data Type, the Value must be 0 or 1 as locations can only contain a single, unique instance of a specific Keyword. The result for a keyword is always only the keyword, meaning that you can only ever have one unique match for a keyword as opposed to a Regular Expression or Dictionary which can have many unique matches. For example, if you create a sensitive data definition saying "Keyword > 3", it is never True as you cannot have 3 unique instances of the same keyword in one location. If you create a sensitive data definition saying "Keyword <= 1", then it is possible for this definition to be True. A Regular Expression on the other hand can have multiple unique match counts per definition as a Regular Expression does not specify a specific identity. The following is a list of operators and their valid numerical values when the Data Type = Keyword:

Field Description
Equals 0 Returns True only if there are no keyword matches found.
Equals 1 Returns True if there is one unique instance of a specific keyword.
Does Not Equal 0 Returns True if there is one unique instance of a specific keyword.
Does Not Equal 1 Returns True only if there are no keyword matches found.
Less Than 1 Returns True only if there are no keyword matches found.
Less Than or Equals 1 Returns True if there are no matches or there is one unique instance of a specific keyword.
Greater Than 0 Returns True if there is one unique instance of a specific keyword.
Greater Than or Equals 0 Returns True if there are no matches or there is one unique instance of a specific keyword.

Note: For database searches, the Near, Far, Near with Distance and Far with Distance operators are only relevant to data within a cell, not across cells. For files it is per file. For email it is the content of the email.

Split Button

The Splitbutton is used to add additional filters allowing you to build your sensitive data definition on multiple criteria.

Click the gray plus button  and a new filter appears below the current filter.

Each time you click the plus button, it adds another filter, but it also adds an Operator. When there are multiple filters there must be at least one operator, either And or Or. When you click a plus button to the right of a filter, an additional operator is added directly above that filter and a new filter is added below it. In the example above, clicking the plus button on the bottom filter adds a new operator directly above that filter and a new filter directly below it forming a new operator group as is shown below.

And/Or/Not Operators

The And/Or/Not operators are used to create complex filters by filtering records based on more than one condition. To change the condition from And to Or (or vice versa), click the word And or Or.

The AND operator displays a record if both the first condition AND the second condition are True.

The Or operator displays a record if either the first condition OR the second condition is True.

To change the condition to the Not operator, click the gray split button to the right of the filter so that you get two lines of filters and then click the gray x button to the left of one of the filters and then click the word And or Or to see the Not operator.

The Not operator is used to negate an expression. It displays a record if the condition is false.

If you want to add additional filters under the same operator click the gray plus button to the right of the operator and a new filter is added at the bottom of that group. To add a third filter to the example below, click the gray plus button to the right of the And operator.

The new filter displays at the bottom of the group.

Clicking the gray x button to the left on an operator removes that operator and all filters within that operator group.

Policy Settings

Sensitive Data Definitions can be enabled and configured using the following policy settings in a policy applied to the endpoint.

Examples

Validating HR data with multiple And/Or operators, a Dictionary, a Regular Expression and multiple AnyFinds

In this example, we create a sensitive data definition that searches for and validates HR Data. A location is considered a match if it contains a Social Security Number OR (a Drivers License AND a match on the ICD9 Dictionary AND a match on the MRN Regular Expression).

  1. First create your Customer ICD9 Dictionary and your MRN Regular Expression and add them as Sensitive Data Types.
  2. Now we build the Sensitive Data Definition. To do this click the Add ribbon button.
  1. Select 'Sensitive Data Definition' from the Data Type drop-down and enter HR Data in the Name field. Click the browse button, (the ...), next to the Definition field to open the Edit Sensitive Data Definition dialog.
  1. Click the browse button, (the ...), next to the Sensitive Data Types field to open the Select Sensitive Data Types dialog.
  2. The Select Sensitive Data Types dialog contains all of the sensitive data types. Select Social Security Number AnyFind from the column on the left by left clicking on it and click the green right arrow to add it to the column on the right and click OK.

  3. Select the Operator drop-down and select 'Greater Than or Equals' and the field to the right changes to a numerical field.
  4. From the numerical field click the up arrow until the field displays '1.' At this point the definition means that a location is considered a match if it contains 1 or more Social Security Numbers.

  5. Next we want to add additional criteria for our sensitive data definition filter so that the location must match the first condition or a second condition of the definition in order to be considered a match. To do this, click the gray Split button to the right of the filter and a new filter appears below the current filter and an 'And' operator appears above it.
  1. Change the And/Or Operator to 'Or' by clicking on the operator.
  1. Add an 'And' Operator by clicking the gray split button to the right of the second filter.
  1. Click the gray split button to the right of the 'And' operator and another filter appears at the bottom. The definition now means that a location is considered a match if it meets the condition of the first filter (if it contains 1 or more Social Security Numbers) or all three of the second filter group, which we define next.
  1. In the first filter under the And operator click the browse button to open the Select Sensitive Data Types dialog and select the 'Drivers License' AnyFind from the column on the left and add it to the column on the right by clicking the right arrow and then click OK. Change the operator to 'Greater Than or Equals' and in the numerical field click the up arrow until the field displays '1.'
  1. In the second filter under the And operator click the browse button to open the Select Sensitive Data Types dialog and select the 'ICD9 Data' Dictionary that you added earlier from the column on the left, and add it to the column on the right by clicking the right arrow and then click OK. Change the operator to 'Greater Than or Equals' and in the numerical field click the up arrow until the field displays '1.'
  1. In the third filter under the And operator click the browse button to open the Select Sensitive Data Types dialog and select the 'MRN' Regular Expression that you added earlier from the column on the left, and add it to the column on the right by clicking the right arrow and then click OK. Change the operator to 'Greater Than or Equals' and in the numerical field click the up arrow until the field displays '1' and then click OK.
  1. The Definition field now shows that a definition has been defined. Click OK to save the definition.

  2. The sensitive data definition that we just created appears in the sensitive data types view.


    This Sensitive Data Definition does not have any effect on searches unless it is assigned to a policy. For detailed information on how to do this, please refer to the Sensitive Data Types section of the Polices.

 

Validating Medical Billing Records with an And operator, multiple Regular Expressions, a Dictionary and an AnyFind

In this example we create a Sensitive Data Definition that Searches for Medical Billing records by utilizing a Customer ID Regular Expression, a Pricing Regular Expression, an ICD10 Dictionary, and a Personal Address Sensitive Data Type. A location is considered a match if it contains a match for the Customer ID Regular Expression which is no more than 40 characters away from a Personal Address AND (there is match on our Pricing Regular Expression AND and our ICD10 Dictionary).

  1. First create your Customer ID Regular Expression, your Pricing Regular Expression, and your ICD10 Dictionary, then add them as Sensitive Data Types.
  2. Next configure the Settings\SensitiveDataEngine\NearDistance policy setting and specify the distance for the Near operator. Double click the setting to open the Edit dialog and change the value to 40. The setting should now look like this:

    Click OK and the setting is saved.
  1. Now we can build the Sensitive Data Definition. To do this click the Add ribbon button.
  1. Select 'Sensitive Data Definition' from the Data Type drop-down and enter Medical Billing Records in the Name field. Click the browse button, (the ...), next to the Definition field to open the Edit Sensitive Data Definition dialog.
  1. Click the browse button, (the ...), next to the Sensitive Data Types field to open the Select Sensitive Data Types dialog.
  1. The Select Sensitive Data Types dialog contains all of the sensitive data types. Select the 'Customer ID' Regular Expression that you added earlier from the column on the left by left clicking on it and click the green right arrow to add it to the column on the right and click OK.
  1. Click the Operator drop-down and select 'Near.'
  1. Next select the second sensitive data type that we want the Near operator to test against. Click the browse button, (the ...), in the field on the right to open the Select Sensitive Data Types dialog and select the 'Personal Address' AnyFind from the column on the left by left clicking on it, and click the green right arrow to add it to the column on the right and click OK.
  1. At this point the definition means that a location is considered a match if there is a Customer ID no more than 40 characters away from a Personal Address. But we need to add additional criteria for our sensitive data definition filter. To do this, click the gray Split button to the right of the filter and a new filter appears below the current filter. Leave the And/Or Operator as 'And' because we want our definition to require both the first condition AND the second condition to be True in order for the location to be considered a match.
  1. Select the Operator drop-down and select 'Require' and the field to the right changes to 'at least this many Data Types'. The 'Require' operator allows us to select multiple sensitive data types.
  1. Click the browse button to open the Select Sensitive Data Types dialog and select the 'Pricing' Regular Expression that you added earlier from the column on the left and add it to the column on the right by clicking the right arrow, then select the 'ICD10' Dictionary and add that to the column on the right also and then click OK.
  1. From the 'at least this many Data Types' field click the up arrow until the field displays '2' and then click OK.
  1. The Definition field now shows that a definition has been defined. Click OK to save the definition.
  1. The sensitive data definition that we just created appears in the sensitive data types view.


    This Sensitive Data Definition does not have any effect on searches unless it is assigned to a policy. For detailed information on how to do this, please refer to the Sensitive Data Types section of the Polices.