In this blog we shall see some methods of Data Classification.
Identifying and classifying sensitive items that are under your organizations control is the first step in the Information Protection process. Microsoft 365 provides three ways of identifying items so that they can be classified:
1) Manually by users
This method requires human judgement and action. An admin may either use the pre-existing labels and sensitive information types or create their own and then publish them. Users and admins apply them to content as they come across the same or create it. You can then protect the content and manage it.
2) Automated pattern recognition, like sensitive information types
This category of classification mechanisms includes finding content by:
- Keywords or metadata values (keyword query language).
- Using previously identified patterns of sensitive information like social security, credit card or bank account numbers (Sensitive information type entity definitions).
- Recognizing an item because it’s a variation on a template (document finger printing). Currently this is a detection method applicable in Exchange Online only.
- Using the presence of exact strings (exact data match). Please note that this EDM feature is available only in the following licenses:
- Office 365 E5; Microsoft 365 E5 ; Microsoft 365 E5 Compliance ; Microsoft 365 / A5 Information Protection and Governance.
3) Machine learning or Classifiers
When you publish the classifier, it sorts through items in locations like SharePoint Online, Exchange, and OneDrive, and classifies the content. After you publish the classifier, you can continue to train it using a feedback process that is similar to the initial training process.
Classifiers only work with items that are not encrypted and are in English.
Microsoft 365 comes with five pre-trained classifiers:
- Offensive language
- Source Code
When the pre-trained Classifiers do not meet your needs, you can create & train your own Classifiers. There is significantly more work involved with creating your own, but they’ll be much better tailored to your organization’s needs.
For example, you could create trainable Classifiers for:
Legal documents – such as attorney client privilege, closing sets, statement of work.
Strategic business documents – like press releases, merger and acquisition, deals, business or marketing plans, intellectual property, patents, design docs.
Pricing information – like invoices, price quotes, work orders, bidding documents.
Financial information – such as organizational investments, quarterly or annual results.