Business Context > Business Rules
Business Rules
Matching Algorithms
The PPR uses a deterministic and probabilistic algorithm to link distinct records for a given Provider Person or Location received from different sources, under a single PPR assigned Enterprise Provider Identifier (EPID). PPR also uses both deterministic and probabilistic algorithms for identifying candidates to match criteria provided by the PPR consumer for PPR queries. Deterministic algorithms are used generally for querying identifiers. The Probabilistic algorithm is generally used for identifying candidates where the PPR consumer has provided using demographic data as the query criteria (e.g. Name/Address/Role etc.)
Overview
The PPR linking and matching algorithm:
Optimizes demographic and identifier data through data normalization for statistical comparison
Finds all potential matches using threshold analysis (e.g. uses real data to define weights associated with the attributes and values)
Scores via deterministic and probabilistic algorithm (refer to the sub-sections below)
Returns the candidates in descending score order (e.g. highest to lowest)
Deterministic Scoring
Deterministic scoring takes into the account the presence or absence of a value to derive a matching score against the PPR. Holistically, it represents an ‘all or nothing’ static score based on the value in question matching the value in PPR exactly.
PPR performs ‘exact match’ deterministic scoring (for linking records on add/update and identifying candidates on queries) on the following attributes (which, for matching, must be consistently supplied across different sources for the same provider to be used in the linking or matching score):
UPI (Person & Location)
License Number (Person)
Stakeholder Number (Person & Location)
Laboratory Services License Number (Location)
Pharmacy Accreditation Number (Location)
Facility Number (Location)
Master Number (Location)
Probabilistic Scoring
PPR performs probabilistic scoring (often referred to as ‘fuzzy’ matching) for matching candidates on queries. The following string value attributes, which must be consistently supplied across different sources for providers to be used in the linking or matching score, are expected to be supplied across all source records for providers, and are used for probabilistic scoring. The engine normalizes the data (e.g. dashes, area codes in phone numbers, and special characters in names are stripped for comparative purposes) for comparison and scoring
Name
1. Person - FirstName, MiddleName, LastName, AliasFirstName, AliasMiddleName, AliasLastName)
2. Location - LegalName, Location Known NameAddress (Person & Location) including:
1. Street Address
2. Municipality (Person & Location)
3. Telephone (Person & Location)
Probabilistic scoring takes into consideration the frequency of the occurrence of a data value within a particular distribution of an attribute when matching against the existing data PPR. Holistically, it represents a variable match score based on the value in question matching the value in PPR to a measured degree, as per the following example criteria and thresholds:
Table - Probabilistic Scoring
Concepts | Description |
---|---|
Frequency | Scores based on the frequency of an attribute value and not just whether there is a match (e.g. “John” would score less than “Heinrich” because John is a more common name in our jurisdiction). |
Phonetics | Indexes words by sound/pronunciation to identify those that are similar phonetically (e.g. Purdey, Purdie, Purdy) |
Equivalence (Nicknames) | Evaluates based on equivalency (e.g. for names; Liz, Libby, Beth = Elizabeth, for municipalities; Agincourt = Scarborough) |
Field Transposition | Accounts for errors where name (First/Middle/Last) or address (StLine1/Stline2) sub-elements are transposed into the incorrect fields (e.g. Last + First vs. First + Last) |
Edit Distance | Accounts for typos due to character transposition at the attribute value level (e.g. “Micheal”) |
Historical Values | Performs queries against now inactive source data (e.g. Name, Address, Phone) to match on potential candidates based on historical data (e.g. name change as a result of marriage) |
False Positive Filters | Prevents improper linkages from occurring when certain conditions are met (e.g. father & son) |
Trusted Source | Identifies the regulatory colleges as trusted sources that contain authoritative source data attributes |
OR vs. AND Conditions | In general, PPR treats additional criterion as “OR” conditions, as opposed to “AND” conditions as the goal of the MDM algorithm is to find a suitable match. Additional attributes provided in a query casts a wider search net, with the highest scoring candidates returned at the top of the candidate list (i.e. in descending order). Refer to the Minimum Score and Maximum Results sections for additional context. |
For candidate search queries, PPR totals the deterministic and probabilistic search scores together to generate the candidate confidence score. Note that negative scores may be configured for certain elements. Based on the degree of mismatch for the value, the negative score is applied to reduce the overall match score.
Refer to the Record Linking section for scoring scenarios.
Record Linking
PPR uses its deterministic and probabilistic scoring algorithm and record linking thresholds to determine whether a contributed record should be linked to an existing entity (and assigned that entity’s EPID) or not.
The question of “Does this record match an existing entity?” can be answered with Yes, No or Maybe. In MDM, there is a defined range of scores for identifying whether the answer should be Yes, No or Maybe. These are called the Thresholds, and there are 2 – “Clerical Review” and “auto-link”… These questions are answered based on the following:
No = Match Score < Clerical Review minimum Score
Maybe = Clerical Review Threshold <= Match Score < Autolink Threshold
Yes = Match Score >= AutoLink Threshold
These scenarios are dictated by a linkage threshold scoring range for the member record’s total match score (deterministic + probabilistic scores) when compared against all the other member records in the PPR. Scoring is attached for every record added or updated in the PPR. Based on the match score, one of the following scenarios will occur:
No Linkage (No Match): Leaves a record in its existing singleton entity based on a ‘low’ matching score below the “Clerical Review” threshold. The demographics on the record are ‘thin’ or ‘sufficiently different’ enough that PPR cannot auto-link or potentially link it to an existing person entity.
Potential Linkage (Maybe Match): Flags a record by creating a “Potential Linkage” task based on a ‘medium’ matching score above the “Clerical Review” and below “Auto-Link” thresholds. The record in question retains its current entity ID value, either with other linked records or on its own as a singleton, until the Potential Linkage task is manually resolved by a Data Integrity analyst. The data integrity analyst will either link the record to another existing person entity, or leaving it as-is based on a defined investigation process to determine if the two records are in-fact for the same provider.
Automatic Linkage (Yes Match): Automatically links a record to an existing entity based on a ‘high’ matching score e.g. above the “Automatic Link” threshold.
Supported Queries
The PPR uses the following rules to rank possible matches based on query parameters received from consumers within the PPR.
Search Provider Person
PPR supports searching for Practitioner candidates using Practitioner demographics as defined in Table 11 below. In each case:
• A maximum of 25 search results (candidate practitioners) will be returned for a single Practitioner EMPI Match request
Typically, consumers with limited demographic information (e.g. name and city) available will:
Submit a Practitioner EMPI Match to return a list of potential matching Practitioners from the PPR.
Use a Unique Provider Identifier (UPI) or another identifier (e.g. Licence Number) to submit a Practitioner Search to retrieve the full details for the desired Practitioner.
Note that query criteria values provided in the message are ‘ORed’ together by the PPR and the score calculated by the matching algorithm defined above to identify the list of candidate Practitioners.
Table-Search Provider Person Query Criteria
FHIR Operation | Query Criteria | Use |
---|---|---|
Practitioner EMPI Match – Option 1 Name & Municipality Search | Provider Last Name | Mandatory |
Provider First Name | Optional | |
Profession code | Optional | |
Profession Status | Optional | |
Specialty /Sub specialty code | Optional | |
Address (Municipality) | Mandatory | |
Provider Language | Optional | |
Practitioner EMPI Match –Option 2 Profession & Municipality Search | Profession code | Conditional: Profession Must be provided at a minimum, however both Profession and Specialty MAY be provided |
Profession Status | Optional | |
Specialty /Sub specialty code | Conditional: Profession Must be provided at a minimum, however both Profession and Specialty MAY be provided | |
Gender | Optional | |
Address (Municipality) | Mandatory | |
Provider Language Code | Optional |
Get Provider Person Details
PPR supports retrieving the details of a specific practitioner by a defined Practitioner ID as defined in Table 12 below. If the Practitioner UPI/Practitioner Identifier (e.g. Licence Number) in question is not available in the local consumer system, the consumer can execute a Practitioner EMPI Match to query on various criteria to return the UPI/Practitioner Identifier of the provider.
If the UPI/Practitioner Identifier is known, the consumer system can directly execute any of the queries below:
Table – Get Provider Person Details Query Criteria
FHIR Operation | Query Criteria | Use |
---|---|---|
Practitioner Search –Option 1 UPI | ||
ID Issuer (i.e. Ontario Health) | Mandatory | |
Unique Identifier Value (i.e. UPI) | Mandatory | |
Practitioner Search –Option 2 Licence Number | ||
ID Issuer (i.e. Regulatory College) | Mandatory | |
Provider Person Primary Identifier Value (i.e. License Number) | Mandatory | |
Practitioner Search –Option 3 EPID | ||
ID Issuer (i.e. Ontario Health) | Mandatory | |
EPID Value | Mandatory | |
Practitioner Read | ID Issuer (i.e. Ontario Health) | Mandatory |
EPID Value | Mandatory |
Search Provider Location
PPR supports searching for provider location candidates using location demographic information as defined in Table below. In each case:
• A maximum of 100 search results (candidate providers) will be returned for Location EMPI Match
Typically, consumers with limited information available will:
Submit an Location EMPI Match query to return a list of organizations/locations from the PPR
Use a Unique Identifier (UPI) from the search results to submit an Location Search details query to retrieve the desired location.
Note that query criteria values provided in the message are ‘ORed’ together and scored by the PPR as per the algorithm definition above to identify and rank the candidate providers.
Table - Search Provider Location Query Criteria
Search Option | Query Criteria | Use |
---|---|---|
Location EMPI Match –Option 1 Name & Address(Municipality) | ||
Location Name | Mandatory | |
Location Address(Municipality) | Mandatory | |
Location Postal Code | Optional | |
Location Status | Optional | |
Location Phone | Optional | |
Location Fax | Optional | |
Location Address ( lines) | Optional | |
Location EMPI Match –Option 2 Role & Address(Municipality) Search | ||
Location Role | Mandatory | |
Location Name | Optional | |
Location Address(Municipality) | Mandatory | |
Location Address(Postal Code) | Optional | |
Location Status | Optional | |
Location Phone | Optional | |
Location Fax | Optional | |
Location Address | Optional | |
Location EMPI Match –Option 3 LHIN Code & Role Search | ||
LHIN Code | Mandatory | |
Location Role | Mandatory | |
Location Status | Optional |
Get Provider Location Details
PPR supports retrieving a specific Location as defined in Table below. If the Location UPI/ Location Identifier in question is not available in the local consumer system, the consumer can execute a Provider Location EMPI Match query to on various criteria to return the UPI/Provider Location Identifier of the provider.
If the UPI/Location Identifier listed below are available, the consumer system can execute the Location Search using any of the following criteria:
Table - Get Provider Location Details Query Criteria
FHIR Operation | Query Criteria | Use |
---|---|---|
Location Search –Option 1 UPI | ||
ID Issuer (i.e. Ontario Health) | Mandatory | |
Provider Location Primary Identifier Value (i.e. UPI) | Mandatory | |
Location Search –Option 2 | ||
Laboratory Service License Number | ||
ID Issuer (i.e. Ontario Health) | Mandatory | |
Provider Location Primary Identifier Value (i.e. Laboratory Services License Number) | Mandatory | |
Location Search – Option 3 Pharmacy Accreditation Number | ||
ID Issuer (i.e. Ontario Health) | Mandatory | |
Provider Location Primary Identifier Value (i.e. Pharmacy Accreditation Number) | Mandatory | |
Location Search – Option 4 EPID | ||
ID Issuer (i.e. Ontario Health) | Mandatory | |
EPID Value | Mandatory | |
Location Read | ||
ID Issuer (i.e. Ontario Health) | Mandatory | |
EPID Value | Mandatory |
Practitioner Queries
Overview
Consuming systems can query for Practitioners in the PPR via:
Get Queries: PPR EPID or Definitional Identifier (e.g. License Number) to ‘get’ the current demographic information and IDs associated with the provider.
Search Queries: Demographics (e.g. Last Name + Municipality) to ‘search’ for candidate providers, returning their current demographics and ID list for each candidate (to support selection of the appropriate candidate from a ‘pick list’).
The following sections detail PPR get vs. search queries based on input criteria and output results. Note that output data in the PPR is dependent on the data density for a given provider as contributed across all sources of practitioner information at the attribute level.
Get Query
Retrieve all demographics and identifiers for a given provider by definitional identifier.
Table - Rules: Get Query
Summary | Get by Provider ID (Definitional ID or EPID) to retrieve all demographics and IDs for a given provider. |
---|---|
Operation | HL7 FHIR – Practitioner Read (by EPID) or; HL7 FHIR – Practitioner Search (by Primary Identifier UPI or EPID) |
Input (Criteria) | A primary ID Issuer & corresponding ID value must be provided in the request. |
For Practitioner Read: | |
1. PPR Enterprise Provider ID (EPID) | |
For Practitioner Search: | |
1. PPR Enterprise Provider ID (EPID), or 2. Authoritative Source Identifier (e.g. College Source URI+ License Number) or 3. PPR Unique Provider Identifier (eHealth Source URI + UPI) |
|
Note: If a given adopter only has demographic information available for the provider, they must perform a Practitioner EMPI Match. | |
Output (Results) | Summary: |
- Based on exact matching the PPR EPID or Definitional ID or UPI for a given active or merged member record, pull the entity view for a given provider. | |
- Response includes the entity view for that provider when they have at least one active member record in the entity. | |
- Returns the complete set of provincial demographics and list of identifiers as the ‘golden record’ for a given provider. | |
Details: | |
Returns the following list of provider person attributes: | |
PPR EPID | |
Unique Provider Identifier (UPI) | |
License Number | |
Stakeholder Number | |
Official Name | |
Alias Name | |
Gender | |
Communication Language | |
Telephone | |
Fax | |
Email Address | |
Web Address | |
Profession Classification Type | |
Profession Code | |
Profession Start Date | |
License Effective Date | |
Profession Active Indicator | |
Specialty / Sub-Specialty Code | |
Specialty / Sub-Specialty Start Date | |
Address | |
LHIN Number | |
LHIN Name | |
Telephone | |
Fax | |
Affiliation Classification Code | |
Affiliation Type Code | |
Affiliation Unique Provider Identifier (UPI) | |
Training Type Code | |
Training Institution Name |
Search Query
Retrieves a summary of provincial demographics and identifiers for a list of candidates that match the provided demographic data.
Table - Rules: Search Query
Summary | Search by demographics to retrieve practitioner demographics based on the search criteria provided. Provides the Practitioner profiles for up to 25 matched candidates for the consumer to present to an end user. |
---|---|
Operation | HL7 FHIR – Practitioner EMPI Match |
Input (Criteria) | Search by Demographics (minimum criteria): |
1. Last Name + Municipality, or: | |
2. Profession + Municipality | |
Note: If a given adopter has a definitional identifier available for the provider person (e.g. License Number), they must perform a Practitioner Search to get the desired information in a single call. | |
Output (Results) | Summary: |
• Based on probabilistic matching of the input parameters for a given active or merged member record, pull the composite view for a given provider | |
• Response includes the composite view for that provider when they have at least one active member record in the entity. | |
• Returns the complete set of demographics and identifiers as the ‘golden record’ for a given provider. | |
Details: | |
Returns the following list of provider Location attributes: | |
PPR EPID | |
Unique Provider Identifier (UPI) | |
License Number | |
Stakeholder Number | |
Official Name | |
Alias Name | |
Gender | |
Communication Language | |
Telephone (s) | |
Fax (es) | |
Email Address (es) | |
Web Address | |
Profession Classification Type | |
Profession Code | |
Profession Start Date | |
License Effective Date | |
Profession Active Indicator | |
Specialty / Sub-Specialty Code | |
Specialty / Sub-Specialty Start Date | |
Address(es) | |
LHIN Number | |
LHIN Name | |
Telephone | |
Fax | |
Affiliation Classification Code | |
Affiliation Type Code | |
Affiliation Unique Provider Identifier (UPI) | |
Training Type Code | |
Training Institution Name |
Location Queries
Overview
Consuming systems can query for provider Locations in the PPR via:
- Get Queries: PPR EPID or Definitional Identifier (e.g. Pharmacy Accreditation Number) to ‘get’ the current Location details and IDs
- Search Queries: Demographics (e.g. Location Name (legal or Known) + Municipality) to ‘search’ for candidate Locations, returning their current Location details and ID list for each matched candidate (i.e. to support selection of the appropriate candidate from a ‘pick list’).
The following sections detail PPR Organization/location get vs. search queries based on input criteria and output results. Note that output data in the PPR is dependent on the data density for a given organization as contributed across sources at the attribute level.
Get Query
Retrieve all Location details and identifiers for a given Location by definitional identifier.
Summary | Get by Provider ID (EPID or Definitional ID) to retrieve all provincial demographics and IDs for a given provider. |
---|---|
Operation | HL7 FHIR – Location Read (by EPID), or; HL7 FHIR – Location Search (by Primary Identifier, UPI or EPID) |
Input (Criteria) | A primary ID Issuer & corresponding ID value must be provided in the request. For Location Read: 1. PPR Enterprise Provider ID (EPID) For Location Search: 1. PPR Enterprise Provider ID (EPID), or 2. Authoritative Source Identifier (e.g. OCP Source URI + Pharmacy Accreditation Number), or 3. PPR Unique Provider Identifier (eHealth Source URI + UPI) Note: If a given adopter only has demographic information available for the Location they must perform an Location EMPI Match. |
Output (Results) | Summary: |
• Based on exact matching the PPR EPID or Definitional ID or UPI for a given active or merged member record, pull the entity view for a given provider. | |
• Response includes the entity view for that provider when they have at least one active member record in the entity. | |
• Returns the complete set of demographics and list of identifiers as the ‘golden record’ for a given provider. | |
Details: | |
Returns the following list of provider Location attributes: | |
PPR EPID | |
Unique Provider Identifier (UPI) | |
Primary Identifiers (e.g. Stakeholder Number, Accreditation Number for Pharmacies) | |
Facility Number (FCN) | |
Master Number (MNI) | |
Location Legal Name | |
Location Common Name | |
Communication Language | |
Location Abbreviated Name | |
Location Type Code | |
Location Active Indicator | |
Location Operational Status Code | |
Location Operational Status Reason Code | |
Location Address | |
Location Telephone | |
Location Fax | |
Location Email Address | |
Location Web Address | |
Site ID | |
Address | |
LHIN Number | |
LHIN Name | |
Telephone | |
Fax | |
Email Address | |
Web Address | |
Affiliation Classification Code | |
Affiliation Type Code | |
Affiliation Unique Provider Identifier (UPI) |
Search Query
Retrieves Location details and identifiers for a list of candidates that match the provided search criteria.
Table - Rules: Search Query
Summary | Search to retrieve Location details and ID list for up to 100 matched candidates to present to a user to select a match. |
---|---|
Operation | HL7 FHIR – Location EMPI Match |
Input (Criteria) | Search by Demographics (minimum criteria): |
Location Name (legal or Known) + Municipality or | |
Role + Municipality or LHIN + Role | |
Note: If a given adopter has a definitional identifier available for the provider Location (e.g. UPI or OCP Accreditation Number), they must perform an Location Search to get the desired information in a single call. | |
Output (Results) | Summary: |
Based on probabilistic matching of the input parameters for a given active or merged member record, pull the Composite view for a given provider. | |
Response includes the entity view for that provider when they have at least one active member record in the entity. | |
Returns the complete set of Location details and identifiers as the ‘golden record’ for a given provider. | |
Details: | |
Returns the following list of provider Location attributes: | |
PPR EPID | |
Unique Provider Identifier (UPI) | |
Primary Identifiers (e.g. Stakeholder Number, Accreditation Number for Pharmacies) | |
Facility Number (FCN) | |
Master Number (MNI) | |
Location Legal Name | |
Location Common Name | |
Location Abbreviated Name | |
Location Type Code | |
Location Active Indicator | |
Location Operational Status Code | |
Location Operational Status Reason Code | |
Location Address | |
Location Telephone | |
Location Fax | |
Location Email Address | |
Location Web Address | |
Site ID | |
Address (es) | |
LHIN Number | |
LHIN Name | |
Telephone | |
Fax | |
Email Address | |
Web Address | |
Affiliation Classification Code | |
Affiliation Type Code | |
Affiliation Unique Provider Identifier (UPI) |
Data Consumption
Agreements
PPR data consumption is bound to the following, as per direction from the Ontario Health Privacy)
The following attributes will be masked for consumption for specific roles due to personal information:
Death Indicator and Death Date.
Role Status
License restrictions
Mailing address
Confidential addresses and contact information.
Definition of these attributes in the specification is with the understanding that the values will be masked in PPR query responses.
Enterprise Provider ID (EPID)
In addition to the business attributes scoped per the agreement language above, PPR EPID must not be displayed on user/clinician/provider facing user interface screens or reports.
PPR adopters MUST also consider EPID as a point-in-time value to avoid unexpected results due to dynamic PPR record linking. The EPID must only be used to resolve a provider as part of a ‘real-time’ integration (e.g. a single session or a get query immediately following selection of a search result). Historical (static) EPID values must not be used for future integration, given they may change.
EPID values should not be persisted outside of the PPR for any reason.
Minimum Search Criteria
Minimum search criteria are implemented to provide optimum performance for PPR query response times.
Maximum Results
Maximum results are enforced to provide optimum performance for PPR query response times. A maximum of 25 persons or 100 Locations can be returned in a given search.
Minimum Score
PPR is currently configured to return candidates matched on any positive score (i.e. greater than zero).
Merged Records
Performing a get query on a definitional identifier (e.g. get by License Number) for a record merged in the PPR will return the surviving provider. Note that the surviving provider details may differ from the query criteria used to match on the merged record.
Phone Number Convention
Given the PPR linking and matching algorithm does not score on special characters (e.g. dashes) or spaces, and only scores on the digits in the local number, PPR currently does not enforce a specific phone number convention or format at the product level. As a result, and depending on the source provider data, phone numbers in PPR are split across database elements, or combined within, per the illustrative examples below:
Table - Phone Number Examples
Provider Identification Segment - Phone Elements | Example #1 | Example #2 | Example #3 |
---|---|---|---|
Phone Number | |||
Telephone Number | |||
Telecom Use Code | PRN | WPN | WPN |
Telecomm Type | Fax | Phone | Phone |
Email Address | |||
Country Code | 1 | ||
Area/City Code | 416 | ||
Local Number | 5555555 | (416)538-3937 | 14165383937 Ex: 4873 |
Extension | 1234 | Ext4873 |
PPR consumers must therefore implement logic to reconstruct and/or reformat the phone number across all the PPR returned phone elements. For example, a consumer must consolidate Country Code + Area Code + Local Number + Extension, or leverage the consolidated phone number field for FHIR consumption, and apply local rules accordingly. The logic to reassemble and/or format the phone number to assign in the local system will differ across adopters and their corresponding data standards. For example, for systems that only store digits without formatting, the rule would be to remove any non-numeric formatting characters from the concatenated phone string as only the phone digits will be persisted in the local system.
Address Convention
Provider Identification Segment - Address Elements | Example #1 | Example #2 | Example #3 |
---|---|---|---|
Street Address (aka Street Line1) | 777 Bay Street | 777 BAY ST. | 777 Bay Street |
Other Designation (aka Street Line2) | 3rd Floor | 3RD FLOOR | 3rd Floor |
City | Toronto | TORONTO | Singapore |
State or Province | ON | ON | |
Zip or Postal Code | M5W 7E6 | M5W7E6 | 270021 |
Country | CAN | CAN | SGP |
Address Type | P | P | P |
PPR consumers must therefore implement logic to reconstruct and/or reformat addresses across all the PPR returned address elements. The logic to reassemble and/or format addresses to assign in the local system will differ across adopters and their corresponding data standards. For example, a consumer requiring no space between Canadian postal code forward sortation area (e.g. M5W) and local delivery unit (e.g. 7E6) must strip out any spaces returned by PPR in the postal code element. Note that PPR currently only maps standardized state/province codes for Canada and the USA. International addresses may return null values for State/Province.
‘User-at-the-Keyboard’ Identification
To support privacy inquiries into the disclosure of provider PI, the End User’s user name or mnemonic ID MUST be included in the PPR query message (vial SAML) to identify the user who initiated the query. This is only required when a query is initiated by an actual user (vs. when performed by a system-to-system interaction (e.g. no disclosure to an individual user). Depending on the PPR interface implementation, this MUST be satisfied at the transport and/or functional message level(s).