Skip to end of metadata
Go to start of metadata
Description

This algorithm merges all people in an NSF database who have the same name (absent case and punctuation).

Menu path

Data Preparation > Database > NSF > Merge Identical NSF People

Outputs
  • A database where the identified identical authors have been merged.
  • The merging table used to merge the identical authors. This can be used to rerun the merge manually, likely to correct for errors, with Merge Entities.
  • A Merge Report as a text file. It will give a simple description all the people who were merged, identified by their FORMATTED_FULL_NAME.
Implementation Details

The merging is performed as indicated in Merge Entities. The algorithm merges on the FORMATTED_FULL_NAME in the PERSON (NSF) table, which is the fully formatted form of a person's name (last name, first initial and maybe middle initial). To identify identical entities, this algorithm compares all the "normalized" values from the FORMATTED_FULL_NAME. In this case, "normalized" means that the value has been converted to lower case and all characters that are not a decimal, letter, or a single space are removed. If the "normalized" values are the same, the people being compared are assumed to be identical. There is some limited 'sanity checking' that looks to see if the people being merged were authors on the same award with the assumption that this case should not occur. If it finds such a merge group, it will warn the user on the console with the ORIGINAL_INPUT_NAME, primary key, AWARD_NUMBER, and primary key but will continue with the merge.

See Also

Labels
  • None