Removes duplicate publications from ISI records, based on ISI Unique ID attribute. The criteria for eliminating duplicate records are evaluated as follows: If one record does not have a title while the other does, keep the record with a title. If one record has a greater citation count than another, keep that record. If the total length of all the fields in one record is greater than the other, keep the longer record. If none of these criteria remove a record, arbitrarily choose the first to be kept.

Pros & Cons

Does a good straight-forward job of removing duplicate records. It's criteria for removing duplicates may not match your own. It will only remove "duplicate" nodes if they have the same UID (they could conceivably represent the same paper, and have different UIDs, depending on how clean ISI data is). Only works on ISI data.


Used when you combine two queries from ISI into one. For instance, you can combine the results from "Grey Squirrel" with the results for "Red Squirrel" in a single file, and remove the duplicate publications using this algorithm.

Usage Hints

Will be indirectly employed if you choose the "Load" from the File menu with the 'ISI flat format' option. See here.

See Also

