Sci2 Manual : How Full Names are Parsed
This page last changed on Dec 07, 2010 by barbosaa.
Full names appear in two basic forms:
(The only real difference is the comma.) When parsing a full name, it is first split into tokens. The first token is always considered to be the last name (which corresponds to the PERSON.FAMILY_NAME field, though practically speaking that field is always filled by the parsing results of the "AU" or "ED" ISI fields). If a second token is supplied, it is considered to be the first name (PERSON.PERSONAL_NAME). All tokens after the second are concatenated together, separated by a single space character, and treated as the middle name (PERSON.ADDITIONAL_NAME). All values that end up in PERSON.FULL_NAME as a result of this loader are converted to a canonical form, which is of the form:
where <Middle Names> is all tokens after the second concatenated as described above. Note: Titles such as Mr., Ms., Mrs., etc are not directly supported, in that there are no special parsing rules to handle them. See CISHELL:AUTHORS, CISHELL:EDITORS, CISHELL:PERSON, CISHELL:REFERENCE, and CISHELL:How Abbreviated Names are Parsed. |
![]() |
Document generated by Confluence on May 31, 2011 15:16 |