ANCESTRY.COM, BEHINDTHENAME.COM, AND WERELATE.ORG ANNOUNCE AN IMPROVED APPROACH FOR FINDING VARIANT NAMES
Ancestry.com, BehindTheName.com, and WeRelate.org
announce an improved approach to finding variant names in genealogy
searches. Up to now, most genealogy websites have had to rely upon
Soundex to return variant names in response to searches. These
approaches often miss variants that should be returned, or include
variants that aren't very similar.
Ancestry.com, BehindTheName.com, and WeRelate.org
have created an open-source database of name variants that is free for
any website or genealogy software developer to use. Tested against pairs
of names provided by Ancestry.com, it reduces the number of missed name variants by over 25% in comparison with Soundex.
How
you can help: A large portion of genealogical expertise involves
learning variant spellings for the surnames in your tree. Why not share
your knowledge with others? By adding your variant spellings to the
database, searches on any website that uses it will include your variant
spellings automatically. You can review and add variant spellings here:
http://www.werelate.org/wiki/Special:Names
In
addition, we need people to review the changes that others have made to
the database, to make sure that we have multiple pairs of eyes
reviewing the names that are being added and removed. You can review
changes that others have made here: http://www.werelate.org/wiki/Special:NamesLog
If you are a website or software developer: The database and source code are available at: https://github.com/DallanQ/Names
In
addition to the database of name variants, the source code also
includes a function to return the similarity score between any two
names. This function has been found useful in duplicate detection.
No comments:
Post a Comment