Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added some medical suffixes #88

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open

added some medical suffixes #88

wants to merge 4 commits into from

Conversation

Prismacolor
Copy link

I had a suggestion for adding a few medical acronyms: bn, np, and rn which I added to my version as we sometimes deal with medical professionals.

added some nurse suffixes: bn, rn, and np.
removed king and queen from the titles as these are sometimes used as names
added a few prefixes: el, van, mc, mac
Copy link
Owner

@derek73 derek73 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this pull request has some good additions in it, but would need a few tweaks before I can merge it in. (Sorry it's been so long since your pull request, it's been a while since I could focus on this.)

'san',
'santa',
'st',
'ste',
'van',
'vel',
'van',
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Van is sometimes a first name, so including it in prefixes would break parsing for all the Vans of the world. Skimming the US birth names database there do appear to be people named Van, eg 183 people born in 1983.

% python tests.py "Van middle last"
<HumanName : [
	title: '' 
	first: 'Van middle last' 
	middle: '' 
	last: '' 
	suffix: ''
	nickname: ''
]>

Similar comment with Mac. I went to school with a guy named Mac.

Mc is fine because there's no vowel so it can't be a first name. Although I guess it could be a title abbreviation, Master of Ceremonies, and I'm not sure how that would play out.

El is an article in Spanish, so I'd kinda like to know how it is used in a name. Is it used as the Spanish article in a title like el senator, or as a prefix like del?

@@ -7,12 +7,10 @@
'brother',
'dame',
'father',
'king',
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both king and queen are including in the set of titles that indicate first names when placed before a single name, e.g. King David and Queen Mary, so this pull request will break some tests. In 2005 there were 148 people born in the US named King, so maybe it is a more useful case to handle than the title. I'm know people have used this parser on datasets that include kings and queens before though, but I guess we can let them customize the titles constant to pick them up.

We should update the test cases that include "king" to use one of the other titles in that set.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants