Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

phone numbers with two digit area code not recognized #10

Open
cod3licious opened this issue Aug 24, 2020 · 5 comments
Open

phone numbers with two digit area code not recognized #10

cod3licious opened this issue Aug 24, 2020 · 5 comments

Comments

@cod3licious
Copy link

this: +1 123 1548690 is correctly identified as a phone number, but not this: +49 123 1548690

@cod3licious
Copy link
Author

At the top here are some nice regexs, incl. this one for phone numbers:

    r"""
    (?:
      (?:            # (international)
        \+?[01]
        [ *\-.\)]*
      )?
      (?:            # (area code)
        [\(]?
        \d{3}
        [ *\-.\)]*
      )?
      \d{3}          # exchange
      [ *\-.\)]*
      \d{4}          # base
    )"""

maybe this fixes it?

@cod3licious
Copy link
Author

ok, I think this might work:
r"(?:^|(?<=[^\w)]))(((\+?[01])|(\+\d{2}))[ .-]?)?(\(?\d{3}\)?[ .-]?)?(\d{3}[ .-]?\d{4})(\s?(?:ext\.?|[#x-])\s?\d{2,6})?(?:$|(?=\W))"

@AssassinTee
Copy link

phone_numbers = [
    "2404 9099130",
    "024049099130",
    "02404 9099130",
    "02404/9099130",
    "+492404 9099130",
    "+4924049099130",
    "+492404/9099130",
    "0160 123456789",
    "0160/123456789",
    "+32160 123456789",
    "Tel.: 0160 123456789"
]

for i, number in enumerate(phone_numbers):
    print(f"{i}: {text_cleaner.transform(number)}")
0: 2404 <phone>
1: 024049099130
2: 02404 <phone>
3: 02404/<phone>
4: +492404 <phone>
5: +4924049099130
6: +492404/<phone>
7: 0160 123456789
8: 0160/123456789
9: +32160 123456789
10: tel.: 0160 123456789

:(

@jfilter
Copy link
Owner

jfilter commented Oct 15, 2020

Thanks @cod3licious for providing the regex and thanks @AssassinTee for the test cases. I adapted the regex to make it work with all the provided phone numbers.

@rhnfzl
Copy link

rhnfzl commented Sep 4, 2023

The regex doesn't work with phone numbers like

001-504-724-7835x2050
001-687-915-1144
001-507-783-9793x4107

@jfilter jfilter reopened this Sep 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants