1

I am having Mobile numbers in below format with no "+" sign in front of mobile numbers. How to get country from these format of numbers. I checked documentation,"+" sign is required. Any way to add '+' sign manually before it check for number to avoid parsing exception.

Mobile_Number: 9687655xxxx
Mobile_Number: 6142499xxxx
Mobile_Number: 20109811xxxx

py script-

import phonenumbers
from phonenumbers import geocoder

query = phonenumbers.parse("96650072xxxx", None)
print (geocoder.description_for_number(query, "en"))
print(query.country_code)

Error-
<>@ubuntu:~/elk$ python3 a.py
Traceback (most recent call last):
  File "a.py", line 4, in <module>
    query = phonenumbers.parse("96650072xxxx", None)
  File "/home/<>/.local/lib/python3.6/site-packages/phonenumbers/phonenumberutil.py", line 2855, in parse
    "Missing or invalid default region.")
phonenumbers.phonenumberutil.NumberParseException: (0) Missing or invalid default region.

Outpt after adding '+' sign

<>@ubuntu:~/<..>$ python3 a.py
Saudi Arabia
966

Ref link- https://pypi.org/project/phonenumbers/

Divyank
  • 811
  • 2
  • 10
  • 26
  • Cannot you just prepend '+' sign to the "96650072xxxx" string? – Ivan Vnucec Dec 02 '21 at 05:49
  • Don't use the snippet in questions if you're not using html/js/css – FLAK-ZOSO Dec 02 '21 at 05:53
  • Hi @IvanVnucec, we have field named "Mobile_Number" as "96650072xxxx" in elastic search index, we have millions of numbers, so adding the "+" sign is not feasible.I omit this info just to keep my question simple. . – Divyank Dec 02 '21 at 06:11
  • @Divyank I think you might add that the performance is an issue because you are getting answers that are useless to you. – Ivan Vnucec Dec 02 '21 at 07:21
  • Performance issue will persist, only 400 docs enrich with country and code field in 3 mins, we have 10m+ docs in an index and pipeline will injest new docs every 1 hour,so py script will be scheduled according. There may be delay in enriching docs, will check other faster ways,Let see how it goes!! – Divyank Dec 02 '21 at 07:37

2 Answers2

2

You could define a function to check if the string starts with '+', and if not add it to the string before parsing.

def parse_phone_number(phone_number: str) -> str:
    """Prepend '+' sign if required, then parse phone number"""

    if not phone_number.startswith('+'):
        phone_number = '+' + phone_number
    return phonenumbers.parse(phone_number, None)

You would then just change this line:

query = phonenumbers.parse("96650072xxxx", None)

to:

query = parse_phone_number("96650072xxxx")
ljdyer
  • 1,946
  • 1
  • 3
  • 11
1

If your source dataset is just missing the leading + you can just add it in the parse call.

original_phonenumber = "96650072xxxx"
query = phonenumbers.parse(f"+{original_phonenumber}")

If you have a mixed dataset, you need to check first if your phonenumber actually starts with +

original_phonenumber = "96650072xxxx"
if not original_phonenumber.startswith("+"):
    original_phonenumber = f"+{original_phonenumber}"
query = phonenumbers.parse(original_phonenumber)

But that's bad practice, so I would instead advise you to fix your source dataset. Are you sure that only the leading + is missing and not the entire country code?

vinzenz
  • 669
  • 3
  • 14