PHP phone number formatting

    There was a problem of automatic formatting of phone numbers in the form of a country (city) number , and the first thing I turned to the existing solutions.
    Unfortunately, it turned out that all the solutions found are based on the usual line fitting to a custom format, having a limited scope and errors when going beyond it.

    First, I’ll give an overview of the solutions found. For those who are not interested, I recommend scrolling below to the heading “Telephone Number Formats” - my version of parsing a number with a link to a code is already presented there.

    All-Destructive Primitive

    (Solution found. Mine is below)
    The first thing I came across was forum posts and script banks offering solutions of the following plan:
    function phone_number ($ sPhone) {
        $ sPhone = ereg_replace ("[^ 0-9]", '', $ sPhone);
        if ( strlen ($ sPhone)! = 10) return (False);
        $ sArea = substr ($ sPhone, 0.3);
        $ sPrefix = substr ($ sPhone, 3.3);
        $ sNumber = substr ($ sPhone, 6.4);
        $ sPhone = "(". $ sArea. ")". $ sPrefix. "-". $ sNumber;
        return ($ sPhone);
    }
    ?>

    One of the simplest options for quick formatting of phone numbers, but each such solution is focused on phone numbers from a specific local area and is not a solution to the problem.

    Formatting with sscanf

    (Solution found. Mine below)
    function formatPhone ($ phone) {
        if (empty ($ phone)) return "";
        if ( strlen ($ phone) == 7)
            sscanf ($ phone, "% 3s% 4s", $ prefix, $ exchange);
        else if ( strlen ($ phone) == 10)
                sscanf ($ phone, "% 3s% 3s% 4s", $ area, $ prefix, $ exchange);
            else if ( strlen ($ phone)> 10)
                    if ( substr ($ phone, 0,1) == '1') {
                        sscanf ($ phone, "% 1s% 3s% 3s% 4s", $ country, $ area , $ prefix, $ exchange);
                    }
                    else {
                        sscanf ($ phone, "% 3s% 3s% 4s% s", $ area, $ prefix, $ exchange, $ extension);


                    return "unknown phone format: $ phone";
        $ out = "";
        $ out. = isset ($ country)? $ country. ' ':' ';
        $ out. = isset ($ area)? '('. $ area. ')': '';
        $ out. = $ prefix. '-'. $ exchange;
        $ out. = isset ($ extension)? 'x'. $ extension: '';
        return $ out;
    }

    Despite a simple solution, this function already knows how to format numbers with a length of 7, 10 or more digits, but if it gets a number from the Russian hinterland, it will choke and give an erroneous result.

    Symfony, lib / helpers / PhoneHelper.php, format_phone

    (Solution found. Mine below)
    function format_phone ($ phone = '', $ convert = false, $ trim = true)
    {
        // If we have not entered a phone number just return empty
        if (empty ($ phone)) {
            return '';
        }
     
        // Strip out any extra characters that we do not need only keep letters and numbers
        $ phone = preg_replace ("/ [^ 0-9A-Za-z] /", "", $ phone);
     
        // Do we want to convert phone numbers with letters to their number equivalent?
        // Samples are: 1-800-TERMINIX, 1-800-FLOWERS, 1-800-Petmeds
        if ($ convert == true) {
            $ replace = array ('2' => array ('a', 'b' , 'c'),
                     '3' => array ('d', 'e', ​​'

                     '5' => array ('j', 'k', 'l'),
                                     '6' => array ('m', 'n', 'o'),
                     '7' => array ('p' , 'q', 'r', 's'),
                     ' 8 '=> array (' t ',' u ',' v '),' 9 '=> array (' w ',' x ',' y ',' z '));
     
            // Replace each letter with a number
            // Notice this is case insensitive with the str_ireplace instead of str_replace 
            foreach ($ replace as $ digit => $ letters) {
                $ phone = str_ireplace ($ letters, $ digit, $ phone);


     


    ($ phone)> 11) {
            $ phone = substr ($ phone, 0, 11);
        }
     
        // Perform phone number formatting here
        if ( strlen ($ phone) == 7) {
            return preg_replace ("/ ([0-9a-zA-Z] {3}) ([0-9a-zA-Z] { 4}) / "," $ 1- $ 2 ", $ phone);
        } elseif ( strlen ($ phone) == 10) {
            return preg_replace ("/ ([0-9a-zA-Z] {3}) ([0-9a-zA-Z] {3}) ([0- 9a-zA-Z] {4}) / "," ($ 1) $ 2- $ 3 ", $ phone);
        } elseif ( strlen ($ phone) == 11) {
            return preg_replace ("/ ([0-9a-zA-Z] {1}) ([0-9a-zA-Z] {3}) ([0- 9a-zA-Z] {3}) ([0-9a-zA-Z] {4}) / "," $ 1 ($ 2) $ 3- $ 4 ", $ phone);
        }
     
        // Return original phone if not 7, 10 or 11 digits long
        return $ phone;
    }
    ?>

    The function allows not only formatting in XXX-XXXX, (XXX) XXX-XXXX and X (XXX) XXX-XXXX, but also converting numbers written in numbers. The limited function in formatting numbers of 7, 10 and 11 characters in length does not fit.

    Phone Number Formats

    The wiki article shows that there is no simple and convenient pattern for quickly formatting all numbers. Country codes are registered, like domain zones, and city codes remain on the conscience of each country.

    In other words, the routing of calls goes in a mask, starting with the country code: a call sent to a specific country further breaks its route in accordance with the codes of the region, city, district, etc. starting from the leftmost digit, until the last link transfers it to a specific telephone / fax machine. The problem is further complicated by the fact that city codes within countries do not lend themselves to uniform cross-cutting standardization, i.e. in the worst of the options for the correct formatting of the numbers you will have to use a two-dimensional array with codes of countries and their cities.

    In fact, everything was not so scary. In each country, you can divide all city codes into two parts: those that for the most part are the same in length, and all the rest. This is enough to drastically reduce the code enumeration area when comparing. Those. you can create an array of data for each country of the form:
    $ data = Array (
    'Country code' => Array (
            'name' => 'Country name', // for convenience. Will not be used.
            'cityCodeLength' => regular_city_code_length_for_this_country,
            'exceptions' => Array (exception_city_codes),
        )
    );
    ?>
    Then pre-process the data, supplementing it with fields that narrow the search area, exceptions_max and exceptions_min - the maximum and minimum code length of the city-exceptions, respectively. It is also necessary to take into account countries in which city codes start at 0 - we reflect this “feature” with the zeroHack field . As an example:
    $ data = Array (
    '886' => Array (
            'name' => 'Taiwan',
            'cityCodeLength' => 1,
            'zeroHack' => false,
            'exceptions' => Array (89,90,91,92, 93,96,60,70,94,95),
            'exceptions_max' => 2,
            'exceptions_min' => 2
        ),
    );
    ?>
    After that, take the appropriate code sections from the solutions above and make the formatting function:
    function phone ($ phone = '', $ convert = true, $ trim = true)
    {
        global $ phoneCodes; // just for example! When implementing, get rid of the global variable.
        if (empty ($ phone)) {
            return '';
        }
        // cleaning up excess garbage with saving information about the “plus” at the beginning of the number
        $ phone = trim ($ phone);
        $ plus = ($ phone [0] == '+');
        $ phone = preg_replace ("/ [^ 0-9A-Za-z] /", "", $ phone);
        $ OriginalPhone = $ phone;
     
        // convert the letter number to digital
        if ($ convert == true &&! is_numeric ($ phone)) {
            $ replace = array ('2' =>
            '3' => array ('d', 'e', ​​'f'),
            '4' => array ('g', 'h', 'i'),
            '5' => array ('j' , 'k', 'l'),
            '6' => array ('m', 'n', 'o'),
            '7' => array ('p', 'q', 'r', ' s'),
            '8' => array ('t', 'u', 'v'),
            '9' => array ('w', 'x', 'y', 'z'));
     
            foreach ($ replace as $ digit => $ letters) {
                $ phone = str_ireplace ($ letters, $ digit, $ phone);
            }
        }
     
        // replace 00 at the beginning of the number with +
        if ( substr ($ phone,
        0, 2) == “00”) {
            $ phone = substr ($ phone, 2, strlen ($ phone) -2);
            $ plus = true;
        }
     
        // if the phone is longer than 7 characters, start the search for the country
        if ( strlen ($ phone)> 7)
        foreach ($ phoneCodes as $ countryCode => $ data)
        {
            $ codeLen = strlen ($ countryCode);
            if ( substr ($ phone, 0, $ codeLen) == $ countryCode)
            {
                // as soon as a country is found, cut the phone to the city code level
                $ phone = substr ($ phone, $ codeLen, strlen ($ phone) - $ codeLen );
                $ zero = false;
                // check for zeros in the city code
                if ($ data ['zeroHack'] && $ phone [0] == '0')
                {
                    $ zero = true;
                    $ phone = substr ($ phone, 1, strlen ($ phone) -1);
                }
     
                $ cityCode = NULL;
                // first compare with exception cities
                if ($ data ['exceptions_max']! = 0)
                for ($ cityCodeLen = $ data ['exceptions_max']; $ cityCodeLen> = $ data ['exceptions_min']; $ cityCodeLen-- )
                if ( in_array ( intval ( substr ($ phone, 0, $ cityCodeLen)), $ data ['exceptions']))
                {
                    $ cityCode = ($ zero? "0": ""). substr ($ phone, 0, $ cityCodeLen);
    substr ($ phone, $ cityCodeLen, strlen ($ phone) - $ cityCodeLen);
                    break;
                }
                // in case of failure with exceptions, cut the city code according to the default length
                if ( is_null ($ cityCode))
                {
                    $ cityCode = substr ($ phone, 0, $ data ['cityCodeLength']);
                    $ phone = substr ($ phone, $ data ['cityCodeLength'], strlen ($ phone) - $ data ['cityCodeLength']);
                }
                // return the result
                return ($ plus? "+": ""). $ countryCode. '('. $ cityCode. ')'.
            }
        }
        // return the result without the country and city codes
        return ($ plus? "+": "") .phoneBlocks ($ phone);
    }
     
    // the function turns any number into a string of the format XX-XX -... or XXX-XX-XX -... depending on the parity of the number of digits
    function phoneBlocks ($ number) {
        $ add = '';
        if ( strlen ($ number)% 2)
        {
            $ add = $ number [0];
            $ add. = ( strlen ($ number) <= 5? "-": "");
            $ number = substr ($ number, 1, strlen ($ number) -1);
        }
        return $ add. implode ("-", str_split ($ number, 2));

     
    // tests
    echo phone ("+ 38 (044) 226-22-04"). "";
    echo phone (" 0038 (044) 226-22-04 ")."";
    echo phone (" + 79263874814 ")."";
    echo phone (" 4816145 ")."";
    echo phone (" + 44 (0) 870 770 5370 ")."";
    echo phone (" 0044 (0) 870 770 5370 ")."";
    echo phone (" + 436764505509 ")."";
    echo phone (" (+ 38-048) 784-15-46 ")."";
    echo phone (" (38-057) 706-34-03 ")."";
    echo phone (" + 38 (044) 244 12 01 ")."";
    ?>

    where global $ phoneCodes; - the same array with information for all countries.

    Will output
    +380(44)226-22-04
    +380(44)226-22-04
    +7(926)387-48-14
    481-61-45
    +44(0870)770-53-70
    +44(0870)770-53-70
    +43(6764)50-55-09
    380(4878)415-46
    380(5770)634-03
    +380(44)244-12-01

    The function completely solves the task.
    Among the drawbacks of the function, it should be noted that there is no analysis of slow sections in order to optimize, as well as process telephone numbers, where there is a city code but no country code (in this case, just beat the blocks with phoneBlocks or use one of the solutions above). When using it in any implementation, it is necessary to replace the global variable with the link in the parameter, and you can also refine or replace the output format for which the phoneBlocks function is responsible .

    The most interesting

    Using information from the sites:
    http://www.mtt.ru/info/def/index.wbp
    http://www.hella.ru/code/codeuro.htm
    http://www.scross.ru/guide/phone -global /
    I have collected an array of data for all the countries represented, including exception cities, zeroHack flags , and mobile network codes. The code can be downloaded here .

    Performance

    Contrary to all the most pessimistic expectations, the code fulfills 10,000 numbers in less than 2 seconds.

    UPD Amendments being prepared:
    1. support for formatting patterns adopted within specific countries ("locally accepted" standards for displaying numbers);
    2. adding a flag to indicate which country to format the number of;
    3. adding a parameter to indicate the output format (in case of personal preferences and exceptions);
    4. support for non-latin letter numbers
    5. defining cell numbers and replacing brackets with spaces
    UPD: The archive disappeared from the server, posted on https://github.com/mrXCray/PhoneCodes Soon there will be an update on the amendments above + bonus.

    Also popular now: