8 useful regexps with visual analysis

    A lot has been written about the power and flexibility of regular expressions, and their use has long been a standard for various kinds of operations on text. Perhaps most often, regexps work when validating the input data - here there is practically no alternative to them, except for cumbersome cyclic analysis with a bunch of unobvious checks. Let's start with the simplest:

    1. Part of the CNC (human-friendly URL)


    Essentially a hyphenated word.

    Pattern: / ^ [a-z0-9 -] + $ /
    short_url



    2. UserName


    Letters, numbers, hyphens and underscores, from 3 to 16 characters.

    Pattern: / ^ [a-z0-9 _-] {3,16} $ /
    username

    3. Password


    The same as the username, only from 6 to 18.

    Pattern: / ^ [a-z0-9 _-] {6.18} $ /
    password
    From myself: more briefly - / ^ [\ w _] {6.18} $ / . Similarly for username.

    4. Hexadecimal color


    Character # (optional), then a word consisting of letters from a to f or numbers, length 3 or 6.

    Pattern: / ^ #? ([A-f0-9] {6} | [a-f0-9] { 3}) $ /
    hex

    5. XML tag


    Behind the opening bracket <there should be a word of letters - the name of the element, then there may be attributes - any characters, except the closing bracket>. Next is any text (content) and closing tag, i.e. <name>, or at least one space, a slash, and a closing bracket (self-closing tag).

    Pattern: /^<([azapter+)([^>†+)*(?:>(.*)<\/\1>|\s+\/>)$/
    xml_tag

    6. Email


    General view - login @ subdomain . domain Login, as well as a subdomain - words from letters, numbers, underscores, hyphens and periods. And the domain (meaning 1st level) is from 2 to 6 letters and dots.

    Pattern: /^([a-z0-9_\.-†+)@([a-z0-9_\.-†+)\.([az\.{{2,6►)$/
    email
    From myself: can be shorter - /^([\w\._†+)@\1\.([az{{2,6}\.?)$/ . This is also a little more correct - a point in a first-level domain can occur only once and only at the end.

    7. URL


    First of all, an optional protocol (http: // or https: //), then a sequence of letters, numbers, hyphens, underscores and periods (level domains> 1), then a zero level domain (from 2 to 6 letters and periods) and, finally, the file structure is a set of words from letters, numbers, hyphens, underscores, and periods with a slash at the end. All this may end again with a slash.

    Pattern: /^(https?:\/\/)?([\da-z\.-†+)\.([az\.{{2,6►)([\/\w \ .-] *) * \ /? $ /
    url
    On my own: is it better - / ^ (https?: \ / \ /)? ([\ W \.] +) \. ([Az] {2,6} \.?) (\ / [\ w \.] *) * \ /? $ /

    8. IP address


    4 groups of digits (from 1 to 3 digits in each) are separated by dots. If the group consists of 3 characters, then the first of them is 1 or 2; if 1, then the rest from 0 to 9, and if 2 - then the second from 0 to 5; if the second character is from 0 to 4, then the third is from 0 to 9, and if the second 5 is the third is from 0 to 5. If the group consists of 2 characters, then the first is from 1 to 9, the second is from 0 to 9 . In the case of a one-character group, this character can be a number from 1 to 9.

    Pattern: / ^ (? :( ?: 25 [0-5] | 2 [0-4] [0-9] | [01]? [0 -9] [0-9]?) \.) {3} (?: 25 [0-5] | 2 [0-4] [0-9] | [01]? [0-9] [0- 9]?) $ /
    ip
    From myself: in my opinion, it’s more correct - / ^ (? :( ?: 25 [0-5] | 2 [0-4] \ d | [01]? \ D \ d?) \.) {3} (?: 25 [0-5] | 2 [0-4] \ d | [01]? \ D \ d?) $ / .

    Taken from here

    Also popular now: