Automatic gender detection by name

    Continuing to talk about the technologies that are used in our email marketing service Pechkin-mail.ru , we are simply obliged to mention the automatic determination of the subscriber’s gender by name. Back in 2007, developing a sms-mailing service, we really wanted to realize the possibility of automatically substituting endings in the adjectives “Dear”, “Dear” and so on. Usually, such a substitution is carried out on the basis of an additional field in the address base of the client. However, it seems to us that this sucks.
    There are 3 reasons for this:
    1. making a subscriber set his gender is stupid (the more fields in the form, the lower the probability of filling it)
    2. manually determined - for a long time, which means expensive
    3. a man is not immune from mistakes in exactly the same way as a machine.


    But we are not linguists, therefore, it would be too difficult to implement such a module for us, and the function is not “essential”. But recently, while working on the decline of texts of endless acts, treaties and other legal documents, we remembered the Morfer service we had been using for 2 years . This is an excellent linguistic service that specializes in Russian and allows you to incline words, whole sentences, numerals, and also receive text from numbers. In general, the amazing development of just one person - Sergey Slepov .

    So, we open his site after a while and find out - there is a convenient and simple module for PHP, which allows you to determine the gender of the noun. Those. substituting a combination of names, you can get a fairly accurate definition of gender (male, female). Super! The implementation of the function was not long in coming. Everything is done in the form of templates in the mailing text.

    Using the cunning declination design:
    {Уважаем{ый|ая}}/{%ИМЯ% %ФАМИЛИЯ%}
    

    As a result, the output will be either “Dear” or “Dear”.

    The contents of the construct may be arbitrary. Let us explain in the above example:
    • Respect - this is the “root” of the declined word.
    • | | ая - these are possible endings in order {masculine | feminine | middle gender | plural}. In the full version, it may look like this: “ая | | | | |” ”
    • % NAME%% SURNAME% - these are the tags from the address base that define the expression on the basis of which the gender will be determined. You can simply set% NAME%, but then there is a risk of “losing” a part of double-digit names like Sasha, Vasya, Valya and so on.


    Here is another construction example:
    {Дорог{ой|ая}}/{%ИМЯ% %ФАМИЛИЯ%} %ИМЯ{клиент}%
    

    => the result will be “Dear Ivan” or “Dear Margarita” or “Dear client” (if no name is specified)

    For those who use our online editor, it’s still easier: Convenient links in the control panel by clicking on the little man on the right at the top, which you see in the screenshot above. Writing such a wrapper is really not difficult. The module’s performance is amazing and allows you to use such a sex determination and auto-substitution of the template on the fly in the process of sending newsletters without loss of speed. And the advantages for our customers are obvious:
    image






    • No need to ask the subscriber to set the gender and store it in an additional field in the address database
    • No need to manually determine the gender of your subscribers by their names
    • The accuracy of determination is even higher than manual processing (the script does not get tired of the second thousand names)


    Therefore, if you have the task of “live” work with texts, declension of various words and phrases, work with numerals, then Sergey’s library will help you! Thank you so much for her!

    PS Our service has an exclusive on the use of this library in email marketing services in Russia. We will be glad if this function becomes really necessary and useful for our customers.

    Also popular now: