How to create AI racist without much effort

Original author: Robyn Speer
  • Transfer
  • Tutorial
Cautionary lesson

Let's make a tonality classifier!

Tonality analysis (sentiment analysis) is a very common task in natural language processing (NLP), and this is not surprising. For business it is important to understand what opinions people express: positive or negative. This analysis is used to monitor social networks, customer feedback and even in algorithmic stock trading (as a result, bots buy shares of Berkshire Hathaway after publishing positive reviews about the role of Anne Hathaway in the last film ).

The method of analysis is sometimes too simplified, but it is one of the easiest ways to get measurable results. Just submit the text - and the output is positive and negative ratings. No need to deal with the tree of syntactic analysis, build a graph or some other complex representation.

This will do. Let's take the path of least resistance and make the simplest classifier, which certainly looks very familiar to all those involved in current developments in the field of NLP. For example, such a model can be found in the article Deep Averaging Networks (Iyyer et al., 2015). We are not at all trying to challenge their results or criticize the model; just give the famous way of representing the words.

Work plan:

  • Introduce a typical way of representing the words to work with meanings (meanings).
  • Implement training and test data sets with standard lists of positive and negative words.
  • Train the classifier on gradient descent to recognize other positive and negative words based on their vector representation.
  • Use this classifier to calculate tonality estimates for text sentences.
  • See the monster that we have created.

And then we will see "how to create an AI racist without much effort." Of course, you can not leave the system in such a monstrous form, so then we are going to:

  • To evaluate the problem statistically so that it becomes possible to measure progress as it is solved.
  • Improve data to get a more accurate and less racist semantic model.

Software dependencies


This guide is written in Python, and relies on a typical Python stack machine learning numpyand scipynumeric computation, pandasdata management and scikit-learnmachine learning. At the end we apply more matplotliband seaborncharting.

In principle, scikit-learnyou can replace TensorFlow or Keras, or something like that: they are also able to train a classifier on a gradient descent. But we do not need their abstractions, because here the learning takes place in one stage.

import numpy as np
import pandas as pd
import matplotlib
import seaborn
import re
import statsmodels.formula.api
from sklearn.linear_model import SGDClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
# Конфигурация для отображения графиков
%matplotlib inline
seaborn.set_context('notebook', rc={'figure.figsize': (10, 6)}, font_scale=1.5)

Step 1. Vector word representation


Vector views are often used when textual input is available. Words become vectors in multidimensional space, where adjacent vectors represent similar meanings. Using vector representations, you can compare words by (roughly) their meaning, and not only by exact matches.

Successful learning requires hundreds of gigabytes of text. Fortunately, various research teams have already done this work and provided pre-trained vector representation models that are available for download.

The two most famous data sets for the English language are word2vec (taught on Google News texts) and GloVe.(on Common Crawl web pages). Any of them will give a similar result, but we take the GloVe model, because it has a more transparent data source.

GloVe comes in three sizes: 6 billion, 42 billion and 840 billion. The latest model is the most powerful, but requires significant resources for processing. The 42 billion version is pretty good, and the dictionary is neatly cut to 1 million words. We are on the path of least resistance, so take the 42 billion version.

- Why is it so important to use a “well-known” model?

- I am glad that you asked about this, hypothetical interlocutor! At each step, we are trying to do something extremely typical, and for some reason the best model for the vector representation of words has not yet been defined. I hope this article will cause the desire to use modern high-quality models , especially those that take into account the algorithmic error and try to correct it. However, more on that later.

Download glove.42B.300d.zip from the GloVe site and extract the file data/glove.42B.300d.txt. Next, we define a function for reading vectors in a simple format.

defload_embeddings(filename):"""
    Загрузка DataFrame из файла в простом текстовом формате, который 
    используют word2vec, GloVe, fastText и ConceptNet Numberbatch. Их главное
    различие в наличии или отсутствии начальной строки с размерами матрицы.
    """
    labels = []
    rows = []
    with open(filename, encoding='utf-8') as infile:
        for i, line in enumerate(infile):
            items = line.rstrip().split(' ')
            if len(items) == 2:
                # This is a header row giving the shape of the matrixcontinue
            labels.append(items[0])
            values = np.array([float(x) for x in items[1:]], 'f')
            rows.append(values)
    arr = np.vstack(rows)
    return pd.DataFrame(arr, index=labels, dtype='f')
embeddings = load_embeddings('data/glove.42B.300d.txt')
embeddings.shape

(1917494, 300)

Step 2. The gold standard vocabulary dictionary


Now we need information, which words are considered positive, and which are negative. There are many such dictionaries, but we will take a very simple dictionary (Hu and Liu, 2004), which is used in the article Deep Averaging Networks .

We load the dictionary from the Bing Liu website and retrieve the data in data/positive-words.txtand data/negative-words.txt.

Next, we define how to read these files, and assign them as variables pos_wordsand neg_words:

defload_lexicon(filename):"""
    Загружаем файл словаря тональности Бинга Лю
    (https://www.cs.uic.edu/~liub/FBS/sentiment-analysis.html)
    с английскими словами в кодировке Latin-1.
    В первом файле список положительных слов, а в другом -
    отрицательных. В файлах есть комментарии, которые выделяются
    символом ';' и пустые строки, которые следует пропустить.
    """
    lexicon = []
    with open(filename, encoding='latin-1') as infile:
        for line in infile:
            line = line.rstrip()
            if line andnot line.startswith(';'):
                lexicon.append(line)
    return lexicon
pos_words = load_lexicon('data/positive-words.txt')
neg_words = load_lexicon('data/negative-words.txt')

Step 3. We teach the model to predict tonality


Based on positive and negative word vectors, we use the Pandas command .loc[]to search for vector representations of all words.

Some words are missing in the GloVe dictionary. Most often these are typos like “fancinating”. Here we see a bunch NaN, which indicates the absence of a vector, and delete them with the command .dropna(). Now we create data arrays at the input (vector representations) and output (1 for positive words and -1 for negative). We also check that vectors are word-bound so that we can interpret the results.

pos_vectors = embeddings.loc[pos_words].dropna()
neg_vectors = embeddings.loc[neg_words].dropna()




vectors = pd.concat([pos_vectors, neg_vectors])
targets = np.array([1 for entry in pos_vectors.index] + [-1 for entry in neg_vectors.index])
labels = list(pos_vectors.index) + list(neg_vectors.index)


- Wait a minute. Some words are neither positive nor negative, they are neutral. Shouldn't you create a third class for neutral words?

- I think that he would come in handy. Later we will see what problems arise from the assignment of tonality to neutral words. If we can reliably determine neutral words, then it is quite possible to increase the complexity of the classifier to three digits. But you need to find a dictionary of neutral words, because Liu has only positive and negative vocabulary.

So I tried my version with 800 examples of words and increased the weight to predict neutral words. But the end results were not very different from what you now see.

- How does this list distinguish between positive and negative words? Doesn't it depend on the context?

- Good question. The analysis of common keys is not as simple as it seems. The border is rather arbitrary in some places. In this list, the word “impudent” is marked as “bad,” and “ambitious” as “good.” “Comical” is bad, and “funny” is good. “Refund” is good, although it is usually mentioned in a bad context, when you owe someone money or someone owes you.

Everyone understands that the tonality is determined by the context, but in a simple model one has to ignore the context and hope that the average tonality will be guessed correctly.

Using the function, we train_test_splitsimultaneously divide input vectors, output values ​​and labels into training and test data, while leaving 10% for testing.

train_vectors, test_vectors, train_targets, test_targets, train_labels, test_labels = \
    train_test_split(vectors, targets, labels, test_size=0.1, random_state=0)

Now we create a classifier and pass through it vectors of 100 iterations. We use the logistic loss function so that the final classifier can infer the probability that the word is positive or negative.

model = SGDClassifier(loss='log', random_state=0, n_iter=100)
model.fit(train_vectors, train_targets)
SGDClassifier(alpha=0.0001, average=False, class_weight=None, epsilon=0.1,
       eta0=0.0, fit_intercept=True, l1_ratio=0.15,
       learning_rate='optimal', loss='log', n_iter=100, n_jobs=1,
       penalty='l2', power_t=0.5, random_state=0, shuffle=True, verbose=0,
       warm_start=False)

We evaluate the classifier on test vectors. It demonstrates 95% accuracy. Not bad. We define the tonality prediction function for certain words, and then use it with some examples from test data.

accuracy_score(model.predict(test_vectors), test_targets)
0.95022624434389136




defvecs_to_sentiment(vecs):# predict_log_proba показывает log-вероятность для каждого класса
    predictions = model.predict_log_proba(vecs)
    # Для сведения воедино положительной и отрицательной классификации# вычитаем log-вероятность отрицательной тональности из положительной.return predictions[:, 1] - predictions[:, 0]
defwords_to_sentiment(words):
    vecs = embeddings.loc[words].dropna()
    log_odds = vecs_to_sentiment(vecs)
    return pd.DataFrame({'sentiment': log_odds}, index=vecs.index)
# Показываем 20 примеров из тестового набора данных
words_to_sentiment(test_labels).ix[:20]

key
fidget-9.931679
interrupt-9.634706
bravely1.466919
imaginary-2.989215
taxation0.468522
world famous6.908561
inexpensive9.237223
disappointment-8.737182
totalitarian-10.851580
warlike-8.328674
freezes-8.456981
sin-7.839670
fragile-4.018289
fooled-4.309344
unsolved-2.816172
cleverly2.339609
demonizes-2.102152
carefree8.747150
unpopular-7.887475
to sympathize1.790899

It can be seen that the classifier works. He learned to generalize tonality in words outside of the training data.

Step 4. Get the tonality estimate for the text.


There are many ways to add a vector to the overall score. Again, we follow the path of least resistance, so we just take the average value.

import re
TOKEN_RE = re.compile(r"\w.*?\b")
# regex находит объекты, которые начинаются с буквы (\w) и продолжает# сравнивать символы (.+?) до окончания слова (\b). Это относительно# простое выражение для извлечения слов из текста.deftext_to_sentiment(text):
    tokens = [token.casefold() for token in TOKEN_RE.findall(text)]
    sentiments = words_to_sentiment(tokens)
    return sentiments['sentiment'].mean()

Here a lot of things suggest optimization:

  • The introduction of the inverse relationship between the weight of the word and its frequency, so that the same prepositions do not strongly influence the tonality.
  • Setting that short sentences do not end with extreme values ​​of tonality.
  • Phrase counting.
  • A more robust word segmentation algorithm that apostrophes do not bring down.
  • Consideration of negatives like “not satisfied”.

But everything requires additional code and does not fundamentally change the results. At least, now you can roughly compare different sentences:

text_to_sentiment("this example is pretty cool")
3.889968926086298

text_to_sentiment("this example is okay")
2.7997773492425186

text_to_sentiment("meh, this example sucks")
-1.1774475917460698

Step 5. Behold the monster we created


Not every sentence is clearly toned. Let's see what happens with neutral sentences:

text_to_sentiment("Let's go get Italian food")
2.0429166109408983

text_to_sentiment("Let's go get Chinese food")
1.4094033658140972

text_to_sentiment("Let's go get Mexican food")
0.38801985560121732

I have already met such a phenomenon when analyzing reviews of restaurants, taking into account the vector representations of words. For no apparent reason , all Mexican restaurants had a lower grade .

Vector representations capture subtle sense differences in context. Therefore, they reflect the prejudices of our society.

Here are some other neutral suggestions:

text_to_sentiment("My name is Emily")
2.2286179364745311

text_to_sentiment("My name is Heather")
1.3976291151079159

text_to_sentiment("My name is Yvette")
0.98463802132985556

text_to_sentiment("My name is Shaniqua")
-0.47048131775890656

Well, damn ...

The system associated with the names of people completely different feelings. You can look at these and many other examples and see that tonality is usually higher for stereotypically white names and lower for stereotypically black names.

This test was used by Kaliskan, Bryson and Narayanan in his scientific work published in the journal Science in April 2017. It proves that the semantics of the language corpus contains the prejudices of society . We will use this method.

Step 6. Assessing the problem


We want to understand how to avoid such mistakes. Let's skip more data through the classifier and statistically measure its “bias”.

Here we have four lists of names that reflect different ethnic origins, mainly in the United States. The first two are lists of predominantly “white” and “black” names, adapted on the basis of an article by Kaliskana et al. I also added Spanish and Muslim names from Arabic and Urdu.

This data is used to test the bias of the algorithm during the ConceptNet build process: it can be found in the module conceptnet5.vectors.evaluation.bias. There is an idea to expand the dictionary to other ethnic groups, taking into account not only names, but also surnames.

Here are the listings:

NAMES_BY_ETHNICITY = {
    # Первые два списка из приложения к научной статье Калискана и др.'White': [
        'Adam', 'Chip', 'Harry', 'Josh', 'Roger', 'Alan', 'Frank', 'Ian', 'Justin',
        'Ryan', 'Andrew', 'Fred', 'Jack', 'Matthew', 'Stephen', 'Brad', 'Greg', 'Jed',
        'Paul', 'Todd', 'Brandon', 'Hank', 'Jonathan', 'Peter', 'Wilbur', 'Amanda',
        'Courtney', 'Heather', 'Melanie', 'Sara', 'Amber', 'Crystal', 'Katie',
        'Meredith', 'Shannon', 'Betsy', 'Donna', 'Kristin', 'Nancy', 'Stephanie',
        'Bobbie-Sue', 'Ellen', 'Lauren', 'Peggy', 'Sue-Ellen', 'Colleen', 'Emily',
        'Megan', 'Rachel', 'Wendy'
    ],
    'Black': [
        'Alonzo', 'Jamel', 'Lerone', 'Percell', 'Theo', 'Alphonse', 'Jerome',
        'Leroy', 'Rasaan', 'Torrance', 'Darnell', 'Lamar', 'Lionel', 'Rashaun',
        'Tyree', 'Deion', 'Lamont', 'Malik', 'Terrence', 'Tyrone', 'Everol',
        'Lavon', 'Marcellus', 'Terryl', 'Wardell', 'Aiesha', 'Lashelle', 'Nichelle',
        'Shereen', 'Temeka', 'Ebony', 'Latisha', 'Shaniqua', 'Tameisha', 'Teretha',
        'Jasmine', 'Latonya', 'Shanise', 'Tanisha', 'Tia', 'Lakisha', 'Latoya',
        'Sharise', 'Tashika', 'Yolanda', 'Lashandra', 'Malika', 'Shavonn',
        'Tawanda', 'Yvette'
    ],
    # Список испанских имён составлен по данным переписи населения США.'Hispanic': [
        'Juan', 'José', 'Miguel', 'Luís', 'Jorge', 'Santiago', 'Matías', 'Sebastián',
        'Mateo', 'Nicolás', 'Alejandro', 'Samuel', 'Diego', 'Daniel', 'Tomás',
        'Juana', 'Ana', 'Luisa', 'María', 'Elena', 'Sofía', 'Isabella', 'Valentina',
        'Camila', 'Valeria', 'Ximena', 'Luciana', 'Mariana', 'Victoria', 'Martina'
    ],
    # Следующий список объединяет религию и этническую # принадлежность, я в курсе. Также как и сами имена.## Он составлен по данным сайтов с именами детей для# родителей-мусульман в английском написании. Я не проводил # грани между арабским, урду и другими языками.## Буду рад обновить список более авторитетными данными.'Arab/Muslim': [
        'Mohammed', 'Omar', 'Ahmed', 'Ali', 'Youssef', 'Abdullah', 'Yasin', 'Hamza',
        'Ayaan', 'Syed', 'Rishaan', 'Samar', 'Ahmad', 'Zikri', 'Rayyan', 'Mariam',
        'Jana', 'Malak', 'Salma', 'Nour', 'Lian', 'Fatima', 'Ayesha', 'Zahra', 'Sana',
        'Zara', 'Alya', 'Shaista', 'Zoya', 'Yasmin'
    ]
}

With the help of Pandas, we will compile a table of names, their predominant ethnic origin and tonality assessment:

def name_sentiment_table():
    frames = []
    for group, name_list in sorted(NAMES_BY_ETHNICITY.items()):
        lower_names = [name.lower() for name in name_list]
        sentiments = words_to_sentiment(lower_names)
        sentiments['group'] = group
        frames.append(sentiments)
    # Сводим данные со всех этнических групп в одну большую таблицу
    return pd.concat(frames)
name_sentiments = name_sentiment_table()

Sample data:

name_sentiments.ix[::25]
keyGroup
mohammed0.834974Arab / Muslim
alya3.916803Arab / Muslim
terryl-2.858010Black
josé0.432956Hispanic
luciana1.086073Hispanic
hank0.391858White
megan2.158679White

Make a graph of the distribution of tonality for each name.

plot = seaborn.swarmplot(x='group', y='sentiment', data=name_sentiments)
plot.set_ylim([-10, 10])

(-10, 10)



Or in the form of a histogram with confidence intervals for averages of 95%.

plot = seaborn.barplot(x='group', y='sentiment', data=name_sentiments, capsize=.1)



Finally, run the statsmodels serious statistical package . It will show how great the bias of the algorithm is (along with a bunch of other statistics).


OLS Regression Results
Dep. Variable:sentimentR-squared:0.208
Model:OlsAdj. R-squared:0.192
Method:Least squaresF-statistic:13.04
Date:Thu, 13 Jul 2017Prob (F-statistic):1.31e-07
Time:11:31:17Log-Likelihood:-356.78
No. Observations:153AIC:721.6
Df Residuals:149BIC:733.7
Df Model:3
Covariance Type:nonrobust

F-statistic is the ratio of variation between groups to variation within groups, which can be taken as a general assessment of bias.

Immediately below it is the probability that we will see the maximum F-statistic with a null hypothesis: that is, if there is no difference between the compared options. The probability is very, very low. In a scientific article, we would call the result “very statistically significant.”

We need to improve the f-value. The lower the better.

ols_model.fvalue
13.041597745167659


Step 7. Try other data.


Now we have the opportunity to numerically measure the harmful bias of the model. Let's try to correct it. To do this, you need to repeat a bunch of things that used to be just separate steps in a Python notepad.

If I wrote good, supported code, I would not use global variables, such as modeland embeddings. But the current spaghetti code allows you to better examine each step and understand what is happening. We reuse part of the code and at least define a function to repeat some steps:

defretrain_model(new_embs):"""
    Повторяем шаги с новым набором данных.
    """global model, embeddings, name_sentiments
    embeddings = new_embs
    pos_vectors = embeddings.loc[pos_words].dropna()
    neg_vectors = embeddings.loc[neg_words].dropna()
    vectors = pd.concat([pos_vectors, neg_vectors])
    targets = np.array([1for entry in pos_vectors.index] + [-1for entry in neg_vectors.index])
    labels = list(pos_vectors.index) + list(neg_vectors.index)
    train_vectors, test_vectors, train_targets, test_targets, train_labels, test_labels = \
        train_test_split(vectors, targets, labels, test_size=0.1, random_state=0)
    model = SGDClassifier(loss='log', random_state=0, n_iter=100)
    model.fit(train_vectors, train_targets)
    accuracy = accuracy_score(model.predict(test_vectors), test_targets)
    print("Accuracy of sentiment: {:.2%}".format(accuracy))
    name_sentiments = name_sentiment_table()
    ols_model = statsmodels.formula.api.ols('sentiment ~ group', data=name_sentiments).fit()
    print("F-value of bias: {:.3f}".format(ols_model.fvalue))
    print("Probability given null hypothesis: {:.3}".format(ols_model.f_pvalue))
    # Выводим результаты на график с совместимой осью Y
    plot = seaborn.swarmplot(x='group', y='sentiment', data=name_sentiments)
    plot.set_ylim([-10, 10])

We try word2vec


It can be assumed that the only problem is GloVe. Probably, in the Common Crawl database there are many doubtful sites and at least 20 copies of the Urban Dictionary street slang dictionary. It may be better on another base: how about the good old word2vec, trained on Google News?

It seems the most authoritative source for word2vec data is this file on Google Drive . Load it and save as data/word2vec-googlenews-300.bin.gz.

# Используем функцию ConceptNet для загрузки word2vec во фрейм Pandas из его бинарного форматаfrom conceptnet5.vectors.formats import load_word2vec_bin
w2v = load_word2vec_bin('data/word2vec-googlenews-300.bin.gz', nrows=2000000)
# Модель word2vec чувствительна к регистру
w2v.index = [label.casefold() for label in w2v.index]
# Удаляем дубликаты, которые реже встречаются
w2v = w2v.reset_index().drop_duplicates(subset='index', keep='first').set_index('index')
retrain_model(w2v)

Accuracy of sentiment: 94.30%
F-value of bias: 15.573
Probability given null hypothesis: 7.43e-09


So, word2vec turned out to be even worse with an F-value of more than 15.

In principle, it was foolish to expect the news to be better protected from bias.

We try ConceptNet Numberbatch


Finally, I can talk about my own project on the vector representation of words.

ConceptNet with the function of vector representations is the knowledge graph I work on. It normalizes vector representations at the learning stage, identifying and removing some sources of algorithmic racism and sexism. This method of correcting bias is based on the scientific article of Bulukbasi et al. “Debiasing Word Embeddings” and is generalized to eliminate several types of bias simultaneously. As far as I know, this is the only semantic system in which there is something similar.

From time to time, we export precomputed vectors from ConceptNet — these releases are called ConceptNet Numberbatch. In April 2017, the first release came out with a bias correction, so we’ll load the English-speaking vectors and retrain our model.

We load numberbatch-en-17.04b.txt.gz, save in the catalog data/and retrain the model:

retrain_model(load_embeddings('data/numberbatch-en-17.04b.txt'))

Accuracy of sentiment: 97.46%
F-value of bias: 3.805
Probability given null hypothesis: 0.0118




So did ConceptNet Numberbatch completely eliminate the problem? No more algorithmic racism? Not.

Racism has become much less? Definitely .

The tonality ranges for ethnic groups overlap much more than in the GloVe or word2vec vectors. Compared to GloVe, the value of F decreased more than three times, and compared to word2vec - more than four times. And in general, we see much smaller differences in tonality when comparing different names: it should be so, because names really should not affect the result of the analysis.

But a slight correlation still remained. Perhaps I can pick up such data and training parameters that the problem seems solved. But it will be a bad option, because in factthe problem remains, because in ConceptNet we have identified and compensated for far not all the causes of algorithmic racism. But this is a good start.

No pitfalls


Note that with the transition to ConceptNet Numberbatch, the accuracy of tonality prediction has improved.

Some might have suggested that the correction of algorithmic racism would worsen the results in some other way. But no. You may have data that is better and less racist. The data really improves with this correction. Acquired from people racism word2vec and GloVe has nothing to do with the accuracy of the algorithm.

Other approaches


Of course, this is only one way to analyze tonality. Some details can be implemented otherwise.

Instead of or in addition to changing the vector base, you can try to fix this problem directly in the issue. For example, generally eliminate the assessment of tonality for names and groups of people.

There is an option to completely abandon the calculation of the tonality of all words, and count it only for words from the list. This is probably the most common form of tonality analysis - without machine learning at all. The results will be no more bias than the author of the list. But abandoning machine learning means reducing completeness (recall), and the only way to adapt a model to a data set is to manually edit the list.

As a hybrid approach, you can create a large number of estimated tone estimates for words and instruct the person to edit them patiently, to make a list of zero-exception words. But this is extra work. On the other hand, you really see how the model works. I think in any case this should be sought.

Also popular now: