
Valid sturgeon freshness data

The end of 2012 turned out to be generous in projects with foreign partners. Personally, in addition to expanding the geography of the portfolio, this allowed me to slightly change my attitude to the usual approaches and techniques.
On usability tests, we record numerical indicators. Previously recorded time, success, frequency; formulas were built on the basis of these data in the hope of obtaining a certain integral indicator of the criticality of problems. They even dreamed of finding (or creating) a universal usability indicator applicable to any system.
At some point, we began to refuse to fix the time: too many conditions made this indicator unreliable. For example, in tests in the “Thoughts Aloud” format, the method itself introduced a fair margin of error - as long as a person explains everything that has crossed his mind, a lot of time will pass, and it is thought out loud differently.
Nevertheless, we preserved mathematics by all means. And in the name of mathematics, we strictly followed the test scenario: because if we need to calculate the average or percentage, then each respondent must go through all the steps of the script, regardless of how much this corresponds to the actual user experience.
Sometimes I felt simply ridiculous: it happened that the person on the test had already reached the goal by some albeit abnormal, but quite effective method, but instead of congratulating him on his victory, I had to beg to follow the path prescribed in the script. “Well, you found the right song / changed the tariff / saved the contract, but let's try to do it differently,” I asked purely for the sake of data validity. Fortunately, no one ever sent me away in response to such a request.
World harmony and a tear of a child
One of the foreign partners who ordered us testing sent a sample report in which there were practically no numerical data. There was only one table - with a list of test participants. The criticality of the problems was indicated, but it was deduced not according to the formula, but expertly. Probably, due to the stereotype of thinking, I nevertheless prepared our standard tabular summary according to the results of testing, although I pretty much cut it down - to ranking by frequency of problems. “Thank you,” a partner wrote to me, “but you don’t show this data to the gas supply office. We know that the problem is the problem, and that in such a sample it doesn’t matter if 2 or 3 people encountered it. And in a large corporation, these numbers may become an occasion to dictate what is important and what is not. ”
In fact, a colleague (here it must be clarified that the customer in this case was also a usability company, we were a local contractor for them) voiced to me what is being preached in our company: conducting qualitative, not quantitative research. That is, the main thing is to identify the problem, and not measure and weigh it. Another colleague was right: if you show the business indicators that are too low or too high, then the most obvious action will seem to be a change in these indicators, rather than solving the problem fundamentally and essentially.

Comics: xkcd.com
In September 2012, our company hosted UX Masterclass in Moscow. Among the speakers was Gavin Lee of User Centric, and among other things, he defended the often disputed idea that usability can be measured. For example, he cited the stupid dialog box of a medical program and asked the audience in the hall to evaluate how long it would take to answer, and what is the probability of an error.

Of course, everything can be measured. Including the costs of the deceased as a result of the patient’s error, and even whether all the world harmony is worth the tear of a tortured child.
Of course, here I will rightly be reminded of the scarcity of resources that can be allotted for correction in the interface - in my opinion, it is only for the distribution of these resources that measurements are taken, and not for the sake of the interface itself. I have no answer to this remark. I can’t say whether it is worth spending more or less on oncology than on cardiovascular diseases, under the pretext that cancer patients suffer more or that there are more cores.
Is it possible to find out the truth from invalid data?
Another foreign client was represented by a whole team: developer, manager, UX specialist. When discussing the work, the developer told us sternly, almost rudely: “You in the industry do not understand a damn thing, so give us raw data, without any estimates, ranking and conclusions. We will figure it out ourselves. ” This client was present at all tests and actively participated in them. For example, this way: during the test I asked to change the conditions of the assignment. Our UX colleague sighed: no valid data remained. I almost flinched when I heard this, because the wording almost verbatim repeated the words that I had to hear at the beginning of my usability path.
But the rudeness of the developer caused me more positive feelings than the sigh of a colleague. It was evident that a person is rooting for the product and wants to make it ideal. He does not care how many times this or that problem has appeared. If there is a problem, the product needs to be improved. Criticism, it seems, was not of much interest to him, just as a demanding cook is not interested in the degree of freshness of sturgeon. It should be fresh, and that’s it. The accuracy of measuring rottenness does not matter at all when it comes to a dinner party.
I have no doubt that the developer set priorities for himself and, perhaps, even used numbers. In the end, on the expert assessment (perhaps the second most frequent method after usability testing), the expert determines the criticality, and no one reproaches it with invalidity. The more problems we identify, even if they are minor or completely random, the more holistic the vision of the product we will have.
There was another foreign voice. True, its owner was not our client or partner, but he is an authority in UX. This is Jared Spool. First, he spoke of “ death from a thousand cuts"- when a lot of seemingly insignificant problems result in a serious dissatisfaction with the product. Secondly, he criticized the tough scripted approach to testing for the fact that because of it we get the result programmed by us, rather than real information about the user experience.
Instead, Spool offers improvisation - not to follow the script strictly, but to adapt to the expectations and goals of the user. Graphs and charts will go in this case to Woland. In my opinion, in most cases, they are dear there. It is when viewing the graphs and diagrams that the focus of our discipline is blurred: it begins to seem that the problem arises where the indicators rose or fell to a certain mark, and not where the user, instead of his main work, is forced to engage in a war with the interface. This sacrifice (the impossibility of collecting valid data) is worth it if in return we find out whether the
Posted by Anton Alyabyev, Analyst, UIDesign Group.
PS By the way, this week our company marks 10 years. There is something to tell here, therefore, we decided to describe the entire history of the UIDesign Group (from part-time on weekends to entering the international market) in a series of articles that we will post on our “regular” blog (not on a hub because it’s not quite a format). If suddenly someone will be interested - you can read the first article at this link .