
What do grades at a technical university depend on?
Hi Habr!
Once upon a time I was curious: how fair is the grading system in most of our technical universities? And what generally affects the student’s grade?
After all, often a student who went all semester, wrote lectures and performed on time laboratory, receives "beats." on the exam, and the lucky gouge grabs the five.
How random is all this? No matter how you study, you’ll get an assessment “random”? Or is it still not? And if you are a beautiful girl in a short miniskirt, what are your chances compared to guys? (Exclusively the figure of speech - no sexism)
Under the cut you will see the results of my research, in which I tried to answer these and some other questions. Experimental rabbits were several thousand students of my native university - MSTU. N.E. Bauman.
I apologize in advance if I chose the wrong hub, but maybe I should not have written it on Habr at all. But I wanted to share.
I’ll also say right away that I pretend to be overly scientific and I will be glad to point out all my inaccuracies, mistakes in the selection of criteria and other things, because statistics and sociology are not my profession, but rather a small hobby.
Here I will give only a part of the tested hypotheses, so as not to inflate the article. Also, in the course of the study, I built a regression model to determine the student’s grade, depending on the various parameters of the student, but I won’t shove it here either - and so it was voluminous. Maybe next time, if someone will be interested.
So let's go.
The factors whose influence I studied are, of course, not comprehensive. And they do not pretend to cover the whole range of reasons why a student can get one or another grade. But why did I choose these factors? Because they could be obtained from official documents without resorting to such labor-intensive methods as questioning, interviewing, etc. For I had no time and resources for this.
So, the list of factors:
1) Sex of the student
2) Faculty
3) The subject or not
4) Semester
5) Attendance
For all factors except the last one (attendance), I took information from the Electronic University system (hereinafter - EI), which stores data on the results of all sessions since 2007. Thus, the sample coincided with the general population, which means that it is automatically representative.
Attendance is getting harder. In a good way, it should also be introduced into the "EU", but only in the first two courses. And after consultations with a friend of the deputy dean, it became completely clear that one should not even count on this - the data is either not entered, or entered as God will put into one’s soul. Which is sad.
However, attendance is such an important factor that it was impossible not to analyze it. I had to scoop up information from attendance magazines, which are kept by the elders, and at the end of the semester they pass them to the dean's office. Given that these journals do not exist electronically, I had to scan several hundred pages and then process them manually.
Having figured out how much time I will kill if I process the data across the entire university, I decided to limit myself to only one (dear to me) faculty - “Computer Science and Control Systems” (hereinafter - “IU”). And only for the last academic year.
Of course, the data from the magazines do not reflect reality 100%, because they are filled by the elders themselves, and they can make mistakes and cover up their comrades without noticing gaps. But no better has been given. At least, the proportions of attendance remain: the “nerd” will still not have passes in the magazine, and the “gouging”, even if they do not put half of the “enoks”, will still have the worst attendance.
The number of students of the faculty of "IU" - 2931 people. The number of results is 1,550. According to Paniotto and Maksimenko , the sample is representative.
Of course, there are many specialized programs, however, since I know them only by hearsay, and there was no strong desire to get to know them better.
I wanted not only to calculate the coefficients and fix them, but also to enter the student data into the database, so that later I could supplement it, keep a story; check other hypotheses that were not originally included in my research; Build beautiful graphics with the most flexible settings, etc.
In addition, I wanted to do something of my own, sharpened for my task by 100%.
Since I work in a 1C franchise firm, I know 1C best of the development tools. So, the choice was obvious (and of the tools for creating a database with a web interface, I am completely familiar with only 1C, so the choice was doubly obvious).
I downloaded the data from the "EU" as an html page, wrote a parser and uploaded it to my database.
I had to manually fill in the attendance data (from the magazines), although I honestly tried to write a program for recognizing data from log scans at first. But the image quality was not so hot, so the written program did not accelerate the process.
Finally, having scored all the data and implemented in my database all the necessary algorithms for testing statistical hypotheses, I, trembling with impatience, began to build graphs and calculate the coefficients. And so what happened ...
So, the most interesting. The results that my program gave me.
I analyzed the results using the chi-square test, and the bond strength using the Spearman coefficient. But, probably, the number plates are not very interesting to readers, so here I will provide only visual graphs. If someone will be very interested, I will publish the numbers.

Wow, girls leave no chance for guys! Probably because they try harder, attend classes better? And here it is:

Here, judging by the schedule, attendance is almost independent of gender (this is also confirmed by calculations). It turns out that girls get higher points, and attendance is the same as that of guys.
It is widely believed among techies that there are “simple” faculties, such as economic or humanitarian ones, and “complex” ones, such as engineering.

On the graph, I gave only two faculties - with the highest average mark (for all courses) and the lowest. Law is the Faculty of Law (yes, there is one in Baumanka), OE is optoelectronics. As you can see from the graph, the differences are quite strong.

However, if you build a histogram of the average score for all faculties, the difference will not be so obvious.
For the cathedral subjects (they are conducted by teachers of the same department where the student is studying), the reputation of the student, which is hardly known to the teacher from the department, can play a role. It is believed that the attitude at the department is more loyal. In senior courses there are almost no non-cathedral subjects.

4) Semester.
As they say, the main thing is to survive the first two courses. That is, 4 semesters, which, in general, confirms the schedule.

Attendance was measured on a scale from 0 to 1. Where 1 - 100% attendance at all lectures and seminars. 0 –not one visit to lectures and seminars.

Everything seems to be true - the better you go, the higher the score (that's just a failure at 20% of attendance - I can’t explain, except for the quirk of the sample). But now we need to see if this relationship is equally strong at all courses?
The bond strength was measured using the Cramer coefficient.

As we can see, in senior courses the strength of communication decreases, while in younger courses it is maximum. I grouped the data in pairs, because it turns out a more linear schedule, and also because the training system in 1-2 courses is very different from 3-6 courses.
That is, the real significance of student attendance is only in the first two courses. On which the so-called "modular system" is used, which when setting grades is more oriented not on the exam, but on student performance and attendance during the semester.
One cannot but pay attention to the strange overestimation of ratings for girls at the same level of attendance as for boys. But perhaps this is not an overestimation, and the girls are simply more intelligent? After all, I did not measure the mental abilities of the respondents (and is this really possible?).
Special attention also deserves the fact that attendance strongly affects the assessment only in the first two courses.
A picture emerges that confirms the prevailing opinion (at least at MSTU): starting from a course of 3-4, students begin to “hammer in” to study, getting a job in various companies by profession or not. Attendance is falling, in senior years more and more grades are put “for free”, because the teachers are cathedral and acquaintances, and the quality of knowledge ... However, this is a topic for a separate study.
This is a normal picture, or not - it is for everyone to decide. This coincides with my personal experience and observations. And frankly speaking, it does not cause positive emotions.
I hope someone found this article useful or interesting for themselves. Thanks for attention.
Once upon a time I was curious: how fair is the grading system in most of our technical universities? And what generally affects the student’s grade?
After all, often a student who went all semester, wrote lectures and performed on time laboratory, receives "beats." on the exam, and the lucky gouge grabs the five.
How random is all this? No matter how you study, you’ll get an assessment “random”? Or is it still not? And if you are a beautiful girl in a short miniskirt, what are your chances compared to guys? (Exclusively the figure of speech - no sexism)
Under the cut you will see the results of my research, in which I tried to answer these and some other questions. Experimental rabbits were several thousand students of my native university - MSTU. N.E. Bauman.
I apologize in advance if I chose the wrong hub, but maybe I should not have written it on Habr at all. But I wanted to share.
I’ll also say right away that I pretend to be overly scientific and I will be glad to point out all my inaccuracies, mistakes in the selection of criteria and other things, because statistics and sociology are not my profession, but rather a small hobby.
Here I will give only a part of the tested hypotheses, so as not to inflate the article. Also, in the course of the study, I built a regression model to determine the student’s grade, depending on the various parameters of the student, but I won’t shove it here either - and so it was voluminous. Maybe next time, if someone will be interested.
So let's go.
Factors
The factors whose influence I studied are, of course, not comprehensive. And they do not pretend to cover the whole range of reasons why a student can get one or another grade. But why did I choose these factors? Because they could be obtained from official documents without resorting to such labor-intensive methods as questioning, interviewing, etc. For I had no time and resources for this.
So, the list of factors:
1) Sex of the student
2) Faculty
3) The subject or not
4) Semester
5) Attendance
Sample
For all factors except the last one (attendance), I took information from the Electronic University system (hereinafter - EI), which stores data on the results of all sessions since 2007. Thus, the sample coincided with the general population, which means that it is automatically representative.
Attendance is getting harder. In a good way, it should also be introduced into the "EU", but only in the first two courses. And after consultations with a friend of the deputy dean, it became completely clear that one should not even count on this - the data is either not entered, or entered as God will put into one’s soul. Which is sad.
However, attendance is such an important factor that it was impossible not to analyze it. I had to scoop up information from attendance magazines, which are kept by the elders, and at the end of the semester they pass them to the dean's office. Given that these journals do not exist electronically, I had to scan several hundred pages and then process them manually.
Having figured out how much time I will kill if I process the data across the entire university, I decided to limit myself to only one (dear to me) faculty - “Computer Science and Control Systems” (hereinafter - “IU”). And only for the last academic year.
Of course, the data from the magazines do not reflect reality 100%, because they are filled by the elders themselves, and they can make mistakes and cover up their comrades without noticing gaps. But no better has been given. At least, the proportions of attendance remain: the “nerd” will still not have passes in the magazine, and the “gouging”, even if they do not put half of the “enoks”, will still have the worst attendance.
The number of students of the faculty of "IU" - 2931 people. The number of results is 1,550. According to Paniotto and Maksimenko , the sample is representative.
Data processing
Of course, there are many specialized programs, however, since I know them only by hearsay, and there was no strong desire to get to know them better.
I wanted not only to calculate the coefficients and fix them, but also to enter the student data into the database, so that later I could supplement it, keep a story; check other hypotheses that were not originally included in my research; Build beautiful graphics with the most flexible settings, etc.
In addition, I wanted to do something of my own, sharpened for my task by 100%.
Since I work in a 1C franchise firm, I know 1C best of the development tools. So, the choice was obvious (and of the tools for creating a database with a web interface, I am completely familiar with only 1C, so the choice was doubly obvious).
I downloaded the data from the "EU" as an html page, wrote a parser and uploaded it to my database.
I had to manually fill in the attendance data (from the magazines), although I honestly tried to write a program for recognizing data from log scans at first. But the image quality was not so hot, so the written program did not accelerate the process.
Finally, having scored all the data and implemented in my database all the necessary algorithms for testing statistical hypotheses, I, trembling with impatience, began to build graphs and calculate the coefficients. And so what happened ...
results
So, the most interesting. The results that my program gave me.
I analyzed the results using the chi-square test, and the bond strength using the Spearman coefficient. But, probably, the number plates are not very interesting to readers, so here I will provide only visual graphs. If someone will be very interested, I will publish the numbers.
1) The first factor, as you recall, is “Student gender”

Wow, girls leave no chance for guys! Probably because they try harder, attend classes better? And here it is:

Here, judging by the schedule, attendance is almost independent of gender (this is also confirmed by calculations). It turns out that girls get higher points, and attendance is the same as that of guys.
2) Faculty
It is widely believed among techies that there are “simple” faculties, such as economic or humanitarian ones, and “complex” ones, such as engineering.

On the graph, I gave only two faculties - with the highest average mark (for all courses) and the lowest. Law is the Faculty of Law (yes, there is one in Baumanka), OE is optoelectronics. As you can see from the graph, the differences are quite strong.

However, if you build a histogram of the average score for all faculties, the difference will not be so obvious.
3) A cathedral subject or not.
For the cathedral subjects (they are conducted by teachers of the same department where the student is studying), the reputation of the student, which is hardly known to the teacher from the department, can play a role. It is believed that the attitude at the department is more loyal. In senior courses there are almost no non-cathedral subjects.

4) Semester.
As they say, the main thing is to survive the first two courses. That is, 4 semesters, which, in general, confirms the schedule.

5) One of the most interesting factors: attendance
Attendance was measured on a scale from 0 to 1. Where 1 - 100% attendance at all lectures and seminars. 0 –not one visit to lectures and seminars.

Everything seems to be true - the better you go, the higher the score (that's just a failure at 20% of attendance - I can’t explain, except for the quirk of the sample). But now we need to see if this relationship is equally strong at all courses?
The bond strength was measured using the Cramer coefficient.

As we can see, in senior courses the strength of communication decreases, while in younger courses it is maximum. I grouped the data in pairs, because it turns out a more linear schedule, and also because the training system in 1-2 courses is very different from 3-6 courses.
That is, the real significance of student attendance is only in the first two courses. On which the so-called "modular system" is used, which when setting grades is more oriented not on the exam, but on student performance and attendance during the semester.
conclusions
One cannot but pay attention to the strange overestimation of ratings for girls at the same level of attendance as for boys. But perhaps this is not an overestimation, and the girls are simply more intelligent? After all, I did not measure the mental abilities of the respondents (and is this really possible?).
Special attention also deserves the fact that attendance strongly affects the assessment only in the first two courses.
A picture emerges that confirms the prevailing opinion (at least at MSTU): starting from a course of 3-4, students begin to “hammer in” to study, getting a job in various companies by profession or not. Attendance is falling, in senior years more and more grades are put “for free”, because the teachers are cathedral and acquaintances, and the quality of knowledge ... However, this is a topic for a separate study.
This is a normal picture, or not - it is for everyone to decide. This coincides with my personal experience and observations. And frankly speaking, it does not cause positive emotions.
I hope someone found this article useful or interesting for themselves. Thanks for attention.