hose314 January 25, 2017 at 17:00

An attempt to teach students to program about where this process can and should be automated

From the sandbox

It is this quote by Steve Jobs that meets visitors to cs.betlabs.ru , it can be considered an attempt to motivate students to work harder on their homework and laboratory tasks. Unfortunately, I do not have metrics for quantifying the impact of motivation from a teacher on student performance. Moreover, I believe that the competitive environment in the study group is a much more important factor in the overall performance indicator. Now this is just a hypothesis, and its verification does not lie in the field of my scientific interests.

Background

I, like any graduate student, should spend 100 hours from the curriculum on teaching at a university. Specifically, I had the task of conducting laboratory classes in the courses “Computer Science and the Basics of Algorithmization” and “Object-Oriented Programming” in C # for first-year students of business informatics . It is important to note the fact that the vast majority of students believed, perhaps still believes that they will not need programming in their current and future activities.

First year

In the first year and the first semester of teaching, I had two study groups with a total of 58 people. Tasks: carry out test work, check individual homework and set a mark on a 5-point scale for semester work. The mark I put is not final, and the final mark is determined on the exam of the lecturer.

The load was large, it was necessary to do a lot, so very often I heard from students comments that they would not need this subject in life. I quite often had explanatory discussions about the fact that my cognitive abilities need to be trained and programming is an excellent tool in order to prepare the brain for systemic and at the same time creative thinking . I do not think that my words resonated in most minds, but it seems to me that a number of students have thought and found programming useful for themselves.

Software and Services

Google Sheets for keeping records of remote control results and test results
Google Forms for Recording Test Records and Collecting Feedback
BitLy.com link reduction service

How were the control and delivery of DZ

I only got acquainted with the process of conducting laboratory classes, so I did as the lecturer recommended. Controls were written on paper sheets. All the time laboratory work went to check homework. I checked the operation of the programs “by eye”.

Summary

Statistics of visiting a page with a list of tasks for homework. In the fall semester, students actively begin to study in mid-November.

At the end of the semester, I conducted an anonymous survey on classic questions about the course and usefulness of the subject. The usefulness of an item was rated on average at 3.76 , the subject's fascination was 3.95 , on a five-point scale.

The results on exams and the marks I predicted in most cases coincided ( precision 90% ). One of the groups managed to show the best result among other training groups of the entire external stream. In my opinion, one of the factors could be the competitive environment formed in the group. I compared the results of the exam in students of both groups, the average and median are approximately the same, respectively, and academic performance in the exam differs markedly in the five.

Second year

It's time for productivity and automation!

The tasks were the same, the students were freshmen again, but this time I decided to increase KPI in terms of fascination, comprehensibility and usefulness, and not according to the final mark on the exam. My students are still yesterday’s schoolchildren, most of them did not pass the exam in computer science, in general they were absolute zeros in programming. A little inspired by the Harvard CS50 course , I decided that the routine should be automated, freeing up time for students' questions and a detailed explanation of the course material.

Software and Services

Dropbox Paper for lab notes and “squeezes”
HackerRank for automatic verification of control and homework
Google Sheets for keeping records of remote control results and test results
Google Forms for Recording Test Records and Collecting Feedback
Trello students' letters to my mail with questions
automatically become cards in my ToDo board

How were the control and delivery of DZ

The desire to automate task verification led me to the idea that I would have to learn how to use students' git. I basically needed a ready-made, almost perfect boxed solution. This solution was the HackerRank service. This service + aaa (additional functionality written by me using the official API) allowed me to automatically check the tasks and see the written off works (plagiarism detector). It takes more time to prepare tasks and write tests, but you need to do this only once. It seems someone wrote off and is not shy :(

Setting semester marks

I would like to dwell on this part in more detail. I promise not to use complex terms and explain everything on my fingers.

First, I wanted to remove the subjectivity of the teacher in assessing student performance. Second, do not think about what marks anyone has, how to consider the final, what should be the scale, etc. Third, take into account almost any “sneeze” (attendance, activity in the classroom, etc.) of the student as a factor in setting the semester grade.

What data did I collect?

Attendance: was / was not on a certain day - a binary sign
Work in the classroom: for solving a specific problem, a specific score is a numerical
sign
The results of the tests - a numerical sign
Homework Results - Numeric Character

All this merges into one table, is scaled and fed to the input by the K-Means algorithm, the result of which will be mapping between the student and the cluster to which he belongs (marks 2, 3, 4, 5).

We ask the computer to divide the set into 4 subsets as it seems to it “more objective”.

import pandas as pd
import numpy as np
from sklearn.cluster import KMeans
from sklearn import preprocessing
from sklearn.decomposition import PCA
# Learn
clu = KMeans(n_clusters=4, random_state=240)
clu.fit(processed_data)
# Clusters
labels = pd.DataFrame(clu.labels_)
# Reduce space dimension
pca = PCA(n_components=3)
pca.fit(processed_data)
pca_processed_data = pca.transform(processed_data)

Using the simplest methods of Machine Learning does not require complicated preparation

This is how marks and clusters look in the three-dimensional space of attributes. Definitely someone needs to put two and an automatic :)

Student Clustering by Performance (KMeans)

At first glance, the resulting marks coincide with my idea of the performance of specific students. If this approach is applied everywhere and the data of the training sample is collected, it will be possible to predict the mark at the end of the semester by academic performance in the first month (hypothesis) of training. Thus, this will allow time to identify “problematic” students and offer them help in additional clarification of the material.

Instead of a conclusion

Labs are now fully devoted to explaining the material and solving problems. Whether this will show an increase in established KPIs remains an open question.

The HackerRank service for educational purposes described above is not a very convenient tool with the absence of very important and convenient features. For this classic Market Research → Customer Development → MVP → pre-seed → seed
... well, you know.

Tags: