vovaekb90 September 6, 2015 at 11:32

Vision and Sports Summer School 2015 in Prague: how it was

This summer, during my vacation, I was lucky to take part in Vision and Sports Summer School 2015 (VS3 2015) in Prague. I want to tell about my impressions in my article, and maybe even motivate someone to apply for participation in this school.

In the beginning a few words about yourself. For the past year, I have been studying doctoral studies at the Technical University in Brno at the Department of Computer Graphics and Multimedia. My research work is related to computer vision and service robotics. Specifically, I am researching object recognition algorithms in a 3D scene using the so-called “point clouds” obtained from cameras such as Microsoft Kinect. For me, participation in school is a good opportunity to deepen my knowledge of the main aspects and possibilities of using computer vision in various areas of life, as I am just starting to study this area of IT. I learned about the summer school this spring from the announcement of one of its organizers, who gave a lecture at our faculty. Organizers send out invitations to participate in universities (at least
Among the participants there are people from different countries: the UK, France, Ukraine, Slovenia, the Czech Republic. These are mainly students in specialized fields, although there were also undergraduate students. Also, participants from Russia were seen at the summer school.

About Summer School

VS3 already in its fifth year organizes the Center of Machine perception, which is located in the Department of Cybernetics of the Faculty of Electrical Engineering at the Czech Technical University (CTU) in Prague. Specifically, the organizers of the school are Ondrei Hum, Jiri Matas (both from ChVUT) and Vittorio Ferrari (University of Edinburgh, UK).

Pictured is Vittorio Ferrari in a black T-shirt from the left, Ondra Hum in the center.
The school passed for a week (more precisely, 6 days) from August 17 to 22 on the territory of the ChVUT. Lectures were held in the building of the Faculty of Civil Engineering of the ChVUT. All the main objects of the event can be viewed here .

The purpose of the school is to acquaint people who are somehow connected with computer vision, with current achievements and current tasks in this area. Also, the school is a pleasant opportunity to get acquainted with experts and practitioners well-known in this field and compete with them in sporting events. Information about the school can be found on the official page .
As lecturers, specialists from the field of computer vision and machine learning from famous universities from around the world were invited:

Jiri Matas, Czech Technical University (Prague, Czech Republic);
Krystian Mikolajczyk, University of Surray, Guildford, UK;
Vittorio Ferrari, School of Informatics of the University of Edinburgh (UK);
Raquel Urtasun, University of Toronto (Toronto, Canada);
Christoph Lampert, Institute of Science and Technology Austria (IST Austria);
Carsten Rother, TU Dresden,
Daniel Cremers, TU München (Munich, Germany),
Ondrej Chum, Czech Technical University (Prague, Czech Republic).

The school is organized in the format of daily theoretical and several practical exercises (in the morning) and sporting events (in the second half). A workshop was organized on Saturday, in which each of the lecturers presented current achievements and trends in their field (benchmarks). Thus, every day of the summer school turned out to be busy from 9 a.m. to 6 p.m., only on Saturday it ended at 2 p.m. This year's full school program is available here .
In addition to lectures and sporting events, a barbecue was organized on Wednesday outside the campus of the ChVUT.

Registration and participation

To register at the school, you must fill out the registration form on the website in the Registration section and pay the registration fee after receiving confirmation of participation from the organizers. As a PhD student, I paid 10175 kroons. Dates of registration and payment are indicated on the official page of the school. The current year registration form can be found here . Here you can see the size of the registration fee.
It is possible to ask your faculty for support in participating in this school by compensating the registration fee. I and several of my faculty colleagues took this opportunity.
The summer school provides several accommodation options for events at hotels. You can see this year's settlement options here. or choose your own (fortunately, the prices for hotels and hostels in this part of Prague are acceptable).

How was the training

The first day program began with registration at 8 o’clock. Each participant received a confirmation of participation, a school program for every day, lunch coupons in the CTU canteen and an individual branded water bottle (for sports events).

Lectures of the school began with the opening words of the organizers of Ondra Huma and Vittorio Ferrari. After that, a series of lectures on a variety of topics related to computer vision followed throughout the week.
Krystian Mikolajczyk spoke about the extraction and matching of local features to solve various tasks of computer vision: from object recognition and creating panoramas to SLAM orientation technologies in robotics. Here, special attention was paid to the issue of detector invariance with respect to various types of transformation, in particular, to affine transformation and scaling.

Ondrei Hum spoke about the solution to the problem of searching for such images based on a set of bag of words in the image, in particular, about the use of the K-mean method in the space of signs. At the end of the lecture, Ondra Hum showed an interesting project developed by his colleagues from ChVUT, which searches for similar images for a given image fragment. This application allows the user to highlight an interesting detail with a frame on the image (for example, a sculpture on the facade of the cathedral) and in response gives out all relevant images that may contain this part from the same perspective, from different points of view, at different scales and even more detailed. Application developers also managed to perform 3D reconstruction of architectural objects based on a collection of images for a better and more “smart” search for similarities.

Karsten Roser in the lecture MRF / CRF for Computer vision spoke about Random fields (Markov random and conditional random fields), their application for solving problems of interactive image segmentation, denoising (noise reduction in the image) and stereo matching.

Christophe Lamper talked about Structured prediction models, described standard regression, probabilistic graphical models such as Factor graphs, probabilistic inference, and structural SVMs.
Vittorio Ferrari talked about using Weakly supervised learning (WSL) techniques to train visual models in solving semantic segmentation problems, using Convolutional neural networks in WSL, and compared Weakly supervised learning with Full supervised learning.
Daniel Kremers, as part of a lecture on Variational methods & Geometric reconstruction, described how variational methods can be used to optimize the solution of some computer vision problems by the example of segmenting an object in an image, 3D reconstruction, and building terrain maps using techniques such as SLAM.

Jiri Matas talked about visual tracking in video, various methods of detecting an object for tracking and direct tracking, as well as teaching techniques in the tracking process.
Raquel Urtasan talked about the basics of Deep structured learning, described the concept of Convolutional neural networks and their application in problems of classification, object localization and semantic segmentation, and also talked about the use of graphical models (CRF, MRF) in combination with CNN.
Those who wish can read the program of the Saturday workshop here . Most of all I remember the performances of Christian Mikołajczyk and Raquel Urtasan. Christian Mikołajczyk spoke about the automatic annotation of tennis games by tracking the trajectory of the ball and recognizing the actions of the players. Raquel Urtasan spoke about the latest projects in the field of autonomous driving: car localization, road planning and 3D reconstruction of city streets based on stereo cameras. Here are some photos from the workshop lectures.

As for the practice, two practical exercises were held.
The first practical lesson was devoted to the topic of Karsten Roser's lecture - MRF / CRF for Computer Vision. The lesson was held in a computer room on computers with Windows Server 2012 installed in the MATLAB program. The tasks were as follows. Task 1 was devoted to interactive image segmentation. It was necessary to study the logic of the script for image segmentation and to study the influence of various algorithm parameters on the segmentation result. The script took an annotated image with the selection of the background area and the object using brush strokes of blue and red, respectively (the pixel colors under the strokes were used by the algorithm). Task 1 also required changing the script to optimize segmentation to achieve a better result. The second task was to solve the noise suppression problem and required to find the most optimal parameter values to obtain the best result. The practical task allowed us not only to superficially understand the theory of using random fields on practical problems, but also to gain experience in MathLab.

The second practical task was devoted to the extraction of signs and the search for similar images with their help. The lesson was Ondra Hum. Two tasks were given. The first task concerned Bag of words and Inverted files, which were presented at the lecture on Monday. It was necessary to independently implement a script to search for images for a given database, presented in the form of a matrix: the rows of the matrix represent the bag of words for one document (attribute values), and each column - one word (attribute).

The assignment gave instructions on what sequential steps should be performed and what functions and MATLAB data types should be used. Also, to help the participants were given slides from the lecture. In the beginning, it was necessary to build a matrix for the database based on the existing data structure and the weights of all words. An interesting point was the calculation of the idf parameter - the weight for each visual word according to the formula: Here it was necessary to calculate the number of documents containing the word X. After building the database matrix, it was necessary to fulfill a request to search for similar images for a given fragment. The result of the query was:

idf(X) = log(# documents / # documents containing X)

Thus, the tasks were given not only to start execution and study the logic of the algorithms, but also required some algorithmic skill to find a method for solving the problem.

Sporting events

In the last month before the school, each participant was sent a message from the school organizers asking them to choose the sports of interest. When organizing the school’s timetable, the organizers distributed all the participants into groups for each day so that several groups of sports were organized every day. I had such a plan: Monday - badminton, Tuesday - archery, Wednesday - table tennis, Thursday - volleyball, Friday - sokker (football).
It looks like a sports hall for football, badminton and volleyball.

Conclusion

The school was over, but a large number of emotions and memories remained for a long time. What would I like to say in conclusion about the summer school? As the lecture program showed, machine learning in computer vision is becoming a promising trend: from graphic models (CRF and MRF) to deep learning with the rapidly growing popularity of convolutional networks. What pleased me in particular was the growing number of developments in the field of stereo vision, for example 3D reconstruction and visual navigation for autonomous cars. In my opinion, there were not enough practical exercises here. Nevertheless, the conducted practices introduced the rather rich programming language MATHLAB, which has very powerful and practical features such as building scarce matrices. I also learned about some good CV books,

David A. Forsyth, Jean Ponce - Computer Vision: A Modern Approach
Kenichi Kanatani - Understanding Geometric Algebra: Hamilton, Grassmann, and Clifford for Computer Vision and Graphics.
Richard Szeliski - Computer Vision: Algorithms and Applications.

Each participant finds something useful for himself in this school and, I am sure, participation in it is not in vain. Information about the organization of the school becomes available every year in April. As soon as information about Vision and Sports Summer School 2016 appears, I will write a short announcement about the upcoming school. Thank you for your attention and good luck to everyone who shows a desire and sends an application for participation in future VS3 schools!

PS. There may be some inaccuracies in the description of the concise content of lectures, since I am not good at machine learning techniques and could have misunderstood something in the presentation of lectures by lecturers.

Tags: