
How to enroll in a PhD machine learning program
- Tutorial
1. Introduction
This text is a small summary of my Computer Science PhD filing experience with a bias in machine learning in North America. I tried to collect my miscalculations in this guide (to learn better from the mistakes of others) and more or less universal things that are useful to everyone. But you still need to understand that this is a rather individual experience, so your personal strategy may vary. For example, in the case of the choice of universities / academic advisers or in writing a statement of purpose. Well, or you are in different starting conditions compared to me (ratings, articles, recommendations).
Keep in mind that the main part of the guide was written before the results were obtained, because I wanted to avoid the “ survivorship bias ” and analyze my experience regardless of whether I did it or not. At the end of the manual there are my results: I entered 2 out of 11 universities that I applied to. In my opinion, you should still avoid the errors that I will describe here. Well, you need to understand that in the process of applying to the ML PhD there is a lot of noise, so you can do everything well and fly by and probably even vice versa.

1. Introduction
2. Why do I need PhD?
3. Choosing a country
4. Evaluating expenses
5. Preparing for GRE
6. Preparing for TOEFL
7. Grades at the university (s)
8. Recommendations
9. Articles
10. Choosing a university and academic adviser
11. Writing a statement of purpose / personal history
12 CV and cherries on the cake
13. Application process
14. Waiting and response timelines
15. How to deal with refusals
16. My results and a few words about US visas
17. Conclusion and thanks
Be prepared that applying for PhD will take you from two months to six months, depending on your starting level and how you organize your work. It turned out around two months for me and it was stressful. If you do not have scientific articles, it may make sense to throw a year or two on their writing. For money: $ 400 (GRE + TOEFL) + $ 70-150 for each application submitted + $ 150 (preparation for GRE / TOEFL through Magoosh). Please note that these figures are relevant at the end of 2017.
The application process for PhD is briefly structured as follows: you prepare and pass the GRE / TOEFL, choose a university and a researcher, write a statement of purpose / personal history, write to potential scientific advisers, fill out applications, wait for answers, go through interviews (in some cases, take without them ), come and cut awesome research and you are taken as a professor at Stanford or a researcher at Google (but this is inaccurate). Each of the chapters in this guide describes one part of the process above. At the end of each chapter, I also collected useful links that I came across in the process of preparation, because my experience is not the first and not the last.
- Good short guide on submitting applications for phd with important notes. Especially the part about admissions is a crapshot
- Post professors from Carnegie-Mellon how they select students
- Another good and honest guide , this time with a bias in Computer Science: http://www.pgbovine.net/grad-school-app-tips.htm
- Guide with a bias in psychology
- Short tips with a bias in California realities
- Timeline of training for up to eight months
- A very comprehensive guide written in 1997 and updated in 2017.
- And one more good guide (I know you are already tired of these links)
2. Why do I need a PhD?
This is the main question that you need to answer yourself before you get involved in all this. Filing applications costs time, money and, most importantly, nerves. Yes, in the process you will understand something for yourself and become a little better at imagining how the research in this area is organized, but this same knowledge can be obtained in less stressful conditions and made much more useful during this time.

In my opinion, the normal answer to the question “Why do I need PhD” is only one - you want to do research in this area. If you look at phd as a way to get into Google / Facebook / Amazon, then there are a bunch of other more reliable and interesting ways. You need to understand that phd takes from 4 to 6 years and during this time it is quite possible to build a normal career as a data scientist or data engineer. Moreover, if your phd goes wrong, then you will find yourself in a very losing position compared to the people who worked while you were suffering with phd.
Essentially, phd is a license to do research (but not the only way to do it). If you do not know what to do next with this license, then it is better not to get involved in it.
- Important text on why you should NOT do PhD
- Another good post about why you should NOT do PhD
- Hyde about economic phd , but there to a fig of useful and universal ideas
- A guide for newly minted phd about choosing between academy and industry. It is useful to read it before applying, in order to understand what awaits you
- Quora thread about choosing between PhD and industry
3. Country selection
Initially, this item was not in this guide, but I decided to add it because of the visa situation. The harsh truth of life is that in the current geopolitical situation (2018), it has become more difficult for many foreigners to obtain study visas in the United States, especially if they are engaged in dual-use technologies: atomic physics, computer science, chemistry, and so on. Almost 100%, that applying for a visa, you will find a thing called administrative processing, which used to take in the region of three weeks, and now can take three months or more.
The second problem with the American study visa is that it is likely that they will give you only a year. This means that you will either be stuck in the USA (you can be there without a visa if everything is in order with internal documents), or you will have to update your visa every year if you want to travel to conferences outside the USA (a visa gives you the right to enter country, but not to stay). If geographic mobility is important to you or you want to visit relatives regularly, you should seriously consider submitting documents not to the USA, for example, to Canada or Europe.
It is also important to understand the characteristics of countries in their approaches to PhD. In Europe, PhD usually requires a master's degree and lasts 3-4 years, during which you work on a specific project. In Canada and the United States, people usually go to graduate school after undergraduate studies, spend the first two years of training, choose a supervisor with a topic, and ultimately defend themselves 5-6 years after the start of training. You can apply for a PhD in the USA with a master’s degree, but this is not the main thing that most universities look at first.
Commentary on the situation with PhD applications from foreigners in the USA
4. Cost Estimation
This mainly applies to American / Canadian universities, which require almost everything to pay an application fee ($ 70-125 per university), as well as send them official GRE reports ($ 27 GRE + $ 19 TOEFL). As a result, it turns out that one application to a university costs $ 100-150 dollars. There are also fixed bones for GRE and TOEFL - approximately $ 200 each. In other words, if you want to go to 10 American universities, then it will cost you about $ 2000. The calculation is relevant at the end of 2017.
The second important component of expenses is time. It took me about two months: one to prepare for GRE and the other to search for supervisors, write a statement of purpose, and fill out applications. In my opinion, this is an absolute minimum, below which it is not worth falling. This is not a clean time, because I worked in a science lab at the same time, so if you have more free time, maybe you can manage faster. If you are a supporter of a minimum of stress, then it is better to start 3-6 months before the deadline of applications.
- FAQ on GRE , including cost:
- TOEFL cost
- Preparing for GRE and TOEFL on Magoosh
5. Preparation for GRE
5.1 General
GRE is a 3 hours 45 minutes test that tests your knowledge of numerical methods (quant, Q), your ability to analyze texts / sentences in combination with your vocabulary (verbal, V), and also the ability to write analytical texts (AWA). About the test itself, it is written in detail in a bunch of places, so here I will share my impressions and tricks.

GRE is generally a wacky story, in my opinion. If you write it very well, it does not give a particular advantage, because most strong candidates write it well. But if you write it badly, it can do much harm. This makes preparing for it a tedious and tedious task, since such a formulation of the question does not motivate at all (you need to run as hard as you can to stay in place). I used a few mental tricks to make this tedious process more enjoyable / effective.
Set a goal for yourself.My goal was 165Q, 155V. I did not set an AWA goal and it was a mistake. As a result, I passed on 169Q, 159V and 3.0 AWA, where the first two grades are very good for my specialty (96 and 83%), and the last is extremely mediocre (18%). If I set a specific goal on AWA, my preparation would be more effective.
Look at GRE as an opportunity to learn something. In the case of mathematics, I refreshed some school knowledge + learned a few evaluative tricks. In the case of c verbal, I significantly expanded my vocabulary and learned some of the words that I would never have learned otherwise. Without this trick, getting ready for GRE is terribly boring.
Understand the meta test.Questions in GRE are not always formulated as clearly as possible and this is done on purpose. The drafters are well aware of the conditions in which you solve the test and sometimes try to confuse you as part of the rules. You need to understand how these traps are arranged so as not to fall into them. Magoosh is very useful in this (see below).
Use www.magoosh.com. A six-month subscription costs $ 150 and it's worth it. Magoosh has tons of short and clear videos to explain how GRE works, basic tricks and traps for compilers, and to help refresh the math you forgot. Plus, there are about a thousand tasks on quant and verbal, as well as convenient and clear statistics and a way to track sections in which you are most mistaken.
Estimate the time you need to prepare.Rule of thumb, which is written everywhere and with which I agree - on average, it takes 40 hours to improve the score in a category (for example, quant) by 5 units. For example, if you wrote a test for the first time at 160Q / 155V, then you need 80 hours to raise the score to 165Q / 160V. But here it is important to understand your individual characteristics. For example, if you are sure that your points are underestimated due to nerves, then you may need less / more time to develop your strategy for writing a test.
Set a training routine based on your priorities and available time. I had exactly one month to prepare and therefore in my case the routine was 40 quant questions and 40 verbal questions daily. I did not have an AWA routine and this was a mistake.
5.2 Quant
It is important to understand that GRE Quant is a test not only for knowledge of basic mathematics, but also for attention with concentration. At the beginning of the training, I rated myself on these three points (excellent / normal / poor) and built the training accordingly. In my case, the math was excellent, the attention is poor and the concentration is excellent. Concentration can be understood as the ability to work under hard temporary pressure.

Every day I solved at least 40 questions with magoosh in quiz mode - this is when you answer questions and only then see the answers. I would not use practice mode at all when you see the correct answer immediately after your answer. Preparation in quiz format is more similar in terms of conditions to a real test. Plus it’s easier and better to analyze errors in a bunch.
In addition, while writing this text, I was advisedCrunchprep - it is claimed that it is also convenient to use and you can see what to tighten.
5.3 Verbal
GRE Verbal is primarily about vocabulary, and secondly, about understanding how the most common pitfalls in reading tasks work. To pass Verbal normally it’s enough to carefully look at all the videos on magoosh about verbal (there are fewer of them than in mathematics) + constantly work on the dictionary. The quizzlet.com site (there is also memrise.com) helped me a lot with the latter, where you can make lists of words and then start training, where the site palms them to you in a cunning way for study. I got into the habit of writing down all the unfamiliar words that I met on magoosh questions and texts that I read. I wrote down words in packs of 50 each and at the end of the preparation I tried to work out one pack every 2-3 days. With reading, in my opinion, it’s enough to solve all the questions related to it on magoosh. The most important trick I pulled out is
5.4 AWA
I screwed up this part a bit since I got 3.0 from 7.0, which is pretty bad. The ideal preparation option, as I understand it after the fact, is to find people who can give feedback by letter and write 3-4 essays a week. The main problem with AWA for me was that it was hard to write meaningful things under tough time pressure. Magoosh offers a good outline: intro, 3-4 paragraphs with abstracts, conclusion. It was useful to me, because it allows you not to think about the structure, but to focus on the content.
In the process of writing this text, I was also advised here is this resource , which gives a rough estimate for an essay in a semi-automatic mode.
5.5 The skill of passing the test itself
In order to pass the GRE normally, in my opinion it is very important to reduce the level of stress in passing it. For example, be familiar with the test interface. In addition, it is very important to properly manage the time. For example, do not hang on difficult issues and return to them in the remaining time. To do this, I recommend passing as many mockup tests as possible (there is such an option on magoosh, and a list of free tests can be found here ). In addition, GRE offers two powerprep tests when booking the delivery time. They must be passed in order to get an idea of the interface.
Personally, over the past 10 days of preparation, I have passed six tests: two PowerPrep and four Magoosh. This helped me a lot when passing the test itself. For example, in the quant section I got a very cleverly formulated question about probabilities, on which I hung. But since I had experience with the change, I skipped this question, then with a calm soul returned to it at the end and it turned out that the question was simple, simply formulated with a catch.
5.6 Time reservation
The latest time for comfortable passing GRE and TOEFL is the first week of November if you want to make only one attempt. If you want several, add a month for each additional attempt for GRE. October / November is the hottest time of delivery, so it’s better to book at least a month in advance, or even earlier, to get the test at a convenient time of the day.
For example, I’m an owl and initially booked the test at 8 am, because I booked at the last moment. Then I had to monitor a convenient time and spend $ 50 to change the time to take the test at four in the afternoon. After the fact, I believe that this was a very correct decision, because I handed over the simpler TOEFL at 8 in the morning and felt that the brain had not yet turned on very much. If you are a lark, then perhaps for you this is true exactly the opposite.
5.7 Retake GRE / TOEFL
If you are not confident in your abilities, plan the tests so that you have time for one or two retakes. GRE you can take out five times a year with a minimum interval of 21 days, TOEFL you can retake as much as you want with an interval of 12 days. In practice, this means that it is better to add a month for each attempt to retake GRE and two weeks on TOEFL.
6. Preparation for TOEFL
TOEFL consists of four parts: speaking, writing, listening, reading. For each of them you can get a maximum of 30 points. As a rule, universities require that your result be at least a certain threshold, most often 80 or 100. Some universities indicate section minimums. For example, I did not submit to Cornell, because they had a cut-off by speaking of 22 (I got 20). In general, speaking is usually the most important part if the uni has a separate scoring, so you should pay special attention to it (see below).

If you normally prepared for GRE Verbal and AWA, then you are automatically ready for reading / writing, because they are simplified versions of GRE Verbal. Listening should also not be a problem if you are able to watch TV shows / movies without subtitles and understand most of what is happening there. If not, then this is a good way to prepare. The main difficulty with listening during the test is that several people take the test in the room, so you can listen when someone else speaks speaking. One must be mentally prepared for this and not shy.
The hardest part for me was speaking. I thought that by default I was ready for it, but there was an important nuance in the test - the time limit. You have 45-60 seconds, and sometimes even less, to clearly answer the question. This requires some practice. Magoosh has a toefl preparation service ($ 50 per month). I bought, but in fact almost never used it. If I were preparing for the test now, I would definitely work out a few dozen speaking questions.
7. Grades at the university (s)
There are two important components: undergrad (undergraduate / specialty) and graduate (graduate). Estimated requirements vary from university to university. Someone is interested in your grades only in the whole undergrad, someone is interested in the last two years, including a master's program (if you were in it). In my case, I was rather in a bad situation - I had very poor grades, despite the fact that I graduated from a very good university with a very good program.
Depending on the university and program, high grades will increase your chance of going through the initial selection, but they most likely will not affect the final decision. Bad grades reduce the likelihood that you will pass pre-filters and make your profile a little less competitive: you will have a lot of competitors with a GPA close to ideal. At the same time, judging by what I read, there is not much difference between GPA 3.8 and 4.0. According to my feelings, if you have other parts of the application that are strong, then GPA> 3.5 is quite normal.

Here I went along the path of minimizing damage - if you have a good reason why the grades were poor, then it is worth mentioning it in the statement of purpose, but without fanaticism and in a positive way. In addition, if you have academic advisers that you taught, you can ask them to write something like "his grades in undergraduate sucks, but this is complete nonsense." Whether it works or not depends on the university and the program, but this is not something that you can greatly influence, so you should not strain much on this topic (although I have been straining anyway).
If you have poor grades, then it is doubly important for you to pull in GRE well and be very sensible in choosing the universities where you are applying. For example, I did not go to MIT because they are known for the fact that the GPA is very important for them. And the same MIT directly writes that GRE means nothing to them. Probably, you can get into MIT with poor grades, it's just that the probability is not very high, and my task was to maximize the probability of getting on PCBs, provided that I like the potential university and academic advisors. A little more about this in the paragraph on the choice of a university and a potential scientific adviser.
8. Recommendations
For most universities, you will need 2-3 recommendations from teachers / supervisors / people who know you from a scientific or labor point of view. And here two problems arise - how to find such people and what they should write there.
8.1 How to choose a recommender
Since you are applying for a research position, ideally, the recommendations should be from researchers in your area of interest who speak of your ability to engage in independent research. I would aim for at least two recommendations from the academic environment. The status of the recommender is also important - if he is known, then there is a higher chance that his recommendations will be heeded.

Since most of us do not have the opportunity to receive a recommendation from Benjio, Hinton or Lekun, there are several possible sources of recommendations. Firstly, the supervisor of the diploma is an almost mandatory option, especially if you studied at the magistracy. Secondly, someone from the dean’s office who knows you well and treats you well. Thirdly, if you have done interesting research projects or summer practice, then the project / practice leader is suitable. Fourth, your immediate supervisor at work, if you have worked long enough somewhere and are proud of what you did there.
The general principle when choosing a recommender is that it is better for you to write a good recommendation for a less status person who treats you well than a faceless one - status. The ideal option is both this and that, but in this case, you most likely do not need this guide.
8.2 How to write recommendations
There is a chance that the recommender will ask you to write a recommendation for him, so that he can then edit it. This is a strange experience, because on the one hand I want to write about myself well, on the other hand - objectively. Since I did not write recommendations for myself, I can give some general advice.
Avoid recommendations like did well in class. Universities receive such recommendations in the thousands and they are useless. If a recommendation is written by a person who has taught your course, let him write in more detail what you are so good at, what an interesting project you have done, and how cool you are among those whom he taught in your life.
The recommendation should demonstrate how independent you are and capable of research. Professors are usually terribly busy people, so they appreciate those who require less of their precious time. If the recommendation shows that you are capable of doing research on your own (but not disappearing!) - this is a good sign. Ideally, there should be concrete examples of projects and what you have done in them.
From the recommendation it should be at least approximately clear how pleasant you are. If you are a genius and unique, then this part is probably not very important, but if not, then, all else being equal, they will give preference to someone with whom it is pleasant to work. You don’t have to write a lot about this, but if it’s clear from the recommendation that you are a nice and decent person, it will definitely not be superfluous.
8.3 How to make life easier for recommenders
The steeper your recommenders, the worse they have with free time. Your task is to make the process of writing and submitting recommendations as painless as possible for them. Personally, I made a table for them in Google Sheets, where I indicated all the universities where I applied, their deadlines and the status of the recommendation (sent / not sent a request, received / not received a recommendation, and do I need a recommendation from this person). It will not be superfluous, when approaching the earliest deadline, to send a reminder to the recommenders that in X weeks / days the first deadline will come.
9. Articles
This is a very important part of the application, because the main part of the scientific work is the writing of articles. Even if you do not plan to do science after PhD, you will have to write articles during the training and also defend your dissertation. To understand what you're going for, it would be nice to try to do this in advance to understand how interesting you are.

If you are a student of undergraduate or graduate studies, then everything is relatively simple - you are looking for people and laboratories at the university who are doing what you are interested in and trying to do research projects with them. Feel free to contact the professors - everyone needs hardworking students and extra hands (especially free). At the same time, you should not set the bar too high and immediately mark on the NIPS or ICLR, but it would be nice to certainly go to English-language conferences or workshops. Even if your article was not accepted, but you like it, post it on arXiv - it's better than nothing. Nobody expects an article on NIPS from you - it is very difficult and one of the goals of your postgraduate studies is to learn how to write such articles.
If you worked in the industry and don’t want to go to graduate school, but immediately want to immediately enter PhD, then it’s more difficult. Here I can offer only one recipe - get a research assistant in the laboratory and take part in several projects.
I was lucky and got a job in a laboratory in the USA that deals with neuroscience, and I wrote several of my articles there. If you think that my story is unique, here is an example of a person who worked in the USA as a very highly paid lawyer, and then entered NYU, having worked until this year in a science laboratory. The moral of this story is: even if you have achieved a lot in the old profession / industry, you will most likely have to sacrifice time / money for PhD.
True, there is another problem in the fact that in Russia there are not so many groups and universities that do ML-research of a world level. In my opinion, there are four of them: HSE, Moscow Institute of Physics and Technology, Moscow State University and Skoltech. I do not want to advise specific names, but it is quite easy to find people in these universities who have publications in international conferences. How to get into such a group is a separate issue and here, unfortunately, I can not advise anything.
Finally, another way to gain research experience is to reproduce some famous article from scratch. This will allow you to understand how capable you are of doing what the authors of the article did. Moreover, the ICLR has a Reproducibility Challengein which the organizers urge the reproduction of articles from the previous year. This is also a good way to show that you are capable of doing research in this area, as well as getting quasi-publication for your PhD application.
10. Choice of university and supervisor
10.1 General
In theory, this section should go right after “why do I need PhD”, but it is at the end for one simple reason. In the United States and Canada, a huge number of good universities and even more good professors. In order to carefully review them, you need a lot of time. The items above (GRE, articles, TOEFL, GPA) impose restrictions on your choice of universities. For example, if your grades are so-so, then most likely universities like MIT are closed for you. Or for example, your GRE does not reach the officially specified threshold (some universities indicate this). This means that if you postpone the choice of universities at the end, you can save time by using your results as additional filters.
In my opinion, before you start preparing for PhD, you should choose a few dream school - the places where you want to go despite the odds, just to try. After you pass the tests, you can add to this list several more realistic candidates based on your results.
It is also important to understand that in the USA and Canada there are a lot of good universities, of which you most likely know only 5-10 the most famous (say Stanford, Berkeley, Harvard, Yale, Carnegie Mellon, MIT and Caltech). It is very difficult to get into these universities, because everyone knows them and every year a huge number of people go there. Personally, I was guided by getting into the university from the top 50.
10.2 Supervisor Search
For myself, I decided that the rating of the school is not very important for me - there are a lot of ranks (QS, TIMES, US NEWS and so on), they can vary and it is often not very clear how they are composed. Therefore, first of all, I was looking for professors who are engaged in interesting research and look like nice people. The last part should not be underestimated - you will spend several years with the supervisor and if he is unpleasant to you from the very beginning, then this is unlikely to be a pleasant time.

I used CSrankings.org to search for scientists- A convenient and minimalistic site in which you can choose different areas of CS / AI / ML and watch the university, sorted by the number of publications in leading conferences in these areas. More importantly, a breakdown of citations from professors is provided for each university. Actually, I just chose the directions that interest me, took the period over the past five years and walked the list of people from each university. As a rule, I filtered professors who have less than 10 publications because I was looking for people who are actively working.
For each professor, I evaluated three things. The first is the google scholar profile. There I watched not only the most cited articles, but also the breadth of interests of the professor, as well as his latest articles. I tried to avoid too narrow or too broad specialists, as well as pure theorists (there are quite a lot of them) and pure applicants (there are few of them, because applied articles are more difficult to publish). I was looking for people who are fundamentally strong and use this knowledge to solve applied problems. This eliminated about half of the professors (very subjective).
The second is a personal site. This is the best (albeit very imperfect) of the possible approximations of the personalities of the professor, if you are unfamiliar with him. According to my observations, for good professors the site is not overloaded with regalia or show-offs, it clearly spells out what the person is doing in general and now, the key ones are selected from the publications and ideally there are notes for potential students. In addition, the site often writes whether they take students or not. Of the things that worried me: an abundance of show-offs and / or regalia (you are a professor, it is clear that you are cool / cool), lack of updates, lack of students or a small number of them.
Third, social networks. This is an optional thing, but live twitter / facebook is a big plus for the professor. From it you can understand how he thinks, what things interest him and what kind of person he is. There are not very many such professors, but I think that over the years there will be more and more, so this advice will be more relevant.
It is important to understand that my way of choosing a scientist is strongly biased towards cool guys. If a professor actively publishes in the best conferences, chances are good that he works in a good university, which is more difficult to enter. On the other hand, if you do not like a potential supervisor even on paper, then there is a chance that it will be difficult for you.
10.3 University selection

Since we live in an imperfect world, it can happen that an ideal scientist finds himself in an imperfect university. This is either the location, or the criteria for admission, or the scientist simply does not take students this year. Therefore, after filtering the scientists, I filtered universities. The criteria were as follows.
The number of potential scientists. I did not go to universities, where I could not find at least three potential leaders that I liked. This is a matter of maximizing the return on invested resources - you pay money for each application, so it’s risky to bet on one supervisor. Plus, many universities ask you to indicate three potential leaders.
Correspondence of selection criteria to my parameters.For example, I had not very high speaking in TOEFL - 20 and Cornell was closed to me by this criterion. Other universities like MIT look very meticulously at the GPA. Third universities give cutoff on GRE, explicit or implicit. Everything is clear with explicit, but implicit usually manifests itself in the fact that the university gives points for those who entered there for different years (for example, for Duke University). If your points are significantly lower, then it is worth considering.
Funding opportunities.Most universities write how they fund their PhD students. This is usually the work of a teaching / research assistant. If this is not clearly indicated on the university’s website, then this may be an alarming sign, because there is a chance that you will have problems with financing. Well, that is, they can take you, but without funding, which for me personally was tantamount to refusal, because graduate school in the USA, like all education in general, is very expensive.
How many universities apply depends on your time and money, as well as on the relative strength of your application. If you think that you have a strong application for those universities where you are going to apply, then you can apply to a small number of universities (<7), if the application is relatively weak, then it may be worth expanding the network wider. It is important to understand that your assessment of the relative strength of a resume may be overstated, so it is worth making a safety deposit.
I know several people who served either at the same time as me, or a year earlier. The first from the United States, with a very strong resume, went to ~ 10 cool universities, of which more than half took him and he is now at Stanford. The second from Russia, with not very good grades in undergrad, went to five universities for six programs, of which he was taken to two universities, one of which was in the top 10 US News. The third from China, which went to ~ 20 places, of which she was taken to one or two universities, and she eventually went to the university from the top 25. All of them were submitted to biomedical engineering.
Personally, I went to 11 (8 in the USA, 2 in Canada, 1 in Europe) university at Computer Science, nine of which demanded a fee for the application. In my opinion, more is already too much. Each university requires filling out an application (and usually the formats for filling out applications are different), so expect that it will take about two hours to fill out only one (registration on the site, filling in many fields, checking information), so multiplying the number of universities linearly multiplies this time.
11. Writing a statement of purpose / personal history
Statement of Purpose (SoP) is a two-page text about who you are, why you need PhD, what you want to do, and what relevant experience you have. Already from this description it is clear that the main problem of SoP is to put a huge amount of information into a very compressed volume of text. Depending on your profile, aspirations and character, you will have to sacrifice some parts and write more about others.
Estimates of the role of statement of purpose (SoP) vary widely. Some guides say that this is the most important part of the application for PhD, others that it is a more or less formal part (after all, someone can write it for a candidate). In my opinion, the role of SoP grows if you do not have the most ideal profile and you are not a bachelor / specialist student at the time of submission. Personally, I spent a lot of time writing it and formulated for myself several important principles that are listed below. Important note: I remind you once again that this guide is very individual, and this part is doubly individual. There is a chance that you will come to some kind of your scheme.

Rewrite SoP again and again.I am still ashamed of my SoP, which I wrote in a hurry for one famous European laboratory. He was smug, stupid and overloaded with unnecessary details. Be prepared to write a few SoP drafts to throw out all that is unnecessary and to write important things bright and short.
Understand what you want to do at least in general terms. I formulated a scientific question that interests me even before I decided to apply for PhD. This helped me in finding supervisors and writing SoP. If it is difficult for you to formulate a scientific question, even in general terms, then in my opinion this is an alarming sign (see the paragraph “Why do I need phd”). On the other hand, be prepared that you will not be doing the things that are described in SoP. Such a stupid dualism.
Show SoP to everyone you trust.It does not matter if a person works in science or industry, the main thing is that you are not indifferent to him. Your task is to evaluate the range of reactions to your text: do you seem ingratiating or smug? Is it clear what kind of person you are? Is it clear what you want? Your task is to reduce the likelihood of an extreme reaction to the text, because so many different people will watch it. For example, a couple of very good and painfully accurate advice was given to me by a friend who does not have a higher education, but he is very good at understanding people. A couple of good tips were given by the person who read a lot of cover letters.
Write about the case.SoP is not an exhibition of achievements or CVs, but a text about why you need PhD and what exactly you want to do. All your achievements should be in the context of your ability to survive PhD and do research. All other things are best described in a CV or statement of personal history. What matters here is not quantity, but quality, so it’s better to choose two or three striking achievements and describe them well than to make a text version of the resume. People’s feedback will help you with this (see paragraph above).
Show what kind of person you are.When I first looked at my friend's SoP, it seemed bad to me, because in my opinion there was not enough focus and determination. Having written my SoP, I realized that there was an important merit in its text - it was clear from the text that my acquaintance was a good person. It is important to understand that universities are looking not only for stars, but also for people with whom they are pleased to work. It is clear that it is difficult to show who you are in the short text, but if you succeed, it will increase the chance that you will find a supervisor who is close to you (or filter out those that do not suit you).
Subtract for errors.This advice looks obvious right up to the moment you understand that you made a typo in the name of a potential supervisor in your dream school, to which you already applied. It happened to me, and it was very unpleasant. Do not repeat my mistake.
12. CV and cherries on the cake
In this section I will describe what in my opinion may be useful for your resume. Each of the things below does not guarantee anything separately (though the cool GitHub can help a lot), but it can be useful to make your profile a little stronger and stand out from the rest of the candidates.
12.1 Live GitHub
Most likely you have it. If not, then you urgently need to start it and learn how to use it, because it is a daily tool for working in many universities. Most likely, you have github, but there is not very much interesting. How to fill it? The best option is to reproduce famous ML / DL articles in some well-known framework like TF / PyTorch / Keras. I didn’t have such a thing, but I have repeatedly seen this advice from cool guys like Bengio, so do not repeat my mistake. It is important to understand that it is unlikely to make a github alive in a couple of months, so start working on it as soon as possible. If you have scientific articles and you can lay out the code, do it because this is the best demonstration of your code. Another option is a normal code from ML-competitions, even if you did not take prizes.

12.2 ML Competition Experience
If you are interested in ML, you are most likely involved in Kaggle competitions. For me, this is a great way to get out of my comfort zone and try new tasks. You need to understand that Kaggle requires a lot of time and mental rigidity. As a rule, all the obvious things are already done by others or described in public kernels, so you constantly have to come up with something new. That is why it is very useful. A good habit (which I never started) is to clean up the code after the competition and post the documented solution on github.
If you are in Moscow, then there is a cool ML training group in Yandex, where people regularly receive fresh gold / high silver. They also have a YouTube channel with recordings of performances.
Of the minuses of the Kuggle: it takes a lot of time, requires a bunch of computing resources, and part of the skills needed to win is very specific. But in my opinion, the pros outweigh the cons, especially if you try to summarize your experience, and not just stack public kernels (which I myself did a little more often than necessary).
12.3 Personal website
Before applying, it would be nice to have a small site that describes who you are, your projects and aspirations. Most of the applications in the universities give the opportunity to include a link to your site. I built my site after my applications for PhD, so there was no link to it in my applications. The best I could do at this point was to add a link to my site on LinkedIn, Github, and Google Scholar. The main reason why I did not make the site right away was that I chose an overly complex engine, which I did not fully understand. As soon as I found another simpler and more minimalistic engine, I made the site in a couple of days. Again - do not repeat my mistake and make the site in advance.
12.4 Google Scholar
If you have articles, then you need Google Scholar. It's that simple.
12.5 Coursera
Her presence is better than her absence, but judging by the resumes of accepted students that I have seen, she is quite common.
13. Application process
Be prepared that filling out applications is a time-consuming procedure that will take you 2-3 hours to go to college at best. Each university has its own application system and they can be very different from each other. For example, universities have different requirements for downloading documents. For example, in one university they require one pdf no more than 2 MB, in another they demand to load separate pages of transcripts in separate files, in the third they ask to drive key courses with their hands. Or, a university requires two registrations at once - one on the university’s website, the other on the department’s website, where you apply. For each university, there are dynamic things to watch out for: whether GRE, TOEFL results and recommendations have reached. In addition, if you write your statement of purpose to each university, it would be nice to keep links to them in one place.

I organized the application process through Google Doc, where I stored all the necessary information: university name, application deadline, potential academic advisors, link to the login page, login itself, GRE and TOEFL status, status of all three recommenders, university response status (useful in the process expectations), a link to statement of purpose, and so on. In addition, I had a simplified label for referrers, so it was convenient for them to keep track of when and where to send recommendations. This system worked very well for me.
14. Waiting and response timelines
So, you sent all the applications, made sure that the requests for recommendations reached the addressees and they sent them in time. Time to relax, right? If you are one of those people who answered “yes”, then you may not read this section. If you are one of those to whom expectation and uncertainty are hard for me, then you will go.
The deadline for university responses to applications in the United States is April 15th. The problem is that the distribution of answers is very different from university to university. Somewhere answers begin to come in late January-February, and somewhere not earlier than March-April. Guessing this is very difficult, so I went the other way.
There is such a site - thegradcafe.com, where applicants themselves post their applications and their status. I parsed these applications over the past five years and compiled decision-making schedules for all universities on average and for the universities that interest me. The timeline for all universities looks something like this ( larger image link ):

You can find timelines for specific universities in this album . It can be seen from them that in most cases you can especially not strain if there are no answers before the beginning of March. If there are no answers until mid-March, then this most likely means that if you are in the short lists, then most likely you are at their end. But it’s important to understand that you can get your offer in early April (as was the case with one of mine).
15. How to deal with failures

Failures are very unpleasant and insulting. And the more you want to go to a particular university, the more offensive the denial. I used several mental techniques to reduce the pain of failure.
I reminded myself that universities physically could not take all the good students. Machine learning now is an insanely competitive field, so thousands of applications are submitted to several dozen PhDs at the university, of which most are one way or another. In these conditions, you can be a strong candidate, but not pass, because you were not chosen. The two refusals that came to me mentioned the number of people who applied to CS PhD at this university: in one there were about 1,500 people, in the other about 2,000. This means that yours are very low, even if you are a strong candidate.
I started the rejection latte tradition. Every time I got a refusal, I went to a nearby coffee shop, bought myself a big tasty latte and drank it slowly. It worked surprisingly well: even in the event of a refusal, I received a small reward. In total, I received nine failures, so about five liters of coffee came out.
I whined a couple of close people. In such a stressful situation, it is important for someone to speak out. This does not mean that you should pour out your sorrows to such people day and night, but it is good when there is someone to listen and joke together or in response.
2. Discussion at academia.stackexchange how to live with waivers in PhD
16. My results and a few words about the visa in the USA
As I wrote above, I went to 11 universities (10 in the USA and Canada, one in Europe). About half of them were very popular and famous, such as Berkeley, UT Austin or NYU. The second half was just good universities, which in my opinion were below the radar. As a result, I was taken to two good universities below the radar (one of which is in the top 10 in CS Rankings), which I consider to be a success.

The process of obtaining a student visa in Moscow took 2.5 months and I had to postpone the start of studies to the spring semester. This is another evidence of how important the choice of the country is when choosing a university - it may turn out that all your efforts will be complicated by political tension between the countries or the immigration policy of the country where you entered PhD.
17. Conclusion and thanks
Regardless of the outcome of the situation with my visa, I believe that it was a useful experience. At least because I wrote this guide that will help you avoid my mistakes (and make your own). If you still decide to enroll in PhD, then good luck with that!
Many thanks to Pavel Nesterov ( mephistopheies ), Ekaterina Arkhangelskaya, Gleb Posobin, Maxim Artemyev, Yulia Denisova and Anvar Kurmukov for the comments and help in writing this manual.