
Internship at JetBrains and how I almost managed to get on it
- Tutorial

Like many young developers, when there is a desire to find a job / internship, I look in the direction of cool IT companies.
Recently I tried to get into the ranks of JetBrains and under the cut I am ready to share my experience.
Why did “almost” succeed?
Surely you immediately have such a question.
In my opinion, I have a good resume with a bunch of achievements and a good skill, which I have been improving the last 8-9 years day after day.
I completed the test task (and it seems to me good), previously visited the office of JB, which is located in my city, talked to HH and some developers of the company and, as a result, was refused an internship without any comments.
Most likely, the reason lies in the fact that JetBrains selects students exclusively for the internship, and at the moment I just graduated from the 11th and pass exams one after another.
Well, this is an occasion for another whole year to pull yourself up and apply for the next year.
Analysis of the test task
The deadlines for submitting applications for internships and testing test assignments are over, which means that everyone who solved them, including me, can post an analysis of these assignments so that next year any interested student could familiarize themselves with the approximate level of assignments before starting JB internships, with which he will have to face and in which case to pull up his knowledge.
I applied for an internship with the Corotin debugger development team for Kotlin.
The task of this team during an internship for those who hit it this year will be to finalize this part of the debugger and its integration with the IDE.
The task was a little expected - to write a debugger for a small PL.
I would not say that it is complex, rather the opposite. It does not require any in-depth knowledge of the theory of constructing translators and a cool skill. But nevertheless, those who apply for an internship in this area should at least have these basics and cope with this task without any problems. I was surprised when I decided to search on github for keywords of the solutions of my “competitors” and found 1-2 more or less working-looking solutions against about 6-7 empty repositories or with a couple of pieces of code after which people gave up. Maybe I was looking badly, but nevertheless, the results did not please me. If this post will be read by people who abandoned this task - no need to do this in the future. In an extreme case, it was enough to sit on the task for a couple of days and I am sure that you would deal with it.
The text of the quest itself
Objective: implement step-by-step code execution for the trivial programming language Guu.
Attention: in the description below, some significant points are deliberately omitted. As a rule, they remain at your discretion. If it is completely incomprehensible, write to (here is the mail that I decided to remove).
A Guu program consists of a set of procedures. Each procedure begins with the line sub (subname) and ends with the declaration of another procedure (or the end of the file if the procedure in the file is the last). Execution begins with sub main.
The body of a procedure is a set of instructions, each of which is on a separate line. Insignificant tabs or spaces may occur at the beginning of a line. Blank lines are ignored. There are no comments on Guu.
Guu has only three operators: - set (varname) (new value) - setting a new integer value for the variable. - call (subname) - call the procedure. Calls can be recursive. - print (varname) - print the value of the variable on the screen.
Variables in Guu have a global scope. The program below will display the line a = 2.
sub main
set a 1
call foo
print a
sub foo
set a 2
But the simplest program with infinite recursion:
sub main
call main You
need to write a step-by-step interpreter for Guu. When it starts, the debugger should stop on the line with the first instruction in sub main and wait for commands from the user. Minimum required set of debugger commands:
i - step into, the debugger goes inside call (subname).
o - step over, the debugger does not go inside the call.
trace - print stack trace execution with line numbers starting with main ...
var - print the values of all declared variables.
The user’s communication format with the debugger is left to the discretion above. You can choose either a minimalistic GDB-like interface or a console or graphical UI. The names of the debugger commands can be changed if desired.
To solve this problem, you can use any programming language from TIOBE TOP 50 and an open-source compiler / interpreter.
When evaluating the work will be evaluated:
The overall performance of the program;
The quality of the source code and the availability of tests;
Easy to expand functionality (for example, support for new language statements or debugger instructions).
A solution with instructions for building it must be published to the Git repository (for example, on GitHub or BitBucket). In the response you need to specify a link to the repository. A link to a private GitHub repository is also suitable, only you will need to add me to it.
Attention: in the description below, some significant points are deliberately omitted. As a rule, they remain at your discretion. If it is completely incomprehensible, write to (here is the mail that I decided to remove).
A Guu program consists of a set of procedures. Each procedure begins with the line sub (subname) and ends with the declaration of another procedure (or the end of the file if the procedure in the file is the last). Execution begins with sub main.
The body of a procedure is a set of instructions, each of which is on a separate line. Insignificant tabs or spaces may occur at the beginning of a line. Blank lines are ignored. There are no comments on Guu.
Guu has only three operators: - set (varname) (new value) - setting a new integer value for the variable. - call (subname) - call the procedure. Calls can be recursive. - print (varname) - print the value of the variable on the screen.
Variables in Guu have a global scope. The program below will display the line a = 2.
sub main
set a 1
call foo
print a
sub foo
set a 2
But the simplest program with infinite recursion:
sub main
call main You
need to write a step-by-step interpreter for Guu. When it starts, the debugger should stop on the line with the first instruction in sub main and wait for commands from the user. Minimum required set of debugger commands:
i - step into, the debugger goes inside call (subname).
o - step over, the debugger does not go inside the call.
trace - print stack trace execution with line numbers starting with main ...
var - print the values of all declared variables.
The user’s communication format with the debugger is left to the discretion above. You can choose either a minimalistic GDB-like interface or a console or graphical UI. The names of the debugger commands can be changed if desired.
To solve this problem, you can use any programming language from TIOBE TOP 50 and an open-source compiler / interpreter.
When evaluating the work will be evaluated:
The overall performance of the program;
The quality of the source code and the availability of tests;
Easy to expand functionality (for example, support for new language statements or debugger instructions).
A solution with instructions for building it must be published to the Git repository (for example, on GitHub or BitBucket). In the response you need to specify a link to the repository. A link to a private GitHub repository is also suitable, only you will need to add me to it.
I write in C ++, Java and Object Pascal.
At first there were thoughts to write everything in my same MPS, but I thought that it would not be very convenient for JB to check it, and I submitted the application 2 days before the closing of the submission (exams all the same ...), and it was already evening outside the window - I decided to quickly write everything in more well-known languages.
In my opinion, Pascal is most suitable for solving the problem, at least because of the most convenient implementation of strings ...
At least for me. In addition, it is in TIOBE TOP 50, so I boldly launched the IDE, namely Lazarus, because he is not commercial :) and set about solving the problem.
Despite the fact that they give JB as much as 7 days, it took me about an hour to complete the project, and the project turned out to be about 500 lines of code.
Where to begin?
First of all, you need to imagine how code debugging will work in the end.
We need to implement step-by-step code execution - that means each instruction should be presented in the form of a structure / class and, in general, the instructions should look like a list of these classes or, as in my implementation, refer to each other forming a sequence (I will write down why I did it later).
To get this sequence, our debugger needs to process the code in the proposed language, which means we also need to implement a small parser, as well as syntactic and semantic analysis of the code.
Let's start with the implementation of the parser. Because Since the Guu language consists of a set of tokens, separated by a space, it is logical to write a small and simple tokenizer first:
function GetToken(s: string; tokenNum: word): string;
var
p: word;
begin
s := Trim(s);
s := StringReplace(s, ' ', ' ', [rfReplaceAll]);
while tokenNum > 1 do
begin
p := Pos(' ', s);
if p > 0 then
Delete(s, 1, p)
else
begin
s := '';
break;
end;
dec(tokenNum);
end;
p := Pos(' ', s);
if p > 0 then
Delete(s, p, Length(s));
Result := s;
end;
Next, declare enum from tokens:
type
TGuuToken = (opSub, opSet, opCall, opPrint, opUnknown);
const
GuuToken: array[opSub..opPrint] of string = (
'sub', 'set', 'call', 'print'
);
And the instruction class itself, into which we will parse the lines of code:
type
TGuuOp = class
public
OpType : TGuuToken;
OpArgs : TStringList;
OpLine : Cardinal;
OpUnChangedLine: string;
NextOp : TGuuOp;
OpReg : Pointer;
function Step(StepInto: boolean; CallBacks: TList; Trace: TStringList): TGuuOp;
constructor Create(LineNum: Cardinal; Line:string);
destructor Destroy; override;
end;
In OpType the instruction will be stored, in OpArgs - the rest of the construction.
OpLine, OpUnChangedLine - information for the debugger.
NextOp is a pointer to the next statement. If it is equal to nil (null in Pascal), then there are no further instructions and you need to complete the code or return via the callback stack.
OpReg is a small pointer-register, which will be used later for a small optimization of code execution.
After the class declaration was written - I decided that the most compact and beautiful solution would be to add the parser and a little parsing in its constructor, which I did next:
constructor TGuuOp.Create(LineNum: Cardinal; Line:string);
(*
* That method parse code line.
*)
var
s: string;
w: word;
begin
inherited Create;
OpArgs := TStringList.Create;
OpLine := LineNum;
OpUnChangedLine := Line;
NextOp := nil;
OpReg := nil;
s := GetToken(Line, 1);
OpType := TGuuToken(AnsiIndexStr(s, GuuToken));
case OpType of
opSub : begin // sub
s := GetToken(Line, 2);
if Length(s) > 0 then
OpArgs.Add(s)
else
begin
writeln('[Syntax error]: Invalid construction "sub" at line ', OpLine, '.');
halt;
end;
if Length(GetToken(Line, 3)) > 0 then
begin
writeln('[Syntax error]: Invalid construction "', Line, '" at line ', OpLine, '.');
halt;
end;
end;
opSet : begin // set
OpArgs.Add(GetToken(Line, 2));
OpArgs.Add(GetToken(Line, 3));
w := 1;
while w < Length(OpArgs[1]) + 1 do
begin
if not (OpArgs[1][w] in ['0'..'9']) then
begin
writeln('[Syntax error]: Invalid variable assigment "', Line, '" at line ', OpLine, '.');
halt;
end;
inc(w);
end;
if (Length(OpArgs[0]) = 0) or (Length(OpArgs[1]) = 0) or
(Length(GetToken(Line, 4)) > 0) then
begin
writeln('[Syntax error]: Invalid construction "', Line, '" at line ', OpLine, '.');
halt;
end
end;
opCall : begin // call
s := GetToken(Line, 2);
if Length(s) > 0 then
OpArgs.Add(s)
else
begin
writeln('[Syntax error]: Invalid construction "call" at line ', OpLine, '.');
halt;
end;
if Length(GetToken(Line, 3)) > 0 then
begin
writeln('[Syntax error]: Invalid construction "', Line, '" at line ', OpLine, '.');
halt;
end;
end;
opPrint: begin // print
s := GetToken(Line, 2);
if Length(s) > 0 then
OpArgs.Add(s)
else
begin
writeln('[Syntax error]: Invalid construction "print" at line ', OpLine, '.');
halt;
end;
if Length(GetToken(Line, 3)) > 0 then
begin
writeln('[Syntax error]: Invalid construction "', Line, '" at line ', OpLine, '.');
halt;
end;
end;
else
begin
writeln('[Syntax error]: Invalid token "', s, '" at line ', OpLine, '.');
halt;
end;
end;
end;
destructor TGuuOp.Destroy;
begin
FreeAndNil(OpArgs);
inherited;
end;
Here we essentially check the beginning of the construction (i.e. the first word), and then look at the remaining tokens and their number. If something is clearly wrong with the code, we display an error.
In the main piece of code, we simply read the code in the TStringList from the file, call the TGuuOp constructors line by line, and save the pointers to the class instances in GuuOps: TList.
Ads:
var
LabelNames: TStringList;
GuuOps, GuuVars: TList;
SubMain: TGuuOp = nil;
Together with code parsing, it would be nice to perform a couple more actions:
procedure ParseNext(LineNum: Cardinal; Line: string);
(*
* Parsing code lines and define variables and labels.
*)
var
Op: TGuuOp;
GV: TGuuVar;
c: cardinal;
begin
if Trim(Line) <> '' then
begin
Op := TGuuOp.Create(LineNum, Line);
GuuOps.Add(Op);
case Op.OpType of
opSet: begin // define variable and/or optimisation var calling
GV := nil;
c := 0;
while c < GuuVars.Count do
begin
if TGuuVar(GuuVars[c]).gvName = Op.OpArgs[0] then
begin
GV := TGuuVar(GuuVars[c]);
break;
end;
inc(c);
end;
if GV = nil then
begin
GV := TGuuVar.Create(Op.OpArgs[0]);
GuuVars.Add(GV);
end;
Op.OpReg := GV;
end;
opSub: begin // Check for label dublicade declaration
if Op.OpArgs[0] = 'main' then
SubMain := Op;
if LabelNames.IndexOf(Op.OpArgs[0]) <> -1 then
begin
writeln('[Error]: Dublicate sub "', Op.OpArgs[0], '" declaration at line ', Op.OpLine, '.');
halt;
end
else
LabelNames.Add(Op.OpArgs[0]);
end;
end;
end;
end;
At this stage, you can check the entry points at the time of redefinition and think of OpReg - I used it to store a pointer to a Guu variable.
Speaking of variables, I took this small piece of code into a separate unit:
unit uVars;
{$mode objfpc}{$H+}
interface
uses
Classes, SysUtils;
type
TGuuVar = class
public
gvName: string;
gvVal: variant;
constructor Create(VarName: string);
end;
implementation
constructor TGuuVar.Create(VarName: string);
begin
inherited Create;
gvName := VarName;
gvVal := 0;
end;
end.
Now we have parsed code that seems to be correct in syntax. It remains to analyze it and you can begin to perform the most important thing - debugging.
Next, you need to implement a little semantic analysis and simultaneously prepare everything for code execution and debugging:
procedure CheckSemantic;
(*
* Semantic analyse and calls optimisation.
*)
var
c, x: cardinal;
op: TGuuOp;
begin
if GuuOps.Count > 0 then
begin
if TGuuOp(GuuOps[0]).OpType <> opSub then
begin
writeln('[Error]: Operation outside sub at line ', TGuuOp(GuuOps[0]).OpLine, '.');
halt;
end;
c := 0;
while c < GuuOps.Count do
begin
case TGuuOp(GuuOps[c]).OpType of
opSub:;
opCall: begin
TGuuOp(GuuOps[c - 1]).NextOp := TGuuOp(GuuOps[c]);
x := 0;
op := nil;
while x < GuuOps.Count do
begin
if TGuuOp(GuuOps[x]).OpType = opSub then
if TGuuOp(GuuOps[x]).OpArgs[0] = TGuuOp(GuuOps[c]).OpArgs[0] then
begin
op := TGuuOp(GuuOps[x]);
break;
end;
inc(x);
end;
if op <> nil then
TGuuOp(GuuOps[c]).OpReg := op
else
begin
writeln('[Error]: Calling to not exist sub "', TGuuOp(GuuOps[c]).OpArgs[0],
'" at line ', TGuuOp(GuuOps[c]).OpLine, '.');
halt;
end;
end;
opPrint: begin
TGuuOp(GuuOps[c - 1]).NextOp := TGuuOp(GuuOps[c]);
x := 0;
while x < GuuVars.Count do
begin
if TGuuVar(GuuVars[x]).gvName = TGuuOp(GuuOps[c]).OpArgs[0] then
begin
TGuuOp(GuuOps[c]).OpReg := TGuuVar(GuuVars[x]);
break;
end;
inc(x);
end;
if TGuuOp(GuuOps[c]).OpReg = nil then
begin
writeln('[Error]: Variable "', TGuuOp(GuuOps[c]).OpArgs[0],
'" for print doesn''t exist at line ', TGuuOp(GuuOps[c]).OpLine, '.');
end;
end;
else
TGuuOp(GuuOps[c - 1]).NextOp := TGuuOp(GuuOps[c]);
end;
inc(c);
end;
end;
end;
In TGuuOp.NextOp of each token, write a pointer to the next token.
For the call opcode, we’re doing it tricky and simple - in NextOp we write a pointer to the called entry point.
We also check the output variables through the print statement ...
Maybe they were not declared before the output?
Now you need to implement code execution. Go back to the TGuuOp class and implement the Step method:
function TGuuOp.Step(StepInto: boolean; CallBacks: TList; Trace: TStringList): TGuuOp;
(*
* That method execute instruction.
*)
var
Op: TGuuOp;
CBSize: Cardinal;
begin
case OpType of
opSub: begin
Trace.Add('-> Sub "' + OpArgs[0] + '"');
Result := NextOp;
end;
opCall: begin
if StepInto then
begin
if NextOp <> nil then
CallBacks.Add(NextOp);
Result := TGuuOp(OpReg);
end
else
begin
Op := TGuuOp(OpReg);
CBSize := CallBacks.Count;
while ((Op <> nil) or (CallBacks.Count > CBSize)) and (Trace.Count < STACK_SIZE) do
begin
if Op = nil then
begin
Op := TGuuOp(CallBacks[CallBacks.Count - 1]);
CallBacks.Delete(CallBacks.Count - 1);
Trace.Delete(Trace.Count - 1);
end;
Op := Op.Step(StepInto, CallBacks, Trace);
end;
Result := NextOp;
end;
end;
opPrint: begin
writeln(TGuuVar(OpReg).gvName, ' = ', TGuuVar(OpReg).gvVal);
Result := NextOp;
end;
opSet: begin
TGuuVar(OpReg).gvVal := OpArgs[1];
Result := NextOp;
end;
end;
end;
To avoid access violation in the event of a loop - it is better to limit the stack, which I did.
The constant STACK_SIZE = 2048, declared above is just responsible for this.
Now it’s finally time to write the main code of our debugger:
var
code: TStringList;
c: Cardinal;
cmd: string;
CallBacks: TList;
Trace: TStringList;
DebugMode: boolean = true;
begin
if ParamCount > 0 then
begin
// Initialisation
if not FileExists(ParamStr(1)) then
begin
writeln('[Error]: Can''t open file "', ParamStr(1), '".');
halt;
end;
if ParamCount > 1 then
if LowerCase(ParamStr(2)) = '/run' then
DebugMode := false;
code := TStringList.Create;
code.LoadFromFile(ParamStr(1));
GuuOps := TList.Create;
GuuVars := TList.Create;
// Parsing and preparing
LabelNames := TStringList.Create;
c := 0;
while c < code.Count do
begin
ParseNext(c + 1, Trim(code[c]));
inc(c);
end;
FreeAndNil(LabelNames);
CheckSemantic;
if SubMain = nil then
begin
writeln('[Error]: Sub "main" doesn''t exist!');
halt;
end;
// Start code execution
CurrentOp := SubMain;
CallBacks := TList.Create;
Trace := TStringList.Create;
if DebugMode then
begin
//Out code and features
ClrScr;
writeln('Code for debugging:');
writeln('.....');
c := 0;
while c < code.Count do
begin
writeln(FillSpaces(IntToStr(c + 1), 4), '| ', code[c]);
inc(c);
end;
writeln('"""""');
FreeAndNil(code);
writeln(sLineBreak,
'Features:', sLineBreak,
'* i - step into.', sLineBreak,
'* o - step over.', sLineBreak,
'* trace - print stack trace.', sLineBreak,
'* var - print variables list.', sLineBreak,
'* x - exit.', sLineBreak);
// Execution loop
while ((CurrentOp <> nil) or (CallBacks.Count > 0)) and (Trace.Count < STACK_SIZE) do
begin
write('Line ', CurrentOp.OpLine, ' ~> ');
readln(cmd);
// Execute commands
if cmd = 'i' then
CurrentOp := CurrentOp.Step(true, CallBacks, Trace)
else
if cmd = 'o' then
CurrentOp := CurrentOp.Step(false, CallBacks, Trace)
else
if cmd = 'trace' then
begin
writeln('| Trace:');
c := 0;
while c < Trace.Count do
begin
writeln('| ', Trace[c]);
inc(c);
end;
writeln('| -> Line ', CurrentOp.OpLine, ': "', CurrentOp.OpUnChangedLine, '".')
end
else
if cmd = 'var' then
begin
writeln('| Variables list:');
c := 0;
while c < GuuVars.Count do
begin
writeln('| ', TGuuVar(GuuVars[c]).gvName, ' = ', TGuuVar(GuuVars[c]).gvVal);
inc(c);
end;
end
else
if cmd = 'x' then
halt;
// Check for method end & make callback
if (CurrentOp = nil) and (CallBacks.Count > 0) then
begin
CurrentOp := TGuuOp(CallBacks[CallBacks.Count - 1]);
CallBacks.Delete(CallBacks.Count - 1);
Trace.Delete(Trace.Count - 1);
end;
end;
end
else
begin
// Only run mode (/run)
FreeAndNil(code);
while ((CurrentOp <> nil) or (CallBacks.Count > 0)) and (Trace.Count < STACK_SIZE) do
begin
CurrentOp := CurrentOp.Step(false, CallBacks, Trace);
if (CurrentOp = nil) and (CallBacks.Count > 0) then
begin
CurrentOp := TGuuOp(CallBacks[CallBacks.Count - 1]);
CallBacks.Delete(CallBacks.Count - 1);
Trace.Delete(Trace.Count - 1);
end;
end;
end;
if Trace.Count >= STACK_SIZE then
writeln('[Runtime error]: Stack overflow!');
FreeAndNil(CallBacks);
FreeAndNil(Trace);
end
else
writeln(
'Guu debugger v1.0.', sLineBreak,
'Author: Pavel Shiryaev (@RoPi0n).', sLineBreak,
'Run: svmc guu_debugger.vmc [arg]', sLineBreak,
'Args:', sLineBreak,
' /run - Run Guu code.'
);
end.
By the condition of the job, the interface can be implemented as you like.
It would be possible to implement a full-fledged UI, fasten SynEdit to the project, but in my opinion it’s a waste of work that will not reflect the skill, and also, which will not be paid for :)
So I limited myself to a small console UI.
The code above is not complicated, so you can leave it without comment. In it, we take ready-made TGuuOp's and call them Step.
Screenshots of the solved problem:


Displaying error information:


Link to the repository of my solution: click
Summary
There are no particular results. I’ll have to devote most of the summer to a busy vacation and search for a university (well, in case I pass the exam well, of course), instead of two months of work and training in the JetBrains team.
Perhaps next year a new post will appear on Habré, already describing the internship process in JB or in another company interesting to me :)