Part 3/3. The ideal VM compiler for ICFPC 2009, on Haskell, with popularizing comments
Ending. Previous parts: 1 and 2
What else is left? We missed the place where two adjacent opcodes turn into one. What did we have? According to the specification, the following operations took place: Examining binary files for how these constructs are used, we realized that they came with a pair of Cmp, then Phi. Surely, because it was compiled by the organizers from some of their original formulas by some kind of compiler. Well, if so, then we glue them a little in one operation:
Now, we determine that for an empty list, nothing needs to be done. For a list starting with our two teams we are looking for, we replace them with two new teams, and then we process the rest of the list. And if the list starts with everything else (which corresponds to the third line, the rest is given the name “x”), then we leave this rest unchanged, and process the rest of the list “xs”.
Now the main part of the program. Haskell entry point - main function
This we read the file (see above)
Conclusion
This "compiler" is written at the level of a novice Haskelist, of course. But even at this level, I became noticeably different from Java, in which I write every day at work.
Firstly, the absence of any “utility functions” and classes required in java to implement all kinds of patterns there that would probably be used by any authoritative javist.
Secondly, the distribution of time. I spent a lot of time researching working with Ptr - because I chose this particular method of converting 8 bytes to Double, and I have not done this before. Then, about two-thirds of the time it took to write the decoding of the file, creating the AST and its basic serialization in Haskell-code (linear). And only a third of the time I spent actually on the algorithm for translating flat opcodes into a tree-like AST, and the algorithm turned out by itself.
In short, I experienced a feeling of deep satisfaction. It is a pity there was no time for this at IFCPC.
What else is left? We missed the place where two adjacent opcodes turn into one. What did we have? According to the specification, the following operations took place: Examining binary files for how these constructs are used, we realized that they came with a pair of Cmp, then Phi. Surely, because it was compiled by the organizers from some of their original formulas by some kind of compiler. Well, if so, then we glue them a little in one operation:
flag = m20 > 0
if (flag) m222 = m3 else m222 = m4
-- convert 2-op conditional operator to 1-op
removePhi [] = []
removePhi ((Cmpz cond condr1):(Phi addr r1 r2):xs) =
Noop addr : If cond condr1 addr r1 r2 : removePhi xs
removePhi (x:xs) = x:removePhi xs
Here, the list is managed as follows: a list of opcodes is supplied to the input of the function; at the output, we have a list in which this pair [Cmp, Phi] is replaced by [Noop, If]. Noop is inserted in order to leave the addresses of all commands unchanged (i.e. so that their number and position do not change, because their position in memory is one of their operands). Now, we determine that for an empty list, nothing needs to be done. For a list starting with our two teams we are looking for, we replace them with two new teams, and then we process the rest of the list. And if the list starts with everything else (which corresponds to the third line, the rest is given the name “x”), then we leave this rest unchanged, and process the rest of the list “xs”.
Now the main part of the program. Haskell entry point - main function
main = do
(len,buf) <- readMyFile
This we read the file (see above)
let nframes = fromInteger $ len `div` 12 -- посчитали количество опкодов и размер памяти.
-- opcode/data memory in tuples
ids <- mapM (\i -> instruction (plusPtr buf $ i*12) i) [0..nframes-1]
Here, each i (from 0 to nframes-1] was mapped to the result - a pair (instructon, data), which came from adding to the pointer (buf) the corresponding offset (i * 12), and interpreting the command at this address (calling instruction with a pointer where it lies and with the instruction number, for even-even sampling).-- code AST
let code = removePhi $ map disasm $ zip (map fst ids) [O..]
Now we (read on the right), took from the list of pairs (ids) all (map) first (fst) elements (instructions only), made pairs (zip) with numbers from zero to as many as needed ([0 ..] ), then for all these pairs (instruction, address) passed (map) through disasm, got a list of Ops, then removed Phi from them, making If instead (using removePhi). -- print code
putStrLn $ produceCode code (map snd ids)
Further, having code and data (via map snd ids - take the second half of the pairs "(instruction, data)", they called produceCode with two parameters and printed the result (putStrLn).free buf
And this line definitely does not need comment. Conclusion
This "compiler" is written at the level of a novice Haskelist, of course. But even at this level, I became noticeably different from Java, in which I write every day at work.
Firstly, the absence of any “utility functions” and classes required in java to implement all kinds of patterns there that would probably be used by any authoritative javist.
Secondly, the distribution of time. I spent a lot of time researching working with Ptr - because I chose this particular method of converting 8 bytes to Double, and I have not done this before. Then, about two-thirds of the time it took to write the decoding of the file, creating the AST and its basic serialization in Haskell-code (linear). And only a third of the time I spent actually on the algorithm for translating flat opcodes into a tree-like AST, and the algorithm turned out by itself.
In short, I experienced a feeling of deep satisfaction. It is a pity there was no time for this at IFCPC.