
I want to work at Google! Telephone Interview (Part 3, Nutritional)
From the comments on the previous article, in addition to a bunch of useful information, discussion of the shortcomings of my code, I also made a strategic decision - by hook or by crook to avoid C / C ++ programming in the next interview. Affects the lack of practice writing programs. For more than 4 years he has not been touched and the python was enough for any statistical calculations and data visualization. But I will definitely return to the classic textbooks next week. Comrades TheHorse and 0leGG ashamed, I have a second state, and AxisPod scored the last nail in the coffin of my hopes, that will go to the old knowledge. Therefore, shifting the emphasis towards the beloved Python, let's look at possible tasks.
And more news from the fronts: after discussionthe question raised by danmiru , and the call from my recruiter who found out about these publications on Habré (the world is still small), I made a strong-willed decision not to publish the specific tasks that I had in the interview. The list of topics for training and standard tasks taken from public sources do not cause big problems for, I hope, my future employer.
Well and again, I am very grateful to the habro-audience for such an active discussion of these topics. Having filtered out the comments “I would have put the author to the wall”, and proceeding immediately to the proposals in which the code is discussed, I learned a lot of useful things for myself. Many thanks!!!
Well, let's try to look at data structures and slightly more complex algorithms implemented in python. The tasks that we will consider here are very much inspired by the book Types of coding questions Google asks: Programming Interviews Exposed; Secrets to Landing Your Next Job (Programmer to Programmer) ” by John Mongan, Noah Suojanen and Eric Giguere. But I hope I do not violate copyright, because these are common programming problems, and I give sources where else they were discussed; I choose only those that I liked a lot; plus the tasks in the book are solved in C, I'm trying to write them in python.
Remarkably, this data structure is implemented in C, which I so bravely decided not to touch.
Last time, I won’t! There are no per se pointers in Python , but any object can be assigned to a variable. And therefore, we can make a chain of objects by assigning the next object from the chain to the property of the previous link. There is no particular practical value in such a design, but for the sake of practice ... (much can be justified by this).
That is, we will build such a construction:
It’s more correct to make a class for the sequence itself. To add a new element to the beginning of the chain, find the link you need:
In python, the garbage collector tracks the state of objects by the number of links to this object. If all links have been deleted or reassigned, the object will be deleted. Here, prior to the 2.0 version of python, there was a potential problem if the object referred to itself - it would never have been deleted, so they introduced a regular check for looping and now even such a design will be freed after some time:
But this is such a lyrical digression. In general, this kind of design is very artificial. In python, this type of data is easily replaced by regular sheets (lists, tuples).
But if the task is to write this kind of data construction, you can generally implement a lot through the lambda functions, as was done here :
The functions are a little confused, but you can figure it out by looking at what they do.
Next, we look at the task that can be given for this structure
To those two classes that we described earlier, you need to add two methods
Unfortunately, it turns out like last time - I spend a lot of time writing 20% of the text at the beginning, and then realizing that there is not that much time, I’ll add the rest 80 headlong. But what to do, this already looks like a tradition. So let's go, so my high-speed 80% started.
We use the whole subject with related sequences. This task tests the C programming skills well, as you have to work with pointers and specifically monitor code execution errors (lack of memory, incorrect input data). But for python, such a task will do.
The idea is that instead of storing data about the entire sequence and running along the first m elements, if this is possible (if not, return None, although you can also raise an exception). And then we reach the end of the sheet simultaneously with the second pointer, which is offset from the first by m elements. This method saves memory - we only need to keep two pointers in memory. And this is the O (n) algorithm, in the worst case, and this is the best that can be achieved for a one-way connected sequence.
Inert tongue on the brink of fiction. This example was discussed, for example, here , using a recursive method. You can implement it all through iterations. The idea of the method is as follows:
we go through the sequence first, until we come across an empty element (end of the sequence)
if we see that the current element has a child - throw it at the end of the sequence, tie the tail with the child and vice versa
find a new tail, running with two pointers (tail runs twice as fast as cyclecheck), if there is a cycle - pointers will ever coincide.
The constructor of this data structure is very simple, but it was done to simply create such a data structure. The class simply creates a double linked sequence. And a stand-alone function connects two elements, like relatives. To avoid evil languages, I say in advance, this was done on purpose. In production, hands need to be torn off.
For practice, you can restore the data structure (deFlatten) back. Because Since we did not delete information about children, we can run in sequence and when we find a link to a child - remove the connection between the tail and the child, and the child with the tail; recursively call a function for a child.
It is possible to achieve partitioning iteratively. But running in sequence is better from the end. When a child is found, cut it off from the previous element (this will be a new tail), remove the link to the child from the new tail; follow further in sequence. Everything will work fine (specialists-algorithmists have comments?), If there were no cycles in the structure. But we checked for this condition when we created a flat structure.
One of the issues on the list of topics (it’s good that I was allowed to publish them) was the types of binary tree traversal. If you do not remember, I’m talking about preorder (KLP), inorder (LKP) and postorder (KLP). This just indicates in what sequence we will look at the root (K), the left (L) branch and the right (P) branch. This question is quite common and has already been written about here and on stackoverflow .
To iterate through the tree, you need to store information about the branches that you need to visit. A stack is suitable for a preorder traversal - this is a traversal in depth. I had to sit longer on the iterative inorder bypass. But the same idea as last time. Just to get the leftmost element of the tree, we go down to it until we come across an empty sheet. We take out the previous value from the stack, this will be the sheet itself, display its value and look to the right. If there is nothing there, then we will get from the stack the parent who already has the right element. Etc. The crawl is in the order of the leftmost, root, rightmost element.
In python, dictionaries and sets use a hash table to store and find values / keys. The algorithm shown below will be at least O (n) (pass along the entire line), since it will take some more time to add new characters to the dictionary.
Will there be ideas how to increase speed yet? Here then everything was done through the normal sequence. Checking whether a character is in a sequence is not done in constant time O (1), as for dictionaries and sets. So I think the use of dictionaries is slightly better.
Well, the last for today:
The second method is not only shorter, but also faster:
There will be no problems from my interview, because I was very much asked not to publish them. I apologize.
Everything, start kicking!
Part 1
Part 2
And more news from the fronts: after discussionthe question raised by danmiru , and the call from my recruiter who found out about these publications on Habré (the world is still small), I made a strong-willed decision not to publish the specific tasks that I had in the interview. The list of topics for training and standard tasks taken from public sources do not cause big problems for, I hope, my future employer.
Well and again, I am very grateful to the habro-audience for such an active discussion of these topics. Having filtered out the comments “I would have put the author to the wall”, and proceeding immediately to the proposals in which the code is discussed, I learned a lot of useful things for myself. Many thanks!!!
Well, let's try to look at data structures and slightly more complex algorithms implemented in python. The tasks that we will consider here are very much inspired by the book Types of coding questions Google asks: Programming Interviews Exposed; Secrets to Landing Your Next Job (Programmer to Programmer) ” by John Mongan, Noah Suojanen and Eric Giguere. But I hope I do not violate copyright, because these are common programming problems, and I give sources where else they were discussed; I choose only those that I liked a lot; plus the tasks in the book are solved in C, I'm trying to write them in python.
Linked list
Remarkably, this data structure is implemented in C, which I so bravely decided not to touch.
typedef struct IntElement {
struct IntElement * next;
int data;
} IntElement;
Last time, I won’t! There are no per se pointers in Python , but any object can be assigned to a variable. And therefore, we can make a chain of objects by assigning the next object from the chain to the property of the previous link. There is no particular practical value in such a design, but for the sake of practice ... (much can be justified by this).
That is, we will build such a construction:
class Element:
def __init __ (self, data = None):
self.next = None
self.data = data
It’s more correct to make a class for the sequence itself. To add a new element to the beginning of the chain, find the link you need:
class LinkedList:
def __init __ (self, data = None):
self.head = None
if data:
# initializing the sheet, we immediately want to
stuff the data newElement = Element (data)
self.head = newElement
def insertFront (self, data):
newElement = Element (data)
newElement.next = self.head
self.head = newElement
def insertBack (self, data):
pos = self.head
while (not pos.next): pos = pos.next
pos.next = Element ( data)
def find (self, data):
next = self.head
while (not next and data! = next.data): next = next.next
return next
def delete (self, delElement):
# special case with the beginning of the sequence
if delElement == self.head:
self.head = self.head.next
# gc in python will delete an object if no one refers to it
# (reference count )
# or: del delElement - what will the experts say?
return True
pos = self.head
while (not pos.next):
# tidy with the last element. pos - will change to
# the penultimate
if delElement == pos.next:
pos.next = delElement.next
# if we delete the last element, then delElement.next == None
# again, is del delElement necessary here?
return True
pos = pos.next
return False
def deleteAll (self):
while (not self.head.next):
delElement = self.head
self.head = self.head.next
del delElement # or enough delElement.next = None
self .head = None
In python, the garbage collector tracks the state of objects by the number of links to this object. If all links have been deleted or reassigned, the object will be deleted. Here, prior to the 2.0 version of python, there was a potential problem if the object referred to itself - it would never have been deleted, so they introduced a regular check for looping and now even such a design will be freed after some time:
class Untouchable:
def __init __ (self):
self.me = self
But this is such a lyrical digression. In general, this kind of design is very artificial. In python, this type of data is easily replaced by regular sheets (lists, tuples).
But if the task is to write this kind of data construction, you can generally implement a lot through the lambda functions, as was done here :
w = sys.stdout.write
cons = lambda el, lst: (el, lst)
mklist = lambda * args: reduce (lambda lst, el: cons (el, lst), reversed (args), None)
car = lambda lst : lst [0] if lst else lst
cdr = lambda lst: lst [1] if lst else lst
nth = lambda n, lst: nth (n-1, cdr (lst)) if n> 0 else car (lst)
length = lambda lst, count = 0: length (cdr (lst), count + 1) if lst else count
begin = lambda * args: args [-1]
display = lambda lst: begin (w ("% s"% car ( lst)), display (cdr (lst))) if lst else w ("niln")
The functions are a little confused, but you can figure it out by looking at what they do.
Next, we look at the task that can be given for this structure
Write a stack using a linked sequence of objects
To those two classes that we described earlier, you need to add two methods
push, pop
class Stack (LinkedList):
def push (self, data): insertFront (data)
def pop (self):
tmp = self.head.data
delete (self.head)
return tmp
Unfortunately, it turns out like last time - I spend a lot of time writing 20% of the text at the beginning, and then realizing that there is not that much time, I’ll add the rest 80 headlong. But what to do, this already looks like a tradition. So let's go, so my high-speed 80% started.
Return the mth element from the end of the sequence (m = 0 - the last element)
We use the whole subject with related sequences. This task tests the C programming skills well, as you have to work with pointers and specifically monitor code execution errors (lack of memory, incorrect input data). But for python, such a task will do.
def mFromTheEnd (self, m):
cur = self.head
if not cur: return None # check for an empty sequence
for i in range (m):
if cur.next: cur = cur.next
else: return None
mcur = self. head
while (cur.next):
mcur = mcur.next
cur = cur.next
return mcur
The idea is that instead of storing data about the entire sequence and running along the first m elements, if this is possible (if not, return None, although you can also raise an exception). And then we reach the end of the sheet simultaneously with the second pointer, which is offset from the first by m elements. This method saves memory - we only need to keep two pointers in memory. And this is the O (n) algorithm, in the worst case, and this is the best that can be achieved for a one-way connected sequence.
Spread double-linked sequences with children
Inert tongue on the brink of fiction. This example was discussed, for example, here , using a recursive method. You can implement it all through iterations. The idea of the method is as follows:
we go through the sequence first, until we come across an empty element (end of the sequence)
if we see that the current element has a child - throw it at the end of the sequence, tie the tail with the child and vice versa
find a new tail, running with two pointers (tail runs twice as fast as cyclecheck), if there is a cycle - pointers will ever coincide.
The constructor of this data structure is very simple, but it was done to simply create such a data structure. The class simply creates a double linked sequence. And a stand-alone function connects two elements, like relatives. To avoid evil languages, I say in advance, this was done on purpose. In production, hands need to be torn off.
#! / usr / bin / env python
# - * - coding: utf-8 - * -
import sys
class Element:
def __init __ (self, data = None):
self.next = None
self.prev = None
self.data = data
self.child = None
class DoubleLinkedList:
def __init __ (self, data = None):
self.head = None
self.tail = None
if data:
for dat in data:
newEl = Element (dat)
if self.tail: self. tail.next = newEl
newEl.prev = self.tail
if not self.head: self.head = newEl
self.tail = newEl
def addChild (parent, child):
parent.child = child
# no loopback test
def flat (lst):
cur = lst.head
while (cur):
if cur.child:
lst.tail.next = cur.child
cur. child.prev = lst.tail
cyclecheck = lst.tail
while (lst.tail.next and lst.tail.next.next):
cyclecheck = cyclecheck.next
lst.tail = lst.tail.next.next
# There is an error in the book this algorithm since the first step is not done
# and the initial conditions are the same, i.e. if is always executed
if cyclecheck == lst.tail:
print ("Looping in the data structure")
sys.exit ()
if lst.tail.next: lst.tail = lst.tail.next
cur = cur.next
def main ():
st1 = DoubleLinkedList (range (6))
st2 = DoubleLinkedList (range (4))
st3 = DoubleLinkedList (range (3))
addChild (st1.head, st2.head)
addChild (st1.tail, st3.head)
# addChild (st2.tail, st3.head) - loop .
flat (st1)
cur = st1.head
while (cur):
print (cur.data)
cur = cur.next
if __name__ == '__main__':
main ()
For practice, you can restore the data structure (deFlatten) back. Because Since we did not delete information about children, we can run in sequence and when we find a link to a child - remove the connection between the tail and the child, and the child with the tail; recursively call a function for a child.
It is possible to achieve partitioning iteratively. But running in sequence is better from the end. When a child is found, cut it off from the previous element (this will be a new tail), remove the link to the child from the new tail; follow further in sequence. Everything will work fine (specialists-algorithmists have comments?), If there were no cycles in the structure. But we checked for this condition when we created a flat structure.
Iterative binary tree traversal
One of the issues on the list of topics (it’s good that I was allowed to publish them) was the types of binary tree traversal. If you do not remember, I’m talking about preorder (KLP), inorder (LKP) and postorder (KLP). This just indicates in what sequence we will look at the root (K), the left (L) branch and the right (P) branch. This question is quite common and has already been written about here and on stackoverflow .
#! / usr / bin / env python
# - * - coding: utf-8 - * -
class N:
def __init __ (self, data, left = None, right = None):
self.data = data
self.left = left
self.right = right
def __unicode __ (self):
return '% s'% self.data
# a great way to print the root value
def inorder (t):
if t:
for x in inorder (t.left):
yield x
yield t. data
for x in inorder (t.right):
yield x
def iterate_preorder (root):
stack = []
stack.append (root)
while stack:
cur = stack.pop ()
yield cur.data
if cur.right: stack.append (cur.right)
if cur.left: stack.append (cur.left)
def iterate_inorder (root):
stack = []
cur = root
while stack or cur:
if cur :
stack.append (cur)
cur = cur.left
else:
if stack:
cur = stack.pop ()
yield cur.data
cur = cur.right
def main ():
bt = N (4, N (2, N ( 1), N (3)), N (6, N (5), N (7, None, N (8))))
print (list (iterate_preorder (bt)))
print (list (iterate_inorder (bt)) )
return 0
if __name__ == '__main__':
main ()
To iterate through the tree, you need to store information about the branches that you need to visit. A stack is suitable for a preorder traversal - this is a traversal in depth. I had to sit longer on the iterative inorder bypass. But the same idea as last time. Just to get the leftmost element of the tree, we go down to it until we come across an empty sheet. We take out the previous value from the stack, this will be the sheet itself, display its value and look to the right. If there is nothing there, then we will get from the stack the parent who already has the right element. Etc. The crawl is in the order of the leftmost, root, rightmost element.
Find the first non-duplicate character in a string
In python, dictionaries and sets use a hash table to store and find values / keys. The algorithm shown below will be at least O (n) (pass along the entire line), since it will take some more time to add new characters to the dictionary.
#! / usr / bin / env python
# - * - coding: utf-8 - * -
def nonRepeated (st):
repeated = dict ()
for l in st:
if l in repeated: repeated [l] = 0
else: repeated [l] = 1
for l in st:
if repeated [l]: return l
def main ():
print (nonRepeated ('test'))
return 0
if __name__ == '__main__':
main ()
Will there be ideas how to increase speed yet? Here then everything was done through the normal sequence. Checking whether a character is in a sequence is not done in constant time O (1), as for dictionaries and sets. So I think the use of dictionaries is slightly better.
Well, the last for today:
Flip the word order in a sentence
reverseOrder = lambda str: '' .join ((x [:: - 1] for x in str [:: - 1] .split ()))
rev_order = lambda s: '' .join (s.split () [ :: - 1]) # shorter version from reputable printf
The second method is not only shorter, but also faster:
>>> from timeit import Timer
>>> Timer ("reverseOrder ('1 2 3 45')", "from __main__ import reverseOrder"). timeit ()
3.4366888999938965
>>> Timer ("rev_order ('1 2 3 45' ) "," from __main__ import rev_order "). timeit ()
0.9511728286743164
There will be no problems from my interview, because I was very much asked not to publish them. I apologize.
Everything, start kicking!
Part 1
Part 2