Pavel's BlogA selfish personal technical blog, that I use entirely for my personal growth. Stiinta |
Comenteaza
We're hiring again!
Our team is expanding an we're growing again.We need to fill 9 positions. If you're interested (or know anyone interested) send me your resume at pavel@fusu.usAbout things we do:http://www.allthingsdistributed.com/2014/11/apollo-amazon-deployment-engine.html
Apollo-based CodeDeploy is out!
We're proud to announce our new service CodeDeploy, based on our internal tool Apollo:http://www.allthingsdistributed.com/2014/11/apollo-amazon-deployment-engine.html
We're hiring!
Update:Position has been filled but we'll be hiring again soon!Builder Tools team @ Amazon is hiring:http://www.amazon.com/gp/jobs/254131Do you think you have what it takes? We need you!Apply online or send me your resume and we'll get you an interview.
Finding The Median In Large Sets Of Numbers Split Across 1000 Servers
How would you find the median across a thousand servers with a billion of numbers each?This is a question that involves lots of discussion because it may be quite vague and may require taking some assumptions. Obviously we can't do any kind of in-memory sorting because we don't have enough memory. We may possibly fit one set at a time, which would be under 1 GB of memory.The first obvious solution is an external merge sort and then a look up of the n/2 element (or the average of n/2 and n/2 + 1 on even n's). This solution would be n log n, but you may look for an O (n) solution, which someone may find impossible, since you need a sorted set to determine the median that involves element comparison which leads to the n log n complexity. But, who says we need sort by comparison.. ?Another solution comes from our observation that we need to find the n/2 element in a sorted array, which can be generalized to the order statistics algorithm of finding the kth smallest number in an unsorted array. This algorithm is also called quick select or the selection algorithm. I talked about an implementation of this algorithm in this article. The problem is that this is an in-memory algorithm and we would not have enough memory to maintain all the sets. So we will need a distributed version of this algorithm which will basically need a routine to swap numbers across servers. We have to assume that this is an optimized and efficient routine, because there's just too many factors that can affect the efficiency of this routine (network latency, connection problems). This is a O(n) solution and the distributed version of kth smallest may look like this:Another solution comes from the second observation that we shouldn't use a comparison sort. We could use a histogram of the counts of the numbers across the servers, like in the counting sort. But here we would have to assume a few more things. First we have to know something about the range of the numbers. If we have ranges of order of billions we could store an array of a few billions cells or at least one billion since the problem statement allows us to process a billion numbers in memory, which is the second assumption. Again this is a O(n) solution because we compute the histogram by going once through all the numbers and then find the kth number (the median) by going through the elements of the histogram. In code it may look similar to this: We could think of more solutions if we could assume even more facts. Let's say we know that the numbers across all servers are uniformly distributed. Having this large amount of numbers we could say the median is also the average of all numbers. So if we also know the range of distribution, say 0.. 1 billion, then computing the median is a O(1) operation and all we need to do is to compute 0 + (1 billion - 0) / 2 which is 500 million. If we don't know the range we can compute the median of medians in O(n) by using order statistics on each server.If the distribution is not uniform but we know some information about it we could still calculate the median with some probability. Of course we can find other different solutions if we take various assumptions, but the above solutions can probably satisfy any interviewer or whoever asked this question.
Printing Outside Nodes (Boundary) Of A Binary Tree
Contrary to many solutions I found online for this problem - I'm going to say a breadth first search (while printing the first and the last element) is not appropriate here and it won't help in cases where the tree has leafs at levels closer to the root, and those algorithms would miss that. The solution here is a depth first pre-order algorithm, but let me explain the problem statement first. So having the tree below (it is a binary search tree), print the boundary nodes of the tree anticlockwise. Printing the boundary nodes for our tree should result in the following sequence: [60, 41, 16, 25, 42, 55, 62, 64, 70, 65, 74].
Fig. 1 - Simple Binary Search TreeA pre-order traversal is more appropriate here because it traverses branches in the way we need; for example, it will traverse [60, 41, 16, 25] first which is what we need and we print them. The next sequences are of no interest to us, so we could add a flag that determines whether nodes should be printed out or not (ex: print only leaf nodes). So out of [53, 46, 42, 55, 74, 65, 63, 62, 64, 70] we would print only leaf nodes [42, 55, 62, 64, 70]. We're left with right branch only, where we have to print all the non-leaf nodes. For this we could have a separate stack where, while traversing we add the non-leaf nodes. In code this may look like this:
Fig. 1 - Simple Binary Search TreeA pre-order traversal is more appropriate here because it traverses branches in the way we need; for example, it will traverse [60, 41, 16, 25] first which is what we need and we print them. The next sequences are of no interest to us, so we could add a flag that determines whether nodes should be printed out or not (ex: print only leaf nodes). So out of [53, 46, 42, 55, 74, 65, 63, 62, 64, 70] we would print only leaf nodes [42, 55, 62, 64, 70]. We're left with right branch only, where we have to print all the non-leaf nodes. For this we could have a separate stack where, while traversing we add the non-leaf nodes. In code this may look like this:
Matching An Input String With A Given Dictionary
Questions about string matching are pretty often in interviews and this is one of them. I've seen two variations online: one to split the string in meaningful words and another one to determine all possible meaningful words that can be produced out of sub-strings of the input.First one seems easy - you just go once through the string and output what you've matched, thus O(n) where n is the number of the input string's characters. Second one would involve trying all possible sub-words, which would be O(nk) where k is the length of the longest word.Now another problem is how would you represent a dictionary. A hash map won't work here because there will be too many collisions and the hash map would loose its efficiency. The first solution here would be a trie and that's what I used in both mentioned problems. An example of an implementation of a trie can be found here. Now can it be done faster? Yes, we could use the Aho-Corasick string matching algorithm which uses a finite state machine that resembles to a trie. That will improve our time in the second problem from O(nk) to O(n), but don't forget that we have the pre-processing time. This may not be asked to be implemented in an interview, but it's good to mention.
Re-creating A Binary Search Tree Given Its Pre-order Traversal
As a reminder a pre-order traversal consists of visiting the root first then the left and right child. Thus, the first element in a pre-order traversal will always be the root of the tree. It may be easier to find a quick solution by looking at this post where I showed how to validate a tree. Our interest lays in the recursive solution. We could traverse the array and create the tree in the same manner it is traversed in the above mentioned solution, by maintaining a minimum and maximum value.Note: Using the external index here is because passing it as an integer argument will not work, since Java passes arguments by value. To avoid using the external index, we could have a mutable integer class or an array of ints, where the first cell is the index value: Implementing a mutable integer class is fairly easy but it is mostly re-inventing the wheel - Apache Commons has an implementation of this class that could be used here.
Fast Fibonacci
I was looking for a faster Fibonacci algorithm and stumbled across this one. The doubling method assumes we know F(n) and F(n+1) to produce F(2n) and F(2n+1) out of the following formulas:F(2n) = F(n) * (2*F(n+1) - F(n))F(2n+1) = F(n+1)2 + F(n)2The formulas can be extracted from the matrix exponentiation algorithm mentioned in the above article and reduced of redundant calculations.Note:- Integer.numberOfLeadingZeros is a Java shortcut to get to the highest set bit. It may be implemented in some other way in different languages. It is needed here to avoid a few iterations; in our case considering we have 32 iterations and we can get an n up to 92, we can save up to 24 iterations.
Verifying A Sudoku Solution
This one may be easy for Sudoku fans. I used to play it so it wasn't difficult to come up with the rules that would define the algorithm. For those who haven't played, Sudoku is a number game where you have a board (usually 9 by 9) divided in 9 sub-grids (blocks 3 by 3). Every cell of the board can contain a number from 1 to 9. The goal is to fill the board with numbers in such a way that numbers are unique column-wise, row-wise and block-wise. The board comes with some numbers filled in. For more information check the Wikipedia page. The above Sudoku game was of order 3, but we may want our problem to be a little less restrictive and accept higher orders too. Thus if it would accept a Sudoku board of order n, the board would have n2 columns and n2 rows; the blocks would be n by n, and we would have to fill in numbers from 1 to n2.Our algorithm will have at least O(n2) time complexity since we have to check each element on the board. It sounds like we could one table traversal and check each element's validity. But how would you check one element when it depends on other elements that may be on the same row/column/block? We could use a Set array of size n2 for columns, another array of same size for rows and a bi-dimensional array of size n for blocks. Thus, while traversing the board we note each element in the associated set or if it is already there then we stop the algorithm since we have an invalid solution.In code looks like this:It seems like we could improve the code a little by using some bit patterns instead of sets. We could set a bit for each existing number. For example if we already have 3 and 1 our bit pattern should look like this: 101. We would need some simple bit manipulation logic to check and set a bit. I used longs so that I can allow Sudoku boards of orders up to 8:Helper initialize method and test data for convenience:
Array Rotation By A Particular Amount
John Bentley's "Programming Pearls" again, with the rotated array problem. So we have a one-dimensional array of N elements that we have to rotate left by I positions. We have to do this in time proportional to N and in O(1) space. So for example if we have the array [1 2 3 4 5 6 7 8 9] and we want to rotate it by 3 positions we should result in this array: [4 5 6 7 8 9 1 2 3]. Note that the array is not necessarily sorted.There's plenty of solutions that may come in our minds at first but they probably require additional temporary space. For example we could copy the I elements into a temporary vector, then move to the left all the rest in the original array by I positions and finally copy the elements from the temporary array to the end of the original array. As you see this is too expensive space wise.To solve this problem we will require a different view of the problem and some time until you hit that aha! moment. We basically need to transform AB to BA, where A and B are sub-arrays of the original array. Imagine we have a routine that can reverse the elements in a specific portion of the original array (A, B). So we reverse elements in A first resulting in ARB and then reverse the elements in B resulting in ARBR. And finally we need to reverse ARBR to produce (ARBR)R which is in fact the array that we were looking for BA.This may be easier to understand with an example. So we have our array arr = [1, 2, 3, 4, 5, 6, 7, 8, 9] that we need to rotate left by 3. We need to first rotate the sub-array [1, 2, 3] resulting in [3, 2, 1, 4, 5, 6, 7, 8, 9] and then we rotate [4, 5, 6, 7, 8, 9] resulting in [3, 2, 1, 9, 8, 7, 6, 5, 4]. And finally we reverse the whole array resulting in [4, 5, 6, 7, 8, 9, 1, 2, 3] which is the array we were looking for.In code it looks even easier to understand, considering that we already have the routine that does the reversal of the elements in an array:The routine for reversing is just a for loop that is of O(n/2) complexity, where n is the number of elements it has to reverse. The routine may look like this:Note:- the swap routine is trivial and may look like this:
The Subset Problem
There are many variations of this problem, I will stay on the general problem of finding all subsets of a set. For example if our set is [1, 2, 3] - we would have 8 (2 to the power of 3) subsets: {[], [1], [2], [3], [1, 2], [1, 3], [1, 2, 3], [2, 3]}. So basically our algorithm can't be faster than O(2^n) since we need to go through all possible combinations.There's a few ways of doing this. I'll mention two ways here - the recursive way, that we've been taught in high schools; and using a bit string.Using a bit string involves some bit manipulation but the final code can be found easy to understand. The idea is that all the numbers from 0 to 2^n are represented by unique bit strings of n bit width that can be translated into a subset. So for example in the above mentioned array we would have 8 numbers from 0 to 7 inclusive that would have a bit representation that is translated using the bit index as the index of the array.NrBitsCombination0000{}1001{1}2010{2}3011{1, 2}4100{3}5101{1, 3}6110{2, 3}7111{1, 2, 3}It may look easier in the code:Note:- The Math.pow call computes 2 to the power of n- The findLowestBitSet is a method I mentioned earlier. In Java it could look like this:Or it could be optimized to look like this:For comparison find below the code for the recursive algorithm:
Generalized Tower of Hanoi - K Pegs, N Disks
I recently found online that Facebook is giving a fast track job opportunity for those who would solve a specific problem which may be a little more complex than a simple interview question. The process happens online - you login with your Facebook id and have a specific amount of time to write a solution in the language you desire. Here's the url in case you want to try it. They also provide a sample problem and an allotted 45 minutes to solve it. The problem is the well known Tower of Hanoi puzzle but generalized to k pegs and n disks. Given an initial and final configuration of the pegs you have to output the minimal number of steps needed to reach the final configuration. To make it easier they bound the k and n to 1 <= n <= 8, 3 <= k <= 5 and let you assume that there's a solution that takes at least 7 steps.One basic approach here is brute forcing all possible configurations which results in an undirected graph. And how do you find something in a graph ? Using a search strategy - in my case I chose Breath-First Search.Before going into the details of this approach let's see what other approaches we can find. Brute forcing will generate in the worst case 5^8 (390625) different configurations and then we'll have to search through them. Is there any other way where we wouldn't do a brute force, but go through the minimal path the first time? It seems like we could use different strategies of navigating through the pegs of a Hanoi Tower. There's basically 3 configuration cases that will take different approaches in navigation. Those are n < k, n=k and n > k. The point is there's a specific move pattern for each of these cases, that you could do to achieve the desired configuration. This article goes into much detail on that. There's also the way of multi-threading - we could parallelize this in some way - for example we could start from both ends.I chose the brute force this time. While coding this approach a few problems may arise; mainly the problem of how would you represent the pegs. I chose to use a byte array, where each cell represents a peg and each bit of a peg represents a disk (if set).Code looks like this: Note:The findLowestBitSet method basically returns the number of trailing zeros. In Java there's a util method for that: The method could be optimized to look like this:
Bitmap Sort
I was looking through John Bentley's "Programming Pearls" and I found a problem that I encountered before and decided to make note of it here. So you have a large amount of numbers that you're supposed to sort, but there's not enough memory to do that in. Obvious answer is to look for methods of sorting on disk aka external sorting. There's a few methods known that include taking multiple passes through the input or parallelism. But this article is not about any of those - it should be faster than any of those. Consider that you have a set of 48000 unique numbers in the range of 1..64000. Does this make a difference? Also consider that you have 10K available memory, so you can't load all the numbers in memory. The idea here is that we can represent the input as a bit sequence (bit map) - the only thing left is to think of a data type. We could use a number of integers, and map our input to the integers bits. In Java an integer takes 4 bytes or 32 bits, this means we could map 32 input numbers to each integer. So, for example an input number 16 will map to bit 16 of integer 1, where an input number 37 will map to bit 5 of integer 2. To represent all possible input numbers we'll need 64000 / 32 integers - 2000 which will take 2000 * 4 = 8K, which fits into our limitations. Second step would be to output the sorted array. Since mapping the integers to a specific bit, makes the sequence sorted, we just need to traverse the bits and output every bit that has been turned on. In the end we get a linear time algorithm that takes O(n) or precisely O(64000) and it looks like this: Note:- The IO operations are there just to simulate input and output, instead of loading array into memory. Those are trivial, and implementation depends upon language.- BUCKET_SIZE = 32 in our caseI will go through the code a little to make sure it is clear. So first step is the createBitMap method which creates a bitmap out of the input. On line 11 we decide which bit to set, whereas in line 12 we decide which bucket (integer) this bit should go to. It also does get set on line 12, through an OR operation.Second step is going through the created bitmap and showing the actual integer values. On line 24 we go through each bit of an integer and check if it's set or not (line 25). If set we convert to the actual integer values on line 27.
Unsorted Array Converted To "E1 < E2 > E3 < E4..." Format
I had a question once where given an unsorted array of integers, you have to produce an array in the "E1 < E2 > E3 < E4..." format. For example {3, 2, 6, 4, 9, 7, 1, 10, 8, 5} would produce {2, 6, 3, 9, 4, 7, 1, 10, 5, 8} and this is because 2 < 6 > 3 < 9 > 4 < 7 > 1 < 10 > 5 < 8. This is a tricky question where if asked by an interviewer you have to ask follow-up questions! I'm saying this out of my experience, my first solutions that I had in mind when I first had this question asked were involving sorting or priority queues, thus at least O(n log n) running time and involving additional space. But when I asked about the desired running time I was told that it should be faster, so I revisited my thoughts. It is in fact a very simple problem where you have to traverse the array just once resulting in O(n) running time. Note that you also have to ask about how to handle duplicates, in this case I considered an array of unique numbers.The idea is that while traversing the array, you maintain a flag that indicates whether a "less than" or a "greater than" comparison should succeed. If it does not succeed you swap the previous index with the current index. This will be easier to understand through an example. Let's say we have the above array {3, 2, 6, 4, 9, 7, 1, 10, 8, 5}. We traverse the array starting at the second element (2), since we'll be comparing the current with the previous element. We also have to maintain the above mentioned flag, say a flag named lessThan which will be defaulted to true since our format definition indicates that we have to start from a less than sign. So we have our element "2" and previous element "3". Is element "3" less than element "2" ? No, then we swap the elements which results in the array {2, 3, 6, 4, 9, 7, 1, 10, 8, 5}. We also have to switch the lessThan flag and then go to the next index. Now the flag tells us that we have to use the "greater than" sign. Is element "3" greater than element "6" ? No, then swap the numbers. But wait, what happens to the previous comparison ? The second index has to stay greater than the first one, but note that this did get verified by the condition that we just imposed by comparing "6" with "3". "6" is greater than "3", which is greater than "2" - this means "6" is greater than "2", thus it's safe swapping "6" with "3". This results in {2, 6, 3, 4, 9, 7, 1, 10, 8, 5}. This goes on until the end of the array. Here's a table that may add more sense to how the algorithm works:IterationArrayLessThan flagSwapped02, 3, 6, 4, 9, 7, 1, 10, 8, 5truetrue12, 6, 3, 4, 9, 7, 1, 10, 8, 5falsetrue22, 6, 3, 4, 9, 7, 1, 10, 8, 5truefalse32, 6, 3, 9, 4, 7, 1, 10, 8, 5falsetrue42, 6, 3, 9, 4, 7, 1, 10, 8, 5truefalse52, 6, 3, 9, 4, 7, 1, 10, 8, 5falsefalse62, 6, 3, 9, 4, 7, 1, 10, 8, 5truefalse62, 6, 3, 9, 4, 7, 1, 10, 8, 5falsefalse72, 6, 3, 9, 4, 7, 1, 10, 5, 8truetrueThe code is pretty simple. See it below: Note:- The swap function is trivial and it may look like in the code below.
Validating a Binary Search Tree
This is one of those problems that deceives you into thinking that you got it and if you're not careful enough you'll make a mistake. The problem is that when deciding if a tree is a binary search tree or not, you not only have to make sure that the left child is lesser than the root and right child larger; you also have to make sure that the left child is lesser than all its ancestors and right child is larger than all its ancestors. To do this we'll at least need to go through all the nodes. Thus we need a traversal algorithm either recursive or iterative. Both the recursive and the iterative will take up to O(n) space (the call stack for recursive and a stack data structure for iterative).I'll start with the iterative algorithm, and I'll take as an example the tree below.
Fig. 1 - Simple Binary Search TreeThe idea of the iterative one is that a binary search tree traversed in-order will always output a ordered sequence. So the above tree traversed in-order will look like this: 16, 25, 41, 42, 46, 53, 55, 60, 62, 63, 64, 65, 70, 74. And, as a reminder, this happens because an in-order traversal visits the left branch first, then the root, and only in the end the right child - which, according to the binary search tree property where the left branch is less than the root, and the right one is larger than the root; leads us into a sorted sequence. Having this said, it would be enough for us to maintain the last visited node's value and keep traversing while the current node's value is larger than the node that we traversed last. If not - this is not a sorted sequence, thus definitely not a binary search tree. The recursive version may be slightly difficult to understand, but it may be clearer in a way of showing how the ancestors are checked against the binary search property. This means that the children of a specific node are always limited by an upper and lower limit, so that a binary search tree can happen. For example, the children on the left branch of node "65" in the picture above have to be less than 65 and greater than 60 in order to conform to the binary search property. But what are the children of the left/right branches of root node? Well, left children of the root node, have to be less than 60 and greater than -infinity, and children on the right, have to be greater than 60 and less than infinity. I still prefer the iterative version over the recursive one - it's clean enough and besides in a production environment it may be a better idea to use the iterative version. But anyways here's the code below.
Fig. 1 - Simple Binary Search TreeThe idea of the iterative one is that a binary search tree traversed in-order will always output a ordered sequence. So the above tree traversed in-order will look like this: 16, 25, 41, 42, 46, 53, 55, 60, 62, 63, 64, 65, 70, 74. And, as a reminder, this happens because an in-order traversal visits the left branch first, then the root, and only in the end the right child - which, according to the binary search tree property where the left branch is less than the root, and the right one is larger than the root; leads us into a sorted sequence. Having this said, it would be enough for us to maintain the last visited node's value and keep traversing while the current node's value is larger than the node that we traversed last. If not - this is not a sorted sequence, thus definitely not a binary search tree. The recursive version may be slightly difficult to understand, but it may be clearer in a way of showing how the ancestors are checked against the binary search property. This means that the children of a specific node are always limited by an upper and lower limit, so that a binary search tree can happen. For example, the children on the left branch of node "65" in the picture above have to be less than 65 and greater than 60 in order to conform to the binary search property. But what are the children of the left/right branches of root node? Well, left children of the root node, have to be less than 60 and greater than -infinity, and children on the right, have to be greater than 60 and less than infinity. I still prefer the iterative version over the recursive one - it's clean enough and besides in a production environment it may be a better idea to use the iterative version. But anyways here's the code below.
Coin
Flipping
and
Die
Rolls
Given a function that outputs a coin toss (a random number from 1 to 2) describe an algorithm that outputs a die roll (a random number from 1 to 6). Each outcome should be equally likely.One solution is tossing the coin 3 times and using the outcome as bits of a three-bit number. So our outcomes would be 000, 001, 010, 011, 100, 101, 110, 111 (0 to 7). If our outcome is 110 or 111 we would have to discard it and toss again. I found this solution widely accepted on the web, but it seems that discarding 3 tosses is a waste. It seems like we could reuse the lastly tossed number (in the case of 110 and 111 - 0 and 1), and we would need only 2 additional tosses instead of 3. This would change the algorithm to something like this: The toss function could reuse the random number method mentioned earlier.
Minimum Stack
Another common interview question is implementing a stack like data structure that besides supporting the traditional push() and pop() operations would provide a O(1) operation able to find the minimum element in the stack. The question is easy but I still decided to write it down because it seemed elegant to me.So the solution consists in maintaining 2 stacks: one for performing the traditional operations as before, and another one that will always have the minimum on top for fast O(1) access. Imagine you have a set of numbers - 5, 2, 8, 5, 7, 13, 3, 11, 1, 6 and you start pushing them on the stack. Every time you push one number on the stack you compare it with the number on top of the other stack, and if it's smaller, you push it on the second stack too. This way after pushing all the numbers on both stacks you should have the second stack looking like this: 5, 2, 1 and the first stack looking as mentioned before like this 5, 2, 8, 5, 7, 13, 3, 11, 1, 6.We also need to maintain the minimum when pop-ing elements from our new data structure. We pop from the first stack unconditionally, but from the second stack we check first if the element we just popped from the first stack is actually the element on top of the second stack; if so - we pop it from the second too.This is how it would look if we pop all elements from the stack:IterationStack 1Stack 2Current minimum05, 2, 8, 5, 7, 13, 3, 11, 1, 65, 2, 1115, 2, 8, 5, 7, 13, 3, 11, 15, 2, 1125, 2, 8, 5, 7, 13, 3, 115, 2235, 2, 8, 5, 7, 13, 35, 2285, 25, 22955510In code it looks very simple and short: Note:- It is not necessary to use 2 stacks here. Another way was making MinimumStack a child of Stack. I just like symmetry and encapsulation. Besides it seems easier to show the idea this way.
Shortest Path in a Binary Search Tree
The shortest path in a binary search tree is not much more different than the shortest path in a binary tree that I handled here. The difference consists in reusing the binary search tree property where the left branch nodes are less than the root, and the right branch nodes are larger than the root. As in the binary tree shortest path, the procedure consists of finding the lowest common ancestor which I described here and then "drawing" the path from the ancestor to the nodes. Let's use the below tree as an example.
Fig. 1 - Simple Binary Search TreeLet's say we want to find the smallest path from "25" to "46". It is obvious that it has to go through the lowest common ancestor that we identify as "41". Now we have 2 basic cases: either one of the nodes we're looking for is the LCA and the other is in either of the branches; or each of the nodes is in a separate branch. Now if we don't have the first case, we can look for the minimum of the two nodes in the left branch and for the maximum in the right branch. And again, this is because of the binary search tree property - that the lower element goes on the left and higher on the right. So we go down on the left binary searching for our node "25" and same with "46". I say binary searching because at each step we compare the current node with the node we're looking for and decide whether to go to the right or to the left. Managing the path is tricky, since the path for the second element will come inverted with recursion. In my implementation I used a stack to manage that as I did when looking for the shortest path in a simple binary tree: Note:- First stack is not really needed, a simple list would do. I just like symmetry.- For the lowestCommonAncestor method see here.
Fig. 1 - Simple Binary Search TreeLet's say we want to find the smallest path from "25" to "46". It is obvious that it has to go through the lowest common ancestor that we identify as "41". Now we have 2 basic cases: either one of the nodes we're looking for is the LCA and the other is in either of the branches; or each of the nodes is in a separate branch. Now if we don't have the first case, we can look for the minimum of the two nodes in the left branch and for the maximum in the right branch. And again, this is because of the binary search tree property - that the lower element goes on the left and higher on the right. So we go down on the left binary searching for our node "25" and same with "46". I say binary searching because at each step we compare the current node with the node we're looking for and decide whether to go to the right or to the left. Managing the path is tricky, since the path for the second element will come inverted with recursion. In my implementation I used a stack to manage that as I did when looking for the shortest path in a simple binary tree: Note:- First stack is not really needed, a simple list would do. I just like symmetry.- For the lowestCommonAncestor method see here.
Shortest Path in a Binary Tree
This one may be from the tougher category, but it becomes easy when you understand the idea. So the question is "what is the shortest route between 2 unique nodes in a binary tree". I'll handle the same problem but in a binary search tree in a separate article. I knew the answer before, since I reviewed the lowest common ancestor in a binary tree first. Yes, the idea is to find first the lowest common ancestor of the two nodes and then "draw the path" from the LCA to the 2 nodes. I explained finding the LCA in a binary tree in this article, so I'll go straight to the second part: identifying the path. Let's use the below tree as an example.
Fig. 1 - A Simple Binary TreeLet's say we want to find the shortest path between nodes "2" (last one) and "11". Having identified the LCA as "7", we can now assume that the nodes are either located each in its separate branch or one node is the LCA and the second is either in the left or right branch. Second option is not the case, so we traverse the left and right branch (recursion possible) until we find any of the two nodes by comparing the branch node with the nodes. First one is easy we go onto the left branch and we find the "2" node right away. Then we go onto the right branch - "6", not our node, recurse on the left and right branch of "6". "5" is not our node, "11" is, so we add it to our path. The final path is "2, 7, 6, 11". How would we manage to maintain the path in code though? If we use recursion we'll get a path like "2, 7, 11, 6", so we need some little adjustments. You may notice that the first path is ok up to the root, then we have an inverted path. How do we handle this? There's probably multiple ways, I chose having a stack to invert the list as you can see in the code below. Note:- First stack is not really needed, a simple list would do - I just like symmetry.- For the lowestCommonAncestor method see here.
Fig. 1 - A Simple Binary TreeLet's say we want to find the shortest path between nodes "2" (last one) and "11". Having identified the LCA as "7", we can now assume that the nodes are either located each in its separate branch or one node is the LCA and the second is either in the left or right branch. Second option is not the case, so we traverse the left and right branch (recursion possible) until we find any of the two nodes by comparing the branch node with the nodes. First one is easy we go onto the left branch and we find the "2" node right away. Then we go onto the right branch - "6", not our node, recurse on the left and right branch of "6". "5" is not our node, "11" is, so we add it to our path. The final path is "2, 7, 6, 11". How would we manage to maintain the path in code though? If we use recursion we'll get a path like "2, 7, 11, 6", so we need some little adjustments. You may notice that the first path is ok up to the root, then we have an inverted path. How do we handle this? There's probably multiple ways, I chose having a stack to invert the list as you can see in the code below. Note:- First stack is not really needed, a simple list would do - I just like symmetry.- For the lowestCommonAncestor method see here.
Deck Shuffling
How would you shuffle a deck of cards or generally an array of distinct integers? The shuffle should happen in such a way that every possible reordering is equally likely. Of course it sounds like we should use a random number, but how to ensure the equal probability of each reordering?There's probably multiple solutions, but the one I liked is a solution that takes O(n) time and no additional space. It is the Fisher-Yates shuffle. What we have to do is traverse the array once and swap each element with a random element that doesn't appear before the current index.The code is very simple and looks like this: Note:- The random method is a method that generates a random number within the specified range. This is trivial and could look like this:- The swap method is a method that swaps 2 elements in an array with the specified index, the implementation is also trivial. It may look like this:
Order Statistics. Kth Smallest or Largest Element in an Un-ordered Array
Finding the largest or smallest element in an un-ordered array in O(n) is a trivial and obvious task, but what about finding the third largest or sixth or abstractly speaking kth? In statistics this is called Order Statistic. One way of solving this is using the QuickSort partitioning technique. To remind you how this works: you choose a pivot element, and then move all elements less than the pivot element to the left and all elements that are larger than the pivot element to the right. After this operation the pivot element is on the right spot and if the k you're looking for equals to the pivot's index, you don't have to sort anymore and just return the pivot.As an example, let's say I have the following array:[42, 92, 31, 73, 46, 11, 29, 53, 64, 4]I chose the first element as the pivot (or we could use a random element) - 42.Now we iterate from the start and end of the array and swap the elements in-place as they don't conform to our property (less than the pivot on the left and larger on the right). Is 92 less than 42 ? No, whom do we switch it with ? Let's look at the elements at the end of the array, since in the end we want to have an ascending order sorted array (if we need to). Is 4 larger than 42 ? No! Swap this with 92. Is 31 less than 42? Good. 73? No, then swap, and so on.After the first operation the array should look like this:[11, 4, 31, 29, 42, 46, 73, 53, 64, 92]Now, 42 is in the right spot and it's the 5th number (index 4). Let's say we're looking for the 2nd smallest (index 1) so 5 is no good to us. What we do now, is doing the same thing just over the smaller array that we have: [11, 4, 31, 29]Again, chose a pivot - 11, and move all lesser values to the left and larger to the right:[4, 11, 31, 29]11 is the element with index number 1 thus it is the second smallest element so we can stop the algorithm here.Computing the kth largest element is the same task, with the only difference that we have to translate the k value, since the largest elements are located at the end of the array. So we'll have to translate k using this formula: k = array.length - k + 1.Here's the algorithm in code:
Lowest Common Ancestor in a Binary Search Tree
For the LCA in a simple binary tree go here.Finding the lowest common ancestor (LCA) in a binary search tree may be easy, but coming up with an elegant and quick solution may take some thought.A binary search tree is a ordered binary tree where the nodes on the left are always less than the root node and the nodes on the right are always larger than the root node.Let's use the tree below as an example.
Fig.1 - Simple Binary Search TreeNow keeping in mind that property that makes the binary search tree different that a binary tree we can come up with something that is faster than finding the LCA for a simple binary tree, precisely we'll make it O(h) - where h is the height of the tree. This means that in the tree above it will take maximum 5 iterations to find the LCA.Let's say we're looking for the nodes 62 and 70.We take the maximum of these two and compare it to the root, if the maximum is less than the root then our values are in the left branch, and we can forget about the right branch at all. Again, we can do this because of the binary search tree property that all the nodes on the left of the root are less than the root. Since the maximum of our two numbers is not less than the root we can move to the second case.Now, we take the minimum of those two and compare it with the root. If the minimum is greater than the root, then we go on with the right branch. You may ask: why minimum (or maximum) ? The answer is that the nodes we're looking for might be each other's parent. So if our comparison would not be greater than the root, we would go to our third case which i'll take note later. Since this is the case the minimum of these two is 62 and it is greater the the root - we go further to 74. We keep doing our previous conditions, until both of them don't succeed:- max is less than 74, we go on the left to 65- max is not less than 65, then is min greater than 65 - not even that!Here we go to our third case: if both previous cases failed, this could not mean anything but that the root is the LCA of nodes we're looking for, which is 65.Again in code this is nothing but a binary search: Note:- Math.max and Math.min = determines the maximum and minimum of two numbers- In a balanced binary search tree h = log n, so then we have O(log n)For the node representation we'll use the following class:
Fig.1 - Simple Binary Search TreeNow keeping in mind that property that makes the binary search tree different that a binary tree we can come up with something that is faster than finding the LCA for a simple binary tree, precisely we'll make it O(h) - where h is the height of the tree. This means that in the tree above it will take maximum 5 iterations to find the LCA.Let's say we're looking for the nodes 62 and 70.We take the maximum of these two and compare it to the root, if the maximum is less than the root then our values are in the left branch, and we can forget about the right branch at all. Again, we can do this because of the binary search tree property that all the nodes on the left of the root are less than the root. Since the maximum of our two numbers is not less than the root we can move to the second case.Now, we take the minimum of those two and compare it with the root. If the minimum is greater than the root, then we go on with the right branch. You may ask: why minimum (or maximum) ? The answer is that the nodes we're looking for might be each other's parent. So if our comparison would not be greater than the root, we would go to our third case which i'll take note later. Since this is the case the minimum of these two is 62 and it is greater the the root - we go further to 74. We keep doing our previous conditions, until both of them don't succeed:- max is less than 74, we go on the left to 65- max is not less than 65, then is min greater than 65 - not even that!Here we go to our third case: if both previous cases failed, this could not mean anything but that the root is the LCA of nodes we're looking for, which is 65.Again in code this is nothing but a binary search: Note:- Math.max and Math.min = determines the maximum and minimum of two numbers- In a balanced binary search tree h = log n, so then we have O(log n)For the node representation we'll use the following class:
Lowest Common Ancestor in a Binary Tree
Another popular interview problem and the start of my tree review is finding the LCA in a binary tree. Now, pay attention - not a binary search tree, that is a different case and will be handled in a separate article; now will use a simple binary tree. For those who don't remember a binary search tree is a sorted binary tree where the left child is always smaller than the root and the right child is always larger. A simple binary tree, as in our case, doesn't have that property.Let's use the tree below as an example.
Fig.1 - Simple Binary TreeNow let's say we want to find the LCA for nodes 4 and 9, we will need to traverse the whole tree to compare each node so that we can locate the nodes. Now considering that we start the traversal with the left branch (pre-order traversal) - we have the worst case here with O(n) running time.Traversing the tree we compare the current node with both of the nodes and if one of them match, it means that one is the LCA on the respective branch. Let's say after traversing the above tree in pre-order the first node that matches our nodes is 9 (2, 7, 2, 6, 5, 11, 5, 9). So the first obvious thought is that the 4 must be a child of 9, since we're already on the right child of node 5 and the pre-order traversal looks at the node first, then the left child and lastly the right child. Then we note node 9 as the LCA and we don't have to look further anymore.Let's use another case, say we're looking for the LCA of 7 and 9. The first node in our pre-order traversal (2, 7, 2, 6, 5, 11, 5, 9, 4) is 7. Now here we can say that the LCA for the left branch is 7 because again, if the second node is in the same branch, independently of where and how deep it will be in this branch, the LCA will still be 7; thus we don't have to look in this branch anymore. But we still did not look at the right branch, so we keep traversing in a pre-order manner, but now omitting the other nodes: 2, 7, 5, 9. Now we can say that the LCA for that branch is 9. We can also affirm that the LCA for the branch with the root in node 5 is also 9. And in the end we have our nodes both in separate branches, which means that the LCA is the root of those branches - node 2.The algorithm looks as a modified version of a pre-order tree traversal :For the node representation we'll use the following class:
Fig.1 - Simple Binary TreeNow let's say we want to find the LCA for nodes 4 and 9, we will need to traverse the whole tree to compare each node so that we can locate the nodes. Now considering that we start the traversal with the left branch (pre-order traversal) - we have the worst case here with O(n) running time.Traversing the tree we compare the current node with both of the nodes and if one of them match, it means that one is the LCA on the respective branch. Let's say after traversing the above tree in pre-order the first node that matches our nodes is 9 (2, 7, 2, 6, 5, 11, 5, 9). So the first obvious thought is that the 4 must be a child of 9, since we're already on the right child of node 5 and the pre-order traversal looks at the node first, then the left child and lastly the right child. Then we note node 9 as the LCA and we don't have to look further anymore.Let's use another case, say we're looking for the LCA of 7 and 9. The first node in our pre-order traversal (2, 7, 2, 6, 5, 11, 5, 9, 4) is 7. Now here we can say that the LCA for the left branch is 7 because again, if the second node is in the same branch, independently of where and how deep it will be in this branch, the LCA will still be 7; thus we don't have to look in this branch anymore. But we still did not look at the right branch, so we keep traversing in a pre-order manner, but now omitting the other nodes: 2, 7, 5, 9. Now we can say that the LCA for that branch is 9. We can also affirm that the LCA for the branch with the root in node 5 is also 9. And in the end we have our nodes both in separate branches, which means that the LCA is the root of those branches - node 2.The algorithm looks as a modified version of a pre-order tree traversal :For the node representation we'll use the following class:
Finding If N Is Palindrome
Today I've decided that I want to review my basic computer science knowledge. And since I have a feeling that I may want to review this in the future again - I thought of making out of it a blogging experience. This way I kill two birds with one stone: blogging and review. Also even though there are lots of solutions and code sites on the web, this still may be of help to other colleagues of mine that plan to review.The way I'm going to do the review is by solving popular programming problems which may be also a good preparation for a job interview. For my review I'm going to be using the Java programming language, but it should not matter for experienced programmers, since I'm going to use basic language structures and no external libraries.Today I'll start with a classic popular problem: the palindromeFor those who do not know what a palindrome is - it is a sequence of characters that can be interpreted in the same way independently of what direction is used to read them. For example "abba" is a palindrome because you can interpret this sequence in the same way independently of the direction you read the sequence.One known problem is determining if a number is a palindrome. It is usually limited to no extra memory. Let's say we have two numbers: 1781 and 42124. The latter one is the palindrome, and the way we do it programmatically would be by comparing the first and last digit and moving on to the second and n-1 digit and so on. Now the main problem is determining the first and last digit which consists of numerical operations. Last digit is easy - just get the reminder of the division by 10 (modulus) but first digit requires a little bit of thought. So in our number 1781, for the first digit we basically need to divide it by 1000, and we get 1 (781 reminder). How do you find this 1000 number ? - just go through the digits and multiply 1 with 10 n-1 times. (Line 7-9). Now as we have the first and last number we can compare them and fail right away if they're not equal. Truncating the number from the last and first digit is fairly easy : 1781 % 1000 reminder is 781, and then just divide by 10, and we have the 78 number. (Line 17). Lastly decrease div by 2 digits (Line 18). All in all you get an O(n) solution.
What Dreams May Come
“To accomplish great things, we must not only act, but also dream; not only plan, but also believe.”Anatole FranceDream and Believe – these two words have been around me for the last five years. They have helped me reach goals, which years before I had never thought they could have been even possible for me to achieve. They have helped me transform “Setting and reaching goals” into a challenge, which finally made me understand my way of being and employed me in a continuously growing environment.And the best part about this is when you accomplish that goal, which seemed to be impossible to reach and when you get that feeling which makes you say “I can do everything” or “Nothing is impossible” and challenge yourself to new and more difficult to achieve goals, which might seem impossible at all for other people, but not for you. Cause you… you can do everything.
First, it was the dream.I love New York, I’m loving it since I was 14 years old. I was collecting images and movie clips about New York; I had a lot of souvenirs. I really loved it, and I never realized I could ever visit it. Victor Hugo said “There is nothing like a dream to create the future”; I had a dream, you might think it’s a small one; for me it was the most wanted and the most beautiful dream I ever had till that day. I didn’t realize that my dream is happening even when I was landing in J.F.Kennedy International Airport, in New York. I only realized it, when I was in that yellow Cab, heading to New York City, going up a hill. We were on the top of the hill, and I was talking to my friends, when all of the sudden everyone got quiet. I looked out of the window, and I was stunned. At that moment all the pictures that I had collected, the clips I had seen, seemed to be put together like a jigsaw puzzle...and here it was: the unbelievable - New York City. What I saw printed itself in my memory forever. All that immensely tall skyscrapers that make you feel so small and enclosed; the neon lights, and all that gorgeous bridges... I could almost see the busy streets, people running to work or getting stuck in the traffic. It was breathtaking. There is so much history hidden behind those places. At times it felt like all I wanted to do was to get out of the car and cry out of happiness. I have to admit that it gave me chills and a sense of accomplishment that I had never felt before. At that moment I realized: dreams may become reality, if you truly believe in them and do anything in your power to make that happen.After that experience, I became confident that if there is something I can put my mind up to, I can achieve it. I have to admit that it’s not as easy as it sounds. We have to go through experiences of trials, disappointment and face mistakes, but the idea behind it is that these are the things that make us stronger. These are the facts that made me feel confident about what I want, and when things went wrong, all I had to do was to find a better way to get there.Dream – everything starts with a dream. Don’t be afraid to dream big. So many people struggle to free their imagination. Einstein said “Imagination is more important than knowledge.” Napoleon said “Imagination rules the world.” We have a tremendous power in our dreams and our imagination and our creativity. Everybody needs a dream for life. It is psychologically healthy. Start living your dream. Make it a challenge. I dare you to dream! Be who you were designed to be.What Dreams May Come.Last year, I had a dream. I enjoyed visiting the States, and afterwards I wanted to study there; I wanted something different for my education, something that would improve my skills, and something that would offer me a better future. I wanted to do a Master degree in the United States. I believed I have what it takes to get there. I knew I can make things possible; all I had to do is keep dreaming and doing my best to get there. That was my dream, and everything starts with a dream.
First, it was the dream.I love New York, I’m loving it since I was 14 years old. I was collecting images and movie clips about New York; I had a lot of souvenirs. I really loved it, and I never realized I could ever visit it. Victor Hugo said “There is nothing like a dream to create the future”; I had a dream, you might think it’s a small one; for me it was the most wanted and the most beautiful dream I ever had till that day. I didn’t realize that my dream is happening even when I was landing in J.F.Kennedy International Airport, in New York. I only realized it, when I was in that yellow Cab, heading to New York City, going up a hill. We were on the top of the hill, and I was talking to my friends, when all of the sudden everyone got quiet. I looked out of the window, and I was stunned. At that moment all the pictures that I had collected, the clips I had seen, seemed to be put together like a jigsaw puzzle...and here it was: the unbelievable - New York City. What I saw printed itself in my memory forever. All that immensely tall skyscrapers that make you feel so small and enclosed; the neon lights, and all that gorgeous bridges... I could almost see the busy streets, people running to work or getting stuck in the traffic. It was breathtaking. There is so much history hidden behind those places. At times it felt like all I wanted to do was to get out of the car and cry out of happiness. I have to admit that it gave me chills and a sense of accomplishment that I had never felt before. At that moment I realized: dreams may become reality, if you truly believe in them and do anything in your power to make that happen.After that experience, I became confident that if there is something I can put my mind up to, I can achieve it. I have to admit that it’s not as easy as it sounds. We have to go through experiences of trials, disappointment and face mistakes, but the idea behind it is that these are the things that make us stronger. These are the facts that made me feel confident about what I want, and when things went wrong, all I had to do was to find a better way to get there.Dream – everything starts with a dream. Don’t be afraid to dream big. So many people struggle to free their imagination. Einstein said “Imagination is more important than knowledge.” Napoleon said “Imagination rules the world.” We have a tremendous power in our dreams and our imagination and our creativity. Everybody needs a dream for life. It is psychologically healthy. Start living your dream. Make it a challenge. I dare you to dream! Be who you were designed to be.What Dreams May Come.Last year, I had a dream. I enjoyed visiting the States, and afterwards I wanted to study there; I wanted something different for my education, something that would improve my skills, and something that would offer me a better future. I wanted to do a Master degree in the United States. I believed I have what it takes to get there. I knew I can make things possible; all I had to do is keep dreaming and doing my best to get there. That was my dream, and everything starts with a dream.
Generat în 0.417 secunde.