• OK, it's on.
  • Please note that many, many Email Addresses used for spam, are not accepted at registration. Select a respectable Free email.
  • Done now. Domine miserere nobis.

A problem from oxford interview

Valentas

Well-Known Member
Local time
Today 12:54 AM
Joined
Jun 20, 2012
Messages
506
---
Okay, basically I've spot this problem, I believe that it may be interesting to @Architect

Missing numbers. Imagine you are given a list of slightly less than 1,000,000 numbers, all different, and each between 0 and 999,999 inclusive. How could you find (in a reasonable time) a number between 0 and 999,999 that is not on the list?

My notion is that there is only one number missing because it is explicitly stated in the end.

I think firstly it is important to cut the list in half so I'd make an array of numbers {"0", "1", "2", ... "499999"} and check its length. If it has all numbers than we proceed to the other half....maybe someone could show how to solve this problem for example in Java? :D
 

Architect

Professional INTP
Local time
Yesterday 5:54 PM
Joined
Dec 25, 2010
Messages
6,691
---
Sounds like a search or sort problem. Search for the missing number, or sort into two lists (one with all the numbers and the other with the missing numbers). Blind numerical search is worst case N (1,000,000) if the missing number is at the end. Hmmm, off the cuff most of the searches I'm thinking are N worst case. Maybe a trick is in order, vectorize it.

Create a hash signature of blocks of numbers, say in 100k blocks. Compare to the hash of the first 100k, second 100k, etc. First time you find a mismatch you know which block contains it.

EDIT: Carefully reading the problem statement I see that numbers can be arbitrary, the list could be [22, 22, 4, 47887, ...] . OK so it's a sort problem then. I'd sort the list first, remove dups, then do the hash algorithm I mentioned second. The hash algorithm protects against worst case the list is 1 to 999,999 (monotonic with the end digit missing).

Interesting.
 

EyeSeeCold

lust for life
Local time
Yesterday 4:54 PM
Joined
Aug 12, 2010
Messages
7,828
---
Location
California, USA
1) First, check the length of the list to make sure there is indeed a missing number(s).
2) Build a separate list of numbers, all of 0-999,999,999
3) Check/compare if the numbers in the new list are in the old, if not print the number(s).

Not sure of the coding, but I think it could be done.
 
Local time
Today 12:54 AM
Joined
Aug 29, 2012
Messages
705
---
1) First, check the length of the list to make sure there is indeed a missing number(s).
2) Build a separate list of numbers, all of 0-999,999,999
3) Check/compare if the numbers in the new list are in the old, if not print the number(s).

Not sure of the coding, but I think it could be done.

Jesus that is the most inefficient thing I've ever seen in my life. Why don't you just put a random for loop in there that calculates 30 digits of pi every time it checks an even number, 50 digits of e^1 when it checks an odd number.
 

EyeSeeCold

lust for life
Local time
Yesterday 4:54 PM
Joined
Aug 12, 2010
Messages
7,828
---
Location
California, USA
Jesus that is the most inefficient thing I've ever seen in my life. Why don't you just put a random for loop in there that calculates 30 digits of pi every time it checks an even number, 50 digits of e^1 when it checks an odd number.

Why are you telling me this? I'm not the one asking.

If you think you've got a genius solution, then direct it to Valentas/Oxford.
 
Local time
Today 12:54 AM
Joined
Aug 29, 2012
Messages
705
---
Why are you telling me this? I'm not the one asking.

If you think you've got a genius solution, then direct it to Valentas/Oxford.

I'm not saying I have a genius solution, I'm just saying your solution needs some improvement, because obviously you want it to run in the slowest time possible.
 

EyeSeeCold

lust for life
Local time
Yesterday 4:54 PM
Joined
Aug 12, 2010
Messages
7,828
---
Location
California, USA
I'm not saying I have a genius solution, I'm just saying your solution needs some improvement, because obviously you want it to run in the slowest time possible.

Yeah.. but I'm not the one asking so there's no point in telling me that.
 

Architect

Professional INTP
Local time
Yesterday 5:54 PM
Joined
Dec 25, 2010
Messages
6,691
---
Can't you just run the thing through two for loops?

@intpz yes but that is the brute force approach which finds the answer in N time. Vectorizing it amends the problem for physical computers, it can be run on a supercomputer, or your 8 thread Intel box.

The problem statement said

in a reasonable time

which implies they're looking for a optimized solution.

If taking the linear approach I'm not sure why you'd need two loops, in that case I'd just sort them, then do a single loop looking for a gap.
 

intpz

Banned
Local time
Today 12:54 AM
Joined
Jun 15, 2011
Messages
1,568
---
@intpz yes but that is the brute force approach which finds the answer in N time. Vectorizing it amends the problem for physical computers, it can be run on a supercomputer, or your 8 thread Intel box.

The problem statement said



which implies they're looking for a optimized solution.

If taking the linear approach I'm not sure why you'd need two loops, in that case I'd just sort them, then do a single loop looking for a gap.

Right, reasonable time. Then I'm out of ideas, your post fits the criteria significantly better. Brute force is useful in the class though, though teachers hate your ass for it.
 

Architect

Professional INTP
Local time
Yesterday 5:54 PM
Joined
Dec 25, 2010
Messages
6,691
---
Compiles and works by quick chec. Smooze some items - the hashSet needs to be initialized with the precomputed hash for 100k blocks. I also faked the byte comparisons, the result hash needs to be a combined byte from the block (details). But it perfectly illustrates the solution.

crud returns the block that contains the missing digit. The client can easily then do a check of that block by a foreach loop. The astute will notice this is a form of a QuickSort (divide and conquer). The even more astute will savor my use of Lambda functions.

Code:
        static byte[] hashSet = { 123, 234, 234, 234 }; // ... TODO: precompute

        static int crud(int[] data)
        {
            System.Security.Cryptography.MD5 md5 = System.Security.Cryptography.MD5Cng.Create();

            int[] sorted = (data.OrderBy((x) => x)).ToArray<int>();
            HashSet<int> unique = new HashSet<int>(sorted);
            byte[] bytearr = unique.Select(x => (byte)x).ToArray();

            int counter = 0;
    
            for (int i = 0; i < 1000000; i*=100000) {
                byte[] hash = md5.ComputeHash(bytearr, i, 100000);
                if (hash[0] == hashSet[counter]) return counter; // TODO: Fix comparison
                counter++;
            }
            return -1;
        }
 

Cogwulf

Is actually an INTJ
Local time
Today 12:54 AM
Joined
Aug 21, 2009
Messages
1,544
---
Location
England
I'm not approaching this from a programming perspective, but the simplest approach I can see is to add all the numbers together. If there is a single missing number, then it will be the difference between that and 999,999.
If there are multiple missing numbers, then doing this will have made a brute-force approach easier as you will now know the maximum possible value so you will not have to use the entire list.
 

Valentas

Well-Known Member
Local time
Today 12:54 AM
Joined
Jun 20, 2012
Messages
506
---
I imagine that this problem asks to find a NUMBER not numbers which are missing...Thus simple arithmetic progression(sum of the list) - (given list of numbers) would result in missing number...Ofcourse, the second summing would be impossible to do by human because we don't know the number missing. Thus we would be forced to look for that number manually anyway...:D

So the first sum is possible to do by human while second(with missing number) must be done by a computer...@Architect my Java knowledge is too small to comprehend everything you've written but I understood what you wanted to propose. You go through blocks of elements, your program finds the one which lacks one element and then looping will produce that number because incrementing by one will stop somewhere...I hope I understood correctly.
 

Architect

Professional INTP
Local time
Yesterday 5:54 PM
Joined
Dec 25, 2010
Messages
6,691
---
Folks, that's C# :)
 

Coolydudey

You could say that.
Local time
Today 2:54 AM
Joined
May 21, 2012
Messages
1,039
---
Location
Pensive-land.....
For those who proposed summing the numbers: you need to look up and sum each number separately, so you might as well do a brute force search.
@Architect, doesn't hashing need you to look through all the data to produce the hash? If this is the case, you might as well look through the data until the missing value is found.
 

Architect

Professional INTP
Local time
Yesterday 5:54 PM
Joined
Dec 25, 2010
Messages
6,691
---
doesn't hashing need you to look through all the data to produce the hash? If this is the case, you might as well look through the data until the missing value is found.

Very astute! I didn't bother saying because I figured nobody would notice that :rolleyes:

What you would do is create a hash based on a block of the numbers as binary data. In C# you could stream them in, or you could treat it as a block of vector memory or something. Along the lines of the UNIX command line md5 proggy, which can compute a MD5 on a large file in a second or two. So you're viewing the data as a large binary block rather than integers, which by the example I just gave can be quick.

Of course as with any optimization problem the answer is; don't. Or at least until after you do the brute force solution. Who knows how much faster, if any, my solution is. You know writing the loops and checks in assembly would probably be the fastest.

But these questions are to see how smart you are, not find the right answer. So I came up with an answer as quick as I could (a few minutes after posting), and got interested to see how it looked when written up. I think it is a faster solution, but regardless it should impress an interviewer with your acumen. :D
 
Top Bottom