2015年2月28日土曜日

Lesson 6: EquiLeader (Equi Leader)





As my solution utilizes the O(N) algorithm to find a leader described in the reading material that Codility provides, so first read it briefly.
Open reading material (PDF)


The definition of `equi leader' is given as below in the problem.


``An equi leader is an index S such that 0 ≤ S < N − 1 and two sequences A[0], A[1], ..., A[S] and A[S + 1], A[S + 2], ..., A[N − 1] have leaders of the same value.''

So let's think of this. 

Assume the value X is the leader of the whole array A and the total number of its occurrence is cntX. 

Now we have the value Y and the total number of its occurrence is cntY. The value Y can be any value other than X.

Since X is the leader, it can be said that cntX > N / 2. 

Then, it is always cntY < N / 2, since cntX + cntY <= N (because cntX > N / 2)

Since cntX > N / 2 and N / 2 > cntY, we can say cntX > cntY


The question is if this value Y can have an equi leader. The answer is NO, which means ``The leaders of the same value that makes an equip leader is always the leader of the whole array''.

Why? Assume that the value Y can have an equi leader at the index S.
This means, for A[0], A[1], ...., A[S], Y occurs more than X.

Suppose the number of occurrences of X and Y on the left side of the array divided by the index S are cntX_L, cntY_L respectively.

Then for A[n] (n <= S), 
it must be cntY_L > cntX_L  as Y is supposed to occur more than X to become the leader on the left side of the array divided at the index S.

Now let's think if Y can be also a leader of the right side of the array divided by the index S.

The total number of the occurrences of X and Y on the right side is cntX - cntX_L and cntY - cntY_L respectively.

so if the Y can also become the leader of the right side, it has to be cntX - cntX_L < cntY - cntY_L.

However, the assumption above leads to a contradiction.

cntX - cntX_L < cntY - cntY_L

This obviously contradicts since we assumed 

cntX   > cntY
cntX_L < cntY_L

This can only result as below.

cntX - cntX_L > cntY - cntY_L


This means that only the leader of the whole array can have an equi leader. 


Based on this condition, to solve this problem, we first find the leader of the whole array. If there is no leader, there won't be no equal leader. So just return 0.

After finding a leader (if any), we then scan the whole array again,  counting the occurrence of the leader at the current scanning position, comparing if the position can be an equi leader or not.
If we find a new equi leader, we increment the counter for the answer.

This strategy will result in the 100% score as below.










#include <alloca.h>

int solution(int A[], int N) {
    
    //first find the leader and count its occurances.
    //as all the values on the stack will be the same,
    //we only have to keep one value.
    int sp = 0;
    int candidate = 0;
    
    int i;
    for (i = 0; i < N; i++){
        if (sp == 0){
            sp++;
            candidate = A[i];
            continue;
        }
        
        if (candidate == A[i]){
            sp++;
        }
        else {
            sp--;
        }
    }
    
    //if there is no leader, there is no equi leader.
    if (sp == 0){
        return 0;
    }
    
    //now we check if the candidate value is really a leader
    int cnt = 0;
    for (i = 0; i < N; i++){
        if (A[i] == candidate){
            cnt++;
        }
    }
    
    //if there is no leader, there is no equi leader.
    if (cnt <= N / 2){
        return 0;
    }
    
    //now we have a leader.
    int leader = candidate;
  
    
    //let's count the number of equi leader.
    int lcnt = 0; //leader appeard until the index.
    int ans  = 0; //the number of equi leaders.
    for (i = 0; i < N; i++){
        if (A[i] == leader){
            lcnt++;
        }
        //check if the current index is a equi leader
        int lelems = i + 1;
        if ((lcnt > lelems / 2) && 
            ((cnt - lcnt) > (N - lelems) / 2)){
            ans++;
        }
    }
    
    return ans;
}

Lesson 5: StoneWall (Stone Wall)

Lesson 5: StoneWall
https://codility.com/programmers/lessons/5

Indeed, the Codility blog post explain the solution to this problem.
http://blog.codility.com/2012/06/sigma-2012-codility-programming.html


Here is the simplest solution that gives the 100% score.










#include <alloca.h>

int solution(int H[], int N) {
    
    int *stack = (int*)alloca(sizeof(int) * N);
    int sp = 0;

    int cnt = 0;
    
    
    int i;
    for (i = 0; i < N; i++){
        
        //if there is no stone on the left, we need a new stone.
        if (sp == 0){
            cnt++;
            stack[sp++] = H[i];
            //check the next height.
            continue;
        }
        
        //there is at least one stone on the left
        
        //the same hight, we don't need a new stone.
        if (stack[sp - 1] == H[i]){
            //check the next height.
            continue;
        }
        
        //the hight goes up, we need a new stone
        if (stack[sp -1] < H[i]){
            cnt++;
            stack[sp++] = H[i];
            //check the next height.
            continue;
        }
        
        //the hight goes down, try rewinding the stack
        while(1){
                    
            //the stone on the left is still heigher.
            //try rewind the stack.
            if (stack[sp - 1] > H[i]){
                sp--;
                
                //reached the bottom of the stack.
                if (sp == 0){
                    //we need a new stone.
                    cnt++;
                    stack[sp++] = H[i];
                    break;        
                }
                continue;
            }
            
            //there is a stone with the same height on the left.
            //this can continue grow to the right.
            if (stack[sp - 1] == H[i]){
                break;
            }
            
            //the stone on the left is lower than the new height.
            //there is need for a new stone.
            stack[sp++] = H[i];
            cnt++;
            break;
        }
        
    }
    
    return cnt;
}


Indeed, by combining the conditional branches, this solution can be
shorten as seen in the solution by the Codility blog. Unlike the above solution, the modified solution tries to rewind the stack first. (Though shortening the code doesn't always mean the comprehensive code...). The below code also gives the 100% score.









#include <alloca.h>

int solution(int H[], int N) {
    
    int *stack = (int*)alloca(sizeof(int) * N);
    int sp = 0;

    int cnt = 0;
    
    int i;
    for (i = 0; i < N; i++){
        
        //try rewind the stack first.
        while (sp > 0 && stack[sp - 1] > H[i]){
            sp--;
        }
        
        //when the code reaches here, it is always
        //sp == 0 and/or stack[sp - 1] < H[i].
        
        //if there is the stone with the same height, 
        //simply keep on using it!
        if (sp > 0 && stack[sp - 1] == H[i]){
            continue;
        }

        //if not, we need a new stone.
        stack[sp++] = H[i];
        cnt++;
    }
    
    return cnt;
}



2015年2月2日月曜日

Lesson 5: Fish (Fish)

Lesson 5: Fish
https://codility.com/programmers/lessons/5

The simplest solution for this problem is to use the stack.
We prepare a stack only for those fish going downstream.

The arrays of the fish size/direction are scanned from the beginning to the end. 

If the fish at the current scanning position is going downstream, we put it onto the stack. 

When the fish at the current position is going upstream, it starts eating up the fish going downstream from the stack. If it mets a bigger fish than itself, it dies then. If there is no more fish on the stack, it means it can reach the upper end of the river. This means it survives. We count up the surviver counter.

The total number of the survived fish is the survivors (heading upstream) and those on the stack (heading downstream).

This solution gives the 100% score as below.









#include <alloca.h>

int solution(int A[], int B[], int N) 
{
    int memsize = sizeof(int) * N;

    //stack (memory & index)
    int* down   = (int*)alloca(memsize);
    
    int d = 0;

    //the counter for the fish that reached to the end of the river.    
    int cnt_reached = 0;

    int i;
    for (i = 0; i < N; i++){        
        
        //if the fish is going upstream,
        //check if it can reach the upper end.
        if (B[i] == 0){
            while(1){
                //if there is no more fish against it,
                //it reaches the upper end.
                if (d == 0){
                    cnt_reached++;
                    break;
                }
                //met a bigger one, they it is eaten..
                if (down[d - 1] > A[i]){
                    break;
                }
                //we keep on eating the fish...
                d--;
            }
            continue;
        }
        //the fish is going downstream. push it to the stack.
        down[d] = A[i];
        d++;
    }
    
    return cnt_reached + d;
}

Lesson 5 : Nesting (Nesting)

Lesson 5 : Nesting
https://codility.com/programmers/lessons/5

This question requires the space complexity of O(1).

The trick to achieve this is just to count the number of '(' and ')'. To add more, the number of ')' can not exceed the number of '(' otherwise the string is invalid. The string can also be empty.

This solution gives the 100% score.









int solution(char *S) 
{
    int lnum = 0;
    int rnum = 0;
    
    char *p = S;
    
    //the string can be empty. no need to scan.
    if (*p == NULL){
        return 1;
    }
    
    while(*p != NULL){
        if (*p == '('){
            lnum++;
        }
        else {
            //the number of ')' can never exceed the number of '('
            rnum++;
            if (rnum > lnum){
                return 0;
            }
        }
        p++;
    }
    
    return lnum == rnum;
}

Lesson 5 : Brackets (Brackets)

Lesson 5 : Brackets
https://codility.com/programmers/lessons/5

This is a simple problem to check the matchings of brackets. 
Here is the simplest solution. 

Simply, we put the left bracket characters onto the stack when any of them is found. When any right bracket is found, we check if the character (left bracket) on the stack top matches to the right bracket.

One thing that should not be forgotten is see if the string is correctly terminated with no left bracket on the stack.

This solution gives the 100% score.











#include <alloca.h>

int solution(char *S) {
    
    int len = strlen(S);
    
    if (len == 0){
        return 1;
    }
    
    //I always wonder why thye don't tell anything about
    //what we shoud do when memory allocation failed 
    //in codility problems...
    char*   stack = (char*)alloca(len);
  
  
    int idx = 0;  //this is a stack pointer.
    
    int i;
    //scan the character one-by-one
    for (i = 0; i < len; i++){
        char c = S[i];
        
        switch(c){
            //if the character is '(', '[', '{',
            //just push it onto the stack.
            case '(':
            case '[':
            case '{':
                stack[idx] = c;
                idx++;
                break;
                
            //if the character is '(', '[', '{',
            //check if the last character matches with each
            case ')':
                idx--;
                if (stack[idx] != '('){
                    return 0;
                }
                break;
            case ']':
                idx--;
                if (stack[idx] != '['){
                    return 0;
                }
                break;
            case '}':
                idx--;
                if (stack[idx] != '{'){
                    return 0;
                }
                break;
            default:
                return 0;
        }
    }
    
    //we need to see if the string is terminted collectly
    if (idx != 0){
        return 0;
    }
    
    return 1;
}

2015年2月1日日曜日

Lesson 4: NumberOfDiscIntersections (Number of DiscIntersections)

Lesson 4: Number Of Disc Intersections
https://codility.com/programmers/lessons/4

I was little busy for some other issues but now have got some time to write this blog.

As for this problem, I think this one is quite tough, and some blogs seem to have wrong explanations. 

Since there are many ways to answer this problem, let's work on this step-by-step, from the simplest solution to the best solution so that we can understand how it works.


1. a very simple solution

One of the easiest solutions would be to check each disc one-by-one, seeing how many other discs have intersection. 

Of course, this give a very bad score.








This gives 'detected time complexity: O (N ** 2)' (and indeed, it is.
//the source code of 
int solution(int A[], int N) 
{   
    int cnt = 0;     int i, j;     
    for (i = 0; i < N - 1; i++){         long long curl = (long long)i - A[i];         long long curr = (long long)i + A[i];         for (j = i + 1; j < N; j++){             long long posl = (long long)j - A[j];              long long posr = (long long)j + A[j];              
            if (posl <= curr && curl <= posr){                  cnt++;                  if (cnt > 10000000){                      return -1                              }           }     }   return cnt; }

2. a better solution

Let's think of this problem graphically. Here is a pictorial representation of the example data given in the solution.





















One might notice this kind of a representation seems a bit like what we see in a compiler course. Indeed, we may have some impression that it is sort of a liveness analysis problem, as we seen in register allocation; consider each element of the array A is like a variable. To get how many registers are required for a program, we need to see what is the maximum number of variables that are alive simultaneously. If the actual number of the CPU's registers is not enough, one has to generate the code to escape some variables onto the memory at least temporarily.

So this problem is about liveness. Let's consider the left side of a disc is the beginning of liveness and the right side of a disc is the end of liveness.

Checking how many other discs are 'live' between this regions gives the number of the intersections the disc has with the other discs. 

However, this time in this problem, we only need the number of intersections and do not need to enumerate all the combinations. We only have to check the number of the 'live' discs.

If we sort the left limits and right limits of the given discs, we only have to scan the array only once after that.

Here is the score of this approach, the detected time complexity is 'O(N * log(N)) or log(N)', since we have two qsort 'O(N * log(N))' and one for loop to scan the array 'O(N)' (so O(2 * N * log(N) + N), taking the bigger time complexity, it will be O(N * log(N)) ).









#include <alloca.h>

int compare(const void* a, const void *b)
{
    long long l = *((long long*)a);
    long long r = *((long long*)b);

    if (l == r){
        return 0;
    }

    return l < r ? -1 : 1;
}

int solution(int A[], int N) {

    int memsize = sizeof(long long) * N;
    long long* activate = (long long*)alloca(memsize);
    long long* deactivate = (long long*)alloca(memsize);

    long long i = 0;
    for (   ; i < N; i++){
        activate[i] = i - A[i];
        deactivate[i] = i + A[i];
    }

    qsort(activate, N, sizeof(long long), compare);
    qsort(deactivate, N, sizeof(long long), compare);

    long long total = 0;

    long long currentActive = 0;
    long long activatedIndex = 0;
    long long deactivatedIndex = 0;

    for (i = 0; i < N; i++){

        while (activatedIndex < N &&
               activate[activatedIndex] <= deactivate[deactivatedIndex]){
            activatedIndex++;
            currentActive++;
        }

        currentActive--;
        total += currentActive;
        if (total > 10000000){
            return -1;
        }
        deactivatedIndex++;
    }

    return total;
}


This approach may require some explanation to understand. Let's see what is done graphically. First, as shown in the code, both upper(right) limit and lower(left) limit are computed and sorted, but sorted in different arrays. We don't even pair these two values for each disc, as we don't have to enumerate the combinations of discs that intersect this time.



























The smallest number for the upper limit is 'one' as above. So let's focus on this number now. How many discs are 'live' before this A[0] disc 'dies'? It can be easily obtained by checking the lower limit numbers less than 'one'. As the lower limits are also sorted, we only have to slide the current index linearly to get the next smallest value. 

Note that we don't need to know the value of a lower limit belong to which disc. What we are checking now is 'the smallest upper limit', which means all the other upper limit values are larger than it. Then, any disc with the lower limit value below 'the smallest upper limit' always has an intersection.






















So four discs was 'activated' to live including A[0] itself. Considering this, the number of the active 'intersection' at 'one' is 3. And after A[0] is 'deactivated' to die, there are 3 discs still alive here. As we have checked A[0] already, we can now forget  about it.

Then let's focus, the next smallest upper limit (A[2] or A[3]. Both would result the same answer anyway).



















Before reaching 'four' (A[2]), one more disc is activated at 'two' (A[3]). so the live discs when we are checking 'four' (A[2]) increases to 4. So the number of discs that intersects with this time is 3 (4 - 1, as we don't want to count the A[2] disc itself').

One may consider A[0] is still alive indeed, but we already have checked that the combination of A[0] and A[2] in the previous phase. 
So lets' count up the number of the intersection we checked so far (3 in the first phase + 3 in this phase = 6, now).

Let's check A[3] now, as it has 'four' the next smallest number.


















No new disc is activated in this phase, so as the number of the current live discs are three, the number of the new intersection at this point is 2. As we have seen in the previous phases, we don't have to check if A[0] or A[2] intersects with A[3], since they are already checked. We have checked 3 + 3 + 2 = 8 interactions.




















Now let's look at 5 of A[5]. (Indeed, it may need to be emphasized we really don't have to know if this is A[5] or not. We just need to know 'how many discs are 'alive' at a certain upper limit point.)

Right when A[5] is activated to live, it is deactivated to die. So the number of the live discs before the death is 3, and the number of new intersections is 2. We have checked 3 + 3 + 2 + 2 = 10 interactions.





















Now let's check the next smallest upper limit 'six' (of A[1]). We have two discs alive before 6, thus the number of new intersections is '1'. We have checked 3 + 3 + 2 + 2 + 1 = 11 interactions.



















We reached the biggest upper limit value 'eight'. Only one disc is alive and thus there is no intersection. The final answer we have is 11.

I hope the above graphical representations help the understanding of the solution. It should also be noted again that what we just need was 'the number of the active discs' and so we only have to care about the 'activate' and 'deactivate' events and do not have to care about to which disc each event belong.

Considering this fact, there can be found a solution even better than this.


3. the best solution

I found this solution in the stack overflow.
http://stackoverflow.com/questions/4801242/algorithm-to-calculate-number-of-intersecting-discs

This solution is O(N) and does not require any sorting. It also gives 100% score.










To fully understand the solution, we begin from a less efficient solution.

As described in the above solution, note this again: what we are interested in the number of the intersections, and it is not required to enumerate the combinations. 

So we care about the liveness; we checked how many discs are still alive when any of them dies. 

Considering this, we can prepare two arrays similar as the solution we have seen already, but we use the array to contain the number of activation events and deactivation events at the index [i].

Scanning such arrays from the beginning, we can track how many discs are alive at the index [i] and how many dies there. When we found any disc dies at the index [i], we count up the number of intersection by the_number_of_the_current_active_discs - 1 (which is the number of the intersections between the dying discs and other active discs), and the decrement the number of the active discs as one dies. If two or more discs are dying at the index [i], we repeat the above until there is no more disc that dies at the index.

The code is as below.

#include <memory.h>
#include <alloca.h>

int solution(int A[], int N) {

    
    int memsize = sizeof(int) * N;
    int* activated   = (int*)alloca(memsize);
    int* deactivated = (int*)alloca(memsize);
    
    memset(activated    , 0x00, memsize);
    memset(deactivated  , 0x00, memsize);
    
    int i;
    //counter the number of activation/deactivation at the index.
    for (i = 0; i < N; i++){
        //if the lower limit is below 0, consider it as 0.
        //this won't affect the number of activated discs at the index.
        long long lower_lim = i - (long long)A[i]; 
        lower_lim = lower_lim < 0 ? 0 : lower_lim;
        activated[lower_lim]++;
        
        //the same to the upper limit.
        long long upper_lim = i + (long long)A[i];
        upper_lim  = upper_lim > N - 1 ? N - 1 : upper_lim;
        deactivated[upper_lim]++;
    }
    
    //let's scan the activated/deactivated arrays.
    int total = 0;
    int active = 0;
    for (i = 0;  i < N; i++){
        active += activated[i];
        if (active == 0 || deactivated[i] == 0){
            continue;
        }
        
        
        //we have at least one deactivated disc at the index 'i'.
        while (deactivated[i] > 0){
            //active - 1is the number of the new intersection.
            total += active - 1;   
            if (total > 10000000){
                return -1;
            }
            //one disc is dead.
            //so decrement the number of active discs.
            active--;
            deactivated[i]--;
        }
    }
    
    return total;
}




One might notice what we do is the internal while loop is indeed computing the sequence of numbers with common difference, which can be replaced by a simple formula; think of this, if there is a sequence such as [10, 9, 8, 7,...], we can compute its sum by Sn = n (a + l) / 2, where 'Sn' is the sum, 'n' is the number of terms, and 'a' and 'l' are the first term and the last term respectively.

Modifying the code above in this manner, the best solution can be given.









#include <memory.h>
#include <alloca.h>

int solution(int A[], int N) {
   
    int memsize = sizeof(int) * N;
    int* activated   = (int*)alloca(memsize);
    int* deactivated = (int*)alloca(memsize);
   
    memset(activated    , 0x00, memsize);
    memset(deactivated  , 0x00, memsize);
   
    int i;
    //counter the number of activation/deactivation at the index.
    for (i = 0; i < N; i++){
        //if the lower limit is below 0, consider it as 0.
        //this won't affect the number of activated discs at the index.
        long long lower_lim = i - (long long)A[i];
        lower_lim = lower_lim < 0 ? 0 : lower_lim;
        activated[lower_lim]++;
       
        //the same to the upper limit.
        long long upper_lim = i + (long long)A[i];
        upper_lim  = upper_lim > N - 1 ? N - 1 : upper_lim;
        deactivated[upper_lim]++;
    }
   
    //let's scan the activated/deactivated arrays.
    int total = 0;
    int active = 0;
    for (i = 0;  i < N; i++){
        active += activated[i];
        if (active == 0 || deactivated[i] == 0){
            continue;
        }

        //Sn = n * (a + l) / 2
        //  where   Sn  is the sum of the sequence 9, 8, 7, 6, 5 ...
        //          n   is the number of terms  (= deactivated[i])
        //          a   is the first term       (= active - 1)
        //          n   is the last term        (= active - deactivated[i])
        //
        //int sn = deactivated[i] * ( (active - 1) + (active - deactivated[i]) / 2
        int sn = deactivated[i] * (2 * active - 1 - deactivated[i]) / 2;
        total += sn;
        if (total > 10000000){
            return -1;
        }
        active -= deactivated[i];
    }
   
    return total;

}


While this problem is very tough and interesting, I still don't have any idea when such a solution can be useful. When you do linear register allocation, you actually have to track each variables and the set of live variables so to decide which variable(s) should be escaped to memory when the number of registers are not enough...