2015年3月12日木曜日

Lesson 8: Flags (Flags)

Lesson 8: Flags
https://codility.com/programmers/lessons/8

This is a though problem. I couldn't hit on the idea to achieve the O(N) complexity, so I googled and found the solution by Codility.

https://codility.com/media/train/solution-flags.pdf


But let's do it step-by-step to understand this problem better.

First of all, it should be noted that what is required as the answer is the maximum number of flags that can be set; we don't have to enumerate all the possible combinations how the flags can be set.

Then, the question is what we can suppose about the peek where the first flag can be set? Indeed, we can always start from the 1st peek.

Let's think such a case as follows; the first peek is at A[1], the second is at A[3] and the third is at A[7], and the distance between two flags must be larger than 3. Obviously you can't put flags both onto A[1] and A[3], while both (A[1], A[7]) and (A[3],A[7]) satisfies this condition. 

Indeed, since the first peak always has more distance to the third peek than the second peek (as the first peek is always on the left side of the array form the second peek, this is obvious), we can always start from the first peek found in the array.

What about the other peeks? The same can be said. For instance, the distance between A[13] and A[20] is always larger than the A[15] and A[20], by the definition given in the problem.


Possibly the easiest way to imagine this situation is to slide the flags from right to left. Suppose any arbitrary valid combinations of the flags and peeks. 

The first flag can be moved to the first peek to the left, as this only makes the first flag more distant to the second flag.  Then second flag to the left, a peek that is the closest peek to the first flat that does not violate the distance constraint given to the first flag and the second flag.

In this situation, from the first flag, the second flag is seen far enough, since we shifted the second flag to the left without violating the distance constraint. From the third flag, the second flag just became more far, so this doesn't cause the distance  constraint to be violated. 

We can shift all the flags to the left if possible in this manner. And we just shifted the flags without changing the number of flags. 

Then, this leeds to the conclusion; as long as we can always start from the 1st peek and then pick up the nearest peek that satisfies the given constraint, it is enough to check if k flags can be set or not. 


Now, let's solve this problem step-by-step.

(1) the simplest solution

One of the simplest solutions would be like below. While this code gives the 100% correctness score, the performance score is obviously bad.









#include <alloca.h>

int check(int k, int* peeks, int peek_cnt)
{
    //if we have to put only one flag, it is always possible..
    if (k == 1){
        return 1;
    }
    
    //the first flag can be set.
    int prev = peeks[0];
    int rest = k - 1;
    
    //check if this is true
    int i = 1;
    while (i < peek_cnt){
        int curr = peeks[i];
        if (curr - prev >= k){
            prev = curr;
            rest--;
            //we consumed the all the flag.
            if (rest == 0){
                return 1;
            }
        }
        i++;
    }
    
    return 0;
}

int solution(int A[], int N) 
{
    //the max number of the peeks is less than N / 2.
    int* peeks = (int*)alloca(sizeof(int) * (N / 2) );
    int peek_cnt = 0;
    
    //let's find the peeks first.
    int i;
    for (i = 1; i < N - 1; i++){
        if (A[i - 1] < A[i] && A[i] > A[i + 1]){
            peeks[peek_cnt] = i;
            peek_cnt++;
        }
    }
    
    //we don't have to check when there is no peek found.
    if (peek_cnt == 0){
        return 0;
    }
    
    //check how many flags can be set...
    int max_flags = 0;
    for (int i = peek_cnt; i > 0; i--){
        int ret = check(i, peeks, peek_cnt);
        if (ret != 0){
            max_flags = i;
            break;
        }
    }
    
    return max_flags;
}



(2) The bisection solution.

Now, we have to deal with the performance problem. What we can say from the problem is that if K flags can be set (K > 1), we can always put K - 1 flag(s), and if K flag(s) can not be set (K > 0), K  + 1 flag(s) can not be set either.

This means that we can use the bisection algorithm for this problem.

Now let's think of the upper limit for K. When we put K flags, the minimum distance between the first flag and the last flag is (K - 1) * K. As the left most cell and the right most cell of the array A can not be peeks by the definition, we can say (K - 1) * K + 2 must be less than N. (The reason why we don't say (K - 1) * K + 2 <= N is because the array A is an zero-indexed array.)


Then,
(K - 1) * K + 2 < N 
(K - 1) * K     < N - 2
      K - 1     < ((N - 2) / K) 
          K     < ((N - 2) / K) + 1;

When K is sqrt(N - 2), K is equal to ((N - 2)/ K), since K * K = N - 2, then K = (N - 2) / K and the above inequation `K < ((N - 2)/ K) + 1' is true.

While K must be a positive integer value, sqrt(N - 2) may not be integer. Yet, also when K is smaller than sqrt(N - 2), the above inequation is still true, since if K becomes smaller (N - 2) / K will be bigger. 

Can we increment K a bit more? Since K must be a positive integer, the next possible value will be floor(sqrt(N - 2)) + 1. Let's use the value X, which is equal to floor(sqrt(N - 2)), and replace K with X + 1, and let's check if the below is true or false.

X + 1 < ((N - 2) / (X + 1)) + 1

We multiply both sides by X + 1, then

(X + 1) ^ 2  < (N - 2) + (X + 1)
X^2 + 2X + 1 < (N - 2) + (X + 1)

Now, subtract (X + 1) from the both sides.

X^2 + X < N - 2

As we first assumed X + 1 > sqrt(N - 2) (because floor(sqrt(N - 2)) + 1 > sqrt(N - 2)) and then X^2 is larger than N - 2, this is obviously false.

This means that the upper limit to check K should be set to sqrt(N - 2) (but excluding sqrt(N - 2)). So in the following solution, the upper limit for K to use when checking if K flags can be set, is written as ceil(sqrt(N - 2)) (as K must be an integer, we has to chose K bigger than sqrt(N - 2) when sqrt(N - 2) is not an integer.)


If the number of peeks found is smaller than ceil(sqrt(N - 2)), we can use the number of peeks as the upper limit. As the loop index we use is integer in the below code, we use ceil(sqrt(N - 2)) when sort(N - 2) is not integer, and use sqrt(N - 2) - 1 when sqrt(N - 2) is integer (to make it correct upper limit).


The code below implements such a strategy. 

I don't know why the 100% score is given for this solution, since this is obviously the O(N log n) solution due to the bisection algorithm  while the problem requires O(N) solution. 







#include <math.h>
#include <alloca.h>

int check(int k, int* peeks, int peek_cnt)
{
    //if we have to put only one flag, it is always possible..
    if (k == 1){
        return 1;
    }
    
    //the first flag can be set.
    int prev = peeks[0];
    int rest = k - 1;
    
    //check if this is true
    int i = 1;
    while (i < peek_cnt){
        int curr = peeks[i];
        if (curr - prev >= k){
            prev = curr;
            rest--;
            //we consumed the all the flag.
            if (rest == 0){
                return 1;
            }
        }
        i++;
    }
    
    return 0;
}

int solution(int A[], int N) 
{   
    //there is no peek if N < 3.
    if (N < 3){
        return 0;
    }
    
    //the max number of the peeks is always less than N / 2.
    //Indeed, the max number of the peeks, (N - 3) / 2 + 1. 
    //Think of [0, 1] and then
    //add [0, 1] on the left hand side and [0] on the right hand side...
    int* peeks = (int*)alloca(sizeof(int) * (N / 2) );
    int peek_cnt = 0;
    
    //let's find the peeks first.
    int i;
    for (i = 1; i < N - 1; i++){
        if (A[i - 1] < A[i] && A[i] > A[i + 1]){
            peeks[peek_cnt] = i;
            peek_cnt++;
        }
    }
    
    //we don't have to check when there is no peek found.
    if (peek_cnt == 0){
        return 0;
    }
    
    //check how many flags can be set...
    //this time we use the bisection algorithm
    int max_flags   = 0;    
    
    
    int min = 1; //as we have at least one peek, min can be 1.

    //we need to round up the sqrt(N - 2) to get the upper limit
    double  tmp1 = sqrt(N - 2);
    int     tmp2 = (int)tmp1;
    
    //the belo code is to handle when N == 3. we don' want tmp to be 0.
    int tmp = 1;
    if (tmp1 > 1){
        tmp = (tmp1 == tmp2 ? tmp2 - 1: tmp2 + 1); 
    }
    
    int max = peek_cnt < tmp ? peek_cnt : tmp;
    
    while(min <= max){        
        int k = (max + min) / 2;
        
        int ret = check(k, peeks, peek_cnt);

        //we failed to put k flags, then try to check a smaller value.
        if (ret == 0){
            max = k - 1;
        }
        //we succeed to put k flags, update the max_flags and try a larger value.
        else {
            min = k + 1;
            max_flags = k;
        }
    }
    
    return max_flags;
}


(3) The O(N) solution


This is something described in https://codility.com/media/train/solution-flags.pdf.

As I didn't hit on a good idea for the O(N) time complexity solution, I checked the solution by Codiliy.

First of all, instead of retaining the indices of the peeks as above solutions, we create an array that holds the next peek on the right side of the array at each index. For instance, if there is no index between A[2] and A[5] and A[8] is a peek, the next_peek[3] to next_peek[6] array contains the value of 5 and 8 respectively.

Using this next_peek array, the location of the next peek in the array A can be found in a constant time by looking up this next_peek array.

Here is the solution I coded after seen the Codility's solution.

This gives the 100% score.




#include <alloca.h>

int solution(int A[], int N) {

    //for N < 3, there can be no peak.
    if (N < 3){
        return 0;
    }
    
    int* next_peek = (int*)alloca(sizeof(int) * N);
    

    //-1 stands for there is no more peek.
    next_peek[N - 1] = -1;
    
    //scan backwards
    int peek_cnt = 0;
    int i;
    for (i = N - 2; i > 0; i--){
        if (A[i - 1] < A[i] && A[i] > A[i + 1]){
            next_peek[i] = i;
            peek_cnt++;
        }
        else {
            next_peek[i] = next_peek[i + 1];
        }
    }
    next_peek[0] = next_peek[1];
        
    //now let's check 
    int max_flags = 0;
    for (i = 1; (i - 1) * i + 2 < N && i <= peek_cnt; i++){
        int pos = 0;
        int num = 0;
        while (pos < N && num < i){
            pos = next_peek[pos];
            //we don't have any more peek. exit this loop.
            if (pos == -1){
                break;
            }
            //we can set a flag at the pos. 
            //increment pos to seek the next peek
            num++;
            pos += i;
        }
    
        max_flags = max_flags < num ? num : max_flags;
    }
    
    return max_flags;
}

(4) Summary

What I found after reading the solution by Codility and by other people on the Internet, there are lots of not-so-precise code.

For instance, consider sqrt(N) as the upper limit for the K instead of sqrt(N - 2) is frequently seen in the official solution by Codility...

Also the code in the (3) O(N) solution section is indeed can be improved by using bisection algorithm, as we don't really have to check all the possible 'i' for this problem.

Here I show some example code to utilize the bisection algorithm with the next_peek strategy in the (3) O(N) algorithm...

#include <alloca.h>

int solution(int A[], int N) {

    //for N < 3, there can be no peak.
    if (N < 3){
        return 0;
    }
    
    int* next_peek = (int*)alloca(sizeof(int) * N);
    

    //-1 stands for there is no more peek.
    next_peek[N - 1] = -1;
    
    //scan backwards
    int peek_cnt = 0;
    int i;
    for (i = N - 2; i > 0; i--){
        if (A[i - 1] < A[i] && A[i] > A[i + 1]){
            next_peek[i] = i;
            peek_cnt++;
        }
        else {
            next_peek[i] = next_peek[i + 1];
        }
    }
    next_peek[0] = next_peek[1];
        
    //now let's check 
    int max_flags = 0;
    
    //use the bisection algorithm
    int min = 1; //as we have at least one peek, min can be 1.

    //we need to round up the sqrt(N - 2) to get the upper limit
    double  tmp1 = sqrt(N - 2);
    int     tmp2 = (int)tmp1;
    
    //the belo code is to handle when N == 3. we don' want tmp to be 0.
    int tmp = 1;
    if (tmp1 > 1){
        tmp = (tmp1 == tmp2 ? tmp2 - 1: tmp2 + 1); 
    }
    
    int max = peek_cnt < tmp ? peek_cnt : tmp;
    
    while (min <= max){
        int k = (max + min) / 2;
        
        //let's check if k flags can be set. 
        int pos = 0;
        int num = 0;
        while (pos < N && num < k){
            pos = next_peek[pos];
            if (pos == -1){
                break;
            }
            num++;
            pos += k;
        }
        
        //we failed to put k flags, then try to check a smaller value.
        if (num < k){
            max = k - 1;
        }
        //we succeed to put k flags, update the max_flags and try a larger value.
        else {
            min = k + 1;
            max_flags = k;
        }
    }
    
    return max_flags;
}

2 件のコメント:

  1. Great explanations, thanks a lot.

    返信削除
  2. Thank you very much for your explanation

    I don't think we should keep a peak array with N element. The next peak is not necessary. Please find below for my implementation



    bool checkMid(std::vector peaks, int mid)
    {
    int count = 1;
    int prev = peaks[0];

    for (unsigned int i = 0; i < peaks.size(); i++)
    {
    if (peaks[i] - prev >= mid)
    {
    count++;
    prev = peaks[i];
    }
    }

    return (count >= mid) ? true : false;
    }

    int solution(vector &A) {

    std::vector peaks;

    if (A.size() < 3)
    return 0;

    for (unsigned int i = 1; i < A.size() - 1; i++)
    {
    if (A[i] > A[i-1] && A[i] > A[i+1])
    peaks.push_back(i);
    }

    if (peaks.size() == 0)
    return 0;

    int left = 0;
    int right = peaks.size();
    int posibility = -1;

    while (left <= right)
    {
    int mid = (left + right) / 2;

    if (checkMid(peaks, mid) == true)
    {
    left = mid + 1;
    posibility = mid;
    }
    else
    right = mid - 1;
    }

    return posibility;
    }

    返信削除