Monthly Archives: May 2015

Least Perfect Square

This problem is from careercup. http://www.careercup.com/question?id=5725584103571456

Given a number “n”, find the least number of perfecet square numbers sum needed to get “n”.
Exmaple:
n=12, return 3 (4+4+4)=(2^2+2^2+2^2) NOT (3^2+1+1^2+1^2)
n=6, return 3(4+1+1)=(2^2+1^2+1^2)

At first sight, I thought this could be resolved by greedy method:
round(sqrt(12))=3, remainder=3;
round(sqrt(3))=1, remainder=2;
round(sqrt(2))=1, remainder=1;
But this won’t give out a optimized answer.

This problem is actually a dynamic programming problem. It is very similar to Fibonacci. Let’s assume table[i] stores the answer for least perfect square. Let’s think about how to get table[12].
12 = 11+1^2
12 = 8+2^2
12 = 3+ 3^2
In this way, we know that table[12]=min{table[3]+1, table[8]+1, table[11]+1}

We can compute table[3], table[8], table[11] accordingly like this. We just need to find the numbers before each one, which add a power number and can reach it. To understandy the solution, I wrote below code. It helps to understand. But it is not efficient.

/** Inefficient version */
static int leastPerfectSquare1(int n){
    if(n==1){
        return 1;
    }
    else if(n==0){
        return 0;
    }
    int maxSquare = (int)Math.sqrt(n);
    int result = n;
    for(int i=1;i<=maxSquare; i++){
        int tmpResult = leastPerfectSquare1(n - (int) Math.pow(i, 2)) + 1;
        result = (tmpResult < result) ? tmpResult : result;
    }
    return result;
}

Below version stores the sub solutions.

/** Recursion storing sub solutions **/
static int leastPerfectSquare2(int n){
    int[] storedResult = new int[n + 1];
    storedResult[1] = 1;
    return leastPerfectSquare2Helper(n, storedResult);
}

static int leastPerfectSquare2Helper(int n, int[] storedResult){
    if(n==0){
        return 0;
    }
    int maxSquare = (int)Math.sqrt(n);
    int result = n;
    for(int i=1;i<=maxSquare; i++){
        int power = (int)Math.pow(i, 2);
        int tmpResult = 0;
        if(storedResult[n - power]!=0){
            tmpResult = storedResult[n - power] + 1;
        }
        else {
            tmpResult = leastPerfectSquare2Helper(n - power, storedResult) + 1;
        }
        result = (tmpResult < result) ? tmpResult : result;
    }
    storedResult[n] = result;
    return result;
}

Below version is the effective one compared to above 2 versions. It only uese loop, takes O(n) time. It loop all the potentional squre number, and see what position of number can use it to compute the next one.(hard to explain, reading the code would be helpful.)

/** Non-recursion solution **/
static int leastPerfectSquare3(int n){
    int[] table = new int[n+1];
    for(int i=0;iint maxSquare = (int) Math.sqrt(n);
    for(int i=2; i<=maxSquare; i++){
        int power = (int) Math.pow(i, 2);
        int boundary = n - power;
        for(int j=0;j<=boundary; j++){
            if(table[j] + 1 < table[j+power] ){
                table[j+power] = table[j] + 1;
            }
        }
    }
    return table[n];
}

Translate query from mysql to oracle

IF
Mysql: select if(field1=‘ok’,1,0) AS ‘DR‘
Oracle:
select
case
when field1=‘ok’ then
1
else
0
end AS DR

Date in condition
Mysql: where creation_date>=’2014-06-01‘
Oracle: where trunc(creation_date)>= trunc(to_date(‘2014-06-01’, ‘rrrr-mm-dd’))
In Oracle, we should use trunc besides to_date to compare. For example trunc(sysdate+1).

CONCAT
Mysql: concat(field1, field2, field3)
Oracle: concat(field1, concat(field2, field3)) or field1||field2||field3

GROUP_CONCAT
Mysql: select GROUP_CONCAT(field1 SEPARATOR ‘,’)
Oracle: select listagg(field1, ‘,’)  WITHIN group (order by null)

GRUOP BY, Add aggregator in those fields which are not specified in gorup by:
Mysql: select field1, field2, field3 from t1 group by field1
Oracle: select field1, max(field2), max(field3) from t1 group by field1

UNION,
Mysql: select ‘abc’ as field1 union select field1 from t1
Oracle: select ‘abc’ as field1 from dual union select field1 from t1

Other things we need to pay attention between mysql and oracle

Select date in Oracle
In Oracle select clause, use: select to_char(creation_date, ‘mm/dd/rrrr‘), not: select to_date(creation_date, ‘mm/dd/rrrr’)

Oracle select result are all uppercase
Mysql: select field1 as f1 from t1. The result will be:f1
Oracle: select field1 as f1 from t1. The result will be:F1

Select same field for more than one time:
Mysql: select field1, field1 from t1. The result will be:field1, field1
Oracle: select field1, field1 from t1. The result will be:field1, field1_1

TO_DAYS
Mysql: select to_days(curdate()) from dual.
Oracle: select TRUNC(sysdate) – DATE’0000-01-03′ from dual

Weighted Reservoir Sampling

Based on last Reservoir Sampling problem, I will develop it a little bit more. What if each element has a weight. Let’s say we have below elements. The struture is: [id, weight]:
e0=[0, 2], e1=[1, 1], e2=[2, 1], e3=[3, 1], e4=[4, 1].

And we want to sample 2 of them. Let’s compute the possibilite that e0 will be sampled. Out of 5 elements, the possibility that first time we choose e0 is 2/(2+1+1+1+1) = 2/6. And we have chance that the first time we didn’t choose e0, it is 4/6. And the possibility that we choose e0 at 2nd time is (4/6)*(2/5). So the total possibility we choose e0 is 2/6 + (4/6)*(2/5).

Above should be the result of the sampling. But how do we handle this problem with weight?  Actually, this is a weighted reservoir sampling problem which is described in this paper http://utopia.duth.gr/~pefraimi/research/data/2007EncOfAlg.pdf.

The idea is that we compute a value ki of element i, by combining weight wi and a random value ui. The ki will be between (0,…,1):

And we read from stream, keep top m ki value of element by using heap.

package com.pli.project.algorithm.probability;

import java.util.*;

/**
 * Created by lipeng on 2015/5/25.
 */
public class WeightedReservoirSampling {

    public static int[] weightedReservoirSampling(int[][] items, int m){
        if(items==null || m > items.length){
            System.out.println("incorrect input.");
            return null;
        }
        PriorityQueue<WrsElement> heap = new PriorityQueue<WrsElement>(10);
        /** Transform weight into a double between (0,1). Put it in min heap. **/
        for(int i=0; i < items.length ; i++){
            double key = Math.pow((Math.random()), 1/(double)items[i][1]);
            WrsElement element = new WrsElement(items[i][0], key);
            heap.add(element);
            if(heap.size() > m){
                heap.poll();
            }
        }
        /** Output result. **/
        int[] result = new int[m];
        for(int i=0;i<m; i++){
            result[i] = heap.poll().value;
        }
        return result;
    }

    static class WrsElement implements Comparable<WrsElement>{

        int value;
        double key;

        public WrsElement(int value, double key){
            this.value = value;
            this.key = key;
        }

        public int compareTo(WrsElement wrsElement) {
            return Double.compare(this.key, wrsElement.key);
        }

        @Override
        public String toString() {
            return "WrsElement{" +
                    "value=" + value +
                    ", key=" + key +
                    '}';
        }
    }

    public static void main(String[] args) {
        /** each item = {id, weight} **/
        int[][] items = {
                {1, 1}, {2, 5}, {3, 20}, {4, 5}, {5, 10}, {6, 3}, {7, 3}, {8, 3}
        };
        int[] result = weightedReservoirSampling(items, 3);
        System.out.println(Arrays.toString(result));
    }

    public static void test(){
        int[][] items = {{0, 2}, {1, 1}, {2, 1}, {3, 1}, {4, 1} };
        int m = 2;
        int[] probabilities = new int[items.length];
        for(int i=0; i< 1000; i++) {
            int[] result = weightedReservoirSampling(items, m);
            for(int j = 0; j < m; j++){
                probabilities[result[j]]++;
            }
        }
        System.out.println(Arrays.toString(probabilities));
    }
}

Here are some other referene to learn:
http://gregable.com/2007/10/reservoir-sampling.html
http://stackoverflow.com/questions/17117872/is-there-an-algorithm-for-weighted-reservoir-sampling

Reservoir Sampling

I have a post about how to return a value by it’s weight by inputing a stream. http://allenlipeng47.com/PersonalPage/index/view/129/nkey

Based on that, I will develop this to another problem: Reservoir Sampling.

The probelm of Reservoir Sampling is like this. There are elements inputted by stream, which lenght is n. How can we uniformly choose m out of n elements. We assume n is too large to fit in memory.

The solution is similar to the previous post. We define an ResultArray of length m. At beginning, we push [0,…,m-1] elements to ResultArray. Then loop element from [m,…,n-1]. For element i, we generate a random number with range [0,…,i-1]. If random number is less than m, then we put element i to ResultArray[i]. Otherwise, we continue the loop.

package com.pli.project.algorithm.probability;

import java.util.Arrays;

/**
 * Created by lipeng on 2015/5/25.
 */
public class ReservoirSampling {

    /**
     * Suppose arr is a very large array, can't be stored in memory.
     * Randomly select m element among arr.
     * @param arr, the input array. Very large, can't be stored in memory.
     * @param m, the size of reservoir
     * @return, the sampling result.
     */
    public static int[] reservoirSampling(int[] arr, int m){
        if(arr==null || m > arr.length){
            return null;
        }
        int[] result = Arrays.copyOf(arr, m);
        for(int i = m; i < arr.length; i++){
            int randPos = (int)(Math.random() * (i + 1));
            if(randPos < m){
                result[randPos] = arr[i];
            }
        }
        return result;
    }

    public static void main(String[] args) {
        int[] arr = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9};
        int[] result = reservoirSampling(arr, 3);
        System.out.println(Arrays.toString(result));
    }

}

Water jug problem

And I found this water jug problem when I’m reading Number Theory. It is interesting. Let me start by defining the problem:

Assume we have 2 jugs. Jug1 can contains 3 galons water, jug2 can contains 5 galons water. We can pour water into jug1 and jug2, empty 2 jugs, or we can pour water between 2 jugs. The question is how can we get a 4 galons water by pouring water between these 2 jugs.

Assume (x, y) is one of state of 2 jugs, which jug1 has x galons of water, jug2 has y galons of water. A possible movement is like this: (0, 5) -> (3, 2) -> (0, 2) -> (2, 0) -> (2, 5) -> (3, 4)

It turns out it is an AI problem. It has many states. Each state can transit to another state. And we find the goal state.

The transistion of a state can be described as below:
(x, y) -> (a, y) if x < a, fill first jug
(x, y) -> (x, b) if b < y, fill second jug
(x, y) -> (0, y) if x > 0, empty first jug
(x, y) -> (x, 0) if y > 0, empty second jug
(x, y) -> (a, y – (a – x)) if x + y > a, pour water from second jug to first jug
(x, y) -> (x – (b – y), b) if x + y > b, pour water from first jug to second jug
(x, y) -> (x + y, 0) if x + y <= a, pour water from second jug to first jug
(x, y) -> (0, x + y) if x + y <= b, pour water from first jug to second jug

During the search, we find if x or y becomes the goal volume we want. Below my code is a DFS implementation. It is a basic solution for water jug. The result may not be optimized.

package com.pli.project.algorithm.ai;

import java.math.BigInteger;
import java.util.HashSet;
import java.util.Stack;

/**
 * Created by lipeng on 2015/5/23.
 */
public class WaterJug {

    public static void waterJug(int a, int b, int target){
        /** Eliminate unproper input. */
        if(a <= 0 || b <= 0 || target > b){
            System.out.println("no solution for " + a + ", " + b + ", " + target);
            return;
        }
        BigInteger bigA = new BigInteger(String.valueOf(a));
        BigInteger bigB = new BigInteger(String.valueOf(b));
        int gcd = bigA.gcd(bigB).intValue();
        if(target % gcd != 0){
            System.out.println("no solution for " + a + ", " + b + ", " + target);
            return;
        }
        if(a > b){
            System.out.println("b is supposed to greater than a.");
            return;
        }
        HashSet stateSpace = new HashSet();   //stores all the states.
        WjState firstState = new WjState(0, 0);
        Stack stateStack = new Stack();
        stateStack.add(firstState);
        WjState curState = firstState;
        /** DFS to find target result. Result is not guaranteed to be optimized **/
        while(stateStack.size() > 0 && curState.x != target && curState.y != target){
            curState = stateStack.peek();
            WjState nextState = new WjState(0, 0, curState.step + 1, curState);
            if(curState.x < a && !stateSpace.contains(nextState.setPos(a, curState.y))){
                stateSpace.add(nextState);
                stateStack.add(nextState);
            }
            else if(curState.y < b && !stateSpace.contains(nextState.setPos(curState.x, b))){
                stateSpace.add(nextState);
                stateStack.add(nextState);
            }
            else if(curState.x > 0 && !stateSpace.contains(nextState.setPos(0, curState.y))){
                stateSpace.add(nextState);
                stateStack.add(nextState);
            }
            else if(curState.y > 0 && !stateSpace.contains(nextState.setPos(curState.x, 0))){
                stateSpace.add(nextState);
                stateStack.add(nextState);
            }
            else if(curState.x + curState.y > a && !stateSpace.contains(nextState.setPos(a, curState.y - a + curState.x))){
                stateSpace.add(nextState);
                stateStack.add(nextState);
            }
            else if(curState.x + curState.y > b && !stateSpace.contains(nextState.setPos(curState.x - b + curState.y, b))){
                stateSpace.add(nextState);
                stateStack.add(nextState);
            }
            else if(curState.x + curState.y <=a && !stateSpace.contains(nextState.setPos(curState.x + curState.y, 0))){
                stateSpace.add(nextState);
                stateStack.add(nextState);
            }
            else if(curState.x + curState.y <=b && !stateSpace.contains(nextState.setPos(0, curState.x + curState.y))){
                stateSpace.add(nextState);
                stateStack.add(nextState);
            }
            else{
                stateStack.pop();
            }
        }// while
        /** Output result */
        StringBuffer output = new StringBuffer();
        while(curState != null){
            output.insert(0, " -> " + curState);
            curState = curState.pre;
        }
        output.delete(0, 4);
        System.out.println(output);
    }


    public static class WjState{
        int x, y;
        int step;   //How many steps is needed in order to reach the state
        WjState pre;    //to define the state before it.

        public boolean posEquals(WjState state){
            if(this.x==state.x&&this.y==state.y){
                return true;
            }
            return false;
        }

        public WjState(int x, int y){
            this.x = x;
            this.y = y;
        }

        public WjState(int x, int y, int step, WjState pre){
            this.x = x;
            this.y = y;
            this.step = step;
            this.pre = pre;
        }

        public WjState setPos(int x, int y){
            this.x = x;
            this.y = y;
            return this;
        }

        public int hashCode() {
            int hash = 1 ;
            hash = hash * 31 + x;
            hash = hash * 31 + y;
            return hash;
        }

        public String toString(){
            return "(" + this.x + ", " + this.y + ")";
        }

        public boolean equals(Object o) {
            if (this == o) return true;
            if (o == null || getClass() != o.getClass()) return false;
            WjState wjState = (WjState) o;
            if (x != wjState.x) return false;
            return y == wjState.y;
        }
    }

    public static void main(String[] args) {
        waterJug(3, 5, 4);
    }
}

gcd

Euclid’s algorithm: gcd of 2 number equals gcd of smaller number and their difference.

Use Euclid’s algorithm to calculate gcd(a, b).

package com.pli.project.algorithm.gcd;

import java.math.BigInteger;

/**
 * Created by lipeng on 2015/5/23.
 */
public class Gcd {

    /**
     * Keep a is the max value.
     * gcd{a,b}=gcd{−a,b}=gcd{a,−b}=gcd{−a,−b}
     * https://proofwiki.org/wiki/GCD_for_Negative_Integers
     */
    public static int gcd(int a, int b){
        a = Math.abs(a);
        b = Math.abs(b);
        if(b > a){
            int tmp = b; b = a; a = tmp;
        }
        if(b==0){
            return a;
        }
        int remainder = 0;
        do {
            remainder = a % b;
            a = b;
            b = remainder;
        } while(remainder!=0);
        return a;
    }

    public static void main(String[] args) {
        BigInteger a = new BigInteger("-10");
        BigInteger b = new BigInteger("5");
        System.out.println(a.gcd(b));
        System.out.println(gcd(-10, 5));
    }

}

HMAC summary

This article is the summary of a HMAC video: https://www.youtube.com/watch?v=KglnT_KwO0M

Below is the tradition private / public key applicaiton. Mostly, they are for encrypting data and signing signature:

The problem is that encrypting/decryptingcomputation takes a lot of CPU. It is slow.
Instead, we can choose a symmetric key, like below:

x is the message,  it could be large. So we define x=x1, x2, …, xn
If we define hash function like h(k, x),  which is called secret prefix:
1. m=MACk(x)=h(k, x1, x2, …, xn)
Hacker may append xn+1, and loop again hash algorithm to generate m’, which m’=MACk(x)=h(k, x1, x2, …, xn, xn+1). In this way, m’ is valid with k.

If we define hash function like h(x,h), which is called secret suffix:
2. m=MACk(x)=h(x, k)
Hacker may find a x’, which  h(x’,k) has same collission value m, which m=MACk(x’)=h(x’,k)

So, we present HMAC. The formula of HMAC is like below.
 

In only computes the whole message for only 1 time.
An HMAC is smaller in size and takes much less CPU to compute and verify than any know public key operation for comparable security levels.

Rolling hash. O(1) time to compute queue’s hash

The key idea of rolling hash is to generate hash of n+1 elements in O(1) time, by knowing hash of n elements.
Here is a great introduction for rolling hash from TopCoder:
https://www.topcoder.com/community/data-science/data-science-tutorials/introduction-to-string-searching-algorithms/

And here is an application for checking palindrome in a stream rolling hash from g4g:
http://www.geeksforgeeks.org/online-algorithm-for-checking-palindrome-in-a-stream/

They are great, and it supports the below changes:
“bcde” -> “abcde”, by doing str = (str + h*str[i/2])%q;
“bcde” -> “bcdef”, by doing str = (str*d + str[i+1])%q;
“bcde” -> “cdef”, by doing str = (d*(str + q – str[(i+1)/2]*h)%q

During the calculation, we need to store the base^n, which we called MAGIC.

Unfortunately, above articles don’t mention this type of change:
“bcde” -> “cde”.

Let’s assume hash for “bcde” is:

And our aim is to get below hash.

To get this done is very easy: we just do hash(“bcde”) – MAGIC * hash(‘b’).
Since the number of elements reduces from 4 to 3, how do we calculate the next MAGIC, which is 
The general problem is that how do we calculate  from , with modular q. Having instantly learned some modular arithmetic knowledge, I got it that we can calculate it by multiplicative inverse.
For example,
We should computes m for if we know:

How do we calculate:

To be honest, I don’t really know how to compute it. But good news is that we can use BigInteger to get it:
BigInteger.valueOf(11).modInverse(BigInteger.valueOf(97)). This is calculation takes O(log97) time, it is a constant.

Isn’t it cool!! By help of BigInteger, we got m is 53.
Now, let (31 * 53) mod 97, we got 91.

Let’s use calculate to verify  It is 91 too! We are right!

Below is a queue application for rolling hash. Every time when we enQueue or deQueue, we can use O(1) time to finish the hash for the whole queue. Furthermore, we can quickly check if 2 Queues are the same.

package util;

import lombok.Getter;
import java.math.BigInteger;
import java.util.LinkedList;

/**
 * Created by PLi on 6/4/2015.
 */
public class RollingHashQueue<E> {

    /** base is the number of characters in input alphabet **/
    private final int base = 31;

    /** q is a prime number used for evaluating Rabin Karp's Rolling hash **/
    private final int q = 103;

    /** magic base^(n-1). n is the number of element the list has. **/
    private int magic = 1;

    LinkedList<E> list;

    @Getter
    private int rollingHash;

    private int magicModInverse;

    public RollingHashQueue(){
        list = new LinkedList<E>();
        magicModInverse = BigInteger.valueOf(base).modInverse(BigInteger.valueOf(q)).intValue();
    }

    /** enQueue, add element to queue tail **/
    public boolean addLast(E e){
        list.addLast(e);
        rollingHash = ((rollingHash * base) % q + e.hashCode()) % q;
        magic = (magic * base) % q;
        return true;
    }

    /** add element to queue head **/
    public boolean addFirst(E e){
        list.addFirst(e);
        rollingHash = ((magic * e.hashCode()) % q + rollingHash ) % q;
        magic = (magic * base) % q;
        return true;
    }

    /** deQueue, remove element from queue head **/
    public E removeFirst(){
        if(list==null || list.size()==0){
            return null;
        }
        magic = (magic * magicModInverse) % q;
        rollingHash = (rollingHash - (magic * list.get(0).hashCode()) % q + q) % q;
        E e = list.remove(0);
        return e;
    }

    /** remove element from queue tail **/
    public E removeLast(){
        if(list==null || list.size()==0){
            return null;
        }
        magic = (magic * magicModInverse) % q;
        rollingHash = (((rollingHash - list.getLast().hashCode() + q) % q) * magicModInverse) % q;
        E e = list.removeLast();
        return e;
    }

    public int hashCode() {
        return this.hashCode();
    }

    public boolean equals(RollingHashQueue<E> queue2){
        if(queue2==null || this.rollingHash!=queue2.rollingHash || !this.list.equals(queue2.list)){
            return false;
        }
        return true;
    }

    public E getFirst(){
        return list.getFirst();
    }

    public E getLast(){
        return list.getLast();
    }

    public E get(int i){
        return list.get(i);
    }

    public String toString(){
        return list.toString();
    }

}

In the end, thank Victor Costan from MIT to inspire how to calculate the MAGIC!

Add maven to project

After create a dynanic website, we hope to automatically add jar files by maven. How to turn on the maven feature?

And go to the project directory, and run the following command:
mvn archetype:generate -DgroupId=com.mycompany.app -DartifactId=my-app -DarchetypeArtifactId=maven-archetype-quickstart -DinteractiveMode=false

It’s done!

Refer: http://maven.apache.org/guides/getting-started/maven-in-five-minutes.html

Merge sort linkedlist

Given a linked list. Use mergesort to sort this list. Time complexity is O(nlogn).

package com.pli.project.algorithm.sort;

import java.util.LinkedList;
import java.util.List;

/**
 * Created by lipeng on 2015/5/5.
 */
public class LinkedListMergeSort {

    public static void main(String[] args) {
        Node n1 = new Node(1);
        Node n2 = new Node(5);
        Node n3 = new Node(9);
        Node n4 = new Node(8);
        Node n5 = new Node(7);
        Node n6 = new Node(6);
        Node n7 = new Node(2);
        Node n8 = new Node(3);
        Node n9 = new Node(5);
        Node n10 = new Node(1);
        n1.next = n2; n2.next = n3;
        n3.next = n4; n4.next = n5; n5.next = n6;
        n6.next = n7; n7.next = n8; n8.next = n9; n9.next = n10;
        n1 = linkedListMergeSort(n1);
        System.out.println(n1);
    }

    public static Node linkedListMergeSort(Node node){
        if(node == null || node.next == null){
            return node;
        }
        /** Divide node into 2  parts**/
        Node half1 = node, half2 = node, fast = node, half2Parent = node;
        while(fast != null && fast.next != null){
            half2Parent = half2;
            half2 = half2.next;
            fast = fast.next.next;
        }
        half2Parent.next = null;
        half1 = linkedListMergeSort(half1);
        half2 = linkedListMergeSort(half2);
        /** Merge half1 and half2 to node **/
        node = new Node(0); //need to define a fake head for Java
        Node tail = node;
        while( half1 != null && half2 != null){
            if(half1.value < half2.value){
                tail.next = half1;
                half1 = half1.next;
            }
            else{
                tail.next = half2;
                half2 = half2.next;
            }
            tail = tail.next;
        }
        if(half1 == null){
            tail.next = half2;
        }
        else{
            tail.next = half1;
        }
        return node.next;
    }

    static class Node {
        public int value;
        public Node next;
        Node(int value){
            this.value = value;
        }

        public String toString(){
            if(this.next == null){
                return String.valueOf(value);
            }
            return String.valueOf(this.value) + " " + this.next;
        }
    }
}