Friday, November 15, 2013

Knuth Morris Pratt (KMP) Algorithm

Objective:
Given a (smaller) pattern string P, find all its occurrences in a (larger) text string T.

Learning Resources:
http://en.wikipedia.org/wiki/Knuth-Morris-Pratt_algorithm
http://community.topcoder.com/tc?module=Static&d1=tutorials&d2=stringSearching 

Good Intuition and worked examples:
http://jakeboxer.com/blog/2009/12/13/the-knuth-morris-pratt-algorithm-in-my-own-words/

Short Explanation:
1) Use the pattern string in order to build the Failure Function array. 
2) Use the failure function to skip certain characters in the text and print each occurrence of the pattern in the text.

Complexity: O(length_of_pattern + length_of_text)

Implementation:
// P - pattern, F - failure function, m - P.size()
void FailureFunction(string &P, vector<int> &F, int m) {
    int i = 1, j = 0;
    F[0] = 0;
    while(i < m) {
        if(P[i] == P[j])
            {F[i] = j+1; i++; j++;}
        else if(j > 0)
            j = F[j-1];
        else
            {F[i] = 0; i++;}
    }
}

// P - pattern, T - text
void KMP(string &P, string &T) {
    int m=P.size(), n=T.size(), i=0, j=0;
    vector<int> F(m+1);
    FailureFunction(P,F,m);
    while(i < n) {
        while(j && T[i] != P[j]) j = F[j-1]; 
        if(T[i] == P[j]) j++;
        if(j == m) printf("%d\n", i-j+1);
        i++;
    }
}

Tuesday, November 12, 2013

CPSIG Week 4: String Algorithms and Range Querying

Learning Material

Basic Track:

String Algorithms:

Regular Expressions:
http://community.topcoder.com/tc?module=Static&d1=tutorials&d2=regularExpressions


Knuth Morris Pratt (KMP) Algorithm:
Concept:
http://jakeboxer.com/blog/2009/12/13/the-knuth-morris-pratt-algorithm-in-my-own-words/
http://community.topcoder.com/tc?module=Static&d1=tutorials&d2=stringSearching

Implementation: 
http://machlearner.blogspot.in/2013/11/knuth-morris-pratt-kmp-algorithm.html


Range Querying:

Sparse Table (ST) Algorithm:
http://www.topcoder.com/tc?d1=tutorials&d2=lowestCommonAncestor&module=Static

Segment Tree:
http://www.topcoder.com/tc?d1=tutorials&d2=lowestCommonAncestor&module=Static
http://letuskode.blogspot.in/2013/01/segtrees.html

Binary Indexed Tree:
http://community.topcoder.com/tc?module=Static&d1=tutorials&d2=binaryIndexedTrees

Advanced Track:

String Algorithms:

Suffix Array: 

O(n*(logn)^2) - Easier to implement in a contest scenario.
Explanation of the Algorithm:
http://discuss.codechef.com/questions/21385/a-tutorial-on-suffix-arrays

Good Examples:
http://www.stanford.edu/class/cs97si/suffix-array.pdf

O(n*logn) - Manber and Myers Algorithm
http://apps.topcoder.com/forums/?module=RevisionHistory&messageID=1171511

O(n) - DC3 Algorithm
http://algo2.iti.kit.edu/documents/jacm05-revised.pdf


A resource where you can find a compilation of resources and problems:
http://qr.ae/GTA78

Practice Problems

Basic Track:

KMP: NHAY, PERIOD, EPALIN, FINDSR, VPALIN
Segment Tree: BRCKTS, LITE, GSS1, GSS3, HORRIBLE, ORDERSET
BIT: INVCNT, MATSUM, YODANESS

Advanced Track:
SARRAY

Thursday, October 24, 2013

ICPC Amritapuri Online Contest 2013 - Editorials

There were 4 problems in total. 3 problems were adhoc in nature. The fourth problem was a tiling problem which my team could not solve during the contest. The problems took some time to load initially (about 5 minutes) but after that the contest went on without any major problems.

The following solutions use my Competitive Programming template which you can find here: http://machlearner.blogspot.in/2013/01/in-competition-programming-where-speed.html

Problem 1: Blurry Vision

Problem Statement:
https://icpc.hackerrank.com/contests/amrita13/challenges/blurry-vision

Solution:
Our objective in this problem is to find whether two sub matrices of a given matrix (which might potentially overlap) are the same. If multiple such sub matrices occur, find the one with maximum number of elements.

We could just iterate through all possible sub matrices, keep adding them to a map (either by converting the sub matrix into a string or a vector<string>). If the current sub matrix already exists in the map, we take answer to be maximum of current answer and number of elements in the current sub matrix.

Code:
map<string,int> m1;
string a[11];

int main()
{
    int t,n,m,ans;
    string s;

    t = ni();

    while(t--) {
        n = ni(); m = ni(); ans = 0;

        REP(i,0,n)
            cin >> a[i];
        REP(i,0,n)
            REP(j,0,m)
                REP(k,i,n)
                    REP(l,j,m) {
                        s = "";
                        REP(p,i,k+1) {
                            REP(q,j,l+1)
                                s.pb(a[p][q]);
                            s.pb('$');
                        }
                        if(m1.find(s) != m1.end())
                            ans = max(ans,m1[s]);
                        else
                            m1[s] = (k-i+1)*(l-j+1);
                    }
        printf("%d\n", ans);
        m1.clear();
    }
        
    return 0;
}


Complexity: O((n*m)^3) per test case.

Problem 2: Flee to Shelter

Problem Statement:
https://icpc.hackerrank.com/contests/amrita13/challenges/flee-to-shelter

Solution:
This was the easiest problem of the contest. Objective of the problem was to find the total time taken to transfer all the instruments.

ans = ceil(n/m)*time_taken_for_one_trip*2

Code:

int main()
{
    int t,n,m,T,ans;

    t = ni();

    while(t--) {
        n = ni(); m = ni(); T = ni();
        if(n%m == 0)
            ans = (n/m)*T*2;
        else
            ans = (n/m+1)*T*2;
        printf("%d\n", ans);
    }
        
    return 0;
}


Complexity: O(1) per test case.

Problem 3: Creating a wormhole

Problem Statement: https://icpc.hackerrank.com/contests/amrita13/challenges/creating-a-wormhole

Solution:
Given an list of edge lengths, our objective is to find the polygon with maximum number of edges which can be formed from the given edges.

We use the generalization of triangle inequality to any polygon  - http://en.wikipedia.org/wiki/Triangle_inequality#Generalization_of_the_inequality_to_any_polygon

First, we sort the array. Then we check for each edge i (0 based index), whether its length is smaller than sum of lengths of all previous edges. If this is true, ans = i+1, else we continue the process. ans is initialized to -1 to account for the case where no such polygon is possible.

Code:
int a[10005];

int main()
{
    int t,n,ans;
    LL sum;

    t = ni();

    while(t--) {
        n = ni(); ans = -1;
        REP(i,0,n)
            a[i] = ni();
        sort(a,a+n);
        sum = a[0] + a[1];
        REP(i,2,n) {
            if(sum > a[i])
                ans = i+1;
            sum += a[i];
        }
        printf("%d\n", ans);
    }
        
    return 0;
}

Complexity: O(n*logn) per test case.

Problem 4: Galaxy Search

Problem Statement:
https://icpc.hackerrank.com/contests/amrita13/challenges/galaxy-search

Solution: (to be updated soon...)

Note: These are just my personal solutions and not an official editorial.

Feel free to leave suggestions or alternate solutions as comments.

Thursday, October 17, 2013

CPSIG Week 3: Basic Data Structures and Graph Theory

Knowledge and usage of Basic Data Structures are the bare minimum skills every Sport Programmer needs to be equipped with. Graph theory is another  important domain which provides an alternate view of problems and facilitates one to reduce these problems into graphs and solve them using classical algorithms. Here some basic topics, learning materials and practice problems are listed to get you started into these topics.

Learning Material
 
Basic Track:

1) C++ STL: Vector, Stack, Queue, List(Linked List), Set, Map, Priority Queue
For reference: http://www.cplusplus.com/reference/stl/ (very useful during coding if you forget something)
Pre-requisite for Graph Algorithms: Representing graphs as adjacency lists, adjacency matrices etc. 

2) Depth First Search
3) Breadth First Search

4) Floyd-Warshall

5) Single Source Shortest Paths - Dijkstra's in O(V^2)

Advanced Track:
1) Single Source Shortest Paths - Dijkstra's in O(V+E*logV) - Refer to priority queue or set approach in Advanced STL uses mentioned above or read this tutorial:
2) Single Source Shortest Paths(with negative weights) - Bellman-Ford:
3) Topological Sorting:

4) Strongly Connected Components - Kosaraju's or Tarjan's algorithm.
A) Kosaraju's:
B) Tarjan's:

5) Bi-Connected Components, finding Articulation Points and Bridges.

6) Minimum Spanning Tree - Prims or Kruskals algorithms.
A) Prims:
Implementation: 
B) Kruskals: 

Practice Problems

Basic Track:

Advanced Track:

Wednesday, October 16, 2013

All Pairs Shortest Paths - Floyd Warshall Algorithm

Learning Resource: http://en.wikipedia.org/wiki/Floyd%E2%80%93Warshall_algorithm

Short Description:
1) This algorithm finds shortest distance been every pair of nodes in a graph.
2) The implementation shared here uses adjacency matrix representation.
3) Complexity of the algorithm is O(n^3)

Working:
1) Initialize all distances between nodes (i,j) to INF (a very large value which is guaranteed to be more than the maximum distance between two nodes - here it is chosen as 10^9) where i != j and 0 where i==j.
2) Input the distances for every node (i,j) and populate the adjacency matrix.
3) Calculate all-pairs shortest paths by iterating for all i,j,k and using the result min-distance(i,j) = min(min-distance(i,j) , min-distance(i,k)+min-distance(k,j))

Implementation:

// Nodes are numbered from 1 to n
for(int i=1; i<=n; i++)
    for(int j=1;j<=n;j++)
        a[i][j]=(i==j)?(0):(1000000000);

// Input a[][] here.

for(int k=1; k<=n; k++)
    for(int i=1; i<=n; i++)
        for(int j=1;j<=n;j++)
            a[i][j]=min(a[i][j],a[i][k]+a[k][j]);

Tuesday, October 15, 2013

Matrix Exponentiation

Learning resource:
Fast exponentiation of numbers: http://machlearner.blogspot.in/2013/09/modular-exponentiation.html

Same concept can be applied to raise matrices to a power.

Implementation:
// If the original matrix is stored in a[][], matexp(n) calculates a^n.

#define DIM 10
#define MOD 1000000007
LL a[DIM][DIM],cpy[DIM][DIM];

void matmultiply(LL y[DIM][DIM])
{
    for(int i=0;i<DIM;i++)
        for(int j=0;j<DIM;j++)
        {
            cpy[i][j]=0;
            for(int r=0;r<DIM;r++)
                cpy[i][j]=(cpy[i][j] + a[i][r]*y[r][j]) % MOD;
        }
    memcpy(a,cpy,sizeof cpy);
}

void matexp(int n)
{
    if(n==1)
        return;
    else if(n%2==0)
    {
        matexp(n/2);
        matmultiply(a);
    }
    else
    {
        LL temp[DIM][DIM];
        memcpy(temp,a,sizeof a);
        matexp(n-1);
        matmultiply(temp);
    }
}

Thursday, October 10, 2013

Chinese Remainder Theorem

Reading:
http://en.wikipedia.org/wiki/Chinese_remainder_theorem

Here i make use of the following computational formula to calculate the solution of a system of simultaneous congruences using  Chinese Remainder Theorem:

x := \sum_{i} a_i \frac{N}{n_i} \left[\left(\frac{N}{n_i}\right)^{-1}\right]_{n_i}

Modular inverse is computed using Extended Euclid's Algorithm.

Implementation:

int egcd(int a, int b, LL &x, LL &y) {
    if(b == 0) {
        x = 1; y = 0; return a;
    }
    int gcd = egcd(b,a%b,y,x);
    y -= a/b*x;
    return gcd;
}

int crt(VI &a, VI &m) {
    int N=1,n=a.size();
    LL t,x,y,ret=0;
    REP(i,0,n) N *= m[i];
    REP(i,0,n) {
        t = N/m[i];
        egcd(t,m[i],x,y);
        ret = (ret + ((a[i]*t)%N)*x)%N;
    }
    if(ret < 0) ret += N;
    return ret;
} 

Tuesday, October 8, 2013

CPSIG Week 2: Dynamic Programming.

Dynamic Programming (DP) is more of a technique than a topic. Your skill in solving DP problems will increase with more and more practice of different types of DP problems. Here i try to mention a list of topics, learning resources and practice problems which should give you a good foundation.  

Learning Material
Basic Track:

Longest Common Subsequence
http://en.wikipedia.org/wiki/Longest_common_subsequence_problem
http://www.geeksforgeeks.org/dynamic-programming-set-4-longest-common-subsequence/

Longest Increasing Subsequence
In O(n^2) - http://www.geeksforgeeks.org/dynamic-programming-set-3-longest-increasing-subsequence/ (See the second implementation)
In O(n*logn) - http://comeoncodeon.wordpress.com/2009/08/12/longest-increasing-subsequence-lis/

Edit Distance
http://en.wikipedia.org/wiki/Levenshtein_distance

Read the Iterative with full matrix solution first.

Matrix Chain Multiplication
http://www.geeksforgeeks.org/dynamic-programming-set-8-matrix-chain-multiplication/

0/1 Knapsack
http://www.personal.kent.edu/~rmuhamma/Algorithms/MyAlgorithms/Dynamic/knapsackdyn.htm

Some more Classical DP Problems: http://people.csail.mit.edu/bdean/6.046/dp/

Solving DP using Matrix Exponentiation

If DP problem can be reduced to a recurrence relation, then use matrix exponentiation to solve the problem.

Solving DP using Bitmasking

Usually these are problems where all possible subsets need to be generated and we need to store information about each subset to avoid re-computation.

Advanced Track:

Reading:
Commonly used DP State Domains - http://apps.topcoder.com/forums/?module=Thread&threadID=697369&start=0


Topics and practice problems:

State Space Reduction
http://www.topcoder.com/stat?c=problem_statement&pm=8605&rd=12012&rm=269199&cr=7581406

Counting

Strategies and Expected values

Solving in reverse - Easier characterization from the end.
http://www.spoj.pl/problems/MUSKET/

DP on Trees

DP with Data Structures

Practice Problems

Basic Track:
AIBOHP, AMZSEQ, KNAPSACK, SAMER08D, ELIS, LIS2, M3TILE, GNY07H, HIST2, MIXTURES, SCUBADIV, DCEPCA06, DCEPC501, EDIST, ACODE

Advanced Track: 
MUSKET, COURIER (more to be updated here...)

Some more problems:
http://codeforces.com/blog/entry/325

http://apps.topcoder.com/forums/;jsessionid=B83B4A412C257FA7F334AD9649D3E766?module=Thread&threadID=674592&start=0&mc=9#1775726

Saturday, September 14, 2013

CPSIG Week 1: Basic Mathematics.

Learning Material:

Basic Track:

Recursion:
Concept and Implementation:
http://community.topcoder.com/tc?module=Static&d1=tutorials&d2=recursionPt1
http://community.topcoder.com/tc?module=Static&d1=tutorials&d2=recursionPt2
Uses: Too many to list, Highly useful concept. Basic idea of recursion is a Pre-Requisite before coming to the lecture.

Time Complexity:
Concept:
http://community.topcoder.com/tc?module=Static&d1=tutorials&d2=complexity1
http://community.topcoder.com/tc?module=Static&d1=tutorials&d2=complexity2

Video Tutorials:
http://www.youtube.com/watch?v=OpebHLAf99Y&list=PL2_aWCzGMAwI9HK8YPVBjElbLbI3ufctn
http://www.youtube.com/watch?v=PFd5s0bHgAQ&list=PL2_aWCzGMAwI9HK8YPVBjElbLbI3ufctn

GCD using Euclid's Algorithm:
Sieve of Erastothenes:
Concept and Implementation: http://community.topcoder.com/tc?module=Static&d1=tutorials&d2=math_for_topcoders
Uses: GCD - calculating GCD, LCM. Prime Sieve - Fast calculation of prime numbers in a range.

GCD and Prime Sieve is the minimum you need to learn from this link. Feel free to read the rest too if you have time.

Modular Exponentiation:
Concept: http://en.wikipedia.org/wiki/Modular_exponentiation
Implementation: http://machlearner.blogspot.in/2013/09/modular-exponentiation.html
Uses: Calculating (a^n)%MOD in O(logn) time.

Matrix Exponentiation:
Concept: http://fusharblog.com/solving-linear-recurrence-for-programming-contest/
Implementation: Refer to above link. It is essentially the same as modular exponentiation for numbers, with matrix multiplication instead of normal multiplication. It is highly recommended you implement this on your own and not re-use code.
Uses: Solving Linear Recurrences like Fibonacci numbers in O((d^3)*logn) time where d - dimension of the matrix.

Modular Inverse where MOD is a prime:
Concept: http://en.wikipedia.org/wiki/Modular_multiplicative_inverse
Implementation: http://machlearner.blogspot.in/2013/09/modular-multiplicative-inverse.html
Uses: To calculate nCr%MOD where MOD is prime. nCr = (fact[n]/(fact[r]*fact[n-r]))%MOD = fact[n]%MOD * mod_inv(fact[r]*fact[n-r], MOD)

Advanced Track(for those who have completed all above mentioned concepts):

Primality Testing - Miller Rabbin:
Concept and Implementation: http://community.topcoder.com/tc?module=Static&d1=tutorials&d2=primalityTesting
Uses: To check whether a single number is prime or not.

Inclusion-Exclusion Principle:
Concept: http://en.wikipedia.org/wiki/Inclusion%E2%80%93exclusion_principle
Implementation:
Uses: To calculate probabilities / solve some counting problems where the concept of inclusion-exclusion is involved.

Chinese Remainder Theorem:
Concept: http://en.wikipedia.org/wiki/Chinese_remainder_theorem
Implementation: http://machlearner.blogspot.in/2013/10/chinese-remainder-theorem.html
Uses:  Determine a number n that when divided by some given divisors leaves given remainders

Euler Totient Function:
Concept: http://en.wikipedia.org/wiki/Euler%27s_totient_function
Implementation:
Uses: Too many to list. Please read the wiki page.

Advanced Counting Techniques - Polya Counting, Burnside's Lemma:
Concept:
http://en.wikipedia.org/wiki/Burnside%27s_lemma
http://petr-mitrichev.blogspot.in/2008/11/burnsides-lemma.html
Implementation:
Uses:

Problemset:

Basic Track:
SEQ, SPP, FIBOSUM, FIBTWIST, PRIME1, PRIMEZUK, DCEPCA06, LASTDIG2, GCD2, KOPC12B, RANGZER2, ZSUM

Advanced Track:
PON, PAGAIN, HNUMBERS, POWPOW, NDIVPHI, SQFREE, ETF, PROOT, NGM2

Modular Multiplicative Inverse

Objective: To calculate modular multiplicative inverse of a number.(Refer to the link if you dont know what this term means)

Solution:
Two possible solutions are very nicely explained in the Wikipedia link. So, i will just share C++ implementations using both the methods

Using Extended Euclid:

void extended_gcd(int a, int b, int &gcd, int &x, int &y)
{
    x=0; y=1; gcd=b;
    int u=1,v=0,m,n,q,r;
    while(a!=0)
    {
        q=gcd/a; r=gcd%a;
        m=x-u*q; n=y-v*q;
        gcd=a; a=r; x=u; y=v; u=m; v=n;
    }
}

LL mod_inv(int a)
{
    int gcd,x,y;
    extended_gcd(a,MOD,gcd,x,y);
    if(gcd==1)
        return (LL)((x+MOD)%MOD);
}
Using Euler's Theorem:

Use modular exponentiation from here

Modular inverse is same as mod_exp(a,MOD-2,MOD).