Class NGram

java.lang.Object
  extended by Configured
      extended by NGram

public class NGram
extends Configured

NGram scans a directory of text files and generates a list of N-grams. More information: N-gram on Wikipedia. The class has two map/reduce stages: one to find all N-grams starting with the desired word or phrase, and the next to filter out (prune) those N-grams that do not occur frequently enough.

Author:
Philip M. White

Nested Class Summary
static class NGram.FindJob_MapClass
          This is the map class of the first stage.
static class NGram.FindJob_ReduceClass
           
static class NGram.PruneJob_MapClass
           
static class NGram.PruneJob_ReduceClass
           
 
Constructor Summary
NGram()
           
 
Method Summary
static void main(java.lang.String[] args)
           
 int run(java.lang.String[] args)
          The main driver for the N-gram map/reduce program.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

NGram

public NGram()
Method Detail

run

public int run(java.lang.String[] args)
        throws java.lang.Exception
The main driver for the N-gram map/reduce program. Invoke this method to submit the map/reduce job.

Throws:
java.io.IOException - When there is communication problems with the job tracker.
java.lang.Exception

main

public static void main(java.lang.String[] args)
                 throws java.lang.Exception
Throws:
java.lang.Exception