tp.preprocess
Class PreprocessDocument

java.lang.Object
  extended by tp.preprocess.PreprocessDocument

public class PreprocessDocument
extends java.lang.Object

A class for pre-processing of documents. This includes stemming and stop words removal.


Constructor Summary
PreprocessDocument()
           
PreprocessDocument(java.io.File stop_list)
           
 
Method Summary
 Document preprocessDocument(Document doc, boolean use_stop_list, boolean use_stemming)
          The method makes preprocessing of one dcument.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

PreprocessDocument

public PreprocessDocument()

PreprocessDocument

public PreprocessDocument(java.io.File stop_list)
Method Detail

preprocessDocument

public Document preprocessDocument(Document doc,
                                   boolean use_stop_list,
                                   boolean use_stemming)
The method makes preprocessing of one dcument.

Parameters:
doc - a document to be pre-processed
use_stop_list - a boolean value, which indicates, if stoplist will be used
use_stemming - a boolean value, which indicates, if stemming will be used
Returns: