tp.builder
Class BuilderWord
java.lang.Object
tp.builder.BuilderWord
- All Implemented Interfaces:
- BuilderInterface
public class BuilderWord
- extends
- implements BuilderInterface
Class, which makes representation of text data by words.
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
BuilderWord
public BuilderWord(DocsRepresentDB.RepresentationModel model,
DocsRepresentDB.Preprocessing preprocessing)
- Constructor of the class
- Parameters:
model
- model used for representation (binary, TF or TF/IDF)preprocessing
- preprocessing options selected by the user
buildRepresentation
public DocsRepresentDB buildRepresentation(DocumentsDatabase database)
- Gets text documents from a database and creates their word representation.
- Specified by:
buildRepresentation
in interface BuilderInterface
- Parameters:
database
- input database of text documents
- Returns:
- database of all document representations
buildRepresentation
public DocsRepresentDB buildRepresentation(DocumentsDatabase database,
int depth)
- Mthod creates a representation, if the depth is equal to "1" - because this class creates word
representation (higher depth is possible only for N-gram representation).
- Specified by:
buildRepresentation
in interface BuilderInterface
- Parameters:
database
- database with all documentsdepth
- depth (length of a feature) - "1" for word representation
- Returns:
- created representation
doInBackground
protected java.lang.Void doInBackground()
throws java.lang.Exception
- Throws:
java.lang.Exception
setMinNGramOccur
public void setMinNGramOccur(int min_ngram_occur)
- Specified by:
setMinNGramOccur
in interface BuilderInterface