The preprocessing tool for text collections of data has also been implemented in Java.
This program enables text preprocessing of documents for text mining It offers several
possibilities of document representation (words or N-grams as terms) and several weighting
methods (binary, TF or TF-IDF). It also provides two standard pre-processing procedures
of text - stopwords removal and stemming.
You can download our text pre-processing tool here
This product is free software: you can redistribute it and/or modify it under the terms
of the GNU General Public License as published by the Free Software Foundation, either
version 3 of the License, or (at your option) any later version, see
http://www.fsf.org/licensing/licenses/gpl.html
|