The preprocessing tool for text collections of data has also been implemented in Java. This program enables text preprocessing of documents for text mining It offers several possibilities of document representation (words or N-grams as terms) and several weighting methods (binary, TF or TF-IDF). It also provides two standard pre-processing procedures of text - stopwords removal and stemming.
You can download our text pre-processing tool here

This product is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version, see http://www.fsf.org/licensing/licenses/gpl.html

