
Text Representation and Its Influence on Text Categorization
Created at the Faculty of Information Technology, BUT, Brno
Authors: Ondřej Šabatka, Vladimír Bartík
Kontakt: xsabat05@stud.fit.vutbr.cz, bartik@fit.vutbr.cz
Authors: Ondřej Šabatka, Vladimír Bartík
Kontakt: xsabat05@stud.fit.vutbr.cz, bartik@fit.vutbr.cz
This program enables text preprocessing of documents for text mining It offers several possibilities of document representation and several pre-processing methods. There is also a possibility of using N-Grams as features for document representation.
More information about setting the pre-processing options and various functions of the program will be displayed in this help panel.
Components of the program
This program consists of the following four componenets
- Reading of text documents
- Setting of text document pre-processing options
- Displaying information about the pre-processed dataset
- Possibility to store the representation into a file for further use.