public class ArticleFeatureExtractor extends DefaultFeatureExtractor
Modifier and Type | Field and Description |
---|---|
static double |
CENTERING_THRESHOLD
Maximal difference between left and right margin to consider the area to be centered (percentage of the parent area width)
|
static double[] |
DEFAULT_WEIGHTS |
static double |
MIN_MARKEDNESS_DIFFERENCE
Minimal difference in the markedness that should be interpreted as a difference between the meaning of the areas.
|
Constructor and Description |
---|
ArticleFeatureExtractor() |
Modifier and Type | Method and Description |
---|---|
static double |
colorLuminosity(java.awt.Color c) |
weka.core.Instances |
createEmptyDataset()
Creates an empty data set containing the header appropriate to this extractor.
|
protected java.util.Set<Tag> |
getAllTags(Area area)
Obtains all the tags assigned to this area and its child areas (not all descendant areas).
|
weka.core.Instance |
getAreaFeatures(Area node,
weka.core.Instances dataset)
Creates a classification data instance from the given area.
|
double |
getAverageBoxColorLuminosity(Area area) |
FeatureVector |
getFeatureVector(Area node) |
double |
getIndentation(Area node)
Computes the indentation metric.
|
int |
getLineCount(Area a) |
double |
getMarkedness(Area node)
Computes the markedness of the area.
|
double |
getRelY(Area a) |
Area |
getTreeRoot()
Obtains the current tree root.
|
double[] |
getWeights() |
boolean |
isCentered(Area area)
Checks whether the area is horizontally centered within its parent area
|
void |
setTree(Area rootNode)
Initializes the extractor to use the tree with the given root node.
|
void |
setWeights(double[] weights) |
loadArffDatasetResource
public static final double MIN_MARKEDNESS_DIFFERENCE
public static final double[] DEFAULT_WEIGHTS
public static final double CENTERING_THRESHOLD
public void setTree(Area rootNode)
FeatureExtractor
rootNode
- the new area tree root nodepublic Area getTreeRoot()
FeatureExtractor
null
when the tree was not set before.public weka.core.Instances createEmptyDataset()
FeatureExtractor
public weka.core.Instance getAreaFeatures(Area node, weka.core.Instances dataset)
FeatureExtractor
node
- the area whose features should be computeddataset
- the data set the created instance should belong topublic void setWeights(double[] weights)
public double[] getWeights()
public double getMarkedness(Area node)
public FeatureVector getFeatureVector(Area node)
public boolean isCentered(Area area)
true
if the area is centeredpublic double getIndentation(Area node)
public double getAverageBoxColorLuminosity(Area area)
public static double colorLuminosity(java.awt.Color c)
public double getRelY(Area a)
public int getLineCount(Area a)