mulan.classifier.meta
Class HMC

java.lang.Object
  extended by mulan.classifier.MultiLabelLearnerBase
      extended by mulan.classifier.meta.MultiLabelMetaLearner
          extended by mulan.classifier.meta.HMC
All Implemented Interfaces:
Serializable, MultiLabelLearner, TechnicalInformationHandler

public class HMC
extends MultiLabelMetaLearner

Class that implements a Hierarchical Multilabel classifier (HMC). HMC classifier takes as parameter any kind of multilabel classifier and builds a hierarchy. Any node of hierarchy is a classifier and is trained separately. The root classifier is trained on all data and as getting down the hierarchy tree the data is adjusted properly to each node. Firstly, instances that do not belong to the node are removed and then attributes that are unnecessary are removed also. For more information, see

Grigorios Tsoumakas, Ioannis Katakis, Ioannis Vlahavas: Effective and Efficient Multilabel Classification in Domains with Large Number of Labels. In: Proc. ECML/PKDD 2008 Workshop on Mining Multidimensional Data (MMD'08), 2008.

BibTeX:

 @inproceedings{Tsoumakas2008,
    author = {Grigorios Tsoumakas and Ioannis Katakis and Ioannis Vlahavas},
    booktitle = {Proc. ECML/PKDD 2008 Workshop on Mining Multidimensional Data (MMD'08)},
    title = {Effective and Efficient Multilabel Classification in Domains with Large Number of Labels},
    year = {2008},
    location = {Antwerp, Belgium}
 }
 

Version:
2012.07.16
Author:
George Saridis, Grigorios Tsoumakas
See Also:
Serialized Form

Field Summary
 
Fields inherited from class mulan.classifier.meta.MultiLabelMetaLearner
baseLearner
 
Fields inherited from class mulan.classifier.MultiLabelLearnerBase
featureIndices, labelIndices, numLabels
 
Constructor Summary
HMC()
          Default constructor
HMC(MultiLabelLearner baseLearner)
          Constructs a new instance
 
Method Summary
protected  void buildInternal(MultiLabelInstances dataSet)
          Learner specific implementation of building the model from MultiLabelInstances training data set.
protected  void deleteInstances(Instances trainSet, int attrIndex)
          Deletes the unnecessary instances, the instances that have value 0 on given attribute.
protected  MultiLabelInstances deleteLabels(MultiLabelInstances mlData, String currentLabel, boolean keepSubTree)
          Deletes the unnecessary attributes.
 long getNoClassifierEvals()
          Reurns number of classifier evaluations
 long getNoNodes()
          Returns the number of nodes
 TechnicalInformation getTechnicalInformation()
          Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.
 long getTotalUsedTrainInsts()
          Returns number of total instances used
 String globalInfo()
          Returns a string describing the multi-label learner.
protected  MultiLabelOutput makePredictionInternal(Instance instance)
          Learner specific implementation for predicting on specified data based on trained model.
 
Methods inherited from class mulan.classifier.meta.MultiLabelMetaLearner
getBaseLearner
 
Methods inherited from class mulan.classifier.MultiLabelLearnerBase
build, debug, getDebug, isModelInitialized, isUpdatable, makeCopy, makePrediction, setDebug
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

HMC

public HMC()
Default constructor


HMC

public HMC(MultiLabelLearner baseLearner)
Constructs a new instance

Parameters:
baseLearner - the multi-label learner at each node
Method Detail

getTechnicalInformation

public TechnicalInformation getTechnicalInformation()
Description copied from class: MultiLabelLearnerBase
Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.

Specified by:
getTechnicalInformation in interface TechnicalInformationHandler
Specified by:
getTechnicalInformation in class MultiLabelLearnerBase
Returns:
the technical information about this class

buildInternal

protected void buildInternal(MultiLabelInstances dataSet)
                      throws Exception
Description copied from class: MultiLabelLearnerBase
Learner specific implementation of building the model from MultiLabelInstances training data set. This method is called from MultiLabelLearnerBase.build(MultiLabelInstances) method, where behavior common across all learners is applied.

Specified by:
buildInternal in class MultiLabelLearnerBase
Parameters:
dataSet - the training data set
Throws:
Exception - if learner model was not created successfully

makePredictionInternal

protected MultiLabelOutput makePredictionInternal(Instance instance)
                                           throws Exception
Description copied from class: MultiLabelLearnerBase
Learner specific implementation for predicting on specified data based on trained model. This method is called from MultiLabelLearnerBase.makePrediction(weka.core.Instance) which guards for model initialization and apply common handling/behavior.

Specified by:
makePredictionInternal in class MultiLabelLearnerBase
Parameters:
instance - the data instance to predict on
Returns:
the output of the learner for the given instance
Throws:
Exception - if an error occurs while making the prediction.
InvalidDataException - if specified instance data is invalid and can not be processed by the learner

deleteLabels

protected MultiLabelInstances deleteLabels(MultiLabelInstances mlData,
                                           String currentLabel,
                                           boolean keepSubTree)
                                    throws InvalidDataFormatException
Deletes the unnecessary attributes. Actually keeps only the children names of the node that is going to be trained as attributes and deletes the rest.

Parameters:
mlData - the instances from which the attributes will be removed
currentLabel - the name of the node whose children will be kept as attributes
keepSubTree - whether to keep the subtree
Returns:
MultiLabelInstances
Throws:
InvalidDataFormatException

deleteInstances

protected void deleteInstances(Instances trainSet,
                               int attrIndex)
Deletes the unnecessary instances, the instances that have value 0 on given attribute.

Parameters:
trainSet - the trainSet on which the deletion will be applied
attrIndex - the index of the attribute that the deletion is based

getNoNodes

public long getNoNodes()
Returns the number of nodes

Returns:
number of nodes

getNoClassifierEvals

public long getNoClassifierEvals()
Reurns number of classifier evaluations

Returns:
number of classifier evaluations

getTotalUsedTrainInsts

public long getTotalUsedTrainInsts()
Returns number of total instances used

Returns:
total instances used

globalInfo

public String globalInfo()
Description copied from class: MultiLabelLearnerBase
Returns a string describing the multi-label learner.

Specified by:
globalInfo in class MultiLabelLearnerBase
Returns:
a description suitable for displaying in a future gui