mulan.classifier.meta
Class SubsetLearner

java.lang.Object
  extended by mulan.classifier.MultiLabelLearnerBase
      extended by mulan.classifier.meta.MultiLabelMetaLearner
          extended by mulan.classifier.meta.SubsetLearner
All Implemented Interfaces:
Serializable, MultiLabelLearner, TechnicalInformationHandler

public class SubsetLearner
extends MultiLabelMetaLearner

A class for learning a classifier according to disjoint label subsets: a multi-label learner (the Label Powerset by default) is applied to subsets with multiple labels and a single-label learner is applied to single label subsets. The final classification prediction is determined by combining labels predicted by all the learned models. Note: the class is not multi-thread safe. <br> <br> There is a mechanism for caching and reusing learned classification models. The caching mechanism is controlled by {@link #useCache} parameter.

For more information, see

Lena Tenenboim, Lior Rokach,, Bracha Shapira: Multi-label Classification by Analyzing Labels Dependencies. In: , Bled, Slovenia, 117--132, 2009.

Lena Tenenboim-Chekina, Lior Rokach,, Bracha Shapira: Identification of Label Dependencies for Multi-label Classification. In: , Haifa, Israel, 53--60, 2010.

BibTeX:

 @inproceedings{LenaTenenboim2009,
    address = {Bled, Slovenia},
    author = {Lena Tenenboim, Lior Rokach, and Bracha Shapira},
    pages = {117--132},
    title = {Multi-label Classification by Analyzing Labels Dependencies},
    volume = {Proc. ECML/PKDD 2009 Workshop on Learning from Multi-Label Data (MLD'09)},
    year = {2009}
 }
 
 @inproceedings{LenaTenenboim-Chekina2010,
    address = {Haifa, Israel},
    author = {Lena Tenenboim-Chekina, Lior Rokach, and Bracha Shapira},
    pages = {53--60},
    title = {Identification of Label Dependencies for Multi-label Classification},
    volume = {Proc. ICML 2010 Workshop on Learning from Multi-Label Data (MLD'10},
    year = {2010}
 }
 

Version:
30.11.2010
Author:
Lena Chekina (lenat@bgu.ac.il), Vasiloudis Theodoros
See Also:
Serialized Form

Field Summary
protected  Classifier baseSingleLabelClassifier
          Base single-label classifier that will be used for training and predictions
 
Fields inherited from class mulan.classifier.meta.MultiLabelMetaLearner
baseLearner
 
Fields inherited from class mulan.classifier.MultiLabelLearnerBase
featureIndices, labelIndices, numLabels
 
Constructor Summary
SubsetLearner()
          Default constructor
SubsetLearner(int[][] labelsSubsets, Classifier singleLabelClassifier)
          Initialize the SubsetLearner with labels subsets partitioning and single label learner.
SubsetLearner(int[][] labelsSubsets, MultiLabelLearner multiLabelLearner, Classifier singleLabelClassifier)
          Initialize the SubsetLearner with labels set partitioning, multilabel and single label learners.
SubsetLearner(LabelClustering clusteringMethod, MultiLabelLearner multiLabelLearner, Classifier singleLabelClassifier)
          Initialize the SubsetLearner with a label clustering method, multilabel and single label learners.
 
Method Summary
protected  void buildInternal(MultiLabelInstances trainingSet)
          We get the initial dataset through trainingSet.
 String getModel()
          Returns a string representation of the model
 TechnicalInformation getTechnicalInformation()
          Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.
 String globalInfo()
          Returns a string describing the multi-label learner.
 MultiLabelOutput makePredictionInternal(Instance instance)
          We make a prediction using a different method depending on whether the split has one or more labels
 void resetRandomSeed(Object model)
          Invokes the setSeed(1) or setRandomSeed(1) method of the supplied object's Class, if such method exist.
 void resetSubsets(int[][] labelsSubsets)
          Reset the label set partitioning.
 void setSeed()
          Set random seed of all internal Learners to 1.
 void setUseCache(boolean useCache)
          Sets whether cache mechanism will be used
 
Methods inherited from class mulan.classifier.meta.MultiLabelMetaLearner
getBaseLearner
 
Methods inherited from class mulan.classifier.MultiLabelLearnerBase
build, debug, getDebug, isModelInitialized, isUpdatable, makeCopy, makePrediction, setDebug
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

baseSingleLabelClassifier

protected Classifier baseSingleLabelClassifier
Base single-label classifier that will be used for training and predictions

Constructor Detail

SubsetLearner

public SubsetLearner()
Default constructor


SubsetLearner

public SubsetLearner(int[][] labelsSubsets,
                     Classifier singleLabelClassifier)
Initialize the SubsetLearner with labels subsets partitioning and single label learner. LabelPowerset method initialized with the specified single label learner.will be used as multilabel learner.

Parameters:
labelsSubsets - subsets of dependent labels
singleLabelClassifier - method used for single label classification

SubsetLearner

public SubsetLearner(int[][] labelsSubsets,
                     MultiLabelLearner multiLabelLearner,
                     Classifier singleLabelClassifier)
Initialize the SubsetLearner with labels set partitioning, multilabel and single label learners.

Parameters:
labelsSubsets - subsets of dependent labels
multiLabelLearner - method used for multilabel classification
singleLabelClassifier - method used for single label classification

SubsetLearner

public SubsetLearner(LabelClustering clusteringMethod,
                     MultiLabelLearner multiLabelLearner,
                     Classifier singleLabelClassifier)
Initialize the SubsetLearner with a label clustering method, multilabel and single label learners.

Parameters:
clusteringMethod -
multiLabelLearner - method used for multilabel classification
singleLabelClassifier - method used for single label classification
Method Detail

resetSubsets

public void resetSubsets(int[][] labelsSubsets)
Reset the label set partitioning.

Parameters:
labelsSubsets - - new label set partitioning

buildInternal

protected void buildInternal(MultiLabelInstances trainingSet)
                      throws Exception
We get the initial dataset through trainingSet. Then for each subset of labels as specified by labelsSubsets we remove the unneeded labels and train the classifiers using MultiLabelLearner for multi-label splits and BinaryRelevance approach for single label splits. Each classification model constructed on a certain training data for a certain labels subset along with related Remove object is stored in HashMap and can be reused when is needed next time.

Specified by:
buildInternal in class MultiLabelLearnerBase
Parameters:
trainingSet - The initial MultiLabelInstances dataset
Throws:
Exception

resetRandomSeed

public void resetRandomSeed(Object model)
Invokes the setSeed(1) or setRandomSeed(1) method of the supplied object's Class, if such method exist.

Parameters:
model - which random seed should be reset.

setSeed

public void setSeed()
Set random seed of all internal Learners to 1.


makePredictionInternal

public MultiLabelOutput makePredictionInternal(Instance instance)
                                        throws Exception
We make a prediction using a different method depending on whether the split has one or more labels

Specified by:
makePredictionInternal in class MultiLabelLearnerBase
Parameters:
instance - the instance for classification prediction
Returns:
the MultiLabelOutput classification prediction for the instance
Throws:
Exception
InvalidDataException - if specified instance data is invalid and can not be processed by the learner

setUseCache

public void setUseCache(boolean useCache)
Sets whether cache mechanism will be used

Parameters:
useCache - whether cache mechanism will be used

getTechnicalInformation

public TechnicalInformation getTechnicalInformation()
Description copied from class: MultiLabelLearnerBase
Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.

Specified by:
getTechnicalInformation in interface TechnicalInformationHandler
Specified by:
getTechnicalInformation in class MultiLabelLearnerBase
Returns:
the technical information about this class

getModel

public String getModel()
Returns a string representation of the model

Returns:
a string representation of the model

globalInfo

public String globalInfo()
Description copied from class: MultiLabelLearnerBase
Returns a string describing the multi-label learner.

Specified by:
globalInfo in class MultiLabelLearnerBase
Returns:
a description suitable for displaying in a future gui