diff --git a/TODO.txt b/TODO.txt
index a19f6d6..8a674a7 100644
--- a/TODO.txt
+++ b/TODO.txt
@@ -24,6 +24,10 @@ Do we want to cover cross-lingual quantification natively in QuaPy, or does it m
Current issues:
==========================================
+Revise the class structure of quantification methods and the methods they inherit... There is some confusion regarding
+ methods isbinary, isprobabilistic, and the like. The attribute "learner_" in aggregative quantifiers is also
+ confusing, since there is a getter and a setter.
+Remove the "deep" in get_params. There is no real compatibility with scikit-learn as for now.
SVMperf-based learners do not remove temp files in __del__?
In binary quantification (hp, kindle, imdb) we used F1 in the minority class (which in kindle and hp happens to be the
negative class). This is not covered in this new implementation, in which the binary case is not treated as such, but as
diff --git a/docs/build/html/genindex.html b/docs/build/html/genindex.html
index c5b4815..6ba0ab0 100644
--- a/docs/build/html/genindex.html
+++ b/docs/build/html/genindex.html
@@ -80,8 +80,6 @@
Adjusted Classify & Count,
+the “adjusted” variant of CC, that corrects the predictions of CC
+according to the misclassification rates.
+
+
Parameters
+
+
learner – a sklearn’s Estimator that generates a classifier
+
val_split – indicates the proportion of data to be used as a stratified held-out validation set in which the
+misclassification rates are to be estimated.
+This parameter can be indicated as a real value (between 0 and 1, default 0.4), representing a proportion of
+validation data, or as an integer, indicating that the misclassification rates should be estimated via
+k-fold cross validation (this integer stands for the number of folds k), or as a
+quapy.data.base.LabelledCollection (the split itself).
Trains a ACC quantifier
-:param data: the training set
-:param fit_learner: set to False to bypass the training (the learner is assumed to be already fit)
-:param val_split: either a float in (0,1) indicating the proportion of training instances to use for
-
-
validation (e.g., 0.3 for using 30% of the training set as validation data), or a LabelledCollection
-indicating the validation set itself, or an int indicating the number k of folds to be used in kFCV
-to estimate the parameters
-
+
Trains a ACC quantifier.
-
Returns
-
self
+
Parameters
+
+
data – the training set
+
fit_learner – set to False to bypass the training (the learner is assumed to be already fit)
+
val_split – either a float in (0,1) indicating the proportion of training instances to use for
+validation (e.g., 0.3 for using 30% of the training set as validation data), or a LabelledCollection
+indicating the validation set itself, or an int indicating the number k of folds to be used in k-fold
+cross validation to estimate the parameters
Solves the system linear system \(Ax = B\) with \(A\) = PteCondEstim and \(B\) = prevs_estim
+
+
Parameters
+
+
PteCondEstim – a np.ndarray of shape (n_classes,n_classes,) with entry (i,j) being the estimate
+of \(P(y_i|y_j)\), that is, the probability that an instance that belongs to \(y_j\) ends up being
+classified as belonging to \(y_i\)
+
prevs_estim – a np.ndarray of shape (n_classes,) with the class prevalence estimates
+
+
+
Returns
+
an adjusted np.ndarray of shape (n_classes,) with the corrected class prevalence estimates
Abstract class for quantification methods that base their estimations on the aggregation of classification
-results. Aggregative Quantifiers thus implement a _classify_ method and maintain a _learner_ attribute.
+results. Aggregative Quantifiers thus implement a classify() method and maintain a learner attribute.
+Subclasses of this abstract class must implement the method aggregate() which computes the aggregation
+of label predictions. The method quantify() comes with a default implementation based on
+
Class labels, in the same order in which class prevalence values are to be computed.
+This default implementation actually returns the class labels of the learner.
The most basic Quantification method. One that simply classifies all instances and countes how many have been
-attributed each of the classes in order to compute class prevalence estimates.
+
The most basic Quantification method. One that simply classifies all instances and counts how many have been
+attributed to each of the classes in order to compute class prevalence estimates.
+
+
Parameters
+
learner – a sklearn’s Estimator that generates a classifier
Trains the Classify & Count method unless _fit_learner_ is False, in which case it is assumed to be already fit.
-:param data: training data
-:param fit_learner: if False, the classifier is assumed to be fit
-:return: self
+
Trains the Classify & Count method unless fit_learner is False, in which case, the classifier is assumed to
+be already fit and there is nothing else to do.
Class of Explicit Loss Minimization (ELM) quantifiers.
+Quantifiers based on ELM represent a family of methods based on structured output learning;
+these quantifiers rely on classifiers that have been optimized using a quantification-oriented loss
+measure. This implementation relies on
+Joachims’ SVM perf structured output
+learning algorithm, which has to be installed and patched for the purpose (see this
+script).
+
+
Parameters
+
+
svmperf_base – path to the folder containing the binary files of SVM perf
The method is described in:
-Saerens, M., Latinne, P., and Decaestecker, C. (2002).
-Adjusting the outputs of a classifier to new a priori probabilities: A simple procedure.
-Neural Computation, 14(1): 21–41.
+
Expectation Maximization for Quantification (EMQ),
+aka Saerens-Latinne-Decaestecker (SLD) algorithm.
+EMQ consists of using the well-known Expectation Maximization algorithm to iteratively update the posterior
+probabilities generated by a probabilistic classifier and the class prevalence estimates obtained via
+maximum-likelihood estimation, in a mutually recursive way, until convergence.
+
+
Parameters
+
learner – a sklearn’s Estimator that generates a classifier
Implementation of the method based on the Hellinger Distance y (HDy) proposed by
-González-Castro, V., Alaiz-Rodrı́guez, R., and Alegre, E. (2013). Class distribution
-estimation based on the Hellinger distance. Information Sciences, 218:146–164.
+
Hellinger Distance y (HDy).
+HDy is a probabilistic method for training binary quantifiers, that models quantification as the problem of
+minimizing the divergence (in terms of the Hellinger Distance) between two cumulative distributions of posterior
+probabilities returned by the classifier. One of the distributions is generated from the unlabelled examples and
+the other is generated from a validation set. This latter distribution is defined as a mixture of the
+class-conditional distributions of the posterior probabilities returned for the positive and negative validation
+examples, respectively. The parameters of the mixture thus represent the estimates of the class prevalence values.
+
+
Parameters
+
+
learner – a sklearn’s Estimator that generates a binary classifier
+
val_split – a float in range (0,1) indicating the proportion of data to be used as a stratified held-out
+validation distribution, or a quapy.data.base.LabelledCollection (the split itself).
Trains a HDy quantifier
-:param data: the training set
-:param fit_learner: set to False to bypass the training (the learner is assumed to be already fit)
-:param val_split: either a float in (0,1) indicating the proportion of training instances to use for
-
-
validation (e.g., 0.3 for using 30% of the training set as validation data), or a LabelledCollection
-indicating the validation set itself
-
+
Trains a HDy quantifier.
-
Returns
-
self
+
Parameters
+
+
data – the training set
+
fit_learner – set to False to bypass the training (the learner is assumed to be already fit)
+
val_split – either a float in (0,1) indicating the proportion of training instances to use for
+validation (e.g., 0.3 for using 30% of the training set as validation data), or a
+quapy.data.base.LabelledCollection indicating the validation set itself
+
+
+
Returns
+
self
@@ -337,28 +624,73 @@ indicating the validation set itself
Threshold Optimization variant for ACC as proposed by
+Forman 2006 and
+Forman 2008 that looks
+for the threshold that maximizes tpr-fpr.
+The goal is to bring improved stability to the denominator of the adjustment.
+
+
Parameters
+
+
learner – a sklearn’s Estimator that generates a classifier
+
val_split – indicates the proportion of data to be used as a stratified held-out validation set in which the
+misclassification rates are to be estimated.
+This parameter can be indicated as a real value (between 0 and 1, default 0.4), representing a proportion of
+validation data, or as an integer, indicating that the misclassification rates should be estimated via
+k-fold cross validation (this integer stands for the number of folds k), or as a
+quapy.data.base.LabelledCollection (the split itself).
Median Sweep. Threshold Optimization variant for ACC as proposed by
+Forman 2006 and
+Forman 2008 that generates
+class prevalence estimates for all decision thresholds and returns the median of them all.
+The goal is to bring improved stability to the denominator of the adjustment.
+
+
Parameters
+
+
learner – a sklearn’s Estimator that generates a classifier
+
val_split – indicates the proportion of data to be used as a stratified held-out validation set in which the
+misclassification rates are to be estimated.
+This parameter can be indicated as a real value (between 0 and 1, default 0.4), representing a proportion of
+validation data, or as an integer, indicating that the misclassification rates should be estimated via
+k-fold cross validation (this integer stands for the number of folds k), or as a
+quapy.data.base.LabelledCollection (the split itself).
Median Sweep 2. Threshold Optimization variant for ACC as proposed by
+Forman 2006 and
+Forman 2008 that generates
+class prevalence estimates for all decision thresholds and returns the median of for cases in
+which tpr-fpr>0.25
+The goal is to bring improved stability to the denominator of the adjustment.
+
+
Parameters
+
+
learner – a sklearn’s Estimator that generates a classifier
+
val_split – indicates the proportion of data to be used as a stratified held-out validation set in which the
+misclassification rates are to be estimated.
+This parameter can be indicated as a real value (between 0 and 1, default 0.4), representing a proportion of
+validation data, or as an integer, indicating that the misclassification rates should be estimated via
+k-fold cross validation (this integer stands for the number of folds k), or as a
+quapy.data.base.LabelledCollection (the split itself).
+
+
+
@@ -377,60 +709,158 @@ indicating the validation set itself
Allows any binary quantifier to perform quantification on single-label datasets. The method maintains one binary
-quantifier for each class, and then l1-normalizes the outputs so that the class prevelences sum up to 1.
-This variant was used, along with the ExplicitLossMinimization quantifier in
-Gao, W., Sebastiani, F.: From classification to quantification in tweet sentiment analysis.
-Social Network Analysis and Mining 6(19), 1–22 (2016)
+
Allows any binary quantifier to perform quantification on single-label datasets.
+The method maintains one binary quantifier for each class, and then l1-normalizes the outputs so that the
+class prevelences sum up to 1.
+This variant was used, along with the EMQ quantifier, in
+Gao and Sebastiani, 2016.
+
+
Parameters
+
+
learner – a sklearn’s Estimator that generates a binary classifier
Class labels, in the same order in which class prevalence values are to be computed.
+This default implementation actually returns the class labels of the learner.
Returns a matrix of shape (n,m,) with n the number of instances and m the number of classes. The entry
+(i,j) is a binary value indicating whether instance i `belongs to class `j. The binary classifications are
+independent of each other, meaning that an instance can end up be attributed to 0, 1, or more classes.
Returns a matrix of shape (n,m,2) with n the number of instances and m the number of classes. The entry
+(i,j,1) (resp. (i,j,0)) is a value in [0,1] indicating the posterior probability that instance i belongs
+(resp. does not belong) to class j.
+The posterior probabilities are independent of each other, meaning that, in general, they do not sum
+up to one.
Probabilistic Adjusted Classify & Count,
+the probabilistic variant of ACC that relies on the posterior probabilities returned by a probabilistic classifier.
+
+
Parameters
+
+
learner – a sklearn’s Estimator that generates a classifier
+
val_split – indicates the proportion of data to be used as a stratified held-out validation set in which the
+misclassification rates are to be estimated.
+This parameter can be indicated as a real value (between 0 and 1, default 0.4), representing a proportion of
+validation data, or as an integer, indicating that the misclassification rates should be estimated via
+k-fold cross validation (this integer stands for the number of folds k), or as a
+quapy.data.base.LabelledCollection (the split itself).
Trains a PACC quantifier
-:param data: the training set
-:param fit_learner: set to False to bypass the training (the learner is assumed to be already fit)
-:param val_split: either a float in (0,1) indicating the proportion of training instances to use for
-
-
validation (e.g., 0.3 for using 30% of the training set as validation data), or a LabelledCollection
-indicating the validation set itself, or an int indicating the number k of folds to be used in kFCV
-to estimate the parameters
-
+
Trains a PACC quantifier.
-
Returns
-
self
+
Parameters
+
+
data – the training set
+
fit_learner – set to False to bypass the training (the learner is assumed to be already fit)
+
val_split – either a float in (0,1) indicating the proportion of training instances to use for
+validation (e.g., 0.3 for using 30% of the training set as validation data), or a LabelledCollection
+indicating the validation set itself, or an int indicating the number k of folds to be used in kFCV
+to estimate the parameters
Esuli, A. and Sebastiani, F. (2015).
-Optimizing text quantifiers for multivariate loss functions.
-ACM Transactions on Knowledge Discovery and Data, 9(4):Article 27.
+
SVM(KLD), which attempts to minimize the Kullback-Leibler Divergence as proposed by
+Esuli et al. 2015.
+Equivalent to:
+
>>> ELM(svmperf_base,loss='kld',**kwargs)
+
+
+
+
Parameters
+
+
svmperf_base – path to the folder containing the binary files of SVM perf
Esuli, A. and Sebastiani, F. (2015).
-Optimizing text quantifiers for multivariate loss functions.
-ACM Transactions on Knowledge Discovery and Data, 9(4):Article 27.
+
SVM(NKLD), which attempts to minimize a version of the the Kullback-Leibler Divergence normalized
+via the logistic function, as proposed by
+Esuli et al. 2015.
+Equivalent to:
+
>>> ELM(svmperf_base,loss='nkld',**kwargs)
+
+
+
+
Parameters
+
+
svmperf_base – path to the folder containing the binary files of SVM perf
Barranquero, J., Díez, J., and del Coz, J. J. (2015).
-Quantification-oriented learning based on reliable classifiers.
-Pattern Recognition, 48(2):591–604.
+
SVM(Q), which attempts to minimize the Q loss combining a classification-oriented loss and a
+quantification-oriented loss, as proposed by
+Barranquero et al. 2015.
+Equivalent to:
+
>>> ELM(svmperf_base,loss='q',**kwargs)
+
+
+
+
Parameters
+
+
svmperf_base – path to the folder containing the binary files of SVM perf
Threshold Optimization variant for ACC as proposed by
+Forman 2006 and
+Forman 2008 that looks
+for the threshold that makes tpr cosest to 0.5.
+The goal is to bring improved stability to the denominator of the adjustment.
+
+
Parameters
+
+
learner – a sklearn’s Estimator that generates a classifier
+
val_split – indicates the proportion of data to be used as a stratified held-out validation set in which the
+misclassification rates are to be estimated.
+This parameter can be indicated as a real value (between 0 and 1, default 0.4), representing a proportion of
+validation data, or as an integer, indicating that the misclassification rates should be estimated via
+k-fold cross validation (this integer stands for the number of folds k), or as a
+quapy.data.base.LabelledCollection (the split itself).
Abstract class of Threshold Optimization variants for ACC as proposed by
+Forman 2006 and
+Forman 2008.
+The goal is to bring improved stability to the denominator of the adjustment.
+The different variants are based on different heuristics for choosing a decision threshold
+that would allow for more true positives and many more false positives, on the grounds this
+would deliver larger denominators.
+
+
Parameters
+
+
learner – a sklearn’s Estimator that generates a classifier
+
val_split – indicates the proportion of data to be used as a stratified held-out validation set in which the
+misclassification rates are to be estimated.
+This parameter can be indicated as a real value (between 0 and 1, default 0.4), representing a proportion of
+validation data, or as an integer, indicating that the misclassification rates should be estimated via
+k-fold cross validation (this integer stands for the number of folds k), or as a
+quapy.data.base.LabelledCollection (the split itself).
Training procedure common to all Aggregative Quantifiers.
-:param learner: the learner to be fit
-:param data: the data on which to fit the learner. If requested, the data will be split before fitting the learner.
-:param fit_learner: whether or not to fit the learner (if False, then bypasses any action)
-:param ensure_probabilistic: if True, guarantees that the resulting classifier implements predict_proba (if the
-learner is not probabilistic, then a CalibratedCV instance of it is trained)
-:param val_split: if specified as a float, indicates the proportion of training instances that will define the
-validation split (e.g., 0.3 for using 30% of the training set as validation data); if specified as a
-LabelledCollection, represents the validation split itself
-:return: the learner trained on the training set, and the unused data (a _LabelledCollection_ if train_val_split>0
-or None otherwise) to be used as a validation set for any subsequent parameter fitting
+
Threshold Optimization variant for ACC as proposed by
+Forman 2006 and
+Forman 2008 that looks
+for the threshold that yields tpr=1-fpr.
+The goal is to bring improved stability to the denominator of the adjustment.
+
+
Parameters
+
+
learner – a sklearn’s Estimator that generates a classifier
+
val_split – indicates the proportion of data to be used as a stratified held-out validation set in which the
+misclassification rates are to be estimated.
+This parameter can be indicated as a real value (between 0 and 1, default 0.4), representing a proportion of
+validation data, or as an integer, indicating that the misclassification rates should be estimated via
+k-fold cross validation (this integer stands for the number of folds k), or as a
+quapy.data.base.LabelledCollection (the split itself).
+
+
+
@@ -607,45 +1212,116 @@ or None otherwise) to be used as a validation set for any subsequent parameter f
Abstract class of binary quantifiers, i.e., quantifiers estimating class prevalence values for only two classes
+(typically, to be interpreted as one class and its complement).
Methods from the articles:
-Pérez-Gállego, P., Quevedo, J. R., & del Coz, J. J. (2017).
-Using ensembles for problems with characterizable changes in data distribution: A case study on quantification.
-Information Fusion, 34, 87-100.
+
Implementation of the Ensemble methods for quantification described by
+Pérez-Gállego et al., 2017
and
-Pérez-Gállego, P., Castano, A., Quevedo, J. R., & del Coz, J. J. (2019).
-Dynamic ensemble selection for quantification tasks.
-Information Fusion, 45, 1-15.
Selects the red_size best performant quantifiers in a static way (i.e., dropping all non-selected instances).
-For each model in the ensemble, the performance is measured in terms of _error_name_ on the quantification of
-the samples used for training the rest of the models in the ensemble.
Average (policy=’ave’): computes class prevalence estimates as the average of the estimates
+returned by the base quantifiers.
+
Training Prevalence (policy=’ptr’): applies a dynamic selection to the ensemble’s members by retaining only
+those members such that the class prevalence values in the samples they use as training set are closest to
+preliminary class prevalence estimates computed as the average of the estimates of all the members. The final
+estimate is recomputed by considering only the selected members.
+
Distribution Similarity (policy=’ds’): performs a dynamic selection of base members by retaining
+the members trained on samples whose distribution of posterior probabilities is closest, in terms of the
+Hellinger Distance, to the distribution of posterior probabilities in the test sample
+
Accuracy (policy=’<valid error name>’): performs a static selection of the ensemble members by
+retaining those that minimize a quantification error measure, which is passed as an argument.
quantifier – base quantification member of the ensemble
+
size – number of members
+
red_size – number of members to retain after selection (depending on the policy)
+
min_pos – minimum number of positive instances to consider a sample as valid
+
policy – the selection policy; available policies include: ave (default), ptr, ds, and accuracy
+(which is instantiated via a valid error name, e.g., mae)
+
max_sample_size – maximum number of instances to consider in the samples (set to None
+to indicate no limit, default)
+
val_split – a float in range (0,1) indicating the proportion of data to be used as a stratified held-out
+validation split, or a quapy.data.base.LabelledCollection (the split itself).
+
n_jobs – number of parallel workers (default 1)
+
verbose – set to True (default is False) to get some information in standard output
In the original article, this procedure is not described in a sufficient level of detail. The paper only says
-that the distribution of posterior probabilities from training and test examples is compared by means of the
-Hellinger Distance. However, how these posterior probabilities are generated is not specified. In the article,
-a Logistic Regressor (LR) is used as the classifier device and that could be used for this purpose. However, in
-general, a Quantifier is not necessarily an instance of Aggreggative Probabilistic Quantifiers, and so, that the
-quantifier builds on top of a probabilistic classifier cannot be given for granted. Additionally, it would not
-be correct to generate the posterior probabilities for training documents that have concurred in training the
-classifier that generates them.
-This function thus generates the posterior probabilities for all training documents in a cross-validation way,
-using a LR with hyperparameters that have previously been optimized via grid search in 5FCV.
-:return P,f, where P is a ndarray containing the posterior probabilities of the training data, generated via
-cross-validation and using an optimized LR, and the function to be used in order to generate posterior
-probabilities for test instances.
+
Class labels, in the same order in which class prevalence values are to be computed.
This function should not be used within quapy.model_selection.GridSearchQ (is here for compatibility
+with the abstract class).
+Instead, use Ensemble(GridSearchQ(q),…), with q a Quantifier (recommended), or
+Ensemble(Q(GridSearchCV(l))) with Q a quantifier class that has a learner l optimized for
Selects the predictions made by models that have been trained on samples with a prevalence that is most similar
-to a first approximation of the test prevalence as made by all models in the ensemble.
+
Indicates that the quantifier is not probabilistic.
This function should not be used within quapy.model_selection.GridSearchQ (is here for compatibility
+with the abstract class).
+Instead, use Ensemble(GridSearchQ(q),…), with q a Quantifier (recommended), or
+Ensemble(Q(GridSearchCV(l))) with Q a quantifier class that has a learner l optimized for
Ensemble factory. Provides a unified interface for instantiating ensembles that can be optimized (via model
+selection for quantification) for a given evaluation metric using quapy.model_selection.GridSearchQ.
+If the evaluation metric is classification-oriented
+(instead of quantification-oriented), then the optimization will be carried out via sklearn’s
+GridSearchCV.
+
+
Parameters
+
+
learner – sklearn’s Estimator that generates a classifier
+
base_quantifier_class – a class of quantifiers
+
param_grid – a dictionary with the grid of parameters to optimize for
+
optim – a valid quantification or classification error, or a string name of it
data – the training data on which to train QuaNet. If fit_learner=True, the data will be split in
+
+
data – the training data on which to train QuaNet. If fit_learner=True, the data will be split in
+40/40/20 for training the classifier, training QuaNet, and validating QuaNet, respectively. If
+fit_learner=False, the data will be split in 66/34 for training QuaNet and validating it, respectively.
+
fit_learner – if true, trains the classifier on a split containing 40% of the data
+
+
+
Returns
+
self
-
40/40/20 for training the classifier, training QuaNet, and validating QuaNet, respectively. If
-fit_learner=False, the data will be split in 66/34 for training QuaNet and validating it, respectively.
-:param fit_learner: if true, trains the classifier on a split containing 40% of the data
-:return: self
@@ -894,17 +1790,41 @@ fit_learner=False, the data will be split in 66/34 for training QuaNet and valid
The Maximum Likelihood Prevalence Estimation (MLPE) method is a lazy method that assumes there is no prior
+probability shift between training and test instances (put it other way, that the i.i.d. assumpion holds).
+The estimation of class prevalence values for any test sample is always (i.e., irrespective of the test sample
+itself) the class prevalence seen during training. This method is considered to be a lower-bound quantifier that
+any quantification method should beat.
Does nothing, since this learner has no parameters.
+
+
Parameters
+
parameters – dictionary of param-value pairs (ignored)
+
+
+
diff --git a/docs/build/html/searchindex.js b/docs/build/html/searchindex.js
index 268c3ce..7492118 100644
--- a/docs/build/html/searchindex.js
+++ b/docs/build/html/searchindex.js
@@ -1 +1 @@
-Search.setIndex({docnames:["Datasets","Evaluation","Installation","Methods","Model-Selection","Plotting","index","modules","quapy","quapy.classification","quapy.data","quapy.method"],envversion:{"sphinx.domains.c":2,"sphinx.domains.changeset":1,"sphinx.domains.citation":1,"sphinx.domains.cpp":4,"sphinx.domains.index":1,"sphinx.domains.javascript":2,"sphinx.domains.math":2,"sphinx.domains.python":3,"sphinx.domains.rst":2,"sphinx.domains.std":2,sphinx:56},filenames:["Datasets.md","Evaluation.md","Installation.rst","Methods.md","Model-Selection.md","Plotting.md","index.rst","modules.rst","quapy.rst","quapy.classification.rst","quapy.data.rst","quapy.method.rst"],objects:{"":{quapy:[8,0,0,"-"]},"quapy.classification":{methods:[9,0,0,"-"],neural:[9,0,0,"-"],svmperf:[9,0,0,"-"]},"quapy.classification.methods":{LowRankLogisticRegression:[9,1,1,""]},"quapy.classification.methods.LowRankLogisticRegression":{fit:[9,2,1,""],get_params:[9,2,1,""],predict:[9,2,1,""],predict_proba:[9,2,1,""],set_params:[9,2,1,""],transform:[9,2,1,""]},"quapy.classification.neural":{CNNnet:[9,1,1,""],LSTMnet:[9,1,1,""],NeuralClassifierTrainer:[9,1,1,""],TextClassifierNet:[9,1,1,""],TorchDataset:[9,1,1,""]},"quapy.classification.neural.CNNnet":{document_embedding:[9,2,1,""],get_params:[9,2,1,""],vocabulary_size:[9,3,1,""]},"quapy.classification.neural.LSTMnet":{document_embedding:[9,2,1,""],get_params:[9,2,1,""],vocabulary_size:[9,3,1,""]},"quapy.classification.neural.NeuralClassifierTrainer":{device:[9,3,1,""],fit:[9,2,1,""],get_params:[9,2,1,""],predict:[9,2,1,""],predict_proba:[9,2,1,""],reset_net_params:[9,2,1,""],set_params:[9,2,1,""],transform:[9,2,1,""]},"quapy.classification.neural.TextClassifierNet":{dimensions:[9,2,1,""],document_embedding:[9,2,1,""],forward:[9,2,1,""],get_params:[9,2,1,""],predict_proba:[9,2,1,""],vocabulary_size:[9,3,1,""],xavier_uniform:[9,2,1,""]},"quapy.classification.neural.TorchDataset":{asDataloader:[9,2,1,""]},"quapy.classification.svmperf":{SVMperf:[9,1,1,""]},"quapy.classification.svmperf.SVMperf":{decision_function:[9,2,1,""],fit:[9,2,1,""],predict:[9,2,1,""],set_params:[9,2,1,""],valid_losses:[9,4,1,""]},"quapy.data":{base:[10,0,0,"-"],datasets:[10,0,0,"-"],preprocessing:[10,0,0,"-"],reader:[10,0,0,"-"]},"quapy.data.base":{Dataset:[10,1,1,""],LabelledCollection:[10,1,1,""],isbinary:[10,5,1,""]},"quapy.data.base.Dataset":{SplitStratified:[10,2,1,""],binary:[10,3,1,""],classes_:[10,3,1,""],kFCV:[10,2,1,""],load:[10,2,1,""],n_classes:[10,3,1,""],stats:[10,2,1,""],vocabulary_size:[10,3,1,""]},"quapy.data.base.LabelledCollection":{Xy:[10,3,1,""],artificial_sampling_generator:[10,2,1,""],artificial_sampling_index_generator:[10,2,1,""],binary:[10,3,1,""],counts:[10,2,1,""],kFCV:[10,2,1,""],load:[10,2,1,""],n_classes:[10,3,1,""],natural_sampling_generator:[10,2,1,""],natural_sampling_index_generator:[10,2,1,""],prevalence:[10,2,1,""],sampling:[10,2,1,""],sampling_from_index:[10,2,1,""],sampling_index:[10,2,1,""],split_stratified:[10,2,1,""],stats:[10,2,1,""],uniform_sampling:[10,2,1,""],uniform_sampling_index:[10,2,1,""]},"quapy.data.datasets":{fetch_UCIDataset:[10,5,1,""],fetch_UCILabelledCollection:[10,5,1,""],fetch_reviews:[10,5,1,""],fetch_twitter:[10,5,1,""],warn:[10,5,1,""]},"quapy.data.preprocessing":{IndexTransformer:[10,1,1,""],index:[10,5,1,""],reduce_columns:[10,5,1,""],standardize:[10,5,1,""],text2tfidf:[10,5,1,""]},"quapy.data.preprocessing.IndexTransformer":{add_word:[10,2,1,""],fit:[10,2,1,""],fit_transform:[10,2,1,""],transform:[10,2,1,""],vocabulary_size:[10,2,1,""]},"quapy.data.reader":{binarize:[10,5,1,""],from_csv:[10,5,1,""],from_sparse:[10,5,1,""],from_text:[10,5,1,""],reindex_labels:[10,5,1,""]},"quapy.error":{absolute_error:[8,5,1,""],acc_error:[8,5,1,""],acce:[8,5,1,""],ae:[8,5,1,""],f1_error:[8,5,1,""],f1e:[8,5,1,""],from_name:[8,5,1,""],kld:[8,5,1,""],mae:[8,5,1,""],mean_absolute_error:[8,5,1,""],mean_relative_absolute_error:[8,5,1,""],mkld:[8,5,1,""],mnkld:[8,5,1,""],mrae:[8,5,1,""],mse:[8,5,1,""],nkld:[8,5,1,""],rae:[8,5,1,""],relative_absolute_error:[8,5,1,""],se:[8,5,1,""],smooth:[8,5,1,""]},"quapy.evaluation":{artificial_prevalence_prediction:[8,5,1,""],artificial_prevalence_protocol:[8,5,1,""],artificial_prevalence_report:[8,5,1,""],evaluate:[8,5,1,""],gen_prevalence_prediction:[8,5,1,""],gen_prevalence_report:[8,5,1,""],natural_prevalence_prediction:[8,5,1,""],natural_prevalence_protocol:[8,5,1,""],natural_prevalence_report:[8,5,1,""]},"quapy.functional":{HellingerDistance:[8,5,1,""],adjusted_quantification:[8,5,1,""],artificial_prevalence_sampling:[8,5,1,""],get_nprevpoints_approximation:[8,5,1,""],normalize_prevalence:[8,5,1,""],num_prevalence_combinations:[8,5,1,""],prevalence_from_labels:[8,5,1,""],prevalence_from_probabilities:[8,5,1,""],prevalence_linspace:[8,5,1,""],strprev:[8,5,1,""],uniform_prevalence_sampling:[8,5,1,""],uniform_simplex_sampling:[8,5,1,""]},"quapy.method":{aggregative:[11,0,0,"-"],base:[11,0,0,"-"],meta:[11,0,0,"-"],neural:[11,0,0,"-"],non_aggregative:[11,0,0,"-"]},"quapy.method.aggregative":{ACC:[11,1,1,""],AdjustedClassifyAndCount:[11,4,1,""],AggregativeProbabilisticQuantifier:[11,1,1,""],AggregativeQuantifier:[11,1,1,""],CC:[11,1,1,""],ClassifyAndCount:[11,4,1,""],ELM:[11,1,1,""],EMQ:[11,1,1,""],ExpectationMaximizationQuantifier:[11,4,1,""],ExplicitLossMinimisation:[11,4,1,""],HDy:[11,1,1,""],HellingerDistanceY:[11,4,1,""],MAX:[11,1,1,""],MS2:[11,1,1,""],MS:[11,1,1,""],MedianSweep2:[11,4,1,""],MedianSweep:[11,4,1,""],OneVsAll:[11,1,1,""],PACC:[11,1,1,""],PCC:[11,1,1,""],ProbabilisticAdjustedClassifyAndCount:[11,4,1,""],ProbabilisticClassifyAndCount:[11,4,1,""],SVMAE:[11,1,1,""],SVMKLD:[11,1,1,""],SVMNKLD:[11,1,1,""],SVMQ:[11,1,1,""],SVMRAE:[11,1,1,""],T50:[11,1,1,""],ThresholdOptimization:[11,1,1,""],X:[11,1,1,""],training_helper:[11,5,1,""]},"quapy.method.aggregative.ACC":{aggregate:[11,2,1,""],classify:[11,2,1,""],fit:[11,2,1,""],solve_adjustment:[11,2,1,""]},"quapy.method.aggregative.AggregativeProbabilisticQuantifier":{posterior_probabilities:[11,2,1,""],predict_proba:[11,2,1,""],probabilistic:[11,3,1,""],quantify:[11,2,1,""],set_params:[11,2,1,""]},"quapy.method.aggregative.AggregativeQuantifier":{aggregate:[11,2,1,""],aggregative:[11,3,1,""],classes_:[11,3,1,""],classify:[11,2,1,""],fit:[11,2,1,""],get_params:[11,2,1,""],learner:[11,3,1,""],n_classes:[11,3,1,""],quantify:[11,2,1,""],set_params:[11,2,1,""]},"quapy.method.aggregative.CC":{aggregate:[11,2,1,""],fit:[11,2,1,""]},"quapy.method.aggregative.ELM":{aggregate:[11,2,1,""],classify:[11,2,1,""],fit:[11,2,1,""]},"quapy.method.aggregative.EMQ":{EM:[11,2,1,""],EPSILON:[11,4,1,""],MAX_ITER:[11,4,1,""],aggregate:[11,2,1,""],fit:[11,2,1,""],predict_proba:[11,2,1,""]},"quapy.method.aggregative.HDy":{aggregate:[11,2,1,""],fit:[11,2,1,""]},"quapy.method.aggregative.MS":{optimize_threshold:[11,2,1,""]},"quapy.method.aggregative.MS2":{optimize_threshold:[11,2,1,""]},"quapy.method.aggregative.OneVsAll":{aggregate:[11,2,1,""],binary:[11,3,1,""],classes_:[11,3,1,""],classify:[11,2,1,""],fit:[11,2,1,""],get_params:[11,2,1,""],posterior_probabilities:[11,2,1,""],probabilistic:[11,3,1,""],quantify:[11,2,1,""],set_params:[11,2,1,""]},"quapy.method.aggregative.PACC":{aggregate:[11,2,1,""],classify:[11,2,1,""],fit:[11,2,1,""]},"quapy.method.aggregative.PCC":{aggregate:[11,2,1,""],fit:[11,2,1,""]},"quapy.method.aggregative.ThresholdOptimization":{aggregate:[11,2,1,""],compute_fpr:[11,2,1,""],compute_table:[11,2,1,""],compute_tpr:[11,2,1,""],fit:[11,2,1,""],optimize_threshold:[11,2,1,""]},"quapy.method.base":{BaseQuantifier:[11,1,1,""],BinaryQuantifier:[11,1,1,""],isaggregative:[11,5,1,""],isbinary:[11,5,1,""],isprobabilistic:[11,5,1,""]},"quapy.method.base.BaseQuantifier":{aggregative:[11,3,1,""],binary:[11,3,1,""],classes_:[11,3,1,""],fit:[11,2,1,""],get_params:[11,2,1,""],probabilistic:[11,3,1,""],quantify:[11,2,1,""],set_params:[11,2,1,""]},"quapy.method.base.BinaryQuantifier":{binary:[11,3,1,""]},"quapy.method.meta":{EACC:[11,5,1,""],ECC:[11,5,1,""],EEMQ:[11,5,1,""],EHDy:[11,5,1,""],EPACC:[11,5,1,""],Ensemble:[11,1,1,""],ensembleFactory:[11,5,1,""],get_probability_distribution:[11,5,1,""]},"quapy.method.meta.Ensemble":{VALID_POLICIES:[11,4,1,""],accuracy_policy:[11,2,1,""],aggregative:[11,3,1,""],binary:[11,3,1,""],classes_:[11,3,1,""],ds_policy:[11,2,1,""],ds_policy_get_posteriors:[11,2,1,""],fit:[11,2,1,""],get_params:[11,2,1,""],probabilistic:[11,3,1,""],ptr_policy:[11,2,1,""],quantify:[11,2,1,""],set_params:[11,2,1,""],sout:[11,2,1,""]},"quapy.method.neural":{QuaNetModule:[11,1,1,""],QuaNetTrainer:[11,1,1,""],mae_loss:[11,5,1,""]},"quapy.method.neural.QuaNetModule":{device:[11,3,1,""],forward:[11,2,1,""],init_hidden:[11,2,1,""]},"quapy.method.neural.QuaNetTrainer":{classes_:[11,3,1,""],clean_checkpoint:[11,2,1,""],clean_checkpoint_dir:[11,2,1,""],epoch:[11,2,1,""],fit:[11,2,1,""],get_aggregative_estims:[11,2,1,""],get_params:[11,2,1,""],quantify:[11,2,1,""],set_params:[11,2,1,""]},"quapy.method.non_aggregative":{MaximumLikelihoodPrevalenceEstimation:[11,1,1,""]},"quapy.method.non_aggregative.MaximumLikelihoodPrevalenceEstimation":{classes_:[11,3,1,""],fit:[11,2,1,""],get_params:[11,2,1,""],quantify:[11,2,1,""],set_params:[11,2,1,""]},"quapy.model_selection":{GridSearchQ:[8,1,1,""]},"quapy.model_selection.GridSearchQ":{best_model:[8,2,1,""],classes_:[8,3,1,""],fit:[8,2,1,""],get_params:[8,2,1,""],quantify:[8,2,1,""],set_params:[8,2,1,""]},"quapy.plot":{binary_bias_bins:[8,5,1,""],binary_bias_global:[8,5,1,""],binary_diagonal:[8,5,1,""],brokenbar_supremacy_by_drift:[8,5,1,""],error_by_drift:[8,5,1,""]},"quapy.util":{EarlyStop:[8,1,1,""],create_if_not_exist:[8,5,1,""],create_parent_dir:[8,5,1,""],download_file:[8,5,1,""],download_file_if_not_exists:[8,5,1,""],get_quapy_home:[8,5,1,""],map_parallel:[8,5,1,""],parallel:[8,5,1,""],pickled_resource:[8,5,1,""],save_text_file:[8,5,1,""],temp_seed:[8,5,1,""]},quapy:{classification:[9,0,0,"-"],data:[10,0,0,"-"],error:[8,0,0,"-"],evaluation:[8,0,0,"-"],functional:[8,0,0,"-"],isbinary:[8,5,1,""],method:[11,0,0,"-"],model_selection:[8,0,0,"-"],plot:[8,0,0,"-"],util:[8,0,0,"-"]}},objnames:{"0":["py","module","Python module"],"1":["py","class","Python class"],"2":["py","method","Python method"],"3":["py","property","Python property"],"4":["py","attribute","Python attribute"],"5":["py","function","Python function"]},objtypes:{"0":"py:module","1":"py:class","2":"py:method","3":"py:property","4":"py:attribute","5":"py:function"},terms:{"0":[0,1,3,4,5,8,9,10,11],"00":[0,1,4,8],"000":1,"0001":[4,11],"000e":1,"001":[4,9,11],"005":8,"009":1,"01":[8,9,11],"017":1,"018":0,"02":1,"021":0,"02552":4,"03":1,"034":1,"035":1,"037":1,"04":1,"041":1,"042":1,"046":1,"048":1,"05":[5,8,10],"055":1,"063":[0,10],"065":0,"070":1,"073":1,"075":1,"078":0,"081":[0,10],"082":[0,1],"083":0,"086":0,"091":1,"099":0,"1":[0,1,3,4,5,8,9,10,11],"10":[0,1,4,5,8,9,11],"100":[0,1,3,4,5,9,10,11],"1000":[0,4,11],"10000":4,"100000":4,"101":[4,8,10],"1010":4,"1024":11,"104":0,"108":1,"109":0,"11":[0,1,6,8,10],"11338":0,"114":1,"1145":[],"12":9,"120":0,"1215742":0,"1271":0,"13":[0,9],"139":0,"14":[3,11],"142":1,"146":[3,11],"1473":0,"148":0,"1484":0,"15":[3,8,10,11],"150":0,"153":0,"157":0,"158":0,"159":0,"1593":0,"1594":0,"1599":0,"161":0,"163":[0,1],"164":[0,3,11],"167":0,"17":0,"1771":1,"1775":[0,3],"1778":[0,3],"178":0,"1823":0,"1839":0,"18399":0,"1853":0,"19":[3,10,11],"193":0,"199151":0,"19982":4,"1e":9,"1st":0,"2":[0,1,3,5,8,10,11],"20":[5,8,11],"200":[1,9],"2000":0,"2002":[3,11],"2011":4,"2013":[3,11],"2015":[0,2,3,9,11],"2016":[3,10,11],"2017":[0,3,10,11],"2018":[0,3,10],"2019":[3,10,11],"2020":4,"20342":4,"206":0,"207":0,"208":0,"21":[1,3,5,8,10,11],"210":[],"211":0,"2126":0,"2155":0,"21591":[0,10],"218":[3,11],"2184":0,"219e":1,"22":[0,3,9,10,11],"222":0,"222046":0,"226":0,"229":1,"229399":0,"23":9,"235":1,"238":0,"2390":0,"24":[0,9],"243":0,"248563":0,"24866":4,"24987":4,"25":[0,5,8,9,11],"25000":0,"256":[0,9],"26":9,"261":0,"265":0,"266":0,"267":0,"27":[1,3,9,11],"270":0,"2700406":[],"271":0,"272":0,"274":0,"275":1,"27th":[0,3,10],"28":3,"280":0,"281":0,"282":0,"283":[0,1],"288":0,"289":0,"2971":0,"2nd":0,"2t":[1,8],"2tp":8,"2x5fcv":0,"3":[0,1,3,5,6,8,9,10,11],"30":[0,1,3,11],"300":[0,1,9],"305":0,"306":0,"312":0,"32":[0,6],"3227":8,"3269206":[],"3269287":[],"33":[0,5,8],"331":0,"333":0,"335":0,"337":0,"34":[0,3,10,11],"341":0,"346":1,"347":0,"350":0,"351":0,"357":1,"359":0,"361":0,"366":1,"372":0,"373":0,"376132":0,"3765":0,"3813":0,"3821":[0,10],"383e":1,"387e":1,"392":0,"394":0,"399":0,"3f":[1,6],"3rd":0,"4":[0,1,3,4,5,8,11],"40":[0,3,4,11],"404333":0,"407":0,"41":[3,11],"412":0,"412e":1,"413":0,"414":0,"417":0,"41734":4,"42":[1,8],"421":0,"4259":0,"426e":1,"427":0,"430":0,"434":0,"435":1,"43676":4,"437":0,"44":0,"4403":10,"446":0,"45":[3,5,10,11],"452":0,"459":1,"4601":0,"461":0,"463":0,"465":0,"466":0,"470":0,"48":[3,11],"481":0,"48135":4,"486":0,"4898":0,"492":0,"496":0,"4960":1,"497":0,"5":[0,1,3,4,5,8,9,10,11],"50":[0,5,8,11],"500":[0,1,4,5,11],"5000":[1,5],"5005":4,"507":0,"508":0,"512":[9,11],"514":0,"515e":1,"530":0,"534":0,"535":0,"535e":1,"5379":4,"539":0,"541":1,"546":0,"5473":0,"54it":4,"55":5,"55it":4,"565":1,"569":0,"57":0,"573":0,"578":1,"583":0,"591":[3,11],"5f":4,"5fcv":11,"5fcvx2":10,"6":[0,1,3,5,8,10,11],"60":0,"600":1,"601":0,"604":[3,11],"606":0,"625":0,"627":0,"633e":1,"634":1,"64":[9,11],"640":0,"641":0,"650":0,"653":0,"654":1,"66":[1,11],"665":0,"667":0,"669":0,"67":[5,8],"683":0,"688":0,"691":0,"694582":0,"7":[1,5,8,9],"70":0,"700":0,"701e":1,"711":0,"717":1,"725":1,"730":0,"735":0,"740e":1,"748":0,"75":[0,5,8],"762":0,"774":0,"778":0,"787":0,"794":0,"798":0,"8":[0,1,5,10,11],"8000":0,"830":0,"837":1,"858":1,"861":0,"87":[0,3,10,11],"8788":0,"889504":0,"8d2fhsgcvn0aaaaa":[],"9":[0,1,3,5,8,11],"90":[5,8],"901":0,"909":1,"914":1,"917":0,"919":[0,10],"922":0,"923":0,"935":0,"936":0,"937":[0,10],"945":1,"95":[8,10],"9533":0,"958":0,"97":0,"979":0,"982":0,"99":8,"abstract":[3,9,10,11],"boolean":[8,10],"case":[0,1,3,4,5,8,10,11],"class":[0,1,3,4,5,6,8,9,10,11],"d\u00edez":[3,11],"default":[1,3,8,9,10],"do":[0,1,3,4,8,9,10],"final":[1,3,5],"float":[0,3,8,9,10,11],"function":[0,1,3,4,5,6,7,9,10,11],"g\u00e1llego":[0,3,10,11],"gonz\u00e1lez":[3,11],"import":[0,1,3,4,5,6,10],"int":[0,5,8,10,11],"long":[4,9],"new":[0,3,10,11],"p\u00e9rez":[0,3,10,11],"return":[0,1,3,4,5,8,9,10,11],"rodr\u0131":[3,11],"short":9,"static":[3,11],"true":[0,1,3,4,5,6,8,9,10,11],"try":4,"while":[3,5,8,9,10,11],A:[0,3,8,9,10,11],As:[3,4],By:[1,3,8],For:[0,1,5,6,8,10,11],If:[3,5,8,10,11],In:[0,1,2,3,4,5,6,9,11],It:[3,4,5,8],One:[0,1,3,11],That:[1,4],The:[0,1,2,4,5,6,8,9,10,11],Then:3,These:0,To:[5,10],_:[5,8,10],__:[],__class__:5,__name__:5,_adjust:[],_ae_:[],_classify_:11,_error_name_:11,_fit_learner_:11,_kld_:[],_labelledcollection_:11,_learner_:11,_mean:[],_min_df_:[],_my:[],_nkld_:[],_posterior_probabilities_:11,_q_:[],_rae_:[],_svmperf_:[],ab:[],aboud:3,about:[0,5,8,10],abov:[0,3,5,8],absolut:[1,3,5,6,8],absolute_error:8,abstractmethod:3,acc:[1,3,5,6,8,11],acc_error:8,accept:3,access:[0,3,10],accommod:0,accord:[1,3,4,8,9,10],accordingli:5,accuraci:[1,5,8],accuracy_polici:11,achiev:[1,3,4,5],acm:[0,3,10,11],across:[0,1,4,5,6,8],action:[0,11],actual:10,acut:0,ad:6,adapt:8,add:[3,4,8,10],add_word:10,addit:3,addition:[0,11],adjust:[3,6,8,11],adjusted_quantif:8,adjustedclassifyandcount:11,adopt:[3,4,10],advanc:[0,6],advantag:3,ae:[1,2,5,8],ae_:1,affect:8,after:8,afterward:11,again:5,against:5,aggreg:[1,4,5,6,7,8],aggregativeprobabilisticquantifi:[3,11],aggregativequantifi:[3,11],aggregg:11,aim:[4,5],aka:10,al:[0,2,9,10],alaiz:[3,11],alegr:[3,11],alejandro:4,algorithm:8,alia:[3,8,11],all:[0,1,2,3,5,8,10,11],allia:3,alloc:[8,9],allow:[0,1,2,3,5,8,9,10,11],almost:3,along:[0,3,8,11],alreadi:[3,11],also:[0,1,2,3,5,6,8,9],altern:4,although:[3,4,5,11],alwai:[3,4,5],among:3,amount:8,an:[0,1,2,3,4,5,6,8,9,10,11],analys:[5,6],analysi:[0,3,6,10,11],analyz:5,ani:[0,1,3,4,5,6,8,9,10,11],anoth:[0,1,3,5],anotherdir:8,anyon:0,api:6,app:[8,10],appeal:1,appear:5,append:5,appli:[2,3,4,5,8,9,10],appropri:4,approxim:[1,5,8,9,10,11],ar:[0,1,3,4,5,8,9,10,11],archive_filenam:8,archive_path:[],arg:[8,10,11],argmax:8,args_i:8,argu:4,argument:[0,1,3,5,8,10],arifici:[],aris:1,around:[1,10],arrai:[1,3,5,8,9,10],articl:[3,4,11],artifici:[0,1,3,4,5,6,8,10],artificial_prevalence_predict:8,artificial_prevalence_protocol:8,artificial_prevalence_report:8,artificial_prevalence_sampl:8,artificial_sampling_ev:[1,4],artificial_sampling_gener:[0,10],artificial_sampling_index_gener:10,artificial_sampling_predict:[1,5],artificial_sampling_report:1,arxiv:4,asarrai:1,asdataload:9,asonam:0,assert:10,assess:4,assign:[3,8,10],associ:[8,10],assum:[1,6,11],assumpt:[1,5,6],astyp:[],attempt:3,attribut:11,august:0,autom:[0,3,6],automat:[0,1],av:[3,11],avail:[0,1,2,3,5,6,9],averag:[1,3,8,10],avoid:[1,8],axi:[5,8],b:[0,10],balanc:[0,4],band:[5,8],bar:8,barranquero:[2,3,9,11],base:[0,3,6,7,8,9],base_classifi:5,base_estim:3,base_quantifier_class:11,baseestim:[9,11],baselin:6,basequantifi:[3,8,11],basic:[5,11],batch:9,batch_siz:9,batch_size_test:9,been:[0,3,4,5,8,10,11],befor:[3,8,9,10,11],beforehand:8,behav:[3,5],being:[4,8],belief:1,belong:3,below:[0,2,3,5,8,10],best:[4,8,9,11],best_epoch:8,best_model:8,best_model_:4,best_params_:4,best_scor:8,better:4,between:[4,5,6,8,9],beyond:5,bia:[6,8],bias:5,bidirect:11,bin:[5,8,11],bin_bia:5,bin_diag:5,binar:[8,10],binari:[3,5,6,8,9,10,11],binary_bias_bin:[5,8],binary_bias_glob:[5,8],binary_diagon:[5,8],binary_quantifi:11,binaryquantifi:11,binom:8,block:[0,8],bool:[8,11],both:5,bound:8,box:[5,8],breast:0,brief:1,broken:[5,8],brokenbar_supremacy_by_drift:8,budg:1,budget:[1,4],build:11,bypass:11,c:[3,4,8,9,10,11],calcul:8,calibr:3,calibratedclassifi:3,calibratedclassifiercv:3,calibratedcv:11,call:[0,1,5,8,10,11],callabl:[0,8,10],can:[0,1,2,3,4,5,8,10],cancer:0,cannot:11,cardiotocographi:0,care:11,carri:[3,10],casa_token:[],castano:[3,10,11],castro:[3,11],categor:[3,10],categori:[1,8],cc:[3,5,11],ceil:8,center:5,chang:[0,1,3,10,11],character:[3,6],characteriz:[0,3,10,11],charg:[0,8,10],chart:8,check:[3,4],checkpoint:[9,11],checkpointdir:11,checkpointnam:11,checkpointpath:9,choic:4,chosen:[4,8],cl:0,cla:[],class2int:10,class_weight:4,classes_:[8,10,11],classif:[0,1,3,7,8,10,11],classif_posterior:[3,11],classif_predict:[3,11],classif_predictions_bin:11,classifi:[1,4,5,6,8,9,11],classifier_net:9,classifiermixin:9,classifyandcount:[3,11],classmethod:[0,10,11],classnam:10,classs:8,clean_checkpoint:11,clean_checkpoint_dir:11,clear:5,clearer:1,clearli:5,clip:8,close:[1,10],closer:1,cm:8,cmc:0,cnn:3,cnnnet:[3,9],code:[0,3,4,5,9],codifi:10,coincid:[0,6],col:[0,10],collect:[0,8,9,10],collet:10,color:[5,8],colormap:8,column:[0,8,10],com:8,combin:[0,1,4,8,10],combinatio:8,combinations_budget:8,come:[0,8,10],commandlin:[],common:11,commonli:6,compar:[5,8,11],comparison:5,compil:[2,3],complet:[3,5],compon:[8,9],compress:0,comput:[1,3,5,8,11],computation:4,compute_fpr:11,compute_t:11,compute_tpr:11,concept:6,concur:11,condit:8,conduct:[0,8],confer:[0,3,10],confid:8,configur:[4,8],conform:10,consecut:[8,9],consid:[3,5,8,9,10],consist:[0,4,5,8,9,10],constrain:[1,5,8,10],constructor:3,consult:[0,1],contain:[1,2,3,5,8,9,10,11],contanin:8,content:7,context:8,contrast:1,control:[1,4,10],conv_block:[],conv_lay:[],conveni:8,convert:[1,3,8,9,10],convolut:9,copi:[8,10],cornel:[],correct:11,correctli:8,correspond:[5,8,10],cost:1,costli:4,could:[0,1,3,4,5,6,11],count:[4,5,6,8,10,11],count_:[],counter:10,countvector:10,covari:10,cover:[1,4,9],coz:[0,3,10,11],cpu:[1,9],creat:[0,6,8],create_if_not_exist:8,create_parent_dir:8,crisp:[3,8],criteria:4,cross:[3,10,11],cs:8,csr:10,csr_matrix:10,csv:10,ctg:0,cuda:[3,9,11],cumbersom:1,cumberson:8,curios:5,current:[3,8,9,10],custom:[3,6,8,10],customarili:[3,4],cv:[3,4],cyan:5,d_:8,dat:[0,9],data:[1,3,4,5,6,7,8,9,11],data_hom:10,datafram:[1,8],dataload:9,dataset:[1,3,4,5,6,7,8,9,11],dataset_nam:10,deal:0,decaesteck:[3,11],decai:9,decid:10,decim:1,decis:[3,8,9],decision_funct:9,decomposit:9,dedic:[1,10],deep:[3,8,11],def:[0,1,3,5,8],defin:[0,3,8,9,10,11],degre:4,del:[0,3,10,11],delai:8,deliv:3,dens:0,densiti:8,depend:[0,1,4,5,8],describ:[3,8,11],descript:0,design:4,desir:[0,1,10],despit:1,destin:8,detail:[0,1,3,6,9,10,11],determin:[1,4,5],detriment:5,devel:10,develop:[4,6],deviat:[0,1,5,8,10],devic:[0,3,5,9,11],df:1,df_replac:[],diabet:0,diagon:[6,8],dict:[8,10,11],dictionari:[8,9,10],differ:[0,1,3,4,5,6,8,10],difficult:5,digit:0,dimens:[8,9,10],dimension:[8,9,10],dir:8,directli:[0,1,3],directori:[2,8,9,10],discard:8,discoveri:[3,11],discret:8,discuss:5,disjoint:9,disk:8,displai:[1,5,8],displaystyl:8,distanc:[8,11],distant:[1,8],distribut:[0,3,5,8,10,11],diverg:[1,3,8],divid:8,dl:[],doabl:0,doc_embed:11,doc_embedding_s:11,doc_posterior:11,document:[0,1,3,5,9,10,11],document_embed:9,doe:[0,2,3,8],doi:[],done:3,dot:[5,8],dowload:8,down:[5,8,10],download:[0,2,3,8],download_fil:8,download_file_if_not_exist:8,draw:[8,10],drawn:[0,1,4,8,10],drift:6,drop:[9,11],drop_p:9,dropout:9,ds:[3,11],ds_polici:11,ds_policy_get_posterior:11,dtype:[1,10],dump:10,dure:[1,5],dynam:[3,9,10,11],e:[0,1,3,4,5,6,8,9,10,11],eacc:11,each:[0,1,3,4,5,8,9,10,11],earli:[8,9],early_stop:11,earlystop:8,easili:[0,2,5,9],ecc:11,edu:[],eemq:11,effect:3,effici:3,ehdi:11,either:[1,3,8,10,11],element:[3,10],elm:[3,11],em:11,emb:9,embed:[3,9],embed_s:9,embedding_s:9,empti:10,emq:[5,11],enabl:9,encod:10,end:[4,8],endeavour:6,enough:5,ensembl:[0,6,10,11],ensemblefactori:11,ensure_probabilist:11,entir:[0,3,4,5,8],environ:[1,3,4,5,8],ep:[1,8],epacc:11,epoch:[8,9,11],epsilon:[1,8,11],equal:[1,8],equidist:[0,8],equip:[3,5],err:[],err_drift:5,err_nam:8,error:[3,4,6,7,9],error_:[],error_by_drift:[5,8],error_funct:1,error_metr:[1,4,8],error_nam:[5,8,11],especi:8,establish:8,estim:[1,3,5,6,8,9,10,11],estim_prev:[1,5,8],estim_preval:[3,6],esuli:[0,2,3,9,10,11],et:[0,2,9,10],etc:6,eval_budget:[4,8],evalu:[0,3,4,5,6,7,9,10],even:8,eventu:[9,10],everi:[3,11],everyth:3,evinc:5,ex:[],exact:[0,10],exactli:0,exampl:[0,1,3,4,5,8,9,10,11],exce:8,excel:0,except:[3,8],exemplifi:0,exhaust:8,exhibit:[4,5],exist:8,exist_ok:8,expand_frame_repr:1,expect:6,expectationmaximizationquantifi:[3,11],experi:[1,2,3,4,5,8],explain:[1,5],explicitlossminim:11,explicitlossminimis:11,explor:[4,8,10],express:10,ext:2,extend:[2,3,11],extens:[0,2,5],extern:3,extract:[1,8,10],f1:[1,8,9],f1_error:8,f1e:[1,8],f:[0,1,3,4,5,6,10,11],f_1:8,fabrizio:4,facilit:6,fact:[3,5],factor:8,fals:[1,3,5,8,9,10,11],famili:3,familiar:3,far:[8,9,10],fare:8,fast:8,faster:[0,10],feat1:10,feat2:10,featn:10,featur:[0,10],feature_extract:10,fetch:[0,6],fetch_review:[0,1,3,4,5,10],fetch_twitt:[0,3,6,10],fetch_ucidataset:[0,3,10],fetch_ucilabelledcollect:[0,10],ff_layer:11,fhe:0,file:[0,5,8,9,10],filenam:8,fin:0,find:[0,4],finish:4,first:[0,1,2,3,5,8,10,11],fit:[1,3,4,5,6,8,9,10,11],fit_learn:[3,11],fit_transform:10,fix:[1,4],flag:8,float64:1,fn:8,fold:[3,10,11],folder:0,follow:[0,1,3,4,5,6,8],fomart:10,for_model_select:[0,10],form:[0,8,10],format:[0,5,10],former:[2,11],forward:[9,11],found:[0,3,4,8,9,10],four:3,fp:[8,11],fpr:8,frac:8,framework:6,frequenc:[0,10],from:[0,1,3,4,5,6,8,10,11],from_csv:10,from_nam:[1,8],from_spars:10,from_text:10,full:[1,8],fulli:0,func:8,further:[0,1,3,9,10],fusion:[0,3,10,11],futur:3,g:[0,1,3,4,6,8,10,11],gain:8,gao:[0,3,10,11],gap:10,gasp:[0,10],gen:8,gen_data:5,gen_fn:8,gen_prevalence_predict:8,gen_prevalence_report:8,gener:[0,1,3,4,5,8,9,10,11],generation_func:8,german:0,get:[0,1,5,8,9,10],get_aggregative_estim:11,get_nprevpoints_approxim:[1,8],get_param:[3,8,9,11],get_probability_distribut:11,get_quapy_hom:8,ggener:8,github:[],given:[1,3,4,8,9,10,11],global:8,goe:4,good:[4,5],got:4,govern:1,gpu:9,grant:11,greater:10,grid:[4,8,10,11],gridsearchcv:4,gridsearchq:[4,8],group:3,guarante:[10,11],guez:[3,11],gzip:0,ha:[3,4,5,8,9,10],haberman:[0,3],had:10,handl:0,happen:[4,5],hard:3,harder:5,harmon:8,harri:0,hat:8,have:[0,1,2,3,4,5,8,10,11],hcr:[0,3,10],hd:8,hdy:[6,11],held:[3,4,8,9],helling:11,hellingerdist:8,hellingerdistancei:[3,11],hellingh:8,help:5,henc:[8,10],here:1,hidden:[5,9],hidden_s:9,hide:5,high:[5,8],higher:[1,5],highlight:8,hightlight:8,hlt:[],hold:[6,8],home:[8,10],hook:11,how:[0,1,3,4,5,8,10,11],howev:[0,4,5,11],hp:[0,3,4,10],html:10,http:[8,10],hyper:[4,8,9],hyperparam:4,hyperparamet:[3,8,11],i:[0,1,3,4,5,8,9,10,11],id:[0,3,10],identifi:8,idf:0,ieee:0,ignor:[8,10,11],ii:8,iid:[1,5,6],illustr:[3,4,5],imdb:[0,5,10],implement:[0,1,3,4,5,6,8,9,10,11],implicit:8,impos:[4,8],improv:[3,8,9],includ:[0,1,3,5,6,10],inconveni:8,inde:[3,4],independ:8,index:[0,3,6,8,9,10],indextransform:10,indic:[0,1,3,4,5,8,10,11],individu:[1,3],infer:[0,10],inform:[0,1,3,4,8,10,11],infrequ:10,inherit:3,init:3,init_hidden:11,initi:[0,9],inplac:[1,3,10],input:[3,5,8,9],insight:5,inspir:3,instal:[0,3,6,9],instanc:[0,3,4,5,6,8,9,10,11],instanti:[0,1,3,4,9],instead:[1,3,4,11],integ:[3,8,9,10],integr:6,interest:[1,5,6,8,10],interestingli:5,interfac:[0,1],intern:[0,3,10],interpret:[5,6],interv:[1,5,8,10],introduc:1,invok:[0,1,3,8,10],involv:[2,5,8],io:[],ionospher:0,iri:0,irrespect:5,isaggreg:11,isbinari:[8,10,11],isomer:8,isometr:[5,8],isprobabilist:11,isti:[],item:8,iter:[0,8,11],its:[3,4,8,9],itself:[3,8,11],j:[0,3,10,11],joachim:[3,9],job:[2,8],joblib:2,join:8,just:[1,3],k:[3,6,8,10,11],keep:8,kei:[8,10],kept:10,kernel:9,kernel_height:9,keyword:10,kfcv:[0,10,11],kindl:[0,1,3,5,10],kl:8,kld:[1,2,8,9],know:3,knowledg:[0,3,10,11],known:[0,3,4],kraemer:8,kullback:[1,3,8],kwarg:[9,10,11],l1:[8,11],label:[0,3,4,5,6,8,9,10,11],labelledcollect:[0,3,4,8,10,11],larg:4,larger:10,largest:8,last:[1,3,5,8,9,10],lastli:3,latex:5,latinn:[3,11],latter:11,layer:[3,9],lead:[1,10],learn:[1,2,3,4,6,8,9,10,11],learner:[3,4,9,11],least:[0,10],leav:10,left:10,legend:8,leibler:[1,3,8],len:8,length:[9,10],less:[8,10],let:[1,3],level:11,leverag:3,leyend:8,like:[0,1,3,5,8,9,10],limit:[5,8,10],line:[1,3,8],linear:5,linear_model:[1,3,4,6,9],linearsvc:[3,5,10],linspac:5,list:[0,5,8,9,10],listedcolormap:8,literatur:[0,1,4,6],load:[0,3,8,10],loader:[0,10],loader_func:[0,10],loader_kwarg:10,local:8,log:[8,10],logist:[1,3,9,11],logisticregress:[1,3,4,6,9],logscal:8,logspac:4,longer:8,longest:9,look:[0,1,3,5],loss:[6,9,11],low:[5,8,9],lower:[5,8],lower_is_bett:8,lowest:5,lowranklogisticregress:9,lr:[1,3,9,11],lstm:[3,9],lstm_class_nlay:9,lstm_hidden_s:11,lstm_nlayer:11,lstmnet:9,m:[3,8,11],machin:[1,4,6],macro:8,made:[0,2,8,10,11],mae:[1,4,6,8,9,11],mae_loss:11,mai:8,main:5,maintain:[3,11],make:[0,1,3],makedir:8,mammograph:0,manag:[0,3,10],mani:[1,3,4,5,6,8,10,11],manner:0,manual:0,map:[1,9],map_parallel:8,margin:9,mass:8,math:[],mathcal:8,matplotlib:[2,8],matric:[0,5,10],matrix:[5,8],max:11,max_it:11,max_sample_s:11,maxim:6,maximum:[1,8,9],maximumlikelihoodprevalenceestim:11,md:[],mean:[0,1,3,4,5,6,8,9,10,11],mean_absolute_error:8,mean_relative_absolute_error:8,measur:[2,3,4,5,6,8,11],mediansweep2:11,mediansweep:11,member:3,memori:9,mention:3,merg:5,met:10,meta:[6,7,8],meth:[],method:[0,1,4,5,6,7,8],method_data:5,method_nam:[5,8],method_ord:8,metric:[1,3,4,6,8],might:[1,8,10],min_df:[1,3,4,5,10],min_po:11,mine:[0,3,11],minim:8,minimum:10,minimun:10,mining6:10,minu:8,miss:8,mixtur:3,mkld:[1,8,11],ml:10,mnkld:[1,8,11],mock:[8,9],modal:4,model:[0,1,5,6,8,9,11],model_select:[4,7],modifi:[3,8],modul:[0,1,3,5,6,7],moment:[0,3],monitor:8,more:[3,5,8],moreo:[0,3,4,10],most:[0,3,5,6,8,10,11],movi:0,mrae:[1,6,8,9,11],ms2:11,ms:11,mse:[1,3,6,8,11],msg:11,multiclass:8,multipli:8,multiprocess:8,multivari:[3,9,11],must:[3,10],my:[],my_arrai:8,my_collect:10,my_custom_load:0,my_data:0,mycustomloss:3,n:[0,1,8,9],n_bin:[5,8],n_class:[1,3,8,9,10,11],n_compon:9,n_dimens:9,n_epoch:11,n_featur:9,n_instanc:[8,9],n_job:[1,3,4,8,10,11],n_preval:[0,8,10],n_prevpoint:[1,4,5,8],n_repeat:[1,8],n_repetit:[1,4,5,8],n_sampl:[8,9],name:[5,8,9,10],nativ:6,natur:[1,8,10],natural_prevalence_predict:8,natural_prevalence_protocol:8,natural_prevalence_report:8,natural_sampling_gener:10,natural_sampling_index_gener:10,nbin:[5,8],ndarrai:[1,3,8,10,11],necessarili:11,need:[0,3,8,10,11],neg:[0,5,8],nest:[],net:9,network:[0,8,9,10,11],neural:[0,7,8,10],neuralclassifiertrain:[3,9],neutral:0,next:[4,8,9,10],nfold:[0,10],nkld:[1,2,6,8,9],nn:[9,11],nogap:10,non:[3,11],non_aggreg:[7,8],none:[1,4,8,9,10,11],nonetheless:4,nor:3,normal:[0,1,3,8,10,11],normalize_preval:8,note:[1,3,4,5,8,10],now:5,nowadai:3,np:[1,3,4,5,8,10],npp:[8,10],nprevpoint:[],nrepeat:[0,10],num_prevalence_combin:[1,8],number:[0,1,3,5,8,9,10,11],numer:[0,1,3,6,10],numpi:[2,4,8,9,11],o_l6x_pcf09mdetq4tu7jk98mxfbgsxp9zso14jkuiyudgfg0:[],object:[0,8,9,10,11],observ:1,obtain:[1,4,8],obtaind:8,obvious:8,occur:[5,10],occurr:10,octob:[0,3],off:9,offer:[3,6],older:2,omd:[0,10],ommit:[1,8],onc:[1,3,5,8],one:[0,1,3,4,5,8,10,11],ones:[1,3,5,8,10],onevsal:[3,11],onli:[0,3,5,8,9,10,11],open:[0,6,10],oper:3,opt:4,optim:[2,3,4,8,9,11],optimize_threshold:11,option:[0,1,3,5,8,10,11],order:[0,2,3,5,8,10,11],order_bi:11,org:10,orient:[3,6,8,11],origin:[0,3,10,11],os:[0,8],other:[1,3,5,6,8,10],otherwis:[0,3,8,10,11],our:[],out:[3,4,5,8,9,10],outcom:5,outer:8,outlier:8,output:[0,1,3,4,8,9,10,11],over:[3,4,8],overal:1,overestim:5,overrid:3,overridden:[3,11],own:4,p:[0,3,8,10,11],p_hat:8,p_i:8,pacc:[1,3,5,8,11],packag:[0,2,3,6,7],pad:[9,10],pad_length:9,padding_length:9,page:[0,2,6],pageblock:0,pair:[0,8],panda:[1,2,8],paper:[0,3,11],parallel:[1,3,8,10],param:[4,9,11],param_grid:[4,8,11],param_mod_sel:11,param_model_sel:11,paramet:[1,3,4,8,9,10,11],parent:8,part:[3,10],particular:[0,1,3],particularli:1,pass:[0,1,5,8,9,11],past:1,patch:[2,3,9],path:[0,3,5,8,9,10],patienc:[8,9,11],pattern:[3,11],pca:[],pcalr:[],pcc:[3,4,5,11],pd:1,pdf:5,peopl:[],percentil:8,perf:[6,9],perform:[1,3,4,5,6,8,9,11],perman:8,phonem:0,pick:4,pickl:[3,8,10],pickle_path:8,pickled_resourc:8,pii:[],pip:2,pipelin:[],pkl:8,plai:0,plan:3,pleas:3,plot:[6,7],png:5,point:[0,1,3,8,10],polici:[3,11],popular:6,portion:4,pos_class:[8,10],posit:[0,3,5,8,10],possibl:[1,3,8],post:8,posterior:[3,8,9,11],posterior_prob:[3,11],postpon:3,potter:0,pp:[0,3],pprox:[],practic:[0,4],pre:[0,3],prec:[0,8],preced:10,precis:[0,1,8],preclassifi:3,predefin:10,predict:[3,4,5,8,9,11],predict_proba:[3,9,11],predictor:1,prefer:8,prepare_svmperf:[2,3],preprint:4,preprocess:[0,1,3,7,8],present:[0,3,10],preserv:[1,5,8,10],pretti:5,prev:[0,1,8,10],prevail:3,preval:[0,1,3,4,5,6,8,10,11],prevalence_estim:8,prevalence_from_label:8,prevalence_from_prob:8,prevalence_linspac:8,prevel:11,previou:3,previous:11,prevs_estim:11,prevs_hat:[1,8],princip:9,print:[0,1,3,4,6,9,10],prior:[1,3,4,5,6,8],priori:[3,11],probabilist:[3,11],probabilisticadjustedclassifyandcount:11,probabilisticclassifyandcount:11,probabl:[1,3,4,5,6,8,9,11],problem:[0,3,5,8,10,11],procedur:[3,6,11],proceed:[0,3,10],process:[3,4,8],processor:3,procol:1,produc:[0,1,5,8],product:3,progress:[8,10],properli:0,properti:[3,8,9,10,11],proport:[3,4,8,9,10,11],propos:[2,3,11],protocl:8,protocol:[0,3,4,5,6,8,10],provid:[0,3,5,6],ptecondestim:11,ptr:[3,11],ptr_polici:11,purpos:[0,11],python:[0,6],pytorch:2,q:[0,2,3,8,9],q_i:8,qacc:9,qdrop_p:11,qf1:9,qgm:9,qp:[0,1,3,4,5,6,8,10],quanet:[2,6,9,11],quanetmodul:11,quanettrain:11,quantif:[0,1,6,8,9,10,11],quantifi:[3,4,5,6,8,11],quantification_error:8,quantiti:8,quapi:[0,1,2,3,4,5],quapy_data:0,quay_data:10,question:8,quevedo:[0,3,10,11],quick:[],quit:8,r:[0,3,8,10,11],rac:[],rae:[1,2,8],rais:[3,8],rand:8,random:[1,3,4,5,8,10],random_se:[1,8],random_st:10,randomli:0,rang:[0,5,8],rank:[3,9],rare:10,rate:[3,8,9],rather:[1,4],raw:10,rb:0,re:[3,4,10],read:10,reader:[7,8],readm:[],real:[8,9,10],reason:[3,5,6],recal:8,receiv:[0,3,5],recip:11,recognit:[3,11],recommend:[1,5],recurr:[0,3,10],red:0,red_siz:[3,11],reduc:[0,10],reduce_column:[0,10],refer:[9,10],refit:[4,8],regard:4,regardless:10,regim:8,region:8,regist:11,regress:9,regressor:[1,3,11],reindex_label:10,reiniti:9,rel:[1,3,8,10],relative_absolute_error:8,reli:[1,3],reliabl:[3,11],rememb:5,remov:10,repeat:[8,10],repetit:8,repl:[],replac:[0,3,10],replic:[1,4,8],report:[1,8],repositori:[0,10],repr_siz:9,repres:[1,3,5,8,10,11],represent:[0,3,8,9],reproduc:10,request:[0,8,10,11],requir:[0,1,3,6,9],reset_net_param:9,resourc:8,respect:[0,1,5,8,11],respond:3,rest:[8,10,11],result:[1,2,3,4,5,6,8,11],retain:[0,3,9],retrain:4,return_constrained_dim:8,reus:[0,3,8],review:[5,6,10],reviews_sentiment_dataset:[0,10],rewrit:5,right:[4,8,10],role:0,root:6,roughli:0,round:10,routin:[8,10],row:[8,10],run:[0,1,2,3,4,5,8,10,11],s003132031400291x:[],s:[0,1,3,4,5,8,9,10],saeren:[3,11],sai:11,said:3,same:[0,3,5,8,10],sampl:[0,1,3,4,5,6,8,9,10,11],sample_s:[0,1,3,4,5,8,10,11],sampling_from_index:[0,10],sampling_index:[0,10],sander:[0,10],save:[5,8],save_or_show:[],save_text_fil:8,savepath:[5,8],scale:8,scall:10,scenario:[1,3,4,5,6],scienc:[3,11],sciencedirect:[],scikit:[2,3,4,10],scipi:[2,10],score:[0,1,4,8,9,10],script:[1,2,3,6],se:[1,8],search:[3,4,6,8,11],sebastiani:[0,3,4,10,11],second:[0,1,3,5,8,10],secondari:8,section:4,see:[0,1,2,3,4,5,6,8,9,10],seed:[1,4,8],seem:3,seemingli:5,seen:[5,8],select:[0,3,6,8,10,11],selector:3,self:[3,8,9,10,11],semeion:0,semev:0,semeval13:[0,10],semeval14:[0,10],semeval15:[0,10],semeval16:[0,6,10],sentenc:10,sentiment:[3,6,10,11],separ:[8,10],sequenc:8,seri:0,serv:3,set:[0,1,3,4,5,6,8,9,10,11],set_opt:1,set_param:[3,8,9,11],set_siz:[],sever:0,sh:[2,3],shape:[5,8,9,10],share:[0,10],shift:[1,4,6,8],shorter:9,shoud:3,should:[0,1,3,4,5,6,9,10,11],show:[0,1,3,4,5,8,9,10],show_dens:8,show_std:[5,8],showcas:5,shown:[1,5,8],shuffl:[9,10],side:8,sign:8,signific:1,significantli:8,silent:[8,11],simeq:[],similar:[8,11],simpl:[0,3,5,11],simplest:3,simplex:[0,8],simpli:[1,2,3,4,5,6,8,11],sinc:[0,1,3,5,8,10,11],singl:[1,3,6,11],size:[0,1,3,8,9,10,11],sklearn:[1,3,4,5,6,9,10,11],sld:3,slice:8,smooth:[1,8],smooth_limits_epsilon:8,so:[0,1,3,5,8,9,10,11],social:[0,3,10,11],soft:3,softwar:0,solid:5,solut:8,solv:4,solve_adjust:11,some:[0,1,3,5,8,10],some_arrai:8,sometim:1,sonar:0,sourc:[2,3,6,9],sout:11,space:[0,4,8,9],spambas:0,spars:[0,10],special:[0,5,10],specif:[3,4],specifi:[0,1,3,5,8,9,10,11],spectf:0,spectrum:[0,1,4,5,8],speed:3,split:[0,3,4,5,8,9,10,11],split_stratifi:10,splitstratifi:10,spmatrix:10,sqrt:8,squar:[1,3,8],sst:[0,10],stabil:1,stabl:10,stackexchang:8,stand:8,standard:[0,1,5,8,10],star:8,start:4,stat:10,state:8,statist:[0,1,8,11],stats_siz:11,std:9,stdout:8,step:[5,8],stop:[8,9],store:[0,9,10],str:[0,8,10],strategi:[3,4],stratif:10,stratifi:[0,3,10],stride:9,string:[1,8,10],strongli:[4,5],strprev:[0,1,8],structur:3,studi:[0,3,10,11],style:10,subclass:11,subdir:8,subinterv:5,sublinear_tf:10,submit:0,submodul:7,subobject:[],suboptim:4,subpackag:7,subsequ:[10,11],subtract:[0,8,10],subtyp:10,suffic:5,suffici:11,sum:[8,11],sum_:8,summar:0,supervis:[4,6],support:[3,6,9,10],surfac:10,surpass:1,svm:[3,5,6,9,10],svm_light:[],svm_perf:[],svm_perf_classifi:9,svm_perf_learn:9,svm_perf_quantif:[2,3],svmae:[3,11],svmkld:[3,11],svmnkld:[3,11],svmperf:[2,3,7,8],svmperf_bas:[9,11],svmperf_hom:3,svmq:[3,11],svmrae:[3,11],syntax:5,system:4,t50:11,t:[0,1,3,8],tab10:8,tail:8,tail_density_threshold:8,take:[0,3,5,8,10,11],taken:[3,8,9,10],target:[3,5,6,8,9,11],task:[3,4,10,11],te:[8,10],temp_se:8,tempor:8,tend:5,tendenc:5,tensor:9,term:[0,1,3,4,5,6,8,9,10,11],test:[0,1,3,4,5,6,8,9,10,11],test_bas:[],test_dataset:[],test_method:[],test_path:[0,10],test_sampl:8,test_split:10,text2tfidf:[0,1,3,10],text:[0,3,8,9,10,11],textclassifiernet:9,textual:[0,6,10],tf:[0,10],tfidf:[0,4,5,10],tfidfvector:10,than:[1,4,5,8,9,10],thei:[0,3],them:[0,3,11],theoret:4,thereaft:1,therefor:[8,10],thi:[0,1,2,3,4,5,6,8,9,10,11],thing:3,third:[1,5],thorsten:9,those:[1,3,4,5,8,9],though:[3,8],three:[0,5],threshold:8,thresholdoptim:11,through:[3,8],thu:[3,4,5,8,11],tictacto:0,time:[0,1,3,8,10],timeout:8,timeouterror:8,timer:8,titl:8,tj:[],tn:[8,11],token:[0,9,10],tool:[1,6],top:[3,8,11],torch:[3,9,11],torchdataset:9,total:8,toward:[5,10],tp:[8,11],tpr:8,tqdm:2,tr:10,tr_iter_per_poch:11,tr_prev:[5,8,11],track:8,trade:9,tradition:1,train:[0,1,3,4,5,6,8,9,10,11],train_path:[0,10],train_prev:[5,8],train_prop:10,train_siz:10,train_val_split:11,trainer:9,training_help:11,training_preval:5,training_s:5,transact:[3,11],transform:[0,9,10],transfus:0,trivial:3,true_prev:[1,5,8],true_preval:6,truncatedsvd:9,ttest_alpha:8,tupl:[8,10],turn:4,tweet:[0,3,10,11],twitter:[6,10],twitter_sentiment_datasets_test:[0,10],twitter_sentiment_datasets_train:[0,10],two:[0,1,3,4,5,8,10],txt:8,type:[0,3,8,10],typic:[1,4,5,8,9,10],u1:10,uci:[6,10],uci_dataset:10,unabl:0,unadjust:5,unalt:9,unbias:5,uncompress:0,under:1,underestim:5,underlin:8,understand:8,unfortun:5,unifi:0,uniform:[8,10],uniform_prevalence_sampl:8,uniform_sampl:10,uniform_sampling_index:10,uniform_simplex_sampl:8,uniformli:[8,10],union:[8,11],uniqu:10,unit:[0,8],unix:0,unk:10,unknown:10,unless:11,unlik:[1,4],unus:[8,9,11],up:[3,4,8,9,11],updat:[],url:8,us:[0,1,3,4,5,6,8,9,10,11],user:[0,1,5],utf:10,util:[7,9],v:[3,11],va_iter_per_poch:11,val:[0,10],val_split:[3,4,8,9,11],valid:[0,1,3,4,5,8,9,10,11],valid_loss:[3,9],valid_polici:11,valu:[0,1,3,8,9,10,11],variabl:[1,3,5,8,10],varianc:[0,5],variant:[5,6,11],varieti:4,variou:[1,5],vector:[0,8,9,10],verbos:[0,1,4,8,9,10,11],veri:[3,5],versatil:6,version:[2,9],vertic:8,vertical_xtick:8,via:[0,2,3,11],view:5,visual:[5,6],vline:8,vocab_s:9,vocabulari:[9,10],vocabulary_s:[3,9,10],vs:[3,8],w:[0,3,10,11],wa:[0,3,5,8,10,11],wai:[1,11],wait:9,want:[3,4],warn:10,wb:[0,10],wdbc:0,we:[0,1,3,4,5,6],weight:[9,10],weight_decai:9,well:[0,3,4,5],were:0,what:3,whcih:10,when:[0,1,3,4,5,8,9,10],whenev:[5,8],where:[3,5,8,9,10,11],wherebi:4,whether:[8,9,10,11],which:[0,1,3,4,5,8,9,10,11],white:0,whole:[0,1,3,4,8],whose:10,why:3,wide:5,wiki:[0,3],wine:0,within:[8,11],without:[1,3,8,10],word:[1,3,6,9,10],work:[1,3,4,5,10],worker:[1,8,10],wors:[4,5,8],would:[0,1,3,5,6,8,10,11],wrapper:[8,9,10],written:6,www:[],x2:10,x:[5,8,9,10,11],x_error:8,xavier:9,xavier_uniform:9,xlrd:[0,2],xy:10,y:[5,8,9,10,11],y_:11,y_error:8,y_pred:8,y_true:8,ye:[],yeast:[0,10],yield:[5,8,10],yin:[],you:[2,3],your:3,z:[0,10],zero:[0,8],zfthyovrzwxmgfzylqw_y8cagg:[],zip:[0,5]},titles:["Datasets","Evaluation","Installation","Quantification Methods","Model Selection","Plotting","Welcome to QuaPy\u2019s documentation!","quapy","quapy package","quapy.classification package","quapy.data package","quapy.method package"],titleterms:{"function":8,A:6,The:3,ad:0,aggreg:[3,11],base:[10,11],bia:5,classif:[4,9],classifi:3,content:[6,8,9,10,11],count:3,custom:0,data:[0,10],dataset:[0,10],diagon:5,distanc:3,document:6,drift:5,emq:3,ensembl:3,error:[1,5,8],evalu:[1,8],ex:[],exampl:6,expect:3,explicit:3,featur:6,get:[],hdy:3,helling:3,indic:6,instal:2,introduct:6,issu:0,learn:0,loss:[2,3,4],machin:0,maxim:3,measur:1,meta:[3,11],method:[3,9,11],minim:3,model:[3,4],model_select:8,modul:[8,9,10,11],network:3,neural:[3,9,11],non_aggreg:11,orient:[2,4],packag:[8,9,10,11],perf:2,plot:[5,8],preprocess:10,process:0,protocol:1,quanet:3,quantif:[2,3,4,5],quapi:[6,7,8,9,10,11],quick:6,reader:10,readm:[],requir:2,review:0,s:6,select:4,sentiment:0,start:[],submodul:[8,9,10,11],subpackag:8,svm:2,svmperf:9,tabl:6,target:4,test:[],test_bas:[],test_dataset:[],test_method:[],titl:[],twitter:0,uci:0,util:8,variant:3,welcom:6,y:3}})
\ No newline at end of file
+Search.setIndex({docnames:["Datasets","Evaluation","Installation","Methods","Model-Selection","Plotting","index","modules","quapy","quapy.classification","quapy.data","quapy.method"],envversion:{"sphinx.domains.c":2,"sphinx.domains.changeset":1,"sphinx.domains.citation":1,"sphinx.domains.cpp":4,"sphinx.domains.index":1,"sphinx.domains.javascript":2,"sphinx.domains.math":2,"sphinx.domains.python":3,"sphinx.domains.rst":2,"sphinx.domains.std":2,sphinx:56},filenames:["Datasets.md","Evaluation.md","Installation.rst","Methods.md","Model-Selection.md","Plotting.md","index.rst","modules.rst","quapy.rst","quapy.classification.rst","quapy.data.rst","quapy.method.rst"],objects:{"":{quapy:[8,0,0,"-"]},"quapy.classification":{methods:[9,0,0,"-"],neural:[9,0,0,"-"],svmperf:[9,0,0,"-"]},"quapy.classification.methods":{LowRankLogisticRegression:[9,1,1,""]},"quapy.classification.methods.LowRankLogisticRegression":{fit:[9,2,1,""],get_params:[9,2,1,""],predict:[9,2,1,""],predict_proba:[9,2,1,""],set_params:[9,2,1,""],transform:[9,2,1,""]},"quapy.classification.neural":{CNNnet:[9,1,1,""],LSTMnet:[9,1,1,""],NeuralClassifierTrainer:[9,1,1,""],TextClassifierNet:[9,1,1,""],TorchDataset:[9,1,1,""]},"quapy.classification.neural.CNNnet":{document_embedding:[9,2,1,""],get_params:[9,2,1,""],vocabulary_size:[9,3,1,""]},"quapy.classification.neural.LSTMnet":{document_embedding:[9,2,1,""],get_params:[9,2,1,""],vocabulary_size:[9,3,1,""]},"quapy.classification.neural.NeuralClassifierTrainer":{device:[9,3,1,""],fit:[9,2,1,""],get_params:[9,2,1,""],predict:[9,2,1,""],predict_proba:[9,2,1,""],reset_net_params:[9,2,1,""],set_params:[9,2,1,""],transform:[9,2,1,""]},"quapy.classification.neural.TextClassifierNet":{dimensions:[9,2,1,""],document_embedding:[9,2,1,""],forward:[9,2,1,""],get_params:[9,2,1,""],predict_proba:[9,2,1,""],vocabulary_size:[9,3,1,""],xavier_uniform:[9,2,1,""]},"quapy.classification.neural.TorchDataset":{asDataloader:[9,2,1,""]},"quapy.classification.svmperf":{SVMperf:[9,1,1,""]},"quapy.classification.svmperf.SVMperf":{decision_function:[9,2,1,""],fit:[9,2,1,""],predict:[9,2,1,""],set_params:[9,2,1,""],valid_losses:[9,4,1,""]},"quapy.data":{base:[10,0,0,"-"],datasets:[10,0,0,"-"],preprocessing:[10,0,0,"-"],reader:[10,0,0,"-"]},"quapy.data.base":{Dataset:[10,1,1,""],LabelledCollection:[10,1,1,""],isbinary:[10,5,1,""]},"quapy.data.base.Dataset":{SplitStratified:[10,2,1,""],binary:[10,3,1,""],classes_:[10,3,1,""],kFCV:[10,2,1,""],load:[10,2,1,""],n_classes:[10,3,1,""],stats:[10,2,1,""],vocabulary_size:[10,3,1,""]},"quapy.data.base.LabelledCollection":{Xy:[10,3,1,""],artificial_sampling_generator:[10,2,1,""],artificial_sampling_index_generator:[10,2,1,""],binary:[10,3,1,""],counts:[10,2,1,""],kFCV:[10,2,1,""],load:[10,2,1,""],n_classes:[10,3,1,""],natural_sampling_generator:[10,2,1,""],natural_sampling_index_generator:[10,2,1,""],prevalence:[10,2,1,""],sampling:[10,2,1,""],sampling_from_index:[10,2,1,""],sampling_index:[10,2,1,""],split_stratified:[10,2,1,""],stats:[10,2,1,""],uniform_sampling:[10,2,1,""],uniform_sampling_index:[10,2,1,""]},"quapy.data.datasets":{fetch_UCIDataset:[10,5,1,""],fetch_UCILabelledCollection:[10,5,1,""],fetch_reviews:[10,5,1,""],fetch_twitter:[10,5,1,""],warn:[10,5,1,""]},"quapy.data.preprocessing":{IndexTransformer:[10,1,1,""],index:[10,5,1,""],reduce_columns:[10,5,1,""],standardize:[10,5,1,""],text2tfidf:[10,5,1,""]},"quapy.data.preprocessing.IndexTransformer":{add_word:[10,2,1,""],fit:[10,2,1,""],fit_transform:[10,2,1,""],transform:[10,2,1,""],vocabulary_size:[10,2,1,""]},"quapy.data.reader":{binarize:[10,5,1,""],from_csv:[10,5,1,""],from_sparse:[10,5,1,""],from_text:[10,5,1,""],reindex_labels:[10,5,1,""]},"quapy.error":{absolute_error:[8,5,1,""],acc_error:[8,5,1,""],acce:[8,5,1,""],ae:[8,5,1,""],f1_error:[8,5,1,""],f1e:[8,5,1,""],from_name:[8,5,1,""],kld:[8,5,1,""],mae:[8,5,1,""],mean_absolute_error:[8,5,1,""],mean_relative_absolute_error:[8,5,1,""],mkld:[8,5,1,""],mnkld:[8,5,1,""],mrae:[8,5,1,""],mse:[8,5,1,""],nkld:[8,5,1,""],rae:[8,5,1,""],relative_absolute_error:[8,5,1,""],se:[8,5,1,""],smooth:[8,5,1,""]},"quapy.evaluation":{artificial_prevalence_prediction:[8,5,1,""],artificial_prevalence_protocol:[8,5,1,""],artificial_prevalence_report:[8,5,1,""],evaluate:[8,5,1,""],gen_prevalence_prediction:[8,5,1,""],gen_prevalence_report:[8,5,1,""],natural_prevalence_prediction:[8,5,1,""],natural_prevalence_protocol:[8,5,1,""],natural_prevalence_report:[8,5,1,""]},"quapy.functional":{HellingerDistance:[8,5,1,""],adjusted_quantification:[8,5,1,""],artificial_prevalence_sampling:[8,5,1,""],get_nprevpoints_approximation:[8,5,1,""],normalize_prevalence:[8,5,1,""],num_prevalence_combinations:[8,5,1,""],prevalence_from_labels:[8,5,1,""],prevalence_from_probabilities:[8,5,1,""],prevalence_linspace:[8,5,1,""],strprev:[8,5,1,""],uniform_prevalence_sampling:[8,5,1,""],uniform_simplex_sampling:[8,5,1,""]},"quapy.method":{aggregative:[11,0,0,"-"],base:[11,0,0,"-"],meta:[11,0,0,"-"],neural:[11,0,0,"-"],non_aggregative:[11,0,0,"-"]},"quapy.method.aggregative":{ACC:[11,1,1,""],AdjustedClassifyAndCount:[11,4,1,""],AggregativeProbabilisticQuantifier:[11,1,1,""],AggregativeQuantifier:[11,1,1,""],CC:[11,1,1,""],ClassifyAndCount:[11,4,1,""],ELM:[11,1,1,""],EMQ:[11,1,1,""],ExpectationMaximizationQuantifier:[11,4,1,""],ExplicitLossMinimisation:[11,4,1,""],HDy:[11,1,1,""],HellingerDistanceY:[11,4,1,""],MAX:[11,1,1,""],MS2:[11,1,1,""],MS:[11,1,1,""],MedianSweep2:[11,4,1,""],MedianSweep:[11,4,1,""],OneVsAll:[11,1,1,""],PACC:[11,1,1,""],PCC:[11,1,1,""],ProbabilisticAdjustedClassifyAndCount:[11,4,1,""],ProbabilisticClassifyAndCount:[11,4,1,""],SLD:[11,4,1,""],SVMAE:[11,1,1,""],SVMKLD:[11,1,1,""],SVMNKLD:[11,1,1,""],SVMQ:[11,1,1,""],SVMRAE:[11,1,1,""],T50:[11,1,1,""],ThresholdOptimization:[11,1,1,""],X:[11,1,1,""]},"quapy.method.aggregative.ACC":{aggregate:[11,2,1,""],classify:[11,2,1,""],fit:[11,2,1,""],solve_adjustment:[11,2,1,""]},"quapy.method.aggregative.AggregativeProbabilisticQuantifier":{posterior_probabilities:[11,2,1,""],predict_proba:[11,2,1,""],probabilistic:[11,3,1,""],quantify:[11,2,1,""],set_params:[11,2,1,""]},"quapy.method.aggregative.AggregativeQuantifier":{aggregate:[11,2,1,""],aggregative:[11,3,1,""],classes_:[11,3,1,""],classify:[11,2,1,""],fit:[11,2,1,""],get_params:[11,2,1,""],learner:[11,3,1,""],quantify:[11,2,1,""],set_params:[11,2,1,""]},"quapy.method.aggregative.CC":{aggregate:[11,2,1,""],fit:[11,2,1,""]},"quapy.method.aggregative.ELM":{aggregate:[11,2,1,""],classify:[11,2,1,""],fit:[11,2,1,""]},"quapy.method.aggregative.EMQ":{EM:[11,2,1,""],EPSILON:[11,4,1,""],MAX_ITER:[11,4,1,""],aggregate:[11,2,1,""],fit:[11,2,1,""],predict_proba:[11,2,1,""]},"quapy.method.aggregative.HDy":{aggregate:[11,2,1,""],fit:[11,2,1,""]},"quapy.method.aggregative.OneVsAll":{aggregate:[11,2,1,""],binary:[11,3,1,""],classes_:[11,3,1,""],classify:[11,2,1,""],fit:[11,2,1,""],get_params:[11,2,1,""],posterior_probabilities:[11,2,1,""],probabilistic:[11,3,1,""],quantify:[11,2,1,""],set_params:[11,2,1,""]},"quapy.method.aggregative.PACC":{aggregate:[11,2,1,""],classify:[11,2,1,""],fit:[11,2,1,""]},"quapy.method.aggregative.PCC":{aggregate:[11,2,1,""],fit:[11,2,1,""]},"quapy.method.aggregative.ThresholdOptimization":{aggregate:[11,2,1,""],fit:[11,2,1,""]},"quapy.method.base":{BaseQuantifier:[11,1,1,""],BinaryQuantifier:[11,1,1,""],isaggregative:[11,5,1,""],isbinary:[11,5,1,""],isprobabilistic:[11,5,1,""]},"quapy.method.base.BaseQuantifier":{aggregative:[11,3,1,""],binary:[11,3,1,""],classes_:[11,3,1,""],fit:[11,2,1,""],get_params:[11,2,1,""],n_classes:[11,3,1,""],probabilistic:[11,3,1,""],quantify:[11,2,1,""],set_params:[11,2,1,""]},"quapy.method.base.BinaryQuantifier":{binary:[11,3,1,""]},"quapy.method.meta":{EACC:[11,5,1,""],ECC:[11,5,1,""],EEMQ:[11,5,1,""],EHDy:[11,5,1,""],EPACC:[11,5,1,""],Ensemble:[11,1,1,""],ensembleFactory:[11,5,1,""],get_probability_distribution:[11,5,1,""]},"quapy.method.meta.Ensemble":{VALID_POLICIES:[11,4,1,""],aggregative:[11,3,1,""],binary:[11,3,1,""],classes_:[11,3,1,""],fit:[11,2,1,""],get_params:[11,2,1,""],probabilistic:[11,3,1,""],quantify:[11,2,1,""],set_params:[11,2,1,""]},"quapy.method.neural":{QuaNetModule:[11,1,1,""],QuaNetTrainer:[11,1,1,""],mae_loss:[11,5,1,""]},"quapy.method.neural.QuaNetModule":{device:[11,3,1,""],forward:[11,2,1,""],init_hidden:[11,2,1,""]},"quapy.method.neural.QuaNetTrainer":{classes_:[11,3,1,""],clean_checkpoint:[11,2,1,""],clean_checkpoint_dir:[11,2,1,""],epoch:[11,2,1,""],fit:[11,2,1,""],get_aggregative_estims:[11,2,1,""],get_params:[11,2,1,""],quantify:[11,2,1,""],set_params:[11,2,1,""]},"quapy.method.non_aggregative":{MaximumLikelihoodPrevalenceEstimation:[11,1,1,""]},"quapy.method.non_aggregative.MaximumLikelihoodPrevalenceEstimation":{classes_:[11,3,1,""],fit:[11,2,1,""],get_params:[11,2,1,""],quantify:[11,2,1,""],set_params:[11,2,1,""]},"quapy.model_selection":{GridSearchQ:[8,1,1,""]},"quapy.model_selection.GridSearchQ":{best_model:[8,2,1,""],classes_:[8,3,1,""],fit:[8,2,1,""],get_params:[8,2,1,""],quantify:[8,2,1,""],set_params:[8,2,1,""]},"quapy.plot":{binary_bias_bins:[8,5,1,""],binary_bias_global:[8,5,1,""],binary_diagonal:[8,5,1,""],brokenbar_supremacy_by_drift:[8,5,1,""],error_by_drift:[8,5,1,""]},"quapy.util":{EarlyStop:[8,1,1,""],create_if_not_exist:[8,5,1,""],create_parent_dir:[8,5,1,""],download_file:[8,5,1,""],download_file_if_not_exists:[8,5,1,""],get_quapy_home:[8,5,1,""],map_parallel:[8,5,1,""],parallel:[8,5,1,""],pickled_resource:[8,5,1,""],save_text_file:[8,5,1,""],temp_seed:[8,5,1,""]},quapy:{classification:[9,0,0,"-"],data:[10,0,0,"-"],error:[8,0,0,"-"],evaluation:[8,0,0,"-"],functional:[8,0,0,"-"],isbinary:[8,5,1,""],method:[11,0,0,"-"],model_selection:[8,0,0,"-"],plot:[8,0,0,"-"],util:[8,0,0,"-"]}},objnames:{"0":["py","module","Python module"],"1":["py","class","Python class"],"2":["py","method","Python method"],"3":["py","property","Python property"],"4":["py","attribute","Python attribute"],"5":["py","function","Python function"]},objtypes:{"0":"py:module","1":"py:class","2":"py:method","3":"py:property","4":"py:attribute","5":"py:function"},terms:{"0":[0,1,3,4,5,8,9,10,11],"00":[0,1,4,8],"000":1,"0001":[4,11],"000e":1,"001":[4,9,11],"005":8,"008":[],"009":1,"0097":[],"01":[8,9,11],"017":1,"018":0,"02":1,"021":0,"02552":4,"03":1,"034":1,"035":1,"037":1,"04":1,"041":1,"042":1,"046":1,"048":1,"05":[5,8,10],"055":1,"063":[0,10],"065":0,"070":1,"073":1,"075":1,"078":0,"081":[0,10],"082":[0,1],"083":0,"086":0,"091":1,"099":0,"1":[0,1,3,4,5,8,9,10,11],"10":[0,1,4,5,8,9,11],"100":[0,1,3,4,5,9,10,11],"1000":[0,4,11],"10000":4,"100000":4,"1007":[],"101":[4,8,10],"1010":4,"1024":11,"104":0,"108":1,"109":0,"11":[0,1,6,8,10],"11338":0,"114":1,"1145":[],"12":9,"120":0,"1215742":0,"1271":0,"13":[0,9],"139":0,"14":3,"142":1,"146":3,"1473":0,"148":0,"1484":0,"15":[3,8,10],"150":0,"153":0,"157":0,"158":0,"159":0,"1593":0,"1594":0,"1599":0,"161":0,"163":[0,1],"164":[0,3],"167":0,"17":0,"1771":1,"1775":[0,3],"1778":[0,3],"178":0,"1823":0,"1839":0,"18399":0,"1853":0,"19":[3,10],"193":0,"199151":0,"19982":4,"1e":9,"1st":0,"2":[0,1,3,5,8,10,11],"20":[5,8,11],"200":[1,9],"2000":0,"2002":3,"2006":11,"2008":11,"2011":4,"2013":3,"2015":[0,2,3,9,11],"2016":[3,10,11],"2017":[0,3,10,11],"2018":[0,3,10],"2019":[3,10,11],"2020":4,"2021":11,"20342":4,"206":0,"207":0,"208":0,"21":[1,3,5,8,10],"210":[],"211":0,"2126":0,"2155":0,"21591":[0,10],"218":3,"2184":0,"219e":1,"22":[0,3,9,10],"222":0,"222046":0,"226":0,"229":1,"229399":0,"23":9,"235":1,"238":0,"2390":0,"24":[0,9],"243":0,"248563":0,"24866":4,"24987":4,"25":[0,5,8,9,11],"25000":0,"256":[0,9],"26":9,"261":0,"265":0,"266":0,"267":0,"27":[1,3,9],"270":0,"2700406":[],"271":0,"272":0,"274":0,"275":1,"27th":[0,3,10],"28":3,"280":0,"281":0,"282":0,"283":[0,1],"288":0,"289":0,"2971":0,"2nd":0,"2t":[1,8],"2tp":8,"2x5fcv":0,"3":[0,1,3,5,6,8,9,10,11],"30":[0,1,3,11],"300":[0,1,9],"305":0,"306":0,"312":0,"32":[0,6],"3227":8,"3269206":[],"3269287":[],"33":[0,5,8],"331":0,"333":0,"335":0,"337":0,"34":[0,3,10,11],"341":0,"346":1,"347":0,"350":0,"351":0,"357":1,"359":0,"361":0,"366":1,"372":0,"373":0,"376132":0,"3765":0,"3813":0,"3821":[0,10],"383e":1,"387e":1,"392":0,"394":0,"399":0,"3f":[1,6],"3rd":0,"4":[0,1,3,4,5,8,11],"40":[0,3,4,11],"404333":0,"407":0,"41":3,"412":0,"412e":1,"413":0,"414":0,"417":0,"41734":4,"42":[1,8],"421":0,"4259":0,"426e":1,"427":0,"430":0,"434":0,"435":1,"43676":4,"437":0,"44":0,"4403":10,"446":0,"45":[3,5,10],"452":0,"459":1,"4601":0,"461":0,"463":0,"465":0,"466":0,"470":0,"48":3,"481":0,"48135":4,"486":0,"4898":0,"492":0,"496":0,"4960":1,"497":0,"5":[0,1,3,4,5,8,9,10,11],"50":[0,5,8,11],"500":[0,1,4,5,11],"5000":[1,5],"5005":4,"507":0,"508":0,"512":[9,11],"514":0,"515e":1,"530":0,"534":0,"535":0,"535e":1,"5379":4,"539":0,"541":1,"546":0,"5473":0,"54it":4,"55":5,"55it":4,"565":1,"569":0,"57":0,"573":0,"578":1,"583":0,"591":3,"5f":4,"5fcv":[],"5fcvx2":10,"6":[0,1,3,5,8,10],"60":0,"600":1,"601":0,"604":3,"606":0,"625":0,"627":0,"633e":1,"634":1,"64":[9,11],"640":0,"641":0,"650":0,"653":0,"654":1,"66":[1,11],"665":0,"667":0,"669":0,"67":[5,8],"683":0,"688":0,"691":0,"694582":0,"7":[1,5,8,9],"70":0,"700":0,"701e":1,"711":0,"717":1,"725":1,"730":0,"735":0,"740e":1,"748":0,"75":[0,5,8],"762":0,"774":0,"778":0,"787":0,"794":0,"798":0,"8":[0,1,5,10,11],"8000":0,"830":0,"837":1,"858":1,"861":0,"87":[0,3,10],"8788":0,"889504":0,"8d2fhsgcvn0aaaaa":[],"9":[0,1,3,5,8],"90":[5,8],"901":0,"909":1,"914":1,"917":0,"919":[0,10],"922":0,"923":0,"935":0,"936":0,"937":[0,10],"945":1,"95":[8,10],"9533":0,"958":0,"97":0,"979":0,"982":0,"99":8,"abstract":[3,9,10,11],"boolean":[8,10,11],"case":[0,1,3,4,5,8,10,11],"class":[0,1,3,4,5,6,8,9,10,11],"d\u00edez":3,"default":[1,3,8,9,10,11],"do":[0,1,3,4,8,9,10,11],"final":[1,3,5,11],"float":[0,3,8,9,10,11],"function":[0,1,3,4,5,6,7,9,10,11],"g\u00e1llego":[0,3,10,11],"gonz\u00e1lez":3,"import":[0,1,3,4,5,6,10],"int":[0,5,8,10,11],"long":[4,9],"new":[0,3,10],"p\u00e9rez":[0,3,10,11],"return":[0,1,3,4,5,8,9,10,11],"rodr\u0131":3,"short":9,"static":[3,11],"true":[0,1,3,4,5,6,8,9,10,11],"try":4,"while":[3,5,8,9,10,11],A:[0,3,8,9,10,11],As:[3,4],By:[1,3,8],For:[0,1,5,6,8,10],If:[3,5,8,10,11],In:[0,1,2,3,4,5,6,9],It:[3,4,5,8],One:[0,1,3,11],That:[1,4],The:[0,1,2,4,5,6,8,9,10,11],Then:3,These:0,To:[5,10],_:[5,8,10],__:[],__class__:5,__name__:5,_adjust:[],_ae_:[],_classify_:[],_error_name_:[],_fit_learner_:[],_kld_:[],_labelledcollection_:[],_learner_:[],_mean:[],_min_df_:[],_my:[],_nkld_:[],_posterior_probabilities_:11,_q_:[],_rae_:[],_svmperf_:[],ab:[],aboud:3,about:[0,5,8,10],abov:[0,3,5,8],absolut:[1,3,5,6,8,11],absolute_error:8,abstractmethod:3,acc:[1,3,5,6,8,11],acc_error:8,accept:3,access:[0,3,10,11],accommod:0,accord:[1,3,4,8,9,10,11],accordingli:5,accuraci:[1,5,8,11],accuracy_polici:[],achiev:[1,3,4,5],acm:[0,3,10],across:[0,1,4,5,6,8],action:0,actual:[10,11],acut:0,ad:6,adapt:8,add:[3,4,8,10],add_word:10,addit:3,addition:0,adjust:[3,6,8,11],adjusted_quantif:8,adjustedclassifyandcount:11,adopt:[3,4,10],advanc:[0,6],advantag:3,ae:[1,2,5,8,11],ae_:1,affect:8,after:[8,11],afterward:11,again:5,against:5,aggreg:[1,4,5,6,7,8],aggregativeprobabilisticquantifi:[3,11],aggregativequantifi:[3,11],aggregg:[],aim:[4,5],aka:[10,11],al:[0,2,9,10,11],alaiz:3,alegr:3,alejandro:4,algorithm:[8,11],alia:[3,8,11],all:[0,1,2,3,5,8,10,11],allia:3,alloc:[8,9],allow:[0,1,2,3,5,8,9,10,11],almost:3,along:[0,3,8,11],alreadi:[3,11],also:[0,1,2,3,5,6,8,9],altern:4,although:[3,4,5,11],alwai:[3,4,5,11],among:3,amount:8,an:[0,1,2,3,4,5,6,8,9,10,11],analys:[5,6],analysi:[0,3,6,10],analyz:5,ani:[0,1,3,4,5,6,8,9,10,11],anoth:[0,1,3,5],anotherdir:8,anyon:0,api:6,app:[8,10],appeal:1,appear:5,append:5,appli:[2,3,4,5,8,9,10,11],appropri:4,approxim:[1,5,8,9,10],ar:[0,1,3,4,5,8,9,10,11],archive_filenam:8,archive_path:[],arg:[8,10,11],argmax:8,args_i:8,argu:4,argument:[0,1,3,5,8,10,11],arifici:[],aris:1,around:[1,10],arrai:[1,3,5,8,9,10,11],articl:[3,4],artifici:[0,1,3,4,5,6,8,10],artificial_prevalence_predict:8,artificial_prevalence_protocol:8,artificial_prevalence_report:8,artificial_prevalence_sampl:8,artificial_sampling_ev:[1,4],artificial_sampling_gener:[0,10],artificial_sampling_index_gener:10,artificial_sampling_predict:[1,5],artificial_sampling_report:1,arxiv:4,asarrai:1,asdataload:9,asonam:0,assert:10,assess:4,assign:[3,8,10],associ:[8,10],assum:[1,6,11],assumpion:11,assumpt:[1,5,6],astyp:[],attempt:[3,11],attribut:11,august:0,autom:[0,3,6],automat:[0,1],av:[3,11],avail:[0,1,2,3,5,6,9,11],averag:[1,3,8,10,11],avoid:[1,8],ax:11,axi:[5,8],b:[0,10,11],balanc:[0,4],band:[5,8],bar:8,barranquero:[2,3,9,11],base:[0,3,6,7,8,9],base_classifi:5,base_estim:3,base_quantifier_class:11,baseestim:[9,11],baselin:6,basequantifi:[3,8,11],basic:[5,11],batch:9,batch_siz:9,batch_size_test:9,beat:11,been:[0,3,4,5,8,10,11],befor:[3,8,9,10,11],beforehand:8,behav:[3,5],being:[4,8,11],belief:1,belong:[3,11],below:[0,2,3,5,8,10],best:[4,8,9],best_epoch:8,best_model:8,best_model_:4,best_params_:4,best_scor:8,better:4,between:[4,5,6,8,9,11],beyond:5,bia:[6,8],bias:5,bidirect:11,bin:[5,8,11],bin_bia:5,bin_diag:5,binar:[8,10],binari:[3,5,6,8,9,10,11],binary_bias_bin:[5,8],binary_bias_glob:[5,8],binary_diagon:[5,8],binary_quantifi:11,binaryquantifi:11,binom:8,block:[0,8],bool:8,both:5,bound:[8,11],box:[5,8],breast:0,brief:1,bring:11,broken:[5,8],brokenbar_supremacy_by_drift:8,budg:1,budget:[1,4],build:[],bypass:11,c:[3,4,8,9,10],calcul:8,calibr:3,calibratedclassifi:3,calibratedclassifiercv:3,calibratedcv:[],call:[0,1,5,8,10,11],callabl:[0,8,10],can:[0,1,2,3,4,5,8,10,11],cancer:0,cannot:[],cardiotocographi:0,care:11,carri:[3,10,11],casa_token:[],castano:[3,10],castro:3,categor:[3,10],categori:[1,8],cc:[3,5,11],ceil:8,center:5,chang:[0,1,3,10],character:[3,6],characteriz:[0,3,10],charg:[0,8,10],chart:8,check:[3,4],checkpoint:[9,11],checkpointdir:11,checkpointnam:11,checkpointpath:9,choic:4,choos:11,chosen:[4,8],cl:0,cla:[],class2int:10,class_weight:4,classes_:[8,10,11],classif:[0,1,3,7,8,10,11],classif_posterior:[3,11],classif_predict:[3,11],classif_predictions_bin:11,classifi:[1,4,5,6,8,9,11],classifier_net:9,classifiermixin:9,classifyandcount:[3,11],classmethod:[0,10,11],classnam:10,classs:8,clean_checkpoint:11,clean_checkpoint_dir:11,clear:5,clearer:1,clearli:5,clip:8,close:[1,10],closer:1,closest:11,cm:8,cmc:0,cnn:3,cnnnet:[3,9],code:[0,3,4,5,9],codifi:10,coincid:[0,6],col:[0,10],collect:[0,8,9,10],collet:10,color:[5,8],colormap:8,column:[0,8,10],com:8,combin:[0,1,4,8,10,11],combinatio:8,combinations_budget:8,come:[0,8,10,11],commandlin:[],common:[],commonli:6,compar:[5,8],comparison:5,compat:11,compil:[2,3],complement:11,complet:[3,5],compon:[8,9],compress:0,comput:[1,3,5,8,11],computation:4,compute_fpr:[],compute_t:[],compute_tpr:[],concept:6,concur:[],condit:[8,11],conduct:[0,8],confer:[0,3,10],confid:8,configur:[4,8],conform:10,consecut:[8,9,11],consid:[3,5,8,9,10,11],consist:[0,4,5,8,9,10,11],constrain:[1,5,8,10],constructor:3,consult:[0,1],contain:[1,2,3,5,8,9,10,11],contanin:8,content:7,context:8,contrast:1,control:[1,4,10],conv_block:[],conv_lay:[],conveni:8,converg:11,convert:[1,3,8,9,10],convolut:9,copi:[8,10],cornel:[],correct:11,correctli:8,correspond:[5,8,10],cosest:11,cost:1,costli:4,could:[0,1,3,4,5,6],count:[4,5,6,8,10,11],count_:[],counter:10,countvector:10,covari:10,cover:[1,4,9],coz:[0,3,10],cpu:[1,9],creat:[0,6,8],create_if_not_exist:8,create_parent_dir:8,crisp:[3,8],criteria:4,cross:[3,10,11],cs:8,csr:10,csr_matrix:10,csv:10,ctg:0,cuda:[3,9,11],cumbersom:1,cumberson:8,cumul:11,curios:5,current:[3,8,9,10,11],custom:[3,6,8,10],customarili:[3,4],cv:[3,4],cyan:5,d:11,d_:8,dat:[0,9],data:[1,3,4,5,6,7,8,9,11],data_hom:10,datafram:[1,8],dataload:9,dataset:[1,3,4,5,6,7,8,9,11],dataset_nam:10,deal:0,decaesteck:[3,11],decai:9,decid:10,decim:1,decis:[3,8,9,11],decision_funct:9,decomposit:9,dedic:[1,10],deep:[3,8,11],def:[0,1,3,5,8],defin:[0,3,8,9,10,11],degre:4,del:[0,3,10],delai:8,deliv:[3,11],denomin:11,dens:0,densiti:8,depend:[0,1,4,5,8,11],describ:[3,8,11],descript:0,design:4,desir:[0,1,10],despit:1,destin:8,detail:[0,1,3,6,9,10],determin:[1,4,5],detriment:5,devel:10,develop:[4,6],deviat:[0,1,5,8,10],devic:[0,3,5,9,11],df:1,df_replac:[],diabet:0,diagon:[6,8],dict:[8,10,11],dictionari:[8,9,10,11],differ:[0,1,3,4,5,6,8,10,11],difficult:5,digit:0,dimens:[8,9,10],dimension:[8,9,10],dir:8,directli:[0,1,3],directori:[2,8,9,10],discard:8,discoveri:3,discret:8,discuss:5,disjoint:9,disk:8,displai:[1,5,8],displaystyl:8,distanc:[8,11],distant:[1,8],distribut:[0,3,5,8,10,11],diverg:[1,3,8,11],divid:8,dl:[],doabl:0,doc_embed:11,doc_embedding_s:11,doc_posterior:11,document:[0,1,3,5,9,10],document_embed:9,doe:[0,2,3,8,11],doi:[],done:3,dot:[5,8],dowload:8,down:[5,8,10],download:[0,2,3,8],download_fil:8,download_file_if_not_exist:8,draw:[8,10],drawn:[0,1,4,8,10],drift:6,drop:9,drop_p:9,dropout:9,ds:[3,11],ds_polici:[],ds_policy_get_posterior:[],dtype:[1,10],dump:10,dure:[1,5,11],dynam:[3,9,10,11],e:[0,1,3,4,5,6,8,9,10,11],eacc:11,each:[0,1,3,4,5,8,9,10,11],earli:[8,9],early_stop:11,earlystop:8,easili:[0,2,5,9],ecc:11,edu:[],eemq:11,effect:3,effici:3,ehdi:11,either:[1,3,8,10,11],element:[3,10],elm:[3,11],els:11,em:11,emb:9,embed:[3,9],embed_s:9,embedding_s:9,empti:10,emq:[5,11],enabl:9,encod:10,end:[4,8,11],endeavour:6,enough:5,ensembl:[0,6,10,11],ensemblefactori:11,ensure_probabilist:[],entir:[0,3,4,5,8],entri:11,environ:[1,3,4,5,8],ep:[1,8],epacc:11,epoch:[8,9,11],epsilon:[1,8,11],equal:[1,8],equidist:[0,8],equip:[3,5],equival:11,err:[],err_drift:5,err_nam:8,error:[3,4,6,7,9,11],error_:[],error_by_drift:[5,8],error_funct:1,error_metr:[1,4,8],error_nam:[5,8],especi:8,establish:8,estim:[1,3,5,6,8,9,10,11],estim_prev:[1,5,8],estim_preval:[3,6],estimant:11,esuli:[0,2,3,9,10,11],et:[0,2,9,10,11],etc:6,eval_budget:[4,8],evalu:[0,3,4,5,6,7,9,10,11],even:8,eventu:[9,10],everi:[3,11],everyth:3,evinc:5,ex:[],exact:[0,10],exactli:0,exampl:[0,1,3,4,5,8,9,10,11],exce:8,excel:0,except:[3,8,11],exemplifi:0,exhaust:8,exhibit:[4,5],exist:8,exist_ok:8,expand_frame_repr:1,expect:[6,11],expectationmaximizationquantifi:[3,11],experi:[1,2,3,4,5,8],explain:[1,5],explicit:11,explicitlossminim:[],explicitlossminimis:11,explor:[4,8,10],express:10,ext:2,extend:[2,3,11],extens:[0,2,5],extern:3,extract:[1,8,10],f1:[1,8,9],f1_error:8,f1e:[1,8],f:[0,1,3,4,5,6,10],f_1:8,fabrizio:4,facilit:6,fact:[3,5],factor:8,factori:11,fals:[1,3,5,8,9,10,11],famili:[3,11],familiar:3,far:[8,9,10],fare:8,fast:8,faster:[0,10],feat1:10,feat2:10,featn:10,featur:[0,10],feature_extract:10,fetch:[0,6],fetch_review:[0,1,3,4,5,10],fetch_twitt:[0,3,6,10],fetch_ucidataset:[0,3,10],fetch_ucilabelledcollect:[0,10],ff_layer:11,fhe:0,file:[0,5,8,9,10,11],filenam:8,fin:0,find:[0,4],finish:4,first:[0,1,2,3,5,8,10,11],fit:[1,3,4,5,6,8,9,10,11],fit_learn:[3,11],fit_transform:10,fix:[1,4],flag:8,float64:1,fn:8,fold:[3,10,11],folder:[0,11],follow:[0,1,3,4,5,6,8],fomart:10,for_model_select:[0,10],form:[0,8,10],forman:11,format:[0,5,10],former:[2,11],forward:[9,11],found:[0,3,4,8,9,10],four:3,fp:8,fpr:[8,11],frac:8,framework:6,frequenc:[0,10,11],from:[0,1,3,4,5,6,8,10,11],from_csv:10,from_nam:[1,8],from_spars:10,from_text:10,full:[1,8],fulli:0,func:8,further:[0,1,3,9,10],fusion:[0,3,10],futur:3,g:[0,1,3,4,6,8,10,11],gain:8,gao:[0,3,10,11],gap:10,gasp:[0,10],gen:8,gen_data:5,gen_fn:8,gen_prevalence_predict:8,gen_prevalence_report:8,gener:[0,1,3,4,5,8,9,10,11],generation_func:8,german:0,get:[0,1,5,8,9,10,11],get_aggregative_estim:11,get_nprevpoints_approxim:[1,8],get_param:[3,8,9,11],get_probability_distribut:11,get_quapy_hom:8,ggener:8,github:[],give:11,given:[1,3,4,8,9,10,11],global:8,goal:11,goe:4,good:[4,5],got:4,govern:1,gpu:9,grant:[],greater:10,grid:[4,8,10,11],gridsearchcv:[4,11],gridsearchq:[4,8,11],ground:11,group:3,guarante:10,guez:3,gzip:0,ha:[3,4,5,8,9,10,11],haberman:[0,3],had:10,handl:0,happen:[4,5],hard:3,harder:5,harmon:8,harri:0,hat:8,have:[0,1,2,3,4,5,8,10,11],hcr:[0,3,10],hd:8,hdy:[6,11],held:[3,4,8,9,11],helling:11,hellingerdist:8,hellingerdistancei:[3,11],hellingh:8,help:5,henc:[8,10],here:[1,11],heurist:11,hidden:[5,9],hidden_s:9,hide:5,high:[5,8],higher:[1,5],highlight:8,hightlight:8,histogram:11,hlt:[],hold:[6,8,11],home:[8,10],hook:11,how:[0,1,3,4,5,8,10,11],howev:[0,4,5],hp:[0,3,4,10],html:10,http:[8,10],hyper:[4,8,9],hyperparam:4,hyperparamet:[3,8],i:[0,1,3,4,5,8,9,10,11],id:[0,3,10],identifi:8,idf:0,ieee:0,ignor:[8,10,11],ii:8,iid:[1,5,6],illustr:[3,4,5],imdb:[0,5,10],implement:[0,1,3,4,5,6,8,9,10,11],implicit:8,impos:[4,8],improv:[3,8,9,11],includ:[0,1,3,5,6,10,11],inconveni:8,inde:[3,4],independ:[8,11],index:[0,3,6,8,9,10],indextransform:10,indic:[0,1,3,4,5,8,10,11],individu:[1,3],infer:[0,10],inform:[0,1,3,4,8,10,11],infrequ:10,inherit:3,init:3,init_hidden:11,initi:[0,9],inplac:[1,3,10],input:[3,5,8,9,11],insight:5,inspir:3,instal:[0,3,6,9,11],instanc:[0,3,4,5,6,8,9,10,11],instanti:[0,1,3,4,9,11],instead:[1,3,4,11],integ:[3,8,9,10,11],integr:6,interest:[1,5,6,8,10],interestingli:5,interfac:[0,1,11],intern:[0,3,10],interpret:[5,6,11],interv:[1,5,8,10],introduc:1,invok:[0,1,3,8,10],involv:[2,5,8],io:[],ionospher:0,iri:0,irrespect:[5,11],isaggreg:11,isbinari:[8,10,11],isomer:8,isometr:[5,8],isprobabilist:11,isti:[],item:8,iter:[0,8,11],its:[3,4,8,9,11],itself:[3,8,11],j:[0,3,10,11],joachim:[3,9,11],job:[2,8],joblib:2,join:8,just:[1,3],k:[3,6,8,10,11],keep:8,kei:[8,10],kept:10,kernel:9,kernel_height:9,keyword:[10,11],kfcv:[0,10,11],kindl:[0,1,3,5,10],kl:8,kld:[1,2,8,9,11],know:3,knowledg:[0,3,10],known:[0,3,4,11],kraemer:8,kullback:[1,3,8,11],kwarg:[9,10,11],l1:[8,11],l:11,label:[0,3,4,5,6,8,9,10,11],labelledcollect:[0,3,4,8,10,11],larg:4,larger:[10,11],largest:8,last:[1,3,5,8,9,10],lastli:3,latex:5,latinn:[3,11],latter:11,layer:[3,9],lazi:11,lead:[1,10],learn:[1,2,3,4,6,8,9,10,11],learner:[3,4,9,11],least:[0,10],leav:10,left:10,legend:8,leibler:[1,3,8,11],len:8,length:[9,10],less:[8,10],let:[1,3],level:[],leverag:3,leyend:8,like:[0,1,3,5,8,9,10,11],likelihood:11,limit:[5,8,10,11],line:[1,3,8],linear:[5,11],linear_model:[1,3,4,6,9],linearsvc:[3,5,10],link:[],linspac:5,list:[0,5,8,9,10],listedcolormap:8,literatur:[0,1,4,6],load:[0,3,8,10],loader:[0,10],loader_func:[0,10],loader_kwarg:10,local:8,log:[8,10],logist:[1,3,9,11],logisticregress:[1,3,4,6,9,11],logscal:8,logspac:4,longer:8,longest:9,look:[0,1,3,5,11],loop:11,loss:[6,9,11],low:[5,8,9],lower:[5,8,11],lower_is_bett:8,lowest:5,lowranklogisticregress:9,lr:[1,3,9,11],lstm:[3,9],lstm_class_nlay:9,lstm_hidden_s:11,lstm_nlayer:11,lstmnet:9,m:[3,8,11],machin:[1,4,6],macro:8,made:[0,2,8,10],mae:[1,4,6,8,9,11],mae_loss:11,mai:8,main:5,maintain:[3,11],make:[0,1,3,11],makedir:8,mammograph:0,manag:[0,3,10],mani:[1,3,4,5,6,8,10,11],manner:0,manual:0,map:[1,9],map_parallel:8,margin:9,mass:8,math:[],mathcal:8,matplotlib:[2,8],matric:[0,5,10],matrix:[5,8,11],max:11,max_it:11,max_sample_s:11,maxim:[6,11],maximum:[1,8,9,11],maximumlikelihoodprevalenceestim:11,md:[],mean:[0,1,3,4,5,6,8,9,10,11],mean_absolute_error:8,mean_relative_absolute_error:8,measur:[2,3,4,5,6,8,11],median:11,mediansweep2:11,mediansweep:11,member:[3,11],memori:9,mention:3,merg:5,met:10,meta:[6,7,8],meth:[],method:[0,1,4,5,6,7,8],method_data:5,method_nam:[5,8],method_ord:8,metric:[1,3,4,6,8,11],might:[1,8,10],min_df:[1,3,4,5,10],min_po:11,mine:[0,3],minim:[8,11],minimum:[10,11],minimun:10,mining6:10,minu:8,misclassif:11,miss:8,mixtur:[3,11],mkld:[1,8,11],ml:10,mlpe:11,mnkld:[1,8,11],mock:[8,9],modal:4,model:[0,1,5,6,8,9,11],model_select:[4,7,11],modifi:[3,8],modul:[0,1,3,5,6,7],moment:[0,3],monitor:8,more:[3,5,8,11],moreo:[0,3,4,10,11],most:[0,3,5,6,8,10,11],movi:0,mrae:[1,6,8,9,11],ms2:11,ms:11,mse:[1,3,6,8,11],msg:[],multiclass:8,multipli:8,multiprocess:8,multivari:[3,9],must:[3,10,11],mutual:11,my:[],my_arrai:8,my_collect:10,my_custom_load:0,my_data:0,mycustomloss:3,n:[0,1,8,9,11],n_bin:[5,8],n_class:[1,3,8,9,10,11],n_classes_:11,n_compon:9,n_dimens:9,n_epoch:11,n_featur:9,n_instanc:[8,9,11],n_job:[1,3,4,8,10,11],n_preval:[0,8,10],n_prevpoint:[1,4,5,8],n_repeat:[1,8],n_repetit:[1,4,5,8],n_sampl:[8,9],name:[5,8,9,10,11],nativ:6,natur:[1,8,10,11],natural_prevalence_predict:8,natural_prevalence_protocol:8,natural_prevalence_report:8,natural_sampling_gener:10,natural_sampling_index_gener:10,nbin:[5,8],ndarrai:[1,3,8,10,11],necessarili:[],need:[0,3,8,10,11],neg:[0,5,8,11],nest:[],net:9,network:[0,8,9,10],neural:[0,7,8,10],neuralclassifiertrain:[3,9],neutral:0,next:[4,8,9,10],nfold:[0,10],nkld:[1,2,6,8,9,11],nn:[9,11],nogap:10,non:3,non_aggreg:[7,8],none:[1,4,8,9,10,11],nonetheless:4,nor:3,normal:[0,1,3,8,10,11],normalize_preval:8,note:[1,3,4,5,8,10],noth:11,now:5,nowadai:3,np:[1,3,4,5,8,10,11],npp:[8,10],nprevpoint:[],nrepeat:[0,10],num_prevalence_combin:[1,8],number:[0,1,3,5,8,9,10,11],numer:[0,1,3,6,10],numpi:[2,4,8,9,11],o_l6x_pcf09mdetq4tu7jk98mxfbgsxp9zso14jkuiyudgfg0:[],object:[0,8,9,10,11],observ:1,obtain:[1,4,8,11],obtaind:8,obvious:8,occur:[5,10],occurr:10,octob:[0,3],off:9,offer:[3,6],older:2,omd:[0,10],ommit:[1,8],onc:[1,3,5,8],one:[0,1,3,4,5,8,10,11],ones:[1,3,5,8,10],onevsal:[3,11],onli:[0,3,5,8,9,10,11],open:[0,6,10],oper:3,opt:4,optim:[2,3,4,8,9,11],optimize_threshold:[],option:[0,1,3,5,8,10,11],order:[0,2,3,5,8,10,11],order_bi:11,org:10,orient:[3,6,8,11],origin:[0,3,10],os:[0,8],other:[1,3,5,6,8,10,11],otherwis:[0,3,8,10,11],our:[],out:[3,4,5,8,9,10,11],outcom:5,outer:8,outlier:8,output:[0,1,3,4,8,9,10,11],outsid:11,over:[3,4,8],overal:1,overestim:5,overrid:3,overridden:[3,11],own:4,p:[0,3,8,10,11],p_hat:8,p_i:8,pacc:[1,3,5,8,11],packag:[0,2,3,6,7],pad:[9,10],pad_length:9,padding_length:9,page:[0,2,6],pageblock:0,pair:[0,8,11],panda:[1,2,8],paper:[0,3],parallel:[1,3,8,10,11],param:[4,9,11],param_grid:[4,8,11],param_mod_sel:11,param_model_sel:11,paramet:[1,3,4,8,9,10,11],parent:8,part:[3,10],particular:[0,1,3],particularli:1,pass:[0,1,5,8,9,11],past:1,patch:[2,3,9,11],path:[0,3,5,8,9,10,11],patienc:[8,9,11],pattern:3,pca:[],pcalr:[],pcc:[3,4,5,11],pd:1,pdf:5,peopl:[],percentil:8,perf:[6,9,11],perform:[1,3,4,5,6,8,9,11],perman:8,phonem:0,pick:4,pickl:[3,8,10],pickle_path:8,pickled_resourc:8,pii:[],pip:2,pipelin:[],pkl:8,plai:0,plan:3,pleas:3,plot:[6,7],png:5,point:[0,1,3,8,10],polici:[3,11],popular:6,portion:4,pos_class:[8,10],posit:[0,3,5,8,10,11],possibl:[1,3,8],post:8,posterior:[3,8,9,11],posterior_prob:[3,11],postpon:3,potter:0,pp:[0,3],pprox:[],practic:[0,4],pre:[0,3],prec:[0,8],preced:10,precis:[0,1,8],preclassifi:3,predefin:10,predict:[3,4,5,8,9,11],predict_proba:[3,9,11],predictor:1,prefer:8,preliminari:11,prepare_svmperf:[2,3],preprint:4,preprocess:[0,1,3,7,8],present:[0,3,10],preserv:[1,5,8,10],pretti:5,prev:[0,1,8,10],prevail:3,preval:[0,1,3,4,5,6,8,10,11],prevalence_estim:8,prevalence_from_label:8,prevalence_from_prob:8,prevalence_linspac:8,prevel:11,previou:3,previous:[],prevs_estim:11,prevs_hat:[1,8],princip:9,print:[0,1,3,4,6,9,10],prior:[1,3,4,5,6,8,11],priori:3,probabilist:[3,11],probabilisticadjustedclassifyandcount:11,probabilisticclassifyandcount:11,probabl:[1,3,4,5,6,8,9,11],problem:[0,3,5,8,10,11],procedur:[3,6],proceed:[0,3,10],process:[3,4,8],processor:3,procol:1,produc:[0,1,5,8],product:3,progress:[8,10],properli:0,properti:[3,8,9,10,11],proport:[3,4,8,9,10,11],propos:[2,3,11],protocl:8,protocol:[0,3,4,5,6,8,10],provid:[0,3,5,6,11],ptecondestim:11,ptr:[3,11],ptr_polici:[],purpos:[0,11],put:11,python:[0,6],pytorch:2,q:[0,2,3,8,9,11],q_i:8,qacc:9,qdrop_p:11,qf1:9,qgm:9,qp:[0,1,3,4,5,6,8,10],quanet:[2,6,9,11],quanetmodul:11,quanettrain:11,quantif:[0,1,6,8,9,10,11],quantifi:[3,4,5,6,8,11],quantification_error:8,quantiti:8,quapi:[0,1,2,3,4,5],quapy_data:0,quay_data:10,question:8,quevedo:[0,3,10],quick:[],quit:8,r:[0,3,8,10],rac:[],rae:[1,2,8,11],rais:[3,8,11],rand:8,random:[1,3,4,5,8,10],random_se:[1,8],random_st:10,randomli:0,rang:[0,5,8,11],rank:[3,9],rare:10,rate:[3,8,9,11],rather:[1,4],raw:10,rb:0,re:[3,4,10],reach:11,read:10,reader:[7,8],readm:[],real:[8,9,10,11],reason:[3,5,6],recal:8,receiv:[0,3,5],recip:11,recognit:3,recommend:[1,5,11],recomput:11,recurr:[0,3,10],recurs:11,red:0,red_siz:[3,11],reduc:[0,10],reduce_column:[0,10],refer:[9,10],refit:[4,8],regard:4,regardless:10,regim:8,region:8,regist:11,regress:9,regressor:[1,3],reindex_label:10,reiniti:9,rel:[1,3,8,10,11],relative_absolute_error:8,reli:[1,3,11],reliabl:3,rememb:5,remov:10,repeat:[8,10],repetit:8,repl:[],replac:[0,3,10],replic:[1,4,8],report:[1,8],repositori:[0,10],repr_siz:9,repres:[1,3,5,8,10,11],represent:[0,3,8,9],reproduc:10,request:[0,8,10],requir:[0,1,3,6,9],reset_net_param:9,resourc:8,resp:11,respect:[0,1,5,8,11],respond:3,rest:[8,10,11],result:[1,2,3,4,5,6,8,11],retain:[0,3,9,11],retrain:4,return_constrained_dim:8,reus:[0,3,8],review:[5,6,10],reviews_sentiment_dataset:[0,10],rewrit:5,right:[4,8,10],role:0,root:6,roughli:0,round:10,routin:[8,10,11],row:[8,10],run:[0,1,2,3,4,5,8,10,11],s003132031400291x:[],s10618:[],s:[0,1,3,4,5,8,9,10,11],saeren:[3,11],sai:[],said:3,same:[0,3,5,8,10,11],sampl:[0,1,3,4,5,6,8,9,10,11],sample_s:[0,1,3,4,5,8,10,11],sampling_from_index:[0,10],sampling_index:[0,10],sander:[0,10],save:[5,8],save_or_show:[],save_text_fil:8,savepath:[5,8],scale:8,scall:10,scenario:[1,3,4,5,6],scienc:3,sciencedirect:[],scikit:[2,3,4,10],scipi:[2,10],score:[0,1,4,8,9,10],script:[1,2,3,6,11],se:[1,8],search:[3,4,6,8],sebastiani:[0,3,4,10,11],second:[0,1,3,5,8,10],secondari:8,section:4,see:[0,1,2,3,4,5,6,8,9,10,11],seed:[1,4,8],seem:3,seemingli:5,seen:[5,8,11],select:[0,3,6,8,10,11],selector:3,self:[3,8,9,10,11],semeion:0,semev:0,semeval13:[0,10],semeval14:[0,10],semeval15:[0,10],semeval16:[0,6,10],sentenc:10,sentiment:[3,6,10],separ:[8,10],sequenc:8,seri:0,serv:3,set:[0,1,3,4,5,6,8,9,10,11],set_opt:1,set_param:[3,8,9,11],set_siz:[],sever:0,sh:[2,3],shape:[5,8,9,10,11],share:[0,10],shift:[1,4,6,8,11],shorter:9,shoud:3,should:[0,1,3,4,5,6,9,10,11],show:[0,1,3,4,5,8,9,10],show_dens:8,show_std:[5,8],showcas:5,shown:[1,5,8],shuffl:[9,10],side:8,sign:8,signific:1,significantli:8,silent:[8,11],simeq:[],similar:[8,11],simpl:[0,3,5],simplest:3,simplex:[0,8],simpli:[1,2,3,4,5,6,8,11],sinc:[0,1,3,5,8,10,11],singl:[1,3,6,11],size:[0,1,3,8,9,10,11],sklearn:[1,3,4,5,6,9,10,11],sld:[3,11],slice:8,smooth:[1,8],smooth_limits_epsilon:8,so:[0,1,3,5,8,9,10,11],social:[0,3,10],soft:3,softwar:0,solid:5,solut:8,solv:[4,11],solve_adjust:11,some:[0,1,3,5,8,10,11],some_arrai:8,sometim:1,sonar:0,sourc:[2,3,6,9],sout:[],space:[0,4,8,9],spambas:0,spars:[0,10],special:[0,5,10],specif:[3,4],specifi:[0,1,3,5,8,9,10],spectf:0,spectrum:[0,1,4,5,8],speed:3,split:[0,3,4,5,8,9,10,11],split_stratifi:10,splitstratifi:10,spmatrix:10,springer:[],sqrt:8,squar:[1,3,8],sst:[0,10],stabil:[1,11],stabl:10,stackexchang:8,stand:[8,11],standard:[0,1,5,8,10,11],star:8,start:4,stat:10,state:8,statist:[0,1,8,11],stats_siz:11,std:9,stdout:8,step:[5,8],stop:[8,9,11],store:[0,9,10,11],str:[0,8,10],strategi:[3,4],stratif:10,stratifi:[0,3,10,11],stride:9,string:[1,8,10,11],strongli:[4,5],strprev:[0,1,8],structur:[3,11],studi:[0,3,10],style:10,subclass:11,subdir:8,subinterv:5,sublinear_tf:10,submit:0,submodul:7,subobject:[],suboptim:4,subpackag:7,subsequ:10,subtract:[0,8,10],subtyp:10,suffic:5,suffici:[],sum:[8,11],sum_:8,summar:0,supervis:[4,6],support:[3,6,9,10],surfac:10,surpass:1,svm:[3,5,6,9,10,11],svm_light:[],svm_perf:[],svm_perf_classifi:9,svm_perf_learn:9,svm_perf_quantif:[2,3],svmae:[3,11],svmkld:[3,11],svmnkld:[3,11],svmperf:[2,3,7,8,11],svmperf_bas:[9,11],svmperf_hom:3,svmq:[3,11],svmrae:[3,11],sweep:11,syntax:5,system:[4,11],t50:11,t:[0,1,3,8],tab10:8,tail:8,tail_density_threshold:8,take:[0,3,5,8,10,11],taken:[3,8,9,10],target:[3,5,6,8,9,11],task:[3,4,10],te:[8,10],temp_se:8,tempor:8,tend:5,tendenc:5,tensor:9,term:[0,1,3,4,5,6,8,9,10,11],test:[0,1,3,4,5,6,8,9,10,11],test_bas:[],test_dataset:[],test_method:[],test_path:[0,10],test_sampl:8,test_split:10,text2tfidf:[0,1,3,10],text:[0,3,8,9,10],textclassifiernet:9,textual:[0,6,10],tf:[0,10],tfidf:[0,4,5,10],tfidfvector:10,than:[1,4,5,8,9,10],thei:[0,3,11],them:[0,3,11],theoret:4,thereaft:1,therefor:[8,10],thi:[0,1,2,3,4,5,6,8,9,10,11],thing:3,third:[1,5],thorsten:9,those:[1,3,4,5,8,9,11],though:[3,8],three:[0,5],threshold:[8,11],thresholdoptim:11,through:[3,8],thu:[3,4,5,8,11],tictacto:0,time:[0,1,3,8,10],timeout:8,timeouterror:8,timer:8,titl:8,tj:[],tn:8,token:[0,9,10],tool:[1,6],top:[3,8],torch:[3,9,11],torchdataset:9,total:8,toward:[5,10],tp:8,tpr:[8,11],tqdm:2,tr:10,tr_iter_per_poch:11,tr_prev:[5,8,11],track:8,trade:9,tradition:1,train:[0,1,3,4,5,6,8,9,10,11],train_path:[0,10],train_prev:[5,8],train_prop:10,train_siz:10,train_val_split:[],trainer:9,training_help:[],training_preval:5,training_s:5,transact:3,transform:[0,9,10],transfus:0,trivial:3,true_prev:[1,5,8],true_preval:6,truncatedsvd:9,ttest_alpha:8,tupl:[8,10,11],turn:4,tweet:[0,3,10],twitter:[6,10],twitter_sentiment_datasets_test:[0,10],twitter_sentiment_datasets_train:[0,10],two:[0,1,3,4,5,8,10,11],txt:8,type:[0,3,8,10,11],typic:[1,4,5,8,9,10,11],u1:10,uci:[6,10],uci_dataset:10,unabl:0,unadjust:5,unalt:9,unbias:5,uncompress:0,under:1,underestim:5,underlin:8,understand:8,unfortun:5,unifi:[0,11],uniform:[8,10],uniform_prevalence_sampl:8,uniform_sampl:10,uniform_sampling_index:10,uniform_simplex_sampl:8,uniformli:[8,10],union:[8,11],uniqu:10,unit:[0,8],unix:0,unk:10,unknown:10,unlabel:11,unless:11,unlik:[1,4],until:11,unus:[8,9],up:[3,4,8,9,11],updat:11,url:8,us:[0,1,3,4,5,6,8,9,10,11],user:[0,1,5],utf:10,util:[7,9],v:3,va_iter_per_poch:11,val:[0,10],val_split:[3,4,8,9,11],valid:[0,1,3,4,5,8,9,10,11],valid_loss:[3,9,11],valid_polici:11,valu:[0,1,3,8,9,10,11],variabl:[1,3,5,8,10],varianc:[0,5],variant:[5,6,11],varieti:4,variou:[1,5],vector:[0,8,9,10],verbos:[0,1,4,8,9,10,11],veri:[3,5],versatil:6,version:[2,9,11],vertic:8,vertical_xtick:8,via:[0,2,3,11],view:5,visual:[5,6],vline:8,vocab_s:9,vocabulari:[9,10],vocabulary_s:[3,9,10],vs:[3,8],w:[0,3,10],wa:[0,3,5,8,10,11],wai:[1,11],wait:9,want:[3,4],warn:10,wb:[0,10],wdbc:0,we:[0,1,3,4,5,6],weight:[9,10],weight_decai:9,well:[0,3,4,5,11],were:0,what:3,whcih:10,when:[0,1,3,4,5,8,9,10],whenev:[5,8],where:[3,5,8,9,10],wherebi:4,whether:[8,9,10,11],which:[0,1,3,4,5,8,9,10,11],white:0,whole:[0,1,3,4,8],whose:[10,11],why:3,wide:5,wiki:[0,3],wine:0,within:[8,11],without:[1,3,8,10],word:[1,3,6,9,10],work:[1,3,4,5,10],worker:[1,8,10,11],wors:[4,5,8],would:[0,1,3,5,6,8,10,11],wrapper:[8,9,10],written:6,www:[],x2:10,x:[5,8,9,10,11],x_error:8,xavier:9,xavier_uniform:9,xlrd:[0,2],xy:10,y:[5,8,9,10,11],y_:[],y_error:8,y_i:11,y_j:11,y_pred:8,y_true:8,ye:[],yeast:[0,10],yield:[5,8,10,11],yin:[],you:[2,3],your:3,z:[0,10],zero:[0,8],zfthyovrzwxmgfzylqw_y8cagg:[],zip:[0,5]},titles:["Datasets","Evaluation","Installation","Quantification Methods","Model Selection","Plotting","Welcome to QuaPy\u2019s documentation!","quapy","quapy package","quapy.classification package","quapy.data package","quapy.method package"],titleterms:{"function":8,A:6,The:3,ad:0,aggreg:[3,11],base:[10,11],bia:5,classif:[4,9],classifi:3,content:[6,8,9,10,11],count:3,custom:0,data:[0,10],dataset:[0,10],diagon:5,distanc:3,document:6,drift:5,emq:3,ensembl:3,error:[1,5,8],evalu:[1,8],ex:[],exampl:6,expect:3,explicit:3,featur:6,get:[],hdy:3,helling:3,indic:6,instal:2,introduct:6,issu:0,learn:0,loss:[2,3,4],machin:0,maxim:3,measur:1,meta:[3,11],method:[3,9,11],minim:3,model:[3,4],model_select:8,modul:[8,9,10,11],network:3,neural:[3,9,11],non_aggreg:11,orient:[2,4],packag:[8,9,10,11],perf:2,plot:[5,8],preprocess:10,process:0,protocol:1,quanet:3,quantif:[2,3,4,5],quapi:[6,7,8,9,10,11],quick:6,reader:10,readm:[],requir:2,review:0,s:6,select:4,sentiment:0,start:[],submodul:[8,9,10,11],subpackag:8,svm:2,svmperf:9,tabl:6,target:4,test:[],test_bas:[],test_dataset:[],test_method:[],titl:[],twitter:0,uci:0,util:8,variant:3,welcom:6,y:3}})
\ No newline at end of file
diff --git a/quapy/method/aggregative.py b/quapy/method/aggregative.py
index 3ccf607..c0280a2 100644
--- a/quapy/method/aggregative.py
+++ b/quapy/method/aggregative.py
@@ -23,46 +23,109 @@ from quapy.method.base import BaseQuantifier, BinaryQuantifier
class AggregativeQuantifier(BaseQuantifier):
"""
Abstract class for quantification methods that base their estimations on the aggregation of classification
- results. Aggregative Quantifiers thus implement a _classify_ method and maintain a _learner_ attribute.
+ results. Aggregative Quantifiers thus implement a :meth:`classify` method and maintain a :attr:`learner` attribute.
+ Subclasses of this abstract class must implement the method :meth:`aggregate` which computes the aggregation
+ of label predictions. The method :meth:`quantify` comes with a default implementation based on
+ :meth:`classify` and :meth:`aggregate`.
"""
@abstractmethod
- def fit(self, data: LabelledCollection, fit_learner=True): ...
+ def fit(self, data: LabelledCollection, fit_learner=True):
+ """
+ Trains the aggregative quantifier
+
+ :param data: a :class:`quapy.data.base.LabelledCollection` consisting of the training data
+ :param fit_learner: whether or not to train the learner (default is True). Set to False if the
+ learner has been trained outside the quantifier.
+ :return: self
+ """
+ ...
@property
def learner(self):
+ """
+ Gives access to the classifier
+
+ :return: the classifier (typically an sklearn's Estimator)
+ """
return self.learner_
@learner.setter
- def learner(self, value):
- self.learner_ = value
+ def learner(self, classifier):
+ """
+ Setter for the classifier
+
+ :param classifier: the classifier
+ """
+ self.learner_ = classifier
def classify(self, instances):
+ """
+ Provides the label predictions for the given instances.
+
+ :param instances: array-like
+ :return: np.ndarray of shape `(n_instances,)` with label predictions
+ """
return self.learner.predict(instances)
def quantify(self, instances):
+ """
+ Generate class prevalence estimates for the sample's instances by aggregating the label predictions generated
+ by the classifier.
+
+ :param instances: array-like
+ :return: `np.ndarray` of shape `(self.n_classes_,)` with class prevalence estimates.
+ """
classif_predictions = self.classify(instances)
return self.aggregate(classif_predictions)
@abstractmethod
- def aggregate(self, classif_predictions: np.ndarray): ...
+ def aggregate(self, classif_predictions: np.ndarray):
+ """
+ Implements the aggregation of label predictions.
+
+ :param classif_predictions: `np.ndarray` of label predictions
+ :return: `np.ndarray` of shape `(self.n_classes_,)` with class prevalence estimates.
+ """
+ ...
def get_params(self, deep=True):
+ """
+ Return the current parameters of the quantifier.
+
+ :param deep: for compatibility with sklearn
+ :return: a dictionary of param-value pairs
+ """
+
return self.learner.get_params()
def set_params(self, **parameters):
+ """
+ Set the parameters of the quantifier.
+
+ :param parameters: dictionary of param-value pairs
+ """
+
self.learner.set_params(**parameters)
- @property
- def n_classes(self):
- return len(self.classes_)
-
@property
def classes_(self):
+ """
+ Class labels, in the same order in which class prevalence values are to be computed.
+ This default implementation actually returns the class labels of the learner.
+
+ :return: array-like
+ """
return self.learner.classes_
@property
def aggregative(self):
+ """
+ Returns True, indicating the quantifier is of type aggregative.
+
+ :return: True
+ """
+
return True
@@ -96,23 +159,24 @@ class AggregativeProbabilisticQuantifier(AggregativeQuantifier):
# Helper
# ------------------------------------
-def training_helper(learner,
- data: LabelledCollection,
- fit_learner: bool = True,
- ensure_probabilistic=False,
- val_split: Union[LabelledCollection, float] = None):
+def _training_helper(learner,
+ data: LabelledCollection,
+ fit_learner: bool = True,
+ ensure_probabilistic=False,
+ val_split: Union[LabelledCollection, float] = None):
"""
Training procedure common to all Aggregative Quantifiers.
+
:param learner: the learner to be fit
:param data: the data on which to fit the learner. If requested, the data will be split before fitting the learner.
:param fit_learner: whether or not to fit the learner (if False, then bypasses any action)
:param ensure_probabilistic: if True, guarantees that the resulting classifier implements predict_proba (if the
- learner is not probabilistic, then a CalibratedCV instance of it is trained)
+ learner is not probabilistic, then a CalibratedCV instance of it is trained)
:param val_split: if specified as a float, indicates the proportion of training instances that will define the
- validation split (e.g., 0.3 for using 30% of the training set as validation data); if specified as a
- LabelledCollection, represents the validation split itself
+ validation split (e.g., 0.3 for using 30% of the training set as validation data); if specified as a
+ LabelledCollection, represents the validation split itself
:return: the learner trained on the training set, and the unused data (a _LabelledCollection_ if train_val_split>0
- or None otherwise) to be used as a validation set for any subsequent parameter fitting
+ or None otherwise) to be used as a validation set for any subsequent parameter fitting
"""
if fit_learner:
if ensure_probabilistic:
@@ -154,8 +218,10 @@ def training_helper(learner,
# ------------------------------------
class CC(AggregativeQuantifier):
"""
- The most basic Quantification method. One that simply classifies all instances and countes how many have been
- attributed each of the classes in order to compute class prevalence estimates.
+ The most basic Quantification method. One that simply classifies all instances and counts how many have been
+ attributed to each of the classes in order to compute class prevalence estimates.
+
+ :param learner: a sklearn's Estimator that generates a classifier
"""
def __init__(self, learner: BaseEstimator):
@@ -163,19 +229,40 @@ class CC(AggregativeQuantifier):
def fit(self, data: LabelledCollection, fit_learner=True):
"""
- Trains the Classify & Count method unless _fit_learner_ is False, in which case it is assumed to be already fit.
- :param data: training data
+ Trains the Classify & Count method unless `fit_learner` is False, in which case, the classifier is assumed to
+ be already fit and there is nothing else to do.
+
+ :param data: a :class:`quapy.data.base.LabelledCollection` consisting of the training data
:param fit_learner: if False, the classifier is assumed to be fit
:return: self
"""
- self.learner, _ = training_helper(self.learner, data, fit_learner)
+ self.learner, _ = _training_helper(self.learner, data, fit_learner)
return self
- def aggregate(self, classif_predictions):
+ def aggregate(self, classif_predictions: np.ndarray):
+ """
+ Computes class prevalence estimates by counting the prevalence of each of the predicted labels.
+
+ :param classif_predictions: array-like with label predictions
+ :return: `np.ndarray` of shape `(self.n_classes_,)` with class prevalence estimates.
+ """
return F.prevalence_from_labels(classif_predictions, self.classes_)
class ACC(AggregativeQuantifier):
+ """
+ `Adjusted Classify & Count `_,
+ the "adjusted" variant of :class:`CC`, that corrects the predictions of CC
+ according to the `misclassification rates`.
+
+ :param learner: a sklearn's Estimator that generates a classifier
+ :param val_split: indicates the proportion of data to be used as a stratified held-out validation set in which the
+ misclassification rates are to be estimated.
+ This parameter can be indicated as a real value (between 0 and 1, default 0.4), representing a proportion of
+ validation data, or as an integer, indicating that the misclassification rates should be estimated via
+ `k`-fold cross validation (this integer stands for the number of folds `k`), or as a
+ :class:`quapy.data.base.LabelledCollection` (the split itself).
+ """
def __init__(self, learner: BaseEstimator, val_split=0.4):
self.learner = learner
@@ -183,13 +270,14 @@ class ACC(AggregativeQuantifier):
def fit(self, data: LabelledCollection, fit_learner=True, val_split: Union[float, int, LabelledCollection] = None):
"""
- Trains a ACC quantifier
+ Trains a ACC quantifier.
+
:param data: the training set
:param fit_learner: set to False to bypass the training (the learner is assumed to be already fit)
:param val_split: either a float in (0,1) indicating the proportion of training instances to use for
- validation (e.g., 0.3 for using 30% of the training set as validation data), or a LabelledCollection
- indicating the validation set itself, or an int indicating the number k of folds to be used in kFCV
- to estimate the parameters
+ validation (e.g., 0.3 for using 30% of the training set as validation data), or a LabelledCollection
+ indicating the validation set itself, or an int indicating the number `k` of folds to be used in `k`-fold
+ cross validation to estimate the parameters
:return: self
"""
if val_split is None:
@@ -205,7 +293,7 @@ class ACC(AggregativeQuantifier):
pbar.set_description(f'{self.__class__.__name__} fitting fold {k}')
training = data.sampling_from_index(training_idx)
validation = data.sampling_from_index(validation_idx)
- learner, val_data = training_helper(self.learner, training, fit_learner, val_split=validation)
+ learner, val_data = _training_helper(self.learner, training, fit_learner, val_split=validation)
y_.append(learner.predict(val_data.instances))
y.append(val_data.labels)
@@ -214,10 +302,10 @@ class ACC(AggregativeQuantifier):
class_count = data.counts()
# fit the learner on all data
- self.learner, _ = training_helper(self.learner, data, fit_learner, val_split=None)
+ self.learner, _ = _training_helper(self.learner, data, fit_learner, val_split=None)
else:
- self.learner, val_data = training_helper(self.learner, data, fit_learner, val_split=val_split)
+ self.learner, val_data = _training_helper(self.learner, data, fit_learner, val_split=val_split)
y_ = self.learner.predict(val_data.instances)
y = val_data.labels
class_count = val_data.counts()
@@ -239,7 +327,15 @@ class ACC(AggregativeQuantifier):
@classmethod
def solve_adjustment(cls, PteCondEstim, prevs_estim):
- # solve for the linear system Ax = B with A=PteCondEstim and B = prevs_estim
+ """
+ Solves the system linear system :math:`Ax = B` with :math:`A` = `PteCondEstim` and :math:`B` = `prevs_estim`
+
+ :param PteCondEstim: a `np.ndarray` of shape `(n_classes,n_classes,)` with entry `(i,j)` being the estimate
+ of :math:`P(y_i|y_j)`, that is, the probability that an instance that belongs to :math:`y_j` ends up being
+ classified as belonging to :math:`y_i`
+ :param prevs_estim: a `np.ndarray` of shape `(n_classes,)` with the class prevalence estimates
+ :return: an adjusted `np.ndarray` of shape `(n_classes,)` with the corrected class prevalence estimates
+ """
A = PteCondEstim
B = prevs_estim
try:
@@ -252,11 +348,18 @@ class ACC(AggregativeQuantifier):
class PCC(AggregativeProbabilisticQuantifier):
+ """
+ `Probabilistic Classify & Count `_,
+ the probabilistic variant of CC that relies on the posterior probabilities returned by a probabilistic classifier.
+
+ :param learner: a sklearn's Estimator that generates a classifier
+ """
+
def __init__(self, learner: BaseEstimator):
self.learner = learner
def fit(self, data: LabelledCollection, fit_learner=True):
- self.learner, _ = training_helper(self.learner, data, fit_learner, ensure_probabilistic=True)
+ self.learner, _ = _training_helper(self.learner, data, fit_learner, ensure_probabilistic=True)
return self
def aggregate(self, classif_posteriors):
@@ -264,6 +367,18 @@ class PCC(AggregativeProbabilisticQuantifier):
class PACC(AggregativeProbabilisticQuantifier):
+ """
+ `Probabilistic Adjusted Classify & Count `_,
+ the probabilistic variant of ACC that relies on the posterior probabilities returned by a probabilistic classifier.
+
+ :param learner: a sklearn's Estimator that generates a classifier
+ :param val_split: indicates the proportion of data to be used as a stratified held-out validation set in which the
+ misclassification rates are to be estimated.
+ This parameter can be indicated as a real value (between 0 and 1, default 0.4), representing a proportion of
+ validation data, or as an integer, indicating that the misclassification rates should be estimated via
+ `k`-fold cross validation (this integer stands for the number of folds `k`), or as a
+ :class:`quapy.data.base.LabelledCollection` (the split itself).
+ """
def __init__(self, learner: BaseEstimator, val_split=0.4):
self.learner = learner
@@ -271,7 +386,8 @@ class PACC(AggregativeProbabilisticQuantifier):
def fit(self, data: LabelledCollection, fit_learner=True, val_split: Union[float, int, LabelledCollection] = None):
"""
- Trains a PACC quantifier
+ Trains a PACC quantifier.
+
:param data: the training set
:param fit_learner: set to False to bypass the training (the learner is assumed to be already fit)
:param val_split: either a float in (0,1) indicating the proportion of training instances to use for
@@ -294,7 +410,7 @@ class PACC(AggregativeProbabilisticQuantifier):
pbar.set_description(f'{self.__class__.__name__} fitting fold {k}')
training = data.sampling_from_index(training_idx)
validation = data.sampling_from_index(validation_idx)
- learner, val_data = training_helper(
+ learner, val_data = _training_helper(
self.learner, training, fit_learner, ensure_probabilistic=True, val_split=validation)
y_.append(learner.predict_proba(val_data.instances))
y.append(val_data.labels)
@@ -303,12 +419,12 @@ class PACC(AggregativeProbabilisticQuantifier):
y_ = np.vstack(y_)
# fit the learner on all data
- self.learner, _ = training_helper(self.learner, data, fit_learner, ensure_probabilistic=True,
- val_split=None)
+ self.learner, _ = _training_helper(self.learner, data, fit_learner, ensure_probabilistic=True,
+ val_split=None)
classes = data.classes_
else:
- self.learner, val_data = training_helper(
+ self.learner, val_data = _training_helper(
self.learner, data, fit_learner, ensure_probabilistic=True, val_split=val_split)
y_ = self.learner.predict_proba(val_data.instances)
y = val_data.labels
@@ -337,10 +453,13 @@ class PACC(AggregativeProbabilisticQuantifier):
class EMQ(AggregativeProbabilisticQuantifier):
"""
- The method is described in:
- Saerens, M., Latinne, P., and Decaestecker, C. (2002).
- Adjusting the outputs of a classifier to new a priori probabilities: A simple procedure.
- Neural Computation, 14(1): 21–41.
+ `Expectation Maximization for Quantification `_ (EMQ),
+ aka `Saerens-Latinne-Decaestecker` (SLD) algorithm.
+ EMQ consists of using the well-known `Expectation Maximization algorithm` to iteratively update the posterior
+ probabilities generated by a probabilistic classifier and the class prevalence estimates obtained via
+ maximum-likelihood estimation, in a mutually recursive way, until convergence.
+
+ :param learner: a sklearn's Estimator that generates a classifier
"""
MAX_ITER = 1000
@@ -350,7 +469,7 @@ class EMQ(AggregativeProbabilisticQuantifier):
self.learner = learner
def fit(self, data: LabelledCollection, fit_learner=True):
- self.learner, _ = training_helper(self.learner, data, fit_learner, ensure_probabilistic=True)
+ self.learner, _ = _training_helper(self.learner, data, fit_learner, ensure_probabilistic=True)
self.train_prevalence = F.prevalence_from_labels(data.labels, self.classes_)
return self
@@ -365,6 +484,17 @@ class EMQ(AggregativeProbabilisticQuantifier):
@classmethod
def EM(cls, tr_prev, posterior_probabilities, epsilon=EPSILON):
+ """
+ Computes the `Expectation Maximization` routine.
+
+ :param tr_prev: array-like, the training prevalence
+ :param posterior_probabilities: `np.ndarray` of shape `(n_instances, n_classes,)` with the
+ posterior probabilities
+ :param epsilon: float, the threshold different between two consecutive iterations
+ to reach before stopping the loop
+ :return: a tuple with the estimated prevalence values (shape `(n_classes,)`) and
+ the corrected posterior probabilities (shape `(n_instances, n_classes,)`)
+ """
Px = posterior_probabilities
Ptr = np.copy(tr_prev)
qs = np.copy(Ptr) # qs (the running estimate) is initialized as the training prevalence
@@ -393,9 +523,17 @@ class EMQ(AggregativeProbabilisticQuantifier):
class HDy(AggregativeProbabilisticQuantifier, BinaryQuantifier):
"""
- Implementation of the method based on the Hellinger Distance y (HDy) proposed by
- González-Castro, V., Alaiz-Rodrı́guez, R., and Alegre, E. (2013). Class distribution
- estimation based on the Hellinger distance. Information Sciences, 218:146–164.
+ `Hellinger Distance y `_ (HDy).
+ HDy is a probabilistic method for training binary quantifiers, that models quantification as the problem of
+ minimizing the divergence (in terms of the Hellinger Distance) between two cumulative distributions of posterior
+ probabilities returned by the classifier. One of the distributions is generated from the unlabelled examples and
+ the other is generated from a validation set. This latter distribution is defined as a mixture of the
+ class-conditional distributions of the posterior probabilities returned for the positive and negative validation
+ examples, respectively. The parameters of the mixture thus represent the estimates of the class prevalence values.
+
+ :param learner: a sklearn's Estimator that generates a binary classifier
+ :param val_split: a float in range (0,1) indicating the proportion of data to be used as a stratified held-out
+ validation distribution, or a :class:`quapy.data.base.LabelledCollection` (the split itself).
"""
def __init__(self, learner: BaseEstimator, val_split=0.4):
@@ -404,19 +542,20 @@ class HDy(AggregativeProbabilisticQuantifier, BinaryQuantifier):
def fit(self, data: LabelledCollection, fit_learner=True, val_split: Union[float, LabelledCollection] = None):
"""
- Trains a HDy quantifier
+ Trains a HDy quantifier.
+
:param data: the training set
:param fit_learner: set to False to bypass the training (the learner is assumed to be already fit)
:param val_split: either a float in (0,1) indicating the proportion of training instances to use for
- validation (e.g., 0.3 for using 30% of the training set as validation data), or a LabelledCollection
- indicating the validation set itself
+ validation (e.g., 0.3 for using 30% of the training set as validation data), or a
+ :class:`quapy.data.base.LabelledCollection` indicating the validation set itself
:return: self
"""
if val_split is None:
val_split = self.val_split
self._check_binary(data, self.__class__.__name__)
- self.learner, validation = training_helper(
+ self.learner, validation = _training_helper(
self.learner, data, fit_learner, ensure_probabilistic=True, val_split=val_split)
Px = self.posterior_probabilities(validation.instances)[:, 1] # takes only the P(y=+1|x)
self.Pxy1 = Px[validation.labels == self.learner.classes_[1]]
@@ -459,6 +598,19 @@ class HDy(AggregativeProbabilisticQuantifier, BinaryQuantifier):
class ELM(AggregativeQuantifier, BinaryQuantifier):
+ """
+ Class of Explicit Loss Minimization (ELM) quantifiers.
+ Quantifiers based on ELM represent a family of methods based on structured output learning;
+ these quantifiers rely on classifiers that have been optimized using a quantification-oriented loss
+ measure. This implementation relies on
+ `Joachims’ SVM perf `_ structured output
+ learning algorithm, which has to be installed and patched for the purpose (see this
+ `script `_).
+
+ :param svmperf_base: path to the folder containing the binary files of `SVM perf`
+ :param loss: the loss to optimize (see :attr:`quapy.classification.svmperf.SVMperf.valid_losses`)
+ :param kwargs: rest of SVM perf's parameters
+ """
def __init__(self, svmperf_base=None, loss='01', **kwargs):
self.svmperf_base = svmperf_base if svmperf_base is not None else qp.environ['SVMPERF_HOME']
@@ -481,9 +633,15 @@ class ELM(AggregativeQuantifier, BinaryQuantifier):
class SVMQ(ELM):
"""
- Barranquero, J., Díez, J., and del Coz, J. J. (2015).
- Quantification-oriented learning based on reliable classifiers.
- Pattern Recognition, 48(2):591–604.
+ SVM(Q), which attempts to minimize the `Q` loss combining a classification-oriented loss and a
+ quantification-oriented loss, as proposed by
+ `Barranquero et al. 2015 `_.
+ Equivalent to:
+
+ >>> ELM(svmperf_base, loss='q', **kwargs)
+
+ :param svmperf_base: path to the folder containing the binary files of `SVM perf`
+ :param kwargs: rest of SVM perf's parameters
"""
def __init__(self, svmperf_base=None, **kwargs):
@@ -492,9 +650,14 @@ class SVMQ(ELM):
class SVMKLD(ELM):
"""
- Esuli, A. and Sebastiani, F. (2015).
- Optimizing text quantifiers for multivariate loss functions.
- ACM Transactions on Knowledge Discovery and Data, 9(4):Article 27.
+ SVM(KLD), which attempts to minimize the Kullback-Leibler Divergence as proposed by
+ `Esuli et al. 2015 `_.
+ Equivalent to:
+
+ >>> ELM(svmperf_base, loss='kld', **kwargs)
+
+ :param svmperf_base: path to the folder containing the binary files of `SVM perf`
+ :param kwargs: rest of SVM perf's parameters
"""
def __init__(self, svmperf_base=None, **kwargs):
@@ -503,9 +666,15 @@ class SVMKLD(ELM):
class SVMNKLD(ELM):
"""
- Esuli, A. and Sebastiani, F. (2015).
- Optimizing text quantifiers for multivariate loss functions.
- ACM Transactions on Knowledge Discovery and Data, 9(4):Article 27.
+ SVM(NKLD), which attempts to minimize a version of the the Kullback-Leibler Divergence normalized
+ via the logistic function, as proposed by
+ `Esuli et al. 2015 `_.
+ Equivalent to:
+
+ >>> ELM(svmperf_base, loss='nkld', **kwargs)
+
+ :param svmperf_base: path to the folder containing the binary files of `SVM perf`
+ :param kwargs: rest of SVM perf's parameters
"""
def __init__(self, svmperf_base=None, **kwargs):
@@ -513,25 +682,60 @@ class SVMNKLD(ELM):
class SVMAE(ELM):
+ """
+ SVM(AE), which attempts to minimize Absolute Error as first used by
+ `Moreo and Sebastiani, 2021 `_.
+ Equivalent to:
+
+ >>> ELM(svmperf_base, loss='mae', **kwargs)
+
+ :param svmperf_base: path to the folder containing the binary files of `SVM perf`
+ :param kwargs: rest of SVM perf's parameters
+ """
+
def __init__(self, svmperf_base=None, **kwargs):
super(SVMAE, self).__init__(svmperf_base, loss='mae', **kwargs)
class SVMRAE(ELM):
+ """
+ SVM(RAE), which attempts to minimize Relative Absolute Error as first used by
+ `Moreo and Sebastiani, 2021 `_.
+ Equivalent to:
+
+ >>> ELM(svmperf_base, loss='mrae', **kwargs)
+
+ :param svmperf_base: path to the folder containing the binary files of `SVM perf`
+ :param kwargs: rest of SVM perf's parameters
+ """
+
def __init__(self, svmperf_base=None, **kwargs):
super(SVMRAE, self).__init__(svmperf_base, loss='mrae', **kwargs)
class ThresholdOptimization(AggregativeQuantifier, BinaryQuantifier):
+ """
+ Abstract class of Threshold Optimization variants for :class:`ACC` as proposed by
+ `Forman 2006 `_ and
+ `Forman 2008 `_.
+ The goal is to bring improved stability to the denominator of the adjustment.
+ The different variants are based on different heuristics for choosing a decision threshold
+ that would allow for more true positives and many more false positives, on the grounds this
+ would deliver larger denominators.
+
+ :param learner: a sklearn's Estimator that generates a classifier
+ :param val_split: indicates the proportion of data to be used as a stratified held-out validation set in which the
+ misclassification rates are to be estimated.
+ This parameter can be indicated as a real value (between 0 and 1, default 0.4), representing a proportion of
+ validation data, or as an integer, indicating that the misclassification rates should be estimated via
+ `k`-fold cross validation (this integer stands for the number of folds `k`), or as a
+ :class:`quapy.data.base.LabelledCollection` (the split itself).
+ """
def __init__(self, learner: BaseEstimator, val_split=0.4):
self.learner = learner
self.val_split = val_split
- @abstractmethod
- def optimize_threshold(self, y, probabilities):
- ...
-
def fit(self, data: LabelledCollection, fit_learner=True, val_split: Union[float, int, LabelledCollection] = None):
self._check_binary(data, "Threshold Optimization")
@@ -548,7 +752,7 @@ class ThresholdOptimization(AggregativeQuantifier, BinaryQuantifier):
pbar.set_description(f'{self.__class__.__name__} fitting fold {k}')
training = data.sampling_from_index(training_idx)
validation = data.sampling_from_index(validation_idx)
- learner, val_data = training_helper(self.learner, training, fit_learner, val_split=validation)
+ learner, val_data = _training_helper(self.learner, training, fit_learner, val_split=validation)
probabilities.append(learner.predict_proba(val_data.instances))
y.append(val_data.labels)
@@ -556,16 +760,16 @@ class ThresholdOptimization(AggregativeQuantifier, BinaryQuantifier):
probabilities = np.concatenate(probabilities)
# fit the learner on all data
- self.learner, _ = training_helper(self.learner, data, fit_learner, val_split=None)
+ self.learner, _ = _training_helper(self.learner, data, fit_learner, val_split=None)
else:
- self.learner, val_data = training_helper(self.learner, data, fit_learner, val_split=val_split)
+ self.learner, val_data = _training_helper(self.learner, data, fit_learner, val_split=val_split)
probabilities = self.learner.predict_proba(val_data.instances)
y = val_data.labels
self.cc = CC(self.learner)
- self.tpr, self.fpr = self.optimize_threshold(y, probabilities)
+ self.tpr, self.fpr = self._optimize_threshold(y, probabilities)
return self
@@ -573,20 +777,32 @@ class ThresholdOptimization(AggregativeQuantifier, BinaryQuantifier):
def _condition(self, tpr, fpr) -> float:
"""
Implements the criterion according to which the threshold should be selected.
- This function should return a (float) score to be minimized.
+ This function should return the (float) score to be minimized.
+
+ :param tpr: float, true positive rate
+ :param fpr: float, false positive rate
+ :return: float, a score for the given `tpr` and `fpr`
"""
...
- def optimize_threshold(self, y, probabilities):
+ def _optimize_threshold(self, y, probabilities):
+ """
+ Seeks for the best `tpr` and `fpr` according to the score obtained at different
+ decision thresholds. The scoring function is implemented in function `_condition`.
+
+ :param y: predicted labels for the validation set (or for the training set via `k`-fold cross validation)
+ :param probabilities: array-like with the posterior probabilities
+ :return: best `tpr` and `fpr` according to `_condition`
+ """
best_candidate_threshold_score = None
best_tpr = 0
best_fpr = 0
candidate_thresholds = np.unique(probabilities[:, 1])
for candidate_threshold in candidate_thresholds:
y_ = [self.classes_[1] if p > candidate_threshold else self.classes_[0] for p in probabilities[:, 1]]
- TP, FP, FN, TN = self.compute_table(y, y_)
- tpr = self.compute_tpr(TP, FP)
- fpr = self.compute_fpr(FP, TN)
+ TP, FP, FN, TN = self._compute_table(y, y_)
+ tpr = self._compute_tpr(TP, FP)
+ fpr = self._compute_fpr(FP, TN)
condition_score = self._condition(tpr, fpr)
if best_candidate_threshold_score is None or condition_score < best_candidate_threshold_score:
best_candidate_threshold_score = condition_score
@@ -603,25 +819,40 @@ class ThresholdOptimization(AggregativeQuantifier, BinaryQuantifier):
adjusted_prevs_estim = np.array((1 - adjusted_prevs_estim, adjusted_prevs_estim))
return adjusted_prevs_estim
- def compute_table(self, y, y_):
+ def _compute_table(self, y, y_):
TP = np.logical_and(y == y_, y == self.classes_[1]).sum()
FP = np.logical_and(y != y_, y == self.classes_[0]).sum()
FN = np.logical_and(y != y_, y == self.classes_[1]).sum()
TN = np.logical_and(y == y_, y == self.classes_[0]).sum()
return TP, FP, FN, TN
- def compute_tpr(self, TP, FP):
+ def _compute_tpr(self, TP, FP):
if TP + FP == 0:
return 0
return TP / (TP + FP)
- def compute_fpr(self, FP, TN):
+ def _compute_fpr(self, FP, TN):
if FP + TN == 0:
return 0
return FP / (FP + TN)
class T50(ThresholdOptimization):
+ """
+ Threshold Optimization variant for :class:`ACC` as proposed by
+ `Forman 2006 `_ and
+ `Forman 2008 `_ that looks
+ for the threshold that makes `tpr` cosest to 0.5.
+ The goal is to bring improved stability to the denominator of the adjustment.
+
+ :param learner: a sklearn's Estimator that generates a classifier
+ :param val_split: indicates the proportion of data to be used as a stratified held-out validation set in which the
+ misclassification rates are to be estimated.
+ This parameter can be indicated as a real value (between 0 and 1, default 0.4), representing a proportion of
+ validation data, or as an integer, indicating that the misclassification rates should be estimated via
+ `k`-fold cross validation (this integer stands for the number of folds `k`), or as a
+ :class:`quapy.data.base.LabelledCollection` (the split itself).
+ """
def __init__(self, learner: BaseEstimator, val_split=0.4):
super().__init__(learner, val_split)
@@ -631,6 +862,21 @@ class T50(ThresholdOptimization):
class MAX(ThresholdOptimization):
+ """
+ Threshold Optimization variant for :class:`ACC` as proposed by
+ `Forman 2006 `_ and
+ `Forman 2008 `_ that looks
+ for the threshold that maximizes `tpr-fpr`.
+ The goal is to bring improved stability to the denominator of the adjustment.
+
+ :param learner: a sklearn's Estimator that generates a classifier
+ :param val_split: indicates the proportion of data to be used as a stratified held-out validation set in which the
+ misclassification rates are to be estimated.
+ This parameter can be indicated as a real value (between 0 and 1, default 0.4), representing a proportion of
+ validation data, or as an integer, indicating that the misclassification rates should be estimated via
+ `k`-fold cross validation (this integer stands for the number of folds `k`), or as a
+ :class:`quapy.data.base.LabelledCollection` (the split itself).
+ """
def __init__(self, learner: BaseEstimator, val_split=0.4):
super().__init__(learner, val_split)
@@ -641,6 +887,21 @@ class MAX(ThresholdOptimization):
class X(ThresholdOptimization):
+ """
+ Threshold Optimization variant for :class:`ACC` as proposed by
+ `Forman 2006 `_ and
+ `Forman 2008 `_ that looks
+ for the threshold that yields `tpr=1-fpr`.
+ The goal is to bring improved stability to the denominator of the adjustment.
+
+ :param learner: a sklearn's Estimator that generates a classifier
+ :param val_split: indicates the proportion of data to be used as a stratified held-out validation set in which the
+ misclassification rates are to be estimated.
+ This parameter can be indicated as a real value (between 0 and 1, default 0.4), representing a proportion of
+ validation data, or as an integer, indicating that the misclassification rates should be estimated via
+ `k`-fold cross validation (this integer stands for the number of folds `k`), or as a
+ :class:`quapy.data.base.LabelledCollection` (the split itself).
+ """
def __init__(self, learner: BaseEstimator, val_split=0.4):
super().__init__(learner, val_split)
@@ -650,41 +911,70 @@ class X(ThresholdOptimization):
class MS(ThresholdOptimization):
+ """
+ Median Sweep. Threshold Optimization variant for :class:`ACC` as proposed by
+ `Forman 2006 `_ and
+ `Forman 2008 `_ that generates
+ class prevalence estimates for all decision thresholds and returns the median of them all.
+ The goal is to bring improved stability to the denominator of the adjustment.
+ :param learner: a sklearn's Estimator that generates a classifier
+ :param val_split: indicates the proportion of data to be used as a stratified held-out validation set in which the
+ misclassification rates are to be estimated.
+ This parameter can be indicated as a real value (between 0 and 1, default 0.4), representing a proportion of
+ validation data, or as an integer, indicating that the misclassification rates should be estimated via
+ `k`-fold cross validation (this integer stands for the number of folds `k`), or as a
+ :class:`quapy.data.base.LabelledCollection` (the split itself).
+ """
def __init__(self, learner: BaseEstimator, val_split=0.4):
super().__init__(learner, val_split)
def _condition(self, tpr, fpr) -> float:
pass
- def optimize_threshold(self, y, probabilities):
+ def _optimize_threshold(self, y, probabilities):
tprs = []
fprs = []
candidate_thresholds = np.unique(probabilities[:, 1])
for candidate_threshold in candidate_thresholds:
y_ = [self.classes_[1] if p > candidate_threshold else self.classes_[0] for p in probabilities[:, 1]]
- TP, FP, FN, TN = self.compute_table(y, y_)
- tpr = self.compute_tpr(TP, FP)
- fpr = self.compute_fpr(FP, TN)
+ TP, FP, FN, TN = self._compute_table(y, y_)
+ tpr = self._compute_tpr(TP, FP)
+ fpr = self._compute_fpr(FP, TN)
tprs.append(tpr)
fprs.append(fpr)
return np.median(tprs), np.median(fprs)
class MS2(MS):
+ """
+ Median Sweep 2. Threshold Optimization variant for :class:`ACC` as proposed by
+ `Forman 2006 `_ and
+ `Forman 2008 `_ that generates
+ class prevalence estimates for all decision thresholds and returns the median of for cases in
+ which `tpr-fpr>0.25`
+ The goal is to bring improved stability to the denominator of the adjustment.
+ :param learner: a sklearn's Estimator that generates a classifier
+ :param val_split: indicates the proportion of data to be used as a stratified held-out validation set in which the
+ misclassification rates are to be estimated.
+ This parameter can be indicated as a real value (between 0 and 1, default 0.4), representing a proportion of
+ validation data, or as an integer, indicating that the misclassification rates should be estimated via
+ `k`-fold cross validation (this integer stands for the number of folds `k`), or as a
+ :class:`quapy.data.base.LabelledCollection` (the split itself).
+ """
def __init__(self, learner: BaseEstimator, val_split=0.4):
super().__init__(learner, val_split)
- def optimize_threshold(self, y, probabilities):
+ def _optimize_threshold(self, y, probabilities):
tprs = [0, 1]
fprs = [0, 1]
candidate_thresholds = np.unique(probabilities[:, 1])
for candidate_threshold in candidate_thresholds:
y_ = [self.classes_[1] if p > candidate_threshold else self.classes_[0] for p in probabilities[:, 1]]
- TP, FP, FN, TN = self.compute_table(y, y_)
- tpr = self.compute_tpr(TP, FP)
- fpr = self.compute_fpr(FP, TN)
+ TP, FP, FN, TN = self._compute_table(y, y_)
+ tpr = self._compute_tpr(TP, FP)
+ fpr = self._compute_fpr(FP, TN)
if (tpr - fpr) > 0.25:
tprs.append(tpr)
fprs.append(fpr)
@@ -696,6 +986,7 @@ AdjustedClassifyAndCount = ACC
ProbabilisticClassifyAndCount = PCC
ProbabilisticAdjustedClassifyAndCount = PACC
ExpectationMaximizationQuantifier = EMQ
+SLD = EMQ
HellingerDistanceY = HDy
ExplicitLossMinimisation = ELM
MedianSweep = MS
@@ -704,11 +995,14 @@ MedianSweep2 = MS2
class OneVsAll(AggregativeQuantifier):
"""
- Allows any binary quantifier to perform quantification on single-label datasets. The method maintains one binary
- quantifier for each class, and then l1-normalizes the outputs so that the class prevelences sum up to 1.
- This variant was used, along with the ExplicitLossMinimization quantifier in
- Gao, W., Sebastiani, F.: From classification to quantification in tweet sentiment analysis.
- Social Network Analysis and Mining 6(19), 1–22 (2016)
+ Allows any binary quantifier to perform quantification on single-label datasets.
+ The method maintains one binary quantifier for each class, and then l1-normalizes the outputs so that the
+ class prevelences sum up to 1.
+ This variant was used, along with the :class:`EMQ` quantifier, in
+ `Gao and Sebastiani, 2016 `_.
+
+ :param learner: a sklearn's Estimator that generates a binary classifier
+ :param n_jobs: number of parallel workers
"""
def __init__(self, binary_quantifier, n_jobs=-1):
@@ -727,18 +1021,30 @@ class OneVsAll(AggregativeQuantifier):
return self
def classify(self, instances):
- # returns a matrix of shape (n,m) with n the number of instances and m the number of classes. The entry
- # (i,j) is a binary value indicating whether instance i belongs to class j. The binary classifications are
- # independent of each other, meaning that an instance can end up be attributed to 0, 1, or more classes.
+ """
+ Returns a matrix of shape `(n,m,)` with `n` the number of instances and `m` the number of classes. The entry
+ `(i,j)` is a binary value indicating whether instance `i `belongs to class `j`. The binary classifications are
+ independent of each other, meaning that an instance can end up be attributed to 0, 1, or more classes.
+
+ :param instances: array-like
+ :return: `np.ndarray`
+ """
+
classif_predictions_bin = self.__parallel(self._delayed_binary_classification, instances)
return classif_predictions_bin.T
def posterior_probabilities(self, instances):
- # returns a matrix of shape (n,m,2) with n the number of instances and m the number of classes. The entry
- # (i,j,1) (resp. (i,j,0)) is a value in [0,1] indicating the posterior probability that instance i belongs
- # (resp. does not belong) to class j.
- # The posterior probabilities are independent of each other, meaning that, in general, they do not sum
- # up to one.
+ """
+ Returns a matrix of shape `(n,m,2)` with `n` the number of instances and `m` the number of classes. The entry
+ `(i,j,1)` (resp. `(i,j,0)`) is a value in [0,1] indicating the posterior probability that instance `i` belongs
+ (resp. does not belong) to class `j`.
+ The posterior probabilities are independent of each other, meaning that, in general, they do not sum
+ up to one.
+
+ :param instances: array-like
+ :return: `np.ndarray`
+ """
+
if not self.binary_quantifier.probabilistic:
raise NotImplementedError(f'{self.__class__.__name__} does not implement posterior_probabilities because '
f'the base quantifier {self.binary_quantifier.__class__.__name__} is not '
@@ -800,8 +1106,19 @@ class OneVsAll(AggregativeQuantifier):
@property
def binary(self):
+ """
+ Informs that the classifier is not binary
+
+ :return: False
+ """
return False
@property
def probabilistic(self):
+ """
+ Indicates if the classifier is probabilistic or not (depending on the nature of the base classifier).
+
+ :return: boolean
+ """
+
return self.binary_quantifier.probabilistic
diff --git a/quapy/method/base.py b/quapy/method/base.py
index 64fdff4..4a4962a 100644
--- a/quapy/method/base.py
+++ b/quapy/method/base.py
@@ -6,39 +6,107 @@ from quapy.data import LabelledCollection
# Base Quantifier abstract class
# ------------------------------------
class BaseQuantifier(metaclass=ABCMeta):
+ """
+ Abstract Quantifier. A quantifier is defined as an object of a class that implements the method :meth:`fit` on
+ :class:`quapy.data.base.LabelledCollection`, the method :meth:`quantify`, and the :meth:`set_params` and
+ :meth:`get_params` for model selection (see :meth:`quapy.model_selection.GridSearchQ`)
+ """
@abstractmethod
- def fit(self, data: LabelledCollection): ...
+ def fit(self, data: LabelledCollection):
+ """
+ Trains a quantifier.
+
+ :param data: a :class:`quapy.data.base.LabelledCollection` consisting of the training data
+ :return: self
+ """
+ ...
@abstractmethod
- def quantify(self, instances): ...
+ def quantify(self, instances):
+ """
+ Generate class prevalence estimates for the sample's instances
+
+ :param instances: array-like
+ :return: `np.ndarray` of shape `(self.n_classes_,)` with class prevalence estimates.
+ """
+ ...
@abstractmethod
- def set_params(self, **parameters): ...
+ def set_params(self, **parameters):
+ """
+ Set the parameters of the quantifier.
+
+ :param parameters: dictionary of param-value pairs
+ """
+ ...
@abstractmethod
- def get_params(self, deep=True): ...
+ def get_params(self, deep=True):
+ """
+ Return the current parameters of the quantifier.
+
+ :param deep: for compatibility with sklearn
+ :return: a dictionary of param-value pairs
+ """
+ ...
@property
@abstractmethod
- def classes_(self): ...
+ def classes_(self):
+ """
+ Class labels, in the same order in which class prevalence values are to be computed.
+
+ :return: array-like
+ """
+ ...
+
+ @property
+ def n_classes(self):
+ """
+ Returns the number of classes
+
+ :return: integer
+ """
+ return len(self.classes_)
# these methods allows meta-learners to reimplement the decision based on their constituents, and not
# based on class structure
@property
def binary(self):
+ """
+ Indicates whether the quantifier is binary or not.
+
+ :return: False (to be overridden)
+ """
return False
@property
def aggregative(self):
+ """
+ Indicates whether the quantifier is of type aggregative or not
+
+ :return: False (to be overridden)
+ """
+
return False
@property
def probabilistic(self):
+ """
+ Indicates whether the quantifier is of type probabilistic or not
+
+ :return: False (to be overridden)
+ """
+
return False
class BinaryQuantifier(BaseQuantifier):
+ """
+ Abstract class of binary quantifiers, i.e., quantifiers estimating class prevalence values for only two classes
+ (typically, to be interpreted as one class and its complement).
+ """
def _check_binary(self, data: LabelledCollection, quantifier_name):
assert data.binary, f'{quantifier_name} works only on problems of binary classification. ' \
@@ -46,18 +114,43 @@ class BinaryQuantifier(BaseQuantifier):
@property
def binary(self):
+ """
+ Informs that the quantifier is binary
+
+ :return: True
+ """
return True
def isbinary(model:BaseQuantifier):
+ """
+ Alias for property `binary`
+
+ :param model: the model
+ :return: True if the model is binary, False otherwise
+ """
return model.binary
def isaggregative(model:BaseQuantifier):
+ """
+ Alias for property `aggregative`
+
+ :param model: the model
+ :return: True if the model is aggregative, False otherwise
+ """
+
return model.aggregative
def isprobabilistic(model:BaseQuantifier):
+ """
+ Alias for property `probabilistic`
+
+ :param model: the model
+ :return: True if the model is probabilistic, False otherwise
+ """
+
return model.probabilistic
diff --git a/quapy/method/meta.py b/quapy/method/meta.py
index fc3efe3..7f7cba8 100644
--- a/quapy/method/meta.py
+++ b/quapy/method/meta.py
@@ -1,6 +1,5 @@
from copy import deepcopy
from typing import Union
-
import numpy as np
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import f1_score, make_scorer, accuracy_score
@@ -30,14 +29,40 @@ class Ensemble(BaseQuantifier):
VALID_POLICIES = {'ave', 'ptr', 'ds'} | qp.error.QUANTIFICATION_ERROR_NAMES
"""
- Methods from the articles:
- Pérez-Gállego, P., Quevedo, J. R., & del Coz, J. J. (2017).
- Using ensembles for problems with characterizable changes in data distribution: A case study on quantification.
- Information Fusion, 34, 87-100.
+ Implementation of the Ensemble methods for quantification described by
+ `Pérez-Gállego et al., 2017 `_
and
- Pérez-Gállego, P., Castano, A., Quevedo, J. R., & del Coz, J. J. (2019).
- Dynamic ensemble selection for quantification tasks.
- Information Fusion, 45, 1-15.
+ `Pérez-Gállego et al., 2019 `_.
+ The policies implemented include:
+
+ - Average (`policy='ave'`): computes class prevalence estimates as the average of the estimates
+ returned by the base quantifiers.
+ - Training Prevalence (`policy='ptr'`): applies a dynamic selection to the ensemble’s members by retaining only
+ those members such that the class prevalence values in the samples they use as training set are closest to
+ preliminary class prevalence estimates computed as the average of the estimates of all the members. The final
+ estimate is recomputed by considering only the selected members.
+ - Distribution Similarity (`policy='ds'`): performs a dynamic selection of base members by retaining
+ the members trained on samples whose distribution of posterior probabilities is closest, in terms of the
+ Hellinger Distance, to the distribution of posterior probabilities in the test sample
+ - Accuracy (`policy=''`): performs a static selection of the ensemble members by
+ retaining those that minimize a quantification error measure, which is passed as an argument.
+
+ Example:
+
+ >>> model = Ensemble(quantifier=ACC(LogisticRegression()), size=30, policy='ave', n_jobs=-1)
+
+ :param quantifier: base quantification member of the ensemble
+ :param size: number of members
+ :param red_size: number of members to retain after selection (depending on the policy)
+ :param min_pos: minimum number of positive instances to consider a sample as valid
+ :param policy: the selection policy; available policies include: `ave` (default), `ptr`, `ds`, and accuracy
+ (which is instantiated via a valid error name, e.g., `mae`)
+ :param max_sample_size: maximum number of instances to consider in the samples (set to None
+ to indicate no limit, default)
+ :param val_split: a float in range (0,1) indicating the proportion of data to be used as a stratified held-out
+ validation split, or a :class:`quapy.data.base.LabelledCollection` (the split itself).
+ :param n_jobs: number of parallel workers (default 1)
+ :param verbose: set to True (default is False) to get some information in standard output
"""
def __init__(self,
@@ -47,7 +72,7 @@ class Ensemble(BaseQuantifier):
min_pos=5,
policy='ave',
max_sample_size=None,
- val_split=None,
+ val_split:Union[qp.data.LabelledCollection, float]=None,
n_jobs=1,
verbose=False):
assert policy in Ensemble.VALID_POLICIES, \
@@ -65,12 +90,12 @@ class Ensemble(BaseQuantifier):
self.verbose = verbose
self.max_sample_size = max_sample_size
- def sout(self, msg):
+ def _sout(self, msg):
if self.verbose:
print('[Ensemble]' + msg)
def fit(self, data: qp.data.LabelledCollection, val_split: Union[qp.data.LabelledCollection, float] = None):
- self.sout('Fit')
+ self._sout('Fit')
if self.policy == 'ds' and not data.binary:
raise ValueError(f'ds policy is only defined for binary quantification, but this dataset is not binary')
if val_split is None:
@@ -84,7 +109,7 @@ class Ensemble(BaseQuantifier):
posteriors = None
if self.policy == 'ds':
# precompute the training posterior probabilities
- posteriors, self.post_proba_fn = self.ds_policy_get_posteriors(data)
+ posteriors, self.post_proba_fn = self._ds_policy_get_posteriors(data)
is_static_policy = (self.policy in qp.error.QUANTIFICATION_ERROR_NAMES)
@@ -99,9 +124,9 @@ class Ensemble(BaseQuantifier):
# static selection policy (the name of a quantification-oriented error function to minimize)
if self.policy in qp.error.QUANTIFICATION_ERROR_NAMES:
- self.accuracy_policy(error_name=self.policy)
+ self._accuracy_policy(error_name=self.policy)
- self.sout('Fit [Done]')
+ self._sout('Fit [Done]')
return self
def quantify(self, instances):
@@ -110,23 +135,42 @@ class Ensemble(BaseQuantifier):
)
if self.policy == 'ptr':
- predictions = self.ptr_policy(predictions)
+ predictions = self._ptr_policy(predictions)
elif self.policy == 'ds':
- predictions = self.ds_policy(predictions, instances)
+ predictions = self._ds_policy(predictions, instances)
predictions = np.mean(predictions, axis=0)
return F.normalize_prevalence(predictions)
def set_params(self, **parameters):
+ """
+ This function should not be used within :class:`quapy.model_selection.GridSearchQ` (is here for compatibility
+ with the abstract class).
+ Instead, use `Ensemble(GridSearchQ(q),...)`, with `q` a Quantifier (recommended), or
+ `Ensemble(Q(GridSearchCV(l)))` with `Q` a quantifier class that has a learner `l` optimized for
+ classification (not recommended).
+
+ :param parameters: dictionary
+ :return: raises an Exception
+ """
raise NotImplementedError(f'{self.__class__.__name__} should not be used within GridSearchQ; '
f'instead, use Ensemble(GridSearchQ(q),...), with q a Quantifier (recommended), '
f'or Ensemble(Q(GridSearchCV(l))) with Q a quantifier class that has a learner '
f'l optimized for classification (not recommended).')
def get_params(self, deep=True):
+ """
+ This function should not be used within :class:`quapy.model_selection.GridSearchQ` (is here for compatibility
+ with the abstract class).
+ Instead, use `Ensemble(GridSearchQ(q),...)`, with `q` a Quantifier (recommended), or
+ `Ensemble(Q(GridSearchCV(l)))` with `Q` a quantifier class that has a learner `l` optimized for
+ classification (not recommended).
+
+ :return: raises an Exception
+ """
raise NotImplementedError()
- def accuracy_policy(self, error_name):
+ def _accuracy_policy(self, error_name):
"""
Selects the red_size best performant quantifiers in a static way (i.e., dropping all non-selected instances).
For each model in the ensemble, the performance is measured in terms of _error_name_ on the quantification of
@@ -141,7 +185,7 @@ class Ensemble(BaseQuantifier):
self.ensemble = _select_k(self.ensemble, order, k=self.red_size)
- def ptr_policy(self, predictions):
+ def _ptr_policy(self, predictions):
"""
Selects the predictions made by models that have been trained on samples with a prevalence that is most similar
to a first approximation of the test prevalence as made by all models in the ensemble.
@@ -152,7 +196,7 @@ class Ensemble(BaseQuantifier):
order = np.argsort(ptr_differences)
return _select_k(predictions, order, k=self.red_size)
- def ds_policy_get_posteriors(self, data: LabelledCollection):
+ def _ds_policy_get_posteriors(self, data: LabelledCollection):
"""
In the original article, this procedure is not described in a sufficient level of detail. The paper only says
that the distribution of posterior probabilities from training and test examples is compared by means of the
@@ -182,7 +226,7 @@ class Ensemble(BaseQuantifier):
return posteriors, posteriors_generator
- def ds_policy(self, predictions, test):
+ def _ds_policy(self, predictions, test):
test_posteriors = self.post_proba_fn(test)
test_distribution = get_probability_distribution(test_posteriors)
tr_distributions = [m[2] for m in self.ensemble]
@@ -196,18 +240,40 @@ class Ensemble(BaseQuantifier):
@property
def binary(self):
+ """
+ Returns a boolean indicating whether the base quantifiers are binary or not
+
+ :return: boolean
+ """
return self.base_quantifier.binary
@property
def aggregative(self):
+ """
+ Indicates that the quantifier is not aggregative.
+
+ :return: False
+ """
return False
@property
def probabilistic(self):
+ """
+ Indicates that the quantifier is not probabilistic.
+
+ :return: False
+ """
return False
def get_probability_distribution(posterior_probabilities, bins=8):
+ """
+ Gets a histogram out of the posterior probabilities (only for the binary case).
+
+ :param posterior_probabilities: array-like of shape `(n_instances, 2,)`
+ :param bins: integer
+ :return: `np.ndarray` with the relative frequencies for each bin (for the positive class only)
+ """
assert posterior_probabilities.shape[1] == 2, 'the posterior probabilities do not seem to be for a binary problem'
posterior_probabilities = posterior_probabilities[:, 1] # take the positive posteriors only
distribution, _ = np.histogram(posterior_probabilities, bins=bins, range=(0, 1), density=True)
@@ -306,6 +372,23 @@ def _check_error(error):
def ensembleFactory(learner, base_quantifier_class, param_grid=None, optim=None, param_model_sel: dict = None,
**kwargs):
+ """
+ Ensemble factory. Provides a unified interface for instantiating ensembles that can be optimized (via model
+ selection for quantification) for a given evaluation metric using :class:`quapy.model_selection.GridSearchQ`.
+ If the evaluation metric is classification-oriented
+ (instead of quantification-oriented), then the optimization will be carried out via sklearn's
+ `GridSearchCV `_.
+
+
+ :param learner: sklearn's Estimator that generates a classifier
+ :param base_quantifier_class: a class of quantifiers
+ :param param_grid: a dictionary with the grid of parameters to optimize for
+ :param optim: a valid quantification or classification error, or a string name of it
+ :param param_model_sel: a dictionary containing any keyworded argument to pass to
+ :class:`quapy.model_selection.GridSearchQ`
+ :param kwargs: kwargs for the class :class:`Ensemble`
+ :return: an instance of :class:`Ensemble`
+ """
if optim is not None:
if param_grid is None:
raise ValueError(f'param_grid is None but optim was requested.')
@@ -316,20 +399,83 @@ def ensembleFactory(learner, base_quantifier_class, param_grid=None, optim=None,
def ECC(learner, param_grid=None, optim=None, param_mod_sel=None, **kwargs):
+ """
+ Implements an ensemble of :class:`quapy.method.aggregative.CC` quantifiers, as used by
+ `Pérez-Gállego et al., 2019 `_.
+
+ :param learner: sklearn's Estimator that generates a classifier
+ :param param_grid: a dictionary with the grid of parameters to optimize for
+ :param optim: a valid quantification or classification error, or a string name of it
+ :param param_model_sel: a dictionary containing any keyworded argument to pass to
+ :class:`quapy.model_selection.GridSearchQ`
+ :param kwargs: kwargs for the class :class:`Ensemble`
+ :return: an instance of :class:`Ensemble`
+ """
+
return ensembleFactory(learner, CC, param_grid, optim, param_mod_sel, **kwargs)
def EACC(learner, param_grid=None, optim=None, param_mod_sel=None, **kwargs):
+ """
+ Implements an ensemble of :class:`quapy.method.aggregative.ACC` quantifiers, as used by
+ `Pérez-Gállego et al., 2019 `_.
+
+ :param learner: sklearn's Estimator that generates a classifier
+ :param param_grid: a dictionary with the grid of parameters to optimize for
+ :param optim: a valid quantification or classification error, or a string name of it
+ :param param_model_sel: a dictionary containing any keyworded argument to pass to
+ :class:`quapy.model_selection.GridSearchQ`
+ :param kwargs: kwargs for the class :class:`Ensemble`
+ :return: an instance of :class:`Ensemble`
+ """
+
return ensembleFactory(learner, ACC, param_grid, optim, param_mod_sel, **kwargs)
def EPACC(learner, param_grid=None, optim=None, param_mod_sel=None, **kwargs):
+ """
+ Implements an ensemble of :class:`quapy.method.aggregative.PACC` quantifiers.
+
+ :param learner: sklearn's Estimator that generates a classifier
+ :param param_grid: a dictionary with the grid of parameters to optimize for
+ :param optim: a valid quantification or classification error, or a string name of it
+ :param param_model_sel: a dictionary containing any keyworded argument to pass to
+ :class:`quapy.model_selection.GridSearchQ`
+ :param kwargs: kwargs for the class :class:`Ensemble`
+ :return: an instance of :class:`Ensemble`
+ """
+
return ensembleFactory(learner, PACC, param_grid, optim, param_mod_sel, **kwargs)
def EHDy(learner, param_grid=None, optim=None, param_mod_sel=None, **kwargs):
+ """
+ Implements an ensemble of :class:`quapy.method.aggregative.HDy` quantifiers, as used by
+ `Pérez-Gállego et al., 2019 `_.
+
+ :param learner: sklearn's Estimator that generates a classifier
+ :param param_grid: a dictionary with the grid of parameters to optimize for
+ :param optim: a valid quantification or classification error, or a string name of it
+ :param param_model_sel: a dictionary containing any keyworded argument to pass to
+ :class:`quapy.model_selection.GridSearchQ`
+ :param kwargs: kwargs for the class :class:`Ensemble`
+ :return: an instance of :class:`Ensemble`
+ """
+
return ensembleFactory(learner, HDy, param_grid, optim, param_mod_sel, **kwargs)
def EEMQ(learner, param_grid=None, optim=None, param_mod_sel=None, **kwargs):
+ """
+ Implements an ensemble of :class:`quapy.method.aggregative.EMQ` quantifiers.
+
+ :param learner: sklearn's Estimator that generates a classifier
+ :param param_grid: a dictionary with the grid of parameters to optimize for
+ :param optim: a valid quantification or classification error, or a string name of it
+ :param param_model_sel: a dictionary containing any keyworded argument to pass to
+ :class:`quapy.model_selection.GridSearchQ`
+ :param kwargs: kwargs for the class :class:`Ensemble`
+ :return: an instance of :class:`Ensemble`
+ """
+
return ensembleFactory(learner, EMQ, param_grid, optim, param_mod_sel, **kwargs)
diff --git a/quapy/method/neural.py b/quapy/method/neural.py
index bb59f97..558e447 100644
--- a/quapy/method/neural.py
+++ b/quapy/method/neural.py
@@ -62,9 +62,11 @@ class QuaNetTrainer(BaseQuantifier):
def fit(self, data: LabelledCollection, fit_learner=True):
"""
+ Trains QuaNet.
+
:param data: the training data on which to train QuaNet. If fit_learner=True, the data will be split in
- 40/40/20 for training the classifier, training QuaNet, and validating QuaNet, respectively. If
- fit_learner=False, the data will be split in 66/34 for training QuaNet and validating it, respectively.
+ 40/40/20 for training the classifier, training QuaNet, and validating QuaNet, respectively. If
+ fit_learner=False, the data will be split in 66/34 for training QuaNet and validating it, respectively.
:param fit_learner: if true, trains the classifier on a split containing 40% of the data
:return: self
"""
diff --git a/quapy/method/non_aggregative.py b/quapy/method/non_aggregative.py
index bc0a99a..f70a0c6 100644
--- a/quapy/method/non_aggregative.py
+++ b/quapy/method/non_aggregative.py
@@ -3,24 +3,60 @@ from .base import BaseQuantifier
class MaximumLikelihoodPrevalenceEstimation(BaseQuantifier):
+ """
+ The `Maximum Likelihood Prevalence Estimation` (MLPE) method is a lazy method that assumes there is no prior
+ probability shift between training and test instances (put it other way, that the i.i.d. assumpion holds).
+ The estimation of class prevalence values for any test sample is always (i.e., irrespective of the test sample
+ itself) the class prevalence seen during training. This method is considered to be a lower-bound quantifier that
+ any quantification method should beat.
+ """
- def __init__(self, **kwargs):
+ def __init__(self):
self._classes_ = None
- def fit(self, data: LabelledCollection, *args):
+ def fit(self, data: LabelledCollection):
+ """
+ Computes the training prevalence and stores it.
+
+ :param data: the training sample
+ :return: self
+ """
self._classes_ = data.classes_
self.estimated_prevalence = data.prevalence()
return self
- def quantify(self, documents, *args):
+ def quantify(self, instances):
+ """
+ Ignores the input instances and returns, as the class prevalence estimantes, the training prevalence.
+
+ :param instances: array-like (ignored)
+ :return: the class prevalence seen during training
+ """
return self.estimated_prevalence
@property
def classes_(self):
+ """
+ Number of classes
+
+ :return: integer
+ """
+
return self._classes_
- def get_params(self):
- pass
+ def get_params(self, deep=True):
+ """
+ Does nothing, since this learner has no parameters.
+
+ :param deep: for compatibility with sklearn
+ :return: `None`
+ """
+ return None
def set_params(self, **parameters):
+ """
+ Does nothing, since this learner has no parameters.
+
+ :param parameters: dictionary of param-value pairs (ignored)
+ """
pass