diff --git a/docs/build/html/genindex.html b/docs/build/html/genindex.html index 54c937e..5c2055f 100644 --- a/docs/build/html/genindex.html +++ b/docs/build/html/genindex.html @@ -224,8 +224,6 @@
Get hyper-parameters for this estimator.
a dictionary with parameter names mapped to their values
+deep – compatibility with sklearn
+a dictionary with parameter names mapped to their values
Bases: object
Trains a neural network for text classification.
Returns an index to be used to extract a random sample of desired size and desired prevalence values. If the prevalence values are not specified, then returns the index of a uniform sampling. -For each class, the sampling is drawn without replacement if the requested prevalence is larger than -the actual prevalence of the class, or with replacement otherwise.
+For each class, the sampling is drawn with replacement if the requested prevalence is larger than +the actual prevalence of the class, or without replacement otherwise.Returns a uniform sample (an instance of LabelledCollection
) of desired size. The sampling is drawn
-without replacement if the requested size is greater than the number of instances, or with replacement
+with replacement if the requested size is greater than the number of instances, or without replacement
otherwise.
Returns an index to be used to extract a uniform sample of desired size. The sampling is drawn -without replacement if the requested size is greater than the number of instances, or with replacement +with replacement if the requested size is greater than the number of instances, or without replacement otherwise.
Implementation of error measures used for quantification
Computes the error in terms of 1-accuracy. The accuracy is computed as \(\frac{tp+tn}{tp+fp+fn+tn}\), with -tp, fp, fn, and tn standing for true positives, false positives, false negatives, and true negatives, +
Computes the error in terms of 1-accuracy. The accuracy is computed as +\(\frac{tp+tn}{tp+fp+fn+tn}\), with tp, fp, fn, and tn standing +for true positives, false positives, false negatives, and true negatives, respectively
Computes the error in terms of 1-accuracy. The accuracy is computed as \(\frac{tp+tn}{tp+fp+fn+tn}\), with -tp, fp, fn, and tn standing for true positives, false positives, false negatives, and true negatives, +
Computes the error in terms of 1-accuracy. The accuracy is computed as +\(\frac{tp+tn}{tp+fp+fn+tn}\), with tp, fp, fn, and tn standing +for true positives, false positives, false negatives, and true negatives, respectively
F1 error: simply computes the error in terms of macro \(F_1\), i.e., \(1-F_1^M\), -where \(F_1\) is the harmonic mean of precision and recall, defined as \(\frac{2tp}{2tp+fp+fn}\), -with tp, fp, and fn standing for true positives, false positives, and false negatives, respectively. -Macro averaging means the \(F_1\) is computed for each category independently, and then averaged.
+F1 error: simply computes the error in terms of macro \(F_1\), i.e., +\(1-F_1^M\), where \(F_1\) is the harmonic mean of precision and recall, +defined as \(\frac{2tp}{2tp+fp+fn}\), with tp, fp, and fn standing +for true positives, false positives, and false negatives, respectively. +Macro averaging means the \(F_1\) is computed for each category independently, +and then averaged.
F1 error: simply computes the error in terms of macro \(F_1\), i.e., \(1-F_1^M\), -where \(F_1\) is the harmonic mean of precision and recall, defined as \(\frac{2tp}{2tp+fp+fn}\), -with tp, fp, and fn standing for true positives, false positives, and false negatives, respectively. -Macro averaging means the \(F_1\) is computed for each category independently, and then averaged.
+F1 error: simply computes the error in terms of macro \(F_1\), i.e., +\(1-F_1^M\), where \(F_1\) is the harmonic mean of precision and recall, +defined as \(\frac{2tp}{2tp+fp+fn}\), with tp, fp, and fn standing +for true positives, false positives, and false negatives, respectively. +Macro averaging means the \(F_1\) is computed for each category independently, +and then averaged.
Gets an error function from its name. E.g., from_name(“mae”) will return function quapy.error.mae()
Gets an error function from its name. E.g., from_name(“mae”)
+will return function quapy.error.mae()
err_name – string, the error name
@@ -199,11 +207,13 @@ with tp, fp, and fn standing for true posKullback-Leibler divergence between two prevalence distributions \(p\) and \(\hat{p}\) is computed as -\(KLD(p,\hat{p})=D_{KL}(p||\hat{p})=\sum_{y\in \mathcal{Y}} p(y)\log\frac{p(y)}{\hat{p}(y)}\), where -\(\mathcal{Y}\) are the classes of interest. +
Kullback-Leibler divergence between two prevalence distributions \(p\) and \(\hat{p}\)
+is computed as
+\(KLD(p,\hat{p})=D_{KL}(p||\hat{p})=
+\sum_{y\in \mathcal{Y}} p(y)\log\frac{p(y)}{\hat{p}(y)}\),
+where \(\mathcal{Y}\) are the classes of interest.
The distributions are smoothed using the eps factor (see quapy.error.smooth()
).
prevs – array-like of shape (n_classes,) with the true prevalence values
prevs_hat – array-like of shape (n_classes,) with the predicted prevalence values
eps – smoothing factor. KLD is not defined in cases in which the distributions contain zeros; eps -is typically set to be \(\frac{1}{2T}\), with \(T\) the sample size. If eps=None, the sample size -will be taken from the environment variable SAMPLE_SIZE (which has thus to be set beforehand).
eps – smoothing factor. KLD is not defined in cases in which the distributions contain +zeros; eps is typically set to be \(\frac{1}{2T}\), with \(T\) the sample size. +If eps=None, the sample size will be taken from the environment variable SAMPLE_SIZE +(which has thus to be set beforehand).
prevs – array-like of shape (n_samples, n_classes,) with the true prevalence values
prevs_hat – array-like of shape (n_samples, n_classes,) with the predicted prevalence values
prevs_hat – array-like of shape (n_samples, n_classes,) with the predicted +prevalence values
prevs – array-like of shape (n_samples, n_classes,) with the true prevalence values
prevs_hat – array-like of shape (n_samples, n_classes,) with the predicted prevalence values
prevs_hat – array-like of shape (n_samples, n_classes,) with the predicted +prevalence values
Computes the mean relative absolute error (see quapy.error.rae()
) across the sample pairs.
-The distributions are smoothed using the eps factor (see quapy.error.smooth()
).
Computes the mean relative absolute error (see quapy.error.rae()
) across
+the sample pairs. The distributions are smoothed using the eps factor (see
+quapy.error.smooth()
).
prevs – array-like of shape (n_samples, n_classes,) with the true prevalence values
prevs_hat – array-like of shape (n_samples, n_classes,) with the predicted prevalence values
eps – smoothing factor. mrae is not defined in cases in which the true distribution contains zeros; eps -is typically set to be \(\frac{1}{2T}\), with \(T\) the sample size. If eps=None, the sample size -will be taken from the environment variable SAMPLE_SIZE (which has thus to be set beforehand).
prevs – array-like of shape (n_samples, n_classes,) with the true +prevalence values
prevs_hat – array-like of shape (n_samples, n_classes,) with the predicted +prevalence values
eps – smoothing factor. mrae is not defined in cases in which the true +distribution contains zeros; eps is typically set to be \(\frac{1}{2T}\), +with \(T\) the sample size. If eps=None, the sample size will be taken from +the environment variable SAMPLE_SIZE (which has thus to be set beforehand).
Computes the mean Kullback-Leibler divergence (see quapy.error.kld()
) across the sample pairs.
-The distributions are smoothed using the eps factor (see quapy.error.smooth()
).
Computes the mean Kullback-Leibler divergence (see quapy.error.kld()
) across the
+sample pairs. The distributions are smoothed using the eps factor
+(see quapy.error.smooth()
).
prevs – array-like of shape (n_samples, n_classes,) with the true prevalence values
prevs_hat – array-like of shape (n_samples, n_classes,) with the predicted prevalence values
eps – smoothing factor. KLD is not defined in cases in which the distributions contain zeros; eps -is typically set to be \(\frac{1}{2T}\), with \(T\) the sample size. If eps=None, the sample size -will be taken from the environment variable SAMPLE_SIZE (which has thus to be set beforehand).
prevs – array-like of shape (n_samples, n_classes,) with the true +prevalence values
prevs_hat – array-like of shape (n_samples, n_classes,) with the predicted +prevalence values
eps – smoothing factor. KLD is not defined in cases in which the distributions contain +zeros; eps is typically set to be \(\frac{1}{2T}\), with \(T\) the sample size. +If eps=None, the sample size will be taken from the environment variable SAMPLE_SIZE +(which has thus to be set beforehand).
Computes the mean Normalized Kullback-Leibler divergence (see quapy.error.nkld()
) across the sample pairs.
-The distributions are smoothed using the eps factor (see quapy.error.smooth()
).
Computes the mean Normalized Kullback-Leibler divergence (see quapy.error.nkld()
)
+across the sample pairs. The distributions are smoothed using the eps factor
+(see quapy.error.smooth()
).
prevs – array-like of shape (n_samples, n_classes,) with the true prevalence values
prevs_hat – array-like of shape (n_samples, n_classes,) with the predicted prevalence values
eps – smoothing factor. NKLD is not defined in cases in which the distributions contain zeros; eps -is typically set to be \(\frac{1}{2T}\), with \(T\) the sample size. If eps=None, the sample size -will be taken from the environment variable SAMPLE_SIZE (which has thus to be set beforehand).
prevs_hat – array-like of shape (n_samples, n_classes,) with the predicted +prevalence values
eps – smoothing factor. NKLD is not defined in cases in which the distributions contain +zeros; eps is typically set to be \(\frac{1}{2T}\), with \(T\) the sample size. +If eps=None, the sample size will be taken from the environment variable SAMPLE_SIZE +(which has thus to be set beforehand).
Computes the mean relative absolute error (see quapy.error.rae()
) across the sample pairs.
-The distributions are smoothed using the eps factor (see quapy.error.smooth()
).
Computes the mean relative absolute error (see quapy.error.rae()
) across
+the sample pairs. The distributions are smoothed using the eps factor (see
+quapy.error.smooth()
).
prevs – array-like of shape (n_samples, n_classes,) with the true prevalence values
prevs_hat – array-like of shape (n_samples, n_classes,) with the predicted prevalence values
eps – smoothing factor. mrae is not defined in cases in which the true distribution contains zeros; eps -is typically set to be \(\frac{1}{2T}\), with \(T\) the sample size. If eps=None, the sample size -will be taken from the environment variable SAMPLE_SIZE (which has thus to be set beforehand).
prevs – array-like of shape (n_samples, n_classes,) with the true +prevalence values
prevs_hat – array-like of shape (n_samples, n_classes,) with the predicted +prevalence values
eps – smoothing factor. mrae is not defined in cases in which the true +distribution contains zeros; eps is typically set to be \(\frac{1}{2T}\), +with \(T\) the sample size. If eps=None, the sample size will be taken from +the environment variable SAMPLE_SIZE (which has thus to be set beforehand).
prevs – array-like of shape (n_samples, n_classes,) with the true prevalence values
prevs_hat – array-like of shape (n_samples, n_classes,) with the predicted prevalence values
prevs – array-like of shape (n_samples, n_classes,) with the +true prevalence values
prevs_hat – array-like of shape (n_samples, n_classes,) with the +predicted prevalence values
Normalized Kullback-Leibler divergence between two prevalence distributions \(p\) and \(\hat{p}\) -is computed as \(NKLD(p,\hat{p}) = 2\frac{e^{KLD(p,\hat{p})}}{e^{KLD(p,\hat{p})}+1}-1\), where +
Normalized Kullback-Leibler divergence between two prevalence distributions \(p\) and
+\(\hat{p}\) is computed as
+math:NKLD(p,hat{p}) = 2frac{e^{KLD(p,hat{p})}}{e^{KLD(p,hat{p})}+1}-1,
+where
\(\mathcal{Y}\) are the classes of interest.
The distributions are smoothed using the eps factor (see quapy.error.smooth()
).
prevs – array-like of shape (n_classes,) with the true prevalence values
prevs_hat – array-like of shape (n_classes,) with the predicted prevalence values
eps – smoothing factor. NKLD is not defined in cases in which the distributions contain zeros; eps -is typically set to be \(\frac{1}{2T}\), with \(T\) the sample size. If eps=None, the sample size -will be taken from the environment variable SAMPLE_SIZE (which has thus to be set beforehand).
eps – smoothing factor. NKLD is not defined in cases in which the distributions +contain zeros; eps is typically set to be \(\frac{1}{2T}\), with \(T\) the sample +size. If eps=None, the sample size will be taken from the environment variable +SAMPLE_SIZE (which has thus to be set beforehand).
Relative absolute error between two prevalence vectors \(p\) and \(\hat{p}\) is computed as -\(RAE(p,\hat{p})=\frac{1}{|\mathcal{Y}|}\sum_{y\in \mathcal{Y}}\frac{|\hat{p}(y)-p(y)|}{p(y)}\), +
Relative absolute error between two prevalence vectors \(p\) and \(\hat{p}\)
+is computed as
+\(RAE(p,\hat{p})=
+\frac{1}{|\mathcal{Y}|}\sum_{y\in \mathcal{Y}}\frac{|\hat{p}(y)-p(y)|}{p(y)}\),
where \(\mathcal{Y}\) are the classes of interest.
The distributions are smoothed using the eps factor (see quapy.error.smooth()
).
prevs – array-like of shape (n_classes,) with the true prevalence values
prevs_hat – array-like of shape (n_classes,) with the predicted prevalence values
eps – smoothing factor. rae is not defined in cases in which the true distribution contains zeros; eps -is typically set to be \(\frac{1}{2T}\), with \(T\) the sample size. If eps=None, the sample size -will be taken from the environment variable SAMPLE_SIZE (which has thus to be set beforehand).
eps – smoothing factor. rae is not defined in cases in which the true distribution +contains zeros; eps is typically set to be \(\frac{1}{2T}\), with \(T\) the +sample size. If eps=None, the sample size will be taken from the environment variable +SAMPLE_SIZE (which has thus to be set beforehand).
Relative absolute error between two prevalence vectors \(p\) and \(\hat{p}\) is computed as -\(RAE(p,\hat{p})=\frac{1}{|\mathcal{Y}|}\sum_{y\in \mathcal{Y}}\frac{|\hat{p}(y)-p(y)|}{p(y)}\), +
Relative absolute error between two prevalence vectors \(p\) and \(\hat{p}\)
+is computed as
+\(RAE(p,\hat{p})=
+\frac{1}{|\mathcal{Y}|}\sum_{y\in \mathcal{Y}}\frac{|\hat{p}(y)-p(y)|}{p(y)}\),
where \(\mathcal{Y}\) are the classes of interest.
The distributions are smoothed using the eps factor (see quapy.error.smooth()
).
prevs – array-like of shape (n_classes,) with the true prevalence values
prevs_hat – array-like of shape (n_classes,) with the predicted prevalence values
eps – smoothing factor. rae is not defined in cases in which the true distribution contains zeros; eps -is typically set to be \(\frac{1}{2T}\), with \(T\) the sample size. If eps=None, the sample size -will be taken from the environment variable SAMPLE_SIZE (which has thus to be set beforehand).
eps – smoothing factor. rae is not defined in cases in which the true distribution +contains zeros; eps is typically set to be \(\frac{1}{2T}\), with \(T\) the +sample size. If eps=None, the sample size will be taken from the environment variable +SAMPLE_SIZE (which has thus to be set beforehand).
Squared error between two prevalence vectors \(p\) and \(\hat{p}\) is computed as -\(SE(p,\hat{p})=\frac{1}{|\mathcal{Y}|}\sum_{y\in \mathcal{Y}}(\hat{p}(y)-p(y))^2\), where +\(SE(p,\hat{p})=\frac{1}{|\mathcal{Y}|}\sum_{y\in \mathcal{Y}}(\hat{p}(y)-p(y))^2\), +where \(\mathcal{Y}\) are the classes of interest.
Smooths a prevalence distribution with \(\epsilon\) (eps) as: -\(\underline{p}(y)=\frac{\epsilon+p(y)}{\epsilon|\mathcal{Y}|+\displaystyle\sum_{y\in \mathcal{Y}}p(y)}\)
+\(\underline{p}(y)=\frac{\epsilon+p(y)}{\epsilon|\mathcal{Y}|+ +\displaystyle\sum_{y\in \mathcal{Y}}p(y)}\)Bases: AbstractStochasticSeededProtocol
, OnLabelledCollectionProtocol
Implementation of the artificial prevalence protocol (APP). The APP consists of exploring a grid of prevalence values containing n_prevalences points (e.g., @@ -621,6 +662,8 @@ grid (default is 21)
smooth_limits_epsilon – the quantity to add and subtract to the limits 0 and 1
random_state – allows replicating samples across runs (default 0, meaning that the sequence of samples will be the same every time the protocol is called)
sanity_check – int, raises an exception warning the user that the number of examples to be generated exceed +this number; set to None for skipping this check
return_type – set to “sample_prev” (default) to get the pairs of (sample, prevalence) at each iteration, or to “labelled_collection” to get instead instances of LabelledCollection
QuaPy module for quantification