Move docs/source/wiki/ to docs/source/manuals/
|
@ -68,15 +68,15 @@ Manuals
|
||||||
The following manuals illustrate several aspects of QuaPy through examples:
|
The following manuals illustrate several aspects of QuaPy through examples:
|
||||||
|
|
||||||
.. toctree::
|
.. toctree::
|
||||||
:maxdepth: 1
|
:maxdepth: 2
|
||||||
|
|
||||||
wiki/Datasets
|
manuals/datasets
|
||||||
wiki/Evaluation
|
manuals/evaluation
|
||||||
wiki/ExplicitLossMinimization
|
manuals/explicit-loss-minimization
|
||||||
wiki/Methods
|
manuals/methods
|
||||||
wiki/Model-Selection
|
manuals/model-selection
|
||||||
wiki/Plotting
|
manuals/plotting
|
||||||
wiki/Protocols
|
manuals/protocols
|
||||||
|
|
||||||
.. toctree::
|
.. toctree::
|
||||||
:hidden:
|
:hidden:
|
||||||
|
|
|
@ -67,9 +67,8 @@ for method in methods:
|
||||||
```
|
```
|
||||||
|
|
||||||
However, generating samples for evaluation purposes is tackled in QuaPy
|
However, generating samples for evaluation purposes is tackled in QuaPy
|
||||||
by means of the evaluation protocols (see the dedicated entries in the Wiki
|
by means of the evaluation protocols (see the dedicated entries in the manuals
|
||||||
for [evaluation](https://github.com/HLT-ISTI/QuaPy/wiki/Evaluation) and
|
for [evaluation](./evaluation) and [protocols](./protocols)).
|
||||||
[protocols](https://github.com/HLT-ISTI/QuaPy/wiki/Protocols)).
|
|
||||||
|
|
||||||
|
|
||||||
## Reviews Datasets
|
## Reviews Datasets
|
|
@ -29,7 +29,7 @@ instance in a sample-- while in quantification the output for a sample
|
||||||
is one single array of class prevalences).
|
is one single array of class prevalences).
|
||||||
Quantifiers also extend from scikit-learn's `BaseEstimator`, in order
|
Quantifiers also extend from scikit-learn's `BaseEstimator`, in order
|
||||||
to simplify the use of `set_params` and `get_params` used in
|
to simplify the use of `set_params` and `get_params` used in
|
||||||
[model selector](https://github.com/HLT-ISTI/QuaPy/wiki/Model-Selection).
|
[model selection](./model-selection).
|
||||||
|
|
||||||
## Aggregative Methods
|
## Aggregative Methods
|
||||||
|
|
||||||
|
@ -96,7 +96,7 @@ classifier, and then _clones_ these classifiers and explores the combinations
|
||||||
of hyperparameters that are specific to the quantifier (this can result in huge
|
of hyperparameters that are specific to the quantifier (this can result in huge
|
||||||
time savings).
|
time savings).
|
||||||
Concerning the inference phase, this two-step process allow the evaluation of many
|
Concerning the inference phase, this two-step process allow the evaluation of many
|
||||||
standard protocols (e.g., the [artificial sampling protocol](https://github.com/HLT-ISTI/QuaPy/wiki/Evaluation)) to be
|
standard protocols (e.g., the [artificial sampling protocol](./evaluation)) to be
|
||||||
carried out very efficiently. The reason is that the entire set can be pre-classified
|
carried out very efficiently. The reason is that the entire set can be pre-classified
|
||||||
once, and the quantification estimations for different samples can directly
|
once, and the quantification estimations for different samples can directly
|
||||||
reuse these predictions, without requiring to classify each element every time.
|
reuse these predictions, without requiring to classify each element every time.
|
||||||
|
@ -484,8 +484,7 @@ the performance estimated for each member of the ensemble in terms of that evalu
|
||||||
When using any of the above options, it is important to set the `red_size` parameter, which
|
When using any of the above options, it is important to set the `red_size` parameter, which
|
||||||
informs of the number of members to retain.
|
informs of the number of members to retain.
|
||||||
|
|
||||||
Please, check the [model selection](https://github.com/HLT-ISTI/QuaPy/wiki/Model-Selection)
|
Please, check the [model selection manual](./model-selection) if you want to optimize the hyperparameters of ensemble for classification or quantification.
|
||||||
wiki if you want to optimize the hyperparameters of ensemble for classification or quantification.
|
|
||||||
|
|
||||||
### The QuaNet neural network
|
### The QuaNet neural network
|
||||||
|
|
|
@ -33,11 +33,11 @@ of scenarios exhibiting different degrees of prior
|
||||||
probability shift.
|
probability shift.
|
||||||
|
|
||||||
The class _qp.model_selection.GridSearchQ_ implements a grid-search exploration over the space of
|
The class _qp.model_selection.GridSearchQ_ implements a grid-search exploration over the space of
|
||||||
hyper-parameter combinations that [evaluates](https://github.com/HLT-ISTI/QuaPy/wiki/Evaluation)
|
hyper-parameter combinations that [evaluates](./evaluation)
|
||||||
each combination of hyper-parameters by means of a given quantification-oriented
|
each combination of hyper-parameters by means of a given quantification-oriented
|
||||||
error metric (e.g., any of the error functions implemented
|
error metric (e.g., any of the error functions implemented
|
||||||
in _qp.error_) and according to a
|
in _qp.error_) and according to a
|
||||||
[sampling generation protocol](https://github.com/HLT-ISTI/QuaPy/wiki/Protocols).
|
[sampling generation protocol](./protocols).
|
||||||
|
|
||||||
The following is an example (also included in the examples folder) of model selection for quantification:
|
The following is an example (also included in the examples folder) of model selection for quantification:
|
||||||
|
|
Before Width: | Height: | Size: 62 KiB After Width: | Height: | Size: 62 KiB |
Before Width: | Height: | Size: 108 KiB After Width: | Height: | Size: 108 KiB |
Before Width: | Height: | Size: 71 KiB After Width: | Height: | Size: 71 KiB |
Before Width: | Height: | Size: 185 KiB After Width: | Height: | Size: 185 KiB |
Before Width: | Height: | Size: 337 KiB After Width: | Height: | Size: 337 KiB |
Before Width: | Height: | Size: 243 KiB After Width: | Height: | Size: 243 KiB |
|
@ -43,7 +43,7 @@ quantification methods across different scenarios showcasing
|
||||||
the accuracy of the quantifier in predicting class prevalences
|
the accuracy of the quantifier in predicting class prevalences
|
||||||
for a wide range of prior distributions. This can easily be
|
for a wide range of prior distributions. This can easily be
|
||||||
achieved by means of the
|
achieved by means of the
|
||||||
[artificial sampling protocol](https://github.com/HLT-ISTI/QuaPy/wiki/Protocols)
|
[artificial sampling protocol](./protocols)
|
||||||
that is implemented in QuaPy.
|
that is implemented in QuaPy.
|
||||||
|
|
||||||
The following code shows how to perform one simple experiment
|
The following code shows how to perform one simple experiment
|
||||||
|
@ -113,7 +113,7 @@ are '.png' or '.pdf'). If this path is not provided, then the plot
|
||||||
will be shown but not saved.
|
will be shown but not saved.
|
||||||
The resulting plot should look like:
|
The resulting plot should look like:
|
||||||
|
|
||||||

|

|
||||||
|
|
||||||
Note that in this case, we are also indicating the training
|
Note that in this case, we are also indicating the training
|
||||||
prevalence, which is plotted in the diagonal a as cyan dot.
|
prevalence, which is plotted in the diagonal a as cyan dot.
|
||||||
|
@ -138,7 +138,7 @@ qp.plot.binary_bias_global(method_names, true_prevs, estim_prevs, savepath='./pl
|
||||||
|
|
||||||
and should look like:
|
and should look like:
|
||||||
|
|
||||||

|

|
||||||
|
|
||||||
The box plots show some interesting facts:
|
The box plots show some interesting facts:
|
||||||
* all methods are biased towards the training prevalence but specially
|
* all methods are biased towards the training prevalence but specially
|
||||||
|
@ -181,7 +181,7 @@ def gen_data():
|
||||||
|
|
||||||
and the plot should now look like:
|
and the plot should now look like:
|
||||||
|
|
||||||

|

|
||||||
|
|
||||||
which clearly shows a negative bias for CC variants trained on
|
which clearly shows a negative bias for CC variants trained on
|
||||||
data containing more negatives (i.e., < 50%) and positive biases
|
data containing more negatives (i.e., < 50%) and positive biases
|
||||||
|
@ -195,7 +195,7 @@ To this aim, an argument _nbins_ is passed which indicates
|
||||||
how many isometric subintervals to take. For example
|
how many isometric subintervals to take. For example
|
||||||
the following plot is produced for _nbins=3_:
|
the following plot is produced for _nbins=3_:
|
||||||
|
|
||||||

|

|
||||||
|
|
||||||
Interestingly enough, the seemingly unbiased estimator (CC at 50%) happens to display
|
Interestingly enough, the seemingly unbiased estimator (CC at 50%) happens to display
|
||||||
a positive bias (or a tendency to overestimate) in cases of low prevalence
|
a positive bias (or a tendency to overestimate) in cases of low prevalence
|
||||||
|
@ -205,7 +205,7 @@ and a negative bias (or a tendency to underestimate) in cases of high prevalence
|
||||||
|
|
||||||
Out of curiosity, the diagonal plot for this experiment looks like:
|
Out of curiosity, the diagonal plot for this experiment looks like:
|
||||||
|
|
||||||

|

|
||||||
|
|
||||||
showing pretty clearly the dependency of CC on the prior probabilities
|
showing pretty clearly the dependency of CC on the prior probabilities
|
||||||
of the labeled set it was trained on.
|
of the labeled set it was trained on.
|
||||||
|
@ -234,7 +234,7 @@ qp.plot.error_by_drift(method_names, true_prevs, estim_prevs, tr_prevs,
|
||||||
error_name='ae', n_bins=10, savepath='./plots/err_drift.png')
|
error_name='ae', n_bins=10, savepath='./plots/err_drift.png')
|
||||||
```
|
```
|
||||||
|
|
||||||

|

|
||||||
|
|
||||||
Note that all methods work reasonably well in cases of low prevalence
|
Note that all methods work reasonably well in cases of low prevalence
|
||||||
drift (i.e., any CC-variant is a good quantifier whenever the IID
|
drift (i.e., any CC-variant is a good quantifier whenever the IID
|