Move docs/source/wiki/ to docs/source/manuals/
|
@ -68,15 +68,15 @@ Manuals
|
|||
The following manuals illustrate several aspects of QuaPy through examples:
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
:maxdepth: 2
|
||||
|
||||
wiki/Datasets
|
||||
wiki/Evaluation
|
||||
wiki/ExplicitLossMinimization
|
||||
wiki/Methods
|
||||
wiki/Model-Selection
|
||||
wiki/Plotting
|
||||
wiki/Protocols
|
||||
manuals/datasets
|
||||
manuals/evaluation
|
||||
manuals/explicit-loss-minimization
|
||||
manuals/methods
|
||||
manuals/model-selection
|
||||
manuals/plotting
|
||||
manuals/protocols
|
||||
|
||||
.. toctree::
|
||||
:hidden:
|
||||
|
|
|
@ -67,9 +67,8 @@ for method in methods:
|
|||
```
|
||||
|
||||
However, generating samples for evaluation purposes is tackled in QuaPy
|
||||
by means of the evaluation protocols (see the dedicated entries in the Wiki
|
||||
for [evaluation](https://github.com/HLT-ISTI/QuaPy/wiki/Evaluation) and
|
||||
[protocols](https://github.com/HLT-ISTI/QuaPy/wiki/Protocols)).
|
||||
by means of the evaluation protocols (see the dedicated entries in the manuals
|
||||
for [evaluation](./evaluation) and [protocols](./protocols)).
|
||||
|
||||
|
||||
## Reviews Datasets
|
|
@ -29,7 +29,7 @@ instance in a sample-- while in quantification the output for a sample
|
|||
is one single array of class prevalences).
|
||||
Quantifiers also extend from scikit-learn's `BaseEstimator`, in order
|
||||
to simplify the use of `set_params` and `get_params` used in
|
||||
[model selector](https://github.com/HLT-ISTI/QuaPy/wiki/Model-Selection).
|
||||
[model selection](./model-selection).
|
||||
|
||||
## Aggregative Methods
|
||||
|
||||
|
@ -96,7 +96,7 @@ classifier, and then _clones_ these classifiers and explores the combinations
|
|||
of hyperparameters that are specific to the quantifier (this can result in huge
|
||||
time savings).
|
||||
Concerning the inference phase, this two-step process allow the evaluation of many
|
||||
standard protocols (e.g., the [artificial sampling protocol](https://github.com/HLT-ISTI/QuaPy/wiki/Evaluation)) to be
|
||||
standard protocols (e.g., the [artificial sampling protocol](./evaluation)) to be
|
||||
carried out very efficiently. The reason is that the entire set can be pre-classified
|
||||
once, and the quantification estimations for different samples can directly
|
||||
reuse these predictions, without requiring to classify each element every time.
|
||||
|
@ -484,8 +484,7 @@ the performance estimated for each member of the ensemble in terms of that evalu
|
|||
When using any of the above options, it is important to set the `red_size` parameter, which
|
||||
informs of the number of members to retain.
|
||||
|
||||
Please, check the [model selection](https://github.com/HLT-ISTI/QuaPy/wiki/Model-Selection)
|
||||
wiki if you want to optimize the hyperparameters of ensemble for classification or quantification.
|
||||
Please, check the [model selection manual](./model-selection) if you want to optimize the hyperparameters of ensemble for classification or quantification.
|
||||
|
||||
### The QuaNet neural network
|
||||
|
|
@ -33,11 +33,11 @@ of scenarios exhibiting different degrees of prior
|
|||
probability shift.
|
||||
|
||||
The class _qp.model_selection.GridSearchQ_ implements a grid-search exploration over the space of
|
||||
hyper-parameter combinations that [evaluates](https://github.com/HLT-ISTI/QuaPy/wiki/Evaluation)
|
||||
hyper-parameter combinations that [evaluates](./evaluation)
|
||||
each combination of hyper-parameters by means of a given quantification-oriented
|
||||
error metric (e.g., any of the error functions implemented
|
||||
in _qp.error_) and according to a
|
||||
[sampling generation protocol](https://github.com/HLT-ISTI/QuaPy/wiki/Protocols).
|
||||
[sampling generation protocol](./protocols).
|
||||
|
||||
The following is an example (also included in the examples folder) of model selection for quantification:
|
||||
|
Before Width: | Height: | Size: 62 KiB After Width: | Height: | Size: 62 KiB |
Before Width: | Height: | Size: 108 KiB After Width: | Height: | Size: 108 KiB |
Before Width: | Height: | Size: 71 KiB After Width: | Height: | Size: 71 KiB |
Before Width: | Height: | Size: 185 KiB After Width: | Height: | Size: 185 KiB |
Before Width: | Height: | Size: 337 KiB After Width: | Height: | Size: 337 KiB |
Before Width: | Height: | Size: 243 KiB After Width: | Height: | Size: 243 KiB |
|
@ -43,7 +43,7 @@ quantification methods across different scenarios showcasing
|
|||
the accuracy of the quantifier in predicting class prevalences
|
||||
for a wide range of prior distributions. This can easily be
|
||||
achieved by means of the
|
||||
[artificial sampling protocol](https://github.com/HLT-ISTI/QuaPy/wiki/Protocols)
|
||||
[artificial sampling protocol](./protocols)
|
||||
that is implemented in QuaPy.
|
||||
|
||||
The following code shows how to perform one simple experiment
|
||||
|
@ -113,7 +113,7 @@ are '.png' or '.pdf'). If this path is not provided, then the plot
|
|||
will be shown but not saved.
|
||||
The resulting plot should look like:
|
||||
|
||||

|
||||

|
||||
|
||||
Note that in this case, we are also indicating the training
|
||||
prevalence, which is plotted in the diagonal a as cyan dot.
|
||||
|
@ -138,7 +138,7 @@ qp.plot.binary_bias_global(method_names, true_prevs, estim_prevs, savepath='./pl
|
|||
|
||||
and should look like:
|
||||
|
||||

|
||||

|
||||
|
||||
The box plots show some interesting facts:
|
||||
* all methods are biased towards the training prevalence but specially
|
||||
|
@ -181,7 +181,7 @@ def gen_data():
|
|||
|
||||
and the plot should now look like:
|
||||
|
||||

|
||||

|
||||
|
||||
which clearly shows a negative bias for CC variants trained on
|
||||
data containing more negatives (i.e., < 50%) and positive biases
|
||||
|
@ -195,7 +195,7 @@ To this aim, an argument _nbins_ is passed which indicates
|
|||
how many isometric subintervals to take. For example
|
||||
the following plot is produced for _nbins=3_:
|
||||
|
||||

|
||||

|
||||
|
||||
Interestingly enough, the seemingly unbiased estimator (CC at 50%) happens to display
|
||||
a positive bias (or a tendency to overestimate) in cases of low prevalence
|
||||
|
@ -205,7 +205,7 @@ and a negative bias (or a tendency to underestimate) in cases of high prevalence
|
|||
|
||||
Out of curiosity, the diagonal plot for this experiment looks like:
|
||||
|
||||

|
||||

|
||||
|
||||
showing pretty clearly the dependency of CC on the prior probabilities
|
||||
of the labeled set it was trained on.
|
||||
|
@ -234,7 +234,7 @@ qp.plot.error_by_drift(method_names, true_prevs, estim_prevs, tr_prevs,
|
|||
error_name='ae', n_bins=10, savepath='./plots/err_drift.png')
|
||||
```
|
||||
|
||||

|
||||

|
||||
|
||||
Note that all methods work reasonably well in cases of low prevalence
|
||||
drift (i.e., any CC-variant is a good quantifier whenever the IID
|