From 8e9e7fa199b7120879828f0274f49a5dd20b2434 Mon Sep 17 00:00:00 2001 From: Mirko Bunse Date: Mon, 1 Jul 2024 16:16:45 +0200 Subject: [PATCH 1/8] Move docs/source/wiki/ to docs/source/manuals/ --- docs/source/index.rst | 16 ++++++++-------- .../{wiki/Datasets.md => manuals/datasets.md} | 5 ++--- .../Evaluation.md => manuals/evaluation.md} | 0 .../explicit-loss-minimization.md} | 0 .../{wiki/Methods.md => manuals/methods.md} | 7 +++---- .../model-selection.md} | 4 ++-- .../plots}/bin_bias.png | Bin .../plots}/bin_bias_bin_cc.png | Bin .../plots}/bin_bias_cc.png | Bin .../plots}/bin_diag.png | Bin .../plots}/bin_diag_cc.png | Bin .../plots}/err_drift.png | Bin .../{wiki/Plotting.md => manuals/plotting.md} | 14 +++++++------- .../{wiki/Protocols.md => manuals/protocols.md} | 0 14 files changed, 22 insertions(+), 24 deletions(-) rename docs/source/{wiki/Datasets.md => manuals/datasets.md} (99%) rename docs/source/{wiki/Evaluation.md => manuals/evaluation.md} (100%) rename docs/source/{wiki/ExplicitLossMinimization.md => manuals/explicit-loss-minimization.md} (100%) rename docs/source/{wiki/Methods.md => manuals/methods.md} (98%) rename docs/source/{wiki/Model-Selection.md => manuals/model-selection.md} (97%) rename docs/source/{wiki/wiki_examples/selected_plots => manuals/plots}/bin_bias.png (100%) rename docs/source/{wiki/wiki_examples/selected_plots => manuals/plots}/bin_bias_bin_cc.png (100%) rename docs/source/{wiki/wiki_examples/selected_plots => manuals/plots}/bin_bias_cc.png (100%) rename docs/source/{wiki/wiki_examples/selected_plots => manuals/plots}/bin_diag.png (100%) rename docs/source/{wiki/wiki_examples/selected_plots => manuals/plots}/bin_diag_cc.png (100%) rename docs/source/{wiki/wiki_examples/selected_plots => manuals/plots}/err_drift.png (100%) rename docs/source/{wiki/Plotting.md => manuals/plotting.md} (95%) rename docs/source/{wiki/Protocols.md => manuals/protocols.md} (100%) diff --git a/docs/source/index.rst b/docs/source/index.rst index d2918cf..7c7916c 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -68,15 +68,15 @@ Manuals The following manuals illustrate several aspects of QuaPy through examples: .. toctree:: - :maxdepth: 1 + :maxdepth: 2 - wiki/Datasets - wiki/Evaluation - wiki/ExplicitLossMinimization - wiki/Methods - wiki/Model-Selection - wiki/Plotting - wiki/Protocols + manuals/datasets + manuals/evaluation + manuals/explicit-loss-minimization + manuals/methods + manuals/model-selection + manuals/plotting + manuals/protocols .. toctree:: :hidden: diff --git a/docs/source/wiki/Datasets.md b/docs/source/manuals/datasets.md similarity index 99% rename from docs/source/wiki/Datasets.md rename to docs/source/manuals/datasets.md index 904fe53..cc972cd 100644 --- a/docs/source/wiki/Datasets.md +++ b/docs/source/manuals/datasets.md @@ -67,9 +67,8 @@ for method in methods: ``` However, generating samples for evaluation purposes is tackled in QuaPy -by means of the evaluation protocols (see the dedicated entries in the Wiki -for [evaluation](https://github.com/HLT-ISTI/QuaPy/wiki/Evaluation) and -[protocols](https://github.com/HLT-ISTI/QuaPy/wiki/Protocols)). +by means of the evaluation protocols (see the dedicated entries in the manuals +for [evaluation](./evaluation) and [protocols](./protocols)). ## Reviews Datasets diff --git a/docs/source/wiki/Evaluation.md b/docs/source/manuals/evaluation.md similarity index 100% rename from docs/source/wiki/Evaluation.md rename to docs/source/manuals/evaluation.md diff --git a/docs/source/wiki/ExplicitLossMinimization.md b/docs/source/manuals/explicit-loss-minimization.md similarity index 100% rename from docs/source/wiki/ExplicitLossMinimization.md rename to docs/source/manuals/explicit-loss-minimization.md diff --git a/docs/source/wiki/Methods.md b/docs/source/manuals/methods.md similarity index 98% rename from docs/source/wiki/Methods.md rename to docs/source/manuals/methods.md index 760df16..03c5c2a 100644 --- a/docs/source/wiki/Methods.md +++ b/docs/source/manuals/methods.md @@ -29,7 +29,7 @@ instance in a sample-- while in quantification the output for a sample is one single array of class prevalences). Quantifiers also extend from scikit-learn's `BaseEstimator`, in order to simplify the use of `set_params` and `get_params` used in -[model selector](https://github.com/HLT-ISTI/QuaPy/wiki/Model-Selection). +[model selection](./model-selection). ## Aggregative Methods @@ -96,7 +96,7 @@ classifier, and then _clones_ these classifiers and explores the combinations of hyperparameters that are specific to the quantifier (this can result in huge time savings). Concerning the inference phase, this two-step process allow the evaluation of many -standard protocols (e.g., the [artificial sampling protocol](https://github.com/HLT-ISTI/QuaPy/wiki/Evaluation)) to be +standard protocols (e.g., the [artificial sampling protocol](./evaluation)) to be carried out very efficiently. The reason is that the entire set can be pre-classified once, and the quantification estimations for different samples can directly reuse these predictions, without requiring to classify each element every time. @@ -484,8 +484,7 @@ the performance estimated for each member of the ensemble in terms of that evalu When using any of the above options, it is important to set the `red_size` parameter, which informs of the number of members to retain. -Please, check the [model selection](https://github.com/HLT-ISTI/QuaPy/wiki/Model-Selection) -wiki if you want to optimize the hyperparameters of ensemble for classification or quantification. +Please, check the [model selection manual](./model-selection) if you want to optimize the hyperparameters of ensemble for classification or quantification. ### The QuaNet neural network diff --git a/docs/source/wiki/Model-Selection.md b/docs/source/manuals/model-selection.md similarity index 97% rename from docs/source/wiki/Model-Selection.md rename to docs/source/manuals/model-selection.md index 9dd5bab..097f902 100644 --- a/docs/source/wiki/Model-Selection.md +++ b/docs/source/manuals/model-selection.md @@ -33,11 +33,11 @@ of scenarios exhibiting different degrees of prior probability shift. The class _qp.model_selection.GridSearchQ_ implements a grid-search exploration over the space of -hyper-parameter combinations that [evaluates](https://github.com/HLT-ISTI/QuaPy/wiki/Evaluation) +hyper-parameter combinations that [evaluates](./evaluation) each combination of hyper-parameters by means of a given quantification-oriented error metric (e.g., any of the error functions implemented in _qp.error_) and according to a -[sampling generation protocol](https://github.com/HLT-ISTI/QuaPy/wiki/Protocols). +[sampling generation protocol](./protocols). The following is an example (also included in the examples folder) of model selection for quantification: diff --git a/docs/source/wiki/wiki_examples/selected_plots/bin_bias.png b/docs/source/manuals/plots/bin_bias.png similarity index 100% rename from docs/source/wiki/wiki_examples/selected_plots/bin_bias.png rename to docs/source/manuals/plots/bin_bias.png diff --git a/docs/source/wiki/wiki_examples/selected_plots/bin_bias_bin_cc.png b/docs/source/manuals/plots/bin_bias_bin_cc.png similarity index 100% rename from docs/source/wiki/wiki_examples/selected_plots/bin_bias_bin_cc.png rename to docs/source/manuals/plots/bin_bias_bin_cc.png diff --git a/docs/source/wiki/wiki_examples/selected_plots/bin_bias_cc.png b/docs/source/manuals/plots/bin_bias_cc.png similarity index 100% rename from docs/source/wiki/wiki_examples/selected_plots/bin_bias_cc.png rename to docs/source/manuals/plots/bin_bias_cc.png diff --git a/docs/source/wiki/wiki_examples/selected_plots/bin_diag.png b/docs/source/manuals/plots/bin_diag.png similarity index 100% rename from docs/source/wiki/wiki_examples/selected_plots/bin_diag.png rename to docs/source/manuals/plots/bin_diag.png diff --git a/docs/source/wiki/wiki_examples/selected_plots/bin_diag_cc.png b/docs/source/manuals/plots/bin_diag_cc.png similarity index 100% rename from docs/source/wiki/wiki_examples/selected_plots/bin_diag_cc.png rename to docs/source/manuals/plots/bin_diag_cc.png diff --git a/docs/source/wiki/wiki_examples/selected_plots/err_drift.png b/docs/source/manuals/plots/err_drift.png similarity index 100% rename from docs/source/wiki/wiki_examples/selected_plots/err_drift.png rename to docs/source/manuals/plots/err_drift.png diff --git a/docs/source/wiki/Plotting.md b/docs/source/manuals/plotting.md similarity index 95% rename from docs/source/wiki/Plotting.md rename to docs/source/manuals/plotting.md index 99f3f7e..ec080da 100644 --- a/docs/source/wiki/Plotting.md +++ b/docs/source/manuals/plotting.md @@ -43,7 +43,7 @@ quantification methods across different scenarios showcasing the accuracy of the quantifier in predicting class prevalences for a wide range of prior distributions. This can easily be achieved by means of the -[artificial sampling protocol](https://github.com/HLT-ISTI/QuaPy/wiki/Protocols) +[artificial sampling protocol](./protocols) that is implemented in QuaPy. The following code shows how to perform one simple experiment @@ -113,7 +113,7 @@ are '.png' or '.pdf'). If this path is not provided, then the plot will be shown but not saved. The resulting plot should look like: -![diagonal plot on Kindle](./wiki_examples/selected_plots/bin_diag.png) +![diagonal plot on Kindle](./plots/bin_diag.png) Note that in this case, we are also indicating the training prevalence, which is plotted in the diagonal a as cyan dot. @@ -138,7 +138,7 @@ qp.plot.binary_bias_global(method_names, true_prevs, estim_prevs, savepath='./pl and should look like: -![bias plot on Kindle](./wiki_examples/selected_plots/bin_bias.png) +![bias plot on Kindle](./plots/bin_bias.png) The box plots show some interesting facts: * all methods are biased towards the training prevalence but specially @@ -181,7 +181,7 @@ def gen_data(): and the plot should now look like: -![bias plot on IMDb](./wiki_examples/selected_plots/bin_bias_cc.png) +![bias plot on IMDb](./plots/bin_bias_cc.png) which clearly shows a negative bias for CC variants trained on data containing more negatives (i.e., < 50%) and positive biases @@ -195,7 +195,7 @@ To this aim, an argument _nbins_ is passed which indicates how many isometric subintervals to take. For example the following plot is produced for _nbins=3_: -![bias plot on IMDb](./wiki_examples/selected_plots/bin_bias_bin_cc.png) +![bias plot on IMDb](./plots/bin_bias_bin_cc.png) Interestingly enough, the seemingly unbiased estimator (CC at 50%) happens to display a positive bias (or a tendency to overestimate) in cases of low prevalence @@ -205,7 +205,7 @@ and a negative bias (or a tendency to underestimate) in cases of high prevalence Out of curiosity, the diagonal plot for this experiment looks like: -![diag plot on IMDb](./wiki_examples/selected_plots/bin_diag_cc.png) +![diag plot on IMDb](./plots/bin_diag_cc.png) showing pretty clearly the dependency of CC on the prior probabilities of the labeled set it was trained on. @@ -234,7 +234,7 @@ qp.plot.error_by_drift(method_names, true_prevs, estim_prevs, tr_prevs, error_name='ae', n_bins=10, savepath='./plots/err_drift.png') ``` -![diag plot on IMDb](./wiki_examples/selected_plots/err_drift.png) +![diag plot on IMDb](./plots/err_drift.png) Note that all methods work reasonably well in cases of low prevalence drift (i.e., any CC-variant is a good quantifier whenever the IID diff --git a/docs/source/wiki/Protocols.md b/docs/source/manuals/protocols.md similarity index 100% rename from docs/source/wiki/Protocols.md rename to docs/source/manuals/protocols.md From d2209afab5da3738f157b84ce3832fa1e41707e9 Mon Sep 17 00:00:00 2001 From: Mirko Bunse Date: Mon, 1 Jul 2024 16:37:28 +0200 Subject: [PATCH 2/8] Manuals and API sections --- docs/source/index.rst | 12 +++--------- docs/source/manuals.rst | 14 ++++++++++++++ docs/source/modules.rst | 7 ------- 3 files changed, 17 insertions(+), 16 deletions(-) create mode 100644 docs/source/manuals.rst delete mode 100644 docs/source/modules.rst diff --git a/docs/source/index.rst b/docs/source/index.rst index 7c7916c..34c7944 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -68,20 +68,14 @@ Manuals The following manuals illustrate several aspects of QuaPy through examples: .. toctree:: - :maxdepth: 2 + :maxdepth: 3 - manuals/datasets - manuals/evaluation - manuals/explicit-loss-minimization - manuals/methods - manuals/model-selection - manuals/plotting - manuals/protocols + manuals .. toctree:: :hidden: - List of Modules + API Features -------- diff --git a/docs/source/manuals.rst b/docs/source/manuals.rst new file mode 100644 index 0000000..a426786 --- /dev/null +++ b/docs/source/manuals.rst @@ -0,0 +1,14 @@ +Manuals +======= + +.. toctree:: + :maxdepth: 2 + :numbered: + + manuals/datasets + manuals/evaluation + manuals/explicit-loss-minimization + manuals/methods + manuals/model-selection + manuals/plotting + manuals/protocols diff --git a/docs/source/modules.rst b/docs/source/modules.rst deleted file mode 100644 index 5d84a54..0000000 --- a/docs/source/modules.rst +++ /dev/null @@ -1,7 +0,0 @@ -quapy -===== - -.. toctree:: - :maxdepth: 4 - - quapy From c668d0b3d80c896f97aeb4ea2a03ff201e29cfdb Mon Sep 17 00:00:00 2001 From: Mirko Bunse Date: Mon, 1 Jul 2024 17:06:35 +0200 Subject: [PATCH 3/8] Translate index.rst to index.md --- docs/source/index.md | 100 ++++++++++++++++++++++++++++++++++++++ docs/source/index.rst | 109 ------------------------------------------ 2 files changed, 100 insertions(+), 109 deletions(-) create mode 100644 docs/source/index.md delete mode 100644 docs/source/index.rst diff --git a/docs/source/index.md b/docs/source/index.md new file mode 100644 index 0000000..accb758 --- /dev/null +++ b/docs/source/index.md @@ -0,0 +1,100 @@ +```{toctree} +:hidden: + +self +``` + +# Quickstart + +QuaPy is an open source framework for quantification (a.k.a. supervised prevalence estimation, or learning to quantify) written in Python. + +QuaPy is based on the concept of "data sample", and provides implementations of the most important aspects of the quantification workflow, such as (baseline and advanced) quantification methods, quantification-oriented model selection mechanisms, evaluation measures, and evaluations protocols used for evaluating quantification methods. QuaPy also makes available commonly used datasets, and offers visualization tools for facilitating the analysis and interpretation of the experimental results. + +QuaPy is hosted on GitHub at [https://github.com/HLT-ISTI/QuaPy](https://github.com/HLT-ISTI/QuaPy). + +## Installation + +```sh +pip install quapy +``` + +## Usage + +The following script fetches a dataset of tweets, trains, applies, and evaluates a quantifier based on the *Adjusted Classify & Count* quantification method, using, as the evaluation measure, the *Mean Absolute Error* (MAE) between the predicted and the true class prevalence values of the test set: + +```python +import quapy as qp +from sklearn.linear_model import LogisticRegression + +dataset = qp.datasets.fetch_twitter('semeval16') + +# create an "Adjusted Classify & Count" quantifier +model = qp.method.aggregative.ACC(LogisticRegression()) +model.fit(dataset.training) + +estim_prevalence = model.quantify(dataset.test.instances) +true_prevalence = dataset.test.prevalence() + +error = qp.error.mae(true_prevalence, estim_prevalence) + +print(f'Mean Absolute Error (MAE)={error:.3f}') +``` + +Quantification is useful in scenarios characterized by prior probability shift. In other words, we would be little interested in estimating the class prevalence values of the test set if we could assume the IID assumption to hold, as this prevalence would be roughly equivalent to the class prevalence of the training set. For this reason, any quantification model should be tested across many samples, even ones characterized by class prevalence values different or very different from those found in the training set. QuaPy implements sampling procedures and evaluation protocols that automate this workflow. See the [](./manuals) for detailed examples. + +## Manuals + +The following manuals illustrate several aspects of QuaPy through examples: + +```{toctree} +:maxdepth: 3 + +manuals +``` + +```{toctree} +:hidden: + +API +``` + +## Features + +- Implementation of many popular quantification methods (Classify-&-Count and its variants, Expectation Maximization, quantification methods based on structured output learning, HDy, QuaNet, quantification ensembles, among others). +- Versatile functionality for performing evaluation based on sampling generation protocols (e.g., APP, NPP, etc.). +- Implementation of most commonly used evaluation metrics (e.g., AE, RAE, NAE, NRAE, SE, KLD, NKLD, etc.). +- Datasets frequently used in quantification (textual and numeric), including: + - 32 UCI Machine Learning binary datasets. + - 5 UCI Machine Learning multiclass datasets (new in v0.1.8!). + - 11 Twitter quantification-by-sentiment datasets. + - 3 product reviews quantification-by-sentiment datasets. + - 4 tasks from LeQua competition (new in v0.1.7!) + - IFCB dataset of plankton water samples (new in v0.1.8!). +- Native support for binary and single-label multiclass quantification scenarios. +- Model selection functionality that minimizes quantification-oriented loss functions. +- Visualization tools for analysing the experimental results. + +## Citing QuaPy + +If you find QuaPy useful (and we hope you will), please consider citing the original paper in your research. + +```bibtex +@inproceedings{moreo2021quapy, + title={QuaPy: a python-based framework for quantification}, + author={Moreo, Alejandro and Esuli, Andrea and Sebastiani, Fabrizio}, + booktitle={Proceedings of the 30th ACM International Conference on Information \& Knowledge Management}, + pages={4534--4543}, + year={2021} +} +``` + +## Contributing + +In case you want to contribute improvements to quapy, please generate pull request to the "devel" branch. + +## Acknowledgments + +```{image} SoBigData.png +:width: 250px +:alt: SoBigData++ +``` diff --git a/docs/source/index.rst b/docs/source/index.rst deleted file mode 100644 index 34c7944..0000000 --- a/docs/source/index.rst +++ /dev/null @@ -1,109 +0,0 @@ -.. QuaPy: A Python-based open-source framework for quantification documentation master file, created by - sphinx-quickstart on Wed Feb 7 16:26:46 2024. - You can adapt this file completely to your liking, but it should at least - contain the root `toctree` directive. - -.. toctree:: - :hidden: - - self - -Quickstart -========================================================================================== - -QuaPy is an open source framework for quantification (a.k.a. supervised prevalence estimation, or learning to quantify) written in Python. - -QuaPy is based on the concept of "data sample", and provides implementations of the most important aspects of the quantification workflow, such as (baseline and advanced) quantification methods, quantification-oriented model selection mechanisms, evaluation measures, and evaluations protocols used for evaluating quantification methods. QuaPy also makes available commonly used datasets, and offers visualization tools for facilitating the analysis and interpretation of the experimental results. - -QuaPy is hosted on GitHub at ``_ - -Installation ------------- - -.. code-block:: none - - pip install quapy - -Citing QuaPy ------------- - -If you find QuaPy useful (and we hope you will), please consider citing the original paper in your research. - -.. code-block:: none - - @inproceedings{moreo2021quapy, - title={QuaPy: a python-based framework for quantification}, - author={Moreo, Alejandro and Esuli, Andrea and Sebastiani, Fabrizio}, - booktitle={Proceedings of the 30th ACM International Conference on Information \& Knowledge Management}, - pages={4534--4543}, - year={2021} - } - -Usage ------ - -The following script fetches a dataset of tweets, trains, applies, and evaluates a quantifier based on the *Adjusted Classify & Count* quantification method, using, as the evaluation measure, the *Mean Absolute Error* (MAE) between the predicted and the true class prevalence values of the test set:: - - import quapy as qp - from sklearn.linear_model import LogisticRegression - - dataset = qp.datasets.fetch_twitter('semeval16') - - # create an "Adjusted Classify & Count" quantifier - model = qp.method.aggregative.ACC(LogisticRegression()) - model.fit(dataset.training) - - estim_prevalence = model.quantify(dataset.test.instances) - true_prevalence = dataset.test.prevalence() - - error = qp.error.mae(true_prevalence, estim_prevalence) - - print(f'Mean Absolute Error (MAE)={error:.3f}') - -Quantification is useful in scenarios characterized by prior probability shift. In other words, we would be little interested in estimating the class prevalence values of the test set if we could assume the IID assumption to hold, as this prevalence would be roughly equivalent to the class prevalence of the training set. For this reason, any quantification model should be tested across many samples, even ones characterized by class prevalence values different or very different from those found in the training set. QuaPy implements sampling procedures and evaluation protocols that automate this workflow. See the `Manuals`_ for detailed examples. - -Manuals -------- - -The following manuals illustrate several aspects of QuaPy through examples: - -.. toctree:: - :maxdepth: 3 - - manuals - -.. toctree:: - :hidden: - - API - -Features --------- - -* Implementation of many popular quantification methods (Classify-&-Count and its variants, Expectation Maximization, quantification methods based on structured output learning, HDy, QuaNet, quantification ensembles, among others). -* Versatile functionality for performing evaluation based on sampling generation protocols (e.g., APP, NPP, etc.). -* Implementation of most commonly used evaluation metrics (e.g., AE, RAE, NAE, NRAE, SE, KLD, NKLD, etc.). -* Datasets frequently used in quantification (textual and numeric), including: - - * 32 UCI Machine Learning binary datasets. - * 5 UCI Machine Learning multiclass datasets (new in v0.1.8!). - * 11 Twitter quantification-by-sentiment datasets. - * 3 product reviews quantification-by-sentiment datasets. - * 4 tasks from LeQua competition (new in v0.1.7!) - * IFCB dataset of plankton water samples (new in v0.1.8!). - -* Native support for binary and single-label multiclass quantification scenarios. -* Model selection functionality that minimizes quantification-oriented loss functions. -* Visualization tools for analysing the experimental results. - -Contributing ------------- - -In case you want to contribute improvements to quapy, please generate pull request to the "devel" branch. - -Acknowledgments ---------------- - -.. image:: SoBigData.png - :width: 250px - :alt: SoBigData++ From 415c92f803d2acb1b36520b0d271b11e663ec936 Mon Sep 17 00:00:00 2001 From: Mirko Bunse Date: Mon, 1 Jul 2024 17:07:01 +0200 Subject: [PATCH 4/8] Fix cross-references within the documentation --- docs/source/manuals/evaluation.md | 4 ++-- docs/source/manuals/explicit-loss-minimization.md | 8 ++++---- docs/source/manuals/methods.md | 4 ++-- 3 files changed, 8 insertions(+), 8 deletions(-) diff --git a/docs/source/manuals/evaluation.md b/docs/source/manuals/evaluation.md index d9c1b79..e5404a3 100644 --- a/docs/source/manuals/evaluation.md +++ b/docs/source/manuals/evaluation.md @@ -72,8 +72,8 @@ one specific _sample generation procotol_ to genereate many samples, typically characterized by widely varying amounts of _shift_ with respect to the original distribution, that are then used to evaluate the performance of a (trained) quantifier. -These protocols are explained in more detail in a dedicated [entry -in the wiki](Protocols.md). For the moment being, let us assume we already have +These protocols are explained in more detail in a dedicated [manual](./protocols.md). +For the moment being, let us assume we already have chosen and instantiated one specific such protocol, that we here simply call _prot_. Let also assume our model is called _quantifier_ and that our evaluatio measure of choice is diff --git a/docs/source/manuals/explicit-loss-minimization.md b/docs/source/manuals/explicit-loss-minimization.md index 23a07ea..f80c434 100644 --- a/docs/source/manuals/explicit-loss-minimization.md +++ b/docs/source/manuals/explicit-loss-minimization.md @@ -5,14 +5,14 @@ SVM(Q), SVM(KLD), SVM(NKLD), SVM(AE), or SVM(RAE). These methods require to first download the [svmperf](http://www.cs.cornell.edu/people/tj/svm_light/svm_perf.html) package, apply the patch -[svm-perf-quantification-ext.patch](./svm-perf-quantification-ext.patch), and compile the sources. -The script [prepare_svmperf.sh](prepare_svmperf.sh) does all the job. Simply run: +[svm-perf-quantification-ext.patch](https://github.com/HLT-ISTI/QuaPy/blob/master/svm-perf-quantification-ext.patch), and compile the sources. +The script [prepare_svmperf.sh](https://github.com/HLT-ISTI/QuaPy/blob/master/prepare_svmperf.sh) does all the job. Simply run: ``` ./prepare_svmperf.sh ``` -The resulting directory [svm_perf_quantification](./svm_perf_quantification) contains the +The resulting directory `svm_perf_quantification/` contains the patched version of _svmperf_ with quantification-oriented losses. The [svm-perf-quantification-ext.patch](https://github.com/HLT-ISTI/QuaPy/blob/master/prepare_svmperf.sh) is an extension of the patch made available by @@ -22,5 +22,5 @@ the _Q_ measure as proposed by [Barranquero et al. 2015](https://www.sciencedire and for the _KLD_ and _NKLD_ measures as proposed by [Esuli et al. 2015](https://dl.acm.org/doi/abs/10.1145/2700406?casa_token=8D2fHsGCVn0AAAAA:ZfThYOvrzWxMGfZYlQW_y8Cagg-o_l6X_PcF09mdETQ4Tu7jK98mxFbGSXp9ZSO14JkUIYuDGFG0). This patch extends the above one by also allowing SVMperf to optimize for _AE_ and _RAE_. -See [Methods.md](Methods.md) for more details and code examples. +See the [](./methods) manual for more details and code examples. diff --git a/docs/source/manuals/methods.md b/docs/source/manuals/methods.md index 03c5c2a..9536820 100644 --- a/docs/source/manuals/methods.md +++ b/docs/source/manuals/methods.md @@ -414,8 +414,8 @@ model.fit(dataset.training) estim_prevalence = model.quantify(dataset.test.instances) ``` -Check the examples _[explicit_loss_minimization.py](..%2Fexamples%2Fexplicit_loss_minimization.py)_ -and [one_vs_all.py](..%2Fexamples%2Fone_vs_all.py) for more details. +Check the examples on [explicit_loss_minimization](https://github.com/HLT-ISTI/QuaPy/blob/devel/examples/5.explicit_loss_minimization.py) +and on [one versus all quantification](https://github.com/HLT-ISTI/QuaPy/blob/devel/examples/10.one_vs_all.py) for more details. ### Kernel Density Estimation methods (KDEy) From b8b3cf540e52eb1e83bcfa10f242d38aa757833b Mon Sep 17 00:00:00 2001 From: Mirko Bunse Date: Mon, 1 Jul 2024 17:48:23 +0200 Subject: [PATCH 5/8] Correct all remaining warnings during the build of the docs --- .github/workflows/ci.yml | 2 +- docs/source/conf.py | 7 ++++++- docs/source/quapy.method.rst | 2 +- setup.py | 1 + 4 files changed, 9 insertions(+), 3 deletions(-) diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index b1e275c..83662d9 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -45,7 +45,7 @@ jobs: pre-build-command: | apt-get --allow-releaseinfo-change update -y && apt-get install -y git && git --version python -m pip install --upgrade pip setuptools wheel - python -m pip install -e .[composable,docs] + python -m pip install -e .[composable,neural,docs] docs-folder: "docs/" - name: Publish documentation run: | diff --git a/docs/source/conf.py b/docs/source/conf.py index 9d86c63..702463c 100644 --- a/docs/source/conf.py +++ b/docs/source/conf.py @@ -36,6 +36,7 @@ extensions = [ 'sphinx.ext.autosummary', 'sphinx.ext.viewcode', 'sphinx.ext.napoleon', + 'sphinx.ext.intersphinx', 'myst_parser', ] @@ -55,6 +56,10 @@ exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store'] html_theme = 'sphinx_rtd_theme' # html_theme = 'furo' # need to be installed: pip install furo (not working...) -html_static_path = ['_static'] +# html_static_path = ['_static'] +# intersphinx configuration +intersphinx_mapping = { + "sklearn": ("https://scikit-learn.org/stable/", None), +} diff --git a/docs/source/quapy.method.rst b/docs/source/quapy.method.rst index 31a357a..ac0dfc8 100644 --- a/docs/source/quapy.method.rst +++ b/docs/source/quapy.method.rst @@ -53,7 +53,7 @@ quapy.method.non\_aggregative module :show-inheritance: quapy.method.composable module ------------------------- +------------------------------ .. automodule:: quapy.method.composable :members: diff --git a/setup.py b/setup.py index 23aa3ca..aa699e4 100644 --- a/setup.py +++ b/setup.py @@ -126,6 +126,7 @@ setup( extras_require={ # Optional 'bayes': ['jax', 'jaxlib', 'numpyro'], 'composable': ['qunfold @ git+https://github.com/mirkobunse/qunfold@v0.1.3'], + 'neural': ['torch'], 'tests': ['certifi'], 'docs' : ['sphinx-rtd-theme', 'myst-parser'], }, From c99c9903a33df71544bdc94f848894cea78a1006 Mon Sep 17 00:00:00 2001 From: Mirko Bunse Date: Mon, 24 Jun 2024 14:19:13 +0200 Subject: [PATCH 6/8] TO REVERT: build gh-pages even on pushes to devel --- .github/workflows/ci.yml | 1 - 1 file changed, 1 deletion(-) diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index 83662d9..fb0647b 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -36,7 +36,6 @@ jobs: docs: name: Documentation runs-on: ubuntu-latest - if: github.ref == 'refs/heads/master' steps: - uses: actions/checkout@v1 - name: Build documentation From 7f05f8dd41dbab01f09c480a9128b7adc1b2e2ca Mon Sep 17 00:00:00 2001 From: Mirko Bunse Date: Mon, 1 Jul 2024 18:10:29 +0200 Subject: [PATCH 7/8] Fix the autodoc of the composable module --- .github/workflows/ci.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index fb0647b..fb5e8c7 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -43,7 +43,7 @@ jobs: with: pre-build-command: | apt-get --allow-releaseinfo-change update -y && apt-get install -y git && git --version - python -m pip install --upgrade pip setuptools wheel + python -m pip install --upgrade pip setuptools wheel "jax[cpu]" python -m pip install -e .[composable,neural,docs] docs-folder: "docs/" - name: Publish documentation From 1730d5a1a966398485e142b59efe733a3432051c Mon Sep 17 00:00:00 2001 From: Mirko Bunse Date: Mon, 1 Jul 2024 18:17:58 +0200 Subject: [PATCH 8/8] Revert "TO REVERT: build gh-pages even on pushes to devel" This reverts commit c99c9903a33df71544bdc94f848894cea78a1006. --- .github/workflows/ci.yml | 1 + 1 file changed, 1 insertion(+) diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index fb5e8c7..030b152 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -36,6 +36,7 @@ jobs: docs: name: Documentation runs-on: ubuntu-latest + if: github.ref == 'refs/heads/master' steps: - uses: actions/checkout@v1 - name: Build documentation