QuaPy/BayesianKDEy/TODO.txt

40 lines
2.3 KiB
Plaintext

- Things to try:
- Why not optmize the calibration of the classifier, instead of the classifier as a component of the quantifier?
- init chain helps? [seems irrelevant in MAPLS...]
- Aitchison kernel is better?
- other classifiers?
- optimize classifier?
- use all datasets?
- improve KDE on wine-quality?
- Add other methods that natively provide uncertainty quantification methods?
Ratio estimator
Card & Smith
- MPIW (Mean Prediction Interval Width): is the average of the amplitudes (w/o aggregating coverage whatsoever)
- Implement Interval Score or Winkler Score
- analyze across shift
- add Bayesian EM:
- https://github.com/ChangkunYe/MAPLS/blob/main/label_shift/mapls.py
- take this opportunity to add RLLS:
https://github.com/Angie-Liu/labelshift
https://github.com/ChangkunYe/MAPLS/blob/main/label_shift/rlls.py
- add CIFAR10 and MNIST? Maybe consider also previously tested types of shift (tweak-one-out, etc.)? from RLLS paper
- https://github.com/Angie-Liu/labelshift/tree/master
- https://github.com/Angie-Liu/labelshift/blob/master/cifar10_for_labelshift.py
- Note: MNIST is downloadable from https://archive.ics.uci.edu/dataset/683/mnist+database+of+handwritten+digits
- Seem to be some pretrained models in:
https://github.com/geifmany/cifar-vgg
https://github.com/EN10/KerasMNIST
https://github.com/tohinz/SVHN-Classifier
- consider prior knowledge in experiments:
- One scenario in which our prior is uninformative (i.e., uniform)
- One scenario in which our prior is wrong (e.g., alpha-prior = (3,2,1), protocol-prior=(1,1,5))
- One scenario in which our prior is very good (e.g., alpha-prior = (3,2,1), protocol-prior=(3,2,1))
- Do all my baseline methods come with the option to inform a prior?
- consider different bandwidths within the bayesian approach?
- how to improve the coverage (or how to increase the amplitude)?
- Added temperature-calibration, improve things.
- Is temperature-calibration actually not equivalent to using a larger bandwidth in the kernels?
- consider W as a measure of quantification error (the current e.g., w-CI is the winkler...)
- optimize also C and class_weight? [I don't think so, but could be done easily now]
- remove wikis from repo