<spanid="quapy-data-base-module"></span><h2>quapy.data.base module<aclass="headerlink"href="#module-quapy.data.base"title="Permalink to this headline">¶</a></h2>
<p>Abstraction of training and test <aclass="reference internal"href="#quapy.data.base.LabelledCollection"title="quapy.data.base.LabelledCollection"><codeclass="xref py py-class docutils literal notranslate"><spanclass="pre">LabelledCollection</span></code></a> objects.</p>
<dlclass="field-list simple">
<dtclass="field-odd">Parameters</dt>
<ddclass="field-odd"><ulclass="simple">
<li><p><strong>training</strong>– a <aclass="reference internal"href="#quapy.data.base.LabelledCollection"title="quapy.data.base.LabelledCollection"><codeclass="xref py py-class docutils literal notranslate"><spanclass="pre">LabelledCollection</span></code></a> instance</p></li>
<li><p><strong>test</strong>– a <aclass="reference internal"href="#quapy.data.base.LabelledCollection"title="quapy.data.base.LabelledCollection"><codeclass="xref py py-class docutils literal notranslate"><spanclass="pre">LabelledCollection</span></code></a> instance</p></li>
<li><p><strong>vocabulary</strong>– if indicated, is a dictionary of the terms used in this textual dataset</p></li>
<li><p><strong>name</strong>– a string representing the name of the dataset</p></li>
<emclass="property"><spanclass="pre">classmethod</span></em><spanclass="sig-name descname"><spanclass="pre">SplitStratified</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">collection</span></span><spanclass="p"><spanclass="pre">:</span></span><spanclass="n"><aclass="reference internal"href="#quapy.data.base.LabelledCollection"title="quapy.data.base.LabelledCollection"><spanclass="pre">quapy.data.base.LabelledCollection</span></a></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">train_size</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">0.6</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.base.Dataset.SplitStratified"title="Permalink to this definition">¶</a></dt>
<dd><p>Generates a <aclass="reference internal"href="#quapy.data.base.Dataset"title="quapy.data.base.Dataset"><codeclass="xref py py-class docutils literal notranslate"><spanclass="pre">Dataset</span></code></a> from a stratified split of a <aclass="reference internal"href="#quapy.data.base.LabelledCollection"title="quapy.data.base.LabelledCollection"><codeclass="xref py py-class docutils literal notranslate"><spanclass="pre">LabelledCollection</span></code></a> instance.
See <aclass="reference internal"href="#quapy.data.base.LabelledCollection.split_stratified"title="quapy.data.base.LabelledCollection.split_stratified"><codeclass="xref py py-meth docutils literal notranslate"><spanclass="pre">LabelledCollection.split_stratified()</span></code></a></p>
<emclass="property"><spanclass="pre">property</span></em><spanclass="sig-name descname"><spanclass="pre">binary</span></span><aclass="headerlink"href="#quapy.data.base.Dataset.binary"title="Permalink to this definition">¶</a></dt>
<emclass="property"><spanclass="pre">property</span></em><spanclass="sig-name descname"><spanclass="pre">classes_</span></span><aclass="headerlink"href="#quapy.data.base.Dataset.classes_"title="Permalink to this definition">¶</a></dt>
<emclass="property"><spanclass="pre">classmethod</span></em><spanclass="sig-name descname"><spanclass="pre">kFCV</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">data</span></span><spanclass="p"><spanclass="pre">:</span></span><spanclass="n"><aclass="reference internal"href="#quapy.data.base.LabelledCollection"title="quapy.data.base.LabelledCollection"><spanclass="pre">quapy.data.base.LabelledCollection</span></a></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">nfolds</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">5</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">nrepeats</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">1</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">random_state</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">0</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.base.Dataset.kFCV"title="Permalink to this definition">¶</a></dt>
<dd><p>Generator of stratified folds to be used in k-fold cross validation. This function is only a wrapper around
<aclass="reference internal"href="#quapy.data.base.LabelledCollection.kFCV"title="quapy.data.base.LabelledCollection.kFCV"><codeclass="xref py py-meth docutils literal notranslate"><spanclass="pre">LabelledCollection.kFCV()</span></code></a> that returns <aclass="reference internal"href="#quapy.data.base.Dataset"title="quapy.data.base.Dataset"><codeclass="xref py py-class docutils literal notranslate"><spanclass="pre">Dataset</span></code></a> instances made of training and test folds.</p>
<dlclass="field-list simple">
<dtclass="field-odd">Parameters</dt>
<ddclass="field-odd"><ulclass="simple">
<li><p><strong>nfolds</strong>– integer (default 5), the number of folds to generate</p></li>
<li><p><strong>nrepeats</strong>– integer (default 1), the number of rounds of k-fold cross validation to run</p></li>
<li><p><strong>random_state</strong>– integer (default 0), guarantees that the folds generated are reproducible</p></li>
</ul>
</dd>
<dtclass="field-even">Returns</dt>
<ddclass="field-even"><p>yields <cite>nfolds * nrepeats</cite> folds for k-fold cross validation as instances of <aclass="reference internal"href="#quapy.data.base.Dataset"title="quapy.data.base.Dataset"><codeclass="xref py py-class docutils literal notranslate"><spanclass="pre">Dataset</span></code></a></p>
<emclass="property"><spanclass="pre">classmethod</span></em><spanclass="sig-name descname"><spanclass="pre">load</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">train_path</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">test_path</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">loader_func</span></span><spanclass="p"><spanclass="pre">:</span></span><spanclass="n"><spanclass="pre">callable</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">classes</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">None</span></span></em>, <emclass="sig-param"><spanclass="o"><spanclass="pre">**</span></span><spanclass="n"><spanclass="pre">loader_kwargs</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.base.Dataset.load"title="Permalink to this definition">¶</a></dt>
<dd><p>Loads a training and a test labelled set of data and convert it into a <aclass="reference internal"href="#quapy.data.base.Dataset"title="quapy.data.base.Dataset"><codeclass="xref py py-class docutils literal notranslate"><spanclass="pre">Dataset</span></code></a> instance.
The function in charge of reading the instances must be specified. This function can be a custom one, or any of
the reading functions defined in <aclass="reference internal"href="#module-quapy.data.reader"title="quapy.data.reader"><codeclass="xref py py-mod docutils literal notranslate"><spanclass="pre">quapy.data.reader</span></code></a> module.</p>
<dlclass="field-list simple">
<dtclass="field-odd">Parameters</dt>
<ddclass="field-odd"><ulclass="simple">
<li><p><strong>train_path</strong>– string, the path to the file containing the training instances</p></li>
<li><p><strong>test_path</strong>– string, the path to the file containing the test instances</p></li>
<li><p><strong>loader_func</strong>– a custom function that implements the data loader and returns a tuple with instances and
labels</p></li>
<li><p><strong>classes</strong>– array-like, the classes according to which the instances are labelled</p></li>
<li><p><strong>loader_kwargs</strong>– any argument that the <cite>loader_func</cite> function needs in order to read the instances.
See <aclass="reference internal"href="#quapy.data.base.LabelledCollection.load"title="quapy.data.base.LabelledCollection.load"><codeclass="xref py py-meth docutils literal notranslate"><spanclass="pre">LabelledCollection.load()</span></code></a> for further details.</p></li>
<emclass="property"><spanclass="pre">property</span></em><spanclass="sig-name descname"><spanclass="pre">n_classes</span></span><aclass="headerlink"href="#quapy.data.base.Dataset.n_classes"title="Permalink to this definition">¶</a></dt>
<spanclass="sig-name descname"><spanclass="pre">stats</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">show</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.base.Dataset.stats"title="Permalink to this definition">¶</a></dt>
<dd><p>Returns (and eventually prints) a dictionary with some stats of this dataset. E.g.,:</p>
<emclass="property"><spanclass="pre">property</span></em><spanclass="sig-name descname"><spanclass="pre">vocabulary_size</span></span><aclass="headerlink"href="#quapy.data.base.Dataset.vocabulary_size"title="Permalink to this definition">¶</a></dt>
<emclass="property"><spanclass="pre">class</span></em><spanclass="sig-prename descclassname"><spanclass="pre">quapy.data.base.</span></span><spanclass="sig-name descname"><spanclass="pre">LabelledCollection</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">instances</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">labels</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">classes_</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">None</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.base.LabelledCollection"title="Permalink to this definition">¶</a></dt>
<emclass="property"><spanclass="pre">property</span></em><spanclass="sig-name descname"><spanclass="pre">Xy</span></span><aclass="headerlink"href="#quapy.data.base.LabelledCollection.Xy"title="Permalink to this definition">¶</a></dt>
<spanclass="sig-name descname"><spanclass="pre">artificial_sampling_generator</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">sample_size</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">n_prevalences</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">101</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">repeats</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">1</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.base.LabelledCollection.artificial_sampling_generator"title="Permalink to this definition">¶</a></dt>
<spanclass="sig-name descname"><spanclass="pre">artificial_sampling_index_generator</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">sample_size</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">n_prevalences</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">101</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">repeats</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">1</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.base.LabelledCollection.artificial_sampling_index_generator"title="Permalink to this definition">¶</a></dt>
<emclass="property"><spanclass="pre">property</span></em><spanclass="sig-name descname"><spanclass="pre">binary</span></span><aclass="headerlink"href="#quapy.data.base.LabelledCollection.binary"title="Permalink to this definition">¶</a></dt>
<spanclass="sig-name descname"><spanclass="pre">counts</span></span><spanclass="sig-paren">(</span><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.base.LabelledCollection.counts"title="Permalink to this definition">¶</a></dt>
<spanclass="sig-name descname"><spanclass="pre">kFCV</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">nfolds</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">5</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">nrepeats</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">1</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">random_state</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">0</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.base.LabelledCollection.kFCV"title="Permalink to this definition">¶</a></dt>
<emclass="property"><spanclass="pre">classmethod</span></em><spanclass="sig-name descname"><spanclass="pre">load</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">path</span></span><spanclass="p"><spanclass="pre">:</span></span><spanclass="n"><spanclass="pre">str</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">loader_func</span></span><spanclass="p"><spanclass="pre">:</span></span><spanclass="n"><spanclass="pre">callable</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">classes</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">None</span></span></em>, <emclass="sig-param"><spanclass="o"><spanclass="pre">**</span></span><spanclass="n"><spanclass="pre">loader_kwargs</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.base.LabelledCollection.load"title="Permalink to this definition">¶</a></dt>
<dd><p>Loads a labelled set of data and convert it into a <aclass="reference internal"href="#quapy.data.base.LabelledCollection"title="quapy.data.base.LabelledCollection"><codeclass="xref py py-class docutils literal notranslate"><spanclass="pre">LabelledCollection</span></code></a> instance. The function in charge
of reading the instances must be specified. This function can be a custom one, or any of the reading functions
defined in <aclass="reference internal"href="#module-quapy.data.reader"title="quapy.data.reader"><codeclass="xref py py-mod docutils literal notranslate"><spanclass="pre">quapy.data.reader</span></code></a> module.</p>
<dlclass="field-list simple">
<dtclass="field-odd">Parameters</dt>
<ddclass="field-odd"><ulclass="simple">
<li><p><strong>path</strong>– string, the path to the file containing the labelled instances</p></li>
<li><p><strong>loader_func</strong>– a custom function that implements the data loader and returns a tuple with instances and
labels</p></li>
<li><p><strong>classes</strong>– array-like, the classes according to which the instances are labelled</p></li>
<li><p><strong>loader_kwargs</strong>– any argument that the <cite>loader_func</cite> function needs in order to read the instances, i.e.,
these arguments are used to call <cite>loader_func(path, **loader_kwargs)</cite></p></li>
<emclass="property"><spanclass="pre">property</span></em><spanclass="sig-name descname"><spanclass="pre">n_classes</span></span><aclass="headerlink"href="#quapy.data.base.LabelledCollection.n_classes"title="Permalink to this definition">¶</a></dt>
<spanclass="sig-name descname"><spanclass="pre">natural_sampling_generator</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">sample_size</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">repeats</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">100</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.base.LabelledCollection.natural_sampling_generator"title="Permalink to this definition">¶</a></dt>
<spanclass="sig-name descname"><spanclass="pre">natural_sampling_index_generator</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">sample_size</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">repeats</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">100</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.base.LabelledCollection.natural_sampling_index_generator"title="Permalink to this definition">¶</a></dt>
<spanclass="sig-name descname"><spanclass="pre">prevalence</span></span><spanclass="sig-paren">(</span><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.base.LabelledCollection.prevalence"title="Permalink to this definition">¶</a></dt>
<spanclass="sig-name descname"><spanclass="pre">sampling</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">size</span></span></em>, <emclass="sig-param"><spanclass="o"><spanclass="pre">*</span></span><spanclass="n"><spanclass="pre">prevs</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">shuffle</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">True</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.base.LabelledCollection.sampling"title="Permalink to this definition">¶</a></dt>
<dd><p>Return a random sample (an instance of <aclass="reference internal"href="#quapy.data.base.LabelledCollection"title="quapy.data.base.LabelledCollection"><codeclass="xref py py-class docutils literal notranslate"><spanclass="pre">LabelledCollection</span></code></a>) of desired size and desired prevalence
values. For each class, the sampling is drawn without replacement if the requested prevalence is larger than
the actual prevalence of the class, or with replacement otherwise.</p>
<dlclass="field-list simple">
<dtclass="field-odd">Parameters</dt>
<ddclass="field-odd"><ulclass="simple">
<li><p><strong>size</strong>– integer, the requested size</p></li>
<li><p><strong>prevs</strong>– the prevalence for each class; the prevalence value for the last class can be lead empty since
it is constrained. E.g., for binary collections, only the prevalence <cite>p</cite> for the first class (as listed in
<cite>self.classes_</cite> can be specified, while the other class takes prevalence value <cite>1-p</cite></p></li>
<li><p><strong>shuffle</strong>– if set to True (default), shuffles the index before returning it</p></li>
</ul>
</dd>
<dtclass="field-even">Returns</dt>
<ddclass="field-even"><p>an instance of <aclass="reference internal"href="#quapy.data.base.LabelledCollection"title="quapy.data.base.LabelledCollection"><codeclass="xref py py-class docutils literal notranslate"><spanclass="pre">LabelledCollection</span></code></a> with length == <cite>size</cite> and prevalence close to <cite>prevs</cite> (or
prevalence == <cite>prevs</cite> if the exact prevalence values can be met as proportions of instances)</p>
<spanclass="sig-name descname"><spanclass="pre">sampling_from_index</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">index</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.base.LabelledCollection.sampling_from_index"title="Permalink to this definition">¶</a></dt>
<dd><p>Returns an instance of <aclass="reference internal"href="#quapy.data.base.LabelledCollection"title="quapy.data.base.LabelledCollection"><codeclass="xref py py-class docutils literal notranslate"><spanclass="pre">LabelledCollection</span></code></a> whose elements are sampled from this collection using the
<spanclass="sig-name descname"><spanclass="pre">sampling_index</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">size</span></span></em>, <emclass="sig-param"><spanclass="o"><spanclass="pre">*</span></span><spanclass="n"><spanclass="pre">prevs</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">shuffle</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">True</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.base.LabelledCollection.sampling_index"title="Permalink to this definition">¶</a></dt>
<spanclass="sig-name descname"><spanclass="pre">split_stratified</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">train_prop</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">0.6</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">random_state</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">None</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.base.LabelledCollection.split_stratified"title="Permalink to this definition">¶</a></dt>
<dd><p>Returns two instances of <aclass="reference internal"href="#quapy.data.base.LabelledCollection"title="quapy.data.base.LabelledCollection"><codeclass="xref py py-class docutils literal notranslate"><spanclass="pre">LabelledCollection</span></code></a> split with stratification from this collection, at desired
proportion.</p>
<dlclass="field-list simple">
<dtclass="field-odd">Parameters</dt>
<ddclass="field-odd"><ulclass="simple">
<li><p><strong>train_prop</strong>– the proportion of elements to include in the left-most returned collection (typically used
as the training collection). The rest of elements are included in the right-most returned collection
(typically used as a test collection).</p></li>
<li><p><strong>random_state</strong>– if specified, guarantees reproducibility of the split.</p></li>
</ul>
</dd>
<dtclass="field-even">Returns</dt>
<ddclass="field-even"><p>two instances of <aclass="reference internal"href="#quapy.data.base.LabelledCollection"title="quapy.data.base.LabelledCollection"><codeclass="xref py py-class docutils literal notranslate"><spanclass="pre">LabelledCollection</span></code></a>, the first one with <cite>train_prop</cite> elements, and the
second one with <cite>1-train_prop</cite> elements</p>
<spanclass="sig-name descname"><spanclass="pre">stats</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">show</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">True</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.base.LabelledCollection.stats"title="Permalink to this definition">¶</a></dt>
<spanclass="sig-name descname"><spanclass="pre">uniform_sampling</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">size</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.base.LabelledCollection.uniform_sampling"title="Permalink to this definition">¶</a></dt>
<dd><p>Returns a uniform sample (an instance of <aclass="reference internal"href="#quapy.data.base.LabelledCollection"title="quapy.data.base.LabelledCollection"><codeclass="xref py py-class docutils literal notranslate"><spanclass="pre">LabelledCollection</span></code></a>) of desired size. The sampling is drawn
without replacement if the requested size is greater than the number of instances, or with replacement
otherwise.</p>
<dlclass="field-list simple">
<dtclass="field-odd">Parameters</dt>
<ddclass="field-odd"><p><strong>size</strong>– integer, the requested size</p>
</dd>
<dtclass="field-even">Returns</dt>
<ddclass="field-even"><p>an instance of <aclass="reference internal"href="#quapy.data.base.LabelledCollection"title="quapy.data.base.LabelledCollection"><codeclass="xref py py-class docutils literal notranslate"><spanclass="pre">LabelledCollection</span></code></a> with length == <cite>size</cite></p>
<spanclass="sig-name descname"><spanclass="pre">uniform_sampling_index</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">size</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.base.LabelledCollection.uniform_sampling_index"title="Permalink to this definition">¶</a></dt>
<spanclass="sig-prename descclassname"><spanclass="pre">quapy.data.base.</span></span><spanclass="sig-name descname"><spanclass="pre">isbinary</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">data</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.base.isbinary"title="Permalink to this definition">¶</a></dt>
<dd><p>Returns True if <cite>data</cite> is either a binary <aclass="reference internal"href="#quapy.data.base.Dataset"title="quapy.data.base.Dataset"><codeclass="xref py py-class docutils literal notranslate"><spanclass="pre">Dataset</span></code></a> or a binary <aclass="reference internal"href="#quapy.data.base.LabelledCollection"title="quapy.data.base.LabelledCollection"><codeclass="xref py py-class docutils literal notranslate"><spanclass="pre">LabelledCollection</span></code></a></p>
<dlclass="field-list simple">
<dtclass="field-odd">Parameters</dt>
<ddclass="field-odd"><p><strong>data</strong>– a <aclass="reference internal"href="#quapy.data.base.Dataset"title="quapy.data.base.Dataset"><codeclass="xref py py-class docutils literal notranslate"><spanclass="pre">Dataset</span></code></a> or a <aclass="reference internal"href="#quapy.data.base.LabelledCollection"title="quapy.data.base.LabelledCollection"><codeclass="xref py py-class docutils literal notranslate"><spanclass="pre">LabelledCollection</span></code></a> object</p>
</dd>
<dtclass="field-even">Returns</dt>
<ddclass="field-even"><p>True if labelled according to two classes</p>
<spanid="quapy-data-datasets-module"></span><h2>quapy.data.datasets module<aclass="headerlink"href="#module-quapy.data.datasets"title="Permalink to this headline">¶</a></h2>
<spanclass="sig-prename descclassname"><spanclass="pre">quapy.data.datasets.</span></span><spanclass="sig-name descname"><spanclass="pre">fetch_UCIDataset</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">dataset_name</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">data_home</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">None</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">test_split</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">0.3</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">verbose</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">False</span></span></em><spanclass="sig-paren">)</span><spanclass="sig-return"><spanclass="sig-return-icon">→</span><spanclass="sig-return-typehint"><aclass="reference internal"href="#quapy.data.base.Dataset"title="quapy.data.base.Dataset"><spanclass="pre">quapy.data.base.Dataset</span></a></span></span><aclass="headerlink"href="#quapy.data.datasets.fetch_UCIDataset"title="Permalink to this definition">¶</a></dt>
<dd><p>Loads a UCI dataset as an instance of <aclass="reference internal"href="#quapy.data.base.Dataset"title="quapy.data.base.Dataset"><codeclass="xref py py-class docutils literal notranslate"><spanclass="pre">quapy.data.base.Dataset</span></code></a>, as used in
<aclass="reference external"href="https://www.sciencedirect.com/science/article/pii/S1566253516300628">Pérez-Gállego, P., Quevedo, J. R., & del Coz, J. J. (2017).
Using ensembles for problems with characterizable changes in data distribution: A case study on quantification.
Information Fusion, 34, 87-100.</a>
and
<aclass="reference external"href="https://www.sciencedirect.com/science/article/pii/S1566253517303652">Pérez-Gállego, P., Castano, A., Quevedo, J. R., & del Coz, J. J. (2019).
Dynamic ensemble selection for quantification tasks.
Information Fusion, 45, 1-15.</a>.
The datasets do not come with a predefined train-test split (see <aclass="reference internal"href="#quapy.data.datasets.fetch_UCILabelledCollection"title="quapy.data.datasets.fetch_UCILabelledCollection"><codeclass="xref py py-meth docutils literal notranslate"><spanclass="pre">fetch_UCILabelledCollection()</span></code></a> for further
information on how to use these collections), and so a train-test split is generated at desired proportion.
The list of valid dataset names can be accessed in <cite>quapy.data.datasets.UCI_DATASETS</cite></p>
<dlclass="field-list simple">
<dtclass="field-odd">Parameters</dt>
<ddclass="field-odd"><ulclass="simple">
<li><p><strong>dataset_name</strong>– a dataset name</p></li>
<li><p><strong>data_home</strong>– specify the quapy home directory where collections will be dumped (leave empty to use the default
~/quay_data/ directory)</p></li>
<li><p><strong>test_split</strong>– proportion of documents to be included in the test set. The rest conforms the training set</p></li>
<li><p><strong>verbose</strong>– set to True (default is False) to get information (from the UCI ML repository) about the datasets</p></li>
<spanclass="sig-prename descclassname"><spanclass="pre">quapy.data.datasets.</span></span><spanclass="sig-name descname"><spanclass="pre">fetch_UCILabelledCollection</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">dataset_name</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">data_home</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">None</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">verbose</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">False</span></span></em><spanclass="sig-paren">)</span><spanclass="sig-return"><spanclass="sig-return-icon">→</span><spanclass="sig-return-typehint"><aclass="reference internal"href="#quapy.data.base.Dataset"title="quapy.data.base.Dataset"><spanclass="pre">quapy.data.base.Dataset</span></a></span></span><aclass="headerlink"href="#quapy.data.datasets.fetch_UCILabelledCollection"title="Permalink to this definition">¶</a></dt>
<dd><p>Loads a UCI collection as an instance of <aclass="reference internal"href="#quapy.data.base.LabelledCollection"title="quapy.data.base.LabelledCollection"><codeclass="xref py py-class docutils literal notranslate"><spanclass="pre">quapy.data.base.LabelledCollection</span></code></a>, as used in
<aclass="reference external"href="https://www.sciencedirect.com/science/article/pii/S1566253516300628">Pérez-Gállego, P., Quevedo, J. R., & del Coz, J. J. (2017).
Using ensembles for problems with characterizable changes in data distribution: A case study on quantification.
Information Fusion, 34, 87-100.</a>
and
<aclass="reference external"href="https://www.sciencedirect.com/science/article/pii/S1566253517303652">Pérez-Gállego, P., Castano, A., Quevedo, J. R., & del Coz, J. J. (2019).
Dynamic ensemble selection for quantification tasks.
Information Fusion, 45, 1-15.</a>.
The datasets do not come with a predefined train-test split, and so Pérez-Gállego et al. adopted a 5FCVx2 evaluation
protocol, meaning that each collection was used to generate two rounds (hence the x2) of 5 fold cross validation.
This can be reproduced by using <aclass="reference internal"href="#quapy.data.base.Dataset.kFCV"title="quapy.data.base.Dataset.kFCV"><codeclass="xref py py-meth docutils literal notranslate"><spanclass="pre">quapy.data.base.Dataset.kFCV()</span></code></a>, e.g.:</p>
<dd><p>Loads a Reviews dataset as a Dataset instance, as used in
<aclass="reference external"href="https://dl.acm.org/doi/abs/10.1145/3269206.3269287">Esuli, A., Moreo, A., and Sebastiani, F. “A recurrent neural network for sentiment quantification.”
Proceedings of the 27th ACM International Conference on Information and Knowledge Management. 2018.</a>.
The list of valid dataset names can be accessed in <cite>quapy.data.datasets.REVIEWS_SENTIMENT_DATASETS</cite></p>
<dlclass="field-list simple">
<dtclass="field-odd">Parameters</dt>
<ddclass="field-odd"><ulclass="simple">
<li><p><strong>dataset_name</strong>– the name of the dataset: valid ones are ‘hp’, ‘kindle’, ‘imdb’</p></li>
<li><p><strong>tfidf</strong>– set to True to transform the raw documents into tfidf weighted matrices</p></li>
<li><p><strong>min_df</strong>– minimun number of documents that should contain a term in order for the term to be
kept (ignored if tfidf==False)</p></li>
<li><p><strong>data_home</strong>– specify the quapy home directory where collections will be dumped (leave empty to use the default
~/quay_data/ directory)</p></li>
<li><p><strong>pickle</strong>– set to True to pickle the Dataset object the first time it is generated, in order to allow for
<dd><p>Loads a Twitter dataset as a <aclass="reference internal"href="#quapy.data.base.Dataset"title="quapy.data.base.Dataset"><codeclass="xref py py-class docutils literal notranslate"><spanclass="pre">quapy.data.base.Dataset</span></code></a> instance, as used in:
<aclass="reference external"href="https://link.springer.com/content/pdf/10.1007/s13278-016-0327-z.pdf">Gao, W., Sebastiani, F.: From classification to quantification in tweet sentiment analysis.
Social Network Analysis and Mining6(19), 1–22 (2016)</a>
Note that the datasets ‘semeval13’, ‘semeval14’, ‘semeval15’ share the same training set.
The list of valid dataset names corresponding to training sets can be accessed in
<cite>quapy.data.datasets.TWITTER_SENTIMENT_DATASETS_TRAIN</cite>, while the test sets can be accessed in
<spanclass="sig-prename descclassname"><spanclass="pre">quapy.data.datasets.</span></span><spanclass="sig-name descname"><spanclass="pre">warn</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="o"><spanclass="pre">*</span></span><spanclass="n"><spanclass="pre">args</span></span></em>, <emclass="sig-param"><spanclass="o"><spanclass="pre">**</span></span><spanclass="n"><spanclass="pre">kwargs</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.datasets.warn"title="Permalink to this definition">¶</a></dt>
<spanid="quapy-data-preprocessing-module"></span><h2>quapy.data.preprocessing module<aclass="headerlink"href="#module-quapy.data.preprocessing"title="Permalink to this headline">¶</a></h2>
<emclass="property"><spanclass="pre">class</span></em><spanclass="sig-prename descclassname"><spanclass="pre">quapy.data.preprocessing.</span></span><spanclass="sig-name descname"><spanclass="pre">IndexTransformer</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="o"><spanclass="pre">**</span></span><spanclass="n"><spanclass="pre">kwargs</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.preprocessing.IndexTransformer"title="Permalink to this definition">¶</a></dt>
<ddclass="field-odd"><p><strong>kwargs</strong>–<p>keyworded arguments from <aclass="reference external"href="https://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.CountVectorizer.html">CountVectorizer</a></p>
<spanclass="sig-name descname"><spanclass="pre">add_word</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">word</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">id</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">None</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">nogaps</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">True</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.preprocessing.IndexTransformer.add_word"title="Permalink to this definition">¶</a></dt>
<spanclass="sig-name descname"><spanclass="pre">fit</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">X</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.preprocessing.IndexTransformer.fit"title="Permalink to this definition">¶</a></dt>
<spanclass="sig-name descname"><spanclass="pre">fit_transform</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">X</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">n_jobs</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">-</span><spanclass="pre">1</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.preprocessing.IndexTransformer.fit_transform"title="Permalink to this definition">¶</a></dt>
<spanclass="sig-name descname"><spanclass="pre">transform</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">X</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">n_jobs</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">-</span><spanclass="pre">1</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.preprocessing.IndexTransformer.transform"title="Permalink to this definition">¶</a></dt>
<spanclass="sig-name descname"><spanclass="pre">vocabulary_size</span></span><spanclass="sig-paren">(</span><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.preprocessing.IndexTransformer.vocabulary_size"title="Permalink to this definition">¶</a></dt>
<spanclass="sig-prename descclassname"><spanclass="pre">quapy.data.preprocessing.</span></span><spanclass="sig-name descname"><spanclass="pre">index</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">dataset</span></span><spanclass="p"><spanclass="pre">:</span></span><spanclass="n"><aclass="reference internal"href="#quapy.data.base.Dataset"title="quapy.data.base.Dataset"><spanclass="pre">quapy.data.base.Dataset</span></a></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">min_df</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">5</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">inplace</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">False</span></span></em>, <emclass="sig-param"><spanclass="o"><spanclass="pre">**</span></span><spanclass="n"><spanclass="pre">kwargs</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.preprocessing.index"title="Permalink to this definition">¶</a></dt>
<dd><p>Indexes the tokens of a textual <aclass="reference internal"href="#quapy.data.base.Dataset"title="quapy.data.base.Dataset"><codeclass="xref py py-class docutils literal notranslate"><spanclass="pre">quapy.data.base.Dataset</span></code></a> of string documents.
To index a document means to replace each different token by a unique numerical index.
Rare words (i.e., words occurring less than <cite>min_df</cite> times) are replaced by a special token <cite>UNK</cite></p>
<dlclass="field-list simple">
<dtclass="field-odd">Parameters</dt>
<ddclass="field-odd"><ulclass="simple">
<li><p><strong>dataset</strong>– a <aclass="reference internal"href="#quapy.data.base.Dataset"title="quapy.data.base.Dataset"><codeclass="xref py py-class docutils literal notranslate"><spanclass="pre">quapy.data.base.Dataset</span></code></a> object where the instances of training and test documents
are lists of str</p></li>
<li><p><strong>min_df</strong>– minimum number of occurrences below which the term is replaced by a <cite>UNK</cite> index</p></li>
<li><p><strong>inplace</strong>– whether or not to apply the transformation inplace (True), or to a new copy (False, default)</p></li>
<li><p><strong>kwargs</strong>– the rest of parameters of the transformation (as for sklearn’s</p></li>
:return: a new <aclass="reference internal"href="#quapy.data.base.Dataset"title="quapy.data.base.Dataset"><codeclass="xref py py-class docutils literal notranslate"><spanclass="pre">quapy.data.base.Dataset</span></code></a> (if inplace=False) or a reference to the current</p>
<blockquote>
<div><p><aclass="reference internal"href="#quapy.data.base.Dataset"title="quapy.data.base.Dataset"><codeclass="xref py py-class docutils literal notranslate"><spanclass="pre">quapy.data.base.Dataset</span></code></a> (inplace=True) consisting of lists of integer values representing indices.</p>
<spanclass="sig-prename descclassname"><spanclass="pre">quapy.data.preprocessing.</span></span><spanclass="sig-name descname"><spanclass="pre">reduce_columns</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">dataset</span></span><spanclass="p"><spanclass="pre">:</span></span><spanclass="n"><aclass="reference internal"href="#quapy.data.base.Dataset"title="quapy.data.base.Dataset"><spanclass="pre">quapy.data.base.Dataset</span></a></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">min_df</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">5</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">inplace</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">False</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.preprocessing.reduce_columns"title="Permalink to this definition">¶</a></dt>
<dd><p>Reduces the dimensionality of the instances, represented as a <cite>csr_matrix</cite> (or any subtype of
<cite>scipy.sparse.spmatrix</cite>), of training and test documents by removing the columns of words which are not present
in at least <cite>min_df</cite> instances in the training set</p>
<dlclass="field-list simple">
<dtclass="field-odd">Parameters</dt>
<ddclass="field-odd"><ulclass="simple">
<li><p><strong>dataset</strong>– a <aclass="reference internal"href="#quapy.data.base.Dataset"title="quapy.data.base.Dataset"><codeclass="xref py py-class docutils literal notranslate"><spanclass="pre">quapy.data.base.Dataset</span></code></a> in which instances are represented in sparse format (any
subtype of scipy.sparse.spmatrix)</p></li>
<li><p><strong>min_df</strong>– integer, minimum number of instances below which the columns are removed</p></li>
<li><p><strong>inplace</strong>– whether or not to apply the transformation inplace (True), or to a new copy (False, default)</p></li>
</ul>
</dd>
<dtclass="field-even">Returns</dt>
<ddclass="field-even"><p>a new <aclass="reference internal"href="#quapy.data.base.Dataset"title="quapy.data.base.Dataset"><codeclass="xref py py-class docutils literal notranslate"><spanclass="pre">quapy.data.base.Dataset</span></code></a> (if inplace=False) or a reference to the current
<aclass="reference internal"href="#quapy.data.base.Dataset"title="quapy.data.base.Dataset"><codeclass="xref py py-class docutils literal notranslate"><spanclass="pre">quapy.data.base.Dataset</span></code></a> (inplace=True) where the dimensions corresponding to infrequent terms
<spanclass="sig-prename descclassname"><spanclass="pre">quapy.data.preprocessing.</span></span><spanclass="sig-name descname"><spanclass="pre">standardize</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">dataset</span></span><spanclass="p"><spanclass="pre">:</span></span><spanclass="n"><aclass="reference internal"href="#quapy.data.base.Dataset"title="quapy.data.base.Dataset"><spanclass="pre">quapy.data.base.Dataset</span></a></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">inplace</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">False</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.preprocessing.standardize"title="Permalink to this definition">¶</a></dt>
<dd><p>Standardizes the real-valued columns of a <aclass="reference internal"href="#quapy.data.base.Dataset"title="quapy.data.base.Dataset"><codeclass="xref py py-class docutils literal notranslate"><spanclass="pre">quapy.data.base.Dataset</span></code></a>.
Standardization, aka z-scoring, of a variable <cite>X</cite> comes down to subtracting the average and normalizing by the
standard deviation.</p>
<dlclass="field-list simple">
<dtclass="field-odd">Parameters</dt>
<ddclass="field-odd"><ulclass="simple">
<li><p><strong>dataset</strong>– a <aclass="reference internal"href="#quapy.data.base.Dataset"title="quapy.data.base.Dataset"><codeclass="xref py py-class docutils literal notranslate"><spanclass="pre">quapy.data.base.Dataset</span></code></a> object</p></li>
<li><p><strong>inplace</strong>– set to True if the transformation is to be applied inplace, or to False (default) if a new
<aclass="reference internal"href="#quapy.data.base.Dataset"title="quapy.data.base.Dataset"><codeclass="xref py py-class docutils literal notranslate"><spanclass="pre">quapy.data.base.Dataset</span></code></a> is to be returned</p></li>
<dd><p>Transforms a <aclass="reference internal"href="#quapy.data.base.Dataset"title="quapy.data.base.Dataset"><codeclass="xref py py-class docutils literal notranslate"><spanclass="pre">quapy.data.base.Dataset</span></code></a> of textual instances into a <aclass="reference internal"href="#quapy.data.base.Dataset"title="quapy.data.base.Dataset"><codeclass="xref py py-class docutils literal notranslate"><spanclass="pre">quapy.data.base.Dataset</span></code></a> of
tfidf weighted sparse vectors</p>
<dlclass="field-list simple">
<dtclass="field-odd">Parameters</dt>
<ddclass="field-odd"><ulclass="simple">
<li><p><strong>dataset</strong>– a <aclass="reference internal"href="#quapy.data.base.Dataset"title="quapy.data.base.Dataset"><codeclass="xref py py-class docutils literal notranslate"><spanclass="pre">quapy.data.base.Dataset</span></code></a> where the instances of training and test collections are
lists of str</p></li>
<li><p><strong>min_df</strong>– minimum number of occurrences for a word to be considered as part of the vocabulary (default 3)</p></li>
<li><p><strong>sublinear_tf</strong>– whether or not to apply the log scalling to the tf counters (default True)</p></li>
<li><p><strong>inplace</strong>– whether or not to apply the transformation inplace (True), or to a new copy (False, default)</p></li>
<li><p><strong>kwargs</strong>– the rest of parameters of the transformation (as for sklearn’s
<ddclass="field-even"><p>a new <aclass="reference internal"href="#quapy.data.base.Dataset"title="quapy.data.base.Dataset"><codeclass="xref py py-class docutils literal notranslate"><spanclass="pre">quapy.data.base.Dataset</span></code></a> in <cite>csr_matrix</cite> format (if inplace=False) or a reference to the
current Dataset (if inplace=True) where the instances are stored in a <cite>csr_matrix</cite> of real-valued tfidf scores</p>
<spanid="quapy-data-reader-module"></span><h2>quapy.data.reader module<aclass="headerlink"href="#module-quapy.data.reader"title="Permalink to this headline">¶</a></h2>
<spanclass="sig-prename descclassname"><spanclass="pre">quapy.data.reader.</span></span><spanclass="sig-name descname"><spanclass="pre">binarize</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">y</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">pos_class</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.reader.binarize"title="Permalink to this definition">¶</a></dt>
<li><p><strong>y</strong>– array-like of labels</p></li>
<li><p><strong>pos_class</strong>– integer, the positive class</p></li>
</ul>
</dd>
<dtclass="field-even">Returns</dt>
<ddclass="field-even"><p>a binary np.ndarray, in which values 1 corresponds to positions in whcih <cite>y</cite> had <cite>pos_class</cite> labels, and
<spanclass="sig-prename descclassname"><spanclass="pre">quapy.data.reader.</span></span><spanclass="sig-name descname"><spanclass="pre">from_csv</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">path</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">encoding</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">'utf-8'</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.reader.from_csv"title="Permalink to this definition">¶</a></dt>
<dd><p>Reads a csv file in which columns are separated by ‘,’.
File format <label>,<feat1>,<feat2>,…,<featn></p>
<spanclass="sig-prename descclassname"><spanclass="pre">quapy.data.reader.</span></span><spanclass="sig-name descname"><spanclass="pre">from_sparse</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">path</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.reader.from_sparse"title="Permalink to this definition">¶</a></dt>
<dd><p>Reads a labelled collection of real-valued instances expressed in sparse format
File format <-1 or 0 or 1>[s col(int):val(float)]</p>
<dlclass="field-list simple">
<dtclass="field-odd">Parameters</dt>
<ddclass="field-odd"><p><strong>path</strong>– path to the labelled collection</p>
<spanclass="sig-prename descclassname"><spanclass="pre">quapy.data.reader.</span></span><spanclass="sig-name descname"><spanclass="pre">from_text</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">path</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">encoding</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">'utf-8'</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">verbose</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">1</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">class2int</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">True</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.reader.from_text"title="Permalink to this definition">¶</a></dt>
<spanclass="sig-prename descclassname"><spanclass="pre">quapy.data.reader.</span></span><spanclass="sig-name descname"><spanclass="pre">reindex_labels</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">y</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.reader.reindex_labels"title="Permalink to this definition">¶</a></dt>
<dd><p>Re-indexes a list of labels as a list of indexes, and returns the classnames corresponding to the indexes.