<spanid="quapy-data-base-module"></span><h2>quapy.data.base module<aclass="headerlink"href="#module-quapy.data.base"title="Permalink to this headline">¶</a></h2>
<emclass="property"><spanclass="pre">classmethod</span></em><spanclass="sig-name descname"><spanclass="pre">SplitStratified</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">collection</span></span><spanclass="p"><spanclass="pre">:</span></span><spanclass="n"><aclass="reference internal"href="#quapy.data.base.LabelledCollection"title="quapy.data.base.LabelledCollection"><spanclass="pre">quapy.data.base.LabelledCollection</span></a></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">train_size</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">0.6</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.base.Dataset.SplitStratified"title="Permalink to this definition">¶</a></dt>
<emclass="property"><spanclass="pre">property</span></em><spanclass="sig-name descname"><spanclass="pre">binary</span></span><aclass="headerlink"href="#quapy.data.base.Dataset.binary"title="Permalink to this definition">¶</a></dt>
<emclass="property"><spanclass="pre">property</span></em><spanclass="sig-name descname"><spanclass="pre">classes_</span></span><aclass="headerlink"href="#quapy.data.base.Dataset.classes_"title="Permalink to this definition">¶</a></dt>
<emclass="property"><spanclass="pre">classmethod</span></em><spanclass="sig-name descname"><spanclass="pre">kFCV</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">data</span></span><spanclass="p"><spanclass="pre">:</span></span><spanclass="n"><aclass="reference internal"href="#quapy.data.base.LabelledCollection"title="quapy.data.base.LabelledCollection"><spanclass="pre">quapy.data.base.LabelledCollection</span></a></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">nfolds</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">5</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">nrepeats</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">1</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">random_state</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">0</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.base.Dataset.kFCV"title="Permalink to this definition">¶</a></dt>
<emclass="property"><spanclass="pre">classmethod</span></em><spanclass="sig-name descname"><spanclass="pre">load</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">train_path</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">test_path</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">loader_func</span></span><spanclass="p"><spanclass="pre">:</span></span><spanclass="n"><spanclass="pre">callable</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.base.Dataset.load"title="Permalink to this definition">¶</a></dt>
<emclass="property"><spanclass="pre">property</span></em><spanclass="sig-name descname"><spanclass="pre">n_classes</span></span><aclass="headerlink"href="#quapy.data.base.Dataset.n_classes"title="Permalink to this definition">¶</a></dt>
<spanclass="sig-name descname"><spanclass="pre">stats</span></span><spanclass="sig-paren">(</span><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.base.Dataset.stats"title="Permalink to this definition">¶</a></dt>
<emclass="property"><spanclass="pre">property</span></em><spanclass="sig-name descname"><spanclass="pre">vocabulary_size</span></span><aclass="headerlink"href="#quapy.data.base.Dataset.vocabulary_size"title="Permalink to this definition">¶</a></dt>
<emclass="property"><spanclass="pre">class</span></em><spanclass="sig-prename descclassname"><spanclass="pre">quapy.data.base.</span></span><spanclass="sig-name descname"><spanclass="pre">LabelledCollection</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">instances</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">labels</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">classes_</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">None</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.base.LabelledCollection"title="Permalink to this definition">¶</a></dt>
<emclass="property"><spanclass="pre">property</span></em><spanclass="sig-name descname"><spanclass="pre">Xy</span></span><aclass="headerlink"href="#quapy.data.base.LabelledCollection.Xy"title="Permalink to this definition">¶</a></dt>
<spanclass="sig-name descname"><spanclass="pre">artificial_sampling_generator</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">sample_size</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">n_prevalences</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">101</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">repeats</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">1</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.base.LabelledCollection.artificial_sampling_generator"title="Permalink to this definition">¶</a></dt>
<spanclass="sig-name descname"><spanclass="pre">artificial_sampling_index_generator</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">sample_size</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">n_prevalences</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">101</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">repeats</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">1</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.base.LabelledCollection.artificial_sampling_index_generator"title="Permalink to this definition">¶</a></dt>
<emclass="property"><spanclass="pre">property</span></em><spanclass="sig-name descname"><spanclass="pre">binary</span></span><aclass="headerlink"href="#quapy.data.base.LabelledCollection.binary"title="Permalink to this definition">¶</a></dt>
<spanclass="sig-name descname"><spanclass="pre">counts</span></span><spanclass="sig-paren">(</span><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.base.LabelledCollection.counts"title="Permalink to this definition">¶</a></dt>
<spanclass="sig-name descname"><spanclass="pre">kFCV</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">nfolds</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">5</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">nrepeats</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">1</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">random_state</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">0</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.base.LabelledCollection.kFCV"title="Permalink to this definition">¶</a></dt>
<emclass="property"><spanclass="pre">classmethod</span></em><spanclass="sig-name descname"><spanclass="pre">load</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">path</span></span><spanclass="p"><spanclass="pre">:</span></span><spanclass="n"><spanclass="pre">str</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">loader_func</span></span><spanclass="p"><spanclass="pre">:</span></span><spanclass="n"><spanclass="pre">callable</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">classes</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">None</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.base.LabelledCollection.load"title="Permalink to this definition">¶</a></dt>
<emclass="property"><spanclass="pre">property</span></em><spanclass="sig-name descname"><spanclass="pre">n_classes</span></span><aclass="headerlink"href="#quapy.data.base.LabelledCollection.n_classes"title="Permalink to this definition">¶</a></dt>
<spanclass="sig-name descname"><spanclass="pre">natural_sampling_generator</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">sample_size</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">repeats</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">100</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.base.LabelledCollection.natural_sampling_generator"title="Permalink to this definition">¶</a></dt>
<spanclass="sig-name descname"><spanclass="pre">natural_sampling_index_generator</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">sample_size</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">repeats</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">100</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.base.LabelledCollection.natural_sampling_index_generator"title="Permalink to this definition">¶</a></dt>
<spanclass="sig-name descname"><spanclass="pre">prevalence</span></span><spanclass="sig-paren">(</span><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.base.LabelledCollection.prevalence"title="Permalink to this definition">¶</a></dt>
<spanclass="sig-name descname"><spanclass="pre">sampling</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">size</span></span></em>, <emclass="sig-param"><spanclass="o"><spanclass="pre">*</span></span><spanclass="n"><spanclass="pre">prevs</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">shuffle</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">True</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.base.LabelledCollection.sampling"title="Permalink to this definition">¶</a></dt>
<spanclass="sig-name descname"><spanclass="pre">sampling_from_index</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">index</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.base.LabelledCollection.sampling_from_index"title="Permalink to this definition">¶</a></dt>
<spanclass="sig-name descname"><spanclass="pre">sampling_index</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">size</span></span></em>, <emclass="sig-param"><spanclass="o"><spanclass="pre">*</span></span><spanclass="n"><spanclass="pre">prevs</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">shuffle</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">True</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.base.LabelledCollection.sampling_index"title="Permalink to this definition">¶</a></dt>
<spanclass="sig-name descname"><spanclass="pre">split_stratified</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">train_prop</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">0.6</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">random_state</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">None</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.base.LabelledCollection.split_stratified"title="Permalink to this definition">¶</a></dt>
<spanclass="sig-name descname"><spanclass="pre">stats</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">show</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">True</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.base.LabelledCollection.stats"title="Permalink to this definition">¶</a></dt>
<spanclass="sig-name descname"><spanclass="pre">uniform_sampling</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">size</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.base.LabelledCollection.uniform_sampling"title="Permalink to this definition">¶</a></dt>
<spanclass="sig-name descname"><spanclass="pre">uniform_sampling_index</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">size</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.base.LabelledCollection.uniform_sampling_index"title="Permalink to this definition">¶</a></dt>
<spanclass="sig-prename descclassname"><spanclass="pre">quapy.data.base.</span></span><spanclass="sig-name descname"><spanclass="pre">isbinary</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">data</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.base.isbinary"title="Permalink to this definition">¶</a></dt>
<spanid="quapy-data-datasets-module"></span><h2>quapy.data.datasets module<aclass="headerlink"href="#module-quapy.data.datasets"title="Permalink to this headline">¶</a></h2>
<spanclass="sig-prename descclassname"><spanclass="pre">quapy.data.datasets.</span></span><spanclass="sig-name descname"><spanclass="pre">df_replace</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="pre">df</span></em>, <emclass="sig-param"><spanclass="pre">col</span></em>, <emclass="sig-param"><spanclass="pre">repl={'no':</span><spanclass="pre">0</span></em>, <emclass="sig-param"><spanclass="pre">'yes':</span><spanclass="pre">1}</span></em>, <emclass="sig-param"><spanclass="pre">astype=<class</span><spanclass="pre">'float'></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.datasets.df_replace"title="Permalink to this definition">¶</a></dt>
<spanclass="sig-prename descclassname"><spanclass="pre">quapy.data.datasets.</span></span><spanclass="sig-name descname"><spanclass="pre">fetch_UCIDataset</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">dataset_name</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">data_home</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">None</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">test_split</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">0.3</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">verbose</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">False</span></span></em><spanclass="sig-paren">)</span><spanclass="sig-return"><spanclass="sig-return-icon">→</span><spanclass="sig-return-typehint"><aclass="reference internal"href="#quapy.data.base.Dataset"title="quapy.data.base.Dataset"><spanclass="pre">quapy.data.base.Dataset</span></a></span></span><aclass="headerlink"href="#quapy.data.datasets.fetch_UCIDataset"title="Permalink to this definition">¶</a></dt>
<spanclass="sig-prename descclassname"><spanclass="pre">quapy.data.datasets.</span></span><spanclass="sig-name descname"><spanclass="pre">fetch_UCILabelledCollection</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">dataset_name</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">data_home</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">None</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">verbose</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">False</span></span></em><spanclass="sig-paren">)</span><spanclass="sig-return"><spanclass="sig-return-icon">→</span><spanclass="sig-return-typehint"><aclass="reference internal"href="#quapy.data.base.Dataset"title="quapy.data.base.Dataset"><spanclass="pre">quapy.data.base.Dataset</span></a></span></span><aclass="headerlink"href="#quapy.data.datasets.fetch_UCILabelledCollection"title="Permalink to this definition">¶</a></dt>
<spanclass="sig-prename descclassname"><spanclass="pre">quapy.data.datasets.</span></span><spanclass="sig-name descname"><spanclass="pre">warn</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="o"><spanclass="pre">*</span></span><spanclass="n"><spanclass="pre">args</span></span></em>, <emclass="sig-param"><spanclass="o"><spanclass="pre">**</span></span><spanclass="n"><spanclass="pre">kwargs</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.datasets.warn"title="Permalink to this definition">¶</a></dt>
<spanid="quapy-data-preprocessing-module"></span><h2>quapy.data.preprocessing module<aclass="headerlink"href="#module-quapy.data.preprocessing"title="Permalink to this headline">¶</a></h2>
<emclass="property"><spanclass="pre">class</span></em><spanclass="sig-prename descclassname"><spanclass="pre">quapy.data.preprocessing.</span></span><spanclass="sig-name descname"><spanclass="pre">IndexTransformer</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="o"><spanclass="pre">**</span></span><spanclass="n"><spanclass="pre">kwargs</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.preprocessing.IndexTransformer"title="Permalink to this definition">¶</a></dt>
<spanclass="sig-name descname"><spanclass="pre">add_word</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">word</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">id</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">None</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">nogaps</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">True</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.preprocessing.IndexTransformer.add_word"title="Permalink to this definition">¶</a></dt>
<spanclass="sig-name descname"><spanclass="pre">fit</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">X</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.preprocessing.IndexTransformer.fit"title="Permalink to this definition">¶</a></dt>
<dd><dlclass="field-list simple">
<dtclass="field-odd">Parameters</dt>
<ddclass="field-odd"><p><strong>X</strong>– a list of strings</p>
<spanclass="sig-name descname"><spanclass="pre">fit_transform</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">X</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">n_jobs</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">-</span><spanclass="pre">1</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.preprocessing.IndexTransformer.fit_transform"title="Permalink to this definition">¶</a></dt>
<spanclass="sig-name descname"><spanclass="pre">index</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">documents</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.preprocessing.IndexTransformer.index"title="Permalink to this definition">¶</a></dt>
<spanclass="sig-name descname"><spanclass="pre">transform</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">X</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">n_jobs</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">-</span><spanclass="pre">1</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.preprocessing.IndexTransformer.transform"title="Permalink to this definition">¶</a></dt>
<spanclass="sig-name descname"><spanclass="pre">vocabulary_size</span></span><spanclass="sig-paren">(</span><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.preprocessing.IndexTransformer.vocabulary_size"title="Permalink to this definition">¶</a></dt>
<spanclass="sig-prename descclassname"><spanclass="pre">quapy.data.preprocessing.</span></span><spanclass="sig-name descname"><spanclass="pre">index</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">dataset</span></span><spanclass="p"><spanclass="pre">:</span></span><spanclass="n"><aclass="reference internal"href="#quapy.data.base.Dataset"title="quapy.data.base.Dataset"><spanclass="pre">quapy.data.base.Dataset</span></a></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">min_df</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">5</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">inplace</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">False</span></span></em>, <emclass="sig-param"><spanclass="o"><spanclass="pre">**</span></span><spanclass="n"><spanclass="pre">kwargs</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.preprocessing.index"title="Permalink to this definition">¶</a></dt>
<dd><p>Indexes a dataset of strings. To index a document means to replace each different token by a unique numerical index.
Rare words (i.e., words occurring less than _min_df_ times) are replaced by a special token UNK
:param dataset: a Dataset where the instances are lists of str
:param min_df: minimum number of instances below which the term is replaced by a UNK index
:param inplace: whether or not to apply the transformation inplace, or to a new copy
:param kwargs: the rest of parameters of the transformation (as for sklearn.feature_extraction.text.CountVectorizer)
:return: a new Dataset (if inplace=False) or a reference to the current Dataset (inplace=True)
consisting of lists of integer values representing indices.</p>
<spanclass="sig-prename descclassname"><spanclass="pre">quapy.data.preprocessing.</span></span><spanclass="sig-name descname"><spanclass="pre">reduce_columns</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">dataset</span></span><spanclass="p"><spanclass="pre">:</span></span><spanclass="n"><aclass="reference internal"href="#quapy.data.base.Dataset"title="quapy.data.base.Dataset"><spanclass="pre">quapy.data.base.Dataset</span></a></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">min_df</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">5</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">inplace</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">False</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.preprocessing.reduce_columns"title="Permalink to this definition">¶</a></dt>
<dd><p>Reduces the dimensionality of the csr_matrix by removing the columns of words which are not present in at least
_min_df_ instances
:param dataset: a Dataset in sparse format (any subtype of scipy.sparse.spmatrix)
:param min_df: minimum number of instances below which the columns are removed
:param inplace: whether or not to apply the transformation inplace, or to a new copy
:return: a new Dataset (if inplace=False) or a reference to the current Dataset (inplace=True)
where the dimensions corresponding to infrequent instances have been removed</p>
<spanclass="sig-prename descclassname"><spanclass="pre">quapy.data.preprocessing.</span></span><spanclass="sig-name descname"><spanclass="pre">standardize</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">dataset</span></span><spanclass="p"><spanclass="pre">:</span></span><spanclass="n"><aclass="reference internal"href="#quapy.data.base.Dataset"title="quapy.data.base.Dataset"><spanclass="pre">quapy.data.base.Dataset</span></a></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">inplace</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">True</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.preprocessing.standardize"title="Permalink to this definition">¶</a></dt>
<spanid="quapy-data-reader-module"></span><h2>quapy.data.reader module<aclass="headerlink"href="#module-quapy.data.reader"title="Permalink to this headline">¶</a></h2>
<spanclass="sig-prename descclassname"><spanclass="pre">quapy.data.reader.</span></span><spanclass="sig-name descname"><spanclass="pre">binarize</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">y</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">pos_class</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.reader.binarize"title="Permalink to this definition">¶</a></dt>
<spanclass="sig-prename descclassname"><spanclass="pre">quapy.data.reader.</span></span><spanclass="sig-name descname"><spanclass="pre">from_csv</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">path</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">encoding</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">'utf-8'</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.reader.from_csv"title="Permalink to this definition">¶</a></dt>
<dd><p>Reads a csv file in which columns are separated by ‘,’.
File format <label>,<feat1>,<feat2>,…,<featn></p>
<dlclass="field-list simple">
<dtclass="field-odd">Parameters</dt>
<ddclass="field-odd"><p><strong>path</strong>– path to the csv file</p>
</dd>
<dtclass="field-even">Returns</dt>
<ddclass="field-even"><p>a ndarray for the labels and a ndarray (float) for the covariates</p>
<spanclass="sig-prename descclassname"><spanclass="pre">quapy.data.reader.</span></span><spanclass="sig-name descname"><spanclass="pre">from_sparse</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">path</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.reader.from_sparse"title="Permalink to this definition">¶</a></dt>
<dd><p>Reads a labelled collection of real-valued instances expressed in sparse format
File format <-1 or 0 or 1>[s col(int):val(float)]</p>
<dlclass="field-list simple">
<dtclass="field-odd">Parameters</dt>
<ddclass="field-odd"><p><strong>path</strong>– path to the labelled collection</p>
</dd>
<dtclass="field-even">Returns</dt>
<ddclass="field-even"><p>a csr_matrix containing the instances (rows), and a ndarray containing the labels</p>
<spanclass="sig-prename descclassname"><spanclass="pre">quapy.data.reader.</span></span><spanclass="sig-name descname"><spanclass="pre">from_text</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">path</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">encoding</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">'utf-8'</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">verbose</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">1</span></span></em>, <emclass="sig-param"><spanclass="n"><spanclass="pre">class2int</span></span><spanclass="o"><spanclass="pre">=</span></span><spanclass="default_value"><spanclass="pre">True</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.reader.from_text"title="Permalink to this definition">¶</a></dt>
<dd><p>Reads a labelled colletion of documents.
File fomart <0 or 1><document></p>
<dlclass="field-list simple">
<dtclass="field-odd">Parameters</dt>
<ddclass="field-odd"><p><strong>path</strong>– path to the labelled collection</p>
</dd>
<dtclass="field-even">Returns</dt>
<ddclass="field-even"><p>a list of sentences, and a list of labels</p>
<spanclass="sig-prename descclassname"><spanclass="pre">quapy.data.reader.</span></span><spanclass="sig-name descname"><spanclass="pre">reindex_labels</span></span><spanclass="sig-paren">(</span><emclass="sig-param"><spanclass="n"><spanclass="pre">y</span></span></em><spanclass="sig-paren">)</span><aclass="headerlink"href="#quapy.data.reader.reindex_labels"title="Permalink to this definition">¶</a></dt>
<dd><p>Re-indexes a list of labels as a list of indexes, and returns the classnames corresponding to the indexes.