#include <dataset.h>

Public Member Functions | |
| virtual bool | IsHidden (size_t iNode) const =0 |
| Returns true if the requested experimental node is hidden (does not correspond to a data file). | |
| virtual size_t | GetDiscrete (size_t iY, size_t iX, size_t iNode) const =0 |
| Return the discretized value at the requested position. | |
| virtual float | GetContinuous (size_t iY, size_t iX, size_t iNode) const =0 |
| Return the continuous value at the requested position. | |
| virtual const std::string & | GetGene (size_t iGene) const =0 |
| Returns the gene name at the requested index. | |
| virtual size_t | GetGenes () const =0 |
| Returns the number of genes in the dataset. | |
| virtual bool | IsExample (size_t iY, size_t iX) const =0 |
| Returns true if some data file can be accessed at the requested position. | |
| virtual const std::vector < std::string > & | GetGeneNames () const =0 |
| Return a vector of all gene names in the dataset. | |
| virtual size_t | GetExperiments () const =0 |
| Return the number of experimental nodes in the dataset. | |
| virtual size_t | GetGene (const std::string &strGene) const =0 |
| Return the index of the given gene name, or -1 if it is not included in the dataset. | |
| virtual size_t | GetBins (size_t iNode) const =0 |
| Return the number of discrete values in the requested experimental node; -1 if the node is hidden or continuous. | |
| virtual void | Remove (size_t iY, size_t iX)=0 |
| Remove all data for the given dataset position. | |
| virtual void | FilterGenes (const CGenes &Genes, CDat::EFilter eFilter)=0 |
| Remove values from the dataset based on the given gene set and filter type. | |
| virtual void | Save (std::ostream &ostm, bool fBinary) const =0 |
| Save a dataset to the given stream in binary or tabular (human readable) form. | |
An IDataset is intended to manage a collection of individual datasets, usually CDats. This is often used for integration of many datasets in a model such as a Bayes net or SVM, and as such, IDatasets can be used to learn or evaluate these models. Although most datasets will be backed by discretized CDats with no hidden data (e.g. CDatasetCompact), the IDataset interface allows:
The IDataset interface merges the gene lists from all contained data files into a single gene list, which it exposes through GetGenes/GetGene/GetGeneNames/etc. Gene indices are similarly normalized; requesting gene pair i,j will "mean" the same thing in each encapsulated dataset. Missing values will be filled in as necessary for data files not containing information for the requested pair. QUANT files associated with non-continuous data files will be loaded automatically.
Definition at line 64 of file dataset.h.
| virtual bool Sleipnir::IDataset::IsHidden | ( | size_t | iNode | ) | const [pure virtual] |
Returns true if the requested experimental node is hidden (does not correspond to a data file).
| iNode | Experimental node to investigate. |
Implemented in Sleipnir::CDataset, Sleipnir::CDatasetCompact, Sleipnir::CDataMask, Sleipnir::CDataFilter, and Sleipnir::CDataSubset.
Referenced by Sleipnir::CTrie< tType >::CTrie().
| virtual size_t Sleipnir::IDataset::GetDiscrete | ( | size_t | iY, | |
| size_t | iX, | |||
| size_t | iNode | |||
| ) | const [pure virtual] |
Return the discretized value at the requested position.
| iY | Data row. | |
| iX | Data column. | |
| iNode | Experimental node from which to retrieve the requested pair's value. |
Implemented in Sleipnir::CDataset, Sleipnir::CDatasetCompact, Sleipnir::CDataMask, Sleipnir::CDataFilter, and Sleipnir::CDataSubset.
Referenced by Sleipnir::CTrie< tType >::CTrie(), and Sleipnir::CBayesNetFN::Learn().
| virtual float Sleipnir::IDataset::GetContinuous | ( | size_t | iY, | |
| size_t | iX, | |||
| size_t | iNode | |||
| ) | const [pure virtual] |
Return the continuous value at the requested position.
| iY | Data row. | |
| iX | Data column. | |
| iNode | Experimental node from which to retrieve the requested pair's value. |
Implemented in Sleipnir::CDataset, Sleipnir::CDatasetCompact, Sleipnir::CDataMask, Sleipnir::CDataFilter, and Sleipnir::CDataSubset.
| virtual const std::string& Sleipnir::IDataset::GetGene | ( | size_t | iGene | ) | const [pure virtual] |
Returns the gene name at the requested index.
| iGene | Index of gene name to return. |
Implemented in Sleipnir::CDataset, Sleipnir::CDatasetCompact, Sleipnir::CDataMask, Sleipnir::CDataFilter, and Sleipnir::CDataSubset.
| virtual size_t Sleipnir::IDataset::GetGenes | ( | ) | const [pure virtual] |
Returns the number of genes in the dataset.
Implemented in Sleipnir::CDataset, Sleipnir::CDatasetCompact, Sleipnir::CDataMask, Sleipnir::CDataFilter, and Sleipnir::CDataSubset.
Referenced by Sleipnir::CTrie< tType >::CTrie(), and Sleipnir::CBayesNetFN::Learn().
| virtual bool Sleipnir::IDataset::IsExample | ( | size_t | iY, | |
| size_t | iX | |||
| ) | const [pure virtual] |
Returns true if some data file can be accessed at the requested position.
| iY | Data row. | |
| iX | Data column. |
Implemented in Sleipnir::CDataset, Sleipnir::CDatasetCompact, Sleipnir::CDatasetCompactMap, Sleipnir::CDataMask, Sleipnir::CDataFilter, and Sleipnir::CDataSubset.
Referenced by Sleipnir::CTrie< tType >::CTrie(), and Sleipnir::CBayesNetFN::Learn().
| virtual const std::vector<std::string>& Sleipnir::IDataset::GetGeneNames | ( | ) | const [pure virtual] |
Return a vector of all gene names in the dataset.
Implemented in Sleipnir::CDataset, Sleipnir::CDatasetCompact, Sleipnir::CDataMask, Sleipnir::CDataFilter, and Sleipnir::CDataSubset.
| virtual size_t Sleipnir::IDataset::GetExperiments | ( | ) | const [pure virtual] |
Return the number of experimental nodes in the dataset.
Implemented in Sleipnir::CDataset, Sleipnir::CDatasetCompact, Sleipnir::CDataMask, Sleipnir::CDataFilter, and Sleipnir::CDataSubset.
Referenced by Sleipnir::CTrie< tType >::CTrie(), and Sleipnir::CBayesNetSmile::Open().
| virtual size_t Sleipnir::IDataset::GetGene | ( | const std::string & | strGene | ) | const [pure virtual] |
Return the index of the given gene name, or -1 if it is not included in the dataset.
| strGene | Gene name to retrieve. |
Implemented in Sleipnir::CDataset, Sleipnir::CDatasetCompact, Sleipnir::CDataMask, Sleipnir::CDataFilter, and Sleipnir::CDataSubset.
| virtual size_t Sleipnir::IDataset::GetBins | ( | size_t | iNode | ) | const [pure virtual] |
Return the number of discrete values in the requested experimental node; -1 if the node is hidden or continuous.
| iNode | Experimental node for which bin number should be returned. |
Implemented in Sleipnir::CDataset, Sleipnir::CDatasetCompact, Sleipnir::CDataMask, Sleipnir::CDataFilter, and Sleipnir::CDataSubset.
Referenced by Sleipnir::CTrie< tType >::CTrie(), and Sleipnir::CBayesNetSmile::Open().
| virtual void Sleipnir::IDataset::Remove | ( | size_t | iY, | |
| size_t | iX | |||
| ) | [pure virtual] |
Remove all data for the given dataset position.
| iY | Data row. | |
| iX | Data column. |
Implemented in Sleipnir::CDataset, Sleipnir::CDatasetCompact, Sleipnir::CDatasetCompactMap, Sleipnir::CDataMask, Sleipnir::CDataFilter, and Sleipnir::CDataSubset.
| virtual void Sleipnir::IDataset::FilterGenes | ( | const CGenes & | Genes, | |
| CDat::EFilter | eFilter | |||
| ) | [pure virtual] |
Remove values from the dataset based on the given gene set and filter type.
| Genes | Gene set used to filter the dataset. | |
| eFilter | Way in which to use the given genes to remove values. |
Implemented in Sleipnir::CDataset, Sleipnir::CDatasetCompact, Sleipnir::CDataMask, Sleipnir::CDataFilter, and Sleipnir::CDataSubset.
| virtual void Sleipnir::IDataset::Save | ( | std::ostream & | ostm, | |
| bool | fBinary | |||
| ) | const [pure virtual] |
Save a dataset to the given stream in binary or tabular (human readable) form.
| ostm | Stream into which dataset is saved. | |
| fBinary | If true, save the dataset as a binary file; if false, save it as a text-based tab-delimited file. |
Implemented in Sleipnir::CDataset, Sleipnir::CDatasetCompact, Sleipnir::CDataMask, and Sleipnir::CDataFilter.
1.5.5