Sleipnir::CGenes Class Reference

Represents a simple set of unique genes. More...

#include <genome.h>

Inheritance diagram for Sleipnir::CGenes:


Public Member Functions

 CGenes (CGenome &Genome)
 Construct a new gene set containing genomes drawn from the given underlying genome.
bool Open (std::istream &istm, bool fCreate=true)
 Construct a new gene set by loading genes from the given text stream, one per line.
bool Open (const std::vector< std::string > &vecstrGenes, bool fCreate=true)
 Construct a new gene set containing the given gene IDs.
void Filter (const CGenes &GenesExclude)
 Remove the given genes from the gene set.
size_t CountAnnotations (const IOntology *pOntology, size_t iTerm, bool fRecursive=true, const CGenes *pBackground=NULL) const
 Return the number of genes in the set annotated at or, optionally, below the given ontology term.
std::vector< std::string > GetGeneNames () const
 Return the primary identifiers of all genes in the set.
bool Open (const char *szFile, bool fCreate=true)
 Construct a new gene set by loading genes from the given text file, one per line.
size_t GetGenes () const
 Return the number of genes in the set.
bool IsGene (const std::string &strGene) const
 Return true if the given name is a primary identifier of a gene in the set.
CGenomeGetGenome () const
 Return the gene set's underlying genome.
const CGeneGetGene (size_t iGene) const
 Return the gene at the requested index.
size_t GetGene (const std::string &strGene) const
 Return the index of the gene with the given primary identifier, or -1 if none exists.

Detailed Description

Represents a simple set of unique genes.

Remarks:
Genes are represented by index only and not explicitly checked for uniqueness, so most of the naming issues of CGenome are avoided. Gene comparisons generally assume a constant gene pool drawn from the base CGenome and are thus performed by pointer comparisons for efficiency; in other words, don't expect two different CGene objects with the same primary ID to behave correctly.

Definition at line 355 of file genome.h.


Constructor & Destructor Documentation

Sleipnir::CGenes::CGenes ( CGenome Genome  ) 

Construct a new gene set containing genomes drawn from the given underlying genome.

Parameters:
Genome Genome containing all genes which might become members of this gene set.

Definition at line 451 of file genome.cpp.


Member Function Documentation

bool Sleipnir::CGenes::Open ( std::istream &  istm,
bool  fCreate = true 
)

Construct a new gene set by loading genes from the given text stream, one per line.

Parameters:
istm Stream containing gene IDs to load, one per line.
fCreate If true, add unknown genes to the underlying genome; otherwise, unknown gene IDs are ignored.
Returns:
True if gene set was constructed successfully.
Loads a text file of the form:
 GENE1
 GENE2
 GENE3
containing one primary gene identifier per line. If these gene identifiers are found in the gene set's underlying genome, CGene objects are loaded from there. Otherwise, if fCreate is true, new genes are created from the loaded IDs. If fCreate is false, unrecognized genes are skipped with a warning.

See also:
CGenome::AddGene

Definition at line 481 of file genome.cpp.

References Sleipnir::CGene::GetName().

Referenced by Sleipnir::CPCL::Distance(), Sleipnir::CDatasetCompact::FilterGenes(), Sleipnir::CDat::FilterGenes(), and Open().

bool Sleipnir::CGenes::Open ( const std::vector< std::string > &  vecstrGenes,
bool  fCreate = true 
)

Construct a new gene set containing the given gene IDs.

Parameters:
vecstrGenes Primary identifiers of genes in the new gene set.
fCreate If true, add unknown genes to the underlying genome; otherwise, unknown gene IDs are ignored.
Returns:
True if gene set was constructed successfully.
If the given gene identifiers are found in the gene set's underlying genome, CGene objects are loaded from there. Otherwise, if fCreate is true, new genes are created from the loaded IDs. If fCreate is false, unrecognized genes are skipped with a warning.

See also:
CGenome::AddGene

Definition at line 570 of file genome.cpp.

void Sleipnir::CGenes::Filter ( const CGenes GenesExclude  ) 

Remove the given genes from the gene set.

Parameters:
GenesExclude Genes to be removed from the current gene set.
Remarks:
Comparisons are performed using pointers to CGene objects, so both gene sets should use the same underlying CGenome for proper behavior.

Definition at line 597 of file genome.cpp.

References GetGene(), and GetGenes().

size_t Sleipnir::CGenes::CountAnnotations ( const IOntology pOntology,
size_t  iTerm,
bool  fRecursive = true,
const CGenes pBackground = NULL 
) const

Return the number of genes in the set annotated at or, optionally, below the given ontology term.

Parameters:
pOntology Ontology in which annotations are counted.
iTerm Ontology term at or below which annotations are counted.
fRecursive If true, count annotations at or below the given term; otherwise, count only direct annotations to the term.
pBackground If non-null, count only annotations for genes also contained in the given background set.
Returns:
Number of genes in the gene set annotated at or below the given ontology term.
See also:
IOntology::IsAnnotated

Definition at line 539 of file genome.cpp.

References Sleipnir::IOntology::IsAnnotated(), and IsGene().

vector< string > Sleipnir::CGenes::GetGeneNames (  )  const

Return the primary identifiers of all genes in the set.

Returns:
Vector of primary identifiers of all genes in the set.

Definition at line 617 of file genome.cpp.

Referenced by Sleipnir::CPCL::Distance().

bool Sleipnir::CGenes::Open ( const char *  szFile,
bool  fCreate = true 
) [inline]

Construct a new gene set by loading genes from the given text file, one per line.

Parameters:
szFile File containing gene IDs to load, one per line.
fCreate If true, add unknown genes to the underlying genome; otherwise, unknown gene IDs are ignored.
Returns:
True if gene set was constructed successfully.
Loads a text file of the form:
 GENE1
 GENE2
 GENE3
containing one primary gene identifier per line. If these gene identifiers are found in the gene set's underlying genome, CGene objects are loaded from there. Otherwise, if fCreate is true, new genes are created from the loaded IDs. If fCreate is false, unrecognized genes are skipped with a warning.

See also:
CGenome::AddGene

Definition at line 392 of file genome.h.

References Open().

size_t Sleipnir::CGenes::GetGenes (  )  const [inline]

Return the number of genes in the set.

Returns:
Number of genes in the set.

Definition at line 405 of file genome.h.

Referenced by Sleipnir::CPCL::Distance(), Filter(), Sleipnir::CDat::FilterGenes(), Sleipnir::CSVM::Learn(), Sleipnir::CDatasetCompact::Open(), and Sleipnir::CDat::Open().

bool Sleipnir::CGenes::IsGene ( const std::string &  strGene  )  const [inline]

Return true if the given name is a primary identifier of a gene in the set.

Parameters:
strGene Primary gene identifier for which the set is searched.
Returns:
True if the set contains a gene with the given primary identifier.
See also:
GetGene

Definition at line 422 of file genome.h.

Referenced by CountAnnotations(), Sleipnir::CDatasetCompact::Open(), and Sleipnir::CDat::Open().

CGenome& Sleipnir::CGenes::GetGenome (  )  const [inline]

Return the gene set's underlying genome.

Returns:
Gene set's underlying genome.

Definition at line 433 of file genome.h.

Referenced by Sleipnir::CSVM::Learn().

const CGene& Sleipnir::CGenes::GetGene ( size_t  iGene  )  const [inline]

Return the gene at the requested index.

Parameters:
iGene Gene index to retrieve.
Returns:
Gene at the requested index.
Remarks:
For efficiency, no bounds checking is performed. The given index must be smaller than GetGenes.

Definition at line 450 of file genome.h.

Referenced by Sleipnir::CPCL::Distance(), Filter(), Sleipnir::CDat::FilterGenes(), Sleipnir::CDatasetCompact::Open(), and Sleipnir::CDat::Open().

size_t Sleipnir::CGenes::GetGene ( const std::string &  strGene  )  const [inline]

Return the index of the gene with the given primary identifier, or -1 if none exists.

Parameters:
strGene Primary gene identifier for which the set is searched.
Returns:
Index of the gene with the given primary identifier; -1 if none exists.
See also:
IsGene

Definition at line 467 of file genome.h.


The documentation for this class was generated from the following files:

Generated on Fri Jun 19 12:48:37 2009 for Sleipnir by doxygen 1.5.5