cca_zoo.datasets.load_breast_data#

class cca_zoo.datasets.load_breast_data[source]#

Bases:

Loads the breast data from a remote .rda file.

This function fetches the ‘breastdata.rda’ dataset from the specified URL, parses the R data file, and returns the dataset as a Bunch object containing various attributes related to the breast data.

The function checks for RData support and uses the rdata library for parsing. If the file does not exist locally, it is downloaded and stored in a temporary directory.

Returns:

An object with the following attributes:
  • views (list): Contains two arrays representing ‘dna’ and ‘rna’.

  • view_names (list): Names of the views, i.e., [“dna”, “rna”].

  • chrom (array): Information related to chromosomal locations.

  • nuc (array): Nucleotide sequences.

  • gene (array): Gene sequences.

  • genenames (array): Names of the genes.

  • genechr (array): Chromosomal locations for each gene.

  • genedesc (array): Descriptions of the genes.

  • genepos (array): Positional information for each gene.

  • DESCR (str): Description of the dataset (currently empty).

  • filename (str): Name of the R data file, i.e., ‘breastdata.rda’.

  • data_module: Reference to the data module (assumed to be a global constant).

Return type:

Bunch

Notes

  • Ensure the rdata library is installed and functional.

  • The data is fetched from ‘https://tibshirani.su.domains/PMA/breastdata.rda’.

  • The temporary directory ‘tmpdir’ is created in the current working directory if it doesn’t exist.

Raises:
  • SomeException – Description of under what condition an exception is raised.

  • (You would fill in SomeException with the actual exception(s) that might be raised, if any).

Example

data = load_breast_data() print(data.views) print(data.view_names) # … and so on for other attributes …