Dmelanogaster {BSgenome.Dmelanogaster.UCSC.dm3} | R Documentation |
Drosophila melanogaster full genome as provided by UCSC (dm3, Apr. 2006) and stored in Biostrings objects.
This BSgenome data package was made from the following source data files:
sequences: chromFa.tar.gz, upstream1000.fa.gz, upstream2000.fa.gz, upstream5000.fa.gz from http://hgdownload.cse.ucsc.edu/goldenPath/dm3/bigZips/ masks: all the chr*_gap.txt.gz files from ftp://hgdownload.cse.ucsc.edu/goldenPath/dm3/database/ + chromOut.tar.gz and chromTrf.tar.gz from http://hgdownload.cse.ucsc.edu/goldenPath/dm3/bigZips/See
?BSgenomeForge
and the BSgenomeForge
vignette (vignette("BSgenomeForge")
) in the BSgenome software
package for how to make a BSgenome data package.
H. Pages
BSgenome-class,
DNAString-class,
available.genomes
,
BSgenomeForge
Dmelanogaster seqlengths(Dmelanogaster) Dmelanogaster$chr2L # same as Dmelanogaster[["chr2L"]] if ("AGAPS" %in% masknames(Dmelanogaster)) { ## Check that the assembly gaps contain only Ns: checkOnlyNsInGaps <- function(seq) { ## Replace all masks by the inverted AGAPS mask masks(seq) <- gaps(masks(seq)["AGAPS"]) af <- alphabetFrequency(seq) found_letters <- names(af)[af != 0] if (any(found_letters != "N")) stop("assembly gaps contain more than just Ns") } ## A message will be printed each time a sequence is removed ## from the cache: options(verbose=TRUE) for (seqname in seqnames(Dmelanogaster)) { cat("Checking sequence", seqname, "... ") seq <- Dmelanogaster[[seqname]] checkOnlyNsInGaps(seq) cat("OK\n") } } ## See the GenomeSearching vignette in the BSgenome software ## package for some examples of genome-wide motif searching using ## Biostrings and the BSgenome data packages: if (interactive()) vignette("GenomeSearching", package="BSgenome")