Note: if you use MAGeCKFlute in published research, please cite:

How to get help for MAGeCKFlute

Any and all MAGeCKFlute questions should be posted to the Bioconductor support site, which serves as a searchable knowledge base of questions and answers:

https://support.bioconductor.org

Posting a question and tagging with “MAGeCKFlute” will automatically send an alert to the package authors to respond on the support site. See the first question in the list of Frequently Asked Questions (FAQ) for information about how to construct an informative post.

You should not email your question to the package authors, as we will just reply that the question should be posted to the Bioconductor support site.

Input data

As input, the MAGeCKFlute package expects gene summary file as obtained by running commands mageck test or mageck mle in MAGeCK (Wei Li and Liu. 2014) and MAGeCK-VISPR (Wei Li and Liu. 2015), which are developed by our lab previously, to analyze CRISPR/Cas9 screen data in different scenarios(Tim Wang 2014, Hiroko Koike-Yusa (2014), Ophir Shalem1 (2014), Luke A.Gilbert (2014), Silvana Konermann (2015)). Both algorithms use negative binomial models to model the variances of sgRNAs, and use Robust Rank Aggregation (for MAGeCK) or maximum likelihood framework (for MAGeCK-VISPR) for a robust identification of selected genes.

MAGeCK-MLE can be used to analyze CRISPR screen data from multi-conditioned experiments. MAGeCK-MLE also normalizes the data across multiple samples, making them comparable to each other. The most important ouput of MAGeCK MLE is “gene_summary” file, which includes the beta scores of multiple conditions and the associated statistics. The ‘beta score’ for each gene describes how the gene is selected: a positive beta score indicates a positive selection, and a negative beta score indicates a negative selection.

MAGeCK RRA allows for the comparison between two experimental conditions. It can identify genes and sgRNAs are significantly selected between the two conditions. The most important output of MAGeCK RRA is the file “gene_summary.txt”. MAGeCK RRA will output both the negative score and positive score for each gene. A smaller score indicates higher gene importance. MAGeCK RRA will also output the statistical value for the scores of each gene. Genes that are significantly positively and negatively selected can be identified based on the p-value or FDR.

Quick start

Here we show the most basic steps for integrative analysis pipeline using gene summary file. Before using MAGeCKFlute, analysing CRISPR/Cas9 screen data using MAGeCK RRA (in MAGeCK (Wei Li and Liu. 2014)) or MAGeCK MLE (in MAGeCK-VISPR (Wei Li and Liu. 2015)) is neccessary, which result in the generation of the gene summary file.

To run MAGeCKFlute pipeline, we need gene summary file generated by running MAGeCK RRA or MAGeCK MLE. MAGeCKFlute package provides two example data, one is MLE_Data and the other is RRA_Data. We will work with them in this document.

Downstream analysis pipeline for MAGeCK MLE

All pipeline results are written into local directory “./BRAF_Flute_Results/”, and all figures are integrated into file “BRAF_Flute.mle_summary.pdf”.

Downstream analysis pipeline for MAGeCK RRA

All pipeline results are written into local directory “./BRAF_Flute_Results/” too, and all figures are integrated into file “BRAF_Flute.rra_summary.pdf”.

Section I: Run pipeline from gene summary file generated by MAGeCK MLE

Gene summary file

As the input of MAGeCKFlute package, the gene summary file in MAGeCK-MLE results includes beta scores of all genes in multiple condition samples. Use function ‘data’ to load the dataset, and have a look at the file with a text editor to see how it is formatted.

##      Gene D7_R1.beta D7_R2.beta PLX7_R1.beta PLX7_R2.beta
## 1   SMPD2  -0.458770  -0.514690     -0.67770    -0.510360
## 2   MIXL1   0.094571   0.122780     -0.14383    -0.119670
## 3   ENPEP  -0.082621  -0.082305     -0.11271    -0.061796
## 4    VAV2  -0.024270  -0.034077      0.14537    -0.015095
## 5 C5orf24   0.452310   0.226940      0.15809     0.191940
## 6   TLCD1  -0.883320  -0.779930     -0.80811    -0.854900

Read beta scores

Then, extract beta scores of control and treatment samples from the gene summary table(can be a file path of ‘gene_summary’ or data frame), and have a look at the beta score matrix.

##           Gene EntrezID     D7_R1     D7_R2  PLX7_R1   PLX7_R2
## 6610     SMPD2     6610 -0.458770 -0.514690 -0.67770 -0.510360
## 83881    MIXL1    83881  0.094571  0.122780 -0.14383 -0.119670
## 2028     ENPEP     2028 -0.082621 -0.082305 -0.11271 -0.061796
## 7410      VAV2     7410 -0.024270 -0.034077  0.14537 -0.015095
## 134553 C5orf24   134553  0.452310  0.226940  0.15809  0.191940
## 116238   TLCD1   116238 -0.883320 -0.779930 -0.80811 -0.854900

Batch effect removal

Is there batch effect? This is a common asked question before perform later analysis. In our package, we provide HeatmapView to ensure whether batch effect exists in data, and use BatchRemove to remove easily if same batch samples cluster together.

## Standardizing Data across genes

Normalization of beta scores

It is difficult to control all samples with a consistent cell cycle in an CRISPR screen experiment with multi conditions. Besides, beta score among different conditions with an inconsistent cell cycle are incomparable. So it is necessary to do the normalization when comparing the beta scores in different conditions. Essential genesare thosegenes that are indispensable for its survival. The effect generated by knocking out these genes in different cell types is consistent. Based on this, we developed cell cycle normalization method to shorten the gap of cell cycle in different conditions. In addition, a previous normalization method called loess normalization is available in this package.(Laurent Gautier 2004)

##           Gene EntrezID       D7_R1       D7_R2    PLX7_R1     PLX7_R2
## 6610     SMPD2     6610 -0.50382174 -0.58514097 -0.8888801 -0.66908767
## 83881    MIXL1    83881  0.10385798  0.13958618 -0.1886493 -0.15688871
## 2028     ENPEP     2028 -0.09073448 -0.09357094 -0.1478319 -0.08101525
## 7410      VAV2     7410 -0.02665334 -0.03874147  0.1906692 -0.01978971
## 134553 C5orf24   134553  0.49672736  0.25800364  0.2073529  0.25163549
## 116238   TLCD1   116238 -0.97006304 -0.88668713 -1.0599276 -1.12078346
##           Gene EntrezID       D7_R1       D7_R2    PLX7_R1     PLX7_R2
## 6610     SMPD2     6610 -0.44038017 -0.50825938 -0.6938220 -0.51905843
## 83881    MIXL1    83881  0.09383891  0.13556615 -0.1515824 -0.12397162
## 2028     ENPEP     2028 -0.08218047 -0.07375746 -0.1186463 -0.06484778
## 7410      VAV2     7410 -0.02259079 -0.02142822  0.1343865 -0.01843950
## 134553 C5orf24   134553  0.45749684  0.25157985  0.1396790  0.18052426
## 116238   TLCD1   116238 -0.84112255 -0.77043580 -0.8356798 -0.87902190

Estimate cell cycle time by linear fitting

After normalization, the cell cycle time in different condition should be almost consistent. Here we use linear fitting to estimate the cell cycle time, and use function CellCycleView to view the cell cycle time of all samples.

Positive selection and negative selection

The function ScatterView can group all genes into three groups, positive selection genes (GroupA), negative selection genes (GroupB), and others, and visualize these three grouped genes in scatter plot. We can also use function RankView to rank the beta score deviation between control and treatment and mark top selected genes in figure.

Functional analysis of selected genes

For gene set enrichment analysis, we provide three methods in this package, including “ORT”(Over-Representing Test (Guangchuang Yu and He. 2012)), “GSEA”(Gene Set Enrichment Analysis (Aravind Subramanian and Mesirov. 2005)), and “HGT”(HyperGemetric test), which can be performed on annotations of Gene ontology(GO) terms (Consortium. 2014), Kyoto encyclopedia of genes and genomes (KEGG) pathways (Minoru Kanehisa 2014), MsigDB gene sets, or custom gene sets. The enrichment analysis can be done easily using function enrichment_analysis, which return a list containing gridPlot (ggplot object) and enrichRes (enrichResult instance). Alternatively, you can do enrichment analysis using function enrich.ORT for “ORT”, enrich.GSE for GSEA, and enrich.HGT for “HGT”, which return an enrichResult instance. Function EnrichedView and EnrichedGSEView (for enrich.GSE) can be used to generate gridPlot from enrichReseasily, as shown below.

##                                                                      ID
## 405                                           GO_HISTONE_H3_ACETYLATION
## 110                                               CRGAARNNNNCGA_UNKNOWN
## 168                                              GO_PROTEIN_NEDDYLATION
## 197 GO_NEGATIVE_REGULATION_OF_SMALL_GTPASE_MEDIATED_SIGNAL_TRANSDUCTION
## 48                                              PID_ERB_GENOMIC_PATHWAY
## 229                      GO_NEGATIVE_REGULATION_OF_CELL_MATRIX_ADHESION
##                                                                                                         Description
## 405                                  http://www.broadinstitute.org/gsea/msigdb/cards/GO_CARDIAC_CHAMBER_DEVELOPMENT
## 110                                        http://www.broadinstitute.org/gsea/msigdb/cards/BIOCARTA_THELPER_PATHWAY
## 168 http://www.broadinstitute.org/gsea/msigdb/cards/GO_DNA_TEMPLATED_TRANSCRIPTIONAL_PREINITIATION_COMPLEX_ASSEMBLY
## 197                             http://www.broadinstitute.org/gsea/msigdb/cards/GO_MEMBRANE_LIPID_CATABOLIC_PROCESS
## 48                        http://www.broadinstitute.org/gsea/msigdb/cards/GSE29949_CD8_POS_DC_SPLEEN_VS_DC_BRAIN_UP
## 229                                          http://www.broadinstitute.org/gsea/msigdb/cards/GO_PROTEASOME_ASSEMBLY
##           NES       pvalue     p.adjust GeneRatio  BgRatio
## 405 21.195116 7.744210e-09 2.001878e-06      6/46 46/17304
## 110 14.474263 2.172780e-07 3.744424e-05      5/45 45/17304
## 168 12.853735 4.468965e-09 2.001878e-06      4/12 12/17304
## 197  9.026295 6.664960e-05 4.307230e-03      3/38 38/17304
## 48   9.003448 1.364866e-06 1.411272e-04      3/15 15/17304
## 229  8.204543 5.523581e-04 9.336792e-03      2/28 28/17304
##                                   geneID                          geneName
## 405 112869/27097/10629/10474/117143/1810 SGF29/TAF5L/TAF6L/TADA3/TADA1/DR1
## 110             7178/1387/9040/9968/9049       TPT1/CREBBP/UBE2M/MED12/AIP
## 168                  9039/8883/9616/9040              UBA3/NAE1/RNF7/UBE2M
## 197                      10253/8452/4763                    SPRY2/CUL3/NF1
## 48                        9039/6598/9040                UBA3/SMARCB1/UBE2M
## 229                            4771/4763                           NF2/NF1
##     Count
## 405     6
## 110     5
## 168     4
## 197     3
## 48      3
## 229     2

##                ID                               Description setSize
## hsa00190 hsa00190                 Oxidative phosphorylation      45
## hsa05010 hsa05010                         Alzheimer disease      45
## hsa05012 hsa05012                         Parkinson disease      42
## hsa04932 hsa04932 Non-alcoholic fatty liver disease (NAFLD)      39
## hsa05169 hsa05169              Epstein-Barr virus infection      29
## hsa03010 hsa03010                                  Ribosome      39
##          enrichmentScore      NES      pvalue   p.adjust    qvalues rank
## hsa00190       0.4989677 2.124260 0.001022495 0.06038136 0.05731490  654
## hsa05010       0.5049680 2.149805 0.001022495 0.06038136 0.05731490  654
## hsa05012       0.5155812 2.169594 0.001026694 0.06038136 0.05731490  654
## hsa04932       0.4908035 2.045632 0.001028807 0.06038136 0.05731490  654
## hsa05169       0.5255877 2.051231 0.001059322 0.06038136 0.05731490  670
## hsa03010       0.4438049 1.849745 0.002057613 0.08526552 0.08093532  981
##                            leading_edge
## hsa00190 tags=56%, list=23%, signal=43%
## hsa05010 tags=56%, list=23%, signal=43%
## hsa05012 tags=57%, list=23%, signal=45%
## hsa04932 tags=54%, list=23%, signal=42%
## hsa05169 tags=59%, list=24%, signal=45%
## hsa03010 tags=64%, list=35%, signal=42%
##                                                                                                                                       core_enrichment
## hsa00190                10063/1327/9377/7381/6392/4715/506/4708/4702/1355/4701/10632/1329/498/7386/29796/513/7385/6389/1337/1340/4716/4719/4709/55967
## hsa05010                  3028/1327/8883/9377/7381/6392/4715/506/4708/4702/4701/1329/498/7386/29796/513/7385/6389/1337/1340/2906/4716/4719/4709/55967
## hsa05012                        1327/9377/7381/6392/4715/506/4708/4702/7332/4701/1329/498/7386/29796/513/7385/6389/1337/1340/4716/4719/4709/135/55967
## hsa04932                                   1050/1327/9377/7381/6392/4715/4708/4702/4701/1329/7186/7386/29796/7385/6389/1337/1340/4716/4719/4709/55967
## hsa05169                                                      1460/1387/7979/5608/5434/3312/3305/171568/5704/51728/7186/3106/6693/10621/890/9612/3133
## hsa03010 63875/6183/64979/6182/6230/6208/51021/6191/28998/51073/63931/51116/54948/124995/55168/29088/55052/6181/29093/6150/64968/6146/51318/6169/6193
##                                                                                                                                                                               geneName
## hsa00190   COX17/COX4I1/COX5A/UQCRB/SDHD/NDUFB9/ATP5F1B/NDUFB2/NDUFA8/COX15/NDUFA7/ATP5MG/COX5B/ATP5F1A/UQCRFS1/UQCR10/ATP5F1D/UQCRC2/SDHA/COX6A1/COX6B1/NDUFB10/NDUFS1/NDUFB3/NDUFA12
## hsa05010 HSD17B10/COX4I1/NAE1/COX5A/UQCRB/SDHD/NDUFB9/ATP5F1B/NDUFB2/NDUFA8/NDUFA7/COX5B/ATP5F1A/UQCRFS1/UQCR10/ATP5F1D/UQCRC2/SDHA/COX6A1/COX6B1/GRIN2D/NDUFB10/NDUFS1/NDUFB3/NDUFA12
## hsa05012       COX4I1/COX5A/UQCRB/SDHD/NDUFB9/ATP5F1B/NDUFB2/NDUFA8/UBE2L3/NDUFA7/COX5B/ATP5F1A/UQCRFS1/UQCR10/ATP5F1D/UQCRC2/SDHA/COX6A1/COX6B1/NDUFB10/NDUFS1/NDUFB3/ADORA2A/NDUFA12
## hsa04932                                  CEBPA/COX4I1/COX5A/UQCRB/SDHD/NDUFB9/NDUFB2/NDUFA8/NDUFA7/COX5B/TRAF2/UQCRFS1/UQCR10/UQCRC2/SDHA/COX6A1/COX6B1/NDUFB10/NDUFS1/NDUFB3/NDUFA12
## hsa05169                                                                    CSNK2B/CREBBP/SEM1/MAP2K6/POLR2E/HSPA8/HSPA1L/POLR3H/PSMC4/POLR3K/TRAF2/HLA-B/SPN/POLR3F/CCNA2/NCOR2/HLA-E
## hsa03010          MRPL17/MRPS12/MRPL36/MRPL12/RPS25/RPS14/MRPS16/RPS4X/MRPL13/MRPL4/MRPS14/MRPS2/MRPL16/MRPL10/MRPS18A/MRPL15/MRPL20/RPLP2/MRPL22/MRPL23/MRPS6/RPL22/MRPL35/RPL38/RPS5

For enriched pathways, we can use function KeggPathwayView to visualize the beta score level in control and treatment on pathway map.(Weijun Luo 2013)

## [1] TRUE TRUE TRUE

Identify treatment-associated genes using 9-square model

Considering the difference of beta scores in control and treatment sample, we developed a 9-square model, which group all genes into several subgroups. Among these subgroups, four subgroup genes are treatment-associated, which correspond to specific functions. Group1 and Group3 genes are not selected in control sample, while they are significantly selected in treatment sample, so they may related to drug resistance. Group2 and Group4 genes are selected in control, but they are not selected in treatment, so maybe these genes are associated with drug targets.

In this package, use function SquareView can select these four treatment-associated subgroup genes and view them in 9-Square scatter plot.

Functional analysis for treatment-associated genes

Same as section above. We can do enrichment analysis for treatment-associated genes.

##                ID                                         Description
## hsa00030 hsa00030                           Pentose phosphate pathway
## hsa00130 hsa00130 Ubiquinone and other terpenoid-quinone biosynthesis
## hsa00480 hsa00480                              Glutathione metabolism
## hsa00970 hsa00970                         Aminoacyl-tRNA biosynthesis
## hsa00270 hsa00270                  Cysteine and methionine metabolism
## hsa00280 hsa00280          Valine, leucine and isoleucine degradation
##                 NES     pvalue  p.adjust GeneRatio BgRatio
## hsa00030 0.24490684 0.01225853 0.5148581     5/339 28/6677
## hsa00130 0.14887484 0.01191997 0.5148581     3/339 10/6677
## hsa00480 0.03605360 0.47008429 0.9268424     3/339 50/6677
## hsa00970 0.02548853 0.17253001 0.9268424     4/339 43/6677
## hsa00270 0.02544361 0.89436894 0.9268424     1/339 43/6677
## hsa00280 0.02162495 0.91436495 0.9268424     1/339 47/6677
##                            geneID                geneName Count
## hsa00030 226/7086/5226/5213/51071 ALDOA/TKT/PGD/PFKM/DERA     5
## hsa00130        84274/27235/51805          COQ5/COQ2/COQ3     3
## hsa00480           6723/5226/9027            SRM/PGD/NAT8     3
## hsa00970  10352/57176/92935/57505 WARS2/VARS2/MARS2/AARS2     4
## hsa00270                     6723                     SRM     1
## hsa00280                     3028                HSD17B10     1

Also, pathway visualization can be done using function KeggPathwayView, the same as section above.

Section II: Run pipeline from gene summary file generated by MAGeCK RRA

Gene summary file

For experiments with two experimental conditions, we recommend to use MAGeCK-RRA to identify essential genes from CRISPR/Cas9 knockout screens and tests the statistical significance of each observed change between two conditions. Gene summary file in MAGeCK-RRA results summarize the statistical significance of positive selection and negative selection. Use function ‘data’ to load the dataset, and have a look at the file with a text editor to see how it is formatted.

##      id  neg.fdr  pos.fdr
## 1  CMIP 0.001650 0.977480
## 2  MCL1 0.001650 1.000000
## 3 ITGB2 0.001650 0.999967
## 4 SCN2A 0.003713 0.886655
## 5  GRB2 0.048680 0.984303
## 6  EGFR 0.048680 0.993803

Read FDR of negative selection and positive selection

Then, extract “neg.fdr” and “pos.fdr” from the gene summary table.

##   Official  neg.fdr  pos.fdr ENTREZID
## 1     CMIP 0.001650 0.977480    80790
## 2     MCL1 0.001650 1.000000     4170
## 3    ITGB2 0.001650 0.999967     3689
## 4    SCN2A 0.003713 0.886655     6326
## 5     GRB2 0.048680 0.984303     2885
## 6     EGFR 0.048680 0.993803     1956

Session info

## R version 3.5.1 Patched (2018-07-12 r74967)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 16.04.5 LTS
## 
## Matrix products: default
## BLAS: /home/biocbuild/bbs-3.8-bioc/R/lib/libRblas.so
## LAPACK: /home/biocbuild/bbs-3.8-bioc/R/lib/libRlapack.so
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=C              
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] parallel  stats4    stats     graphics  grDevices utils     datasets 
## [8] methods   base     
## 
## other attached packages:
##  [1] MAGeCKFlute_1.2.0    gridExtra_2.3        pathview_1.22.0     
##  [4] org.Hs.eg.db_3.7.0   AnnotationDbi_1.44.0 IRanges_2.16.0      
##  [7] S4Vectors_0.20.0     Biobase_2.42.0       BiocGenerics_0.28.0 
## [10] ggplot2_3.1.0       
## 
## loaded via a namespace (and not attached):
##   [1] fgsea_1.8.0            colorspace_1.3-2       ggridges_0.5.1        
##   [4] rprojroot_1.3-2        qvalue_2.14.0          XVector_0.22.0        
##   [7] farver_1.0             urltools_1.7.1         ggrepel_0.8.0         
##  [10] bit64_0.9-7            xml2_1.2.0             codetools_0.2-15      
##  [13] splines_3.5.1          GOSemSim_2.8.0         knitr_1.20            
##  [16] jsonlite_1.5           annotate_1.60.0        GO.db_3.7.0           
##  [19] png_0.1-7              pheatmap_1.0.10        graph_1.60.0          
##  [22] ggforce_0.1.3          shiny_1.1.0            compiler_3.5.1        
##  [25] httr_1.3.1             rvcheck_0.1.1          backports_1.1.2       
##  [28] assertthat_0.2.0       Matrix_1.2-14          lazyeval_0.2.1        
##  [31] limma_3.38.0           later_0.7.5            tweenr_1.0.0          
##  [34] htmltools_0.3.6        prettyunits_1.0.2      tools_3.5.1           
##  [37] bindrcpp_0.2.2         igraph_1.2.2           gtable_0.2.0          
##  [40] glue_1.3.0             reshape2_1.4.3         DO.db_2.9             
##  [43] dplyr_0.7.7            fastmatch_1.1-0        Rcpp_0.12.19          
##  [46] enrichplot_1.2.0       Biostrings_2.50.0      nlme_3.1-137          
##  [49] ggraph_1.0.2           stringr_1.3.1          mime_0.6              
##  [52] miniUI_0.1.1.1         clusterProfiler_3.10.0 XML_3.98-1.16         
##  [55] DOSE_3.8.0             europepmc_0.3          zlibbioc_1.28.0       
##  [58] MASS_7.3-51            scales_1.0.0           hms_0.4.2             
##  [61] promises_1.0.1         KEGGgraph_1.42.0       RColorBrewer_1.1-2    
##  [64] yaml_2.2.0             memoise_1.1.0          UpSetR_1.3.3          
##  [67] biomaRt_2.38.0         triebeard_0.3.0        ggExtra_0.8           
##  [70] stringi_1.2.4          RSQLite_2.1.1          genefilter_1.64.0     
##  [73] BiocParallel_1.16.0    matrixStats_0.54.0     rlang_0.3.0.1         
##  [76] pkgconfig_2.0.2        bitops_1.0-6           evaluate_0.12         
##  [79] lattice_0.20-35        purrr_0.2.5            bindr_0.1.1           
##  [82] labeling_0.3           cowplot_0.9.3          bit_1.1-14            
##  [85] tidyselect_0.2.5       ggsci_2.9              plyr_1.8.4            
##  [88] magrittr_1.5           R6_2.3.0               DBI_1.0.0             
##  [91] mgcv_1.8-25            pillar_1.3.0           withr_2.1.2           
##  [94] units_0.6-1            survival_2.43-1        KEGGREST_1.22.0       
##  [97] RCurl_1.95-4.11        tibble_1.4.2           crayon_1.3.4          
## [100] rmarkdown_1.10         viridis_0.5.1          progress_1.2.0        
## [103] grid_3.5.1             sva_3.30.0             data.table_1.11.8     
## [106] blob_1.1.1             Rgraphviz_2.26.0       digest_0.6.18         
## [109] xtable_1.8-3           tidyr_0.8.2            httpuv_1.4.5          
## [112] gridGraphics_0.3-0     munsell_0.5.0          viridisLite_0.3.0     
## [115] ggplotify_0.0.3

References

Aravind Subramanian, Vamsi K. Moothaa, Pablo Tamayo, and Jill P. Mesirov. 2005. “Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles.” http://www.pnas.org/content/102/43/15545.full.

Consortium., The Gene Ontology. 2014. “Gene Ontology Consortium: going forward.” https://academic.oup.com/nar/article/43/D1/D1049/2439067.

Guangchuang Yu, Yanyan Han, Li-Gen Wang, and Qing-Yu He. 2012. “clusterProfiler: an R Package for Comparing Biological Themes Among Gene Clusters.” http://online.liebertpub.com/doi/abs/10.1089/omi.2011.0118.

Hiroko Koike-Yusa, E-Pien Tan, Yilong Li. 2014. “Genome-wide recessive genetic screening in mammalian cells with a lentiviral CRISPR-guide RNA library.” http://science.sciencemag.org/content/343/6166/80.long.

Laurent Gautier, Benjamin M. Bolstad, Leslie Cope. 2004. “affy—analysis of Affymetrix GeneChip data at the probe level.” https://academic.oup.com/bioinformatics/article/20/3/307/185980.

Luke A.Gilbert, BrittAdamson, Max A.Horlbeck. 2014. “Genome-Scale CRISPR-Mediated Control of Gene Repression and Activation.” https://linkinghub.elsevier.com/retrieve/pii/S0092-8674(14)01178-7.

Minoru Kanehisa, Yoko Sato, Susumu Goto. 2014. “Data, information, knowledge and principle: back to metabolism in KEGG.” https://academic.oup.com/nar/article-lookup/doi/10.1093/nar/gkt1076.

Ophir Shalem1, *, 2. 2014. “Genome-scale CRISPR-Cas9 knockout screening in human cells.” http://science.sciencemag.org/content/343/6166/84.long.

Silvana Konermann, Alexandro E. Trevino, Mark D. Brigham. 2015. “Genome-scale transcriptional activation by an engineered CRISPR-Cas9 complex.” https://www.nature.com/nature/journal/vnfv/ncurrent/full/nature14136.html.

Tim Wang, David M. Sabatini, Jenny J. Wei1. 2014. “Genetic Screens in Human Cells Using the CRISPR-Cas9 System.” http://science.sciencemag.org/content/343/6166/80.long.

Wei Li, Han Xu, Johannes Köster, and X. Shirley Liu. 2015. “Quality control, modeling, and visualization of CRISPR screens with MAGeCK-VISPR.” https://genomebiology.biomedcentral.com/articles/10.1186/s13059-015-0843-6.

Wei Li, Tengfei Xiao, Han Xu, and X Shirley Liu. 2014. “MAGeCK enables robust identification of essential genes from genome-scale CRISPR/Cas9 knockout screens.” https://genomebiology.biomedcentral.com/articles/10.1186/s13059-014-0554-4.

Weijun Luo, Cory Brouwer. 2013. “Pathview: an R/Bioconductor package for pathway-based data integration and visualization.” https://academic.oup.com/bioinformatics/article-lookup/doi/10.1093/bioinformatics/btt285.