Skip to content

pritchardlabatpsu/pairwisecomparisons

Repository files navigation

Pairwise Comparisons of Conditional Selection

The published website showing all the code and results can be accessed here

The data directory contains data downloaded prior to analyses.

  • The directories tcga_luad_expression, tcga_skcm_expression, tcga_brca_expression contain mutation, total expression, and exon-level expression data for lung cancer and melanoma. Fig. 3A, B, and Fig S3 use this data.
  • alkati_growthcurvedata_popdoublings.csv and alkati_growthcurvedata.csv contain growth tracking data for Fig 4A
  • alkati_baf3_ic50s_heatmap.csv contains the crizotinib and brigatinib IC50 data from Fig 4C and Fig S4D
  • alkati_growthcurvedata_f1174mutants_raw.xlsx and alkati_growthcurvedata_popdoublings_f1174mutants.csv contain the growth tracking data from Fig S4C
  • alkati_melanoma_vemurafenib_figure_data.csv are dose response studies from Fig 5A

The Output directory contains data that the code makes

  • all_data_skcm.csv and skcm_alk_exon_expression.csv are generated by tcga_skcm_data_parser.rmd. It contains 351 melanoma patients' BRAF and NRAS mutational status, as well as ALK expression metrics that help decide which patients had an ALKATI-like expression. This data is used in Fig 2.
  • luad_alk_exon_expression.csv and all_data_luad.csv is generated by TCGA_luad_data_parser.rmd. This is a summary of ALK exon expression imbalance in lung cancer patients, used in Fig S3A
  • luad_egfr_exon_expression.csv is generated by TCGA_luad_data_parser.rmd and contains exon expression imbalance numbers for lung cancer patients, used in Fig S3B.

The Code directory contains .R files that are functions that the Rmd files in the analysis folder use.

  • contab_downsampler.R takes a contingency table, a GOI frequency, and corrects the frequency of the positive control 1 gene in the contingency table such that it is equal to the GOI frequency. Refer to Star Methods section titled frequency correction in gene pairs and algorithm 1 of the pseudo-code for details.
  • contab_simulator.R takes a contingency table of frequencies and simulates N cohorts of count data centered around the probabilities in the contingency table. N refers to the number of contingency tables generated. Refer to Star Methods section titled Pairwise Comparisons of gene pairs and pseudo-code Algorithm 2 for details.
  • mut_excl_genes_generator.R generates a single contingency table given a cohort size, incidence of the gene of interest, and the odds ratios of the two gene pairs. Refer to the generating simulated cohorts section of the Star Methods and Algorithm 3 of the pseudo-code for details.
  • alldata_compiler.R is a data-parsing function that generates count data of mutations in genes, given the gene name, and the name of the mutation of interest. This function can directly be used with the .mut files.
  • contab_maker.R makes a 2x2 contingency table from count data.

The Analysis directory contains the Rmarkdown files that run various analysis using functions in the code repository.

  • pairwisecomparisons_simulateddata.Rmd contains all analyses in Fig 1.
  • tcga_skcm_data_parser.Rmd, and alkati_subsampling_simulations_2.Rmd contains all analyses in Fig 2
  • tcga_luad_data_parser_egfr.Rmd, TCGA_luad_data_parser.Rmd, ALK_ExonImbalance_SKCM_Analysis.Rmd, alk_luad_mutation_bias.Rmd, and ALKATI_Filter_Cutoff_Analysis.Rmd contain analyses used in Fig 2, Fig 3, and Fig S3
  • baf3_alkati_transformations.Rmd contains analyses used in Fig 4A
  • Alkati_ccle_depmap_sensitivity.Rmd contains analyses used in Fig 5D, E and Fig S5A, B.

The Docs directory contains the html output from the Rmd files in the analysis directory.

A workflowr project.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published