Description of the dataset
The dataset was published1 by Kang et al in Plos Computational Biology.
In brief, total mRNA was prepared from Namalwa (Burkitt’s lymphoma), Hs343T (fibroblast line derived from a mammary gland adenocarcinoma), hTERT-HME1 (normal mammary epithelial cells immortalized with hTERT), and MCF7 (estrogen receptor positive breast cancer cell line). The RNA samples was profiled by RNA-sequencing in duplicates.
Omics_type
= transcriptome
Cancer_type
= brca
Cohort_size
= 30
Patient_metadata
= No
Sample_type
= In silico mixture of cell lines
Preparation of the data
Expression data from array were collected, normalized together using fRMA and transformed using log2.
Normalisation
= edgeR
Transformation
= Log2 + 1 (pseudo-log2)
Aggregation
= median
Composition of the test dataset
Transcriptome dataset
## [1] 5
## [1] 56646 30
colnames(test_data[[1]]) = paste0("sample_",1:dim(test_data[[1]])[2])
knitr::kable(head(test_data[[1]][,1:5], 10))
sample_1 | sample_2 | sample_3 | sample_4 | sample_5 | |
---|---|---|---|---|---|
BHLHE40 /// DELEC1 | 0.1389209 | 0.2015841 | 0.2440109 | 0.6057359 | 0.8770143 |
MTARC1 /// MARCHF1 | 2.0820315 | 2.1339103 | 2.3622637 | 2.2660872 | 2.4993167 |
SEPTIN1 | 3.1283017 | 2.8055923 | 3.0293702 | 3.1058714 | 2.9434853 |
MARCHF10 | 0.8054583 | 0.9071065 | 0.9917149 | 0.9196117 | 1.1236826 |
SEPTIN10 | 5.6377001 | 5.8659071 | 5.7005703 | 5.6473292 | 5.4588523 |
MARCHF11 | 0.0508587 | 0.0259265 | 0.0528873 | 0.2903529 | 0.0513815 |
SEPTIN11 | 7.4596111 | 7.6482334 | 7.4603445 | 7.4340067 | 7.5398126 |
SEPTIN12 | 0.0402248 | 0.0461664 | 0.0397009 | 0.0406292 | 0.0315253 |
SEPTIN14 | 0.0073609 | 0.2188129 | 0.3195756 | 0.0155361 | 0.0110615 |
MTARC2 /// MARCHF2 | 3.1910142 | 3.3989906 | 3.2921603 | 3.2310591 | 3.3790429 |
Composition of the solution dataset (ground truth)
Source
= in silico simulations
Number of expected cell types
= 4
5 independant proportion matrices and corresponding complex expression matrices have been generated to score the algorithm performances.
## [1] 5
## [1] 4 30