Description of the dataset
6 different cell types including 5 immune cell types (“T cells”, “NK cells”, “B cells”, “monocytic lineage”, “neutrophils”) were used to simulate mRNA mixtures. This benchmark dataset has been used to evaluate MCP-counter method and published in Genome Biology
Omics_type
= tanscriptome
Cancer_type
= colorectal
Cohort_size
= 12
Patient_metadata
= No
Sample_type
= In silico mixture of cell lines
Preparation of the data
Expression data from array were collected, normalized together using fRMA and transformed using log2.
Normalisation
= fRMA
Transformation
= Log2
Composition of the public data
## [1] 23520 12
colnames(test_data[[1]]) = paste0("sample_",1:dim(test_data[[1]])[2])
knitr::kable(head(test_data[[1]][,1:5], 10))
sample_1 | sample_2 | sample_3 | sample_4 | sample_5 | |
---|---|---|---|---|---|
A1BG | 5.084187 | 5.352647 | 5.346210 | 5.430234 | 4.956454 |
A1BG-AS1 | 6.863607 | 6.819322 | 7.042982 | 7.240511 | 6.917173 |
A1CF | 4.933038 | 5.190533 | 4.959159 | 4.860161 | 4.751924 |
A2M | 6.778452 | 6.774715 | 7.446837 | 7.433180 | 6.912307 |
A2M-AS1 | 4.731402 | 5.027431 | 7.492709 | 6.547286 | 6.105185 |
A2ML1 | 3.727863 | 3.928135 | 3.958703 | 3.853147 | 3.899408 |
A2MP1 | 5.148930 | 5.022095 | 7.266519 | 7.074190 | 6.019521 |
A4GALT | 7.297125 | 7.414685 | 6.966293 | 7.098775 | 6.758724 |
A4GNT | 3.797872 | 3.924123 | 3.640855 | 3.914707 | 3.635332 |
AA06 | 5.368739 | 5.491767 | 5.217618 | 5.220630 | 4.942217 |
Composition of the ground truth
Source
= in silico simulations
Number of expected cell types
= 6
## [1] 6 12