R/calculation.R
reporter_score.Rd
One step to get the reporter score of your KO abundance table.
reporter_score(
kodf,
group,
metadata = NULL,
method = "wilcox.test",
pattern = NULL,
p.adjust.method1 = "none",
mode = c("directed", "mixed")[1],
verbose = TRUE,
feature = "ko",
type = c("pathway", "module")[1],
p.adjust.method2 = "BH",
modulelist = NULL,
threads = 1,
perm = 4999,
min_exist_KO = 3,
max_exist_KO = 600
)
KO_abundance table, rowname are feature ids (e.g. K00001 if feature="ko"; PEX11A if feature="gene"; C00024 if feature="compound"), colnames are samples.
The comparison groups (at least two categories) in your data, one column name of metadata when metadata exist or a vector whose length equal to columns number of kodf. And you can use factor levels to change order.
sample information data.frame contains group
the type of test. Default is `wilcox.test`. Allowed values include:
t.test
(parametric) and wilcox.test
(non-parametric). Perform comparison between two groups of samples. If the grouping variable contains more than two levels, then a pairwise comparison is performed.
anova
(parametric) and kruskal.test
(non-parametric). Perform one-way ANOVA test comparing multiple groups.
'pearson', 'kendall', or 'spearman' (correlation), see cor
.
a named vector matching the group, e.g. c('G1'=1,'G2'=3,'G3'=2), use the correlation analysis with specific pattern to calculate p-value.
p.adjust.method for `ko.test`, see p.adjust
'mixed' or 'directed' (default, only for two groups differential analysis or multi-groups correlation analysis.), see details in pvalue2zs
.
logical
one of 'ko', 'gene', 'compound'
'pathway' or 'module' for default KOlist for microbiome, 'CC', 'MF', 'BP', 'ALL' for default GOlist for homo sapiens. And org in listed in 'https://www.genome.jp/kegg/catalog/org_list.html' such as 'hsa' (if your kodf is come from a specific organism, you should specify type here).
p.adjust.method for the correction of ReporterScore, see p.adjust
NULL or customized modulelist dataframe, must contain 'id','K_num','KOs','Description' columns. Take the `KOlist` as example, use custom_modulelist
.
default 1
permutation number, default: 4999.
min exist KO number in a pathway (default, 3, when a pathway contains KOs less than 3, there will be no RS)
max exist KO number in a pathway (default, 600, when a pathway contains KOs more than 600, there will be no RS)
reporter_score object:
your input KO_abundance table
ko statistics result contains p.value and z_score
the reporter score in each pathway
default KOlist or customized modulelist dataframe
The comparison groups in your data
sample information dataframe contains group
for the `reporter_s` in result, whose columns represent:
pathway id
pathway description
total number of KOs/genes in the pathway
number of KOs/genes in your inputdata that exist in the pathway
number of kos/genes in your inputdata that are significant in the pathway
\(Z_{pathway}=\frac{1}{\sqrt{k}}\sum Z_{koi}\)
Background mean, \(\mu _k\)
Background standard deviation, \(\sigma _k\)
ReporterScore of the pathway, \(ReporterScore=(Z_{pathway}-\mu _k)/\sigma _k\)
p.value of the ReporterScore
adjusted p.value by p.adjust.method2
Other GRSA:
combine_rs_res()
,
get_reporter_score()
,
ko.test()
,
pvalue2zs()
message("The following example require some time to run:")
#> The following example require some time to run:
# \donttest{
data("KO_abundance_test")
reporter_score_res <- reporter_score(KO_abundance, "Group", metadata,
mode = "directed", perm = 499
)
#> ================================Use feature: ko=================================
#> ===============================Checking rownames================================
#> Some of your ko_stat are not KO id, check the format! (e.g. K00001)
#> 52.7% of your kos in the modulelist!
#> 30 samples are matched for next step.
#> ===========================Removing all-zero rows: 0============================
#> ===================================1.KO test====================================
#> =================================Checking group=================================
#> 30 samples are matched for next step.
#> ===========================Removing all-zero rows: 0============================
#> ==============================Calculating each KO===============================
#> ===========================Using method: wilcox.test============================
#> 1000 features done.
#> 2000 features done.
#> 3000 features done.
#> 4000 features done.
#>
#> Compared groups: WT, OE
#> Total KO number: 4535
#> Compare method: wilcox.test
#> Time use: 2.890
#> =========================2.Transfer p.value to Z-score==========================
#> ==========================3.Calculating reporter score==========================
#> ==================================load KOlist===================================
#> ================KOlist download time: 2023-07-28 06:07:08.466916================
#> If you want to update KOlist, use `update_KO_file()`
#> ============================Calculating each pathway============================
#> 100 pathways done.
#> 400 pathways done.
#> ID number: 481
#> Time use: 1.568
#> ====================================All done====================================
head(reporter_score_res$reporter_s)
#> ID Description K_num Exist_K_num
#> map03010 map03010 Ribosome 143 56
#> map05230 map05230 Central carbon metabolism in cancer 49 10
#> map00785 map00785 Lipoic acid metabolism 31 16
#> map00860 map00860 Porphyrin metabolism 139 44
#> map00780 map00780 Biotin metabolism 24 15
#> map04922 map04922 Glucagon signaling pathway 57 9
#> Significant_K_num Significant_up_num Significant_down_num Z_score
#> map03010 27 25 2 10.905774
#> map05230 9 0 9 -9.718236
#> map00785 12 0 12 -10.657642
#> map00860 25 3 22 -12.386131
#> map00780 10 1 9 -9.104172
#> map04922 7 0 7 -7.436852
#> BG_Mean BG_Sd ReporterScore p.value p.adjust
#> map03010 -5.428490 1.661462 9.831258 0.002 0.04266667
#> map05230 -2.254710 1.663421 -4.486854 0.002 0.04266667
#> map00785 -2.839535 1.748793 -4.470572 0.002 0.04266667
#> map00860 -4.661002 1.748553 -4.418013 0.002 0.04266667
#> map00780 -2.931240 1.652159 -3.736283 0.002 0.04266667
#> map04922 -2.140251 1.640234 -3.229173 0.002 0.04266667
reporter_score_res2 <- reporter_score(KO_abundance, "Group2", metadata,
mode = "mixed",
method = "kruskal.test", p.adjust.method1 = "none", perm = 499
)
#> ================================Use feature: ko=================================
#> ===============================Checking rownames================================
#> Some of your ko_stat are not KO id, check the format! (e.g. K00001)
#> 52.7% of your kos in the modulelist!
#> 30 samples are matched for next step.
#> ===========================Removing all-zero rows: 0============================
#> ===================================1.KO test====================================
#> =================================Checking group=================================
#> 30 samples are matched for next step.
#> ===========================Removing all-zero rows: 0============================
#> ==============================Calculating each KO===============================
#> ===========================Using method: kruskal.test===========================
#> 1000 features done.
#> 2000 features done.
#> 3000 features done.
#> 4000 features done.
#>
#> Compared groups: G1, G2, G3
#> Total KO number: 4535
#> Compare method: kruskal.test
#> Time use: 2.978
#> =========================2.Transfer p.value to Z-score==========================
#> ==========================3.Calculating reporter score==========================
#> ==================================load KOlist===================================
#> ================KOlist download time: 2023-07-28 06:07:08.466916================
#> If you want to update KOlist, use `update_KO_file()`
#> ============================Calculating each pathway============================
#> 100 pathways done.
#> 400 pathways done.
#> ID number: 481
#> Time use: 1.873
#> ====================================All done====================================
reporter_score_res3 <- reporter_score(KO_abundance, "Group2", metadata,
mode = "directed",
method = "pearson", pattern = c("G1" = 1, "G2" = 3, "G3" = 2), perm = 499
)
#> ================================Use feature: ko=================================
#> ===============================Checking rownames================================
#> Some of your ko_stat are not KO id, check the format! (e.g. K00001)
#> 52.7% of your kos in the modulelist!
#> 30 samples are matched for next step.
#> ===========================Removing all-zero rows: 0============================
#> ===================================1.KO test====================================
#> =================================Checking group=================================
#> 30 samples are matched for next step.
#> ===========================Removing all-zero rows: 0============================
#> ==============================Calculating each KO===============================
#> =============================Using method: pearson==============================
#> Using pattern
#> 1000 features done.
#> 2000 features done.
#> 3000 features done.
#> 4000 features done.
#>
#> Compared groups: G1, G2, G3
#> Total KO number: 4535
#> Compare method: pearson
#> Time use: 0.702
#> =========================2.Transfer p.value to Z-score==========================
#> ==========================3.Calculating reporter score==========================
#> ==================================load KOlist===================================
#> ================KOlist download time: 2023-07-28 06:07:08.466916================
#> If you want to update KOlist, use `update_KO_file()`
#> ============================Calculating each pathway============================
#> 100 pathways done.
#> 400 pathways done.
#> ID number: 481
#> Time use: 2.115
#> ====================================All done====================================
# }