R/calculation.R
reporter_score.RdOne step to get the reporter score of your KO abundance table.
reporter_score(
kodf,
group,
metadata = NULL,
method = "wilcox.test",
pattern = NULL,
p.adjust.method1 = "none",
mode = c("directed", "mixed")[1],
verbose = TRUE,
feature = "ko",
type = c("pathway", "module")[1],
p.adjust.method2 = "BH",
modulelist = NULL,
threads = 1,
perm = 4999,
min_exist_KO = 3,
max_exist_KO = 600
)KO_abundance table, rowname are feature ids (e.g. K00001 if feature="ko"; PEX11A if feature="gene"; C00024 if feature="compound"), colnames are samples.
The comparison groups (at least two categories) in your data, one column name of metadata when metadata exist or a vector whose length equal to columns number of kodf. And you can use factor levels to change order.
sample information data.frame contains group
the type of test. Default is `wilcox.test`. Allowed values include:
t.test (parametric) and wilcox.test (non-parametric). Perform comparison between two groups of samples. If the grouping variable contains more than two levels, then a pairwise comparison is performed.
anova (parametric) and kruskal.test (non-parametric). Perform one-way ANOVA test comparing multiple groups.
'pearson', 'kendall', or 'spearman' (correlation), see cor.
a named vector matching the group, e.g. c('G1'=1,'G2'=3,'G3'=2), use the correlation analysis with specific pattern to calculate p-value.
p.adjust.method for `ko.test`, see p.adjust
'mixed' or 'directed' (default, only for two groups differential analysis or multi-groups correlation analysis.), see details in pvalue2zs.
logical
one of 'ko', 'gene', 'compound'
'pathway' or 'module' for default KOlist for microbiome, 'CC', 'MF', 'BP', 'ALL' for default GOlist for homo sapiens. And org in listed in 'https://www.genome.jp/kegg/catalog/org_list.html' such as 'hsa' (if your kodf is come from a specific organism, you should specify type here).
p.adjust.method for the correction of ReporterScore, see p.adjust
NULL or customized modulelist dataframe, must contain 'id','K_num','KOs','Description' columns. Take the `KOlist` as example, use custom_modulelist.
default 1
permutation number, default: 4999.
min exist KO number in a pathway (default, 3, when a pathway contains KOs less than 3, there will be no RS)
max exist KO number in a pathway (default, 600, when a pathway contains KOs more than 600, there will be no RS)
reporter_score object:
your input KO_abundance table
ko statistics result contains p.value and z_score
the reporter score in each pathway
default KOlist or customized modulelist dataframe
The comparison groups in your data
sample information dataframe contains group
for the `reporter_s` in result, whose columns represent:
pathway id
pathway description
total number of KOs/genes in the pathway
number of KOs/genes in your inputdata that exist in the pathway
number of kos/genes in your inputdata that are significant in the pathway
\(Z_{pathway}=\frac{1}{\sqrt{k}}\sum Z_{koi}\)
Background mean, \(\mu _k\)
Background standard deviation, \(\sigma _k\)
ReporterScore of the pathway, \(ReporterScore=(Z_{pathway}-\mu _k)/\sigma _k\)
p.value of the ReporterScore
adjusted p.value by p.adjust.method2
Other GRSA:
combine_rs_res(),
get_reporter_score(),
ko.test(),
pvalue2zs()
message("The following example require some time to run:")
#> The following example require some time to run:
# \donttest{
data("KO_abundance_test")
reporter_score_res <- reporter_score(KO_abundance, "Group", metadata,
mode = "directed", perm = 499
)
#> ================================Use feature: ko=================================
#> ===============================Checking rownames================================
#> Some of your ko_stat are not KO id, check the format! (e.g. K00001)
#> 52.7% of your kos in the modulelist!
#> 30 samples are matched for next step.
#> ===========================Removing all-zero rows: 0============================
#> ===================================1.KO test====================================
#> =================================Checking group=================================
#> 30 samples are matched for next step.
#> ===========================Removing all-zero rows: 0============================
#> ==============================Calculating each KO===============================
#> ===========================Using method: wilcox.test============================
#> 1000 features done.
#> 2000 features done.
#> 3000 features done.
#> 4000 features done.
#>
#> Compared groups: WT, OE
#> Total KO number: 4535
#> Compare method: wilcox.test
#> Time use: 3.087
#> =========================2.Transfer p.value to Z-score==========================
#> ==========================3.Calculating reporter score==========================
#> ==================================load KOlist===================================
#> ================KOlist download time: 2023-07-28 06:07:08.466916================
#> If you want to update KOlist, use `update_KO_file()`
#> ============================Calculating each pathway============================
#> 100 pathways done.
#> 400 pathways done.
#> ID number: 481
#> Time use: 1.972
#> ====================================All done====================================
head(reporter_score_res$reporter_s)
#> ID Description K_num Exist_K_num
#> map03010 map03010 Ribosome 143 56
#> map05230 map05230 Central carbon metabolism in cancer 49 10
#> map00785 map00785 Lipoic acid metabolism 31 16
#> map00860 map00860 Porphyrin metabolism 139 44
#> map00780 map00780 Biotin metabolism 24 15
#> map04922 map04922 Glucagon signaling pathway 57 9
#> Significant_K_num Significant_up_num Significant_down_num Z_score
#> map03010 27 25 2 10.905774
#> map05230 9 0 9 -9.718236
#> map00785 12 0 12 -10.657642
#> map00860 25 3 22 -12.386131
#> map00780 10 1 9 -9.104172
#> map04922 7 0 7 -7.436852
#> BG_Mean BG_Sd ReporterScore p.value p.adjust
#> map03010 -5.428490 1.661462 9.831258 0.002 0.04266667
#> map05230 -2.254710 1.663421 -4.486854 0.002 0.04266667
#> map00785 -2.839535 1.748793 -4.470572 0.002 0.04266667
#> map00860 -4.661002 1.748553 -4.418013 0.002 0.04266667
#> map00780 -2.931240 1.652159 -3.736283 0.002 0.04266667
#> map04922 -2.140251 1.640234 -3.229173 0.002 0.04266667
reporter_score_res2 <- reporter_score(KO_abundance, "Group2", metadata,
mode = "mixed",
method = "kruskal.test", p.adjust.method1 = "none", perm = 499
)
#> ================================Use feature: ko=================================
#> ===============================Checking rownames================================
#> Some of your ko_stat are not KO id, check the format! (e.g. K00001)
#> 52.7% of your kos in the modulelist!
#> 30 samples are matched for next step.
#> ===========================Removing all-zero rows: 0============================
#> ===================================1.KO test====================================
#> =================================Checking group=================================
#> 30 samples are matched for next step.
#> ===========================Removing all-zero rows: 0============================
#> ==============================Calculating each KO===============================
#> ===========================Using method: kruskal.test===========================
#> 1000 features done.
#> 2000 features done.
#> 3000 features done.
#> 4000 features done.
#>
#> Compared groups: G1, G2, G3
#> Total KO number: 4535
#> Compare method: kruskal.test
#> Time use: 3.189
#> =========================2.Transfer p.value to Z-score==========================
#> ==========================3.Calculating reporter score==========================
#> ==================================load KOlist===================================
#> ================KOlist download time: 2023-07-28 06:07:08.466916================
#> If you want to update KOlist, use `update_KO_file()`
#> ============================Calculating each pathway============================
#> 100 pathways done.
#> 400 pathways done.
#> ID number: 481
#> Time use: 1.586
#> ====================================All done====================================
reporter_score_res3 <- reporter_score(KO_abundance, "Group2", metadata,
mode = "directed",
method = "pearson", pattern = c("G1" = 1, "G2" = 3, "G3" = 2), perm = 499
)
#> ================================Use feature: ko=================================
#> ===============================Checking rownames================================
#> Some of your ko_stat are not KO id, check the format! (e.g. K00001)
#> 52.7% of your kos in the modulelist!
#> 30 samples are matched for next step.
#> ===========================Removing all-zero rows: 0============================
#> ===================================1.KO test====================================
#> =================================Checking group=================================
#> 30 samples are matched for next step.
#> ===========================Removing all-zero rows: 0============================
#> ==============================Calculating each KO===============================
#> =============================Using method: pearson==============================
#> Using pattern
#> 1000 features done.
#> 2000 features done.
#> 3000 features done.
#> 4000 features done.
#>
#> Compared groups: G1, G2, G3
#> Total KO number: 4535
#> Compare method: pearson
#> Time use: 0.991
#> =========================2.Transfer p.value to Z-score==========================
#> ==========================3.Calculating reporter score==========================
#> ==================================load KOlist===================================
#> ================KOlist download time: 2023-07-28 06:07:08.466916================
#> If you want to update KOlist, use `update_KO_file()`
#> ============================Calculating each pathway============================
#> 100 pathways done.
#> 400 pathways done.
#> ID number: 481
#> Time use: 1.472
#> ====================================All done====================================
# }