Reformat Taxonomic Lineage using taxonkit
taxonkit_reformat(
file_path,
delimiter = NULL,
add_prefix = FALSE,
prefix_kingdom = "K__",
prefix_phylum = "p__",
prefix_class = "c__",
prefix_order = "o__",
prefix_family = "f__",
prefix_genus = "g__",
prefix_species = "s__",
prefix_subspecies = "t__",
prefix_strain = "T__",
fill_miss_rank = FALSE,
format_string = "",
miss_rank_repl_prefix = "unclassified ",
miss_rank_repl = "",
miss_taxid_repl = "",
output_ambiguous_result = FALSE,
lineage_field = 2,
taxid_field = NULL,
pseudo_strain = FALSE,
trim = FALSE,
text = FALSE,
data_dir = NULL
)
The path to the input file with taxonomic lineages. Or file text (text=TRUE)
The field delimiter in the input lineage (default ";").
Logical, indicating whether to add prefixes for all ranks (default: FALSE).
The prefix for kingdom, used along with –add-prefix (default: "K__").
The prefix for phylum, used along with –add-prefix (default: "p__").
The prefix for class, used along with –add-prefix (default: "c__").
The prefix for order, used along with –add-prefix (default: "o__").
The prefix for family, used along with –add-prefix (default: "f__").
The prefix for genus, used along with –add-prefix (default: "g__").
The prefix for species, used along with –add-prefix (default: "s__").
The prefix for subspecies, used along with –add-prefix (default: "t__").
The prefix for strain, used along with –add-prefix (default: "T__").
Logical, indicating whether to fill missing rank with lineage information of the next higher rank (default: FALSE).
The output format string with placeholders for each rank.
The prefix for estimated taxon level for missing rank (default: "unclassified ").
The replacement string for missing rank.
The replacement string for missing taxid.
Logical, indicating whether to output one of the ambiguous result (default: FALSE).
The field index of lineage. Input data should be tab-separated (default: 2).
The field index of taxid. Input data should be tab-separated. It overrides -i/–lineage-field.
Logical, indicating whether to use the node with lowest rank as strain name (default: FALSE).
Logical, indicating whether to not fill missing rank lower than current rank (default: FALSE).
logical
directory containing nodes.dmp and names.dmp (default "/Users/asa/.taxonkit")
A character vector containing the reformatted taxonomic lineages.
Other Rtaxonkit:
check_taxonkit()
,
download_taxonkit_dataset()
,
install_taxonkit()
,
name_or_id2df()
,
taxonkit_filter()
,
taxonkit_lca()
,
taxonkit_lineage()
,
taxonkit_list()
,
taxonkit_name2taxid()
if (FALSE) { # \dontrun{
# Use taxid
taxids2 <- system.file("extdata/taxids2.txt", package = "pctax")
reformatted_lineages <- taxonkit_reformat(taxids2,
add_prefix = TRUE, taxid_field = 1, fill_miss_rank = TRUE
)
reformatted_lineages
taxonomy <- strsplit2(reformatted_lineages, "\t")
taxonomy <- strsplit2(taxonomy$V2, ";")
# Use lineage result
taxonkit_lineage("9606\n63221", show_name = TRUE, show_rank = TRUE, text = TRUE) %>%
taxonkit_reformat(text = TRUE)
} # }