--- title: "Taxon Letter Codes in Soil Taxonomy" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Taxon Letter Codes in Soil Taxonomy} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` There are four different "levels" of the hierarchy in Soil Taxonomy that are represented by letter codes: - Soil Order - Suborder - Great Group - Subgroup In the _SoilTaxonomy_ package the `level` argument is used by some functions to specify output for a target level within the hierarchy. Other functions determine `level` by comparison against known taxa or codes. This vignette covers the basics of how taxon letter code conversion to and from taxonomic names is implemented. ```{r setup} library(SoilTaxonomy) ``` ## `taxon_to_taxon_code()` `taxon_to_taxon_code()` converts a taxon name (Soil Order, Suborder, Great Group, or Subgroup) to a letter code that corresponds to the logical position of that taxon in the Keys to Soil Taxonomy. _Gelisols_ are the first Soil Order to key out and are given letter code "A" ```{r} taxon_to_taxon_code("gelisols") ``` The number of letters in a taxon code corresponds to the `level` of that taxon. _Histels_ are the first Suborder to key out in the _Gelisols_ key (**A**), so they are given two letter code "AA" ```{r} taxon_to_taxon_code("histels") ``` For each "step" in each key, the letter codes are "incremented" by one. _Glacistels_ are the second Great Group in the _Histels_ key (**AA**), so they have the three letter code "AAB". ```{r} taxon_to_taxon_code("glacistels") ``` _Typic_ subgroups, by convention, are the last subgroup to key out in a Great Group. ```{r} taxon_to_taxon_code("typic glacistels") ``` Since _Typic Glacistels_ have code `"AABC"` we can infer that there are three taxa in the _Glacistels_ key with codes `"AABA"`, `"AABB"` and `"AABC"` This follows for Great Groups with many more subgroups. In case a Great Group has more than 26 subgroups within it, a fifth lowercase letter code is used to "extend" the ability to increment the code beyond 26. An example of where this is needed is in the _Haploxerolls_ key where the _Typic_ subgroup has code `"IFFZh"`. ```{r} taxon_to_taxon_code("typic haploxerolls") ``` From this code we infer that the _Haploxerolls_ key has $26+8=34$ subgroups corresponding to the range from `IFFA` to `IFFZ` _plus_ `IFFZa` to `IFFZh`. ## `taxon_code_to_taxon()` We can use a vector of letter codes to do the inverse operation with `taxon_code_to_taxon()`. Above we determined the _Glacistels_ Key contains three taxa with codes `"AABA"`, `"AABB"` and `"AABC"`. Let's convert those codes to taxon names. ```{r} taxon_code_to_taxon(c("AABA", "AABB", "AABC")) ``` ## `taxon_to_level()` We can infer from the length of the four-letter codes that all of the above are subgroup-level taxa. `taxon_to_level()` confirms this. ```{r} taxon_to_level(c("Hemic Glacistels","Sapric Glacistels","Typic Glacistels")) ``` `taxon_to_level()` can also identify a fifth (lower-level) _family_ tier (`level="family"`). Soil family differentiae are not handled in the Order to Subgroup keys. Family names are defined by concatenating comma-separated class names on to the subgroup. Classes used in family names are determined by specific keys and apply variably depending on the subgroup-level taxonomy. For instance, the soil family `"Fine, mixed, semiactive, mesic Ultic Haploxeralfs"` includes a particle-size class (`"fine"`), a mineralogy class (`"mixed"`), a cation exchange capacity (CEC) activity class (`"semiactive"`) and a temperature class (`"mesic"`) ```{r} taxon_to_level("Fine, mixed, semiactive, mesic Ultic Haploxeralfs") ``` ## `getTaxonAtLevel()` A wrapper method around taxon letter code functionality is `getTaxonAtLevel()`. Say that you have family-level taxon above and you want to determine the taxonomy at a higher (less detailed) level. You can determine what to remove (family and subgroup-level modifiers) to get the Great Group using `getTaxonAtLevel(level="greatgroup")` ```{r} getTaxonAtLevel("Fine, mixed, semiactive, mesic Ultic Haploxeralfs", level = "greatgroup") ``` If you request a more-detailed taxonomic level than what you start with, you will get an `NA` result. For example, we request the subgroup from suborder (`"Folists"`) level taxon name which is undefined. ```{r} getTaxonAtLevel("Folists", level = "subgroup") ``` ## `getParentTaxa()` Another wrapper method around taxon letter code functionality is `getParentTaxa()`. This function will enumerate the tiers above a particular taxon. ```{r} getParentTaxa("Fine, mixed, semiactive, mesic Ultic Haploxeralfs") ``` You can alternately specify `code` argument instead of `taxon`. ```{r} getParentTaxa(code = "BAB") ``` And converting the internally used taxon codes to taxon names can be disabled with `convert = FALSE`. This may be useful for certain applications. ```{r} getParentTaxa(code = c("BAA","BAB"), convert = FALSE) ``` ## `decompose_taxon_code()` For more general cases `decompose_taxon_code()` might be useful. This is a function used by many of the above methods that returns a nested list result containing the letter code hierarchy. ```{r} decompose_taxon_code(c("BAA","BAB")) ``` ## `preceding_taxon_codes()` and `relative_taxon_code_position()` Other functions useful for comparing relative positions within Keys, or the number of "steps" that it takes to reach a particular taxon, are `preceding_taxon_codes()` and `relative_taxon_code_position()`. `preceding_taxon_codes()` returns a list of vectors containing all preceding codes. For example, the `AA` suborder key precedes `AB`. And within the `AB` key `ABA` and `ABB` precede `ABC`. ```{r} preceding_taxon_codes("ABC") ``` `relative_taxon_code_position()` counts how many taxa key out before a taxon plus $1$ (to get _the_ taxon position). ```{r} relative_taxon_code_position(c("A","AA","AAA","AAAA", "AB","AAB","ABA","ABC", "B","BA","BAA","BAB", "BBA","BBB","BBC")) ```