{rosettaPTF} is designed for efficient batch processing of soil hydraulic parameters. The package supports multiple input formats and offers options for scaling analysis to large datasets.
The core implementation uses the rosetta-soil Python module, which provides vectorized computation over multiple samples. This vignette demonstrates key parameters for controlling performance and handling large-scale analyses.
The run_rosetta() function accepts point data as a data.frame, where each row represents a single observation. Let’s use the sample soil property dataset included with the package.
data("MUKEY_PROP")
# View the structure of the sample data
str(MUKEY_PROP)
# Run rosetta on the sample property data
system.time({
res_points <- run_rosetta(MUKEY_PROP[, c("sandtotal_r", "silttotal_r", "claytotal_r", "dbthirdbar_r")])
})
head(res_points)
Spatial soil property predictions are commonly stored as raster grids. The run_rosetta() function can process SpatRaster objects directly, computing predictions for every cell in the grid.
We’ll use the sample spatial dataset (MUKEY_WCS) to create a continuous raster surface by interpolating soil properties.
# Convert MUKEY_WCS matrix to SpatRaster
data("MUKEY_WCS", package = "rosettaPTF")
data("MUKEY_PROP", package = "rosettaPTF")
r_template <- terra::rast(MUKEY_WCS, crs = "EPSG:5070")
terra::ext(r_template) <- c(-1365495, -1358925, 2869245, 2873655)
names(r_template) <- "mukey"
levels(r_template) <- MUKEY_PROP[, c("mukey",
"sandtotal_r", "silttotal_r", "claytotal_r",
"dbthirdbar_r")]
r_input <- terra::catalyze(r_template)
plot(r_input)
Pass the SpatRaster object to run_rosetta(). The output is a multi-layer raster containing mean and standard deviation for each predicted parameter.
# Process the raster stack
system.time({
r_output <- run_rosetta(r_input)
})
# Inspect the layers
names(r_output)
# Plot predicted Ksat (log10 cm/day)
plot(r_output[["log10_Ksat_mean"]], main = "Predicted Ksat")
For large rasters, the cores argument enables block-wise processing across multiple CPU cores. This parameter controls how the raster is divided and processed in parallel.
# Divide the raster into blocks and process each block on separate cores
r_output_parallel <- run_rosetta(r_input, cores = 2)
Note that with small rasters, as in this example, parallel processing may be significantly slower than sequential processing.
cores: Number of CPU cores to use for parallel processing. Set to 1 (default) for sequential processing, or 2+ for parallel block-wise processing. Useful for rasters that are memory-intensive or computationally demanding.
Input format: SpatRaster objects are preferred for spatial workflows. The function handles memory-efficient extraction and output reconstruction automatically.
For extremely large regions or high-resolution grids, consider processing tiles or regions sequentially:
# Example: process raster in regional tiles
tiles <- terra::getTileExtents(r_input, 125)
results <- terra::merge(terra::sprc(apply(tiles, 1, function(x) {
terra::window(r_input) <- x
run_rosetta(r_input)
})))
By default, parameters like alpha, npar, and Ksat are returned on a logarithmic (log10) scale. Alternative scales are available via the estimate_type argument:
# Linear scale estimates
res_linear <- run_rosetta(MUKEY_PROP[, c("sandtotal_r", "silttotal_r", "claytotal_r", "dbthirdbar_r")],
estimate_type = "arith")
# Geometric mean (recommended for log-transformed parameters)
res_geo <- run_rosetta(MUKEY_PROP[, c("sandtotal_r", "silttotal_r", "claytotal_r", "dbthirdbar_r")],
estimate_type = "geo")
By default, predictions include the full 1,000-member bootstrap ensemble for uncertainty quantification. The output includes both mean and standard deviation for each parameter.
# The output includes uncertainty estimates
head(res_points[, c("log10_Ksat_mean", "log10_Ksat_sd")])
Key considerations for high-throughput analysis:
SpatRaster objects as input for spatial workflows; the function handles data conversion automatically.cores > 1 to enable parallel block-wise processing for large rasters.estimate_type based on your application requirements.