This “Getting Started with bbr” vignette takes the user through some
basic scenarios for modeling with NONMEM using bbr
,
introducing you to its standard workflow and functionality.
bbr
is an R interface for running bbi
.
Together they provide a solution for managing projects involving
modeling and simulation with a number of software solutions used in
pharmaceutical sciences. Currently, only NONMEM is supported, so this
vignette will only address that.
The first time you use bbr
on a system or disk,
you must first install bbi
.
The bbi
executable is a single binary file, you just
need to tell bbr
where to find it. This can be done by
setting the bbr.bbi_exe_path
option like so:
options("bbr.bbi_exe_path" = "/data/apps/bbi")
Please note that this must be set for every new R session, so
it’s recommended to include the above snippet in your
.Rprofile, replacing the "/data/apps/bbi"
with the
absolute path from your system/project, if you have installed it
somewhere different.
Also note that this can be pointed to anywhere on your system;
installing to /data/apps/bbi
is merely a convention used at
Metrum Research Group, where bbr
was developed.
use_bbi()
bbr
provides an installer function named
use_bbi()
that makes it easy to install bbi
from within R.
bbr::use_bbi()
With no arguments passed, use_bbi()
will default to
installing bbi
at the path you have set in
options("bbr.bbi_exe_path")
. This is why we recommend
setting this first (and putting it in your .Rprofile
so
that it is sourced for every new R session). If this option is
not set, use_bbi()
will fall back to the
bbi
in my $PATH
(accessed via
Sys.which("bbi")
) and then an OS-dependent default, in that
order. See ?use_bbi()
if you would like more details.
The use_bbi()
function also optionally takes the
absolute path where you want to install bbi
,
though we recommend most users to follow the pattern above: first
specifying a path in options("bbr.bbi_exe_path")
, and then
letting use_bbi()
install to that path by default.
If you’re not sure if you have bbi
installed, you can
use bbi_version()
to check. This will return the version
number of your installation, if one is found, or an empty string if you
do not have bbi
installed. You can also use
bbi_current_release()
to see the most current version
available and run use_bbi()
as specified above if you want
to update.
You can also check the bbi documentation for manual installation instructions from the command line.
To actually submit models with bbi
, you will also need a
bbi.yaml
configuration file. Think of this as containing
“global default settings” for your bbi
runs. Those settings
will not be discussed here, but know that they can be modified globally
by editing that file, or model-by-model as described in the “Passing
arguments to bbi” section below.
The bbi_init()
function will create a
bbi.yaml
, with the default settings, in the specified
directory. Optionally, you can pass the path to a bbi.yaml
to the .config_path
argument of any bbr
function that needs one. However, by default these functions will look
for one in the same folder as the model being submitted or manipulated.
Therefore, if you will have a number of models in the same folder, it is
easiest to put a bbi.yaml
that folder.
MODEL_DIR <- "../model/nonmem/basic" # this should be relative to your working directory
bbi_init(.dir = MODEL_DIR, # the directory to create the bbi.yaml in
.nonmem_dir = "/opt/NONMEM", # location of NONMEM installation
.nonmem_version = "nm74gf") # default NONMEM version to use
Here we define a define variable MODEL_DIR
with the path
to the directory containing our models. Notice that you must also pass
the path to a working installation of NONMEM, and a default NONMEM
version to use.
Note this only needs to be done once for each folder
you are modeling in. Once the bbi.yaml
exists, you will
not need to run bbi_init()
again unless you want
to create another one; for example if you move to modeling in a
different directory, or if you want a different set of default settings
for some specific subset of models.
To begin modeling, first create a model object using
new_model()
. This is an S3 object which you will pass to
all of the other bbr
functions that submit, summarize, or
manipulate models.
The first argument (.path
) must be the path to your
model control stream without file extension. For instance, the
call below assumes you have a control stream named 1.ctl
or
1.mod
in the directory you just assigned to
MODEL_DIR
above.
This new_model()
call will also create a
1.yaml
file in that same directory, which stores model
metadata like description and tags (discussed below). If you ever need
to recreate this model object in memory, just run
mod1 <- read_model(file.path(MODEL_DIR, 1))
to rebuild
it from the YAML file on disk.
The model object you have just created can be passed to various functions which you will see in a minute. Now that we’ve created a model, the first thing we will do is submit the model to be run.
You can view the initial parameter estimates can be extracted and
formatted using initial_estimates()
. Additional columns
will indicate any bounds on $THETA
records, as well as the
record number (such as when multiple $OMEGA
records are
used). You can optionally flag which estimates were FIXED
or not.
initial_est_df <- initial_estimates(mod1, flag_fixed = TRUE)
initial_est_df
#> # A tibble: 8 × 7
#> parameter_names init lower_bound upper_bound fixed record_type record_number
#> <chr> <dbl> <dbl> <dbl> <lgl> <chr> <int>
#> 1 THETA1 2 0 NA FALSE theta 1
#> 2 THETA2 3 0 NA FALSE theta 1
#> 3 THETA3 10 0 NA FALSE theta 1
#> 4 THETA4 0.02 NA NA FALSE theta 1
#> 5 THETA5 1 NA NA FALSE theta 1
#> 6 OMEGA(1,1) 0.05 NA NA FALSE omega 1
#> 7 OMEGA(2,2) 0.2 NA NA FALSE omega 1
#> 8 SIGMA(1,1) 1 NA NA TRUE sigma 1
If stored as an R
object, you can extract the full
$OMEGA
or $SIGMA
matrices. This will
diagonally concatenate all matrices of the same record type, where
NA
values indicate the value was not specified in the
control stream file.
attr(initial_est_df, "omega_mat") # attr(initial_est_df, "sigma_mat") for SIGMA
#> [,1] [,2]
#> [1,] 0.05 NA
#> [2,] NA 0.2
#> attr(,"nmrec_record_size")
#> [1] 2
#> attr(,"nmrec_flags")
#> attr(,"nmrec_flags")$fixed
#> [,1] [,2]
#> [1,] FALSE NA
#> [2,] NA FALSE
mod1 %>% submit_model()
This will return a process object. We won’t discuss this object in this vignette, but it contains some information about the submission call. Please note that checking on a model run in progress is not fully implemented. For now, users should check on their runs manually (by looking at the output directory) and only proceed to the next steps once it has successfully completed.
There are a number of arguments that bbi
can take to
modify how models are run. You can print a list of available arguments
using the print_bbi_args()
helper function.
print_bbi_args()
#> additional_post_work_envs (character): Any additional values (as ENV KEY=VALUE)
#> to provide for the post execution environment. Command-line option:
#> --additional_post_work_envs
#> background (logical): RAW NMFE OPTION - Tells nonmem not to scan StdIn for
#> control characters. Command-line option: --background
#> clean_lvl (numeric): clean level used for file output from a given (set of)
#> runs (default 1). Command-line option: --clean_lvl
#> config (character): Path (relative or absolute) to another bbi.yaml to load.
#> Command-line option: --config
#> copy_lvl (numeric): copy level used for file output from a given (set of) runs.
#> Command-line option: --copy_lvl
#> debug (logical): debug mode. Command-line option: --debug
#> delay (numeric): Selects a random number of seconds between 1 and this value to
#> stagger / jitter job execution. Assists in dealing with large volumes of work
#> dealing with the same data set. May avoid NM-TRAN issues about not being able
#> read / close files. Command-line option: --delay
#> ext_file (character): name of custom ext-file. Command-line option: --ext-file
#> git (logical): whether git is used. Command-line option: --git
#> json (logical): json tree of output, if possible. Command-line option: --json
#> licfile (character): RAW NMFE OPTION - Specify a license file to use with NMFE
#> (Nonmem). Command-line option: --licfile
#> log_file (character): If populated, specifies the file into which to store the
#> output / logging details from bbi. Command-line option: --log_file
#> maxlim (numeric): RAW NMFE OPTION - Set the maximum values for the buffers used
#> by Nonmem (if 0, don't pass -maxlim to nmfe) (default 2). Command-line
#> option: --maxlim. Compatibility note: Default changed from unset to 2 with
#> bbi v3.2.0.
#> mpi_exec_path (character): The fully qualified path to mpiexec. Used for nonmem
#> parallel operations (default '/usr/local/mpich3/bin/mpiexec'). Command-line
#> option: --mpi_exec_path
#> nm_version (character): Version of nonmem from the configuration list to use.
#> Command-line option: --nm_version
#> nm_qual (logical): Whether or not to execute with nmqual (autolog.pl).
#> Command-line option: --nmqual
#> nobuild (logical): RAW NMFE OPTION - Skips recompiling and rebuilding on nonmem
#> executable. Command-line option: --nobuild
#> no_ext_file (logical): do not use ext file. Command-line option: --no-ext-file
#> no_grd_file (logical): do not use grd file. Command-line option: --no-grd-file
#> no_shk_file (logical): do not use shk file. Command-line option: --no-shk-file
#> overwrite (logical): Whether or not to remove existing output directories if
#> they are present. Command-line option: --overwrite
#> parafile (character): Location of a user-provided parafile to use for parallel
#> execution. Command-line option: --parafile
#> parallel (logical): Whether or not to run nonmem in parallel mode. Command-line
#> option: --parallel
#> parallel_timeout (numeric): The amount of time to wait for parallel operations
#> in nonmem before timing out (default 2147483647). Command-line option:
#> --parallel_timeout
#> post_work_executable (character): A script or binary to run when job execution
#> completes or fails. Command-line option: --post_work_executable
#> prcompile (logical): RAW NMFE OPTION - Forces PREDPP compilation. Command-line
#> option: --prcompile
#> prdefault (logical): RAW NMFE OPTION - Do not recompile any routines other than
#> FSUBS. Command-line option: --prdefault. Compatibility note: This option
#> isn't available in bbi until v3.2.0.
#> prsame (logical): RAW NMFE OPTION - Indicates to nonmem that the PREDPP
#> compilation step should be skipped. Command-line option: --prsame
#> preview (logical): preview action, but don't actually run command. Command-line
#> option: --preview
#> save_config (logical): Whether or not to save the existing configuration to a
#> file with the model (default true). Command-line option: --save_config
#> threads (numeric): number of threads to execute with locally or nodes to
#> execute on in parallel (default 4). Command-line option: --threads
#> tprdefault (logical): RAW NMFE OPTION - Test if is okay to do -prdefault.
#> Command-line option: --tprdefault. Compatibility note: This option isn't
#> available in bbi until v3.2.0.
#> verbose (logical): verbose output. Command-line option: --verbose
As discussed in “Setup” above, these can be set globally in the
bbi.yaml
file, and you can see the default values of them
in that file. However, specific arguments can also be set or
changed for each model. This can be done in two ways:
add_bbi_args()
or
replace_bbi_args()
.bbi_args
argument
of one of the following functions
Note that any bbi_args
attached to a model object will
override the relevant settings in bbi.yaml
, and that
.bbi_args
passed into a submit_
or
_summary
call will override the relevant settings in both
bbi.yaml
and bbi_args
attached to the
model object.
See the docs for any of the functions just mentioned for more details on usage and syntax.
It is common to run a model, make some tweaks to it, and then run it
again. However, to avoid accidentally deleting model outputs,
bbr
will error by default if it sees existing output when
trying to submit a model. To automatically overwrite any previous model
output, just pass .overwrite = TRUE
to
submit_model()
.
mod1 %>% submit_model(.overwrite = TRUE)
You can also inherit this setting, either by attaching
overwrite = TRUE
to a model with
add_bbi_args()
or by setting it globally with
overwrite: true
in the bbi.yaml
file for your
project. However, either of these settings will be overridden by
directly passing the argument as shown above.
Once the model run has completed, users can get a summary object containing much of the commonly used diagnostic information in a named list.
sum1 <- mod1 %>% model_summary()
print(names(sum1))
#> [1] "absolute_model_path" "run_details" "run_heuristics"
#> [4] "parameters_data" "parameter_names" "ofv"
#> [7] "condition_number" "shrinkage_details" "success"
These elements can be accessed manually or extracted with built-in helper functions like so:
param_df1 <- sum1 %>% param_estimates()
param_df1
#> # A tibble: 9 × 8
#> parameter_names estimate stderr random_effect_sd random_effect_sdse fixed
#> <chr> <dbl> <dbl> <dbl> <dbl> <lgl>
#> 1 THETA1 2.32 8.72e- 2 NA NA FALSE
#> 2 THETA2 54.6 3.38e+ 0 NA NA FALSE
#> 3 THETA3 463. 3.03e+ 1 NA NA FALSE
#> 4 THETA4 -0.0820 5.61e- 2 NA NA FALSE
#> 5 THETA5 4.18 1.38e+ 0 NA NA FALSE
#> 6 OMEGA(1,1) 0.0985 2.03e- 2 0.314 3.24e- 2 FALSE
#> 7 OMEGA(2,1) 0 1 e+10 0 1 e+10 TRUE
#> 8 OMEGA(2,2) 0.157 2.72e- 2 0.396 3.44e- 2 FALSE
#> 9 SIGMA(1,1) 1 1 e+10 1 1 e+10 TRUE
#> # ℹ 2 more variables: diag <lgl>, shrinkage <dbl>
Specific parameter estimates can also be extracted and formatted
using the get_omega()
, get_theta()
, and
get_sigma()
helpers:
sum1 %>% get_theta()
#> THETA1 THETA2 THETA3 THETA4 THETA5
#> 2.3171600 54.6151000 462.5140000 -0.0820082 4.1795900
cat("\n")
cat("\n")
To see how to load summaries of multiple models to an easy-to-read tibble, see the Creating a Model Summary Log vignette.
You can also use the nm_join()
function to load the input data and any output tables from your model
into a single tibble.
join_df <- nm_join(mod1)
join_df %>%
select(ID, TIME, DV, CL, V, NPDE, CWRES)
#> # A tibble: 779 × 7
#> ID TIME DV CL V NPDE CWRES
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 1 0 1.22 43.4 463. 0.185 0.291
#> 2 1 0.25 12.6 43.4 463. 2.47 3.01
#> 3 1 0.5 11.2 43.4 463. 2.71 2.68
#> 4 1 0.75 17.7 43.4 463. 2.94 4.24
#> 5 1 1 21.9 43.4 463. 2.94 5.25
#> 6 1 1.25 16.2 43.4 463. 2.94 3.88
#> 7 1 1.5 18.7 43.4 463. 2.94 4.47
#> 8 1 1.75 23.8 43.4 463. 2.94 5.69
#> 9 1 2 16.5 43.4 463. 2.94 3.94
#> 10 1 2.25 17.3 43.4 463. 2.94 4.15
#> # ℹ 769 more rows
Much of the benefit of bbr
is leveraged in the model
iteration workflow, and the run log summaries that can be created
afterwards. For example, imagine you look at these model results and
want to begin iterating on them with a new model.
If you are now in a new R session and no longer have your
mod1
object in memory, you can easily rebuild it from the
YAML file on disk with read_model()
:
mod1 <- read_model(file.path(MODEL_DIR, 1))
Now you can create a new model object, based on the original, copying
and renaming the control stream in the process. By default, the
copy_model_from()
function will increment the name of the
new model to the next available integer in your modeling directory, but
you can also pass a custom name to the .new_model
argument.
See ?copy_model_from
for more details and other optional arguments.
The copy_model_from()
call below will create both
2.ctl
and 2.yaml
files in the same directory
as the parent model, and return the model object corresponding to them.
(copy_model_from()
also stores the model’s “ancestry” which
can be useful later in the project, as shown in the Using
the based_on field vignette.)
mod2 <- copy_model_from(mod1) %>%
update_model_id()
#> replacing 1 with 2 in /data/home/barrettk/.cache/R/renv/library/bbr-e68b8766/R-4.1/x86_64-pc-linux-gnu/bbr/model/nonmem/basic/2.ctl
The new control stream file 2.ctl
will be identical to
its parent model, though you can use update_model_id()
(as above) to update to the new model ID in places like
$TABLE
statements.
Your new control stream can now be edited with any desired changes
and then submitted and summarized exactly as above. At any time, you can
also see the difference between a model and its parent using model_diff()
.
# manually edit control stream, then...
model_diff(mod2)
mod2 %>% submit_model()
mod2 %>% model_summary()
After looking at these results, the user can add tags, which can later be used to organize your modeling runs.
Note that using free text for your tags is discouraged, for reasons
mentioned in the ?modify_tags
help page. For simplicity’s
sake, we ignore that advice here, but please read it before using tags
extensively in the wild.
In addition to tags, the user can add notes that can be referenced later.
The model_diff()
and tags_diff()
function
make it easy to compare models to their parent model. See the “Using
based_on” vignette for examples.
Now the iteration process continues with a third model. Note that you
can tell copy_model_from()
to inherit the tags from the
parent model and automatically add them to the new model.
mod3 <- copy_model_from(mod2, .inherit_tags = TRUE)
Submit and summarize as before.
# manually edit control stream, then...
mod3 %>% submit_model()
mod3 %>% model_summary()
Add tags and notes for filtering in run log, described next.
As suggested above, the typical iteration process involves using
copy_model_from()
to create a new model object, followed by
manually editing the control stream file. However, in some
instances it can be valuable to set the initial parameter
estimates of a model using the final estimates of a
previously executed model. Often this is used to carry
forward the final estimates of a “parent” model to be used as the
initial estimates of the “child” model; for example a model created with
copy_model_from()
.
This is supported via inherit_param_estimates()
, which
automates the overwriting of the control steam file of your new
model, using the final parameter estimates from the previously
executed parent model. If no parent model is specified, it will
default to using the first entry in get_based_on(.mod)
. You
also have the option to keep or discard existing bounds when setting the
initial estimates for THETA records specifically.
mod4 <- copy_model_from(mod1, "4") %>%
inherit_param_estimates(bounds = "keep", digits = 3) %>%
add_notes("Initial estimates inherited from mod1 results")
You can also tweak or ‘jitter’ the initial parameter estimates using
tweak_initial_estimates()
. This can be used both in
conjunction with inherit_param_estimates()
or as a
standalone call:
mod5 <- copy_model_from(mod1, "5") %>%
inherit_param_estimates(digits = 3) %>%
tweak_initial_estimates(digits = 3, .p = 0.2)
mod6 <- copy_model_from(mod1, "6") %>%
tweak_initial_estimates(digits = 3, .p = 0.2)
At any point, the user can easily construct a “run log” tibble to summarize all models run up to this point.
Before we move on, note that you can get even more information about
your models from the config_log()
and
summary_log()
functions, as well as
add_config()
and add_summary()
which
automatically join the columns output from those functions against the
tibble output from run_log()
. See the “Further Reading”
section below for links to vignettes demonstrating those functions.
log_df <- run_log(MODEL_DIR)
log_df
#> # A tibble: 6 × 10
#> absolute_model_path run yaml_md5 model_type description bbi_args
#> <chr> <chr> <chr> <chr> <chr> <list>
#> 1 /data/home/barrettk/.cache… 1 3ff1756… nonmem original a… <named list>
#> 2 /data/home/barrettk/.cache… 2 41f8d47… nonmem NA <named list>
#> 3 /data/home/barrettk/.cache… 3 03a1f9d… nonmem NA <named list>
#> 4 /data/home/barrettk/.cache… 4 09bed41… nonmem NA <named list>
#> 5 /data/home/barrettk/.cache… 5 36d4841… nonmem NA <named list>
#> 6 /data/home/barrettk/.cache… 6 36d4841… nonmem NA <named list>
#> # ℹ 4 more variables: based_on <list>, tags <list>, notes <list>, star <lgl>
The run_log()
returns a tibble which can be manipulated
like any other tibble. However, several of the columns
(tags
, notes
, and based_on
for
example) are list columns, which complicates how you can interact with
them. We provide some helper functions to more seamlessly interact with
these log tibbles, as well as some sample tidyverse
code
below.
The bbr::collapse_to_string()
function can collapse any
list column into a string representation of its contents. It is
specifically designed for collapsing columns like tags
,
notes
, and based_on
into a more human-readable
format.
log_df %>%
collapse_to_string(tags, notes) %>%
select(run, tags, notes)
#> # A tibble: 6 × 3
#> run tags notes
#> <chr> <chr> <chr>
#> 1 1 acop tag, other tag, orig acop model NA
#> 2 2 2 compartment 2 compartment model more appropria…
#> 3 3 2 compartment, combined RUV, iiv CL Added combined error structure bec…
#> 4 4 NA Initial estimates inherited from m…
#> 5 5 NA NA
#> 6 6 NA NA
All *_log()
functions have in .include
argument that will filter the resulting log. Only models that contain a
tag or name (run
) that matches an element of the vector
passed to .include
will be included.
run_log(MODEL_DIR, .include = "2 compartment") %>%
collapse_to_string(tags, notes) %>%
select(run, tags, notes)
#> # A tibble: 2 × 3
#> run tags notes
#> <chr> <chr> <chr>
#> 1 2 2 compartment 2 compartment model more appropriat…
#> 2 3 2 compartment, combined RUV, iiv CL Added combined error structure beca…
Hopefully this has given you a good start on understanding the
capabilities and basic workflow of bbr
. Please see the
other vignettes for demonstrations of more advanced or specific
functionality.
based_on
field to
track a model’s ancestry through the model development process, as well
as using model_diff()
and tags_diff()
to
compare models. Also demonstrates leveraging config_log()
to check whether older models are still up-to-date.summary_log()
to
extract model diagnostics like the objective function value, condition
number, and parameter counts.