This page provides an overview of the basic structure of Stan models in
bbr
. The main entry point for interacting with Stan models is the
bbi_stan_model
object. With it, you can create a new model on disk from
"scaffold" files, copy a new model from an existing one, jump to model files
of interest, and submit models. The Details section contains information
about the model structure and the necessary files that will exist on disk for
any bbi_stan_model
.
<run>
- The "run" is, in some sense, the "name" of a given model.
Practically, it will correspond to the model directory name, the base name of
the bbr-created YAML (<run>.yaml
), as well as the base name for some files
in that directory. Calling bbr::get_model_id()
on a model object will
return <run>
as a string. The bbi_log_df
tibbles also all contain a run
column which is populated by calling basename(.mod$absolute_model_path)
for
each model. Note: this is not actually stored in the model object because
it can be unequivocally extracted as just described.
absolute_model_path
- Like the bbi_nonmem_model
object, the
bbi_stan_model
object will carry around only an absolute path to the model
directory. This will point to the model directory (named <run>
) containing
all of the files described below, as well as a <run>.yaml
file that bbr
uses to persist model metadata. A model is loaded or created by passing a
relative path to this directory to either bbr::read_model()
or
bbr::new_model()
, both of which return the bbi_stan_model
object. When
this object is created, it checks the model directory for the relevant files
and populates absolute_model_path
.
All of the files described below will exist inside the model directory named
<run>
. If you call new_model(..., .model_type = "stan")
without any of
these files, template "scaffold" files for all of them will be created in the
newly created model directory.
<run>.stan
- The Stan file.
<run>-stanargs.R
- Contains a named list with all of the arguments that
will be passed through to the $sample()
method of cmdstanr::CmdStanModel. See set_stanargs()
for details on
modifying.
<run>-standata.R
- Contains all necessary R code to read in
any source data and transform them to a Stan-ready data object (list).
Contains only one function, called make_standata(.dir)
, that takes a
single argument and returns the data list to pass to the
CmdStanModel$sample().
The .dir
argument will be the directory containing the script. This is
used to find data files for loading, for example
read_csv(file.path(.dir, "..", "..", "data", "derived", "my_data.csv"))
Can be called (by build_data()
) to generate the data for model
submission or to compare the resulting data to previously saved data on
disk.
Note that make_standata
will be evaluated in the parent environment of
your global session, giving it access to all other environments on your
search path. This means that you don't need to prefix function calls
with the package name (e.g., here::here()
), but doing so is recommended
so that make_standata
doesn't depend on your search path state. As an
exception, you may be comfortable leaving base packages unqualified
(e.g., rnorm()
rather than stats::rnorm()
) because users are unlikely
to remove package:stats
from their search path or to attach a package
that overrides rnorm()
.
<run>-init.R
- This file contains all necessary R code to create the
initial values passed to the cmdstanr's
$sample() method. This file is a lot like
<run>-standata.R
(discussed above) and a scaffold can be created with
add_staninit_file()
.
Contains only one function, called make_init(.data)
, that takes a
single argument and returns something that can be passed to the init
argument of $sample()
. There are several options; see the
$sample() documentation for details.
The object returned from make_standata()
will be passed to the .data
argument of make_init()
.
Will be called internally by bbr
and the result passed as the init
argument to $sample()
.
See the make_standata
entry above for details on the evaluation
environment.
Note that $sample()
supports passing
"A function that returns a single list...". If you intend to use this
option, your make_init()
function must return the function described,
not the "single list...".
Note that this file will not be included when you're defining a model for standalone generated quantities. See "Standalone Generated Quantities" section below for more information.
There will be several other things created in the model directory, as the model is run or as it prepares to run.
<run>
- This is the binary file created when the <run>.stan
file is
compiled by cmdstan
. We .gitignore
this automatically.
<run>-output
- This directory is created by bbr
. It is where the
posteriors will be saved (currently as CSV’s) and also where the
bbi_config.json
is saved when the model run finishes successfully. Note
that we don’t call this <run>
(as is done in NONMEM) for two primary
reasons:
It is more informative to call it <run>-output
to distinguish it from
all the other files and directories that start with <run>
.
There is also the binary called <run>
(previously mentioned) that could
cause confusion. In fact, there was a bug in cmdstanr
in February 2021
involving exactly this scenario.
<run>-output/bbi_config.json
- This file is created by bbr
when a model
run finishes successfully. It stores some configuration information about the
run, as well as the md5 hashes of the necessary files. These hashes are later
used (by bbr::check_up_to_date()
to check whether the files have changed
since the model was run, primarily for reproducibility purposes.
check_stan_model()
(mentioned above) - Checks for the necessary files
before running or copying the model. By default, it also checks the syntax
of the <run>.stan
file.
bbr::build_path_from_model()
- Builds the absolute path to a file in
the model folder from a model object and a suffix.
add_stanmod_file()
, add_standata_file()
, add_staninit_file()
,
add_stan_fitted_params_file()
- Helpers for adding one of the necessary
files to the model folder.
open_stanmod_file()
, open_standata_file()
, open_staninit_file()
,
open_stan_fitted_params_file()
- Helpers for opening files within the
model directory.
bbr::model_diff()
- Compare necessary files between two models.
Defaults to comparing <run>.stan
files.
Also has many of the same helpers as bbi_nonmem_model
objects:
bbr::tags_diff()
, bbr::add_tags()
, bbr::add_notes()
,
bbr::get_model_path()
, bbr::get_output_dir()
, bbr::get_model_id()
Stan supports generating quantities of interest from existing posterior samples (see Stan user's guide). cmdstanr exposes this through the $generate_quantities() of cmdstanr::CmdStanModel.
Note: The information below applies to standalone generated quantities. If the model defines generated quantities that are produced at the same time as the MCMC samples, the model will have the structure defined above.
In bbr
, models for standalone generated quantities are defined via the
bbi_stan_gq_model
object, a subclass of bbi_stan_model
. On the file
system, these models look very similar to regular Stan models, with the
following differences:
the "model_type" value in the model YAML is "stan_gq" instead of "stan"
there is no <run>-init.R
file;
$generate_quantities() does
not have an init
argument.
there is a <run>-fitted-params.R
file. This file must define a function,
make_fitted_params
, that takes a single argument, the model object. The
function can return any value accepted for the fitted_params
argument of
$generate_quantities().
See the make_standata
entry above for details on the evaluation
environment.
"stan_gq" models can be created fresh with new_model(..., .model_type = "stan_gq")
. However, for the more common case where the "stan_gq" model is
derived from an existing "stan" model, you can use the
copy_model_as_stan_gq()
helper, which takes care of copying over the
relevant files, adding a "gq_parent" field to the model's YAML file that
points back to the parent model, and setting up a default
<run>-fitted-params.R
that returns the paths to the parent model's
posteriors.
The "gq_parent" field of "stan_gq" models links to the "stan" model whose
samples are used as input. The default <run>-fitted-params.R
uses this
value to retrieve the previous fit, and
check_up_to_date considers it when deciding
whether a model is up to date.
In the most common case, the "gq_parent" value will be automatically set up
by copy_model_as_stan_gq()
. However, you may want to manually set this to
multiple values (e.g., with add_stan_gq_parent()
) for cases where the
fitted parameters are coming from multiple models. The field may also be
absent, which is appropriate for cases where the fitted parameters are not
coming from a previous model.
To run a "stan_gq" model, pass the bbi_stan_gq_model
object to
submit_model(), which will use the model files to
construct a call to
CmdStanModel$generate_quantities().