Summarize input data to prepare for passing to plot_forest()
. Takes a
data.frame or tibble, calculates the relevant confidence intervals, and
returns a tibble that can be passed directly to plot_forest()
. See Details
section for data specification and format.
A dataframe or tibble to summarize. See Details section for required format.
name of the column in data
to perform calculations on (i.e.
median/mean, lower, and upper CI)
name of the column in data
that defines groups within the
data. Often, this will contain the names of the covariates you are grouping
by.
(optional) name of the column in data
that contains
subgroups to group by. For example, if your group
column contains
covariates like WEIGHT
and AGE
, this column could contain categories
like underweight
, average
, overweight
, young
, mid
, elderly
, etc.
(optional) name of the column in data
that contains
metagroups
. Similar to facet wrap, if passed, this will cause
plot_forest()
to produce independent plots per metagroup.
(optional) name of the column in data
that contains to an
index of replicates, for example with multiple simulations or
bootstrapping. If specified, plot_forest()
will draw additional CI's of
the individual statistics, as small lines above each primary line.
numeric vector of length two, both between 0 and 1, corresponding to
your lower and upper tail probabilities. Defaults to c(0.05, 0.95)
is the actual statistic to output (i.e. median/mean)
same as probs
but used only when replicate
is passed
for the minor intervals (i.e. the small lines) above the major interval (i.e. the big lines).
same as statistic
but used only when replicate
is
passed for the minor intervals (i.e. the small lines) above the major interval (i.e. the big lines).
Input Data
The tibble passed to data
must be in a "long" format and has 2-5
columns: value
, group
, and optionally any of group_level
, metagroup
,
and/or replicate
. These are each described in detail in the input arguments
section.
Output Data
The tibble output from this function has one of two formats, depending on whether
replicate
was passed (details below).
Either way, the output tibble has a column named group
, containing the
values in the column you passed to the group
argument, and optionally analogous columns
for group_level
and metagroup
if those were passed.
Without replicate
If replicate
is not passed, the output data has three
additional columns mid
, lo
, and hi
, containing the summarized values corresponding
to what was passed to statistic
(mid
) and probs
(lo
/hi
).
With replicate
If replicate
is passed, the output data has
nine additional columns mid_mid
, mid_lo
, mid_hi
, plus three more each
for lo_*
and hi_*
, containing the summarized values. In this case, the
mid_mid
, lo_mid
, and hi_mid
correspond to the values of the major
interval (i.e. the big lines and data point) and the *_mid
, *_lo
, and
*_hi
correspond to the values for each minor interval (i.e. the small
lines).