This document shows how data set column definitions can be entered into a lookup file which can be accessed by multiple data specification files within a project. This document also discusses an internal lookup data base that is always available for individual data sets to look up standardized column information for commonly used data items in our workflow.
The lookup file that is to be accessed by other data specification
files is just another data specification file. For example, create a
file called lookup.yml
and enter this information.
# in file lookup.yml
AMT:
short: dose amount
unit: nmol
type: numeric
AMTMG:
short: dose amount
unit: mg
type: numeric
WT:
short: patient weight
unit: lbs
This information must be valid yspec data specification format and (generally) valid yaml.
This is just the standard data specification file
SETUP__:
description: PKPD analysis data set
lookup_file: lookup.yml
C:
short: comment character
AMT: !look
Notice two things about this file: we included a
lookup_file
section in the SETUP__
section and
we referenced our lookup.yml
file. By default, yspec
expects that the lookup file is in the same directory as the spec file.
Also, in the AMT
column, we used the !look
handler to indicate that we wanted that data to be looked up.
Alternatively, we could just pass in empty data and yspec will assume that you want to try to look up that data
SETUP__:
description: PKPD analysis data set
lookup_file: lookup.yml
C:
short: comment character
AMT:
Finally, we can import a column from the lookup file under a new name in the working spec
SETUP__:
description: PKPD analysis data set
lookup_file: lookup.yml
C:
short: comment character
AMT:
lookup: AMTMG
In this snippet, we are asking for the AMTMG
column from
the lookup and bringing it in as AMT
in the working
spec.
There is an internal data base of common data set columns that yspec will attach by default. So, with no lookup file defined, we could write the following in our specification file
SETUP__:
description: PKPD analysis data set
use_internal_db: true
C:
AMT:
MDV:
EVID:
WT:
EGFR:
ALB: !look
ZIP_CODE:
values: 55378
We can read this data in and have the columns defined
## name info unit short source
## C cd- . comment character ysdb_internal
## AMT --- . dose amount ysdb_internal
## MDV -d- . MDV ysdb_internal
## EVID -d- . event ID ysdb_internal
## WT --- kg weight ysdb_internal
## EGFR --- ml/min/1.73m2 eGFR ysdb_internal
## ALB --- g/dL albumin ysdb_internal
## ZIP_CODE --- . ZIP_CODE .
This all can get confusing about where each column is coming from. You can audit the spec object and find you where a lookup event happened
ys_lookup_source(spec)
## # A tibble: 8 × 2
## col lookup_source
## <chr> <chr>
## 1 C ysdb_internal.yml
## 2 AMT ysdb_internal.yml
## 3 MDV ysdb_internal.yml
## 4 EVID ysdb_internal.yml
## 5 WT ysdb_internal.yml
## 6 EGFR ysdb_internal.yml
## 7 ALB ysdb_internal.yml
## 8 ZIP_CODE spec.yml
Here, we can see that most of the columns came from the internal data
base and that the one column (ZIP_CODE
) came by our own
specification.
You can also re-create the lookup object (just a named list) for a specification object. Just click open the arrow to see the output.
ys_get_lookup(spec) %>% glimpse()
## List of 23
## $ C :List of 6
## ..$ short : chr "comment character"
## ..$ values : chr [1:2] "." "C"
## ..$ decode : chr [1:2] "analysis row" "commented row"
## ..$ type : chr "character"
## ..$ col : chr "C"
## ..$ lookup_source: chr "ysdb_internal.yml"
## $ ID :List of 4
## ..$ short : chr "subject identifier"
## ..$ type : chr "numeric"
## ..$ col : chr "ID"
## ..$ lookup_source: chr "ysdb_internal.yml"
## $ USUBJID:List of 4
## ..$ short : chr "unique subject identifier"
## ..$ type : chr "character"
## ..$ col : chr "USUBJID"
## ..$ lookup_source: chr "ysdb_internal.yml"
## $ SUBJ :List of 4
## ..$ short : chr "subject identifier"
## ..$ type : chr "character"
## ..$ col : chr "SUBJ"
## ..$ lookup_source: chr "ysdb_internal.yml"
## $ STUDYID:List of 4
## ..$ short : chr "study identifier"
## ..$ type : chr "character"
## ..$ col : chr "STUDYID"
## ..$ lookup_source: chr "ysdb_internal.yml"
## $ CMT :List of 4
## ..$ short : chr "compartment number"
## ..$ type : chr "numeric"
## ..$ col : chr "CMT"
## ..$ lookup_source: chr "ysdb_internal.yml"
## $ EVID :List of 4
## ..$ short : chr "event ID"
## ..$ values :List of 2
## .. ..$ observation: int 0
## .. ..$ dose : int 1
## ..$ col : chr "EVID"
## ..$ lookup_source: chr "ysdb_internal.yml"
## $ AMT :List of 4
## ..$ short : chr "dose amount"
## ..$ type : chr "numeric"
## ..$ col : chr "AMT"
## ..$ lookup_source: chr "ysdb_internal.yml"
## $ RATE :List of 4
## ..$ short : chr "infusion rate"
## ..$ type : chr "numeric"
## ..$ col : chr "RATE"
## ..$ lookup_source: chr "ysdb_internal.yml"
## $ II :List of 4
## ..$ short : chr "inter-dose interval"
## ..$ type : chr "numeric"
## ..$ col : chr "II"
## ..$ lookup_source: chr "ysdb_internal.yml"
## $ SS :List of 5
## ..$ short : chr "steady state indicator"
## ..$ values : int [1:2] 0 1
## ..$ decode : chr [1:2] "non-steady state indicator" "steady state indicator"
## ..$ col : chr "SS"
## ..$ lookup_source: chr "ysdb_internal.yml"
## $ MDV :List of 6
## ..$ values :List of 2
## .. ..$ non-missing: int 0
## .. ..$ missing : int 1
## ..$ type : chr "numeric"
## ..$ long : chr "missing DV indicator"
## ..$ comment : chr "per NONMEM specifications"
## ..$ col : chr "MDV"
## ..$ lookup_source: chr "ysdb_internal.yml"
## $ DV :List of 4
## ..$ short : chr "dependent variable"
## ..$ type : chr "numeric"
## ..$ col : chr "DV"
## ..$ lookup_source: chr "ysdb_internal.yml"
## $ WT :List of 5
## ..$ short : chr "weight"
## ..$ unit : chr "kg"
## ..$ type : chr "numeric"
## ..$ col : chr "WT"
## ..$ lookup_source: chr "ysdb_internal.yml"
## $ EGFR :List of 5
## ..$ short : chr "eGFR"
## ..$ long : chr "estimated glomerular filtration rate"
## ..$ unit : chr "ml/min/1.73m2"
## ..$ col : chr "EGFR"
## ..$ lookup_source: chr "ysdb_internal.yml"
## $ BMI :List of 5
## ..$ long : chr "body mass index"
## ..$ unit : chr "m2/kg"
## ..$ type : chr "numeric"
## ..$ col : chr "BMI"
## ..$ lookup_source: chr "ysdb_internal.yml"
## $ HT :List of 5
## ..$ about : chr [1:2] "height" "cm"
## ..$ long : chr "Height"
## ..$ type : chr "numeric"
## ..$ col : chr "HT"
## ..$ lookup_source: chr "ysdb_internal.yml"
## $ ALB :List of 6
## ..$ long : chr "serum albumin"
## ..$ unit : chr "g/dL"
## ..$ short : chr "albumin"
## ..$ type : chr "numeric"
## ..$ col : chr "ALB"
## ..$ lookup_source: chr "ysdb_internal.yml"
## $ AGE :List of 4
## ..$ about : chr [1:2] "age" "years"
## ..$ type : chr "numeric"
## ..$ col : chr "AGE"
## ..$ lookup_source: chr "ysdb_internal.yml"
## $ SEX :List of 3
## ..$ values :List of 2
## .. ..$ male : int 0
## .. ..$ female: int 1
## ..$ col : chr "SEX"
## ..$ lookup_source: chr "ysdb_internal.yml"
## $ NUM :List of 4
## ..$ short : chr "record number"
## ..$ type : chr "numeric"
## ..$ col : chr "NUM"
## ..$ lookup_source: chr "ysdb_internal.yml"
## $ BQL :List of 5
## ..$ short : chr "data point below the LOQ"
## ..$ type : chr "numeric"
## ..$ values :List of 2
## .. ..$ 0: chr "not below quantitation limit"
## .. ..$ 1: chr "below quantitation limit"
## ..$ col : chr "BQL"
## ..$ lookup_source: chr "ysdb_internal.yml"
## $ LOQ :List of 4
## ..$ short : chr "assay limit of quantification"
## ..$ type : chr "numeric"
## ..$ col : chr "LOQ"
## ..$ lookup_source: chr "ysdb_internal.yml"