NEWS.md
explain
function was added to the package. (#220)write_derived
is now protected against commas causing issues with history.csv. (#218)write_derived()
no longer returns unnecessary information to the R console. (#205)
read_src_dir()
now saves the md5 sum of each source file. (#204)
write_derived()
was reporting differences in columns where there were none.(#208)write_derived()
now checks that the given subject column exists in the data. (#194)
write_derived()
now performs a spec check before writing out data. (#196)
query_src_dir
to query_src_list
, and it now takes a list as an argument. (#182)assign_id()
is now protected against grouped data.frames. (#188)
Fixed write_derived()
warning message when creating history.csv
. (#186)
renamed function view_src()
to v()
, and function received a styling overhaul. (#141)
execute_data_diffs
was made to run faster. (#142)
nm_validate()
and nm_write()
was removed from the package. (#156)
write_derived()
no longer writes out the subject level csv in the meta data folder. (#158)
added assigning-id
vignette. (#160)
src_viz()
app bug fixes and improvements with handling subject columns. (#141)
assign_id()
now assigns ID’s based on maximum ID in previous data set, even if no subjects overlap. (#158)
execute_data_diffs()
Since the NUM column is excluded from subject level diffs, it needed to be removed from the list of columns used for the subject level diffs. (#138)add check-src-duplicates.R
function to the package to perform source data duplicate checks. (#113)
add check-src-missing-datetime.R
function to the package to check for missing values. (#113)
add check-src.R
to run all source data checks. (#113)
add src-viz.R
function to the package to view all source domains in a shiny app. (#120)
write-derived.R
now always explicitly exports the HEAD version from svn for data diffs. (#126)
write-derived.R
now arranges subject-level data by numerical ID. (#126)
write-derived.R
now always writes out subject level diffs even if they are empty. (#135)
execute-data-diffs.R
now returns a list of diffs and does not write out. (#135)
assign-id.R
prints a simplified message. (#136)
write-derived.R
The printed message for columns “added” or “removed” is now correct. (#126)assign_id()
now does not create duplicate records. (#110)nm_write()
N IDs diff now shows number of IDs that changed. (#108)assign_id()
now only outputs console message when new ID’s present. (#104)
read_src_dir()
no longer outputs subject column domain. (#106)
query_src_dir()
returns a data.frame now instead of a view of the output. (#106)
write_derived()
now shows number of IDs change now displayed in diff. (#102)write_derived()
now allows users to turn off executing diffs. (#100)write_derived()
fixed error created by different classes in ID diff. (#100)Removed nm_summary()
. (#81)
nm_write()
now pulls the base data to compare to from svn. (#83)
nm_write()
now runs additional diff checks at the ID level. (#83)
Add get_data_version()
to assist with data versioning. (#87)
Add write_derived()
to write out csv files along with meta information about the data. (#87)
view_mrgda_flags()
was removed from the package. (#87)
write_derived()
print messages were cleaned up. (#92, #97)
assign_id()
function added to package. (#93)
read_src_dir()
fixed error message when source domains fail to read in. (#81)nm_write()
no longer renders a define document for the data specification. (#75)Added view_mrgda_flags()
to allow the user to view a summary of how mrgda flags in the specification file are assigned to the data set. (#67)
read_src_dir()
now outputs the name and size of the current file its reading in. (#67)
query_src_dir()
now allows the user to specify file types. (#67)
nm_summary()
now will only run if all required flags are available. (#67)
nm_validate()
has stricter requirements for the data it needs defined to run each check. (#67)
read_src_dir()
Allows non-detect method for file type discovery. (#67)
query_src_dir()
is a new function that searches through the data in a source directory for a string or pattern. (#64)
distinct_subject_columns()
is a new function that takes a data frame and subject column identifier and returns the columns in which the values are unique for every subject. (#64)
nm_write()
the experimental feature of including the data assembly source script in the meta data folder has been removed. (#64)
read_src_dir()
now returns an additional data object - a data frame containing the column name and labels from every domain. (#64)
nm_validate()
now checks if MDV is set to 1 when DV is either NA or 0. (#55)
nm_write()
now determines and saves out the names of other analysis that depend on the derived data. (#55)
nm_write()
now outputs the source script in the meta data folder. (#55)
read_src_dir()
added a .read_domains
argument to allow users to only read in specific domains. If not specified, default is to read all in. (#55)
read_src_dir()
added a .subject_col
argument to allow users to specify the name of the unique subject identifier column in the source data. (#55)
Added view_src_dir_summary()
to allow the user to view all the domains and file sizes from their source data directory. (#55)
nm_write()
now allows for special characters in the file name. (#46)read_src_dir()
now allows for source directories in which all files have no USUBJID. (#42)User can now omit tests from being run in nm_validate()
. Optional argument provided to do this. (#38)
nm_validate()
and nm_summary()
now utilize a dictionary of column names found here system.file("package-data", "recognized-flags.csv", package = "mrgda")
. The functions can extract information from the data set without the user explicitly needing to define these columns. (#38)
view_sdtm_domains()
allows the user to view a dictionary of common SDTM domains and a description of the data found within them. (#38)
read_src_dir()
reads in all .csv, .xpt or .sas7bdat files from a source data directory into a nested list containing the data from each domain.(#38)
read_src_dir()
creates an additional data set informing the user if a subject has data in each domain. The resulting data.frame has one row per subject with a TRUE/FALSE column indicating if they are in a specific domain. (#38)
nm_write()
writes out derived data to a .csv file and creates a folder matching the name of the derived data, containing meta data from the assembly. (#31, #36, #38)
nm_validate()
added 4 new pass/fail checks. These include checking for: non-finite times, MDV not set to 1 when DV is NA, all NUM values being unique and AMT being equal to RATE times DUR. (#38)
nm_summary()
now opens an html document in a new tab containing both tables and figures. The figures are now interactive, allowing the user to hover over data points and view subject level information. (#30)
nm_summary()
added a table showing the distribution of BLQ values in the data. Additionally, new spaghetti plots were added to view time-varying covariates. (#30)
nm_summary()
now allows the user to decide if the tables and boxplots are to be stratified by study or not. (#30)
mutate_egfr()
was removed from the package. (#38)
nm_validate()
could not run when a data set did not have the needed data columns to run a specific check. Protections were added that allow tests to be skipped if the required data columns are not present. (#38)nm_validate
did not always print the corresponding test name alongside the debug code. (#16)Output of nm_validate()
now returns code using the users given arguments when errors are found in the data. The code helps the user to debug the issue. (#12)
Two checks in nm_validate()
were combined. The missing time varying covariate and baseline covariate checks are now combined as a missing covariate check. (#12)
Baseline and time-varying covariate flag names were updated to make them more easily readable. The updated flag names are available in the README. (#9)
mutate_egfr()
was added as a new function to assist users calculating estimated glomerular filtration rate during data assemblies (#8)
readr
and withr
were moved to suggests from imports. assertr
was removed as a dependency. (#4, #12)
nm_summary()
did not create readable continuous and categorical covariate figures when a large number of covariates were present. (#2)mrgda
initial aim is to assist users in verifying the accuracy of data sets intended for NONMEM, following their derivation.
The main feature, nm_validate()
asks the user to provide the derived data set and spec file as inputs. The function identifies columns in the data set using the spec file and runs a series of validation tests. It outputs a pass/fail for each test.
A supporting feature is nm_summary()
. Its purpose is to provide the user with quick visual summaries of their data. It outputs these summaries in either tables or figures, depending on the users preference.