EDCimport 0.6.1 (dev)
New features
- New arguments in
edc_swimmerplot():origin_funto summariseoriginat patient level using, anddata_listto control the datasets.
Bug fixes & Improvements
-
edc_viewer(): support for multiple instances on different ports(#114). -
edc_viewer(): new button to browse all the column labels (#113). - Fixed modifiers
edc_clean_names(),edc_unify_subjid(), andedc_split_mixed()that stripped attributes like project name (#111). -
edc_data_stop()now works without a SUBJID and defaults to no issue number (#109). - Fixed bug in
edc_left_join()with case-sensitivity on SUBJID (#108). - Improved
save_edc_data_warnings()with options to hide the resolved issues and to not include stops, and better default path (#107, #110, #112) - Improved
edc_swimmerplot()by removingoriginby default (#106). - Fixed bug in
assert_no_duplicate()not stopping in table with both columnsSUBJIDandsubjid(#105). - Improved
edc_warn_extraction_date()with a strict unit “days”. - Improved
save_plotly()with a glue syntax for paramfile.
EDCimport 0.6.0
CRAN release: 2025-06-24
Documentation
- New vignettes:
vignette("reading"),vignette("postprocessing"),vignette("checking"),vignette("visualizing"), andvignette("utils")
New features
- New functions
edc_patient_gridplot(), which creates a ggplot matrix giving the presence of all patients in all datasets (#77) - New functions
edc_left_join(),edc_right_join(), andedc_full_join(), which perform joins with defaults to subject ID as primary key (#82) - New function
edc_viewer(), which runs a shiny application for easily browsing your database (#83) - New function
set_project_name(), to set the project name when reading from a directory (#96) - New function
edc_find_value(), which searches the whole database for a value, asedc_find_column()searches for column names or labels. - New function
save_edc_data_warnings(), to save all the warnings triggered byedc_data_warn()into a.xlsxfile for sharing.
Bug fixes & Improvements
- New argument
unify(collapse_chr=TRUE), to collapse non-unique character values (#99) - New argument
lastnews_table(show_delta=TRUE), which computes the difference between the lastpreferdate and the actual last date (#81)- Other improvements: allow regex in
except&prefer(withregex=TRUE), improved warning message, and allow saving warnings in a csv file (#78)
- Other improvements: allow regex in
- New argument
edc_data_warn(envir), the environment to evaluatemessagein. - New argument
edc_swimmerplot(include), to subset the swimmer plot on significant variables only. - New argument
subdirectoriesto all reading functions (read_trialmaster(),read_all_xpt(),read_all_sas(), andread_all_csv()), to control whether to read sub-directories. Note that until now, those subdirectories were read and could overwrite root files. - Fixed labels being sometimes duplicated.
Internal improvements
-
read_trialmaster()won’t read from cache if installed EDCimport version is different from cache’s
Deprecations
-
load_list(), renamed toload_database() -
find_keyword(), renamed toedc_find_column()
Breaking changes
I don’t think enough people are using this so that it is necessary to go through the deprecation process.
-
split_mixed_datasetsbecomesedc_split_mixed() - Removed export of internal functions:
build_lookup(),extend_lookup(),get_key_cols(),get_subjid_cols(),get_crfname_cols(),get_meta_cols(),load_as_list(),save_list()
EDCimport 0.5.2
CRAN release: 2024-11-14
- Fixed a bug in
lastnews_table()when SUBJID is not numeric - Fixed a bug in
read_all_sas()causing metadata (e.g.date_extraction) being converted to dataframes
EDCimport 0.5.0
CRAN release: 2024-10-24
New features
Read functions
New function
read_all_sas()to read a database of.sas7bdatfiles.New function
read_all_csv()to read a database of.csvfiles.
Sanity checks alerts
-
New functions
edc_data_warn()andedc_data_stop(), to alert if data has inconsistencies (#29, #39, #43).ae %>% filter(grade<1 | grade>5) %>% edc_data_stop("AE of invalid grade") ae %>% filter(is.na(grade)) %>% edc_data_warn("Grade is missing", issue_n=13) #> Warning: Issue #13: Grade is missing (8 patients: #21, #28, #39, #95, #97, ...) New function
edc_data_warnings(), to get a dataframe of all warnings thrown byedc_data_warn().New function
edc_warn_extraction_date(), to alert if data is too old.
Miscellaneous utils
New function
select_distinct()to select all columns that has only one level for a given grouping scope (#57).New function
edc_population_plot()to visualize which patient is in which analysis population (#56).New function
edc_db_to_excel()to export the whole database to an Excel file, easier to browse than RStudio’s table viewer (#55). Useedc_browse_excel()to browse the file without knowing its name.New function
edc_inform_code()to show how much code your project contains (#49).New function
search_for_newer_data()to search a path (e.g. Downloads) for a newer data archive (#46).New function
edc_crf_plot()to show the current database completion status (#48).New function
save_sessioninfo(), to savesessionInfo()into a text file (#42).New function
fct_yesno(), to easily format Yes/No columns (#19, #23, #40).New function
lastnews_table()to find the last date an information has been entered for each patient (#37). Useful for survival analyses.New function
edc_unify_subjid(), to have the same structure for subject IDs in all the datasets of the database (#30).New function
save_plotly(), to save aplotlyto an HTML file (#15).New experimental functions
table_format(),get_common_cols()andget_meta_cols()that might become useful to find keys to pivot or summarise data.
Bug fixes & Improvements
-
get_datasets()will now work even if a dataset is named after a base function (#67). -
read_trialmaster()will output a readable error when no password is entered although one is needed. -
read_trialmaster(split_mixed="TRUE")will work as intended. -
assert_no_duplicate()has now abyargument to check for duplicate in groups, for example by visit (#17). -
find_keyword()is more robust and inform on the proportion of missing if possible. -
edc_lookup()will now retrieve the lookup table. Usebuild_lookup()to build one from a table list. -
extend_lookup()will not fail anymore when the database has a faulty table.
Deprecations
-
get_key_cols()is replaced byget_subjid_cols()andget_crfname_cols(). -
check_subjid()is replaced byedc_warn_patient_diffs(). It can either take a vector or a dataframe as input, and the message is more informative.
EDCimport 0.4.0
CRAN release: 2023-12-11
New features
- New function
check_subjid()to check if a vector is not missing some patients (#8).
options(edc_subjid_ref=enrolres$subjid)
check_subjid(treatment$subjid)
check_subjid(ae$subjid)- New function
assert_no_duplicate()to abort if a table has duplicates in a subject ID column(#9).
tibble(subjid=c(1:10, 1)) %>% assert_no_duplicate() %>% nrow()
#Error in `assert_no_duplicate()`:
#! Duplicate on column "subjid" for value 1.- New function
manual_correction()to safely hard-code a correction while waiting for the TrialMaster database to be updated. - New function
edc_options()to manageEDCimportglobal parameterization. - New argument
edc_swimmerplot(id_lim)to subset the swimmer plot to some patients only. - New option
read_trialmaster(use_cache="write")to read from the zip again but still update the cache. - You can now use the syntax
read_trialmaster(split_mixed=c("col1", "col2"))to split only the datasets you need to (#10).
Bug fixes & Improvements
- Reading with
read_trialmaster()from cache will output an error if parameters (split_mixed,clean_names_fun) are different (#4). -
split_mixed_datasets()is now fully case-insensitive. - Non-UTF8 characters in labels are now identified and corrected during reading (#5).
Minor breaking changes
-
read_trialmaster(use_cache="write")is now the default. Reading from cache is not stable yet, so you should opt-in rather than opt-out. -
read_trialmaster(extend_lookup=TRUE)is now the default. - Options
edc_id,edc_crfname, andedc_verbosehave been respectively renamededc_cols_id,edc_cols_crfname, andedc_read_verbosefor more clarity.
EDCimport 0.3.0 2023/05/19
CRAN release: 2023-05-19
New features
New function
edc_swimmerplot()to show a swimmer plot of all dates in the database and easily find outliers.-
New features in
read_trialmaster():-
clean_names_fun=some_funwill clean all names of all tables. For instance,clean_names_fun=janitor::clean_names()will turn default SAS uppercase column names into valid R snake-case column names. -
split_mixed=TRUEwill split tables that contain both long and short data regarding patient ID into one long table and one short table. See?split_mixed_datasets()for details. -
extend_lookup=TRUEwill improve the lookup table with additional information. See?extend_lookup()for details. -
key_columns=get_key_cols()is where you can change the default column names for patient ID and CRF name (used in other new features).
-
Standalone functions
extend_lookup()andsplit_mixed_datasets().New helper
unify(), which turns a vector of duplicate values into a vector of length 1.
Bug fixes
Reading errors are now handled by
read_trialmaster()instead of failing. If one XPT file is corrupted, the resulting object will contain the error message instead of the dataset.find_keyword()is now robust to non-UTF8 characters in labels.Option
edc_lookupis now set even when reading from cache.SAS formats containing a
=now work as intended.
EDCimport 0.2.1 2022/11/01
CRAN release: 2022-12-02
Import your data from TrialMaster using
tm = read_trialmaster("path/to/archive.zip").Search for a keyword in any column name or label using
find_keyword("date", data=tm$.lookup). You can also generate a lookup table for an arbitrary list of dataframe usingbuild_lookup(my_data).Load the datasets to the global environment using
load_list(tm)to avoid typingtm$everywhere.Browse available global options using
?EDCimport_options.
