Skip to contents

Read all .xpt files in a directory (unzipped TrialMaster archive).
If 7zip is installed, you should probably rather use read_trialmaster() instead.
Formats (factors levels) can be applied from a procformat.sas SAS file, or from a format dictionnary. See the "Format file" section below. Column labels are read directly from the .xpt files.

Usage

read_all_xpt(
  path,
  ...,
  format_file = "procformat.sas",
  datetime_extraction = "guess",
  subdirectories = FALSE,
  verbose = getOption("edc_read_verbose", 1),
  clean_names_fun = NULL,
  directory = "deprecated",
  key_columns = "deprecated"
)

Arguments

path

[character(1)]
the path to the directory containing all .xpt files.

...

unused

format_file

[character(1)]
the path to the file that should be used to apply formats. See section "Format file" below. Use NULL to not apply formats.

datetime_extraction

[POSIXt(1)]
the datetime of the data extraction. Default to the most common date of last modification in path.

subdirectories

[logical(1)]
whether to read subdirectories

verbose

[numeric(1)]
one of c(0, 1, 2). The higher, the more information will be printed.

clean_names_fun

[Deprecated] use edc_clean_names() instead.

directory

deprecated in favour for path

key_columns

deprecated

Value

a list containing one dataframe for each .xpt file in the folder, the extraction date (datetime_extraction), and a summary of all imported tables (.lookup).

Format file

format_file should contain the information about SAS formats. It can be either:

  • a procformat.sas file, containing the whole PROC FORMAT

  • or a data file (.csv or .sas7bdat) containing 3 columns:

    • FMTNAME the SAS format name (repeated)

    • START the variable level

    • LABEL the label associated to the level

    You can get this datafile from SAS using PROC FORMAT with option CNTLOUT. Otherwise, you can use options(edc_var_format_name="xxx", edc_var_level="xxx", edc_var_label="xxx") to specify different column names.

See also

Other EDCimport reading functions: read_all_csv(), read_all_sas(), read_trialmaster()

Examples

# Create a directory with multiple .xpt files.
path = paste0(tempdir(), "/read_all_xpt")
dir.create(paste0(path, "/subdir"), recursive=TRUE)
haven::write_xpt(attenu, paste0(path, "/attenu.xpt"))
haven::write_xpt(mtcars, paste0(path, "/mtcars.xpt"))
haven::write_xpt(mtcars, paste0(path, "/subdir/mtcars.xpt"))
haven::write_xpt(esoph, paste0(path, "/esoph.xpt"))

db = read_all_xpt(path, format_file=NULL, subdirectories=TRUE) %>% 
  set_project_name("My great project")
#> Warning: Option "edc_lookup" has been overwritten.
db
#> ── EDCimport database ──────────────────────────────────────────────────────────
#> Contains 4 tables: `attenu`, `esoph`, `mtcars`, and `subdir_mtcars`
#>  Use `EDCimport::load_database(db)` to load the tables in the global
#>   environment.
#>  Use `EDCimport::edc_lookup()` to see the summary table.
edc_lookup()
#> ── Lookup table - My great project (extraction of 2025-05-16) - EDCimport v0.5.2
#>   dataset        nrow  ncol  n_id rows_per_id crfname
#>   <chr>         <dbl> <dbl> <int>       <dbl> <chr>  
#> 1 attenu          182     5     0          NA NA     
#> 2 esoph            88     5     0          NA NA     
#> 3 mtcars           32    11     0          NA NA     
#> 4 subdir_mtcars    32    11     0          NA NA