| Title: | Example Datasets for Clinical Submission Readiness |
|---|---|
| Description: | Provides realistic synthetic example datasets for the R4SUB (R for Regulatory Submission) ecosystem. Includes a pharma study evidence table, ADaM (Analysis Data Model) and SDTM (Study Data Tabulation Model) metadata following CDISC (Clinical Data Interchange Standards Consortium) conventions (<https://www.cdisc.org>), traceability mappings, a risk register based on ICH (International Council for Harmonisation) Q9 quality risk management principles (<https://www.ich.org/page/quality-guidelines>), and regulatory indicator definitions. Designed for demos, vignettes, and package testing. |
| Authors: | Pawan Rama Mali [aut, cre, cph] (ORCID: <https://orcid.org/0000-0001-7864-5819>) |
| Maintainer: | Pawan Rama Mali <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.1.2 |
| Built: | 2026-05-15 09:39:05 UTC |
| Source: | https://github.com/r4sub/r4subdata |
ADaM (Analysis Data Model) variable-level metadata for ADSL (Subject-Level Analysis Dataset, 16 vars), ADAE (Adverse Events Analysis Dataset, 10 vars), and ADLB (Laboratory Results Analysis Dataset, 10 vars). Follows CDISC (Clinical Data Interchange Standards Consortium) ADaM conventions.
adam_metadataadam_metadata
A tibble with 36 rows and 6 columns:
Character. ADaM dataset name (ADSL, ADAE, ADLB).
Character. Variable name.
Character. Variable label.
Character. Variable type (Char or Num).
Integer. Variable length.
Character. SAS (Statistical Analysis System) format (or NA).
Synthetic metadata based on CDISC ADaM (Analysis Data Model) standards.
data(adam_metadata) table(adam_metadata$dataset)data(adam_metadata) table(adam_metadata$dataset)
Returns column names, types, and descriptions for a given r4subdata dataset.
dataset_dictionary(dataset)dataset_dictionary(dataset)
dataset |
Character. Name of the dataset (e.g., |
A tibble with columns: column, type, description.
dataset_dictionary("evidence_pharma") dataset_dictionary("adam_metadata")dataset_dictionary("evidence_pharma") dataset_dictionary("adam_metadata")
A realistic evidence table for study CDISCPILOT01 (Clinical Data Interchange Standards Consortium Pilot Study 01) covering all four R4SUB (R for Regulatory Submission) pillars (quality, trace, risk, usability) with 250 rows and 18 indicators across multiple datasets and sources.
evidence_pharmaevidence_pharma
A tibble with 250 rows and 17 columns:
Character. Unique run identifier.
Character. Study identifier (CDISCPILOT01).
Character. Asset type: dataset, define, program, validation, spec, other.
Character. Asset identifier (e.g., ADSL, define.xml).
Character. Source of the evidence (e.g., pinnacle21).
Character. Version of the source tool.
Character. Indicator identifier (e.g., Q-MISS-VAR).
Character. Human-readable indicator name.
Character. Domain: quality, trace, risk, usability.
Character. Severity: info, low, medium, high, critical.
Character. Result: pass, fail, warn, na.
Numeric. Metric value (if applicable).
Character. Unit for metric_value.
Character. Descriptive message.
Character. Location reference (e.g., ADSL:AGE).
Character. JSON payload with additional details.
POSIXct. Timestamp when evidence was created.
Synthetic data based on the CDISC (Clinical Data Interchange Standards Consortium) Pilot Study 01 structure.
data(evidence_pharma) head(evidence_pharma) table(evidence_pharma$indicator_domain)data(evidence_pharma) head(evidence_pharma) table(evidence_pharma$indicator_domain)
Returns a summary of all datasets included in the r4subdata package.
list_datasets()list_datasets()
A tibble with columns: name, description, n_rows, n_cols.
list_datasets()list_datasets()
A synthetic R4SUB evidence table for study ONCO-2025-001 covering all four R4SUB (R for Regulatory Submission) pillars (quality, trace, risk, usability) with 29 rows across ADSL, ADRS, and ADTTE datasets. Demonstrates realistic evidence patterns for an oncology submission with mixed pass/warn/fail results.
oncology_evidenceoncology_evidence
A tibble with 29 rows and 17 columns:
Character. Unique run identifier.
Character. Study identifier (ONCO-2025-001).
Character. Asset type: dataset, define, program, validation, spec, other.
Character. Asset identifier (e.g., ADSL, ADRS).
Character. Source tool name.
Character. Version of the source tool.
Character. Indicator identifier (e.g., Q-MISS-VAR).
Character. Human-readable indicator name.
Character. Domain: quality, trace, risk, usability.
Character. Severity: info, low, medium, high, critical.
Character. Result: pass, fail, warn, na.
Numeric. Metric value (if applicable).
Character. Unit for metric_value.
Character. Descriptive message.
Character. Location reference (e.g., ADRS:AVAL).
Character. JSON payload with additional details.
POSIXct. Timestamp when evidence was created.
Synthetic evidence data for a Phase II oncology trial.
data(oncology_evidence) table(oncology_evidence$indicator_domain) table(oncology_evidence$result)data(oncology_evidence) table(oncology_evidence$indicator_domain) table(oncology_evidence$result)
ADaM (Analysis Data Model) variable-level metadata for a synthetic
oncology trial covering ADSL (Subject-Level Analysis Dataset, 14 vars),
ADRS (Response Analysis Dataset, 10 vars), and ADTTE (Time-to-Event
Analysis Dataset, 8 vars). Includes origin, derivation, and codelist
columns suitable for use with r4subusability assessments.
oncology_metadataoncology_metadata
A tibble with 32 rows and 7 columns:
Character. ADaM dataset name (ADSL, ADRS, ADTTE).
Character. Variable name.
Character. Variable label.
Character. Variable origin (CRF, Derived, Assigned).
Character. Derivation text (NA if not derived).
Character. CDISC (Clinical Data Interchange Standards Consortium) codelist code (NA if not applicable).
Character. Variable type (Char or Num).
Synthetic metadata for a Phase II oncology trial following CDISC (Clinical Data Interchange Standards Consortium) ADaM conventions.
data(oncology_metadata) table(oncology_metadata$dataset) table(oncology_metadata$origin)data(oncology_metadata) table(oncology_metadata$dataset) table(oncology_metadata$origin)
Reference table of 30 indicator definitions across all four R4SUB (R for Regulatory Submission) domains (quality, trace, risk, usability). Each indicator has a unique ID, default severity, typical source, and descriptive tags.
regulatory_indicatorsregulatory_indicators
A tibble with 30 rows and 7 columns:
Character. Unique indicator identifier.
Character. Human-readable indicator name.
Character. Indicator domain: quality, trace, risk, usability.
Character. Detailed description.
Character. Default severity level.
Character. Typical source tool.
Character. Comma-separated tags.
Curated indicator definitions for the R4SUB (R for Regulatory Submission) ecosystem.
data(regulatory_indicators) table(regulatory_indicators$domain)data(regulatory_indicators) table(regulatory_indicators$domain)
A Failure Mode and Effects Analysis (FMEA)-based risk register with 18 risks covering data quality, traceability, documentation, programming, and compliance categories. Includes probability, impact, and detectability scores on a 1-5 scale. Structured according to ICH (International Council for Harmonisation) Q9 quality risk management principles.
risk_register_pharmarisk_register_pharma
A tibble with 18 rows and 9 columns:
Character. Unique risk identifier (RISK-001 to RISK-018).
Character. Risk description.
Character. Risk category.
Integer. Probability of occurrence (1-5).
Integer. Impact severity (1-5).
Integer. Detectability rating (1-5).
Character. Risk owner name.
Character. Mitigation action (or NA).
Character. Status: open, mitigated, closed, accepted.
Synthetic risk register based on ICH (International Council for Harmonisation) Q9 quality risk management principles.
data(risk_register_pharma) table(risk_register_pharma$category)data(risk_register_pharma) table(risk_register_pharma$category)
SDTM (Study Data Tabulation Model) variable-level metadata for DM (Demographics, 17 vars), AE (Adverse Events, 14 vars), and LB (Laboratory Results, 12 vars). Follows CDISC (Clinical Data Interchange Standards Consortium) SDTM conventions.
sdtm_metadatasdtm_metadata
A tibble with 43 rows and 6 columns:
Character. SDTM domain name (DM, AE, LB).
Character. Variable name.
Character. Variable label.
Character. Variable type (Char or Num).
Integer. Variable length.
Character. SAS (Statistical Analysis System) format (or NA).
Synthetic metadata based on CDISC SDTM (Study Data Tabulation Model) standards.
data(sdtm_metadata) table(sdtm_metadata$dataset)data(sdtm_metadata) table(sdtm_metadata$dataset)
Maps ADaM (Analysis Data Model) variables to their SDTM (Study Data Tabulation Model) source variables with derivation text and confidence scores. Includes direct copies, derived variables, and unmapped entries. Follows CDISC (Clinical Data Interchange Standards Consortium) traceability conventions.
trace_mappingtrace_mapping
A tibble with 25 rows and 6 columns:
Character. Source ADaM dataset.
Character. Source ADaM variable.
Character. Target SDTM domain (NA if derived).
Character. Target SDTM variable (NA if derived).
Character. Derivation description text.
Numeric. Mapping confidence score (0-1, NA if unmapped).
Synthetic traceability mapping based on CDISC conventions.
data(trace_mapping) table(trace_mapping$adam_dataset)data(trace_mapping) table(trace_mapping$adam_dataset)