rel-trial Clinical trial database

Database Description: The clinical trial database is curated from AACT initiative, which consolidates all protocol and results data from studies registered on ClinicalTrials.gov. It offers extensive information about clinical trials, including study designs, participant demographics, intervention details, and outcomes. It is an important resource for health research, policy making, and therapeutic development.

Database Statistics:

Domain Medical
Num of Tables 15
Num of Rows 5,852,157
Num of Columns 140
Starting Time 2000-01-01
Validation timestamp 2020-01-01
Testing timestamp 2021-01-01
Time window 1 year

Database schema:

To load this relational database in RelBench, do:

from relbench.datasets import get_dataset
dataset = get_dataset("rel-trial")

References:

[1] Clinical Trials Transformation Initiative.

Dataset License: Not specified.


Node Classification Tasks

study-outcome

Task Description: Predict if the trials will achieve its primary outcome (defined as p-value < 0.05).

Evaluation metric: AUROC

Node Regression Tasks

study-adverse

Task Description: Predict the number of affected patients with severe advsere events/death for the trial.

Evaluation metric: MAE

site-success

Task Description: Predict the success rate of a trial site in the next 1 year.

Evaluation metric: MAE

Link Prediction Tasks

condition-sponsor-run

Task Description: Predict whether the sponsor (pharma/hospital) will run clinical trials for the condition (disease)in next year

Evaluation metric: MAP

site-sponsor-run

Task Description: Predict whether this sponsor (pharma/hospital) will have a trial in the facility in next year

Evaluation metric: MAP