rel-stack Stack-Exchange Q&A Website Database
Database Description: Stack Exchange is a network of question-and-answer websites on topics in diverse fields, each site covering a specific topic, where questions, answers, and users are subject to a reputation award process. The reputation system allows the sites to be self-moderating. In our benchmark, we use the stats-exchange site. We derive from the raw data dump from 2023-09-12.
Database Statistics:
| Num of Tables | 7 |
| Num of Rows | 38,109,828 |
| Num of Columns | 51 |
| Starting Time | 2010-03-27 |
| Validation timestamp | 2019-01-01 |
| Testing timestamp | 2021-01-01 |
| Time window | 3 months |
Database schema:

To load this relational database in RelBench, do:
from relbench.datasets import get_dataset
dataset = get_dataset("rel-stack")
References:
Dataset License: CC BY-SA 4.0 DEED.
Node Classification Tasks
user-engagement
Task Description: For each user predict if a user will make any votes, posts, or comments in the next 3 months.
Evaluation metric: AUROC
To load the dataset and the split, do:
from relbench.datasets import get_dataset
dataset = get_dataset(name = "rel-stack")
task = dataset.get_task("user-engagement")
task.train_table, task.val_table, task.test_table # training/validation/testing tables
user-badge
Task Description: For each user predict if a user will receive a new badge in the next 3 months.
Evaluation metric: AUROC
To load the dataset and the split, do:
from relbench.datasets import get_dataset
dataset = get_dataset(name = "rel-stack")
task = dataset.get_task("user-badge")
task.train_table, task.val_table, task.test_table # training/validation/testing tables
Node Regression Tasks
post-votes
Task Description: For each user post predict how many votes it will receive in the next 3 months
Evaluation metric: MAE
To load the dataset and the split, do:
from relbench.datasets import get_dataset
dataset = get_dataset(name = "rel-stack")
task = dataset.get_task("post-votes")
task.train_table, task.val_table, task.test_table # training/validation/testing tables
Link Prediction Tasks
user-post-comment
Task Description: Predict a list of existing posts that a user will comment in the next two years.
Evaluation metric: MAP
To load the dataset and the split, do:
from relbench.datasets import get_dataset
dataset = get_dataset(name = "rel-stack")
task = dataset.get_task("user-post-comment")
task.train_table, task.val_table, task.test_table # training/validation/testing tables
user-post-related
Task Description: Predict a list of existing posts that users will link a given post to in the next two years.
Evaluation metric: MAP
To load the dataset and the split, do:
from relbench.datasets import get_dataset
dataset = get_dataset(name = "rel-stack")
task = dataset.get_task("user-post-related")
task.train_table, task.val_table, task.test_table # training/validation/testing tables