rel-stack Stack-Exchange Q&A Website Database
Database Description: Stack Exchange is a network of question-and-answer websites on topics in diverse fields, each site covering a specific topic, where questions, answers, and users are subject to a reputation award process. The reputation system allows the sites to be self-moderating. In our benchmark, we use the stats-exchange site. We derive from the raw data dump from 2023-09-12.
Database Statistics:
| Num of Tables | 7 |
| Num of Rows | 4,247,264 |
| Num of Columns | 52 |
| Starting Time | 2010-03-27 |
| Validation timestamp | 2019-01-01 |
| Testing timestamp | 2021-01-01 |
| Time window | 3 months |
Database schema:

To load this relational database in RelBench, do:
from relbench.datasets import get_dataset
dataset = get_dataset("rel-stack")
References:
Dataset License: CC BY-SA 4.0 DEED.
Entity Classification Tasks
user-engagement
Task Description: For each user predict if a user will make any votes, posts, or comments in the next 3 months.
Evaluation metric: AUROC
user-badge
Task Description: For each user predict if a user will receive a new badge in the next 3 months.
Evaluation metric: AUROC
badges-class
Task Description: For each badge, predict the badge class.
Evaluation metric: MRR
postlinks-linktypeid
Task Description: For each post link, predict the link type.
Evaluation metric: AUROC
Entity Regression Tasks
post-votes
Task Description: For each user post predict how many votes it will receive in the next 3 months
Evaluation metric: MAE
Link Prediction Tasks
user-post-comment
Task Description: Predict a list of existing posts that a user will comment in the next two years.
Evaluation metric: MAP
post-post-related
Task Description: Predict a list of existing posts that users will link a given post to in the next two years.
Evaluation metric: MAP