Project overview (plain English)

e5cr7 uses ASReview to help humans prioritise which papers to read first. The goal is high recall: miss as few relevant studies as possible, while still saving reviewer time.

How to read this page

  1. Read Current recommendation first (what to do now).
  2. Check Data snapshot for scale and prevalence.
  3. Use Methods & results and planner only if you need technical detail.

Dataset snapshot

Model snapshot

Method overview

Key results

Model leaderboard

Expanded NLP benchmark (baseline, improved, and new candidates)

Baseline vs improved

Nested CV stability

Why more review may still be needed

How many more records should we review?

Use the planner to inspect expected trade-offs. The default recommendation is staged: screen +50, then reassess.

Scenario table

LAB access

Use the shared gateway by default. Project-specific links are kept only as fallback compatibility routes.

Glossary (quick definitions)

Baseline
The pre-improvement model configuration rerun for reproducible comparison. In this project, it is the prior reference run used in baseline_vs_improved tables.
Precision
Of the records flagged as relevant, how many are truly relevant.
Recall
Of all truly relevant records, how many were found.
Prevalence
The proportion of relevant studies in the full candidate dataset.
False negative (FN)
A relevant study the process failed to retrieve at the current stopping point.
False positive (FP)
A non-relevant study that was surfaced as if it might be relevant.
TF-IDF
Text weighting method that emphasises terms informative within this corpus, used as model input features.
SVM (Support Vector Machine)
A supervised learning model used here to rank records by predicted relevance.
WSS@95
Work Saved over random sampling when targeting 95% recall. Higher is better.