ReXrank

Chest X-ray Interpretation Leaderboard

What is ReXrank?

ReXrank is a public leaderboard for chest X-ray image interpretation, including both radiology report generation (RRG) and visual question answering (VQA) tasks.


ReXrank Challenge V1.0 is a competition in the generation of chest radiograph reports utilizing ReXGradient, the largest private test dataset consisting of 10,000 studies across 67 sites. The challenge attracted diverse participants from academic institutions, industry, and independent research teams, resulting in 24 state-of-the-art models previously benchmarked.


ReXrank Challenge V2.0 is a competition in VQA task utilizing VQA dataset constructed from ReXGradient, including 41,007 VQA pairs with 10,000 radiological studies. We benchmarked 8 state-of-the-art models.


ReXGradient-160K is the largest publicly available multi-site chest X-ray dataset, containing 273,004 unique chest X-ray images from 160,000 radiological studies, collected from 109,487 unique patients across 3 U.S. health systems (79 medical sites). In ReXrank, we use additional private test set ReXGradient, 10,000 studies for benchmarking.


ReXVQA is the largest and most comprehensive benchmark for VQA in chest radiology, comprising 653834 questions paired with 160,000 radiological studies. The dataset is constructed from ReXGradient-160K.

ReXrank Challenge V1.0 Leaderboard (RRG)

Rank ReXGradient MIMIC-CXR IU-Xray CheXpert Plus

1

UniRG-CXR

Microsoft Research

zzy-RGv2-2b

Individual

UniRG-CXR

Microsoft Research

UniRG-CXR

Microsoft Research

2

zzy-RGv2-2b

Individual

UniRG-CXR

Microsoft Research

zzy-RGv2-2b

Individual

zzy-RGv2-2b

Individual

3

RadPhi4VisionCXR

Microsoft Research

MedVersa

Harvard

MoERad-IU

IIT Madras

CheXOne-R1

Stanford

4

CheXOne-R1

Stanford

CheXOne-R1

Stanford

CheXOne-R1

Stanford

RadPhi3.5Vision

Microsoft

5

zzy-RGv1-8b

Individual

RadPhi4VisionCXR

Microsoft Research

MedVersa

Harvard

RadPhi4VisionCXR

Microsoft Research

6

MoERad-IU

IIT Madras

zzy-RGv1-8b

Individual

CXRMate-RRG24

CSIRO

zzy-RGv1-8b

Individual

7

MedVersa

Harvard

Libra

University of Glasgow

MedGemma 1.5 4B

Google

CheXpertPlus-CheX-MIMIC

Stanford

8

MedGemma

Google

RadPhi3.5Vision

Microsoft

VLCI-IU

SYSU

CXRMate-RRG24

CSIRO

9

MAIRA-2

Microsoft

CXRMate-ED

CSIRO

MedGemma

Google

MAIRA-2

Microsoft

10

MedGemma 1.5 4B

Google

CXRMate-RRG24

CSIRO

CX-Mind

SJTU

CheXpertPlus-CheX

Stanford

11

CX-Mind

SJTU

CheXpertPlus-CheX-MIMIC

Stanford

MAIRA-2

Microsoft

DD-LLaVA-X

SNUH

12

VLCI-IU

SYSU

DD-LLaVA-X

SNUH

Cvt2distilgpt2-IU

CSIRO

CXRMate-ED

CSIRO

13

RadPhi3.5Vision

Microsoft

RaDialog

TUM

zzy-RGv1-8b

Individual

MedVersa

Harvard

14

RGRG

TUM

CheXpertPlus-MIMIC

Stanford

CXRMate-ED

CSIRO

Libra

University of Glasgow

15

DD-LLaVA-X

SNUH

CX-Mind

SJTU

DD-LLaVA-X

SNUH

RaDialog

TUM

16

Libra

University of Glasgow

RGRG

TUM

RadFM

SJTU

MedGemma

Google

17

RaDialog

TUM

MedGemma

Google

CheXpertPlus-CheX-MIMIC

Stanford

RGRG

TUM

18

CXRMate-ED

CSIRO

CheXagent

Stanford

Libra

University of Glasgow

CheXpertPlus-MIMIC

Stanford

19

Cvt2distilgpt2-MIMIC

CSIRO

MoERad-MIMIC

IIT Madras

RGRG

TUM

CX-Mind

SJTU

20

Cvt2distilgpt2-IU

CSIRO

Cvt2distilgpt2-MIMIC

CSIRO

RadPhi3.5Vision

Microsoft

MoERad-MIMIC

IIT Madras

21

CheXpertPlus-CheX-MIMIC

Stanford

CheXpertPlus-CheX

Stanford

Cvt2distilgpt2-MIMIC

CSIRO

CheXagent

Stanford

22

CXRMate-RRG24

CSIRO

MAIRA-2

Microsoft

RaDialog

TUM

MedGemma 1.5 4B

Google

23

CheXpertPlus-CheX

Stanford

VLCI-MIMIC

SYSU

RadPhi4VisionCXR

Microsoft Research

Cvt2distilgpt2-MIMIC

CSIRO

24

CheXpertPlus-MIMIC

Stanford

MedGemma 1.5 4B

Google

MoERad-MIMIC

IIT Madras

MoERad-IU

IIT Madras

25

RadFM

SJTU

RadFM

SJTU

CheXpertPlus-MIMIC

Stanford

VLCI-MIMIC

SYSU

26

BiomedGPT-IU

Lehigh University

MoERad-IU

IIT Madras

BiomedGPT-IU

Lehigh University

Cvt2distilgpt2-IU

CSIRO

27

MoERad-MIMIC

IIT Madras

Cvt2distilgpt2-IU

CSIRO

CheXpertPlus-CheX

Stanford

RadFM

SJTU

28

VLCI-MIMIC

SYSU

VLCI-IU

SYSU

VLCI-MIMIC

SYSU

GPT4V

OpenAI

29

CheXagent

Stanford

GPT4V

OpenAI

CheXagent

Stanford

VLCI-IU

SYSU

30

GPT4V

OpenAI

BiomedGPT-IU

Lehigh University

GPT4V

OpenAI

BiomedGPT-IU

Lehigh University

31

LLM-CXR

KAIST

LLM-CXR

KAIST

LLM-CXR

KAIST

LLM-CXR

KAIST

ReXrank Challenge V2.0 Leaderboard (VQA)

Rank ReXVQA

1

RadPhi4VisionCXR

Microsoft Research

2

CheXOne-R1

Stanford

3

MedGemma-4B-it

Google

4

Eagle2-9B

NVIDIA

5

Qwen2.5VL-7B-Instruct

Alibaba

6

Qwen2VL-7B-Instruct

Alibaba

7

Janus-Pro-7B

DeepSeek

8

Phi35-Vision-Instruct

Microsoft

9

LLaVA-1.5-7B

Meta

ReXrank Challenge V1.0 - RRG Performance on ReXGradient

ReXGradient is a large-scale private test dataset contains 10,000 studies collected from different medical centers in the US.

Rank Model 1/RadCliQ-v1 BLEU BertScore SembScore RadGraph RaTEScore GREEN CRIMSON

1

2024
CheXagent

Stanford

0.674 0.093 0.305 0.366 0.08 0.428 0.241 0.339

2

2024
CheXpertPlus-MIMIC

Stanford

0.777 0.154 0.341 0.442 0.13 0.501 0.52 0.342

3

2024
CheXpertPlus-CheX

Stanford

0.787 0.143 0.361 0.431 0.124 0.476 0.411 0.121

4

2024
CheXpertPlus-CheX-MIMIC

Stanford

0.83 0.169 0.372 0.442 0.154 0.517 0.489 0.358

5

2023
Cvt2distilgpt2-MIMIC

CSIRO

0.866 0.186 0.374 0.46 0.176 0.524 0.514 0.368

6

2023
Cvt2distilgpt2-IU

CSIRO

0.842 0.178 0.395 0.405 0.167 0.52 0.47 0.357

7

2024
MedVersa

Harvard

1.008 0.21 0.431 0.498 0.202 0.527 0.532 0.382

8

2023
RadFM

SJTU

0.775 0.157 0.365 0.392 0.135 0.504 0.406 0.154

9

2023
RaDialog

TUM

0.876 0.188 0.402 0.45 0.158 0.522 0.435 0.269

10

2023
RGRG

TUM

0.888 0.19 0.391 0.47 0.169 0.54 0.487 0.246

11

2023
VLCI-MIMIC

SYSU

0.721 0.157 0.31 0.402 0.122 0.488 0.477 0.317

12

2023
VLCI-IU

SYSU

0.897 0.214 0.365 0.467 0.215 0.573 0.536 0.35

13

2024
LLM-CXR

KAIST

0.507 0.043 0.182 0.142 0.029 0.317 0.044 -0.424

14

2024
GPT4V

OpenAI

0.629 0.075 0.214 0.337 0.138 0.47 0.497 0.141

15

2024
BiomedGPT-IU

Lehigh University

0.771 0.099 0.317 0.437 0.157 0.472 0.388 0.334

16

2024
MAIRA-2

Microsoft

0.963 0.205 0.436 0.462 0.187 0.559 0.531 0.391

17

2024
CXRMate-ED

CSIRO

0.872 0.202 0.398 0.415 0.187 0.564 0.518 0.293

18

2024
CXRMate-RRG24

CSIRO

0.792 0.15 0.327 0.462 0.152 0.518 0.408 0.379

19

2024
Libra

University of Glasgow

0.881 0.165 0.385 0.474 0.168 0.544 0.555 0.344

20

2025
MoERad-IU

IIT Madras

1.018 0.227 0.434 0.446 0.247 0.575 0.494 0.363

21

2025
MoERad-MIMIC

IIT Madras

0.756 0.145 0.351 0.406 0.116 0.508 0.431 0.249

22

2025
RadPhi3.5Vision

Microsoft

0.891 0.209 0.383 0.488 0.169 0.544 0.453 0.314

23

2025
DD-LLaVA-X

SNUH

0.886 0.166 0.387 0.469 0.174 0.542 0.504 0.316

24

2025
MedGemma

Google

1.008 0.2 0.427 0.479 0.223 0.617 0.566 0.282

25

2025
UniRG-CXR

Microsoft Research

1.622 0.291 0.538 0.576 0.298 0.618 0.469 0.315

26

2025
RadPhi4VisionCXR

Microsoft Research

1.348 0.295 0.477 0.536 0.311 0.618 0.492 0.425

27

2025
CheXOne-R1

Stanford

1.116 0.229 0.483 0.498 0.21 0.535 0.428 0.127

29

2025
CX-Mind

SJTU

0.917 0.181 0.424 0.454 0.167 0.574 0.499 0.339

30

2026
MedGemma 1.5 4B

Google

0.957 0.19 0.37 0.5 0.238 0.61 0.542 0.288

31

2025
zzy-RGv2-2b

Individual

1.496 0.284 0.53 0.556 0.284 0.627 0.515 0.417

ReXrank Challenge V1.0 - RRG Performance on MIMIC-CXR

MIMIC-CXR contains 377,110 images corresponding to 227,835 radiographic studies performed at the Beth Israel Deaconess Medical Center in Boston, MA. We follow the official split of MIMIC-CXR in the following experiments.

Rank Model 1/RadCliQ-v1 BLEU BertScore SembScore RadGraph RaTEScore GREEN CRIMSON

1

2024
CheXagent

Stanford

0.741 0.113 0.346 0.347 0.148 0.474 0.257 0.096

2

2024
CheXpertPlus-MIMIC

Stanford

0.788 0.145 0.361 0.375 0.17 0.485 0.311 0.107

3

2024
CheXpertPlus-CheX

Stanford

0.698 0.077 0.314 0.325 0.142 0.469 0.225 0.026

4

2024
CheXpertPlus-CheX-MIMIC

Stanford

0.805 0.142 0.367 0.379 0.181 0.49 0.305 0.132

5

2023
Cvt2distilgpt2-MIMIC

CSIRO

0.719 0.126 0.331 0.329 0.149 0.432 0.268 0.09

6

2023
Cvt2distilgpt2-IU

CSIRO

0.613 0.055 0.303 0.191 0.103 0.448 0.164 0.042

7

2024
MedVersa

Harvard

1.103 0.209 0.448 0.466 0.273 0.55 0.374 0.17

8

2023
RadFM

SJTU

0.65 0.087 0.313 0.259 0.109 0.45 0.185 -0.074

9

2023
RaDialog

TUM

0.799 0.127 0.363 0.387 0.172 0.485 0.273 0.035

10

2023
RGRG

TUM

0.755 0.13 0.348 0.344 0.168 0.491 0.273 -0.057

11

2023
VLCI-MIMIC

SYSU

0.68 0.136 0.304 0.305 0.14 0.45 0.256 0.016

12

2023
VLCI-IU

SYSU

0.599 0.075 0.263 0.212 0.109 0.449 0.21 0.038

13

2024
LLM-CXR

KAIST

0.516 0.037 0.181 0.156 0.046 0.341 0.043 -0.4

14

2024
GPT4V

OpenAI

0.558 0.068 0.207 0.214 0.084 0.423 0.161 -0.158

15

2024
BiomedGPT-IU

Lehigh University

0.544 0.02 0.192 0.224 0.059 0.36 0.123 -0.018

16

2024
MAIRA-2

Microsoft

0.694 0.088 0.308 0.339 0.131 0.517 0.224 0.075

17

2024
CXRMate-ED

CSIRO

0.872 0.208 0.383 0.396 0.223 0.531 0.327 0.102

18

2024
CXRMate-RRG24

CSIRO

0.87 0.198 0.367 0.423 0.22 0.521 0.338 0.153

19

2024
Libra

University of Glasgow

0.898 0.232 0.402 0.403 0.218 0.523 0.356 0.152

20

2025
MoERad-IU

IIT Madras

0.643 0.064 0.321 0.213 0.122 0.455 0.174 0.043

21

2025
MoERad-MIMIC

IIT Madras

0.726 0.163 0.341 0.334 0.143 0.465 0.24 -0.016

22

2025
RadPhi3.5Vision

Microsoft

0.888 0.223 0.386 0.431 0.207 0.534 0.294 0.083

23

2025
DD-LLaVA-X

SNUH

0.801 0.154 0.348 0.402 0.182 0.505 0.301 0.114

24

2025
MedGemma

Google

0.744 0.165 0.346 0.339 0.159 0.549 0.293 0.082

25

2025
UniRG-CXR

Microsoft Research

1.233 0.262 0.492 0.496 0.269 0.602 0.36 0.181

26

2025
CheXOne-R1

Stanford

1.06 0.218 0.461 0.455 0.235 0.519 0.314 0.08

27

2025
RadPhi4VisionCXR

Microsoft Research

1.033 0.234 0.444 0.439 0.251 0.584 0.351 0.196

29

2025
CX-Mind

SJTU

0.782 0.15 0.385 0.319 0.174 0.543 0.267 0.062

30

2026
MedGemma 1.5 4B

Google

0.678 0.113 0.24 0.355 0.18 0.565 0.284 0.006

31

2025
zzy-RGv2-2b

Individual

1.267 0.237 0.488 0.495 0.292 0.603 0.366 0.263

ReXrank Challenge V1.0 - RRG Performance on IU Xray

IU Xray contains 7,470 pairs of radiology reports and chest X-rays from Indiana University. We follow the split given by R2Gen.

Rank Model 1/RadCliQ-v1 BLEU BertScore SembScore RadGraph RaTEScore GREEN CRIMSON

1

2024
CheXagent

Stanford

0.827 0.116 0.353 0.488 0.139 0.503 0.389 0.633

2

2024
CheXpertPlus-MIMIC

Stanford

0.988 0.178 0.386 0.593 0.169 0.585 0.661 0.619

3

2024
CheXpertPlus-CheX

Stanford

0.92 0.157 0.413 0.495 0.153 0.534 0.541 0.559

4

2024
CheXpertPlus-CheX-MIMIC

Stanford

1.179 0.198 0.453 0.593 0.211 0.618 0.648 0.632

5

2023
Cvt2distilgpt2-MIMIC

CSIRO

1.126 0.199 0.422 0.609 0.209 0.606 0.682 0.632

6

2023
Cvt2distilgpt2-IU

CSIRO

1.283 0.244 0.482 0.548 0.265 0.62 0.686 0.654

7

2024
MedVersa

Harvard

1.46 0.206 0.527 0.606 0.235 0.65 0.631 0.623

8

2023
RadFM

SJTU

1.187 0.2 0.459 0.566 0.23 0.627 0.615 0.562

9

2023
RaDialog

TUM

1.086 0.201 0.444 0.544 0.205 0.586 0.586 0.532

10

2023
RGRG

TUM

1.174 0.216 0.437 0.602 0.223 0.62 0.665 0.481

11

2023
VLCI-MIMIC

SYSU

0.913 0.139 0.364 0.483 0.22 0.578 0.474 -0.036

12

2023
VLCI-IU

SYSU

1.381 0.268 0.455 0.619 0.288 0.679 0.698 0.654

13

2024
LLM-CXR

KAIST

0.486 0.033 0.186 0.057 0.023 0.28 0.025 -0.554

14

2024
GPT4V

OpenAI

0.708 0.076 0.274 0.405 0.146 0.517 0.651 0.52

15

2024
BiomedGPT-IU

Lehigh University

0.956 0.142 0.375 0.522 0.213 0.543 0.523 0.654

16

2024
MAIRA-2

Microsoft

1.298 0.219 0.477 0.604 0.233 0.627 0.194 0.552

17

2024
CXRMate-ED

CSIRO

1.22 0.225 0.464 0.557 0.249 0.655 0.685 0.534

18

2024
CXRMate-RRG24

CSIRO

1.458 0.245 0.456 0.638 0.302 0.666 0.68 0.633

19

2024
Libra

University of Glasgow

1.176 0.183 0.441 0.614 0.21 0.624 0.698 0.642

20

2025
MoERad-IU

IIT Madras

1.922 0.277 0.525 0.641 0.341 0.684 0.665 0.656

21

2025
MoERad-MIMIC

IIT Madras

1.02 0.171 0.42 0.559 0.178 0.603 0.584 0.573

22

2025
RadPhi3.5Vision

Microsoft

1.166 0.248 0.433 0.607 0.22 0.634 0.597 0.598

23

2025
DD-LLaVA-X

SNUH

1.204 0.189 0.443 0.6 0.233 0.636 0.671 0.635

24

2025
MedGemma

Google

1.34 0.217 0.475 0.6 0.26 0.678 0.724 0.524

25

2025
UniRG-CXR

Microsoft Research

4.804 0.376 0.636 0.701 0.398 0.729 0.664 0.632

26

2025
CheXOne-R1

Stanford

1.669 0.265 0.542 0.611 0.28 0.617 0.585 0.49

27

2025
CX-Mind

SJTU

1.328 0.205 0.489 0.603 0.232 0.662 0.644 0.648

29

2025
RadPhi4VisionCXR

Microsoft Research

1.068 0.214 0.43 0.521 0.23 0.589 0.548 0.642

30

2026
MedGemma 1.5 4B

Google

1.385 0.208 0.457 0.641 0.271 0.675 0.706 0.561

31

2025
zzy-RGv2-2b

Individual

2.43 0.305 0.584 0.656 0.34 0.684 0.632 0.615

ReXrank Challenge V1.0 - RRG Performance on CheXpert Plus

CheXpert Plus contains 223,228 unique pairs of radiology reports and chest X-rays from 187,711 studies and 64,725 patients. We follow the official split of CheXpert Plus in the following experiments and use the valid set for evaluation.

Rank Model 1/RadCliQ-v1 BLEU BertScore SembScore RadGraph RaTEScore GREEN CRIMSON

1

2024
CheXagent

Stanford

0.638 0.123 0.278 0.269 0.125 0.434 0.183 0.102

2

2024
CheXpertPlus-MIMIC

Stanford

0.663 0.14 0.292 0.294 0.134 0.43 0.238 0.1

3

2024
CheXpertPlus-CheX

Stanford

0.786 0.15 0.342 0.377 0.191 0.487 0.237 0.117

4

2024
CheXpertPlus-CheX-MIMIC

Stanford

0.808 0.153 0.335 0.404 0.207 0.497 0.274 0.212

5

2023
Cvt2distilgpt2-MIMIC

CSIRO

0.626 0.124 0.267 0.266 0.119 0.42 0.215 0.124

6

2023
Cvt2distilgpt2-IU

CSIRO

0.577 0.084 0.267 0.155 0.098 0.382 0.147 0.032

7

2024
MedVersa

Harvard

0.719 0.129 0.323 0.344 0.147 0.47 0.243 0.086

8

2023
RadFM

SJTU

0.572 0.081 0.235 0.216 0.08 0.396 0.096 -0.179

9

2023
RaDialog

TUM

0.709 0.131 0.312 0.353 0.138 0.445 0.211 0.111

10

2023
RGRG

TUM

0.674 0.154 0.315 0.274 0.14 0.453 0.216 -0.011

11

2023
VLCI-MIMIC

SYSU

0.589 0.12 0.229 0.251 0.101 0.384 0.165 0.018

12

2023
VLCI-IU

SYSU

0.555 0.106 0.22 0.17 0.094 0.418 0.194 0.04

13

2024
LLM-CXR

KAIST

0.519 0.041 0.162 0.211 0.037 0.321 0.022 -0.433

14

2024
GPT4V

OpenAI

0.568 0.081 0.215 0.234 0.082 0.415 0.152 -0.052

15

2024
BiomedGPT-IU

Lehigh University

0.552 0.022 0.2 0.241 0.056 0.351 0.118 -0.001

16

2024
MAIRA-2

Microsoft

0.788 0.163 0.359 0.355 0.189 0.485 0.273 0.238

17

2024
CXRMate-ED

CSIRO

0.723 0.157 0.324 0.316 0.175 0.498 0.265 0.157

18

2024
CXRMate-RRG24

CSIRO

0.801 0.157 0.315 0.411 0.218 0.521 0.276 0.151

19

2024
Libra

University of Glasgow

0.718 0.157 0.319 0.323 0.169 0.466 0.253 0.17

20

2025
MoERad-IU

IIT Madras

0.595 0.075 0.284 0.175 0.102 0.39 0.127 0.032

21

2025
MoERad-MIMIC

IIT Madras

0.641 0.122 0.267 0.3 0.12 0.434 0.166 -0.033

22

2025
RadPhi3.5Vision

Microsoft

0.86 0.198 0.353 0.437 0.217 0.51 0.243 0.073

23

2025
DD-LLaVA-X

SNUH

0.753 0.085 0.318 0.385 0.172 0.476 0.206 0.246

24

2025
MedGemma

Google

0.706 0.147 0.328 0.325 0.137 0.511 0.246 0.111

25

2025
UniRG-CXR

Microsoft Research

1.156 0.217 0.445 0.527 0.262 0.581 0.306 0.192

26

2025
CheXOne-R1

Stanford

1.048 0.18 0.43 0.487 0.243 0.522 0.25 0.094

27

2025
RadPhi4VisionCXR

Microsoft Research

0.85 0.163 0.348 0.448 0.205 0.541 0.282 0.213

29

2025
CX-Mind

SJTU

0.656 0.112 0.317 0.238 0.132 0.498 0.179 0.033

30

2026
MedGemma 1.5 4B

Google

0.632 0.1 0.201 0.362 0.142 0.524 0.273 0.063

31

2025
zzy-RGv2-2b

Individual

1.111 0.179 0.433 0.508 0.264 0.57 0.297 0.352

ReXrank Challenge V2.0 - VQA Performance on ReXVQA

Performance comparison of various vision-language models on medical VQA tasks.

Rank Model Overall Accuracy Differential Diagnosis Geometric Information Location Assessment Negation Assessment Presence Assessment
1 RadPhi4VisionCXR

Microsoft Research

0.9083 0.8975 0.7598 0.8193 0.9493 0.8887
2 CheXOne-R1

Stanford

0.8803 0.8447 0.7374 0.8038 0.9110 0.8799
3 MedGemma-4B-it

Google

0.8344 0.8190 0.6872 0.7504 0.8979 0.7935
4 Eagle2-9B

NVIDIA

0.6939 0.7212 0.5084 0.5761 0.8568 0.5403
5 Qwen2.5VL-7B-Instruct

Alibaba

0.6867 0.7451 0.5642 0.5609 0.8802 0.4892
6 Qwen2VL-7B-Instruct

Alibaba

0.6835 0.7661 0.6760 0.6160 0.7048 0.6363
7 Janus-Pro-7B

DeepSeek

0.6270 0.6923 0.6592 0.6235 0.6764 0.5483
8 Phi35-Vision-Instruct

Microsoft

0.6180 0.6990 0.4469 0.5252 0.7527 0.4643
9 LLaVA-1.5-7B

Meta

0.2632 0.2628 0.3464 0.2571 0.2576 0.2690