# A practical method to quantify knowledge‐based DVH prediction accuracy and uncertainty with reference cohorts

### Abstract

The adoption of knowledge-based dose-volume histogram (DVH) prediction models for assessing organ-at-risk (OAR) sparing in radiotherapy necessitates quantification of prediction accuracy and uncertainty. Moreover, DVH prediction error bands should be readily interpretable as confidence intervals in which to find a percentage of clinically acceptable DVHs. In the event such DVH error bands are not available, we present an independent error quantification methodology using a local reference cohort of high-quality treatment plans, and apply it to two DVH prediction models, ORBIT-RT and RapidPlan, trained on the same set of 90 volumetric modulated arc therapy (VMAT) plans. Organ-atrisk DVH predictions from each model were then generated for a separate set of 45 prostate VMAT plans. Dose-volume histogram predictions were then compared to their analogous clinical DVHs to define prediction errors from which prediction bias, prediction error variation, and root-mean-square error could be calculated for the cohort. The empirical RMSEpred was then contrasted to the model-provided DVH error estimates. For all prostate OARs, above 50% Rx dose, ORBIT-RT prediction bias and prediction error were comparable to or less than those of RapidPlan. Above 80% Rx dose, prediction bias was less than 1% and prediction error was less than 3-4% for both models. As a result, above 50% Rx dose, ORBIT-RT RMSEpred was below that of RapidPlan, indicating slightly improved accuracy in this cohort. Because the bias is near zero, RMSEpred is readily interpretable as a canonical standard deviation, whose error band is expected to correctly predict 68% of normally distributed clinical DVHs. By contrast, RapidPlan’s provided error band, although described in literature as a standard deviation range, was slightly less predictive than RMSEpred (55–70% success), while the provided ORBIT-RT error band was confirmed to resemble an interquartile range (40–65% success) as described. Clinicians can apply this methodology using their own institutions’ reference cohorts to (a) independently assess a knowledge-based model’s predictive accuracy of local treatment plans, and (b) interpret from any error band whether further OAR dose sparing is likely attainable.

Type
Publication
Journal of Applied Clinical Medical Physics