# Scholarly papers describing the methodology

Transformation models have been around for more than 50 years, starting with the seminal paper introducing “Box-Cox” power-transformations published by George Box and Sir David Cox in 1964. Later developments focused on a semiparametric understanding of these models, most importantly the partial likelihood approach to parameter estimation in the Cox proportional hazards model.

During the last decade, fully parametric versions of transformation models have been studied. Model inference is much simpler once all components of the models have been parametrised appropriately. Research on transformation models implemented in the mlt add-on package started with a gradient-boosting algorithm for conditional transformation models (Hothorn, Kneib, and Bühlmann, 2014). This algorithm optimises the Brier score for model estimation. It turned out that maximum likelihood estimation is computationally and conceptionally much simpler and also helps to estimate models for discrete or censored data (Hothorn, Möst, and Bühlmann, 2018). So-called most likely transformations are implemented in the mlt add-on package (Hothorn, 2020b). Special attention to count transformation models is given by Siegfried and Hothorn (2020).

A generalisation of binary logistic regression models to continuous outcomes featuring parameters interpretable as log-odds ratios were described in Lohse, Rohrmann, Faeh, and Hothorn (2017). Simple transformation models as well as more complex transformation models (for example transformation trees and forests) for body mass index distributions are discussed in Hothorn (2018). Transformation forests, and their underlying transformation tree algorithm, were established in Hothorn and Zeileis (2021), providing a statistical learning approach for computing fully parametric predictive distributions. Two likelihood-based boosting methods for transformation models are introduced in Hothorn (2020a).

Multivariate transformation models, where the marginal distributions can be understood as univariate transformation models and their joint distribution is characterised by a (gaussian) copula, are described in Klein, Hothorn, Barbanti, and Kneib (2022).

**References**

[1]
T. Hothorn, T. Kneib, and P. Bühlmann.
“Conditional Transformation Models”.
In: *Journal of the Royal Statistical Society: Series B (Methodological)* 76.1 (2014), pp. 3–27.
DOI: 10.1111/rssb.12017.

[2]
T. Lohse, S. Rohrmann, D. Faeh, and T. Hothorn.
“Continuous Outcome Logistic Regression for Analyzing
Body Mass Index Distributions”.
In: *F1000Research* 6 (2017), p. 1933.
DOI: 10.12688/f1000research.12934.1.

[3]
T. Hothorn.
“Top-Down Transformation Choice”.
In: *Statistical Modelling* 18.3–4 (2018), pp. 274–298.
DOI: 10.1177/1471082X17748081.

[4]
T. Hothorn, L. Möst, and P. Bühlmann.
“Most Likely Transformations”.
In: *Scandinavian Journal of Statistics* 45.1 (2018), pp. 110–134.
DOI: 10.1111/sjos.12291.

[5]
T. Hothorn.
“Most Likely Transformations: The mlt Package”.
In: *Journal of Statistical Software* 92.1 (2020), pp. 1–68.
DOI: 10.18637/jss.v092.i01.

[6]
T. Hothorn.
“Transformation Boosting Machines”.
In: *Statistics and Computing* 30 (2020), pp. 141–152.
DOI: 10.1007/s11222-019-09870-4.

[7]
S. Siegfried and T. Hothorn.
“Count Transformation Models”.
In: *Methods in Ecology and Evolution* 11.7 (2020), pp. 818–827.
DOI: 10.1111/2041-210X.13383.

[8]
T. Hothorn and A. Zeileis.
“Predictive Distribution Modelling Using Transformation Forests”.
In: *Journal of Computational and Graphical Statistics* 14 (2021), pp. 144–148.
DOI: 10.1080/10618600.2021.1872581.

[9]
N. Klein, T. Hothorn, L. Barbanti, and T. Kneib.
“Multivariate Conditional Transformation Models”.
In: *Scandinavian Journal of Statistics* 49 (2022), pp. 116–142.
DOI: 10.1111/sjos.12501.