Based on a rich data set of recoveries donated by a debt collection business, recovery rates for non-performing loans taken from a single European country are modelled using linear regression, linear regression with Lasso, beta regression and inflated beta regression. We also propose a two-stage model: beta mixture model combined with a logistic regression model. The proposed model allows us to model the multimodal distribution we find for these recovery rates. All models are built using loan characteristics, default data and collections data prior to purchase by the debt collection business. The intended use of the models is to estimate future recovery rates for improved risk assessment, capital requirement calculations and bad debt management. They are compared using a range of quantitative performance measures under K-fold cross validation. Among all the models, we find that the proposed two-stage beta mixture model performs best.