Double-blind evaluation and benchmarking of survival models in a multi-centre study