Record keeping systems are gaining more importance in addressing safety problems in megaprojects. The information recorded is being converted into large data domains since it has become necessary to examine problems down to the last detail to deal with them properly. Due to a high number of attributes and the type of information, the data has a high level of heterogeneity. The aim of this paper is to propose an innovative safety assessment methodology to predict the possible scenarios and determine preventative actions. The study introduces predictive models based on factual information and shows how to deal with the safety data which has a significant level of heterogeneity. Latent Class Clustering Analysis (LCCA) was performed to reduce the heterogeneity and extract homogenous subgroups from the data. For predictive purposes, Artificial Neural Network (ANN) and Case-Based Reasoning (CBR) were employed to estimate the outcome of incidents in terms of severity. CBR gave promising results and exhibited better performance than ANN. CBR achieved results in 86% of the test cases with 18% error at most. Moreover, the overall prediction accuracy of fatal incidents was equal to 86.33%. As a final step, the study presented preventative actions to eliminate safety failures. Ultimately, the proposed methodology can assist construction industry professionals in examining future safety problems by utilizing the collected large scale data. Furthermore, the study provides necessary preventative measures to be implemented before and during the construction stage.