Quality assessment of web-based information on type 2 diabetes

Olcer D., Taşkaya Temizel T.

ONLINE INFORMATION REVIEW, vol.46, no.4, pp.715-732, 2022 (SCI-Expanded) identifier identifier

  • Publication Type: Article / Article
  • Volume: 46 Issue: 4
  • Publication Date: 2022
  • Doi Number: 10.1108/oir-02-2021-0089
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED)
  • Page Numbers: pp.715-732
  • Keywords: DISCERN, Website quality, Diabetes, Information quality, Information coverage, HEALTH INFORMATION, INTERNET INFORMATION, DISCERN
  • Middle East Technical University Affiliated: Yes


Purpose This paper proposes a framework that automatically assesses content coverage and information quality of health websites for end-users. Design/methodology/approach The study investigates the impact of textual and content-based features in predicting the quality of health-related texts. Content-based features were acquired using an evidence-based practice guideline in diabetes. A set of textual features inspired by professional health literacy guidelines and the features commonly used for assessing information quality in other domains were also used. In this study, 60 websites about type 2 diabetes were methodically selected for inclusion. Two general practitioners used DISCERN to assess each website in terms of its content coverage and quality. Findings The proposed framework outputs were compared with the experts' evaluation scores. The best accuracy was obtained as 88 and 92% with textual features and content-based features for coverage assessment respectively. When both types of features were used, the proposed framework achieved 90% accuracy. For information quality assessment, the content-based features resulted in a higher accuracy of 92% against 88% obtained using the textual features. Research limitations/implications The experiments were conducted for websites about type 2 diabetes. As the whole process is costly and requires extensive expert human labelling, the study was carried out in a single domain. However, the methodology is generalizable to other health domains for which evidence-based practice guidelines are available. Practical implications Finding high-quality online health information is becoming increasingly difficult due to the high volume of information generated by non-experts in the area. The search engines fail to rank objective health websites higher within the search results. The proposed framework can aid search engine and information platform developers to implement better retrieval techniques, in turn, facilitating end-users' access to high-quality health information. Social implications Erroneous, biased or partial health information is a serious problem for end-users who need access to objective information on their health problems. Such information may cause patients to stop their treatments provided by professionals. It might also have adverse financial implications by causing unnecessary expenditures on ineffective treatments. The ability to access high-quality health information has a positive effect on the health of both individuals and the whole society. Originality/value The paper demonstrates that automatic assessment of health websites is a domain-specific problem, which cannot be addressed with the general information quality assessment methodologies in the literature. Content coverage of health websites has also been studied in the health domain for the first time in the literature.