Web Structure Derived Clustering for Optimised Web Accessibility Evaluation

Hambley A., Yesilada Y., Vigo M., Harper S.

2023 World Wide Web Conference, WWW 2023, Texas, United States Of America, 30 April - 04 May 2023, pp.1345-1354 identifier

  • Publication Type: Conference Paper / Full Text
  • Doi Number: 10.1145/3543507.3583508
  • City: Texas
  • Country: United States Of America
  • Page Numbers: pp.1345-1354
  • Keywords: Accessibility, Automated evaluation, Clustering, Sampling
  • Middle East Technical University Affiliated: Yes


Web accessibility evaluation is a costly and complex process due to limited time, resources and ambiguity. To optimise the accessibility evaluation process, we aim to reduce the number of pages auditors must review by employing statistically representative pages, reducing a site of thousands of pages to a manageable review of archetypal pages. Our paper focuses on representativeness, one of six proposed metrics that form our methodology, to address the limitations we have identified with the W3C Website Accessibility Conformance Evaluation Methodology (WCAG-EM). These include the evaluative scope, the non-probabilistic sampling approach, and the potential for bias within the selected sample. Representativeness, in particular, is a metric to assess the quality and coverage of sampling. To measure this, we systematically evaluate five web page representations with a website of 388 pages, including tags, structure, the DOM tree, content, and a mixture of structure and content. Our findings highlight the importance of including structural components in representations. We validate our conclusions using the same methodology for three additional random sites of 500 pages. As an exclusive attribute, we find that features derived from web content are suboptimal and can lead to lower quality and more disparate clustering for optimised accessibility evaluation.