Please use this identifier to cite or link to this item: https://hdl.handle.net/10216/145828
Author(s): Hélder Antunes
Carla Teixeira Lopes
Title: Readability of web content An analysis by topic
Issue Date: 2019
Abstract: Readability is determined by the characteristics of the text that influence their understanding. The web is composed of content on various topics and the results retrieved in the top positions by the main search engines are expected to be those with the highest number of views. In this study, we analyzed the readability of web pages according to the topic to which it belongs and their position in the search result. For that, we collected the top-20 results retrieved by Google to 23,779 queries from 20 topics and used several readability metrics. The results of the analysis showed that the content from organizations (like colleges and other institutions) and health-related content have lower readability values. Categories Games and Home are on the opposite side. For the categories identified as having less readability, tools can be developed that help the user understand their content. We also found that top-ranked pages have higher values of readability. One can conclude that, directly or indirectly, readability is a factor that seems to be being considered by the Google search engine or has an influence on page popularity.
URI: https://hdl.handle.net/10216/145828
Source: Iberian Conference on Information Systems and Technologies, CISTI
Document Type: Artigo em Livro de Atas de Conferência Internacional
Rights: openAccess
Appears in Collections:FEUP - Artigo em Livro de Atas de Conferência Internacional

Files in This Item:
File Description SizeFormat 
385452.pdfArtigo2.52 MBAdobe PDFThumbnail
View/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.