Please use this identifier to cite or link to this item: http://hdl.handle.net/10216/621
Author(s): Filipe Silva
Armando Oliveira
Lígia M. Ribeiro
Gabriel David
Title: Searching dynamic Web pages with semi-structured contents
Issue Date: 2003
Abstract: At present, information systems (IS) in higher education are usually supported by databases (DB) and accessed through a Web interface. So happens with SiFEUP, the IS of the Engineering Faculty of the University of Porto (FEUP). The typical SiFEUP user sees the system as a collection of Web pages and is not aware of the fact that most of them do not exist in the sense of being an actual HTML file stored in a server but corresponds to HTML code generated on the fly by a designated program that accesses the DB and brings the most up-to-date information to the user desktop. Typical search engines do not index dynamically generated Web pages or just do that for those that are specifically mentioned in a static page and do not follow on the links the dynamic page may contain. In this paper we describe the development of a search facility for SiFEUP, how the limitations put to indexing dynamic Web pages were circumvented, and an evaluation of the results obtained. The solution involves using a locally developed crawler, the Oracle Text full text indexer, plus meta-information automatically drawn from the DB or manually added to improve the relevance factor calculation.
Description: At present, information systems (IS) in higher education are usually supported by databases (DB) and accessed through a Web interface. So happens with SiFEUP, the IS of the Engineering Faculty of the University of Porto (FEUP). The typical SiFEUP user sees the system as a collection of Web pages and is not aware of the fact that most of them do not exist in the sense of being an actual HTML file stored in a server but corresponds to HTML code generated on the fly by a designated program that accesses the DB and brings the most up-to-date information to the user desktop. Typical search engines do not index dynamically generated Web pages or just do that for those that are specifically mentioned in a static page and do not follow on the links the dynamic page may contain. In this paper we describe the development of a search facility for SiFEUP, how the limitations put to indexing dynamic Web pages were circumvented, and an evaluation of the results obtained. The solution involves using a locally developed crawler, the Oracle Text full text indexer, plus meta-information automatically drawn from the DB or manually added to improve the relevance factor calculation.
Subject: Engenharia electrotécnica, electrónica e informática
Call Number: 56197
URI: http://hdl.handle.net/10216/621
Source: EUNIS 03 European University Information Systems
Document Type: Artigo em Livro de Atas de Conferência Internacional
Rights: openAccess
License: https://creativecommons.org/licenses/by-nc/4.0/
Appears in Collections:FEUP - Artigo em Livro de Atas de Conferência Internacional

Files in This Item:
File Description SizeFormat 
56197.doc605.5 kBMicrosoft WordView/Open


This item is licensed under a Creative Commons License Creative Commons