The Final Program is now available! Click here for details.
 
3rd International Workshop on
Web Site Evolution

November 10, 2001; Florence, Italy

WSE 2001
Theme: Access for All

TITLE: Recovering Traceability Links in Multilingual Web Sites

AUTHORS(s) & AFFILIATION(s):

Paolo Tonella, Filippo Ricca, Emanuele Pianta and Christian Girardi, ITC-irst. Centro per la Ricerca Scientifica e Tecnologica, 38050 Povo (Trento), Italy. {tonella, ricca, pianta, cgirardi}@itc.it

KEYWORD(s): Multilingual Web sites, Traceability, Code analysis.

PRESENTER / CONTACT PERSON: Paolo Tonella

CONTACT EMAIL: tonella@itc.it

ABSTRACT:

In this paper the problem of verifying the consistency between Web site portions devoted to different languages will be investigated. The purpose is to support the activity of the site maintainer, who is responsible for the alignment between different site versions. Anomalies that typically occur in such situations include the absence of pages in some languages, differences in the page structure in different languages, missing information and parts not translated.

The approach we propose to recover traceability links so as to simplify the update of the site to a consistent state is based on a mix of structural and textual information extracted from the page. The syntax trees of the pages to be compared drive the page matching process. When structurally corresponding nodes are encountered during the tree visit, their text attributes are considered to see if they are each other's translation.

Last modified October 29, 2001 by Scott Tilley.