An automatic approach to construct domain-specific web portals

Altıngövde, İsmail Sengör
Ozcan, Rifat
Cetintas, Suleyman
Yilmaz, Hakan
Ulusoy, Özgür
We describe the architecture of an automatic domain-specific Web portal construction system. The system has three major components: i) a focused crawler that collects the domain-specific pages on the Web, ii) an information extraction engine that extracts useful fields from these Web pages, and iii) a query engine that allows both typical keyword based queries on the pages and advanced queries on the extracted data fields. We present a prototype system that works for the course homepages domain on the Web. A user study with the prototype system shows that our approach produces high quality results and achieves better precision figures than the typical keyword based search.