|
Published Articles >> Table of Contents >> Abstract
Services Computing, 2004 IEEE International Conference on (SCC'04)
pp. 449-452
Segmenting the Web Document with Document Object Model
Jianli Luo, Yangzhou University, China
Jie Shen, Yangzhou University, China
Cuihua Xie, Yangzhou University, China
Full Article Text:
 
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/SCC.2004.1358040
Send link to a friend
| Abstract |
|
We present a model about DOM-based web document
segmentation using the semi-structure information
of web pages. This model builds DOM tree of the
web page by parsing HTML tags which organize structure
of the web page. By improving traditional plain
text segmentation algorithms, we expand these algorithms
to suit web text segmentation. Then, with the
boundaries between the nodes in the DOM tree, precision
of segmentation results can be increased further.
|
Additional Information
|
Citation:
Jianli Luo, Jie Shen, Cuihua Xie,
"Segmenting the Web Document with Document Object Model,"
scc,
pp. 449-452,
Services Computing, 2004 IEEE International Conference on (SCC'04),
2004
|
|