Web Site Structures and Parsing

My final year project is a text filter...basically a news filter. it is aimed to maintain a user profile according to the interests of the user...according to the categories..such as politics, sports, business, etc, hopefully using cnn, reuters and bbc as sources. i was wondering if somebody could help me out with parsing of the web site contents as the sites are all organized in very differnt manners.

-- Rosheena Siddiqi (rosheenasiddiqi@yahoo.com), November 12, 2002

