09-07-2007, 12:26 AM

I just came up with a new idea that might be related to my research right now, although I am not sure yet.

So for example, you have a training dataset, containing classified documents, which classes are news, entertainment, and education.

Now you want to classify a document, which is a combined document of for example, news and entertainment. Based on your current training dataset, I am wondering if there is any method or approach to divide the document into two documents (determine the split point ?), and then classify them as news and entertainment accordingly ?

Sindharta T.