In the original domain of TextTiling (magazine articles) [6], precision (meaning finding the exact location of a topic change, instead of its general area) was not a priority. The algorithm moved each detected topic break to the nearest paragraph boundary within the text. However, an equivalent of this guideline does not exist in spoken dialogue: Turn-changes are essentially the only semantic divisions provided by data, and these correspond more closely to sentences in prose--they are too small to provide a similar indicator of potential topic change. Examining the graph provided by Hearst in [6] showing detected topic boundaries superimposed over the cosine measure line-graph, it is possible to see significant movement away from the actual troughs in the graph in order to match to the nearest paragraph boundary.
Figure 5.3:
Honours advising session,
 |
Examining figure 5.3, then, we observe this phenomenon in action: The first four detected topics are clearly successful (in terms of agreement with the human subject), but not perfectly aligned. In text with the equivalent of paragraph boundaries available, clearly this problem would not exist, and the first four topic detections in this instance would be perfect.
Figure 5.4:
Location of topic change 1 from figure 5.3, showing distance between detected and real topic boundaries (`break' tag is hand-marked, `autobreak' is system-determined)
 |
To demonstrate this issue more clearly, figure 5.4 shows the site of the first topic change from figure 5.3. While the real and detected breaks are `close' and clearly related in the graph, they are nearly two full speaker turns apart in the document itself. This is too large a gap to be closed merely by moving to the nearest speaker change; clearly, in fact, in this case such a decision would make the segmentation precision worse.
James Ballantine
2005-02-19