Maximal Consistent Interpretations of Errorful Data in Hierarchically Modeled Domains
Abstract
A method is presented fo r c o n s t r u c t i n g maximal cons is ten t t n t e p r e t a t i o n s e r r o r ! u l d a t a . The method appears a p p l i c a b l e to many tasks (speech unders tand ing , n a t u r a l language understanding;, v i s i o n , medical d iagnos is ) r e q u i r i n g p a r t i a l m a t c h i n g o f e r r o r f u l data aga inst complex, h i e r a r c h i c a l l y de f ined p a t t e r n s . The data is represented as symbolic s t r u c t u r e s (word sequences, l i n e segment c o n f i g u r a t i o n s , disease symptoms). Errors consist missing data (unrecognized words, occluded l i n e s , undetected symptoms) and e x t r a (poss ib ly incons is ten t ) data ( i n c o r r e c t l y recognized words, v i s u a l n o i s e , spur ious symptoms). Data i n t e r p r e t a t i o n s correspond to subst ruc tures a h ie rarchy concepts . Cons t ra in ts on cons is ten t predef ined conceptual h i e r a r c h y . c o r r e c t l y fragments speech s t r u c t u r e s embedded t h e An imp1erne nta t ion the me t hod has sets sentence the HEARSAY-II system. The i n t e r p r e t e d e r r o r f u l recognized by understanding Implementat ion has a lso c o r r e c t l y i n t e r p r e t e d t y p e d i n ungrammatical sentences. D e t a i l e d examples i l l u s t r a t e o p e r a t i o n the method on rea l d a t a . 0DUCT10N The a p p l i c a t i o n Al methods to complex domains ( e . g . , spe ec h , v i s ion , medical d i agn os is ) has expanded the dimensions data i n t e r p r e t a t i o n to incorpora te some novel phenomena. Two these phenomena are data e r r o r and h i e r a r c h i c a l l y de f ined data p a t t e r n s . Many complex domains are c h a r a c t e r i z e d by e r r o r f u l d a t a . E r rors such as i n s e r t i o n , d e l e t i o n , s u b s t i t u t i o n , in f orina t ion incrcase as source data t r a n s d u c t i o n i n cr eases . D a t a ma y be in that two or more piece: be e x p l a i n e d c o n s i s t e n t l y , inconsis t enc i es in t he and r e p e t i t i o n the u n c e r t a i n t y and i n t e r p r e t a t i o n mut vial ly i nco n s i s t en t s i n fo rmat ion cannot T o l e r a t i n g e r r o r and data r e q u i r e s robust methods that can not only f i n d the best i n t e r p r e t a t i o n but are able to d i s t i n g u i s h the incons is ten t and e r r o r f u l data from the cons is ten t d a t a . Another aspect data i n t e r p r e t a t i o n in complex domains is that i n t e r p r e t a t i o n s represent complex, h i e r a r c h i c a l l y de f ined concepts ( i d e a s , r u l e s , p a t t e r n s ) r a t h e r than s i m p l e , independent concepts ( f e a t u r e s ) . Of ten the concepts used in i n t e r p r e t a t i o n s can be placed in a h ierarchy where each concept is de f ined in terms of i t s subconcepts. Th is s t r u c t u r e concepts is c a l l e d a conceptual h i e r a r c h y . A c o l l e c t i o n oi data can then be i n t e r p r e t e d by the highest concept in the h ie ra rchy supported ( v a l i d a t e d ) by the d a t a . The i n t e r p r e t a t i o n the data is de f ined by the concept 's descendants (subconcepts, subsubconcepts, e t c . ) and the data which supports them. These descendants form a s u b s t r u c t u r e the conceptual h i e r a r c h y . The general data i n t e r p r e t a t i o n problem can now be r e s t a t e d as a search f o r the concept in the conceptual h ie ra rchy that e x p l a i n s ( i s supported by) the most d a t a . The data suppor t ing the s t r u c t u r e under ly ing t h i s maximal concept can be descr ibed as the maximal c o n s i s t e n t subset d a t a . In t h i s paper we de f ine conceptual This work was supported in p a r t by the Defense Advanced Research P r o j e c t s Agency under c o n t r a c t no. F 4 4 6 2 0 7 3 O 0 0 7 4 and monitored by the A i r Force O f f i c e S c i e n t i f i c Research. In a d d i t i o n , the f i r s t author was p a r t i a l l y supported by a N a t i o n a l Research Counci l Canada Postgraduate Scholarship and the second author was p a r t i a l l y supported by a N a t i o n a l Science Foundation Graduate F e l l o w s h i p . h i e r a r c h i e s and maximal cons is ten t i n t e r p r e t a t i o n s . We then descr ibe a method f o r i n t e r p r e t i n g data in such an environment, i . e . , f i n d i n g maximal cons is ten t i n t e r p r e t a t i o n s in a conceptual h i e r a r c h y . Examples i l l u s t r a t i n g the method are shown. F i n a l l y , we show the ac tua l a p p l i c a t i o n the method to the problem i n t e r p r e t i n g e r r o r f u l sentence fragments recognized by the HEARSAY-II speech understanding system (Erman, 19 7 5 ) . 2. A REAL EXAMPLE The ma tch ing problem used as throughout t h i s paper is taken from speech understanding system. When unable to complete ly recogniz sentence ( u t t e r a n c e ) , i t generat'* sentence fragments (Hayes-Rotn et ai , must be i n t e r p r e t e d by the semanticmodule, named SGI ANT. The generat can be both e r r o r f u l a i n c o n s i s t e n t (Example 2 . 1 ) . A senten a chunk cons is ten t data in that it grammat ica l ly p l a u s i b l e sequence words. HEARSAY-II mechanisms i d e n t i f y i n g such chunks are not s u i t e them i n t o an o v e r a l l cons is tent i n t tlie u t t e r a n c e . EXAMPLE 2. 1 an example the HEARSAY-II HEARSAY-II is e a spoken s a set 19 76c) which i n t e r p r e t a t i o n ed fragments nd mutual ly ce fragment is consists a recognized e f f e c t i v e i n d to combining e r p r e t a t i o n Fragment p o r t i o n the 1-3 conta in 1 and 2 are they provide the over lapp ing 1 6. 3, 1 & 1: [ WHAT HAS HERBERT 2: PAPER ABOUT PATTERN MATCHING ] 3: IN LEARNING OR PATTERN MATCHING J 4: [ WHO Correct Sentence: [ WHO HAS WRITTEN ABOUT PATTERN MATCHING ] Example 2 . 1 shows four sentence fragments generated when HEARSAY-II was unable to recognize the sentence [ WHO HAS WRITTEN ABOUT PATTERN MATCHING ] . The square brackets denote the s t a r t and f i n i s h the spoken u t t e r a n c e . The numbers enclosed in angle brackets s p e c i f y , in cent iseconds, how long a f t e r the s t a r t the u t te rance each fragment begins and ends. 4 c o r r e c t l y matches the i n i t i a l spoken sentence. Fragments s u b s t i t u t i o n e r r o r s . Fragments mutual ly i n c o n s i s t e n t in that d i f f e r e n t i n t e r p r e t a t i o n s o f the time per iod . The fragment p a i r s 4. and 2 & 3 are i n c o n s i s t e n t for the same reason. A lso , Fragment I s p e c i f i c s a WHAT quest ion whereas fragment 4 s p e c i f i e s a WHO q u e s t i o n . Thus Fragments 1 and 4 are semant l e a l ly i n c o n s i s t e n t , I r r e g a r d less t h e i r t imes . Each fragment is semant i c a l l y descr ibed by a h i e r a r c h i c a l l y s t r u c t u r e d c o l l e c t i o n concepts. F igure 2 .1 shows a p o r t i o n the conceptual h ie rarchy used by the SEMANT module in HEARSAY-II . F igure 2.2 shows the h i e r a r c h i c a l d e s c r i p t i o n the cor rec t sentence. The problem i n t e r p r e t i n g these fragments i l l u s t r a t e s the phenomena data e r r o r and h i e r a r c h i c a l l y s t r u c t u r e d i n t e r p r e t a t i o n s . The method used f o r s o l v i n g t h i s problem appears a p p l i c a b l e to a s i g n i f i c a n t c lass problems e x h i b i t i n g these two phenomena. 3. CONCEPTUAL HT FERARC.HIES A conceptual h ie ra rchy can be represented by a d i r e c t e d graph concepts. Th is graph is trees t r u c t u r e d In tha t i t has a root at the to leaf nodes at the bottom; however p e r m i t t e d . The sons a node subconcepts that compose the f a t h e r , the graph de f ines the h ighest g e n e r a l ) i n t e r p r e t a t i o n o f a l l beneath i t . A g iven i n t e r p r e t a t i o n task has top and cycles are de f ine the The root l e v e l (most the concepts
Cite
Text
Fox and Mostow. "Maximal Consistent Interpretations of Errorful Data in Hierarchically Modeled Domains." International Joint Conference on Artificial Intelligence, 1977.Markdown
[Fox and Mostow. "Maximal Consistent Interpretations of Errorful Data in Hierarchically Modeled Domains." International Joint Conference on Artificial Intelligence, 1977.](https://mlanthology.org/ijcai/1977/fox1977ijcai-maximal/)BibTeX
@inproceedings{fox1977ijcai-maximal,
title = {{Maximal Consistent Interpretations of Errorful Data in Hierarchically Modeled Domains}},
author = {Fox, Mark S. and Mostow, Jack},
booktitle = {International Joint Conference on Artificial Intelligence},
year = {1977},
pages = {165-171},
url = {https://mlanthology.org/ijcai/1977/fox1977ijcai-maximal/}
}