#|----------------------------------------------------------------------------- C A T E G O R Y U T I L I T Y D E T E R M I N A T I O N F U N C T I O N S The following four functions determine the category utility using the integration of Gluck & Corter's (1985) equation. Inputs: Each of the functions requires the current parent, the obj to integrate, and the attribute access functions. In addition, some of them require the best node (with the new obj integrated), and the second best node. In each case, the current parent has been modified so that its count, atts, attScores, and members have been updated to include the new obj. The children and data fields, however, have not been changed to reflect the new obj. Outputs: Each of them outputs the Category Utility Score (or -1) if the function is inappropriate (e.g. spliting a node with no children). Each also returns a copy of the new tree. NOTE: this tree only incorporates the new obj to a depth of one level. To integrate the obj at lower levels Classit must be called recursively. In addition, some functions will return the best-node and the second best node. Comments: in each case, the current parent will be neither nil nor a single instance. Both these possibilities should be filtered out by either the Retrieve or Classify function before any category utility determiner is called. MODIFIED 8/89 for optimization. Extensive-copy-node is removed. -JHG ----------------------------------------------------------------------------|# (provide "node-compute") (require "globals") (require "node-atts") (require "struct") #|----------------------------------------------------------------11/May/88---- COMPUTE-CU-SELF computes category utility for placing the obj in a node by itself Inputs: the obj, the modified copy of the parent node to place the obj in (see above Category Utility Determination Functions documentation for a description of the modification), and a list of attribute-access function pairs. Outputs: the category utility score, the parent node for a copy of the tree with the concept integrated, and the new node. SideEffects: none. Comments: The component tree is copied, the original should remain unchanged. OPTIMIZED - 8/89 by JHG (from Wayne) -------------------------------------------------------------------Pyoung----|# (defun compute-cu-self (parent att-list &aux new-parent) (setq new-parent (copy-node parent)) (setf (node-children new-parent) (cons (create-node att-list) (node-children new-parent))) (values (determine-cat-utility new-parent) new-parent (car (node-children new-parent))) ) #|----------------------------------------------------------------03/Jun/88---- COMPUTE-CU-BEST determines best child to place obj in Inputs: the obj, the modified copy of the parent node to place the obj in (see above Category Utility Determination Functions documentation for a description of the modification), and a list of attribute-access function pairs. Outputs: returns multiple values the category utility of the partition with the obj integrated into the best child, the tree with the obj integrated into the best child, the best child with the obj integrated (only to the first level), the second best child without obj integration (used in merge), the best-node without obj integrated (used merge & split) (???) Note that these late 2 are children of the original parent, not of the best-partition. SideEffects: none. Comments: The component tree is copied, the original should remain unchanged. OPTIMIZED - 8/89 by JHG (from Wayne) ------------------------------------------------------------------Pyoung----|# (defun new-partition (parent child index att-list &aux new-parent) (setq new-parent (copy-node parent)) (setf (node-children new-parent) (copy-list (node-children new-parent))) (setf (elt (node-children new-parent) index) (integrate-obj child att-list)) new-parent ) (defun compute-CU-best (parent att-list &key (called-from :store)) (let* ((children (node-children parent)) (partitions (loop for child in children for i = 0 then (1+ i) collect (new-partition parent child i att-list))) (CUs (mapcar #'determine-cat-utility partitions)) (ordered-CUs (sort (copy-list CUs) #'>)) (best-CU (first ordered-CUs)) (best-position (position best-CU CUs)) (best-partition (elt partitions best-position)) (best-child (elt (node-children best-partition) best-position)) (old-best-node (elt children best-position)) (old-second)) (let* ((second-CU (second ordered-CUs)) (1st-postn-with-that-value (position second-CU CUs))) (setf old-second (if (eql best-CU second-CU) (elt children ;not the first one (position second-CU CUs :start (1+ 1st-postn-with-that-value))) (elt children 1st-postn-with-that-value)))) (when (and *GLOBAL-DEBUG* (eq called-from :store)) (do ((score CUs (cdr score)) (index 1 (1+ index))) ((endp score)) (pr-out "======= Cat-utility for child ~2D is ~6,2F~%" index (car score))) (pr-out "===== best-position is ~D~%" (1+ best-position))) (values best-CU best-partition ;best tree at parent level best-child ;child obj was integrated to in best-partition old-second old-best-node ) )) #|----------------------------------------------------------------10/May/88---- COMPUTE-CU-MERGE compute the category utility of merging the best two concepts Inputs: the obj, the modified copy of the parent node to place the obj in (see above Category Utility Determination FUnctions documentation for a description of the modification), best node to place the obj concept in (with the obj already integrated), the second best node, and the old best node without the obj integrated. Outputs: the category utility score (or -1 if merging is not valid-- if the parent has only 2 or fewer children), the parent node for a copy of the tree with the concept integrated, and the node which the obj ended up in. SideEffects: none. Comments: The component tree is copied, the original should remain unchanged. OPTIMIZED - 8/89 by JHG (from Wayne) ------------------------------------------------------------------Pyoung----|# (defun compute-cu-merge (parent best sec old-best att-list &aux merge-tree merge-node) (if (> (length (node-children parent)) 2) (progn (setq merge-tree (copy-node parent)) (setq merge-node (make-node :count (+ (node-count best) (node-count sec)) :atts (combine-atts (node-atts best) (node-atts sec)) :children (list old-best sec) ;old-best since obj not ;integrated to grandchild level :members (append (node-members best) (node-members sec)) )) (setf (node-attScores merge-node) (prob-att=value merge-node)) (setf (node-children merge-tree) (cons merge-node (remove-if #'(lambda (child) (or (equalp child old-best) (equalp child sec))) (copy-list (node-children merge-tree))))) (values (determine-cat-utility merge-tree) merge-tree merge-node) ) (values -1 nil nil)) ) #|----------------------------------------------------------------10/May/88---- Function - COMPUTE-CU-SPLIT compute the category utility of splitting all the best node's children Inputs -> the obj, the modified copy of the parent node to place the obj in (see above Category Utility Determination Functions documentation for a description of the modification), the best node to place the obj concept in (with the obj already integrated), the old best node (the best node without the obj integrated), the attribute-name / access function list. Returns -> (multiple values) category utility score of split-tree (or -1 if merging is not valid-- if best node has no children), split-tree: the parent node for a copy of the tree with the concept integrated to the newly created split-node, one level below split-tree. best-grandchild: the node which the obj ended up in (named that way since that node starts this procedure as a grandchild of parent, though it's only a child of split-tree. best-grandchild is the best-node of best-node basically (hence the almost recursive call to compute-CU-best). SideEffects: none. Comments: The component tree is copied, the original should remain unchanged. I considered computing both CU-best of obj and best and CU-self of obj and best, however, John Gennari convinced me that this would seem strange. If we ended up placing the obj by itself, and then splitting, it would be like placing the obj by itself originally, then arbitrarily spliting another node. OPTIMIZED - 8/89 by JHG (from Wayne) ------------------------------------------------------------------Pyoung----|# (defun compute-cu-split (parent best old-best att-list &aux split-tree) (setq split-tree (copy-node parent)) (if (node-children best) (multiple-value-bind (bestchild-CU bestchild-tree best-grandchild second old-best-grandC) (compute-CU-best best att-list :called-from :split) (setf (node-children split-tree) (append (delete old-best (copy-list (node-children split-tree)) :test #'equalp) (node-children bestchild-tree))) ;remove old-best not best ;since obj not in split-tree's children yet. (values (determine-cat-utility split-tree) split-tree best-grandchild old-best-grandC)) ;ELSE clause -- (values -1 nil)) ) #|***************************************************************************** D E T E R M I N I N G C A T E G O R Y U T I L I T Y *****************************************************************************|# #|----------------------------------------------------------------14/May/88---- Function - ATTSCORE*PRIOR Used both in computing category utility and in sorting the nodes to compute which is the "best" in the partition. Is it right to use this product in "best" comparison? Ok not to divide by n, since this will be the same for each. But we're not subtracting off the info at parent; I guess that's same too. Inputs -> a parent node and one of its children Returns -> the product of the attscore of that child and its "prior" probability (probability of that child in that partition) -------------------------------------------------------------------KThompso--|# (defun attscore*prior (parent child) (* (/ (node-count child) (node-count parent)) ;prior probability of child (node-attscores child))) ;times sum over atts,values of child #|----------------------------------------------------------------14/May/88---- Function - COMPUTE-VALUES-FOR-CHILDREN Inputs -> the parent of a partition Returns -> the sum over all its children of the product of the prior probability of that node and its attscore. -------------------------------------------------------------------KThompso--|# (defun compute-values-for-children (parent) (loop for child in (node-children parent) sum (attscore*prior parent child))) #|----------------------------------------------------------------10/May/88---- DETERMINE-CAT-UTILITY this is the function that actually determines the category utility. Inputs: the parent node of a classit tree ==> with all atts, attScores, (?) properly determined. (COBWEB) with all variance, sum, acount, and sum2 properly determined. (CLASSIT) Outputs: returns the Category Utility (a real number) for that partition. -------------------------------------------------------------------KThompso--|# (defun determine-cat-utility (parent) (/ (- (compute-values-for-children parent) (node-attscores parent)) (length (node-children parent)))) #|***************************************************************************** N O D E U T I L I T Y F U N C T I O N S T H A T A R E S A M E F O R C L A S S I T A N D C O B W E B *****************************************************************************|# #|----------------------------------------------------------------10/Jun/88---- INTEGRATE-OBJ This function returns a new copy of the node with count, atts, sumAttScores, and members updated to reflect the given obj. The children field is not modified. Inputs: the node to integrate the obj into, (OBJ), and a list of attribute names. Outputs: a copy of the node with the obj partially integrated. ------------------------------------------------------------------Pyoung----|# (defun integrate-obj (node att-names &aux new-node) (setq new-node (make-node :count (1+ (node-count node)) :atts (integrate-into-atts (node-atts node) att-names) :members (cons (given-name 'OBJ) (node-members node)) ;The children field should be modified later, ;by the various compute-cu functions :children (node-children node)) ) (setf (node-attScores new-node) (prob-att=value new-node)) (assert (= (length (node-members new-node)) (node-count new-node))) new-node ) #|----------------------------------------------------------------10/May/88---- CREATE-NODE this function creates a new node with the given obj Inputs: The obj to place in the new node and the attribute name list Outputs: The new node. Side Effects: none. ------------------------------------------------------------------Pyoung----|# (defun create-node (att-list &aux new-node) (setq new-node (make-node :count 1 :atts (create-atts att-list) :children nil :members (list (given-name 'OBJ)) )) (setf (node-attScores new-node) (prob-att=value new-node)) new-node ) #|----------------------------------------------------------------10/May/88---- FORK-NODE given a node with no children, and without obj integrated into the root, this function creates a node combining the original node with the new node, and sets up the children. Inputs: The current node, (OBJ), and the attribute name list. Outputs: The new node. Side Effects: none. ------------------------------------------------------------------Pyoung----|# (defun fork-node (node att-names &aux new-node) (setq new-node (make-node :count (1+ (node-count node)) :atts (integrate-into-atts (node-atts node) att-names) :children (list node (create-node att-names)) :members (cons (given-name 'OBJ) (node-members node)) )) (setf (node-attScores new-node) (prob-att=value new-node)) (if (> (node-attscores new-node) *THRESHOLD*) (setf (node-children new-node) nil)) (assert (= (length (node-members new-node)) (node-count new-node))) new-node )