some cleanups extended the document to cover RelaxNG and tree operations

* relaxng.c: some cleanups
* doc/xmlreader.html: extended the document to cover RelaxNG and
  tree operations
* python/tests/Makefile.am python/tests/reader[46].py: added some
  xmlReader example/regression tests
* result/relaxng/tutor*.err: updated the output of a number of tests
Daniel
diff --git a/ChangeLog b/ChangeLog
index ce1a6b6..64ed432 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,12 @@
+Thu Apr 17 14:51:57 CEST 2003 Daniel Veillard <daniel@veillard.com>
+
+	* relaxng.c: some cleanups
+	* doc/xmlreader.html: extended the document to cover RelaxNG and
+	  tree operations
+	* python/tests/Makefile.am python/tests/reader[46].py: added some
+	  xmlReader example/regression tests
+	* result/relaxng/tutor*.err: updated the output of a number of tests
+
 Thu Apr 17 11:35:37 CEST 2003 Daniel Veillard <daniel@veillard.com>
 
 	* relaxng.c: valgrind pointed out an uninitialized variable error.
diff --git a/doc/xmlreader.html b/doc/xmlreader.html
index 7b4ab99..fd95646 100644
--- a/doc/xmlreader.html
+++ b/doc/xmlreader.html
@@ -13,6 +13,8 @@
 A:link, A:visited, A:active { text-decoration: underline }-->
 
 
+
+
   </style>
   <title>Libxml2 XmlTextReader Interface tutorial</title>
 </head>
@@ -42,6 +44,9 @@
   attributes</a></li>
   <li><a href="#Validating">Validating a document</a></li>
   <li><a href="#Entities">Entities substitution</a></li>
+  <li><a href="#L1142">Relax-NG Validation</a></li>
+  <li><a href="#Mixing">Mixing the reader and tree or XPath
+  operations</a></li>
 </ul>
 
 <p></p>
@@ -147,8 +152,7 @@
         ret = reader.Read()
 
     if ret != 0:
-        print "%s : failed to parse" % (filename)
-</pre>
+        print "%s : failed to parse" % (filename)</pre>
 
 <p>The only things worth adding are that the <a
 href="http://dotgnu.org/pnetlib-doc/System/Xml/XmlTextReader.html">xmlTextReader
@@ -390,9 +394,79 @@
 
 <h2><a name="Entities">Entities substitution</a></h2>
 
-<p>@@TODO@@</p>
+<p>By default the xmlReader will report entities as such and not replace them
+with their content. This default behaviour can however be overriden using:</p>
 
-<p> </p>
+<p><code>reader.SetParserProp(libxml2.PARSER_SUBST_ENTITIES,1)</code></p>
+
+<h2><a name="L1142">Relax-NG Validation</a></h2>
+
+<p style="font-size: 10pt">Introduced in version 2.5.7</p>
+
+<p>Libxml2 can now validate the document being read using the xmlReader using
+Relax-NG schemas. While the Relax NG validator can't always work in a
+streamable mode, only subsets which cannot be reduced to regular expressions
+need to have their subtree expanded for validation. In practice it means
+that, unless the schemas for the top level element content is not expressable
+as a regexp, only chunk of the document needs to be parsed while
+validating.</p>
+
+<p>The steps to do so are:</p>
+<ul>
+  <li>create a reader working on a document as usual</li>
+  <li>before any call to read associate it to a Relax NG schemas, either the
+    preparsed schemas or the URL to the schemas to use</li>
+  <li>errors will be reported the usual way, and the validity status can be
+    obtained using the IsValid() interface of the reader like for DTDs.</li>
+</ul>
+
+<p>Example, assuming the reader has already being created and that the schema
+string contains the Relax-NG schemas:</p>
+
+<p><code>rngp = libxml2.relaxNGNewMemParserCtxt(schema, len(schema))<br>
+rngs = rngp.relaxNGParse()<br>
+reader.RelaxNGSetSchema(rngs)<br>
+ret = reader.Read()<br>
+while ret == 1:<br>
+    ret = reader.Read()<br>
+if ret != 0:<br>
+    print "Error parsing the document"<br>
+if reader.IsValid() != 1:<br>
+    print "Document failed to validate"</code><br>
+See <code>reader6.py</code> in the sources or documentation for a complete
+example.</p>
+
+<h2><a name="Mixing">Mixing the reader and tree or XPath operations</a></h2>
+
+<p style="font-size: 10pt">Introduced in version 2.5.7</p>
+
+<p>While the reader is a streaming interface, its underlying implementation
+is based on the DOM builder of libxml2. As a result it is relatively simple
+to mix operations based on both models under some constraints. To do so the
+reader has an Expand() operation allowing to grow the subtree under the
+current node. It returns a pointer to a standard node wich can be manipulated
+in the usual ways. The node will get all its ancestors and the full subtree
+available. Usual operations like XPath queries can be used on that reduced
+view of the document. Here is an example extracted from reader5.py in the
+sources which extract and prints the bibliography for the "Dragon" compiler
+book from the XML 1.0 recommendation:</p>
+<pre>f = open('../../test/valid/REC-xml-19980210.xml')
+input = libxml2.inputBuffer(f)
+reader = input.newTextReader("REC")
+res=""
+while reader.Read():
+    while reader.Name() == 'bibl':
+        node = reader.Expand()            # expand the subtree
+        if node.xpathEval("@id = 'Aho'"): # use XPath on it
+            res = res + node.serialize()
+        if reader.Next() != 1:            # skip the subtree
+            break;</pre>
+
+<p>Note however that the node instance returned by the Expand() call is only
+valid until the next Read() operation. The Expand() operation does not
+affects the Read() ones, however usually once processed the full subtree is
+not useful anymore, and the Next() operation allows to skip it completely and
+process to the successor or return 0 if the document end is reached. </p>
 
 <p><a href="mailto:veillard@redhat.com">Daniel Veillard</a></p>
 
diff --git a/python/tests/Makefile.am b/python/tests/Makefile.am
index 761046a..0c16acf 100644
--- a/python/tests/Makefile.am
+++ b/python/tests/Makefile.am
@@ -23,6 +23,9 @@
     reader.py	\
     reader2.py	\
     reader3.py	\
+    reader4.py	\
+    reader5.py	\
+    reader6.py	\
     ctxterror.py\
     readererr.py\
     relaxng.py
diff --git a/python/tests/reader4.py b/python/tests/reader4.py
new file mode 100755
index 0000000..0269cb0
--- /dev/null
+++ b/python/tests/reader4.py
@@ -0,0 +1,45 @@
+#!/usr/bin/python -u
+#
+# this tests the basic APIs of the XmlTextReader interface
+#
+import libxml2
+import StringIO
+import sys
+
+# Memory debug specific
+libxml2.debugMemory(1)
+
+def tst_reader(s):
+    f = StringIO.StringIO(s)
+    input = libxml2.inputBuffer(f)
+    reader = input.newTextReader("tst")
+    res = ""
+    while reader.Read():
+	res=res + "%s (%s) [%s] %d\n" % (reader.NodeType(),reader.Name(),
+				      reader.Value(), reader.IsEmptyElement())
+	if reader.NodeType() == 1: # Element
+	    while reader.MoveToNextAttribute():
+		res = res + "-- %s (%s) [%s]\n" % (reader.NodeType(),
+						   reader.Name(),reader.Value())
+    return res
+    
+expect="""1 (test) [None] 0
+1 (b) [None] 1
+1 (c) [None] 1
+15 (test) [None] 0
+"""
+
+res = tst_reader("""<test><b/><c/></test>""")
+
+if res != expect:
+    print "Did not get the expected error message:"
+    print res
+    sys.exit(1)
+
+# Memory debug specific
+libxml2.cleanupParser()
+if libxml2.debugMemory(1) == 0:
+    print "OK"
+else:
+    print "Memory leak %d bytes" % (libxml2.debugMemory(1))
+    libxml2.dumpMemory()
diff --git a/python/tests/reader6.py b/python/tests/reader6.py
new file mode 100755
index 0000000..fe22079
--- /dev/null
+++ b/python/tests/reader6.py
@@ -0,0 +1,118 @@
+#!/usr/bin/python -u
+#
+# this tests the entities substitutions with the XmlTextReader interface
+#
+import sys
+import StringIO
+import libxml2
+
+schema="""<element name="foo" xmlns="http://relaxng.org/ns/structure/1.0"
+         datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes">
+  <oneOrMore>
+    <element name="label">
+      <text/>
+    </element>
+    <optional>
+      <element name="opt">
+        <empty/>
+      </element>
+    </optional>
+    <element name="item">
+      <data type="byte"/>
+    </element>
+  </oneOrMore>
+</element>
+"""
+# Memory debug specific
+libxml2.debugMemory(1)
+
+#
+# Parse the Relax NG Schemas
+# 
+rngp = libxml2.relaxNGNewMemParserCtxt(schema, len(schema))
+rngs = rngp.relaxNGParse()
+del rngp
+
+#
+# Parse and validate the correct document
+#
+docstr="""<foo>
+<label>some text</label>
+<item>100</item>
+</foo>"""
+
+f = StringIO.StringIO(docstr)
+input = libxml2.inputBuffer(f)
+reader = input.newTextReader("correct")
+reader.RelaxNGSetSchema(rngs)
+ret = reader.Read()
+while ret == 1:
+    ret = reader.Read()
+
+if ret != 0:
+    print "Error parsing the document"
+    sys.exit(1)
+
+if reader.IsValid() != 1:
+    print "Document failed to validate"
+    sys.exit(1)
+
+#
+# Parse and validate the incorrect document
+#
+docstr="""<foo>
+<label>some text</label>
+<item>1000</item>
+</foo>"""
+
+err=""
+expect="""RNG validity error: file error line 3 element text
+Type byte doesn't allow value '1000'
+RNG validity error: file error line 3 element text
+Error validating datatype byte
+RNG validity error: file error line 3 element text
+Element item failed to validate content
+"""
+
+def callback(ctx, str):
+    global err
+    err = err + "%s" % (str)
+libxml2.registerErrorHandler(callback, "")
+
+f = StringIO.StringIO(docstr)
+input = libxml2.inputBuffer(f)
+reader = input.newTextReader("error")
+reader.RelaxNGSetSchema(rngs)
+ret = reader.Read()
+while ret == 1:
+    ret = reader.Read()
+
+if ret != 0:
+    print "Error parsing the document"
+    sys.exit(1)
+
+if reader.IsValid() != 0:
+    print "Document failed to detect the validation error"
+    sys.exit(1)
+
+if err != expect:
+    print "Did not get the expected error message:"
+    print err
+    sys.exit(1)
+
+#
+# cleanup
+#
+del f
+del input
+del reader
+del rngs
+libxml2.relaxNGCleanupTypes()
+
+# Memory debug specific
+libxml2.cleanupParser()
+if libxml2.debugMemory(1) == 0:
+    print "OK"
+else:
+    print "Memory leak %d bytes" % (libxml2.debugMemory(1))
+    libxml2.dumpMemory()
diff --git a/relaxng.c b/relaxng.c
index c98e04e..d453b93 100644
--- a/relaxng.c
+++ b/relaxng.c
@@ -8,11 +8,9 @@
 
 /**
  * TODO:
- * - error reporting
- * - handle namespace declarations as attributes.
  * - add support for DTD compatibility spec
  *   http://www.oasis-open.org/committees/relax-ng/compatibility-20011203.html
- * - report better mem allocations at runtime and abort immediately.
+ * - report better mem allocations pbms at runtime and abort immediately.
  */
 
 #define IN_LIBXML
@@ -836,7 +834,6 @@
  * @size:  the default size for the container
  *
  * Allocate a new RelaxNG validation state container
- * TODO: keep a pool in the ctxt
  *
  * Returns the newly allocated structure or NULL in case or error
  */
@@ -1989,7 +1986,7 @@
 	case XML_RELAXNG_ERR_EXTRADATA:
 	    return(xmlCharStrdup("Extra data in the document"));
 	default:
-	    TODO
+	    return(xmlCharStrdup("Unknown error !"));
     }
     if (msg[0] == 0) {
 	snprintf(msg, 1000, "Unknown error code %d", err);
@@ -2279,12 +2276,6 @@
     xmlSchemaTypePtr typ;
     int ret;
 
-    /*
-     * TODO: the type should be cached ab provided back, interface subject
-     * to changes.
-     * TODO: handle facets, may require an additional interface and keep
-     * the value returned from the validation.
-     */
     if ((type == NULL) || (value == NULL))
 	return(-1);
     typ = xmlSchemaGetPredefinedType(type, 
@@ -2956,9 +2947,9 @@
         case XML_RELAXNG_LIST:
         case XML_RELAXNG_PARAM:
         case XML_RELAXNG_VALUE:
-	    TODO /* This should not happen and generate an internal error */
-	    printf("trying to compile %s\n", xmlRelaxNGDefName(def));
-
+	    /* This should not happen and generate an internal error */
+	    fprintf(stderr, "RNG internal error trying to compile %s\n",
+	            xmlRelaxNGDefName(def));
 	    break;
     }
     return(ret);
@@ -3302,7 +3293,6 @@
 	    }
 	}
     }
-    /* TODO check ahead of time that the value is okay per the type */
     return(def);
 }
 
@@ -4878,10 +4868,9 @@
 		    ctxt->nbErrors++;
 		    break;
 		case XML_RELAXNG_NOOP:
-		    TODO
 		    if (ctxt->error != NULL)
 			ctxt->error(ctxt->userData,
-		"Internal error, noop found\n");
+		"RNG Internal error, noop found in attribute\n");
 		    ctxt->nbErrors++;
 		    break;
 	    }
@@ -5199,16 +5188,27 @@
 		    ret->attrs = cur;
 		    break;
 		case XML_RELAXNG_START:
+		    if (ctxt->error != NULL)
+			ctxt->error(ctxt->userData,
+		"RNG Internal error, start found in element\n");
+		    ctxt->nbErrors++;
+		    break;
 		case XML_RELAXNG_PARAM:
+		    if (ctxt->error != NULL)
+			ctxt->error(ctxt->userData,
+		"RNG Internal error, param found in element\n");
+		    ctxt->nbErrors++;
+		    break;
 		case XML_RELAXNG_EXCEPT:
-		    TODO
+		    if (ctxt->error != NULL)
+			ctxt->error(ctxt->userData,
+		"RNG Internal error, except found in element\n");
 		    ctxt->nbErrors++;
 		    break;
 		case XML_RELAXNG_NOOP:
-		    TODO
 		    if (ctxt->error != NULL)
 			ctxt->error(ctxt->userData,
-		"Internal error, noop found\n");
+		"RNG Internal error, noop found in element\n");
 		    ctxt->nbErrors++;
 		    break;
 	    }
@@ -5438,9 +5438,6 @@
 			name);
 	ctxt->nbErrors++;
     }
-    /*
-     * TODO: make a closure and verify there is no loop !
-     */
 }
 
 /**
diff --git a/result/relaxng/tutor10_7_3.err b/result/relaxng/tutor10_7_3.err
index ebbc9aa..bc3d6ac 100644
--- a/result/relaxng/tutor10_7_3.err
+++ b/result/relaxng/tutor10_7_3.err
@@ -1,2 +1,2 @@
 RNG validity error: file ./test/relaxng/tutor10_7_3.xml line 2 element card
-Element addressBook has extra content: card
+Element card failed to validate attributes
diff --git a/result/relaxng/tutor10_8_3.err b/result/relaxng/tutor10_8_3.err
index 34eb5e9..06229bf 100644
--- a/result/relaxng/tutor10_8_3.err
+++ b/result/relaxng/tutor10_8_3.err
@@ -1,2 +1,2 @@
 RNG validity error: file ./test/relaxng/tutor10_8_3.xml line 2 element card
-Element addressBook has extra content: card
+Element card failed to validate attributes
diff --git a/result/relaxng/tutor3_2_1.err b/result/relaxng/tutor3_2_1.err
index 83e9a57..73577fc 100644
--- a/result/relaxng/tutor3_2_1.err
+++ b/result/relaxng/tutor3_2_1.err
@@ -1,4 +1,2 @@
 RNG validity error: file ./test/relaxng/tutor3_2_1.xml line 1 element email
-Expecting element name, got email
-RNG validity error: file ./test/relaxng/tutor3_2_1.xml line 1 element email
-Element card failed to validate content
+Did not expect element email there
diff --git a/result/relaxng/tutor3_5_2.err b/result/relaxng/tutor3_5_2.err
index ed09a33..80acb18 100644
--- a/result/relaxng/tutor3_5_2.err
+++ b/result/relaxng/tutor3_5_2.err
@@ -1,2 +1,4 @@
-RNG validity error: file ./test/relaxng/tutor3_5_2.xml line 2 element card
-Element addressBook has extra content: card
+RNG validity error: file ./test/relaxng/tutor3_5_2.xml line 2 element email
+Expecting element name, got email
+RNG validity error: file ./test/relaxng/tutor3_5_2.xml line 2 element email
+Element card failed to validate content
diff --git a/result/relaxng/tutor9_5_2.err b/result/relaxng/tutor9_5_2.err
index 650ca98..ede3b45 100644
--- a/result/relaxng/tutor9_5_2.err
+++ b/result/relaxng/tutor9_5_2.err
@@ -1,2 +1,4 @@
 RNG validity error: file ./test/relaxng/tutor9_5_2.xml line 2 element card
-Element addressBook has extra content: card
+Invalid sequence in interleave
+RNG validity error: file ./test/relaxng/tutor9_5_2.xml line 2 element card
+Element card failed to validate attributes
diff --git a/result/relaxng/tutor9_5_3.err b/result/relaxng/tutor9_5_3.err
index eee06c7..4566bcc 100644
--- a/result/relaxng/tutor9_5_3.err
+++ b/result/relaxng/tutor9_5_3.err
@@ -1,2 +1,2 @@
 RNG validity error: file ./test/relaxng/tutor9_5_3.xml line 2 element card
-Element addressBook has extra content: card
+Invalid attribute error for element card
diff --git a/result/relaxng/tutor9_6_2.err b/result/relaxng/tutor9_6_2.err
index 259cb07..1a10f1b 100644
--- a/result/relaxng/tutor9_6_2.err
+++ b/result/relaxng/tutor9_6_2.err
@@ -1,2 +1,2 @@
 RNG validity error: file ./test/relaxng/tutor9_6_2.xml line 2 element card
-Element addressBook has extra content: card
+Element card failed to validate attributes
diff --git a/result/relaxng/tutor9_6_3.err b/result/relaxng/tutor9_6_3.err
index 2157e52..e92c5f1 100644
--- a/result/relaxng/tutor9_6_3.err
+++ b/result/relaxng/tutor9_6_3.err
@@ -1,2 +1,2 @@
 RNG validity error: file ./test/relaxng/tutor9_6_3.xml line 2 element card
-Element addressBook has extra content: card
+Invalid attribute error for element card