Avoid reparsing in xmlParseStartTag2

The code in xmlParseStartTag2 must handle the case that the input
buffer was grown and reallocated which can invalidate pointers to
attribute values. Before, this was handled by detecting changes of
the input buffer "base" pointer and, in case of a change, jumping
back to the beginning of the function and reparsing the start tag.

The major problem of this approach is that whether an input buffer is
reallocated is nondeterministic, resulting in seemingly random test
failures. See the mailing list thread "runtest mystery bug: name2.xml
error case regression test" from 2012, for example.

If a reallocation was detected, the code also made no attempts to
continue parsing in case of errors which makes a difference in
the lax "recover" mode.

Now we store the current input buffer "base" pointer for each (not
separately allocated) attribute in the namespace URI field, which isn't
used until later. After the whole start tag was parsed, the pointers
to the attribute values are reconstructed using the offset between the
new and the old input buffer. This relies on arithmetic on dangling
pointers which is technically undefined behavior. But it seems like
the easiest and most efficient fix and a similar approach is used in
xmlParserInputGrow.

This changes the error output of several tests, typically making it
more verbose because we try harder to continue parsing in case of
errors.

(Another possible solution is to check not only the "base" pointer
but the size of the input buffer as well. But this would result in
even more reparsing.)
5 files changed
tree: 9b7a19ae77b9102f00885545d6f2bb44488bbfa1
  1. bakefile/
  2. doc/
  3. example/
  4. include/
  5. macos/
  6. optim/
  7. os400/
  8. python/
  9. result/
  10. test/
  11. vms/
  12. VxWorks/
  13. win32/
  14. xstc/
  15. .gitignore
  16. acinclude.m4
  17. AUTHORS
  18. autogen.sh
  19. buf.c
  20. buf.h
  21. build_glob.py
  22. c14n.c
  23. catalog.c
  24. ChangeLog
  25. check-relaxng-test-suite.py
  26. check-relaxng-test-suite2.py
  27. check-xinclude-test-suite.py
  28. check-xml-test-suite.py
  29. check-xsddata-test-suite.py
  30. chvalid.c
  31. chvalid.def
  32. configure.ac
  33. Copyright
  34. dbgen.pl
  35. dbgenattr.pl
  36. debugXML.c
  37. dict.c
  38. DOCBparser.c
  39. elfgcchack.h
  40. enc.h
  41. encoding.c
  42. entities.c
  43. error.c
  44. genChRanges.py
  45. gentest.py
  46. genUnicode.py
  47. global.data
  48. globals.c
  49. HACKING
  50. hash.c
  51. HTMLparser.c
  52. HTMLtree.c
  53. INSTALL.libxml2
  54. legacy.c
  55. libxml-2.0-uninstalled.pc.in
  56. libxml-2.0.pc.in
  57. libxml.3
  58. libxml.h
  59. libxml.m4
  60. libxml.spec.in
  61. libxml2-config.cmake.in
  62. libxml2.doap
  63. libxml2.syms
  64. list.c
  65. MAINTAINERS
  66. Makefile.am
  67. Makefile.tests
  68. Makefile.win
  69. nanoftp.c
  70. nanohttp.c
  71. NEWS
  72. parser.c
  73. parserInternals.c
  74. pattern.c
  75. README
  76. README.cvs-commits
  77. README.tests
  78. regressions.py
  79. regressions.xml
  80. relaxng.c
  81. rngparser.c
  82. runsuite.c
  83. runtest.c
  84. runxmlconf.c
  85. save.h
  86. SAX.c
  87. SAX2.c
  88. schematron.c
  89. testapi.c
  90. testAutomata.c
  91. testC14N.c
  92. testchar.c
  93. testdict.c
  94. testdso.c
  95. testHTML.c
  96. testlimits.c
  97. testModule.c
  98. testOOM.c
  99. testOOMlib.c
  100. testOOMlib.h
  101. testReader.c
  102. testrecurse.c
  103. testRegexp.c
  104. testRelax.c
  105. testSAX.c
  106. testSchemas.c
  107. testThreads.c
  108. testThreadsWin32.c
  109. testURI.c
  110. testXPath.c
  111. threads.c
  112. timsort.h
  113. TODO
  114. TODO_SCHEMAS
  115. tree.c
  116. trio.c
  117. trio.h
  118. triodef.h
  119. trionan.c
  120. trionan.h
  121. triop.h
  122. triostr.c
  123. triostr.h
  124. uri.c
  125. valid.c
  126. xinclude.c
  127. xlink.c
  128. xml2-config.1
  129. xml2-config.in
  130. xml2Conf.sh.in
  131. xmlcatalog.c
  132. xmlIO.c
  133. xmllint.c
  134. xmlmemory.c
  135. xmlmodule.c
  136. xmlreader.c
  137. xmlregexp.c
  138. xmlsave.c
  139. xmlschemas.c
  140. xmlschemastypes.c
  141. xmlstring.c
  142. xmlunicode.c
  143. xmlwriter.c
  144. xpath.c
  145. xpointer.c
  146. xzlib.c
  147. xzlib.h