Subject: Re: Bug#385720: m4: INTERNAL ERROR: recursive
push_string (fwd)



-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

According to Paul Eggert on 9/4/2006 2:16 AM:
>> In the meantime, perhaps Autoconf should document that
>> all autom4te input files should always end in newline.
>
> Naaaah. Let's just fix the bug in M4. It's clearly a bug.
> The GNU tradition is to handle arbitrary input files,
> and not to insist on "text" files in the POSIX sense.
>

OK, I found the culprit; the regression crept in on Aug 1 (post 1.4.5).
My fix to debian bug 175365 (remembering the current file and line, rather
than printing NONE:0: when diagnosing incomplete input) was not expecting
the last token in a file to be a macro expansion with no arguments. In
addition to the workaround of adding the newline, you can also fix things
by calling AC_OUTPUT with arguments (fortunately, AC_OUTPUT will ignore
those arguments, so you aren't changing the syntax of your configure.ac),
or introducing any other non-macro token (such as an empty string [],
extra whitespace, etc.). Here's the patch I'm applying, if debian wants
to release 1.4.6-2 instead of waiting for 1.4.7. And wow is it hard
adding files to the testsuite that don't end in newline, so instead I used
changequote to allow newline in a macro name to trigger a similar failure
in the 'make distcheck' testsuite, to ensure we don't regress again.

2006-09-04 Eric Blake <[email protected]>

* src/input.c (peek_input): Fix regression in handling macro
without arguments as last token in file; debian bug 385720.
(next_token): Always consume an input character.
Reported by Andreas Schultz.
* configure.ac (AC_INIT): Bump version number.
* NEWS: Document this fix.
* doc/m4.texinfo (History): Mention next version.
(Changeword): Add example that exposes this bug.
* THANKS: Update.

- --
Life is short - so eat dessert first!

Eric Blake [email protected]
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.1 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFE/CkR84KuGfSFAYARAmr0AKCy6FIJDZzNWnN8nEvy7EdpJHpurQCgkHX4
VFXV6/1+nUcdhnNDrC7q93E=
=WNhB
-----END PGP SIGNATURE-----
Index: NEWS
===================================================================
RCS file: /sources/m4/m4/NEWS,v
retrieving revision 1.1.1.1.2.57
diff -u -p -r1.1.1.1.2.57 NEWS
--- NEWS 25 Aug 2006 14:08:33 -0000 1.1.1.1.2.57
+++ NEWS 4 Sep 2006 13:15:47 -0000
@@ -2,6 +2,11 @@ GNU M4 NEWS - User visible changes.
Copyright (C) 1992, 1993, 1994, 2004, 2005, 2006 Free Software
Foundation, Inc.

+Version 1.4.7 - ?? ??? 2006, by ?? (CVS version 1.4.6a)
+
+* Fix regression from 1.4.5 in handling a file that ends in a macro
+ expansion without arguments instead of a newline.
+
Version 1.4.6 - 25 August 2006, by Eric Blake (CVS version 1.4.5a)

* Fix buffer overruns in regexp and patsubst macros when handed a trailing
Index: configure.ac
===================================================================
RCS file: /sources/m4/m4/configure.ac,v
retrieving revision 1.36.2.25
diff -u -p -r1.36.2.25 configure.ac
--- configure.ac 25 Aug 2006 14:08:33 -0000 1.36.2.25
+++ configure.ac 4 Sep 2006 13:15:47 -0000
@@ -18,7 +18,7 @@
# 02110-1301 USA

AC_PREREQ([2.60])
-AC_INIT([GNU M4], [1.4.6], [[email protected]])
+AC_INIT([GNU M4], [1.4.6a], [[email protected]])
AM_INIT_AUTOMAKE([1.9.6 dist-bzip2 gnu])
PACKAGE=$PACKAGE_TARNAME; AC_SUBST([PACKAGE])
VERSION=$PACKAGE_VERSION; AC_SUBST([VERSION])
Index: doc/m4.texinfo
===================================================================
RCS file: /sources/m4/m4/doc/m4.texinfo,v
retrieving revision 1.1.1.1.2.70
diff -u -p -r1.1.1.1.2.70 m4.texinfo
--- doc/m4.texinfo 24 Aug 2006 14:27:33 -0000 1.1.1.1.2.70
+++ doc/m4.texinfo 4 Sep 2006 13:15:48 -0000
@@ -360,7 +360,7 @@ addressed some long standing bugs in the
Then in 2005 Gary V. Vaughan collected together the many
patches to @acronym{GNU} @code{m4} 1.4 that were floating around the net and
released 1.4.3 and 1.4.4. And in 2006, Eric Blake joined the team and
-prepared patches for the release of 1.4.5 and 1.4.6.
+prepared patches for the release of 1.4.5, 1.4.6, and 1.4.7.

Meanwhile, development has continued on new features for @code{m4}, such
as dynamic module loading and additional builtins. When complete,
@@ -2724,6 +2724,41 @@ is a restriction on the regular expressi
@code{changeword}. This is that if your regular expression accepts
@samp{foo}, it must also accept @samp{f} and @samp{fo}.

[email protected]
+ifdef(`changeword', `', `errprint(` skipping: no changeword support
+')m4exit(`77')')dnl
+define(`foo
+', `bar
+')
[email protected]{}
+dnl This example wants to recognize changeword, dnl, and `foo\n'.
+dnl First, we check that our regexp will match.
+regexp(`changeword', `[cd][a-z]*\|foo[
+]')
[email protected]{}0
+regexp(`foo
+', `[cd][a-z]*\|foo[
+]')
[email protected]{}0
+regexp(`f', `[cd][a-z]*\|foo[
+]')
[email protected]{}-1
+foo
[email protected]{}foo
+changeword(`[cd][a-z]*\|foo[
+]')
[email protected]{}
+dnl Even though `foo\n' matches, we forgot to allow `f'.
+foo
[email protected]{}foo
+changeword(`[cd][a-z]*\|fo*[
+]?')
[email protected]{}
+dnl Now we can call `foo\n'.
+foo
[email protected]{}bar
[email protected] example
+
@code{changeword} has another function. If the regular expression
supplied contains any grouped subexpressions, then text outside
the first of these is discarded before symbol lookup. So:
Index: src/input.c
===================================================================
RCS file: /sources/m4/m4/src/Attic/input.c,v
retrieving revision 1.1.1.1.2.21
diff -u -p -r1.1.1.1.2.21 input.c
--- src/input.c 23 Aug 2006 11:30:11 -0000 1.1.1.1.2.21
+++ src/input.c 4 Sep 2006 13:15:48 -0000
@@ -455,8 +455,12 @@ peek_input (void)
"INTERNAL ERROR: input stack botch in peek_input ()"));
abort ();
}
- /* End of input source --- pop one level. */
- pop_input ();
+ /* End of current input source --- pop one level if another
+ level still exists. */
+ if (isp->prev != NULL)
+ pop_input ();
+ else
+ return CHAR_EOF;
}
}

@@ -783,18 +787,20 @@ next_token (token_data *td)
obstack_1grow (&token_stack, '\0');
token_bottom = obstack_finish (&token_stack);

+ /* Can't consume character until after CHAR_MACRO is handled. */
ch = peek_input ();
if (ch == CHAR_EOF)
{
#ifdef DEBUG_INPUT
fprintf (stderr, "next_token -> EOF\n");
#endif
+ next_char ();
return TOKEN_EOF;
}
if (ch == CHAR_MACRO)
{
init_macro_token (td);
- (void) next_char ();
+ next_char ();
#ifdef DEBUG_INPUT
fprintf (stderr, "next_token -> MACDEF (%s)\n",
find_builtin_by_addr (TOKEN_DATA_FUNC (td))->name);
@@ -802,7 +808,7 @@ next_token (token_data *td)
return TOKEN_MACDEF;
}

- (void) next_char ();
+ next_char (); /* Consume character we already peeked at. */
if (MATCH (ch, bcomm.string, TRUE))
{
obstack_grow (&token_stack, bcomm.string, bcomm.length);
_______________________________________________
M4-patches mailing list
[email protected]
http://lists.gnu.org/mailman/listinfo/m4-patches



Programming list archiving by: Enterprise Git Hosting