diff --git a/contrib/bison/COPYING b/contrib/bison/COPYING new file mode 100644 index 000000000000..a43ea2126fb6 --- /dev/null +++ b/contrib/bison/COPYING @@ -0,0 +1,339 @@ + GNU GENERAL PUBLIC LICENSE + Version 2, June 1991 + + Copyright (C) 1989, 1991 Free Software Foundation, Inc. + 675 Mass Ave, Cambridge, MA 02139, USA + Everyone is permitted to copy and distribute verbatim copies + of this license document, but changing it is not allowed. + + Preamble + + The licenses for most software are designed to take away your +freedom to share and change it. By contrast, the GNU General Public +License is intended to guarantee your freedom to share and change free +software--to make sure the software is free for all its users. This +General Public License applies to most of the Free Software +Foundation's software and to any other program whose authors commit to +using it. (Some other Free Software Foundation software is covered by +the GNU Library General Public License instead.) You can apply it to +your programs, too. + + When we speak of free software, we are referring to freedom, not +price. Our General Public Licenses are designed to make sure that you +have the freedom to distribute copies of free software (and charge for +this service if you wish), that you receive source code or can get it +if you want it, that you can change the software or use pieces of it +in new free programs; and that you know you can do these things. + + To protect your rights, we need to make restrictions that forbid +anyone to deny you these rights or to ask you to surrender the rights. +These restrictions translate to certain responsibilities for you if you +distribute copies of the software, or if you modify it. + + For example, if you distribute copies of such a program, whether +gratis or for a fee, you must give the recipients all the rights that +you have. You must make sure that they, too, receive or can get the +source code. And you must show them these terms so they know their +rights. + + We protect your rights with two steps: (1) copyright the software, and +(2) offer you this license which gives you legal permission to copy, +distribute and/or modify the software. + + Also, for each author's protection and ours, we want to make certain +that everyone understands that there is no warranty for this free +software. If the software is modified by someone else and passed on, we +want its recipients to know that what they have is not the original, so +that any problems introduced by others will not reflect on the original +authors' reputations. + + Finally, any free program is threatened constantly by software +patents. We wish to avoid the danger that redistributors of a free +program will individually obtain patent licenses, in effect making the +program proprietary. To prevent this, we have made it clear that any +patent must be licensed for everyone's free use or not licensed at all. + + The precise terms and conditions for copying, distribution and +modification follow. + + GNU GENERAL PUBLIC LICENSE + TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION + + 0. This License applies to any program or other work which contains +a notice placed by the copyright holder saying it may be distributed +under the terms of this General Public License. The "Program", below, +refers to any such program or work, and a "work based on the Program" +means either the Program or any derivative work under copyright law: +that is to say, a work containing the Program or a portion of it, +either verbatim or with modifications and/or translated into another +language. (Hereinafter, translation is included without limitation in +the term "modification".) Each licensee is addressed as "you". + +Activities other than copying, distribution and modification are not +covered by this License; they are outside its scope. The act of +running the Program is not restricted, and the output from the Program +is covered only if its contents constitute a work based on the +Program (independent of having been made by running the Program). +Whether that is true depends on what the Program does. + + 1. You may copy and distribute verbatim copies of the Program's +source code as you receive it, in any medium, provided that you +conspicuously and appropriately publish on each copy an appropriate +copyright notice and disclaimer of warranty; keep intact all the +notices that refer to this License and to the absence of any warranty; +and give any other recipients of the Program a copy of this License +along with the Program. + +You may charge a fee for the physical act of transferring a copy, and +you may at your option offer warranty protection in exchange for a fee. + + 2. You may modify your copy or copies of the Program or any portion +of it, thus forming a work based on the Program, and copy and +distribute such modifications or work under the terms of Section 1 +above, provided that you also meet all of these conditions: + + a) You must cause the modified files to carry prominent notices + stating that you changed the files and the date of any change. + + b) You must cause any work that you distribute or publish, that in + whole or in part contains or is derived from the Program or any + part thereof, to be licensed as a whole at no charge to all third + parties under the terms of this License. + + c) If the modified program normally reads commands interactively + when run, you must cause it, when started running for such + interactive use in the most ordinary way, to print or display an + announcement including an appropriate copyright notice and a + notice that there is no warranty (or else, saying that you provide + a warranty) and that users may redistribute the program under + these conditions, and telling the user how to view a copy of this + License. (Exception: if the Program itself is interactive but + does not normally print such an announcement, your work based on + the Program is not required to print an announcement.) + +These requirements apply to the modified work as a whole. If +identifiable sections of that work are not derived from the Program, +and can be reasonably considered independent and separate works in +themselves, then this License, and its terms, do not apply to those +sections when you distribute them as separate works. But when you +distribute the same sections as part of a whole which is a work based +on the Program, the distribution of the whole must be on the terms of +this License, whose permissions for other licensees extend to the +entire whole, and thus to each and every part regardless of who wrote it. + +Thus, it is not the intent of this section to claim rights or contest +your rights to work written entirely by you; rather, the intent is to +exercise the right to control the distribution of derivative or +collective works based on the Program. + +In addition, mere aggregation of another work not based on the Program +with the Program (or with a work based on the Program) on a volume of +a storage or distribution medium does not bring the other work under +the scope of this License. + + 3. You may copy and distribute the Program (or a work based on it, +under Section 2) in object code or executable form under the terms of +Sections 1 and 2 above provided that you also do one of the following: + + a) Accompany it with the complete corresponding machine-readable + source code, which must be distributed under the terms of Sections + 1 and 2 above on a medium customarily used for software interchange; or, + + b) Accompany it with a written offer, valid for at least three + years, to give any third party, for a charge no more than your + cost of physically performing source distribution, a complete + machine-readable copy of the corresponding source code, to be + distributed under the terms of Sections 1 and 2 above on a medium + customarily used for software interchange; or, + + c) Accompany it with the information you received as to the offer + to distribute corresponding source code. (This alternative is + allowed only for noncommercial distribution and only if you + received the program in object code or executable form with such + an offer, in accord with Subsection b above.) + +The source code for a work means the preferred form of the work for +making modifications to it. For an executable work, complete source +code means all the source code for all modules it contains, plus any +associated interface definition files, plus the scripts used to +control compilation and installation of the executable. However, as a +special exception, the source code distributed need not include +anything that is normally distributed (in either source or binary +form) with the major components (compiler, kernel, and so on) of the +operating system on which the executable runs, unless that component +itself accompanies the executable. + +If distribution of executable or object code is made by offering +access to copy from a designated place, then offering equivalent +access to copy the source code from the same place counts as +distribution of the source code, even though third parties are not +compelled to copy the source along with the object code. + + 4. You may not copy, modify, sublicense, or distribute the Program +except as expressly provided under this License. Any attempt +otherwise to copy, modify, sublicense or distribute the Program is +void, and will automatically terminate your rights under this License. +However, parties who have received copies, or rights, from you under +this License will not have their licenses terminated so long as such +parties remain in full compliance. + + 5. You are not required to accept this License, since you have not +signed it. However, nothing else grants you permission to modify or +distribute the Program or its derivative works. These actions are +prohibited by law if you do not accept this License. Therefore, by +modifying or distributing the Program (or any work based on the +Program), you indicate your acceptance of this License to do so, and +all its terms and conditions for copying, distributing or modifying +the Program or works based on it. + + 6. Each time you redistribute the Program (or any work based on the +Program), the recipient automatically receives a license from the +original licensor to copy, distribute or modify the Program subject to +these terms and conditions. You may not impose any further +restrictions on the recipients' exercise of the rights granted herein. +You are not responsible for enforcing compliance by third parties to +this License. + + 7. If, as a consequence of a court judgment or allegation of patent +infringement or for any other reason (not limited to patent issues), +conditions are imposed on you (whether by court order, agreement or +otherwise) that contradict the conditions of this License, they do not +excuse you from the conditions of this License. If you cannot +distribute so as to satisfy simultaneously your obligations under this +License and any other pertinent obligations, then as a consequence you +may not distribute the Program at all. For example, if a patent +license would not permit royalty-free redistribution of the Program by +all those who receive copies directly or indirectly through you, then +the only way you could satisfy both it and this License would be to +refrain entirely from distribution of the Program. + +If any portion of this section is held invalid or unenforceable under +any particular circumstance, the balance of the section is intended to +apply and the section as a whole is intended to apply in other +circumstances. + +It is not the purpose of this section to induce you to infringe any +patents or other property right claims or to contest validity of any +such claims; this section has the sole purpose of protecting the +integrity of the free software distribution system, which is +implemented by public license practices. Many people have made +generous contributions to the wide range of software distributed +through that system in reliance on consistent application of that +system; it is up to the author/donor to decide if he or she is willing +to distribute software through any other system and a licensee cannot +impose that choice. + +This section is intended to make thoroughly clear what is believed to +be a consequence of the rest of this License. + + 8. If the distribution and/or use of the Program is restricted in +certain countries either by patents or by copyrighted interfaces, the +original copyright holder who places the Program under this License +may add an explicit geographical distribution limitation excluding +those countries, so that distribution is permitted only in or among +countries not thus excluded. In such case, this License incorporates +the limitation as if written in the body of this License. + + 9. The Free Software Foundation may publish revised and/or new versions +of the General Public License from time to time. Such new versions will +be similar in spirit to the present version, but may differ in detail to +address new problems or concerns. + +Each version is given a distinguishing version number. If the Program +specifies a version number of this License which applies to it and "any +later version", you have the option of following the terms and conditions +either of that version or of any later version published by the Free +Software Foundation. If the Program does not specify a version number of +this License, you may choose any version ever published by the Free Software +Foundation. + + 10. If you wish to incorporate parts of the Program into other free +programs whose distribution conditions are different, write to the author +to ask for permission. For software which is copyrighted by the Free +Software Foundation, write to the Free Software Foundation; we sometimes +make exceptions for this. Our decision will be guided by the two goals +of preserving the free status of all derivatives of our free software and +of promoting the sharing and reuse of software generally. + + NO WARRANTY + + 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY +FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN +OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES +PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED +OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF +MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS +TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE +PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, +REPAIR OR CORRECTION. + + 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING +WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR +REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, +INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING +OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED +TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY +YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER +PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE +POSSIBILITY OF SUCH DAMAGES. + + END OF TERMS AND CONDITIONS + + Appendix: How to Apply These Terms to Your New Programs + + If you develop a new program, and you want it to be of the greatest +possible use to the public, the best way to achieve this is to make it +free software which everyone can redistribute and change under these terms. + + To do so, attach the following notices to the program. It is safest +to attach them to the start of each source file to most effectively +convey the exclusion of warranty; and each file should have at least +the "copyright" line and a pointer to where the full notice is found. + + + Copyright (C) 19yy + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + +Also add information on how to contact you by electronic and paper mail. + +If the program is interactive, make it output a short notice like this +when it starts in an interactive mode: + + Gnomovision version 69, Copyright (C) 19yy name of author + Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'. + This is free software, and you are welcome to redistribute it + under certain conditions; type `show c' for details. + +The hypothetical commands `show w' and `show c' should show the appropriate +parts of the General Public License. Of course, the commands you use may +be called something other than `show w' and `show c'; they could even be +mouse-clicks or menu items--whatever suits your program. + +You should also get your employer (if you work as a programmer) or your +school, if any, to sign a "copyright disclaimer" for the program, if +necessary. Here is a sample; alter the names: + + Yoyodyne, Inc., hereby disclaims all copyright interest in the program + `Gnomovision' (which makes passes at compilers) written by James Hacker. + + , 1 April 1989 + Ty Coon, President of Vice + +This General Public License does not permit incorporating your program into +proprietary programs. If your program is a subroutine library, you may +consider it more useful to permit linking proprietary applications with the +library. If this is what you want to do, use the GNU Library General +Public License instead of this License. diff --git a/contrib/bison/ChangeLog b/contrib/bison/ChangeLog new file mode 100644 index 000000000000..10083c2d51af --- /dev/null +++ b/contrib/bison/ChangeLog @@ -0,0 +1,1290 @@ +Sat May 11 15:11:15 1996 Richard Stallman + + * Version 1.25 released. + + * Makefile.in (dist): Don't use $(srcdir). + + * bison.simple (__yy_memcpy): Really reorder the args, as was + supposedly done on Feb 14 1995. + (yyparse): Calls changed accordingly. + +Wed Jan 24 22:56:29 1996 Richard Stallman + + * output.c (output_rule_data): Test YYERROR_VERBOSE in the conditional + around the definition of ttyname. + +Thu Dec 28 23:27:32 1995 Richard Stallman + + * bison.simple: Fix line numbers in #line commands. + +Sun Dec 24 16:59:44 1995 Richard Stallman + + * bison.simple (YYPARSE_PARAM_DECL): In C++, make it always null. + (YYPARSE_PARAM_ARG): New macro. + (yyparse): Use YYPARSE_PARAM_ARG. + +Sun Oct 15 12:44:09 1995 Richard Stallman + + * version.c: Version now 1.25. + + * main.c (warn): Set `failure'. + +Tue Aug 1 12:30:38 EDT 1995 Wilfred J. Hansen + + * bison.cld, getargs.c, vmsgetargs.c: Added -n, -k, and -raw switches. + (noparserflag, toknumflag, rawtoknumflag): New variables. + + * conflicts.c (resolve_sr_conflict): Remove use of alloca. + + * files.c (openfiles, open_extra_files, done): Add faction flag + and actfile file. Handle noparserflag. Both for -n switch. + + * lex.c: Include getopt.h. Add some extern decls. + (safegetc): New function to deal with EOF gracefully. + (literalchar); new function to deal with reading \ escapes. + (lex): Use literalchar. + (lex): Implemented "..." tokens. + (literalchar, lex, parse_percent_token): Made tokenbuffer + always contain the token. This includes growing the token + buffer while reading an integer. + (parse_percent_token): Replaced if-else statement with percent_table. + (parse_percent_token): Added % declarations as another + way to specify the flags -n, -l, and -r. Also added hooks for + -d, -k, -y, -v, -t, -p, -b, -o, but implementation requires + major changes to files.c. + (lex) Retain in the incoming stream a character following + an incorrect '/'. + (skip_white_space, lex): Revised most error messages + and changed fatal to warn to avoid aborting. + (percent_table): Added %thong declarations. + + * lex.h: Added THONG and NOOP for alias processing. + Added SETOPT for the new code that allows setting options with %flags. + + * main.c (main): If reader sees an error, don't process the grammar. + (fatals): Updated to not use VARARGS1. + (printable_version, int_to_string, warn, warni, warns, warnss) + (warnsss): New error reporting functions. Avoid abort for error. + + * output.c (output_headers, output_trailers, output, output_gram) + (output_rule_data): Implement noparserflag variable. + Implement toknumflag variable. + (output): Call reader_output_yylsp to output LTYPESTR. + + * reader.c (reader_output_yylsp): New function. + (readgram): Use `#if 0' around code that accepted %command + inside grammar rules: The documentation doesn't allow it, + and it will fail since the %command processors scan for the next %. + (parse_token_decl): Extended the %token + declaration to allow a multi-character symbol as an alias. + (parse_thong_decl): New function. + (read_declarations): Added %thong declarations. + (read_declarations): Handle NOOP to deal with allowing + % declarations as another means to specify the flags. + (readgram): Allow %prec prior to semantics embedded in a rule. + (skip_to_char, read_declarations, copy_definition) + (parse_token_decl, parse_start_decl, parse_type_decl) + (parse_assoc_decl, parse_union_decl, parse_expect_decl) + (get_type_name, copy_guard, copy_action, readgram) + (get_type, packsymbols): Revised most error messages. + Changed `fatal' to `warnxxx' to avoid aborting for error. + Revised and use multiple warnxxx functions to avoid using VARARGS1. + (read_declarations): Improve the error message for + an invalid character. Do not abort. + (read_declarations, copy_guard, copy_action): Use + printable_version to avoid unprintable characters in printed output. + (parse_expect_decl): Error if argument to %expect exceeds 10 digits. + (parse_token_decl, parse_assoc_decl, parse_type_decl, get_type): + Allow the type of a non-terminal can be given + more than once, as long as all specifications give the same type. + + * reduce.c (reduce_grammar): Revise an error message. + (print_notices): Remove final `.' from error message. + + * symtab.h (SALIAS): New #define for adding aliases to %token. + (struct bucket): Added `alias' field. + +Wed May 3 03:12:28 1995 Richard Stallman + + * bison.simple: Change distribution terms. + + * version.c: Version now 1.23. No, 1.24. + +Thu Feb 23 02:43:21 1995 Richard Stallman + + * files.c: Test __VMS_POSIX as well as VMS. + +Tue Feb 14 11:53:05 1995 Jim Meyering (meyering@comco.com) + + * bison.simple (__yy_memcpy): Renamed from __yy_bcopy to avoid + confusion. Reverse FROM and TO arguments to be consistent with + those of memcpy. + +Thu Nov 10 16:33:41 1994 David J. MacKenzie + + * Makefile.in (DISTFILES): Include install-sh, not install.sh. + Include NEWS. + + * configure.in: Update to Autoconf v2 macro names. + +Tue Oct 4 22:25:43 1994 David J. MacKenzie (djm@duality.gnu.ai.mit.edu) + + * Makefile.in (prefix, exec_prefix): Let configure set them. + +Wed Sep 28 09:55:28 1994 David J. MacKenzie (djm@duality.gnu.ai.mit.edu) + + * Makefile.in: Set datadir to $(prefix)/share. + +Tue Jul 12 16:42:43 1994 Richard Stallman (rms@mole.gnu.ai.mit.edu) + + * reader.c (reader): Rename undefined-token token to `$undefined.'. + +Thu May 5 14:41:02 1994 David J. MacKenzie (djm@nutrimat.gnu.ai.mit.edu) + + * Makefile.in (DISTFILES): Add install.sh. + (install): Remove chmod commands. + +Sat Mar 26 15:33:07 1994 Richard Stallman (rms@mole.gnu.ai.mit.edu) + + * bison.simple: Fix #line commands. + +Thu Mar 24 23:09:07 1994 Richard Stallman (rms@mole.gnu.ai.mit.edu) + + * conflicts.c (print_reductions): Increment both fp1 and fp2 + while printing reductions in multi-rule case. + +Sun Jan 2 15:51:52 1994 Richard Stallman (rms@mole.gnu.ai.mit.edu) + + * Makefile.in (LDFLAGS): Make it empty by default. + (bison): Use CFLAGS. + +Sun Nov 21 05:24:30 1993 Richard Stallman (rms@mole.gnu.ai.mit.edu) + + * bison.simple (YYLEX): Take notice of YYLEX_PARAM. + +Mon Oct 18 23:52:33 1993 Richard Stallman (rms@mole.gnu.ai.mit.edu) + + * bison.simple (YYPARSE_PARAM_DECL): Always define this. + +Thu Oct 14 12:19:13 1993 Richard Stallman (rms@mole.gnu.ai.mit.edu) + + * bison.simple (yyparse): Support YYPARSE_PARAM. + +Mon Sep 13 18:17:14 1993 Noah Friedman (friedman@nutrimat.gnu.ai.mit.edu) + + * Makefile.in (check): New target. + +Fri Sep 10 08:10:18 1993 Richard Stallman (rms@mole.gnu.ai.mit.edu) + + * conflicts.c (alloca): #undef before defining. + + * system.h (bcopy): Don't define if already defined. + +Mon Sep 6 15:32:32 1993 Noah Friedman (friedman@nutrimat.gnu.ai.mit.edu) + + * Version 1.22 released. + + * mkinstalldirs: New file. + + * Makefile.in (dist): Use .gz for extension, not .z. + (DISTFILES): New variable. + (dist): Use it instead of explicit file list. + Try to link each file separately, then copy file if ln fails. + (installdirs): Use mkinstalldirs script. + +Thu Jul 29 20:35:02 1993 David J. MacKenzie (djm@wookumz.gnu.ai.mit.edu) + + * Makefile.in (config.status): Run config.status --recheck, not + configure, to get the right args passed. + +Sat Jul 24 04:00:52 1993 Richard Stallman (rms@mole.gnu.ai.mit.edu) + + * bison.simple (yyparse): Init yychar1 to avoid warning. + +Sun Jul 4 16:05:58 1993 Richard Stallman (rms@mole.gnu.ai.mit.edu) + + * bison.simple (yyparse): Don't set yyval when yylen is 0. + +Sat Jun 26 15:54:04 1993 David J. MacKenzie (djm@wookumz.gnu.ai.mit.edu) + + * getargs.c (getargs): Exit after printing the version number. + Add --help and -h options. + (usage): New function. + +Fri Jun 25 15:11:25 1993 Richard Stallman (rms@mole.gnu.ai.mit.edu) + + * getargs.c (longopts): Allow `output' as an alternative. + +Wed Jun 16 17:02:37 1993 Richard Stallman (rms@mole.gnu.ai.mit.edu) + + * bison.simple (yyparse): Conditionalize the entire call to yyoverflow, + not just two arguments in it. + +Thu Jun 3 13:07:19 1993 Richard Stallman (rms@mole.gnu.ai.mit.edu) + + * bison.simple [__hpux] (alloca): Don't specify arg types. + +Fri May 7 05:53:17 1993 Noah Friedman (friedman@nutrimat.gnu.ai.mit.edu) + + * Makefile.in (install): Depend on `uninstall' and `installdirs'. + (installdirs): New target. + +Wed Apr 28 15:15:15 1993 Noah Friedman (friedman@nutrimat.gnu.ai.mit.edu) + + * reader.c: Remove declaration of atoi. + +Fri Apr 23 12:29:20 1993 Noah Friedman (friedman@nutrimat.gnu.ai.mit.edu) + + * new.h [!__STDC__] (FREE): Check x != 0. + Make expr to call `free' evaluate to 0. + +Tue Apr 20 01:43:44 1993 David J. MacKenzie (djm@kropotkin.gnu.ai.mit.edu) + + * files.c [MSDOS]: Use xmalloc, not malloc. + * allocate.c (xmalloc): Renamed from mallocate. Remove old wrapper. + * conflicts.c, symtab.c, files.c, LR0.c, new.h: Change callers. + * allocate.c (xrealloc): New function. + * new.h: Declare it. + * lex.c, reader.c: Use it. + +Sun Apr 18 00:45:56 1993 Noah Friedman (friedman@nutrimat.gnu.ai.mit.edu) + + * Version 1.21 released. + + * reader.c : Don't declare `realloc'. + + * Makefile.in (bison.s1): use `rm -f' since it's quieter. + (dist): make gzipped tar file. + +Fri Apr 16 21:24:10 1993 Noah Friedman (friedman@nutrimat.gnu.ai.mit.edu) + + * Makefile.in (Makefile, config.status, configure): New targets. + +Thu Apr 15 15:37:28 1993 Richard Stallman (rms@mole.gnu.ai.mit.edu) + + * main.c: Don't declare `abort'. + + * files.c: Don't declare `exit'. + +Thu Apr 15 02:42:38 1993 Noah Friedman (friedman@nutrimat.gnu.ai.mit.edu) + + * configure.in: Add AC_CONST. + +Wed Apr 14 00:51:17 1993 Richard Stallman (rms@mole.gnu.ai.mit.edu) + + * Makefile.in (all): Depend on bison.s1. + +Tue Apr 13 14:52:32 1993 Richard Stallman (rms@mole.gnu.ai.mit.edu) + + * Version 1.20 released. + +Wed Mar 24 21:45:47 1993 Richard Stallman (rms@wookumz.gnu.ai.mit.edu) + + * output.c (output_headers): Rename yynerrs if -p. + +Thu Mar 18 00:02:17 1993 Noah Friedman (friedman@nutrimat.gnu.ai.mit.edu) + + * system.h: Don't try to include stdlib.h unless HAVE_STDLIB_H is + defined. + + * configure.in: Check for stdlib.h. + +Wed Mar 17 14:44:27 1993 Richard Stallman (rms@mole.gnu.ai.mit.edu) + + * bison.simple [__hpux, not __GNUC__]: Declare alloca. + (yyparse): When printing the expected token types for an error, + Avoid negative indexes in yycheck and yytname. + +Sat Mar 13 23:31:25 1993 Richard Stallman (rms@mole.gnu.ai.mit.edu) + + * Makefile.in (files.o, .c.o): Put CPPFLAGS and CFLAGS last. + +Mon Mar 1 17:49:08 1993 Richard Stallman (rms@mole.gnu.ai.mit.edu) + + * bison.simple: Test __sgi like __sparc. + +Wed Feb 17 00:04:13 1993 Richard Stallman (rms@mole.gnu.ai.mit.edu) + + * conflicts.c (resolve_sr_conflict): Add extra parens in alloca call. + + * bison.simple [__GNUC__] (yyparse): Declare with prototype. + +Fri Jan 15 13:15:17 1993 Richard Stallman (rms@mole.gnu.ai.mit.edu) + + * conflicts.c (print_reduction): Near end, increment fp2 when mask + recycles. + +Wed Jan 13 04:15:03 1993 Richard Stallman (rms@mole.gnu.ai.mit.edu) + + * Makefile.in (bison.s1): New target. Modifies bison.simple. + (install): Install bison.s1, without changing it. + (clean): Delete bison.s1. + +Mon Jan 4 20:35:58 1993 Richard Stallman (rms@mole.gnu.ai.mit.edu) + + * reader.c (reader): Put Bison version in comment in output file. + +Tue Dec 22 19:00:58 1992 Richard Stallman (rms@mole.gnu.ai.mit.edu) + + * files.c (openfiles): Use .output, not .out, for outfile, + regardless of spec_name_prefix. + +Tue Dec 15 19:22:11 1992 Richard Stallman (rms@mole.gnu.ai.mit.edu) + + * output.c (output_gram): Include yyrhs in the same #if as yyprhs. + +Tue Dec 15 18:29:16 1992 Noah Friedman (friedman@nutrimat.gnu.ai.mit.edu) + + * output.c (output): output directives checking for __cplusplus as + well as __STDC__ to determine when to define "const" as an empty + token. (Patch from Wolfgang Glunz ) + +Tue Dec 8 21:51:23 1992 David J. MacKenzie (djm@kropotkin.gnu.ai.mit.edu) + + * system.h, conflicts.c: Replace USG with HAVE_STRING_H and + HAVE_MEMORY_H. + +Sat Nov 21 00:37:16 1992 David J. MacKenzie (djm@goldman.gnu.ai.mit.edu) + + * Makefile.in: Set and use $(MAKEINFO). + +Fri Nov 20 20:45:57 1992 Richard Stallman (rms@mole.gnu.ai.mit.edu) + + * files.c (done) [MSDOS]: Delete the tmpdefsfile with the rest. + +Thu Oct 8 21:55:52 1992 Richard Stallman (rms@mole.gnu.ai.mit.edu) + + * Makefile.in (dist): Put configure.bat in the distribution. + +Thu Oct 1 09:16:24 1992 David J. MacKenzie (djm@goldman.gnu.ai.mit.edu) + + * Makefile.in (install): cd to $(srcdir) before installing info files. + +Wed Sep 30 17:18:39 1992 Richard Stallman (rms@mole.gnu.ai.mit.edu) + + * Makefile.in (files.o): Supply $(DEFS), and $(CPPFLAGS). + +Fri Sep 25 18:06:28 1992 Richard Stallman (rms@mole.gnu.ai.mit.edu) + + * Version 1.19 released. + + * reader.c (parse_union_decl): Fix ending of C++ comment; + don't lose the char after the newline. + + * configure.bat: New file. + +Thu Sep 24 16:23:15 1992 Richard Stallman (rms@mole.gnu.ai.mit.edu) + + * conflicts.c: Check for using alloca.h as getopt.c does. + +Sun Sep 6 08:01:53 1992 Karl Berry (karl@hayley) + + * files.c (openfiles): open `fdefines' after we have assigned a name + to `tmpdefsfile', and only if `definesflag' is set. + (done): only create the real .tab.h file if `definesflag' is set. + * reader.c (packsymbols): don't close `fdefines' here. + +Sat Sep 5 15:02:11 1992 Richard Stallman (rms@mole.gnu.ai.mit.edu) + + * files.c (openfiles): Open fdefines as temp file, like ftable. + (done): Copy temp defines file to real one, like main output file. + +Fri Aug 21 12:47:48 1992 Richard Stallman (rms@mole.gnu.ai.mit.edu) + + * Makefile.in (dist): Don't release mergedir.awk + (install): Use sed, not awk. Don't depend on mergedir.awk. + * mergedir.awk: File effectively deleted. + +Wed Jul 29 00:53:25 1992 Richard Stallman (rms@mole.gnu.ai.mit.edu) + + * bison.simple: Test __sparc along with __sparc__. + +Sat Jul 11 14:08:33 1992 Richard Stallman (rms@mole.gnu.ai.mit.edu) + + * lex.c (skip_white_space): Count \n just once at end of c++ comment. + +Fri Jun 26 00:00:30 1992 Richard Stallman (rms@mole.gnu.ai.mit.edu) + + * bison.simple: Comment fix; #line command updated. + +Wed Jun 24 15:12:42 1992 Richard Stallman (rms@mole.gnu.ai.mit.edu) + + * Makefile.in (install): Specify full new file name for the executable. + +Mon Jun 22 16:38:24 1992 Richard Stallman (rms@mole.gnu.ai.mit.edu) + + * Makefile.in (dist): Include bison.rnh in distribution. + +Sun Jun 21 22:42:13 1992 Eric Youngdale (youngdale@v6550c.nrl.navy.mil) + + Clean up rough edges in VMS port of bison, add support for remaining + command line options. + + * bison.cld: Add /version, /yacc, /file_prefix, and /name_prefix + switches. + + * build.com: General cleanup: add logic to automatically sense + which C compiler is present; add code to cwd to the directory + that contains bison sources; do not define XPFILE, XPFILE1 + (correct defaults are applied in file.c). + + * files.c: Append _tab, not .tab when using /file_prefix under VMS. + + * system.h: Include string.h instead of strings.h (a la USG). + + * vmsgetargs.c: Add support for all switches added to bison.cld. + +Sun Jun 21 15:53:26 1992 Richard Stallman (rms@mole.gnu.ai.mit.edu) + + * Makefile.in (install): Always specify new file name for install. + Redirect awk output to temp file and install that. + +Wed May 27 22:27:50 1992 Richard Stallman (rms@mole.gnu.ai.mit.edu) + + * bison.simple (yyparse): Make yybackup and yyerrlab1 always be used. + +Fri May 22 14:58:42 1992 Richard Stallman (rms@mole.gnu.ai.mit.edu) + + * Makefile.in (dist): Depend on bison.info + (bison.info): Delete spurious <. + +Sun May 17 21:48:55 1992 Richard Stallman (rms@mole.gnu.ai.mit.edu) + + * Makefile.in (.c.o): New rule. Use $(DEFS) directly. + (CFLAGS): Use just -g by default. + (CDEBUG): Variable deleted. + +Thu May 7 00:03:37 1992 Richard Stallman (rms@mole.gnu.ai.mit.edu) + + * reader.c (copy_guard): Fix typo skipping comment. + +Mon May 4 01:23:21 1992 Richard Stallman (rms@mole.gnu.ai.mit.edu) + + * Version 1.18. + + * getargs.c (getargs): Change '0' to 0 in case for long options. + +Sun Apr 19 10:17:52 1992 Richard Stallman (rms@mole.gnu.ai.mit.edu) + + * reader.c (packsymbols): Handle -p when declaring yylval. + +Sat Apr 18 18:18:48 1992 Richard Stallman (rms@mole.gnu.ai.mit.edu) + + * output.c (output_gram): Output #endif properly at end of decl. + +Mon Mar 30 01:13:41 1992 Richard Stallman (rms@mole.gnu.ai.mit.edu) + + * Version 1.17. + + * Makefile.in (clean): Don't delete configuration files or TAGS. + (distclean): New target; do delete those. + +Sat Mar 28 17:18:50 1992 Richard Stallman (rms@mole.gnu.ai.mit.edu) + + * output.c (output_gram): Conditionalize yyprhs on YYDEBUG. + + * LR0.c (augment_automaton): If copying sp->shifts to insert new + shift, handle case of inserting at end. + +Sat Mar 21 23:25:47 1992 Richard Stallman (rms@mole.gnu.ai.mit.edu) + + * lex.c (skip_white_space): Handle C++ comments. + * reader.c (copy_definition, parse_union_decl, copy_guard): + (copy_action): Likewise. + +Sun Mar 8 01:22:21 1992 Richard Stallman (rms@mole.gnu.ai.mit.edu) + + * bison.simple (YYPOPSTACK): Fix typo. + +Sat Feb 29 03:53:06 1992 Richard Stallman (rms@mole.gnu.ai.mit.edu) + + * Makefile.in (install): Install bison.info* files one by one. + +Fri Feb 28 19:55:30 1992 David J. MacKenzie (djm@wookumz.gnu.ai.mit.edu) + + * bison.1: Document long options as starting with `--', not `+'. + +Sat Feb 1 00:08:09 1992 Richard Stallman (rms@mole.gnu.ai.mit.edu) + + * getargs.c (getargs): Accept value 0 from getopt_long. + +Thu Jan 30 23:39:15 1992 Richard Stallman (rms@mole.gnu.ai.mit.edu) + + * Makefile.in (mostlyclean): Renamed from `clean'. + (clean): Renamed from 'distclean'. Dep on mostlyclean, not realclean. + (realclean): Dep on clean. + +Mon Jan 27 21:59:19 1992 Richard Stallman (rms@mole.gnu.ai.mit.edu) + + * bison.simple: Use malloc, not xmalloc, and handle failure explicitly. + +Sun Jan 26 22:40:04 1992 Richard Stallman (rms@mole.gnu.ai.mit.edu) + + * conflicts.c (total_conflicts): Delete unused arg to fprintf. + +Tue Jan 21 23:17:44 1992 Richard Stallman (rms@mole.gnu.ai.mit.edu) + + * Version 1.16. + +Mon Jan 6 16:50:11 1992 Richard Stallman (rms at mole.gnu.ai.mit.edu) + + * Makefile (distclean): Depend on clean, not realclean. Don't rm TAGS. + (realclean): rm TAGS here. + + * symtab.c (free_symtab): Don't free the type names. + +Sun Dec 29 22:25:40 1991 Richard Stallman (rms at mole.gnu.ai.mit.edu) + + * machine.h: MSDOS has 32-bit ints if __GO32__. + +Wed Dec 25 22:09:07 1991 David J. MacKenzie (djm at wookumz.gnu.ai.mit.edu) + + * bison.simple [_AIX]: Indent `#pragma alloca', so old C compilers + don't choke on it. + +Mon Dec 23 02:10:16 1991 Richard Stallman (rms at mole.gnu.ai.mit.edu) + + * getopt.c, getopt1.c, getopt.h: Link them to standard source location. + * alloca.c: Likewise. + * Makefile.in (dist): Copy those files from current dir. + + * getargs.c: Update usage message. + + * LR0.c (augment_automaton): Put new shift in proper order. + +Fri Dec 20 18:39:20 1991 Richard Stallman (rms at mole.gnu.ai.mit.edu) + + * conflicts.c: Use memcpy if ANSI C library. + + * closure.c (set_fderives): Delete redundant assignment to vrow. + + * closure.c (print_firsts): Fix bounds and offset checking tags. + + * closure.c (tags): Declare just once at start of file. + + * LR0.c (allocate_itemsets): Eliminate unused var max. + (augment_automaton): Test sp is non-null. + + * lalr.c (initialize_LA): Make the vectors at least 1 element long. + + * reader.c (readgram): Remove separate YYSTYPE default for MSDOS. + +Wed Dec 18 02:40:32 1991 Richard Stallman (rms at mole.gnu.ai.mit.edu) + + * print.c (print_grammar): Don't print disabled rules. + +Tue Dec 17 03:48:07 1991 Richard Stallman (rms at mole.gnu.ai.mit.edu) + + * lex.c (lex): Parse hex escapes properly. + Handle \v when filling token_buffer. + + * lex.c: Include new.h. + (token_buffer): Change to a pointer. + (init_lex): Allocate initial buffer. + (grow_token_buffer): New function. + (lex, parse_percent_token): Use that. + + * reader.c (read_declarations): Call open_extra_files just once. + (parse_token_decl): Don't free previous typename value. + Don't increment nvars if symbol is already a nonterminal. + (parse_union_decl): Catch unmatched close-brace. + (parse_expect_decl): Null-terminate buffer. + (copy_guard): Set brace_flag for {, not for }. + + * reader.c: Fix %% in calls to fatal. + + * reader.c (token_buffer): Just one extern decl, at top level. + Declare as pointer. + + * symtab.c (free_symtab): Free type_name fields. Free symtab itself. + +Mon Nov 25 23:04:31 1991 Richard Stallman (rms at mole.gnu.ai.mit.edu) + + * bison.simple: Handle alloca for AIX. + + * Makefile.in (mandir): Compute default using manext. + +Sat Nov 2 21:39:32 1991 David J. MacKenzie (djm at wookumz.gnu.ai.mit.edu) + + * Update all files to GPL version 2. + +Fri Sep 6 01:51:36 1991 Richard Stallman (rms at mole.gnu.ai.mit.edu) + + * bison.simple (__yy_bcopy): Use builtin if GCC version 2. + +Mon Aug 26 22:09:12 1991 Richard Stallman (rms at mole.gnu.ai.mit.edu) + + * reader.c (parse_assoc_decl): Error if same symbol gets two precs. + +Mon Aug 26 16:42:09 1991 David J. MacKenzie (djm at pogo.gnu.ai.mit.edu) + + * Makefile.in, configure: Only put $< in Makefile if using VPATH, + because older makes don't understand it. + +Fri Aug 23 00:05:54 1991 David J. MacKenzie (djm at apple-gunkies) + + * conflicts.c [_AIX]: #pragma alloca. + * reduce.c: Don't define TRUE and FALSE if already defined. + +Mon Aug 12 22:49:58 1991 Richard Stallman (rms at mole.gnu.ai.mit.edu) + + * Makefile.in: Add deps on system.h. + (install): Add some deps. + +Fri Aug 2 12:19:20 1991 David J. MacKenzie (djm at apple-gunkies) + + * Makefile.in (dist): Include texinfo.tex. + + * configure: Create config.status. Remove it and Makefile if + interrupted while creating them. + +Thu Aug 1 23:14:01 1991 David J. MacKenzie (djm at apple-gunkies) + + * configure: Check for +srcdir etc. arg and look for + Makefile.in in that directory. Set VPATH if srcdir is not `.'. + * Makefile.in (prefix): Renamed from DESTDIR. + +Wed Jul 31 21:29:47 1991 Richard Stallman (rms at mole.gnu.ai.mit.edu) + + * print.c (print_grammar): Make output prettier. Break lines. + +Tue Jul 30 22:38:01 1991 Richard Stallman (rms at mole.gnu.ai.mit.edu) + + * print.c (print_grammar): New function. + (verbose): Call it instead of printing token names here. + +Mon Jul 22 16:39:54 1991 Richard Stallman (rms at mole.gnu.ai.mit.edu) + + * vmsgetargs.c (spec_name_prefix, spec_file_prefix): Define variables. + +Wed Jul 10 01:38:25 1991 David J. MacKenzie (djm at wookumz.gnu.ai.mit.edu) + + * configure, Makefile.in: $(INSTALLPROG) -> $(INSTALL), + $(INSTALLTEXT) -> $(INSTALLDATA). + +Tue Jul 9 00:53:58 1991 David J. MacKenzie (djm at wookumz.gnu.ai.mit.edu) + + * bison.simple: Don't include malloc.h if __TURBOC__. + +Sat Jul 6 15:18:12 1991 David J. MacKenzie (djm at geech.gnu.ai.mit.edu) + + * Replace Makefile with configure and Makefile.in. + Update README with current compilation instructions. + +Mon Jul 1 23:12:20 1991 Richard Stallman (rms at mole.gnu.ai.mit.edu) + + * reader.c (reader): Make the output define YYBISON. + +Thu Jun 20 16:52:51 1991 David J. MacKenzie (djm at geech.gnu.ai.mit.edu) + + * Makefile (MANDIR, MANEXT): Install man page in + /usr/local/man/man1/bison.1 by default, instead of + /usr/man/manl/bison.l, for consistency with other GNU programs. + * Makefile: Rename BINDIR et al. to lowercase to conform to + GNU coding standards. + (install): Make man page non-executable. + +Fri May 31 23:22:13 1991 Richard Stallman (rms at mole.gnu.ai.mit.edu) + + * Makefile (bison.info): New target. + (realclean): New target. + +Thu May 2 16:36:19 1991 Richard Stallman (rms at mole.gnu.ai.mit.edu) + + * bison.simple: Use YYPRINT to print a token, if it's defined. + +Mon Apr 29 12:22:55 1991 Richard Stallman (rms at mole.gnu.ai.mit.edu) + + * lalr.c (transpose): Rename R to R_arg. + (initialize_LA): Avoid shadowing variable j. + + * reader.c (packsymbols): Avoid shadowing variable i. + + * files.c: Declare exit and perror. + + * machine.h: Define MAXSHORT and MINSHORT for the eta-10. + +Tue Apr 2 20:49:12 1991 Richard Stallman (rms at mole.gnu.ai.mit.edu) + + * allocate.c (mallocate): Always allocate at least one byte. + +Tue Mar 19 22:17:19 1991 Richard Stallman (rms at mole.gnu.ai.mit.edu) + + * Makefile (dist): Put alloca.c into distribution. + +Wed Mar 6 17:45:42 1991 Richard Stallman (rms at mole.ai.mit.edu) + + * print.c (print_actions): Nicer output for final states. + +Thu Feb 21 20:39:53 1991 Richard Stallman (rms at mole.ai.mit.edu) + + * output.c (output_rule_data): Break lines in yytline based on hpos. + +Thu Feb 7 12:54:36 1991 Richard Stallman (rms at mole.ai.mit.edu) + + * bison.simple (yyparse): Move decl of yylsa before use. + +Tue Jan 15 23:41:33 1991 Richard Stallman (rms at mole.ai.mit.edu) + + * Version 1.14. + + * output.c (output_rule_data): Handle NULL in tags[i]. + +Fri Jan 11 17:27:24 1991 Richard Stallman (rms at mole.ai.mit.edu) + + * bison.simple: On MSDOS, include malloc.h. + +Sat Dec 29 19:59:55 1990 David J. MacKenzie (djm at wookumz.ai.mit.edu) + + * files.c: Use `mallocate' instead of `xmalloc' so no extra decl is + needed. + +Wed Dec 19 18:31:21 1990 Richard Stallman (rms at mole.ai.mit.edu) + + * reader.c (readgram): Alternate YYSTYPE defn for MSDOS. + * files.c [MSDOS]: Declare xmalloc. + +Thu Dec 13 12:45:54 1990 Richard Stallman (rms at mole.ai.mit.edu) + + * output.c (output_rule_data): Put all symbols in yytname. + + * bison.simple (yyparse): Delete extra fprintf arg + when printing a result of reduction. + +Mon Dec 10 13:55:15 1990 Richard Stallman (rms at mole.ai.mit.edu) + + * reader.c (packsymbols): Don't declare yylval if pure_parser. + +Tue Oct 30 23:38:09 1990 Richard Stallman (rms at mole.ai.mit.edu) + + * Version 1.12. + + * LR0.c (augment_automaton): Fix bugs adding sp2 to chain of shifts. + +Tue Oct 23 17:41:49 1990 Richard Stallman (rms at mole.ai.mit.edu) + + * bison.simple: Don't define alloca if already defined. + +Sun Oct 21 22:10:53 1990 Richard Stallman (rms at mole.ai.mit.edu) + + * getopt.c: On VMS, use string.h. + + * main.c (main): Return type int. + +Mon Sep 10 16:59:01 1990 Richard Stallman (rms at mole.ai.mit.edu) + + * output.c (output_headers): Output macro defs for -p. + + * reader.c (readgram): Handle consecutive actions. + + * getargs.c (getargs): Rename -a to -p. + * files.c (openfiles): Change names used for -b. + +Mon Aug 27 00:30:15 1990 Richard Stallman (rms at mole.ai.mit.edu) + + * reduce.c (reduce_grammar_tables): Don't map rlhs of disabled rule. + +Sun Aug 26 13:43:32 1990 Richard Stallman (rms at mole.ai.mit.edu) + + * closure.c (print_firsts, print_fderives): Use BITISSET to test bits. + +Thu Aug 23 22:13:40 1990 Richard Stallman (rms at mole.ai.mit.edu) + + * closure.c (print_firsts): vrowsize => varsetsize. + (print_fderives): rrowsize => rulesetsize. + +Fri Aug 10 15:32:11 1990 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * bison.simple (alloca): Don't define if already defined. + (__yy_bcopy): Alternate definition for C++. + +Wed Jul 11 00:46:03 1990 David J. MacKenzie (djm at albert.ai.mit.edu) + + * getargs.c (getargs): Mention +yacc in usage message. + +Tue Jul 10 17:29:08 1990 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * reader.c (parse_token_decl, copy_action): + Set value_components_used if appropriate. + (readgram): Inhibit output of YYSTYPE definition in that case. + +Sat Jun 30 13:47:57 1990 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * output.c (output_parser): Define YYPURE if pure, and not otherwise. + Don't define YYIMPURE. + * bison.simple: Adjust conditionals accordingly. + * bison.simple (YYLEX): If locations not in use, don't pass &yylloc. + +Thu Jun 28 12:32:21 1990 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * getargs.c (longopts): Add `yacc'. + +Thu Jun 28 00:40:21 1990 David J. MacKenzie (djm at apple-gunkies) + + * getargs.c (getargs): Add long options. + * Makefile: Link with getopt1.o and add getopt1.c and getopt.h to + dist. + + * Move version number and description back into version.c from + Makefile and getargs.c. + * Makefile (dist): Extract version number from version.c. + +Tue Jun 26 13:16:35 1990 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * output.c (output): Always call output_gram. + * bison.simple (yyparse): Print rhs and lhs symbols of reduction rule. + +Thu Jun 21 00:15:40 1990 David J. MacKenzie (djm at albert.ai.mit.edu) + + * main.c: New global var `program_name' to hold argv[0] for error + messages. + * allocate.c, files.c, getargs.c, reader.c: Use `program_name' + in messages instead of hardcoded "bison". + +Wed Jun 20 23:38:34 1990 David J. MacKenzie (djm at albert.ai.mit.edu) + + * Makefile: Specify Bison version here. Add rule to pass it to + version.c. Encode it in distribution directory and tar file names. + * version.c: Use version number from Makefile. + * getargs.c (getargs): Print additional text that used to be part of + version_string in version.c. Use -V instead of -version to print + Bison version info. Print a usage message and exit if given an + invalid option. + +Tue Jun 19 01:15:18 1990 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * bison.simple: Fix a #line. + + * Makefile (INSTALL): New parameter. + (install): Use that. + (CFLAGS): Move definition to top. + +Sun Jun 17 17:10:21 1990 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * reader.c (parse_type_decl): Ignore semicolon. + Remove excess % from error messages. + +Sat Jun 16 19:15:48 1990 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * Version 1.11. + + * Makefile (install): Ensure installed files readable. + +Tue Jun 12 12:50:56 EDT 1990 Jay Fenlason (hack@ai.mit.edu) + + * getargs.c: Declare spec_file_prefix + + * lex.c (lex): \a is '\007' instead of '007' + + * reader.c: include machine.h + + * files.h: Declare extern spec_name_prefix. + + Trivial patch from Thorsten Ohl (td12@ddagsi3.bitnet) + +Thu May 31 22:00:16 1990 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * Version 1.10. + + * bison.simple (YYBACKUP, YYRECOVERING): New macros. + (YYINITDEPTH): This is what used to be YYMAXDEPTH. + (YYMAXDEPTH): This is what used to be YYMAXLIMIT. + If the value is 0, use the default instead. + (yyparse): Return 2 on stack overflow. + +Wed May 30 21:09:07 1990 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * bison.simple (YYERROR): Jump to new label; don't print error message. + (yyparse): Define label yyerrlab1. + +Wed May 16 13:23:58 1990 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * files.c (openfiles): Support -b. + * getargs.c (getargs): Likewise. + + * reader.c (readgram): Error if too many symbols. + + * lex.c (lex): Handle \a. Make error msgs more reliable. + * reader.c (read_declarations): Make error msgs more reliable. + +Sun May 13 15:03:37 1990 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * Version 1.09. + + * reduce.c (reduce_grammar_tables): Fix backward test. + +Sat May 12 21:05:34 1990 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * Makefile (bison-dist.*): Rename targets and files to bison.*. + (bison.tar): Make tar file to unpack into subdirectory named `bison'. + +Mon Apr 30 03:46:58 1990 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * reduce.c (reduce_grammar_tables): Set rlhs to -1 for useless rules. + * nullable.c (set_nullable): Ignore those rules. + * derives.c (set_derives): Likewise. + +Mon Apr 23 15:16:09 1990 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * bison.simple (yyparse): Mention rule number as well as line number. + +Thu Mar 29 00:00:43 1990 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * bison.simple (__yy_bcopy): New function. + (yyparse): Use that, not bcopy. + +Wed Mar 28 15:23:51 1990 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * print.c (print_actions): Don't alter i and j spuriously when errp==0. + +Mon Mar 12 16:22:18 1990 Jim Kingdon (kingdon at pogo.ai.mit.edu) + + * bison.simple [__GNUC__]: Use builtin_alloca. + +Wed Mar 7 21:11:36 1990 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * Makefile (install): Use mergedir.awk to process bison.simple + for installation. + + * bison.simple (yyparse): New feature to include possible valid + tokens in parse error message. + +Sat Mar 3 14:10:56 1990 Richard Stallman (rms at geech) + + * Version 1.08. + +Mon Feb 26 16:32:21 1990 Jim Kingdon (kingdon at pogo.ai.mit.edu) + + * print.c (print_actions) + conflicts.c (print_reductions): Change "shift %d" to + "shift, and go to state %d" and "reduce %d" to "reduce using rule %d" + and "goto %d" to "go to state %d". + print.c (print_core): Change "(%d)" to "(rule %d)". + +Tue Feb 20 14:22:47 EST 1990 Jay Fenlason (hack @ wookumz.ai.mit.edu) + + * bison.simple: Comment out unused yyresume: label. + +Fri Feb 9 16:14:34 EST 1990 Jay Fenlason (hack @ wookumz.ai.mit.edu) + + * bison.simple : surround all declarations and (remaining) uses of + yyls* and yylloc with #ifdef YYLSP_NEEDED This will significantly + cut down on stack usage, and gets rid of unused-variable msgs from + GCC. + +Wed Jan 31 13:06:08 1990 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * files.c (done) [VMS]: Don't delete files that weren't used. + [VMS]: Let user override XPFILE and XPFILE1. + +Wed Jan 3 15:52:28 1990 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * Version 1.07. + +Sat Dec 16 15:50:21 1989 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * gram.c (dummy): New function. + + * reader.c (readgram): Detect error if two consec actions. + +Wed Nov 15 02:06:08 1989 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * reduce.c (reduce_grammar_tables): Update rline like other tables. + + * Makefile (install): Install the man page. + +Sat Nov 11 03:21:58 1989 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * output.c (output_rule_data): Write #if YYDEBUG around yyrline. + +Wed Oct 18 13:07:55 1989 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * Version 1.06. + + * vmsgetargs.c (getargs): Downcase specified output file name. + +Fri Oct 13 17:48:14 1989 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * reader.c (readgram): Warn if there is no default to use for $$ + and one is needed. + +Fri Sep 29 12:51:53 1989 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * Version 1.05. + + * vmsgetargs.h (getargs): Process outfile option. + +Fri Sep 8 03:05:14 1989 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * Version 1.04. + + * reader.c (parse_union_decl): Count newlines even in comments. + +Wed Sep 6 22:03:19 1989 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * files.c (openfiles): short_base_length was always == base_length. + +Thu Aug 24 16:55:06 1989 Richard Stallman (rms at apple-gunkies.ai.mit.edu) + + * Version 1.03. + + * files.c (openfiles): Write output into same dir as input, by default. + +Wed Aug 23 15:03:07 1989 Jay Fenlason (hack at gnu) + + * Makefile: Include system.h in bison-dist.tar + +Tue Aug 15 22:30:42 1989 Richard Stallman (rms at hobbes.ai.mit.edu) + + * version 1.03. + + * reader.c (reader): Output LTYPESTR to fdefines + only after reading the grammar. + +Sun Aug 6 16:55:23 1989 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * reader.c (read_declarations): Put space before comment + to avoid bug in Green Hills C compiler. + +Mon Jun 19 20:14:01 1989 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * allocate.c (xmalloc): New function. + +Fri Jun 16 23:59:40 1989 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * build.com: Compile and link reduce.c. + +Fri Jun 9 23:00:54 1989 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * reduce.c (reduce_grammar_tables): Adjust start_symbol when #s change. + +Sat May 27 17:57:29 1989 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * reader.c (copy_definition, copy_guard): Don't object to \-newline + inside strings. + +Mon May 22 12:30:59 1989 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * files.c (openfiles): Alternate file names for MSDOS. + (open_extra_files): Likewise. + (done): On MSDOS, unlink temp files here, not in openfiles. + + * machine.h (BITS_PER_WORD): 16 on MSDOS. + (MAXTABLE): Now defined in this file. + + * system.h: New file includes system-dependent headers. + All relevant .c files include it. + +Thu Apr 27 17:00:47 1989 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * version.c: Version 1.01. + +Tue Apr 18 12:46:05 1989 Randall Smith (randy at apple-gunkies.ai.mit.edu) + + * conflicts.c (total_conflicts): Fixed typo in yacc style output; + mention conflicts if > 0. + +Sat Apr 15 17:36:18 1989 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * reader.c (packsymbols): Start new symbols after 256. + +Wed Apr 12 14:09:09 1989 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * reader.c (reader): Always assign code 256 to `error' token. + Always set `translations' to 1 so this code gets handled. + * bison.simple (YYERRCODE): Define it. + +Tue Apr 11 19:26:32 1989 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * conflicts.c: If GNU C, use builtin alloca. + + * Makefile (install): Delete parser files before copying them. + +Thu Mar 30 13:51:17 1989 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * getargs.c (getargs): Turn off checking of name Bison was invoked by. + + * Makefile (dist): Include ChangeLog in distrib. + +Thu Mar 23 15:19:41 1989 Jay Fenlason (hack at apple-gunkies.ai.mit.edu) + + * LR0.c closure.c conflicts.c derives.c files.c getargs.c lalr.c + lex.c main.c nullable.c output.c print.c reader.c reduce.c + symtab.c warshall.c: A first pass at getting gcc -Wall to shut up. + Mostly declared functions as void, etc. + + * reduce.c moved 'extern int fixed_outfiles;' into print_notices + where it belongs. + +Wed Mar 1 12:33:28 1989 Randall Smith (randy at apple-gunkies.ai.mit.edu) + + * types.h, symtab.h, state.h, new.h, machine.h, lex.h, gram.h, + files.h, closure.c, vmsgetargs.c, warshall.c, symtab.c, reduce.c, + reader.c, print.c, output.c, nullable.c, main.c, lex.c, lalr.c, + gram.c, getargs.c, files.c, derives.c, conflicts.c, allocate.c, + LR0.c, Makefile, bison.simple: Changed copyright notices to be in + accord with the new General Public License. + * COPYING: Made a link to the new copying file. + +Wed Feb 22 06:18:20 1989 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * new.h (FREE): Alternate definition for __STDC__ avoids error + if `free' returns void. + +Tue Feb 21 15:03:34 1989 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * reader.c (read_declarations): Double a `%' in a format string. + (copy_definition, parse_start_decl, parse_token_decl): Likewise. + (parse_type_decl, parse_union_decl, copy_guard, readgram, get_type). + (copy_action): change a `fatal' to `fatals'. + + * lalr.c (map_goto): Initial high-end of binary search was off by 1. + +Sat Feb 18 08:49:57 1989 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * bison.simple [sparc]: Include alloca.h. + +Wed Feb 15 06:24:36 1989 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * reader.c (packsymbols): Write decl of yylval into .tab.h file. + +Sat Jan 28 18:19:05 1989 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * bison.simple: Avoid comments on `#line' lines. + + * reader.c (LTYPESTR): Rearrange to avoid whitespace after \-newline. + +Mon Jan 9 18:43:08 1989 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * conflicts.c (total_conflicts): if -y, use output syntax POSIX wants. + * reduce.c (print_notices): likewise. + + * lex.c (lex): Handle \v, and \x hex escapes. + + * reader.c (reader): Merge output_ltype into here. + Don't output YYLTYPE definition to .tab.h file + unless the @ construct is used. + + * bison.simple: Define YYERROR, YYABORT, YYACCEPT here. + * reader.c (output_ltype): Don't output them here. + + * bison.simple: YYDEBUG now should be 0 or 1. + * output.c (output): For YYDEBUG, output conditional to define it + only if not previously defined. + +Mon Jan 2 11:29:55 1989 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * bison.simple (yyparse) [YYPURE]: Add local yynerrs. + (yydebug): Declare global, but don't initialize, regardless of YYPURE. + (yyparse): Don't declare yydebug here. + +Thu Dec 22 22:01:22 1988 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * reduce.c (print_notices): Typo in message. + +Sun Dec 11 11:32:07 1988 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * output.c (pack_table): Free only nonzero the elts of froms & tos. + +Thu Dec 8 16:26:46 1988 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * gram.c (rprecsym): New vector indicates the %prec symbol for a rule. + * reader.c (packgram): Allocate it and fill it in. + * reduce.c (inaccessable_symbols): Use it to set V1. + * reduce.c (print_results): Don't complain about useless token + if it's in V1. + +Mon Dec 5 14:33:17 1988 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * machine.h (RESETBIT, BITISSET): New macros. + (SETBIT, WORDSIZE): Change to use BITS_PER_WORD. + + * reduce.c: New file, by David Bakin. Reduces the grammar. + * Makefile: Compile it, link it, put it in dist. + + * main.c (main): Call reduce_grammar (in reduce.c). + +Thu Nov 17 18:33:04 1988 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * conflicts.c: Don't declare alloca if including alloca.h. + + * bison.cld: Define qualifiers `nolines', `debug'. + * vmsgetargs.c (getargs): Handle them. + + * output.c (output_program): Notice `nolinesflag'. + + * output.c (output_parser): Simplify logic for -l and #line. + Avoid writing EOF char into output. + +Wed Oct 12 18:00:03 1988 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * Implement `-l' option. + * getopt.c: Set flag `nolinesflag'. + * reader.c (copy_definition, parse_union_decl, copy_guard, copy_action) + Obey that flag; don't generate #line. + * output.c (output_parser): Discard #line's when copying the parser. + +Mon Sep 12 16:33:17 1988 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * reader.c (copy_guard): Fix brace-counting for brace-surrounded guard. + +Thu Sep 8 20:09:53 1988 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * bison.simple: Correct number in #line command. + (yyparse): Call YYABORT instead of YYERROR, due to last change in + output_ltype. + +Mon Sep 5 14:55:30 1988 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * Makefile: New variable LIBS. Alternatives for USG. + * conflicts.c [USG]: Define bcopy. + * symtab.c [USG]: Include string.h instead of strings.h. + + * conflicts.c [sparc]: Include alloca.h. + +Tue Aug 2 08:38:38 1988 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * reader.c (parse_token_decl): Ignore commas. + +Sat Jun 25 10:29:20 1988 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * reader.c (output_ltype): Make YYERROR yacc-compatible (like YYFAIL). + +Fri Jun 24 11:25:11 1988 Richard Stallman (rms at sugar-bombs.ai.mit.edu) + + * getargs.c (getargs): -t sets debugflag. + Eliminate upper case duplicate options. + * output.c (output): If debugflag, output `#define YYDEBUG'. + +Thu May 26 06:04:21 1988 Richard Stallman (rms at frosted-flakes.ai.mit.edu) + + * allocate.c (mallocate): New name for `allocate' (which loses in VMS). + Calls changed in LR0.c, conflicts.c, symtab.c, new.h. + + * getargs.c (getargs): If argv[0] is "yacc", set fixed_outfiles. + +Tue May 17 12:15:30 1988 Richard Stallman (rms at frosted-flakes.ai.mit.edu) + + * conflicts.c: Declare alloca. + * reader.c: Declare realloc. + * warshall.c (TC): Fix one arithmetic op that was omitted last time. + +Thu May 5 14:36:03 1988 Richard Stallman (rms at frosted-flakes.ai.mit.edu) + + * bison.simple: Conditionalize most refs to yylsp on YYLSP_NEEDED. + * reader.c (copy_guard, copy_action): Notice if `@' is used. + (reader): If it was, output `#define YYLSP_NEEDED'. + +Mon Apr 18 04:54:32 1988 Richard Stallman (rms at rice-krispies.ai.mit.edu) + + * bison.simple: New variable yynerr counts calls to yyerror. + + * lex.c (lex, case '='): Update lineno when skipping a newline. + + * reader.c (parse_expect_decl): ungetc the char that ends the number; + don't read any further. This handles multi-line comments right + and avoids incorrect lineno. + + * reader.c: Delete duplicate decl of symval. + + * warshall.c (RTC, TC): Cast ptrs to char *, not unsigned, for arith. diff --git a/contrib/bison/INSTALL b/contrib/bison/INSTALL new file mode 100644 index 000000000000..a2c8722ccaff --- /dev/null +++ b/contrib/bison/INSTALL @@ -0,0 +1,181 @@ +Basic Installation +================== + + These are generic installation instructions. + + The `configure' shell script attempts to guess correct values for +various system-dependent variables used during compilation. It uses +those values to create a `Makefile' in each directory of the package. +It may also create one or more `.h' files containing system-dependent +definitions. Finally, it creates a shell script `config.status' that +you can run in the future to recreate the current configuration, a file +`config.cache' that saves the results of its tests to speed up +reconfiguring, and a file `config.log' containing compiler output +(useful mainly for debugging `configure'). + + If you need to do unusual things to compile the package, please try +to figure out how `configure' could check whether to do them, and mail +diffs or instructions to the address given in the `README' so they can +be considered for the next release. If at some point `config.cache' +contains results you don't want to keep, you may remove or edit it. + + The file `configure.in' is used to create `configure' by a program +called `autoconf'. You only need `configure.in' if you want to change +it or regenerate `configure' using a newer version of `autoconf'. + +The simplest way to compile this package is: + + 1. `cd' to the directory containing the package's source code and type + `./configure' to configure the package for your system. If you're + using `csh' on an old version of System V, you might need to type + `sh ./configure' instead to prevent `csh' from trying to execute + `configure' itself. + + Running `configure' takes awhile. While running, it prints some + messages telling which features it is checking for. + + 2. Type `make' to compile the package. + + 3. Optionally, type `make check' to run any self-tests that come with + the package. + + 4. Type `make install' to install the programs and any data files and + documentation. + + 5. You can remove the program binaries and object files from the + source code directory by typing `make clean'. To also remove the + files that `configure' created (so you can compile the package for + a different kind of computer), type `make distclean'. There is + also a `make maintainer-clean' target, but that is intended mainly + for the package's developers. If you use it, you may have to get + all sorts of other programs in order to regenerate files that came + with the distribution. + +Compilers and Options +===================== + + Some systems require unusual options for compilation or linking that +the `configure' script does not know about. You can give `configure' +initial values for variables by setting them in the environment. Using +a Bourne-compatible shell, you can do that on the command line like +this: + CC=c89 CFLAGS=-O2 LIBS=-lposix ./configure + +Or on systems that have the `env' program, you can do it like this: + env CPPFLAGS=-I/usr/local/include LDFLAGS=-s ./configure + +Compiling For Multiple Architectures +==================================== + + You can compile the package for more than one kind of computer at the +same time, by placing the object files for each architecture in their +own directory. To do this, you must use a version of `make' that +supports the `VPATH' variable, such as GNU `make'. `cd' to the +directory where you want the object files and executables to go and run +the `configure' script. `configure' automatically checks for the +source code in the directory that `configure' is in and in `..'. + + If you have to use a `make' that does not supports the `VPATH' +variable, you have to compile the package for one architecture at a time +in the source code directory. After you have installed the package for +one architecture, use `make distclean' before reconfiguring for another +architecture. + +Installation Names +================== + + By default, `make install' will install the package's files in +`/usr/local/bin', `/usr/local/man', etc. You can specify an +installation prefix other than `/usr/local' by giving `configure' the +option `--prefix=PATH'. + + You can specify separate installation prefixes for +architecture-specific files and architecture-independent files. If you +give `configure' the option `--exec-prefix=PATH', the package will use +PATH as the prefix for installing programs and libraries. +Documentation and other data files will still use the regular prefix. + + In addition, if you use an unusual directory layout you can give +options like `--bindir=PATH' to specify different values for particular +kinds of files. Run `configure --help' for a list of the directories +you can set and what kinds of files go in them. + + If the package supports it, you can cause programs to be installed +with an extra prefix or suffix on their names by giving `configure' the +option `--program-prefix=PREFIX' or `--program-suffix=SUFFIX'. + +Optional Features +================= + + Some packages pay attention to `--enable-FEATURE' options to +`configure', where FEATURE indicates an optional part of the package. +They may also pay attention to `--with-PACKAGE' options, where PACKAGE +is something like `gnu-as' or `x' (for the X Window System). The +`README' should mention any `--enable-' and `--with-' options that the +package recognizes. + + For packages that use the X Window System, `configure' can usually +find the X include and library files automatically, but if it doesn't, +you can use the `configure' options `--x-includes=DIR' and +`--x-libraries=DIR' to specify their locations. + +Specifying the System Type +========================== + + There may be some features `configure' can not figure out +automatically, but needs to determine by the type of host the package +will run on. Usually `configure' can figure that out, but if it prints +a message saying it can not guess the host type, give it the +`--host=TYPE' option. TYPE can either be a short name for the system +type, such as `sun4', or a canonical name with three fields: + CPU-COMPANY-SYSTEM + +See the file `config.sub' for the possible values of each field. If +`config.sub' isn't included in this package, then this package doesn't +need to know the host type. + + If you are building compiler tools for cross-compiling, you can also +use the `--target=TYPE' option to select the type of system they will +produce code for and the `--build=TYPE' option to select the type of +system on which you are compiling the package. + +Sharing Defaults +================ + + If you want to set default values for `configure' scripts to share, +you can create a site shell script called `config.site' that gives +default values for variables like `CC', `cache_file', and `prefix'. +`configure' looks for `PREFIX/share/config.site' if it exists, then +`PREFIX/etc/config.site' if it exists. Or, you can set the +`CONFIG_SITE' environment variable to the location of the site script. +A warning: not all `configure' scripts look for a site script. + +Operation Controls +================== + + `configure' recognizes the following options to control how it +operates. + +`--cache-file=FILE' + Use and save the results of the tests in FILE instead of + `./config.cache'. Set FILE to `/dev/null' to disable caching, for + debugging `configure'. + +`--help' + Print a summary of the options to `configure', and exit. + +`--quiet' +`--silent' +`-q' + Do not print messages saying which checks are being made. + +`--srcdir=DIR' + Look for the package's source code in directory DIR. Usually + `configure' can determine that directory automatically. + +`--version' + Print the version of Autoconf used to generate the `configure' + script, and exit. + +`configure' also accepts some other, not widely useful, options. + diff --git a/contrib/bison/LR0.c b/contrib/bison/LR0.c new file mode 100644 index 000000000000..77cc02514b87 --- /dev/null +++ b/contrib/bison/LR0.c @@ -0,0 +1,704 @@ +/* Generate the nondeterministic finite state machine for bison, + Copyright (C) 1984, 1986, 1989 Free Software Foundation, Inc. + +This file is part of Bison, the GNU Compiler Compiler. + +Bison is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 2, or (at your option) +any later version. + +Bison is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with Bison; see the file COPYING. If not, write to +the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA. */ + + +/* See comments in state.h for the data structures that represent it. + The entry point is generate_states. */ + +#include +#include "system.h" +#include "machine.h" +#include "new.h" +#include "gram.h" +#include "state.h" + + +extern char *nullable; +extern short *itemset; +extern short *itemsetend; + + +int nstates; +int final_state; +core *first_state; +shifts *first_shift; +reductions *first_reduction; + +int get_state(); +core *new_state(); + +void new_itemsets(); +void append_states(); +void initialize_states(); +void save_shifts(); +void save_reductions(); +void augment_automaton(); +void insert_start_shift(); +extern void initialize_closure(); +extern void closure(); +extern void finalize_closure(); +extern void toomany(); + +static core *this_state; +static core *last_state; +static shifts *last_shift; +static reductions *last_reduction; + +static int nshifts; +static short *shift_symbol; + +static short *redset; +static short *shiftset; + +static short **kernel_base; +static short **kernel_end; +static short *kernel_items; + +/* hash table for states, to recognize equivalent ones. */ + +#define STATE_TABLE_SIZE 1009 +static core **state_table; + + + +void +allocate_itemsets() +{ + register short *itemp; + register int symbol; + register int i; + register int count; + register short *symbol_count; + + count = 0; + symbol_count = NEW2(nsyms, short); + + itemp = ritem; + symbol = *itemp++; + while (symbol) + { + if (symbol > 0) + { + count++; + symbol_count[symbol]++; + } + symbol = *itemp++; + } + + /* see comments before new_itemsets. All the vectors of items + live inside kernel_items. The number of active items after + some symbol cannot be more than the number of times that symbol + appears as an item, which is symbol_count[symbol]. + We allocate that much space for each symbol. */ + + kernel_base = NEW2(nsyms, short *); + kernel_items = NEW2(count, short); + + count = 0; + for (i = 0; i < nsyms; i++) + { + kernel_base[i] = kernel_items + count; + count += symbol_count[i]; + } + + shift_symbol = symbol_count; + kernel_end = NEW2(nsyms, short *); +} + + +void +allocate_storage() +{ + allocate_itemsets(); + + shiftset = NEW2(nsyms, short); + redset = NEW2(nrules + 1, short); + state_table = NEW2(STATE_TABLE_SIZE, core *); +} + + +void +free_storage() +{ + FREE(shift_symbol); + FREE(redset); + FREE(shiftset); + FREE(kernel_base); + FREE(kernel_end); + FREE(kernel_items); + FREE(state_table); +} + + + +/* compute the nondeterministic finite state machine (see state.h for details) +from the grammar. */ +void +generate_states() +{ + allocate_storage(); + initialize_closure(nitems); + initialize_states(); + + while (this_state) + { + /* Set up ruleset and itemset for the transitions out of this state. + ruleset gets a 1 bit for each rule that could reduce now. + itemset gets a vector of all the items that could be accepted next. */ + closure(this_state->items, this_state->nitems); + /* record the reductions allowed out of this state */ + save_reductions(); + /* find the itemsets of the states that shifts can reach */ + new_itemsets(); + /* find or create the core structures for those states */ + append_states(); + + /* create the shifts structures for the shifts to those states, + now that the state numbers transitioning to are known */ + if (nshifts > 0) + save_shifts(); + + /* states are queued when they are created; process them all */ + this_state = this_state->next; + } + + /* discard various storage */ + finalize_closure(); + free_storage(); + + /* set up initial and final states as parser wants them */ + augment_automaton(); +} + + + +/* Find which symbols can be shifted in the current state, + and for each one record which items would be active after that shift. + Uses the contents of itemset. + shift_symbol is set to a vector of the symbols that can be shifted. + For each symbol in the grammar, kernel_base[symbol] points to + a vector of item numbers activated if that symbol is shifted, + and kernel_end[symbol] points after the end of that vector. */ +void +new_itemsets() +{ + register int i; + register int shiftcount; + register short *isp; + register short *ksp; + register int symbol; + +#ifdef TRACE + fprintf(stderr, "Entering new_itemsets\n"); +#endif + + for (i = 0; i < nsyms; i++) + kernel_end[i] = NULL; + + shiftcount = 0; + + isp = itemset; + + while (isp < itemsetend) + { + i = *isp++; + symbol = ritem[i]; + if (symbol > 0) + { + ksp = kernel_end[symbol]; + + if (!ksp) + { + shift_symbol[shiftcount++] = symbol; + ksp = kernel_base[symbol]; + } + + *ksp++ = i + 1; + kernel_end[symbol] = ksp; + } + } + + nshifts = shiftcount; +} + + + +/* Use the information computed by new_itemsets to find the state numbers + reached by each shift transition from the current state. + + shiftset is set up as a vector of state numbers of those states. */ +void +append_states() +{ + register int i; + register int j; + register int symbol; + +#ifdef TRACE + fprintf(stderr, "Entering append_states\n"); +#endif + + /* first sort shift_symbol into increasing order */ + + for (i = 1; i < nshifts; i++) + { + symbol = shift_symbol[i]; + j = i; + while (j > 0 && shift_symbol[j - 1] > symbol) + { + shift_symbol[j] = shift_symbol[j - 1]; + j--; + } + shift_symbol[j] = symbol; + } + + for (i = 0; i < nshifts; i++) + { + symbol = shift_symbol[i]; + shiftset[i] = get_state(symbol); + } +} + + + +/* find the state number for the state we would get to +(from the current state) by shifting symbol. +Create a new state if no equivalent one exists already. +Used by append_states */ + +int +get_state(symbol) +int symbol; +{ + register int key; + register short *isp1; + register short *isp2; + register short *iend; + register core *sp; + register int found; + + int n; + +#ifdef TRACE + fprintf(stderr, "Entering get_state, symbol = %d\n", symbol); +#endif + + isp1 = kernel_base[symbol]; + iend = kernel_end[symbol]; + n = iend - isp1; + + /* add up the target state's active item numbers to get a hash key */ + key = 0; + while (isp1 < iend) + key += *isp1++; + + key = key % STATE_TABLE_SIZE; + + sp = state_table[key]; + + if (sp) + { + found = 0; + while (!found) + { + if (sp->nitems == n) + { + found = 1; + isp1 = kernel_base[symbol]; + isp2 = sp->items; + + while (found && isp1 < iend) + { + if (*isp1++ != *isp2++) + found = 0; + } + } + + if (!found) + { + if (sp->link) + { + sp = sp->link; + } + else /* bucket exhausted and no match */ + { + sp = sp->link = new_state(symbol); + found = 1; + } + } + } + } + else /* bucket is empty */ + { + state_table[key] = sp = new_state(symbol); + } + + return (sp->number); +} + + + +/* subroutine of get_state. create a new state for those items, if necessary. */ + +core * +new_state(symbol) +int symbol; +{ + register int n; + register core *p; + register short *isp1; + register short *isp2; + register short *iend; + +#ifdef TRACE + fprintf(stderr, "Entering new_state, symbol = %d\n", symbol); +#endif + + if (nstates >= MAXSHORT) + toomany("states"); + + isp1 = kernel_base[symbol]; + iend = kernel_end[symbol]; + n = iend - isp1; + + p = (core *) xmalloc((unsigned) (sizeof(core) + (n - 1) * sizeof(short))); + p->accessing_symbol = symbol; + p->number = nstates; + p->nitems = n; + + isp2 = p->items; + while (isp1 < iend) + *isp2++ = *isp1++; + + last_state->next = p; + last_state = p; + + nstates++; + + return (p); +} + + +void +initialize_states() +{ + register core *p; +/* register unsigned *rp1; JF unused */ +/* register unsigned *rp2; JF unused */ +/* register unsigned *rend; JF unused */ + + p = (core *) xmalloc((unsigned) (sizeof(core) - sizeof(short))); + first_state = last_state = this_state = p; + nstates = 1; +} + + +void +save_shifts() +{ + register shifts *p; + register short *sp1; + register short *sp2; + register short *send; + + p = (shifts *) xmalloc((unsigned) (sizeof(shifts) + + (nshifts - 1) * sizeof(short))); + + p->number = this_state->number; + p->nshifts = nshifts; + + sp1 = shiftset; + sp2 = p->shifts; + send = shiftset + nshifts; + + while (sp1 < send) + *sp2++ = *sp1++; + + if (last_shift) + { + last_shift->next = p; + last_shift = p; + } + else + { + first_shift = p; + last_shift = p; + } +} + + + +/* find which rules can be used for reduction transitions from the current state + and make a reductions structure for the state to record their rule numbers. */ +void +save_reductions() +{ + register short *isp; + register short *rp1; + register short *rp2; + register int item; + register int count; + register reductions *p; + + short *rend; + + /* find and count the active items that represent ends of rules */ + + count = 0; + for (isp = itemset; isp < itemsetend; isp++) + { + item = ritem[*isp]; + if (item < 0) + { + redset[count++] = -item; + } + } + + /* make a reductions structure and copy the data into it. */ + + if (count) + { + p = (reductions *) xmalloc((unsigned) (sizeof(reductions) + + (count - 1) * sizeof(short))); + + p->number = this_state->number; + p->nreds = count; + + rp1 = redset; + rp2 = p->rules; + rend = rp1 + count; + + while (rp1 < rend) + *rp2++ = *rp1++; + + if (last_reduction) + { + last_reduction->next = p; + last_reduction = p; + } + else + { + first_reduction = p; + last_reduction = p; + } + } +} + + + +/* Make sure that the initial state has a shift that accepts the +grammar's start symbol and goes to the next-to-final state, +which has a shift going to the final state, which has a shift +to the termination state. +Create such states and shifts if they don't happen to exist already. */ +void +augment_automaton() +{ + register int i; + register int k; +/* register int found; JF unused */ + register core *statep; + register shifts *sp; + register shifts *sp2; + register shifts *sp1; + + sp = first_shift; + + if (sp) + { + if (sp->number == 0) + { + k = sp->nshifts; + statep = first_state->next; + + /* The states reached by shifts from first_state are numbered 1...K. + Look for one reached by start_symbol. */ + while (statep->accessing_symbol < start_symbol + && statep->number < k) + statep = statep->next; + + if (statep->accessing_symbol == start_symbol) + { + /* We already have a next-to-final state. + Make sure it has a shift to what will be the final state. */ + k = statep->number; + + while (sp && sp->number < k) + { + sp1 = sp; + sp = sp->next; + } + + if (sp && sp->number == k) + { + sp2 = (shifts *) xmalloc((unsigned) (sizeof(shifts) + + sp->nshifts * sizeof(short))); + sp2->number = k; + sp2->nshifts = sp->nshifts + 1; + sp2->shifts[0] = nstates; + for (i = sp->nshifts; i > 0; i--) + sp2->shifts[i] = sp->shifts[i - 1]; + + /* Patch sp2 into the chain of shifts in place of sp, + following sp1. */ + sp2->next = sp->next; + sp1->next = sp2; + if (sp == last_shift) + last_shift = sp2; + FREE(sp); + } + else + { + sp2 = NEW(shifts); + sp2->number = k; + sp2->nshifts = 1; + sp2->shifts[0] = nstates; + + /* Patch sp2 into the chain of shifts between sp1 and sp. */ + sp2->next = sp; + sp1->next = sp2; + if (sp == 0) + last_shift = sp2; + } + } + else + { + /* There is no next-to-final state as yet. */ + /* Add one more shift in first_shift, + going to the next-to-final state (yet to be made). */ + sp = first_shift; + + sp2 = (shifts *) xmalloc(sizeof(shifts) + + sp->nshifts * sizeof(short)); + sp2->nshifts = sp->nshifts + 1; + + /* Stick this shift into the vector at the proper place. */ + statep = first_state->next; + for (k = 0, i = 0; i < sp->nshifts; k++, i++) + { + if (statep->accessing_symbol > start_symbol && i == k) + sp2->shifts[k++] = nstates; + sp2->shifts[k] = sp->shifts[i]; + statep = statep->next; + } + if (i == k) + sp2->shifts[k++] = nstates; + + /* Patch sp2 into the chain of shifts + in place of sp, at the beginning. */ + sp2->next = sp->next; + first_shift = sp2; + if (last_shift == sp) + last_shift = sp2; + + FREE(sp); + + /* Create the next-to-final state, with shift to + what will be the final state. */ + insert_start_shift(); + } + } + else + { + /* The initial state didn't even have any shifts. + Give it one shift, to the next-to-final state. */ + sp = NEW(shifts); + sp->nshifts = 1; + sp->shifts[0] = nstates; + + /* Patch sp into the chain of shifts at the beginning. */ + sp->next = first_shift; + first_shift = sp; + + /* Create the next-to-final state, with shift to + what will be the final state. */ + insert_start_shift(); + } + } + else + { + /* There are no shifts for any state. + Make one shift, from the initial state to the next-to-final state. */ + + sp = NEW(shifts); + sp->nshifts = 1; + sp->shifts[0] = nstates; + + /* Initialize the chain of shifts with sp. */ + first_shift = sp; + last_shift = sp; + + /* Create the next-to-final state, with shift to + what will be the final state. */ + insert_start_shift(); + } + + /* Make the final state--the one that follows a shift from the + next-to-final state. + The symbol for that shift is 0 (end-of-file). */ + statep = (core *) xmalloc((unsigned) (sizeof(core) - sizeof(short))); + statep->number = nstates; + last_state->next = statep; + last_state = statep; + + /* Make the shift from the final state to the termination state. */ + sp = NEW(shifts); + sp->number = nstates++; + sp->nshifts = 1; + sp->shifts[0] = nstates; + last_shift->next = sp; + last_shift = sp; + + /* Note that the variable `final_state' refers to what we sometimes call + the termination state. */ + final_state = nstates; + + /* Make the termination state. */ + statep = (core *) xmalloc((unsigned) (sizeof(core) - sizeof(short))); + statep->number = nstates++; + last_state->next = statep; + last_state = statep; +} + + +/* subroutine of augment_automaton. + Create the next-to-final state, to which a shift has already been made in + the initial state. */ +void +insert_start_shift() +{ + register core *statep; + register shifts *sp; + + statep = (core *) xmalloc((unsigned) (sizeof(core) - sizeof(short))); + statep->number = nstates; + statep->accessing_symbol = start_symbol; + + last_state->next = statep; + last_state = statep; + + /* Make a shift from this state to (what will be) the final state. */ + sp = NEW(shifts); + sp->number = nstates++; + sp->nshifts = 1; + sp->shifts[0] = nstates; + + last_shift->next = sp; + last_shift = sp; +} diff --git a/contrib/bison/Makefile.in b/contrib/bison/Makefile.in new file mode 100644 index 000000000000..baabbf963cbb --- /dev/null +++ b/contrib/bison/Makefile.in @@ -0,0 +1,191 @@ +# Makefile for bison +# Copyright (C) 1988, 1989, 1991, 1993 Bob Corbett and Free Software Foundation, Inc. +# +# This file is part of Bison, the GNU Compiler Compiler. +# +# Bison is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 2, or (at your option) +# any later version. +# +# Bison is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with Bison; see the file COPYING. If not, write to +# the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA. + +#### Start of system configuration section. #### + +srcdir = @srcdir@ +VPATH = @srcdir@ + +CC = @CC@ +INSTALL = @INSTALL@ +INSTALL_PROGRAM = @INSTALL_PROGRAM@ +INSTALL_DATA = @INSTALL_DATA@ +MAKEINFO = makeinfo + +# Things you might add to DEFS: +# -DSTDC_HEADERS If you have ANSI C headers and libraries. +# -DHAVE_STRING_H If you don't have ANSI C headers but have string.h. +# -DHAVE_MEMORY_H If you don't have ANSI C headers and have memory.h. +# -DHAVE_STRERROR If you have strerror function. +DEFS = @DEFS@ + +CFLAGS = -g +LDFLAGS = + +LIBS = @LIBS@ + +# Some System V machines do not come with libPW. If this is true, use +# the GNU alloca.o here. +ALLOCA = @ALLOCA@ + +prefix = @prefix@ +exec_prefix = @exec_prefix@ + +# where the installed binary goes +bindir = $(exec_prefix)/bin + +# where the parsers go +datadir = $(prefix)/share + +# where the info files go +infodir = $(prefix)/info + +# where manual pages go and what their extensions should be +mandir = $(prefix)/man/man$(manext) +manext = 1 + +#### End of system configuration section. #### + +DISTFILES = COPYING ChangeLog Makefile.in configure configure.in \ + REFERENCES bison.1 bison.rnh configure.bat \ + bison.simple bison.hairy \ + LR0.c allocate.c closure.c conflicts.c derives.c \ + files.c getargs.c gram.c lalr.c lex.c main.c nullable.c \ + output.c print.c reader.c reduce.c symtab.c version.c \ + warshall.c files.h gram.h lex.h machine.h new.h state.h \ + symtab.h system.h types.h bison.cld build.com vmsgetargs.c \ + vmshlp.mar README INSTALL NEWS bison.texinfo bison.info* texinfo.tex \ + getopt.c getopt.h getopt1.c alloca.c mkinstalldirs install-sh + + +SHELL = /bin/sh + +# This rule allows us to supply the necessary -D options +# in addition to whatever the user asks for. +.c.o: + $(CC) -c $(DEFS) -I$(srcdir)/../include $(CPPFLAGS) $(CFLAGS) $< + +# names of parser files +PFILE = bison.simple +PFILE1 = bison.hairy + +PFILES = -DXPFILE=\"$(datadir)/$(PFILE)\" \ + -DXPFILE1=\"$(datadir)/$(PFILE1)\" + +OBJECTS = LR0.o allocate.o closure.o conflicts.o derives.o files.o \ + getargs.o gram.o lalr.o lex.o \ + main.o nullable.o output.o print.o reader.o reduce.o symtab.o \ + warshall.o version.o \ + getopt.o getopt1.o $(ALLOCA) + +all: bison bison.info bison.s1 + +Makefile: Makefile.in config.status + ./config.status + +config.status: configure + ./config.status --recheck + +configure: configure.in + cd $(srcdir); autoconf + +# Copy bison.simple, inserting directory name into the #line commands. +bison.s1: bison.simple + -rm -f bison.s1 + sed -e "/^#line/ s|bison|$(datadir)/bison|" < $(srcdir)/$(PFILE) > bison.s1 + +clean: + rm -f *.o core bison bison.s1 + +mostlyclean: clean + +distclean: clean + rm -f Makefile config.status + +realclean: distclean + rm -f TAGS *.info* + +# Most of these deps are in case using RCS. +install: all bison.1 $(srcdir)/$(PFILE) $(srcdir)/$(PFILE1) installdirs uninstall + $(INSTALL_PROGRAM) bison $(bindir)/bison + $(INSTALL_DATA) bison.s1 $(datadir)/$(PFILE) + $(INSTALL_DATA) $(srcdir)/$(PFILE1) $(datadir)/$(PFILE1) + cd $(srcdir); for f in bison.info*; \ + do $(INSTALL_DATA) $$f $(infodir)/$$f; done + -$(INSTALL_DATA) $(srcdir)/bison.1 $(mandir)/bison.$(manext) + +# Make sure all installation directories, e.g. $(bindir) actually exist by +# making them if necessary. +installdirs: + -sh $(srcdir)/mkinstalldirs $(bindir) $(datadir) $(libdir) $(infodir) $(mandir) + +uninstall: + rm -f $(bindir)/bison + -cd $(datadir); rm -f $(PFILE) $(PFILE1) + rm -f $(mandir)/bison.$(manext) $(infodir)/bison.info* + +check: + @echo "No checks implemented (yet)." + +bison: $(OBJECTS) + $(CC) $(LDFLAGS) $(CFLAGS) -o $@ $(OBJECTS) $(LIBS) + +# We don't use $(srcdir) in this rule +# because it is normally used in the master source dir +# in which configure has not been run. +dist: bison.info + echo bison-`sed -e '/version_string/!d' -e 's/[^0-9.]*\([0-9.]*\).*/\1/' -e q version.c` > .fname + -rm -rf `cat .fname` + mkdir `cat .fname` + dst=`cat .fname`; for f in $(DISTFILES); do \ + ln $$f $$dst/$$f || { echo copying $$f; cp -p $$f $$dst/$$f ; } \ + done + tar --gzip -chf `cat .fname`.tar.gz `cat .fname` + -rm -rf `cat .fname` .fname + +bison.info: bison.texinfo + $(MAKEINFO) $(srcdir)/bison.texinfo + +TAGS: *.c *.h + etags *.c *.h + +# This file is different to pass the parser file names to the compiler. +files.o: files.c + $(CC) -c $(PFILES) $(DEFS) $(CPPFLAGS) $(CFLAGS) \ + $(srcdir)/files.c $(OUTPUT_OPTION) + +LR0.o: system.h machine.h new.h gram.h state.h +closure.o: system.h machine.h new.h gram.h +conflicts.o: system.h machine.h new.h files.h gram.h state.h +derives.o: system.h new.h types.h gram.h +files.o: system.h files.h new.h gram.h +getargs.o: system.h files.h +lalr.o: system.h machine.h types.h state.h new.h gram.h +lex.o: system.h files.h symtab.h lex.h +main.o: system.h machine.h +nullable.o: system.h types.h gram.h new.h +output.o: system.h machine.h new.h files.h gram.h state.h +print.o: system.h machine.h new.h files.h gram.h state.h +reader.o: system.h files.h new.h symtab.h lex.h gram.h +reduce.o: system.h machine.h files.h new.h gram.h +symtab.o: system.h new.h symtab.h gram.h +warshall.o: system.h machine.h + +# Prevent GNU make v3 from overflowing arg limit on SysV. +.NOEXPORT: diff --git a/contrib/bison/NEWS b/contrib/bison/NEWS new file mode 100644 index 000000000000..db839bdd6783 --- /dev/null +++ b/contrib/bison/NEWS @@ -0,0 +1,44 @@ +Bison News +---------- + +Change in version 1.25: + +* Errors in the input grammar are not fatal; Bison keeps reading +the grammar file, and reports all the errors found in it. + +* Tokens can now be specified as multiple-character strings: for +example, you could use "<=" for a token which looks like <=, instead +of chosing a name like LESSEQ. + +* The %token_table declaration says to write a table of tokens (names +and numbers) into the parser file. The yylex function can use this +table to recognize multiple-character string tokens, or for other +purposes. + +* The %no_lines declaration says not to generate any #line preprocessor +directives in the parser file. + +* The %raw declaration says to use internal Bison token numbers, not +Yacc-compatible token numbers, when token names are defined as macros. + +* The --no-parser option produces the parser tables without including +the parser engine; a project can now use its own parser engine. +The actions go into a separate file called NAME.act, in the form of +a switch statement body. + +Changes in version 1.23: + +The user can define YYPARSE_PARAM as the name of an argument to be +passed into yyparse. The argument should have type void *. It should +actually point to an object. Grammar actions can access the variable +by casting it to the proper pointer type. + +Line numbers in output file corrected. + +Changes in version 1.22: + +--help option added. + +Changes in version 1.20: + +Output file does not redefine const for C++. diff --git a/contrib/bison/README b/contrib/bison/README new file mode 100644 index 000000000000..ea5f743d43ad --- /dev/null +++ b/contrib/bison/README @@ -0,0 +1,15 @@ +This directory contains the Bison parser generator. + +See the file INSTALL for compilation and installation instructions. + +It was once true that, when installing Bison on Sequent (or Pyramid?) +systems, you had to be in the Berkeley universe. This may no longer +be true; we have no way to tell. + +On VMS, you will probably have to create Makefile from Makefile.in by +hand. Remember to do `SET COMMAND BISON' to install the data in +`BISON.CLD'. + +Send bug reports to bug-gnu-utils@prep.ai.mit.edu. Please include the +version number from `bison --version', and a complete, self-contained +test case in each bug report. diff --git a/contrib/bison/REFERENCES b/contrib/bison/REFERENCES new file mode 100644 index 000000000000..b02eb18c52a4 --- /dev/null +++ b/contrib/bison/REFERENCES @@ -0,0 +1,30 @@ +From phr Tue Jul 8 10:36:19 1986 +Date: Tue, 8 Jul 86 00:52:24 EDT +From: phr (Paul Rubin) +To: riferguson%watmath.waterloo.edu@CSNET-RELAY.ARPA, tower +Subject: Re: Bison documentation? + +The main difference between Bison and Yacc that I know of is that +Bison supports the @N construction, which gives you access to +the starting and ending line number and character number associated +with any of the symbols in the current rule. + +Also, Bison supports the command `%expect N' which says not to mention +the conflicts if there are N shift/reduce conflicts and no reduce/reduce +conflicts. + +The differences in the algorithms stem mainly from the horrible +kludges that Johnson had to perpetrate to make Yacc fit in a PDP-11. + +Also, Bison uses a faster but less space-efficient encoding for the +parse tables (see Corbett's PhD thesis from Berkeley, "Static +Semantics in Compiler Error Recovery", June 1985, Report No. UCB/CSD +85/251), and more modern technique for generating the lookahead sets. +(See "Efficient Construction of LALR(1) Lookahead Sets" by F. DeRemer +and A. Pennello, in ACM TOPLS Vol 4 No 4, October 1982. Their +technique is the standard one now.) + + paul rubin + free software foundation + + diff --git a/contrib/bison/alloca.c b/contrib/bison/alloca.c new file mode 100644 index 000000000000..31fb4e0d4e06 --- /dev/null +++ b/contrib/bison/alloca.c @@ -0,0 +1,504 @@ +/* alloca.c -- allocate automatically reclaimed memory + (Mostly) portable public-domain implementation -- D A Gwyn + + This implementation of the PWB library alloca function, + which is used to allocate space off the run-time stack so + that it is automatically reclaimed upon procedure exit, + was inspired by discussions with J. Q. Johnson of Cornell. + J.Otto Tennant contributed the Cray support. + + There are some preprocessor constants that can + be defined when compiling for your specific system, for + improved efficiency; however, the defaults should be okay. + + The general concept of this implementation is to keep + track of all alloca-allocated blocks, and reclaim any + that are found to be deeper in the stack than the current + invocation. This heuristic does not reclaim storage as + soon as it becomes invalid, but it will do so eventually. + + As a special case, alloca(0) reclaims storage without + allocating any. It is a good idea to use alloca(0) in + your main control loop, etc. to force garbage collection. */ + +#ifdef HAVE_CONFIG_H +#include +#endif + +#ifdef HAVE_STRING_H +#include +#endif +#ifdef HAVE_STDLIB_H +#include +#endif + +#ifdef emacs +#include "blockinput.h" +#endif + +/* If compiling with GCC 2, this file's not needed. */ +#if !defined (__GNUC__) || __GNUC__ < 2 + +/* If someone has defined alloca as a macro, + there must be some other way alloca is supposed to work. */ +#ifndef alloca + +#ifdef emacs +#ifdef static +/* actually, only want this if static is defined as "" + -- this is for usg, in which emacs must undefine static + in order to make unexec workable + */ +#ifndef STACK_DIRECTION +you +lose +-- must know STACK_DIRECTION at compile-time +#endif /* STACK_DIRECTION undefined */ +#endif /* static */ +#endif /* emacs */ + +/* If your stack is a linked list of frames, you have to + provide an "address metric" ADDRESS_FUNCTION macro. */ + +#if defined (CRAY) && defined (CRAY_STACKSEG_END) +long i00afunc (); +#define ADDRESS_FUNCTION(arg) (char *) i00afunc (&(arg)) +#else +#define ADDRESS_FUNCTION(arg) &(arg) +#endif + +#if __STDC__ +typedef void *pointer; +#else +typedef char *pointer; +#endif + +#ifndef NULL +#define NULL 0 +#endif + +/* Different portions of Emacs need to call different versions of + malloc. The Emacs executable needs alloca to call xmalloc, because + ordinary malloc isn't protected from input signals. On the other + hand, the utilities in lib-src need alloca to call malloc; some of + them are very simple, and don't have an xmalloc routine. + + Non-Emacs programs expect this to call use xmalloc. + + Callers below should use malloc. */ + +#ifndef emacs +#define malloc xmalloc +#endif +extern pointer malloc (); + +/* Define STACK_DIRECTION if you know the direction of stack + growth for your system; otherwise it will be automatically + deduced at run-time. + + STACK_DIRECTION > 0 => grows toward higher addresses + STACK_DIRECTION < 0 => grows toward lower addresses + STACK_DIRECTION = 0 => direction of growth unknown */ + +#ifndef STACK_DIRECTION +#define STACK_DIRECTION 0 /* Direction unknown. */ +#endif + +#if STACK_DIRECTION != 0 + +#define STACK_DIR STACK_DIRECTION /* Known at compile-time. */ + +#else /* STACK_DIRECTION == 0; need run-time code. */ + +static int stack_dir; /* 1 or -1 once known. */ +#define STACK_DIR stack_dir + +static void +find_stack_direction () +{ + static char *addr = NULL; /* Address of first `dummy', once known. */ + auto char dummy; /* To get stack address. */ + + if (addr == NULL) + { /* Initial entry. */ + addr = ADDRESS_FUNCTION (dummy); + + find_stack_direction (); /* Recurse once. */ + } + else + { + /* Second entry. */ + if (ADDRESS_FUNCTION (dummy) > addr) + stack_dir = 1; /* Stack grew upward. */ + else + stack_dir = -1; /* Stack grew downward. */ + } +} + +#endif /* STACK_DIRECTION == 0 */ + +/* An "alloca header" is used to: + (a) chain together all alloca'ed blocks; + (b) keep track of stack depth. + + It is very important that sizeof(header) agree with malloc + alignment chunk size. The following default should work okay. */ + +#ifndef ALIGN_SIZE +#define ALIGN_SIZE sizeof(double) +#endif + +typedef union hdr +{ + char align[ALIGN_SIZE]; /* To force sizeof(header). */ + struct + { + union hdr *next; /* For chaining headers. */ + char *deep; /* For stack depth measure. */ + } h; +} header; + +static header *last_alloca_header = NULL; /* -> last alloca header. */ + +/* Return a pointer to at least SIZE bytes of storage, + which will be automatically reclaimed upon exit from + the procedure that called alloca. Originally, this space + was supposed to be taken from the current stack frame of the + caller, but that method cannot be made to work for some + implementations of C, for example under Gould's UTX/32. */ + +pointer +alloca (size) + unsigned size; +{ + auto char probe; /* Probes stack depth: */ + register char *depth = ADDRESS_FUNCTION (probe); + +#if STACK_DIRECTION == 0 + if (STACK_DIR == 0) /* Unknown growth direction. */ + find_stack_direction (); +#endif + + /* Reclaim garbage, defined as all alloca'd storage that + was allocated from deeper in the stack than currently. */ + + { + register header *hp; /* Traverses linked list. */ + +#ifdef emacs + BLOCK_INPUT; +#endif + + for (hp = last_alloca_header; hp != NULL;) + if ((STACK_DIR > 0 && hp->h.deep > depth) + || (STACK_DIR < 0 && hp->h.deep < depth)) + { + register header *np = hp->h.next; + + free ((pointer) hp); /* Collect garbage. */ + + hp = np; /* -> next header. */ + } + else + break; /* Rest are not deeper. */ + + last_alloca_header = hp; /* -> last valid storage. */ + +#ifdef emacs + UNBLOCK_INPUT; +#endif + } + + if (size == 0) + return NULL; /* No allocation required. */ + + /* Allocate combined header + user data storage. */ + + { + register pointer new = malloc (sizeof (header) + size); + /* Address of header. */ + + if (new == 0) + abort(); + + ((header *) new)->h.next = last_alloca_header; + ((header *) new)->h.deep = depth; + + last_alloca_header = (header *) new; + + /* User storage begins just after header. */ + + return (pointer) ((char *) new + sizeof (header)); + } +} + +#if defined (CRAY) && defined (CRAY_STACKSEG_END) + +#ifdef DEBUG_I00AFUNC +#include +#endif + +#ifndef CRAY_STACK +#define CRAY_STACK +#ifndef CRAY2 +/* Stack structures for CRAY-1, CRAY X-MP, and CRAY Y-MP */ +struct stack_control_header + { + long shgrow:32; /* Number of times stack has grown. */ + long shaseg:32; /* Size of increments to stack. */ + long shhwm:32; /* High water mark of stack. */ + long shsize:32; /* Current size of stack (all segments). */ + }; + +/* The stack segment linkage control information occurs at + the high-address end of a stack segment. (The stack + grows from low addresses to high addresses.) The initial + part of the stack segment linkage control information is + 0200 (octal) words. This provides for register storage + for the routine which overflows the stack. */ + +struct stack_segment_linkage + { + long ss[0200]; /* 0200 overflow words. */ + long sssize:32; /* Number of words in this segment. */ + long ssbase:32; /* Offset to stack base. */ + long:32; + long sspseg:32; /* Offset to linkage control of previous + segment of stack. */ + long:32; + long sstcpt:32; /* Pointer to task common address block. */ + long sscsnm; /* Private control structure number for + microtasking. */ + long ssusr1; /* Reserved for user. */ + long ssusr2; /* Reserved for user. */ + long sstpid; /* Process ID for pid based multi-tasking. */ + long ssgvup; /* Pointer to multitasking thread giveup. */ + long sscray[7]; /* Reserved for Cray Research. */ + long ssa0; + long ssa1; + long ssa2; + long ssa3; + long ssa4; + long ssa5; + long ssa6; + long ssa7; + long sss0; + long sss1; + long sss2; + long sss3; + long sss4; + long sss5; + long sss6; + long sss7; + }; + +#else /* CRAY2 */ +/* The following structure defines the vector of words + returned by the STKSTAT library routine. */ +struct stk_stat + { + long now; /* Current total stack size. */ + long maxc; /* Amount of contiguous space which would + be required to satisfy the maximum + stack demand to date. */ + long high_water; /* Stack high-water mark. */ + long overflows; /* Number of stack overflow ($STKOFEN) calls. */ + long hits; /* Number of internal buffer hits. */ + long extends; /* Number of block extensions. */ + long stko_mallocs; /* Block allocations by $STKOFEN. */ + long underflows; /* Number of stack underflow calls ($STKRETN). */ + long stko_free; /* Number of deallocations by $STKRETN. */ + long stkm_free; /* Number of deallocations by $STKMRET. */ + long segments; /* Current number of stack segments. */ + long maxs; /* Maximum number of stack segments so far. */ + long pad_size; /* Stack pad size. */ + long current_address; /* Current stack segment address. */ + long current_size; /* Current stack segment size. This + number is actually corrupted by STKSTAT to + include the fifteen word trailer area. */ + long initial_address; /* Address of initial segment. */ + long initial_size; /* Size of initial segment. */ + }; + +/* The following structure describes the data structure which trails + any stack segment. I think that the description in 'asdef' is + out of date. I only describe the parts that I am sure about. */ + +struct stk_trailer + { + long this_address; /* Address of this block. */ + long this_size; /* Size of this block (does not include + this trailer). */ + long unknown2; + long unknown3; + long link; /* Address of trailer block of previous + segment. */ + long unknown5; + long unknown6; + long unknown7; + long unknown8; + long unknown9; + long unknown10; + long unknown11; + long unknown12; + long unknown13; + long unknown14; + }; + +#endif /* CRAY2 */ +#endif /* not CRAY_STACK */ + +#ifdef CRAY2 +/* Determine a "stack measure" for an arbitrary ADDRESS. + I doubt that "lint" will like this much. */ + +static long +i00afunc (long *address) +{ + struct stk_stat status; + struct stk_trailer *trailer; + long *block, size; + long result = 0; + + /* We want to iterate through all of the segments. The first + step is to get the stack status structure. We could do this + more quickly and more directly, perhaps, by referencing the + $LM00 common block, but I know that this works. */ + + STKSTAT (&status); + + /* Set up the iteration. */ + + trailer = (struct stk_trailer *) (status.current_address + + status.current_size + - 15); + + /* There must be at least one stack segment. Therefore it is + a fatal error if "trailer" is null. */ + + if (trailer == 0) + abort (); + + /* Discard segments that do not contain our argument address. */ + + while (trailer != 0) + { + block = (long *) trailer->this_address; + size = trailer->this_size; + if (block == 0 || size == 0) + abort (); + trailer = (struct stk_trailer *) trailer->link; + if ((block <= address) && (address < (block + size))) + break; + } + + /* Set the result to the offset in this segment and add the sizes + of all predecessor segments. */ + + result = address - block; + + if (trailer == 0) + { + return result; + } + + do + { + if (trailer->this_size <= 0) + abort (); + result += trailer->this_size; + trailer = (struct stk_trailer *) trailer->link; + } + while (trailer != 0); + + /* We are done. Note that if you present a bogus address (one + not in any segment), you will get a different number back, formed + from subtracting the address of the first block. This is probably + not what you want. */ + + return (result); +} + +#else /* not CRAY2 */ +/* Stack address function for a CRAY-1, CRAY X-MP, or CRAY Y-MP. + Determine the number of the cell within the stack, + given the address of the cell. The purpose of this + routine is to linearize, in some sense, stack addresses + for alloca. */ + +static long +i00afunc (long address) +{ + long stkl = 0; + + long size, pseg, this_segment, stack; + long result = 0; + + struct stack_segment_linkage *ssptr; + + /* Register B67 contains the address of the end of the + current stack segment. If you (as a subprogram) store + your registers on the stack and find that you are past + the contents of B67, you have overflowed the segment. + + B67 also points to the stack segment linkage control + area, which is what we are really interested in. */ + + stkl = CRAY_STACKSEG_END (); + ssptr = (struct stack_segment_linkage *) stkl; + + /* If one subtracts 'size' from the end of the segment, + one has the address of the first word of the segment. + + If this is not the first segment, 'pseg' will be + nonzero. */ + + pseg = ssptr->sspseg; + size = ssptr->sssize; + + this_segment = stkl - size; + + /* It is possible that calling this routine itself caused + a stack overflow. Discard stack segments which do not + contain the target address. */ + + while (!(this_segment <= address && address <= stkl)) + { +#ifdef DEBUG_I00AFUNC + fprintf (stderr, "%011o %011o %011o\n", this_segment, address, stkl); +#endif + if (pseg == 0) + break; + stkl = stkl - pseg; + ssptr = (struct stack_segment_linkage *) stkl; + size = ssptr->sssize; + pseg = ssptr->sspseg; + this_segment = stkl - size; + } + + result = address - this_segment; + + /* If you subtract pseg from the current end of the stack, + you get the address of the previous stack segment's end. + This seems a little convoluted to me, but I'll bet you save + a cycle somewhere. */ + + while (pseg != 0) + { +#ifdef DEBUG_I00AFUNC + fprintf (stderr, "%011o %011o\n", pseg, size); +#endif + stkl = stkl - pseg; + ssptr = (struct stack_segment_linkage *) stkl; + size = ssptr->sssize; + pseg = ssptr->sspseg; + result += size; + } + return (result); +} + +#endif /* not CRAY2 */ +#endif /* CRAY */ + +#endif /* no alloca */ +#endif /* not GCC version 2 */ diff --git a/contrib/bison/allocate.c b/contrib/bison/allocate.c new file mode 100644 index 000000000000..a74dc1829767 --- /dev/null +++ b/contrib/bison/allocate.c @@ -0,0 +1,64 @@ +/* Allocate and clear storage for bison, + Copyright (C) 1984, 1989 Free Software Foundation, Inc. + +This file is part of Bison, the GNU Compiler Compiler. + +Bison is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 2, or (at your option) +any later version. + +Bison is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with Bison; see the file COPYING. If not, write to +the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA. */ + + +#include + +extern char *calloc (); +extern char *realloc (); +extern void done (); + +extern char *program_name; + +char * +xmalloc (n) + register unsigned n; +{ + register char *block; + + /* Avoid uncertainty about what an arg of 0 will do. */ + if (n == 0) + n = 1; + block = calloc (n, 1); + if (block == NULL) + { + fprintf (stderr, "%s: memory exhausted\n", program_name); + done (1); + } + + return (block); +} + +char * +xrealloc (block, n) + register char *block; + register unsigned n; +{ + /* Avoid uncertainty about what an arg of 0 will do. */ + if (n == 0) + n = 1; + block = realloc (block, n); + if (block == NULL) + { + fprintf (stderr, "%s: memory exhausted\n", program_name); + done (1); + } + + return (block); +} diff --git a/contrib/bison/bison.1 b/contrib/bison/bison.1 new file mode 100644 index 000000000000..0cd949ee422e --- /dev/null +++ b/contrib/bison/bison.1 @@ -0,0 +1,342 @@ +.TH BISON 1 local +.SH NAME +bison \- GNU Project parser generator (yacc replacement) +.SH SYNOPSIS +.B bison +[ +.BI \-b " file-prefix" +] [ +.BI \-\-file-prefix= file-prefix +] [ +.B \-d +] [ +.B \-\-defines +] [ +.B \-k +] [ +.B \-\-token-table +] [ +.B \-l +] [ +.B \-\-no-lines +] [ +.B \-n +] [ +.B \-\-no-parser +] [ +.BI \-o " outfile" +] [ +.BI \-\-output-file= outfile +] [ +.BI \-p " prefix" +] [ +.BI \-\-name-prefix= prefix +] [ +.B \-r +] [ +.B \-\-raw +] [ +.B \-t +] [ +.B \-\-debug +] [ +.B \-v +] [ +.B \-\-verbose +] [ +.B \-V +] [ +.B \-\-version +] [ +.B \-y +] [ +.B \-\-yacc +] [ +.B \-h +] [ +.B \-\-help +] [ +.B \-\-fixed-output-files +] +file +.SH DESCRIPTION +.I Bison +is a parser generator in the style of +.IR yacc (1). +It should be upwardly compatible with input files designed +for +.IR yacc . +.PP +Input files should follow the +.I yacc +convention of ending in +.BR .y . +Unlike +.IR yacc , +the generated files do not have fixed names, but instead use the prefix +of the input file. +For instance, a grammar description file named +.B parse.y +would produce the generated parser in a file named +.BR parse.tab.c , +instead of +.IR yacc 's +.BR y.tab.c . +.PP +This description of the options that can be given to +.I bison +is adapted from the node +.B Invocation +in the +.B bison.texinfo +manual, which should be taken as authoritative. +.PP +.I Bison +supports both traditional single-letter options and mnemonic long +option names. Long option names are indicated with +.B \-\- +instead of +.BR \- . +Abbreviations for option names are allowed as long as they +are unique. When a long option takes an argument, like +.BR \-\-file-prefix , +connect the option name and the argument with +.BR = . +.SS OPTIONS +.TP +.BI \-b " file-prefix" +.br +.ns +.TP +.BI \-\-file-prefix= file-prefix +Specify a prefix to use for all +.I bison +output file names. The names are +chosen as if the input file were named +\fIfile-prefix\fB.c\fR. +.TP +.B \-d +.br +.ns +.TP +.B \-\-defines +Write an extra output file containing macro definitions for the token +type names defined in the grammar and the semantic value type +.BR YYSTYPE , +as well as a few +.B extern +variable declarations. +.sp +If the parser output file is named +\fIname\fB.c\fR +then this file +is named +\fIname\fB.h\fR. +.sp +This output file is essential if you wish to put the definition of +.B yylex +in a separate source file, because +.B yylex +needs to be able to refer to token type codes and the variable +.BR yylval . +.TP +.B \-r +.br +.ns +.TP +.B \-\-raw +The token numbers in the \fIname\fB.h\fR file are usually the Yacc compatible +translations. If this switch is specified, Bison token numbers +are output instead. (Yacc numbers start at 257 except for single character +tokens; Bison assigns token numbers sequentially for all tokens +starting at 3.) +.TP +.B \-k +.br +.ns +.TP +.B \-\-token-table +This switch causes the \fIname\fB.tab.c\fR output to include a list of +token names in order by their token numbers; this is defined in the array +.IR yytname . +Also generated +are #defines for +.IR YYNTOKENS , +.IR YYNNTS , +.IR YYNRULES , +and +.IR YYNSTATES . +.TP +.B \-l +.br +.ns +.TP +.B \-\-no-lines +Don't put any +.B #line +preprocessor commands in the parser file. +Ordinarily +.I bison +puts them in the parser file so that the C compiler +and debuggers will associate errors with your source file, the +grammar file. This option causes them to associate errors with the +parser file, treating it an independent source file in its own right. +.TP +.B \-n +.br +.ns +.TP +.B \-\-no-parser +Do not generate the parser code into the output; generate only +declarations. The generated \fIname\fB.tab.c\fR file will have only +constant declarations. In addition, a \fIname\fB.act\fR file is +generated containing a switch statement body containing all the +translated actions. +.TP +.BI \-o " outfile" +.br +.ns +.TP +.BI \-\-output-file= outfile +Specify the name +.I outfile +for the parser file. +.sp +The other output files' names are constructed from +.I outfile +as described under the +.B \-v +and +.B \-d +switches. +.TP +.BI \-p " prefix" +.br +.ns +.TP +.BI \-\-name-prefix= prefix +Rename the external symbols used in the parser so that they start with +.I prefix +instead of +.BR yy . +The precise list of symbols renamed is +.BR yyparse , +.BR yylex , +.BR yyerror , +.BR yylval , +.BR yychar , +and +.BR yydebug . +.sp +For example, if you use +.BR "\-p c" , +the names become +.BR cparse , +.BR clex , +and so on. +.TP +.B \-t +.br +.ns +.TP +.B \-\-debug +Output a definition of the macro +.B YYDEBUG +into the parser file, +so that the debugging facilities are compiled. +.TP +.B \-v +.br +.ns +.TP +.B \-\-verbose +Write an extra output file containing verbose descriptions of the +parser states and what is done for each type of look-ahead token in +that state. +.sp +This file also describes all the conflicts, both those resolved by +operator precedence and the unresolved ones. +.sp +The file's name is made by removing +.B .tab.c +or +.B .c +from the parser output file name, and adding +.B .output +instead. +.sp +Therefore, if the input file is +.BR foo.y , +then the parser file is called +.B foo.tab.c +by default. As a consequence, the verbose +output file is called +.BR foo.output . +.TP +.B \-V +.br +.ns +.TP +.B \-\-version +Print the version number of +.I bison +and exit. +.TP +.B \-h +.br +.ns +.TP +.B \-\-help +Print a summary of the options to +.I bison +and exit. +.TP +.B \-y +.br +.ns +.TP +.B \-\-yacc +.br +.ns +.TP +.B \-\-fixed-output-files +Equivalent to +.BR "\-o y.tab.c" ; +the parser output file is called +.BR y.tab.c , +and the other outputs are called +.B y.output +and +.BR y.tab.h . +The purpose of this switch is to imitate +.IR yacc 's +output file name conventions. +Thus, the following shell script can substitute for +.IR yacc : +.sp +.RS +.ft B +bison \-y $* +.ft R +.sp +.RE +.PP +The long-named options can be introduced with `+' as well as `\-\-', +for compatibility with previous releases. Eventually support for `+' +will be removed, because it is incompatible with the POSIX.2 standard. +.SH FILES +/usr/local/lib/bison.simple simple parser +.br +/usr/local/lib/bison.hairy complicated parser +.SH SEE ALSO +.IR yacc (1) +.br +The +.IR "Bison Reference Manual" , +included as the file +.B bison.texinfo +in the +.I bison +source distribution. +.SH DIAGNOSTICS +Self explanatory. diff --git a/contrib/bison/bison.cld b/contrib/bison/bison.cld new file mode 100644 index 000000000000..ae424aaad387 --- /dev/null +++ b/contrib/bison/bison.cld @@ -0,0 +1,21 @@ +! +! VMS BISON command definition file +! +DEFINE VERB BISON + IMAGE GNU_BISON:[000000]BISON + + PARAMETER P1,Label=BISON$INFILE,Prompt="File" + value(required,type=$infile) + QUALIFIER VERBOSE,Label=BISON$VERBOSE + QUALIFIER DEFINES,Label=BISON$DEFINES + QUALIFIER FIXED_OUTFILES,Label=BISON$FIXED_OUTFILES + QUALIFIER NOPARSER,Label=BISON$NOPARSER + QUALIFIER RAW,LABEL=BISON$RAW + QUALIFIER TOKEN_TABLE,LABEL=BISON$TOKEN_TABLE + qualifier nolines,Label=BISON$NOLINES + qualifier debug,Label=BISON$DEBUG + qualifier output,value(type=$outfile),Label=BISON$OUTPUT + qualifier version,label=BISON$VERSION + qualifier yacc,label=BISON$YACC + qualifier file_prefix,value(type=$outfile),label=BISON$FILE_PREFIX + qualifier name_prefix,value(type=$outfile),LABEL=BISON$NAME_PREFIX diff --git a/contrib/bison/bison.hairy b/contrib/bison/bison.hairy new file mode 100644 index 000000000000..999b55591d01 --- /dev/null +++ b/contrib/bison/bison.hairy @@ -0,0 +1,334 @@ + +extern int timeclock; + + +int yyerror; /* Yyerror and yycost are set by guards. */ +int yycost; /* If yyerror is set to a nonzero value by a */ + /* guard, the reduction with which the guard */ + /* is associated is not performed, and the */ + /* error recovery mechanism is invoked. */ + /* Yycost indicates the cost of performing */ + /* the reduction given the attributes of the */ + /* symbols. */ + + +/* YYMAXDEPTH indicates the size of the parser's state and value */ +/* stacks. */ + +#ifndef YYMAXDEPTH +#define YYMAXDEPTH 500 +#endif + +/* YYMAXRULES must be at least as large as the number of rules that */ +/* could be placed in the rule queue. That number could be determined */ +/* from the grammar and the size of the stack, but, as yet, it is not. */ + +#ifndef YYMAXRULES +#define YYMAXRULES 100 +#endif + +#ifndef YYMAXBACKUP +#define YYMAXBACKUP 100 +#endif + + +short yyss[YYMAXDEPTH]; /* the state stack */ +YYSTYPE yyvs[YYMAXDEPTH]; /* the semantic value stack */ +YYLTYPE yyls[YYMAXDEPTH]; /* the location stack */ +short yyrq[YYMAXRULES]; /* the rule queue */ +int yychar; /* the lookahead symbol */ + +YYSTYPE yylval; /* the semantic value of the */ + /* lookahead symbol */ + +YYSTYPE yytval; /* the semantic value for the state */ + /* at the top of the state stack. */ + +YYSTYPE yyval; /* the variable used to return */ + /* semantic values from the action */ + /* routines */ + +YYLTYPE yylloc; /* location data for the lookahead */ + /* symbol */ + +YYLTYPE yytloc; /* location data for the state at the */ + /* top of the state stack */ + + +int yynunlexed; +short yyunchar[YYMAXBACKUP]; +YYSTYPE yyunval[YYMAXBACKUP]; +YYLTYPE yyunloc[YYMAXBACKUP]; + +short *yygssp; /* a pointer to the top of the state */ + /* stack; only set during error */ + /* recovery. */ + +YYSTYPE *yygvsp; /* a pointer to the top of the value */ + /* stack; only set during error */ + /* recovery. */ + +YYLTYPE *yyglsp; /* a pointer to the top of the */ + /* location stack; only set during */ + /* error recovery. */ + + +/* Yyget is an interface between the parser and the lexical analyzer. */ +/* It is costly to provide such an interface, but it avoids requiring */ +/* the lexical analyzer to be able to back up the scan. */ + +yyget() +{ + if (yynunlexed > 0) + { + yynunlexed--; + yychar = yyunchar[yynunlexed]; + yylval = yyunval[yynunlexed]; + yylloc = yyunloc[yynunlexed]; + } + else if (yychar <= 0) + yychar = 0; + else + { + yychar = yylex(); + if (yychar < 0) + yychar = 0; + else yychar = YYTRANSLATE(yychar); + } +} + + + +yyunlex(chr, val, loc) +int chr; +YYSTYPE val; +YYLTYPE loc; +{ + yyunchar[yynunlexed] = chr; + yyunval[yynunlexed] = val; + yyunloc[yynunlexed] = loc; + yynunlexed++; +} + + + +yyrestore(first, last) +register short *first; +register short *last; +{ + register short *ssp; + register short *rp; + register int symbol; + register int state; + register int tvalsaved; + + ssp = yygssp; + yyunlex(yychar, yylval, yylloc); + + tvalsaved = 0; + while (first != last) + { + symbol = yystos[*ssp]; + if (symbol < YYNTBASE) + { + yyunlex(symbol, yytval, yytloc); + tvalsaved = 1; + ssp--; + } + + ssp--; + + if (first == yyrq) + first = yyrq + YYMAXRULES; + + first--; + + for (rp = yyrhs + yyprhs[*first]; symbol = *rp; rp++) + { + if (symbol < YYNTBASE) + state = yytable[yypact[*ssp] + symbol]; + else + { + state = yypgoto[symbol - YYNTBASE] + *ssp; + + if (state >= 0 && state <= YYLAST && yycheck[state] == *ssp) + state = yytable[state]; + else + state = yydefgoto[symbol - YYNTBASE]; + } + + *++ssp = state; + } + } + + if ( ! tvalsaved && ssp > yyss) + { + yyunlex(yystos[*ssp], yytval, yytloc); + ssp--; + } + + yygssp = ssp; +} + + + +int +yyparse() +{ + register int yystate; + register int yyn; + register short *yyssp; + register short *yyrq0; + register short *yyptr; + register YYSTYPE *yyvsp; + + int yylen; + YYLTYPE *yylsp; + short *yyrq1; + short *yyrq2; + + yystate = 0; + yyssp = yyss - 1; + yyvsp = yyvs - 1; + yylsp = yyls - 1; + yyrq0 = yyrq; + yyrq1 = yyrq0; + yyrq2 = yyrq0; + + yychar = yylex(); + if (yychar < 0) + yychar = 0; + else yychar = YYTRANSLATE(yychar); + +yynewstate: + + if (yyssp >= yyss + YYMAXDEPTH - 1) + { + yyabort("Parser Stack Overflow"); + YYABORT; + } + + *++yyssp = yystate; + +yyresume: + + yyn = yypact[yystate]; + if (yyn == YYFLAG) + goto yydefault; + + yyn += yychar; + if (yyn < 0 || yyn > YYLAST || yycheck[yyn] != yychar) + goto yydefault; + + yyn = yytable[yyn]; + if (yyn < 0) + { + yyn = -yyn; + goto yyreduce; + } + else if (yyn == 0) + goto yyerrlab; + + yystate = yyn; + + yyptr = yyrq2; + while (yyptr != yyrq1) + { + yyn = *yyptr++; + yylen = yyr2[yyn]; + yyvsp -= yylen; + yylsp -= yylen; + + yyguard(yyn, yyvsp, yylsp); + if (yyerror) + goto yysemerr; + + yyaction(yyn, yyvsp, yylsp); + *++yyvsp = yyval; + + yylsp++; + if (yylen == 0) + { + yylsp->timestamp = timeclock; + yylsp->first_line = yytloc.first_line; + yylsp->first_column = yytloc.first_column; + yylsp->last_line = (yylsp-1)->last_line; + yylsp->last_column = (yylsp-1)->last_column; + yylsp->text = 0; + } + else + { + yylsp->last_line = (yylsp+yylen-1)->last_line; + yylsp->last_column = (yylsp+yylen-1)->last_column; + } + + if (yyptr == yyrq + YYMAXRULES) + yyptr = yyrq; + } + + if (yystate == YYFINAL) + YYACCEPT; + + yyrq2 = yyptr; + yyrq1 = yyrq0; + + *++yyvsp = yytval; + *++yylsp = yytloc; + yytval = yylval; + yytloc = yylloc; + yyget(); + + goto yynewstate; + +yydefault: + + yyn = yydefact[yystate]; + if (yyn == 0) + goto yyerrlab; + +yyreduce: + + *yyrq0++ = yyn; + + if (yyrq0 == yyrq + YYMAXRULES) + yyrq0 = yyrq; + + if (yyrq0 == yyrq2) + { + yyabort("Parser Rule Queue Overflow"); + YYABORT; + } + + yyssp -= yyr2[yyn]; + yyn = yyr1[yyn]; + + yystate = yypgoto[yyn - YYNTBASE] + *yyssp; + if (yystate >= 0 && yystate <= YYLAST && yycheck[yystate] == *yyssp) + yystate = yytable[yystate]; + else + yystate = yydefgoto[yyn - YYNTBASE]; + + goto yynewstate; + +yysemerr: + *--yyptr = yyn; + yyrq2 = yyptr; + yyvsp += yyr2[yyn]; + +yyerrlab: + + yygssp = yyssp; + yygvsp = yyvsp; + yyglsp = yylsp; + yyrestore(yyrq0, yyrq2); + yyrecover(); + yystate = *yygssp; + yyssp = yygssp; + yyvsp = yygvsp; + yyrq0 = yyrq; + yyrq1 = yyrq0; + yyrq2 = yyrq0; + goto yyresume; +} + +$ diff --git a/contrib/bison/bison.rnh b/contrib/bison/bison.rnh new file mode 100644 index 000000000000..c90bd085a8dc --- /dev/null +++ b/contrib/bison/bison.rnh @@ -0,0 +1,191 @@ +.! +.! RUNOFF source file for BISON.HLP +.! +.! This is a RUNOFF input file which will produce a VMS help file +.! for the VMS HELP library. +.! +.! Eric Youngdale and Wilfred J. Hansen (wjh+@cmu.edu). +.! +.literal +.end literal +.no paging +.no flags all +.right margin 70 +.left margin 1 + +.indent -1 +1 BISON +.skip + The BISON command invokes the GNU BISON parser generator. +.skip +.literal + BISON file-spec +.end literal +.skip +.indent -1 +2 Parameters +.skip + file-spec +.skip +Here file-spec is the grammar file name, which usually ends in +.y. The parser file's name is made by replacing the .y +with _tab.c. Thus, the command bison foo.y yields +foo_tab.c. + +.skip +.indent -1 +2 Qualifiers +.skip + The following is the list of available qualifiers for BISON: +.literal + /DEBUG + /DEFINES + /FILE_PREFIX=prefix + /FIXED_OUTFILES + /NAME_PREFIX=prefix + /NOLINES + /NOPARSER + /OUTPUT=outfilefile + /RAW + /TOKEN_TABLE + /VERBOSE + /VERSION + /YACC +.end literal +.skip +.indent -1 +2 /DEBUG +.skip +Output a definition of the macro YYDEBUG into the parser file, +so that the debugging facilities are compiled. +.skip +.indent -1 +2 /DEFINES +.skip +Write an extra output file containing macro definitions for the token +type names defined in the grammar and the semantic value type +YYSTYPE, as well as a extern variable declarations. +.skip +If the parser output file is named "name.c" then this file +is named "name.h". +.skip +This output file is essential if you wish to put the definition of +yylex in a separate source file, because yylex needs to +be able to refer to token type codes and the variable +yylval. +.skip +.indent -1 +2 /FILE_PREFIX +.skip +.literal + /FILIE_PREFIX=prefix +.end literal +.skip + Specify a prefix to use for all Bison output file names. The names are +chosen as if the input file were named prefix.c + +.skip +.indent -1 +2 /FIXED_OUTFILES +.skip +Equivalent to /OUTPUT=y_tab.c; the parser output file is called +y_tab.c, and the other outputs are called y.output and +y_tab.h. The purpose of this switch is to imitate Yacc's output +file name conventions. The /YACC qualifier is functionally equivalent +to /FIXED_OUTFILES. The following command definition will +work as a substitute for Yacc: + +.literal +$YACC:==BISON/FIXED_OUTFILES +.end literal +.skip +.indent -1 +2 /NAME_PREFIX +.skip +.literal + /NAME_PREFIX=prefix +.end literal +.skip +Rename the external symbols used in the parser so that they start with +"prefix" instead of "yy". The precise list of symbols renamed +is yyparse, yylex, yyerror, yylval, yychar and yydebug. + +For example, if you use /NAME_PREFIX="c", the names become cparse, +clex, and so on. + +.skip +.indent -1 +2 /NOLINES +.skip +Don't put any "#line" preprocessor commands in the parser file. +Ordinarily Bison puts them in the parser file so that the C compiler +and debuggers will associate errors with your source file, the +grammar file. This option causes them to associate errors with the +parser file, treating it an independent source file in its own right. +.skip +.indent -1 +2 /NOPARSER +.skip +Do not generate the parser code into the output; generate only +declarations. The generated name_tab.c file will have only +constant declarations. In addition, a name.act file is +generated containing a switch statement body containing all the +translated actions. +.skip +.indent -1 +2 /OUTPUT +.skip +.literal + /OUTPUT=outfile +.end literal +.skip +Specify the name "outfile" for the parser file. +.skip +.indent -1 +2 /RAW +.skip +When this switch is specified, the .tab.h file defines the tokens to +have the bison token numbers rather than the yacc compatible numbers. +To employ this switch you would have to have your own parser. +.skip +.indent -1 +2 /TOKEN_TABLE +.skip +This switch causes the name_tab.c output to include a list of +token names in order by their token numbers; this is defined in the array +yytname. Also generated are #defines for YYNTOKENS, YYNNTS, YYNRULES, +and YYNSTATES. + +.skip +.indent -1 +2 /VERBOSE +.skip +Write an extra output file containing verbose descriptions of the +parser states and what is done for each type of look-ahead token in +that state. +.skip +This file also describes all the conflicts, both those resolved by +operator precedence and the unresolved ones. +.skip +The file's name is made by removing _tab.c or .c from +the parser output file name, and adding .output instead. +.skip +Therefore, if the input file is foo.y, then the parser file is +called foo_tab.c by default. As a consequence, the verbose +output file is called foo.output. +.skip +.indent -1 +2 /VERSION +.skip +Print the version number of Bison. + +.skip +.indent -1 +2 /YACC +.skip +See /FIXED_OUTFILES. +.skip +.indent -1 + + + diff --git a/contrib/bison/bison.simple b/contrib/bison/bison.simple new file mode 100644 index 000000000000..aea6b78c20a0 --- /dev/null +++ b/contrib/bison/bison.simple @@ -0,0 +1,692 @@ +/* -*-C-*- Note some compilers choke on comments on `#line' lines. */ +#line 3 "bison.simple" + +/* Skeleton output parser for bison, + Copyright (C) 1984, 1989, 1990 Free Software Foundation, Inc. + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2, or (at your option) + any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. */ + +/* As a special exception, when this file is copied by Bison into a + Bison output file, you may use that output file without restriction. + This special exception was added by the Free Software Foundation + in version 1.24 of Bison. */ + +#ifndef alloca +#ifdef __GNUC__ +#define alloca __builtin_alloca +#else /* not GNU C. */ +#if (!defined (__STDC__) && defined (sparc)) || defined (__sparc__) || defined (__sparc) || defined (__sgi) +#include +#else /* not sparc */ +#if defined (MSDOS) && !defined (__TURBOC__) +#include +#else /* not MSDOS, or __TURBOC__ */ +#if defined(_AIX) +#include + #pragma alloca +#else /* not MSDOS, __TURBOC__, or _AIX */ +#ifdef __hpux +#ifdef __cplusplus +extern "C" { +void *alloca (unsigned int); +}; +#else /* not __cplusplus */ +void *alloca (); +#endif /* not __cplusplus */ +#endif /* __hpux */ +#endif /* not _AIX */ +#endif /* not MSDOS, or __TURBOC__ */ +#endif /* not sparc. */ +#endif /* not GNU C. */ +#endif /* alloca not defined. */ + +/* This is the parser code that is written into each bison parser + when the %semantic_parser declaration is not specified in the grammar. + It was written by Richard Stallman by simplifying the hairy parser + used when %semantic_parser is specified. */ + +/* Note: there must be only one dollar sign in this file. + It is replaced by the list of actions, each action + as one case of the switch. */ + +#define yyerrok (yyerrstatus = 0) +#define yyclearin (yychar = YYEMPTY) +#define YYEMPTY -2 +#define YYEOF 0 +#define YYACCEPT return(0) +#define YYABORT return(1) +#define YYERROR goto yyerrlab1 +/* Like YYERROR except do call yyerror. + This remains here temporarily to ease the + transition to the new meaning of YYERROR, for GCC. + Once GCC version 2 has supplanted version 1, this can go. */ +#define YYFAIL goto yyerrlab +#define YYRECOVERING() (!!yyerrstatus) +#define YYBACKUP(token, value) \ +do \ + if (yychar == YYEMPTY && yylen == 1) \ + { yychar = (token), yylval = (value); \ + yychar1 = YYTRANSLATE (yychar); \ + YYPOPSTACK; \ + goto yybackup; \ + } \ + else \ + { yyerror ("syntax error: cannot back up"); YYERROR; } \ +while (0) + +#define YYTERROR 1 +#define YYERRCODE 256 + +#ifndef YYPURE +#define YYLEX yylex() +#endif + +#ifdef YYPURE +#ifdef YYLSP_NEEDED +#ifdef YYLEX_PARAM +#define YYLEX yylex(&yylval, &yylloc, YYLEX_PARAM) +#else +#define YYLEX yylex(&yylval, &yylloc) +#endif +#else /* not YYLSP_NEEDED */ +#ifdef YYLEX_PARAM +#define YYLEX yylex(&yylval, YYLEX_PARAM) +#else +#define YYLEX yylex(&yylval) +#endif +#endif /* not YYLSP_NEEDED */ +#endif + +/* If nonreentrant, generate the variables here */ + +#ifndef YYPURE + +int yychar; /* the lookahead symbol */ +YYSTYPE yylval; /* the semantic value of the */ + /* lookahead symbol */ + +#ifdef YYLSP_NEEDED +YYLTYPE yylloc; /* location data for the lookahead */ + /* symbol */ +#endif + +int yynerrs; /* number of parse errors so far */ +#endif /* not YYPURE */ + +#if YYDEBUG != 0 +int yydebug; /* nonzero means print parse trace */ +/* Since this is uninitialized, it does not stop multiple parsers + from coexisting. */ +#endif + +/* YYINITDEPTH indicates the initial size of the parser's stacks */ + +#ifndef YYINITDEPTH +#define YYINITDEPTH 200 +#endif + +/* YYMAXDEPTH is the maximum size the stacks can grow to + (effective only if the built-in stack extension method is used). */ + +#if YYMAXDEPTH == 0 +#undef YYMAXDEPTH +#endif + +#ifndef YYMAXDEPTH +#define YYMAXDEPTH 10000 +#endif + +/* Prevent warning if -Wstrict-prototypes. */ +#ifdef __GNUC__ +int yyparse (void); +#endif + +#if __GNUC__ > 1 /* GNU C and GNU C++ define this. */ +#define __yy_memcpy(TO,FROM,COUNT) __builtin_memcpy(TO,FROM,COUNT) +#else /* not GNU C or C++ */ +#ifndef __cplusplus + +/* This is the most reliable way to avoid incompatibilities + in available built-in functions on various systems. */ +static void +__yy_memcpy (to, from, count) + char *to; + char *from; + int count; +{ + register char *f = from; + register char *t = to; + register int i = count; + + while (i-- > 0) + *t++ = *f++; +} + +#else /* __cplusplus */ + +/* This is the most reliable way to avoid incompatibilities + in available built-in functions on various systems. */ +static void +__yy_memcpy (char *to, char *from, int count) +{ + register char *f = from; + register char *t = to; + register int i = count; + + while (i-- > 0) + *t++ = *f++; +} + +#endif +#endif + +#line 196 "bison.simple" + +/* The user can define YYPARSE_PARAM as the name of an argument to be passed + into yyparse. The argument should have type void *. + It should actually point to an object. + Grammar actions can access the variable by casting it + to the proper pointer type. */ + +#ifdef YYPARSE_PARAM +#ifdef __cplusplus +#define YYPARSE_PARAM_ARG void *YYPARSE_PARAM +#define YYPARSE_PARAM_DECL +#else /* not __cplusplus */ +#define YYPARSE_PARAM_ARG YYPARSE_PARAM +#define YYPARSE_PARAM_DECL void *YYPARSE_PARAM; +#endif /* not __cplusplus */ +#else /* not YYPARSE_PARAM */ +#define YYPARSE_PARAM_ARG +#define YYPARSE_PARAM_DECL +#endif /* not YYPARSE_PARAM */ + +int +yyparse(YYPARSE_PARAM_ARG) + YYPARSE_PARAM_DECL +{ + register int yystate; + register int yyn; + register short *yyssp; + register YYSTYPE *yyvsp; + int yyerrstatus; /* number of tokens to shift before error messages enabled */ + int yychar1 = 0; /* lookahead token as an internal (translated) token number */ + + short yyssa[YYINITDEPTH]; /* the state stack */ + YYSTYPE yyvsa[YYINITDEPTH]; /* the semantic value stack */ + + short *yyss = yyssa; /* refer to the stacks thru separate pointers */ + YYSTYPE *yyvs = yyvsa; /* to allow yyoverflow to reallocate them elsewhere */ + +#ifdef YYLSP_NEEDED + YYLTYPE yylsa[YYINITDEPTH]; /* the location stack */ + YYLTYPE *yyls = yylsa; + YYLTYPE *yylsp; + +#define YYPOPSTACK (yyvsp--, yyssp--, yylsp--) +#else +#define YYPOPSTACK (yyvsp--, yyssp--) +#endif + + int yystacksize = YYINITDEPTH; + +#ifdef YYPURE + int yychar; + YYSTYPE yylval; + int yynerrs; +#ifdef YYLSP_NEEDED + YYLTYPE yylloc; +#endif +#endif + + YYSTYPE yyval; /* the variable used to return */ + /* semantic values from the action */ + /* routines */ + + int yylen; + +#if YYDEBUG != 0 + if (yydebug) + fprintf(stderr, "Starting parse\n"); +#endif + + yystate = 0; + yyerrstatus = 0; + yynerrs = 0; + yychar = YYEMPTY; /* Cause a token to be read. */ + + /* Initialize stack pointers. + Waste one element of value and location stack + so that they stay on the same level as the state stack. + The wasted elements are never initialized. */ + + yyssp = yyss - 1; + yyvsp = yyvs; +#ifdef YYLSP_NEEDED + yylsp = yyls; +#endif + +/* Push a new state, which is found in yystate . */ +/* In all cases, when you get here, the value and location stacks + have just been pushed. so pushing a state here evens the stacks. */ +yynewstate: + + *++yyssp = yystate; + + if (yyssp >= yyss + yystacksize - 1) + { + /* Give user a chance to reallocate the stack */ + /* Use copies of these so that the &'s don't force the real ones into memory. */ + YYSTYPE *yyvs1 = yyvs; + short *yyss1 = yyss; +#ifdef YYLSP_NEEDED + YYLTYPE *yyls1 = yyls; +#endif + + /* Get the current used size of the three stacks, in elements. */ + int size = yyssp - yyss + 1; + +#ifdef yyoverflow + /* Each stack pointer address is followed by the size of + the data in use in that stack, in bytes. */ +#ifdef YYLSP_NEEDED + /* This used to be a conditional around just the two extra args, + but that might be undefined if yyoverflow is a macro. */ + yyoverflow("parser stack overflow", + &yyss1, size * sizeof (*yyssp), + &yyvs1, size * sizeof (*yyvsp), + &yyls1, size * sizeof (*yylsp), + &yystacksize); +#else + yyoverflow("parser stack overflow", + &yyss1, size * sizeof (*yyssp), + &yyvs1, size * sizeof (*yyvsp), + &yystacksize); +#endif + + yyss = yyss1; yyvs = yyvs1; +#ifdef YYLSP_NEEDED + yyls = yyls1; +#endif +#else /* no yyoverflow */ + /* Extend the stack our own way. */ + if (yystacksize >= YYMAXDEPTH) + { + yyerror("parser stack overflow"); + return 2; + } + yystacksize *= 2; + if (yystacksize > YYMAXDEPTH) + yystacksize = YYMAXDEPTH; + yyss = (short *) alloca (yystacksize * sizeof (*yyssp)); + __yy_memcpy ((char *)yyss, (char *)yyss1, size * sizeof (*yyssp)); + yyvs = (YYSTYPE *) alloca (yystacksize * sizeof (*yyvsp)); + __yy_memcpy ((char *)yyvs, (char *)yyvs1, size * sizeof (*yyvsp)); +#ifdef YYLSP_NEEDED + yyls = (YYLTYPE *) alloca (yystacksize * sizeof (*yylsp)); + __yy_memcpy ((char *)yyls, (char *)yyls1, size * sizeof (*yylsp)); +#endif +#endif /* no yyoverflow */ + + yyssp = yyss + size - 1; + yyvsp = yyvs + size - 1; +#ifdef YYLSP_NEEDED + yylsp = yyls + size - 1; +#endif + +#if YYDEBUG != 0 + if (yydebug) + fprintf(stderr, "Stack size increased to %d\n", yystacksize); +#endif + + if (yyssp >= yyss + yystacksize - 1) + YYABORT; + } + +#if YYDEBUG != 0 + if (yydebug) + fprintf(stderr, "Entering state %d\n", yystate); +#endif + + goto yybackup; + yybackup: + +/* Do appropriate processing given the current state. */ +/* Read a lookahead token if we need one and don't already have one. */ +/* yyresume: */ + + /* First try to decide what to do without reference to lookahead token. */ + + yyn = yypact[yystate]; + if (yyn == YYFLAG) + goto yydefault; + + /* Not known => get a lookahead token if don't already have one. */ + + /* yychar is either YYEMPTY or YYEOF + or a valid token in external form. */ + + if (yychar == YYEMPTY) + { +#if YYDEBUG != 0 + if (yydebug) + fprintf(stderr, "Reading a token: "); +#endif + yychar = YYLEX; + } + + /* Convert token to internal form (in yychar1) for indexing tables with */ + + if (yychar <= 0) /* This means end of input. */ + { + yychar1 = 0; + yychar = YYEOF; /* Don't call YYLEX any more */ + +#if YYDEBUG != 0 + if (yydebug) + fprintf(stderr, "Now at end of input.\n"); +#endif + } + else + { + yychar1 = YYTRANSLATE(yychar); + +#if YYDEBUG != 0 + if (yydebug) + { + fprintf (stderr, "Next token is %d (%s", yychar, yytname[yychar1]); + /* Give the individual parser a way to print the precise meaning + of a token, for further debugging info. */ +#ifdef YYPRINT + YYPRINT (stderr, yychar, yylval); +#endif + fprintf (stderr, ")\n"); + } +#endif + } + + yyn += yychar1; + if (yyn < 0 || yyn > YYLAST || yycheck[yyn] != yychar1) + goto yydefault; + + yyn = yytable[yyn]; + + /* yyn is what to do for this token type in this state. + Negative => reduce, -yyn is rule number. + Positive => shift, yyn is new state. + New state is final state => don't bother to shift, + just return success. + 0, or most negative number => error. */ + + if (yyn < 0) + { + if (yyn == YYFLAG) + goto yyerrlab; + yyn = -yyn; + goto yyreduce; + } + else if (yyn == 0) + goto yyerrlab; + + if (yyn == YYFINAL) + YYACCEPT; + + /* Shift the lookahead token. */ + +#if YYDEBUG != 0 + if (yydebug) + fprintf(stderr, "Shifting token %d (%s), ", yychar, yytname[yychar1]); +#endif + + /* Discard the token being shifted unless it is eof. */ + if (yychar != YYEOF) + yychar = YYEMPTY; + + *++yyvsp = yylval; +#ifdef YYLSP_NEEDED + *++yylsp = yylloc; +#endif + + /* count tokens shifted since error; after three, turn off error status. */ + if (yyerrstatus) yyerrstatus--; + + yystate = yyn; + goto yynewstate; + +/* Do the default action for the current state. */ +yydefault: + + yyn = yydefact[yystate]; + if (yyn == 0) + goto yyerrlab; + +/* Do a reduction. yyn is the number of a rule to reduce with. */ +yyreduce: + yylen = yyr2[yyn]; + if (yylen > 0) + yyval = yyvsp[1-yylen]; /* implement default value of the action */ + +#if YYDEBUG != 0 + if (yydebug) + { + int i; + + fprintf (stderr, "Reducing via rule %d (line %d), ", + yyn, yyrline[yyn]); + + /* Print the symbols being reduced, and their result. */ + for (i = yyprhs[yyn]; yyrhs[i] > 0; i++) + fprintf (stderr, "%s ", yytname[yyrhs[i]]); + fprintf (stderr, " -> %s\n", yytname[yyr1[yyn]]); + } +#endif + +$ /* the action file gets copied in in place of this dollarsign */ +#line 498 "bison.simple" + + yyvsp -= yylen; + yyssp -= yylen; +#ifdef YYLSP_NEEDED + yylsp -= yylen; +#endif + +#if YYDEBUG != 0 + if (yydebug) + { + short *ssp1 = yyss - 1; + fprintf (stderr, "state stack now"); + while (ssp1 != yyssp) + fprintf (stderr, " %d", *++ssp1); + fprintf (stderr, "\n"); + } +#endif + + *++yyvsp = yyval; + +#ifdef YYLSP_NEEDED + yylsp++; + if (yylen == 0) + { + yylsp->first_line = yylloc.first_line; + yylsp->first_column = yylloc.first_column; + yylsp->last_line = (yylsp-1)->last_line; + yylsp->last_column = (yylsp-1)->last_column; + yylsp->text = 0; + } + else + { + yylsp->last_line = (yylsp+yylen-1)->last_line; + yylsp->last_column = (yylsp+yylen-1)->last_column; + } +#endif + + /* Now "shift" the result of the reduction. + Determine what state that goes to, + based on the state we popped back to + and the rule number reduced by. */ + + yyn = yyr1[yyn]; + + yystate = yypgoto[yyn - YYNTBASE] + *yyssp; + if (yystate >= 0 && yystate <= YYLAST && yycheck[yystate] == *yyssp) + yystate = yytable[yystate]; + else + yystate = yydefgoto[yyn - YYNTBASE]; + + goto yynewstate; + +yyerrlab: /* here on detecting error */ + + if (! yyerrstatus) + /* If not already recovering from an error, report this error. */ + { + ++yynerrs; + +#ifdef YYERROR_VERBOSE + yyn = yypact[yystate]; + + if (yyn > YYFLAG && yyn < YYLAST) + { + int size = 0; + char *msg; + int x, count; + + count = 0; + /* Start X at -yyn if nec to avoid negative indexes in yycheck. */ + for (x = (yyn < 0 ? -yyn : 0); + x < (sizeof(yytname) / sizeof(char *)); x++) + if (yycheck[x + yyn] == x) + size += strlen(yytname[x]) + 15, count++; + msg = (char *) malloc(size + 15); + if (msg != 0) + { + strcpy(msg, "parse error"); + + if (count < 5) + { + count = 0; + for (x = (yyn < 0 ? -yyn : 0); + x < (sizeof(yytname) / sizeof(char *)); x++) + if (yycheck[x + yyn] == x) + { + strcat(msg, count == 0 ? ", expecting `" : " or `"); + strcat(msg, yytname[x]); + strcat(msg, "'"); + count++; + } + } + yyerror(msg); + free(msg); + } + else + yyerror ("parse error; also virtual memory exceeded"); + } + else +#endif /* YYERROR_VERBOSE */ + yyerror("parse error"); + } + + goto yyerrlab1; +yyerrlab1: /* here on error raised explicitly by an action */ + + if (yyerrstatus == 3) + { + /* if just tried and failed to reuse lookahead token after an error, discard it. */ + + /* return failure if at end of input */ + if (yychar == YYEOF) + YYABORT; + +#if YYDEBUG != 0 + if (yydebug) + fprintf(stderr, "Discarding token %d (%s).\n", yychar, yytname[yychar1]); +#endif + + yychar = YYEMPTY; + } + + /* Else will try to reuse lookahead token + after shifting the error token. */ + + yyerrstatus = 3; /* Each real token shifted decrements this */ + + goto yyerrhandle; + +yyerrdefault: /* current state does not do anything special for the error token. */ + +#if 0 + /* This is wrong; only states that explicitly want error tokens + should shift them. */ + yyn = yydefact[yystate]; /* If its default is to accept any token, ok. Otherwise pop it.*/ + if (yyn) goto yydefault; +#endif + +yyerrpop: /* pop the current state because it cannot handle the error token */ + + if (yyssp == yyss) YYABORT; + yyvsp--; + yystate = *--yyssp; +#ifdef YYLSP_NEEDED + yylsp--; +#endif + +#if YYDEBUG != 0 + if (yydebug) + { + short *ssp1 = yyss - 1; + fprintf (stderr, "Error: state stack now"); + while (ssp1 != yyssp) + fprintf (stderr, " %d", *++ssp1); + fprintf (stderr, "\n"); + } +#endif + +yyerrhandle: + + yyn = yypact[yystate]; + if (yyn == YYFLAG) + goto yyerrdefault; + + yyn += YYTERROR; + if (yyn < 0 || yyn > YYLAST || yycheck[yyn] != YYTERROR) + goto yyerrdefault; + + yyn = yytable[yyn]; + if (yyn < 0) + { + if (yyn == YYFLAG) + goto yyerrpop; + yyn = -yyn; + goto yyreduce; + } + else if (yyn == 0) + goto yyerrpop; + + if (yyn == YYFINAL) + YYACCEPT; + +#if YYDEBUG != 0 + if (yydebug) + fprintf(stderr, "Shifting error token, "); +#endif + + *++yyvsp = yylval; +#ifdef YYLSP_NEEDED + *++yylsp = yylloc; +#endif + + yystate = yyn; + goto yynewstate; +} diff --git a/contrib/bison/bison.texinfo b/contrib/bison/bison.texinfo new file mode 100644 index 000000000000..61f7b12f62cb --- /dev/null +++ b/contrib/bison/bison.texinfo @@ -0,0 +1,5435 @@ +\input texinfo @c -*-texinfo-*- +@comment %**start of header +@setfilename bison.info +@settitle Bison 1.25 +@setchapternewpage odd + +@iftex +@finalout +@end iftex + +@c SMALL BOOK version +@c This edition has been formatted so that you can format and print it in +@c the smallbook format. +@c @smallbook + +@c next time, consider using @set for edition number, etc... + +@c Set following if you have the new `shorttitlepage' command +@c @clear shorttitlepage-enabled +@c @set shorttitlepage-enabled + +@c ISPELL CHECK: done, 14 Jan 1993 --bob + +@c Check COPYRIGHT dates. should be updated in the titlepage, ifinfo +@c titlepage; should NOT be changed in the GPL. --mew + +@iftex +@syncodeindex fn cp +@syncodeindex vr cp +@syncodeindex tp cp +@end iftex +@ifinfo +@synindex fn cp +@synindex vr cp +@synindex tp cp +@end ifinfo +@comment %**end of header + +@ifinfo +This file documents the Bison parser generator. + +Copyright (C) 1988, 89, 90, 91, 92, 93, 1995 Free Software Foundation, Inc. + +Permission is granted to make and distribute verbatim copies of +this manual provided the copyright notice and this permission notice +are preserved on all copies. + +@ignore +Permission is granted to process this file through Tex and print the +results, provided the printed document carries copying permission +notice identical to this one except for the removal of this paragraph +(this paragraph not being relevant to the printed manual). + +@end ignore +Permission is granted to copy and distribute modified versions of this +manual under the conditions for verbatim copying, provided also that the +sections entitled ``GNU General Public License'' and ``Conditions for +Using Bison'' are included exactly as in the original, and provided that +the entire resulting derived work is distributed under the terms of a +permission notice identical to this one. + +Permission is granted to copy and distribute translations of this manual +into another language, under the above conditions for modified versions, +except that the sections entitled ``GNU General Public License'', +``Conditions for Using Bison'' and this permission notice may be +included in translations approved by the Free Software Foundation +instead of in the original English. +@end ifinfo + +@ifset shorttitlepage-enabled +@shorttitlepage Bison +@end ifset +@titlepage +@title Bison +@subtitle The YACC-compatible Parser Generator +@subtitle November 1995, Bison Version 1.25 + +@author by Charles Donnelly and Richard Stallman + +@page +@vskip 0pt plus 1filll +Copyright @copyright{} 1988, 89, 90, 91, 92, 93, 1995 Free Software +Foundation + +@sp 2 +Published by the Free Software Foundation @* +59 Temple Place, Suite 330 @* +Boston, MA 02111-1307 USA @* +Printed copies are available for $15 each.@* +ISBN 1-882114-45-0 + +Permission is granted to make and distribute verbatim copies of +this manual provided the copyright notice and this permission notice +are preserved on all copies. + +@ignore +Permission is granted to process this file through TeX and print the +results, provided the printed document carries copying permission +notice identical to this one except for the removal of this paragraph +(this paragraph not being relevant to the printed manual). + +@end ignore +Permission is granted to copy and distribute modified versions of this +manual under the conditions for verbatim copying, provided also that the +sections entitled ``GNU General Public License'' and ``Conditions for +Using Bison'' are included exactly as in the original, and provided that +the entire resulting derived work is distributed under the terms of a +permission notice identical to this one. + +Permission is granted to copy and distribute translations of this manual +into another language, under the above conditions for modified versions, +except that the sections entitled ``GNU General Public License'', +``Conditions for Using Bison'' and this permission notice may be +included in translations approved by the Free Software Foundation +instead of in the original English. +@sp 2 +Cover art by Etienne Suvasa. +@end titlepage +@page + +@node Top, Introduction, (dir), (dir) + +@ifinfo +This manual documents version 1.25 of Bison. +@end ifinfo + +@menu +* Introduction:: +* Conditions:: +* Copying:: The GNU General Public License says + how you can copy and share Bison + +Tutorial sections: +* Concepts:: Basic concepts for understanding Bison. +* Examples:: Three simple explained examples of using Bison. + +Reference sections: +* Grammar File:: Writing Bison declarations and rules. +* Interface:: C-language interface to the parser function @code{yyparse}. +* Algorithm:: How the Bison parser works at run-time. +* Error Recovery:: Writing rules for error recovery. +* Context Dependency:: What to do if your language syntax is too + messy for Bison to handle straightforwardly. +* Debugging:: Debugging Bison parsers that parse wrong. +* Invocation:: How to run Bison (to produce the parser source file). +* Table of Symbols:: All the keywords of the Bison language are explained. +* Glossary:: Basic concepts are explained. +* Index:: Cross-references to the text. + + --- The Detailed Node Listing --- + +The Concepts of Bison + +* Language and Grammar:: Languages and context-free grammars, + as mathematical ideas. +* Grammar in Bison:: How we represent grammars for Bison's sake. +* Semantic Values:: Each token or syntactic grouping can have + a semantic value (the value of an integer, + the name of an identifier, etc.). +* Semantic Actions:: Each rule can have an action containing C code. +* Bison Parser:: What are Bison's input and output, + how is the output used? +* Stages:: Stages in writing and running Bison grammars. +* Grammar Layout:: Overall structure of a Bison grammar file. + +Examples + +* RPN Calc:: Reverse polish notation calculator; + a first example with no operator precedence. +* Infix Calc:: Infix (algebraic) notation calculator. + Operator precedence is introduced. +* Simple Error Recovery:: Continuing after syntax errors. +* Multi-function Calc:: Calculator with memory and trig functions. + It uses multiple data-types for semantic values. +* Exercises:: Ideas for improving the multi-function calculator. + +Reverse Polish Notation Calculator + +* Decls: Rpcalc Decls. Bison and C declarations for rpcalc. +* Rules: Rpcalc Rules. Grammar Rules for rpcalc, with explanation. +* Lexer: Rpcalc Lexer. The lexical analyzer. +* Main: Rpcalc Main. The controlling function. +* Error: Rpcalc Error. The error reporting function. +* Gen: Rpcalc Gen. Running Bison on the grammar file. +* Comp: Rpcalc Compile. Run the C compiler on the output code. + +Grammar Rules for @code{rpcalc} + +* Rpcalc Input:: +* Rpcalc Line:: +* Rpcalc Expr:: + +Multi-Function Calculator: @code{mfcalc} + +* Decl: Mfcalc Decl. Bison declarations for multi-function calculator. +* Rules: Mfcalc Rules. Grammar rules for the calculator. +* Symtab: Mfcalc Symtab. Symbol table management subroutines. + +Bison Grammar Files + +* Grammar Outline:: Overall layout of the grammar file. +* Symbols:: Terminal and nonterminal symbols. +* Rules:: How to write grammar rules. +* Recursion:: Writing recursive rules. +* Semantics:: Semantic values and actions. +* Declarations:: All kinds of Bison declarations are described here. +* Multiple Parsers:: Putting more than one Bison parser in one program. + +Outline of a Bison Grammar + +* C Declarations:: Syntax and usage of the C declarations section. +* Bison Declarations:: Syntax and usage of the Bison declarations section. +* Grammar Rules:: Syntax and usage of the grammar rules section. +* C Code:: Syntax and usage of the additional C code section. + +Defining Language Semantics + +* Value Type:: Specifying one data type for all semantic values. +* Multiple Types:: Specifying several alternative data types. +* Actions:: An action is the semantic definition of a grammar rule. +* Action Types:: Specifying data types for actions to operate on. +* Mid-Rule Actions:: Most actions go at the end of a rule. + This says when, why and how to use the exceptional + action in the middle of a rule. + +Bison Declarations + +* Token Decl:: Declaring terminal symbols. +* Precedence Decl:: Declaring terminals with precedence and associativity. +* Union Decl:: Declaring the set of all semantic value types. +* Type Decl:: Declaring the choice of type for a nonterminal symbol. +* Expect Decl:: Suppressing warnings about shift/reduce conflicts. +* Start Decl:: Specifying the start symbol. +* Pure Decl:: Requesting a reentrant parser. +* Decl Summary:: Table of all Bison declarations. + +Parser C-Language Interface + +* Parser Function:: How to call @code{yyparse} and what it returns. +* Lexical:: You must supply a function @code{yylex} + which reads tokens. +* Error Reporting:: You must supply a function @code{yyerror}. +* Action Features:: Special features for use in actions. + +The Lexical Analyzer Function @code{yylex} + +* Calling Convention:: How @code{yyparse} calls @code{yylex}. +* Token Values:: How @code{yylex} must return the semantic value + of the token it has read. +* Token Positions:: How @code{yylex} must return the text position + (line number, etc.) of the token, if the + actions want that. +* Pure Calling:: How the calling convention differs + in a pure parser (@pxref{Pure Decl, ,A Pure (Reentrant) Parser}). + +The Bison Parser Algorithm + +* Look-Ahead:: Parser looks one token ahead when deciding what to do. +* Shift/Reduce:: Conflicts: when either shifting or reduction is valid. +* Precedence:: Operator precedence works by resolving conflicts. +* Contextual Precedence:: When an operator's precedence depends on context. +* Parser States:: The parser is a finite-state-machine with stack. +* Reduce/Reduce:: When two rules are applicable in the same situation. +* Mystery Conflicts:: Reduce/reduce conflicts that look unjustified. +* Stack Overflow:: What happens when stack gets full. How to avoid it. + +Operator Precedence + +* Why Precedence:: An example showing why precedence is needed. +* Using Precedence:: How to specify precedence in Bison grammars. +* Precedence Examples:: How these features are used in the previous example. +* How Precedence:: How they work. + +Handling Context Dependencies + +* Semantic Tokens:: Token parsing can depend on the semantic context. +* Lexical Tie-ins:: Token parsing can depend on the syntactic context. +* Tie-in Recovery:: Lexical tie-ins have implications for how + error recovery rules must be written. + +Invoking Bison + +* Bison Options:: All the options described in detail, + in alphabetical order by short options. +* Option Cross Key:: Alphabetical list of long options. +* VMS Invocation:: Bison command syntax on VMS. +@end menu + +@node Introduction, Conditions, Top, Top +@unnumbered Introduction +@cindex introduction + +@dfn{Bison} is a general-purpose parser generator that converts a +grammar description for an LALR(1) context-free grammar into a C +program to parse that grammar. Once you are proficient with Bison, +you may use it to develop a wide range of language parsers, from those +used in simple desk calculators to complex programming languages. + +Bison is upward compatible with Yacc: all properly-written Yacc grammars +ought to work with Bison with no change. Anyone familiar with Yacc +should be able to use Bison with little trouble. You need to be fluent in +C programming in order to use Bison or to understand this manual. + +We begin with tutorial chapters that explain the basic concepts of using +Bison and show three explained examples, each building on the last. If you +don't know Bison or Yacc, start by reading these chapters. Reference +chapters follow which describe specific aspects of Bison in detail. + +Bison was written primarily by Robert Corbett; Richard Stallman made it +Yacc-compatible. Wilfred Hansen of Carnegie Mellon University added +multicharacter string literals and other features. + +This edition corresponds to version 1.25 of Bison. + +@node Conditions, Copying, Introduction, Top +@unnumbered Conditions for Using Bison + +As of Bison version 1.24, we have changed the distribution terms for +@code{yyparse} to permit using Bison's output in non-free programs. +Formerly, Bison parsers could be used only in programs that were free +software. + +The other GNU programming tools, such as the GNU C compiler, have never +had such a requirement. They could always be used for non-free +software. The reason Bison was different was not due to a special +policy decision; it resulted from applying the usual General Public +License to all of the Bison source code. + +The output of the Bison utility---the Bison parser file---contains a +verbatim copy of a sizable piece of Bison, which is the code for the +@code{yyparse} function. (The actions from your grammar are inserted +into this function at one point, but the rest of the function is not +changed.) When we applied the GPL terms to the code for @code{yyparse}, +the effect was to restrict the use of Bison output to free software. + +We didn't change the terms because of sympathy for people who want to +make software proprietary. @strong{Software should be free.} But we +concluded that limiting Bison's use to free software was doing little to +encourage people to make other software free. So we decided to make the +practical conditions for using Bison match the practical conditions for +using the other GNU tools. + +@node Copying, Concepts, Conditions, Top +@unnumbered GNU GENERAL PUBLIC LICENSE +@center Version 2, June 1991 + +@display +Copyright @copyright{} 1989, 1991 Free Software Foundation, Inc. +675 Mass Ave, Cambridge, MA 02139, USA + +Everyone is permitted to copy and distribute verbatim copies +of this license document, but changing it is not allowed. +@end display + +@unnumberedsec Preamble + + The licenses for most software are designed to take away your +freedom to share and change it. By contrast, the GNU General Public +License is intended to guarantee your freedom to share and change free +software---to make sure the software is free for all its users. This +General Public License applies to most of the Free Software +Foundation's software and to any other program whose authors commit to +using it. (Some other Free Software Foundation software is covered by +the GNU Library General Public License instead.) You can apply it to +your programs, too. + + When we speak of free software, we are referring to freedom, not +price. Our General Public Licenses are designed to make sure that you +have the freedom to distribute copies of free software (and charge for +this service if you wish), that you receive source code or can get it +if you want it, that you can change the software or use pieces of it +in new free programs; and that you know you can do these things. + + To protect your rights, we need to make restrictions that forbid +anyone to deny you these rights or to ask you to surrender the rights. +These restrictions translate to certain responsibilities for you if you +distribute copies of the software, or if you modify it. + + For example, if you distribute copies of such a program, whether +gratis or for a fee, you must give the recipients all the rights that +you have. You must make sure that they, too, receive or can get the +source code. And you must show them these terms so they know their +rights. + + We protect your rights with two steps: (1) copyright the software, and +(2) offer you this license which gives you legal permission to copy, +distribute and/or modify the software. + + Also, for each author's protection and ours, we want to make certain +that everyone understands that there is no warranty for this free +software. If the software is modified by someone else and passed on, we +want its recipients to know that what they have is not the original, so +that any problems introduced by others will not reflect on the original +authors' reputations. + + Finally, any free program is threatened constantly by software +patents. We wish to avoid the danger that redistributors of a free +program will individually obtain patent licenses, in effect making the +program proprietary. To prevent this, we have made it clear that any +patent must be licensed for everyone's free use or not licensed at all. + + The precise terms and conditions for copying, distribution and +modification follow. + +@iftex +@unnumberedsec TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION +@end iftex +@ifinfo +@center TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION +@end ifinfo + +@enumerate 0 +@item +This License applies to any program or other work which contains +a notice placed by the copyright holder saying it may be distributed +under the terms of this General Public License. The ``Program'', below, +refers to any such program or work, and a ``work based on the Program'' +means either the Program or any derivative work under copyright law: +that is to say, a work containing the Program or a portion of it, +either verbatim or with modifications and/or translated into another +language. (Hereinafter, translation is included without limitation in +the term ``modification''.) Each licensee is addressed as ``you''. + +Activities other than copying, distribution and modification are not +covered by this License; they are outside its scope. The act of +running the Program is not restricted, and the output from the Program +is covered only if its contents constitute a work based on the +Program (independent of having been made by running the Program). +Whether that is true depends on what the Program does. + +@item +You may copy and distribute verbatim copies of the Program's +source code as you receive it, in any medium, provided that you +conspicuously and appropriately publish on each copy an appropriate +copyright notice and disclaimer of warranty; keep intact all the +notices that refer to this License and to the absence of any warranty; +and give any other recipients of the Program a copy of this License +along with the Program. + +You may charge a fee for the physical act of transferring a copy, and +you may at your option offer warranty protection in exchange for a fee. + +@item +You may modify your copy or copies of the Program or any portion +of it, thus forming a work based on the Program, and copy and +distribute such modifications or work under the terms of Section 1 +above, provided that you also meet all of these conditions: + +@enumerate a +@item +You must cause the modified files to carry prominent notices +stating that you changed the files and the date of any change. + +@item +You must cause any work that you distribute or publish, that in +whole or in part contains or is derived from the Program or any +part thereof, to be licensed as a whole at no charge to all third +parties under the terms of this License. + +@item +If the modified program normally reads commands interactively +when run, you must cause it, when started running for such +interactive use in the most ordinary way, to print or display an +announcement including an appropriate copyright notice and a +notice that there is no warranty (or else, saying that you provide +a warranty) and that users may redistribute the program under +these conditions, and telling the user how to view a copy of this +License. (Exception: if the Program itself is interactive but +does not normally print such an announcement, your work based on +the Program is not required to print an announcement.) +@end enumerate + +These requirements apply to the modified work as a whole. If +identifiable sections of that work are not derived from the Program, +and can be reasonably considered independent and separate works in +themselves, then this License, and its terms, do not apply to those +sections when you distribute them as separate works. But when you +distribute the same sections as part of a whole which is a work based +on the Program, the distribution of the whole must be on the terms of +this License, whose permissions for other licensees extend to the +entire whole, and thus to each and every part regardless of who wrote it. + +Thus, it is not the intent of this section to claim rights or contest +your rights to work written entirely by you; rather, the intent is to +exercise the right to control the distribution of derivative or +collective works based on the Program. + +In addition, mere aggregation of another work not based on the Program +with the Program (or with a work based on the Program) on a volume of +a storage or distribution medium does not bring the other work under +the scope of this License. + +@item +You may copy and distribute the Program (or a work based on it, +under Section 2) in object code or executable form under the terms of +Sections 1 and 2 above provided that you also do one of the following: + +@enumerate a +@item +Accompany it with the complete corresponding machine-readable +source code, which must be distributed under the terms of Sections +1 and 2 above on a medium customarily used for software interchange; or, + +@item +Accompany it with a written offer, valid for at least three +years, to give any third party, for a charge no more than your +cost of physically performing source distribution, a complete +machine-readable copy of the corresponding source code, to be +distributed under the terms of Sections 1 and 2 above on a medium +customarily used for software interchange; or, + +@item +Accompany it with the information you received as to the offer +to distribute corresponding source code. (This alternative is +allowed only for noncommercial distribution and only if you +received the program in object code or executable form with such +an offer, in accord with Subsection b above.) +@end enumerate + +The source code for a work means the preferred form of the work for +making modifications to it. For an executable work, complete source +code means all the source code for all modules it contains, plus any +associated interface definition files, plus the scripts used to +control compilation and installation of the executable. However, as a +special exception, the source code distributed need not include +anything that is normally distributed (in either source or binary +form) with the major components (compiler, kernel, and so on) of the +operating system on which the executable runs, unless that component +itself accompanies the executable. + +If distribution of executable or object code is made by offering +access to copy from a designated place, then offering equivalent +access to copy the source code from the same place counts as +distribution of the source code, even though third parties are not +compelled to copy the source along with the object code. + +@item +You may not copy, modify, sublicense, or distribute the Program +except as expressly provided under this License. Any attempt +otherwise to copy, modify, sublicense or distribute the Program is +void, and will automatically terminate your rights under this License. +However, parties who have received copies, or rights, from you under +this License will not have their licenses terminated so long as such +parties remain in full compliance. + +@item +You are not required to accept this License, since you have not +signed it. However, nothing else grants you permission to modify or +distribute the Program or its derivative works. These actions are +prohibited by law if you do not accept this License. Therefore, by +modifying or distributing the Program (or any work based on the +Program), you indicate your acceptance of this License to do so, and +all its terms and conditions for copying, distributing or modifying +the Program or works based on it. + +@item +Each time you redistribute the Program (or any work based on the +Program), the recipient automatically receives a license from the +original licensor to copy, distribute or modify the Program subject to +these terms and conditions. You may not impose any further +restrictions on the recipients' exercise of the rights granted herein. +You are not responsible for enforcing compliance by third parties to +this License. + +@item +If, as a consequence of a court judgment or allegation of patent +infringement or for any other reason (not limited to patent issues), +conditions are imposed on you (whether by court order, agreement or +otherwise) that contradict the conditions of this License, they do not +excuse you from the conditions of this License. If you cannot +distribute so as to satisfy simultaneously your obligations under this +License and any other pertinent obligations, then as a consequence you +may not distribute the Program at all. For example, if a patent +license would not permit royalty-free redistribution of the Program by +all those who receive copies directly or indirectly through you, then +the only way you could satisfy both it and this License would be to +refrain entirely from distribution of the Program. + +If any portion of this section is held invalid or unenforceable under +any particular circumstance, the balance of the section is intended to +apply and the section as a whole is intended to apply in other +circumstances. + +It is not the purpose of this section to induce you to infringe any +patents or other property right claims or to contest validity of any +such claims; this section has the sole purpose of protecting the +integrity of the free software distribution system, which is +implemented by public license practices. Many people have made +generous contributions to the wide range of software distributed +through that system in reliance on consistent application of that +system; it is up to the author/donor to decide if he or she is willing +to distribute software through any other system and a licensee cannot +impose that choice. + +This section is intended to make thoroughly clear what is believed to +be a consequence of the rest of this License. + +@item +If the distribution and/or use of the Program is restricted in +certain countries either by patents or by copyrighted interfaces, the +original copyright holder who places the Program under this License +may add an explicit geographical distribution limitation excluding +those countries, so that distribution is permitted only in or among +countries not thus excluded. In such case, this License incorporates +the limitation as if written in the body of this License. + +@item +The Free Software Foundation may publish revised and/or new versions +of the General Public License from time to time. Such new versions will +be similar in spirit to the present version, but may differ in detail to +address new problems or concerns. + +Each version is given a distinguishing version number. If the Program +specifies a version number of this License which applies to it and ``any +later version'', you have the option of following the terms and conditions +either of that version or of any later version published by the Free +Software Foundation. If the Program does not specify a version number of +this License, you may choose any version ever published by the Free Software +Foundation. + +@item +If you wish to incorporate parts of the Program into other free +programs whose distribution conditions are different, write to the author +to ask for permission. For software which is copyrighted by the Free +Software Foundation, write to the Free Software Foundation; we sometimes +make exceptions for this. Our decision will be guided by the two goals +of preserving the free status of all derivatives of our free software and +of promoting the sharing and reuse of software generally. + +@iftex +@heading NO WARRANTY +@end iftex +@ifinfo +@center NO WARRANTY +@end ifinfo + +@item +BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY +FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN +OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES +PROVIDE THE PROGRAM ``AS IS'' WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED +OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF +MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS +TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE +PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, +REPAIR OR CORRECTION. + +@item +IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING +WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR +REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, +INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING +OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED +TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY +YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER +PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE +POSSIBILITY OF SUCH DAMAGES. +@end enumerate + +@iftex +@heading END OF TERMS AND CONDITIONS +@end iftex +@ifinfo +@center END OF TERMS AND CONDITIONS +@end ifinfo + +@page +@unnumberedsec How to Apply These Terms to Your New Programs + + If you develop a new program, and you want it to be of the greatest +possible use to the public, the best way to achieve this is to make it +free software which everyone can redistribute and change under these terms. + + To do so, attach the following notices to the program. It is safest +to attach them to the start of each source file to most effectively +convey the exclusion of warranty; and each file should have at least +the ``copyright'' line and a pointer to where the full notice is found. + +@smallexample +@var{one line to give the program's name and a brief idea of what it does.} +Copyright (C) 19@var{yy} @var{name of author} + +This program is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 2 of the License, or +(at your option) any later version. + +This program is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with this program; if not, write to the Free Software +Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. +@end smallexample + +Also add information on how to contact you by electronic and paper mail. + +If the program is interactive, make it output a short notice like this +when it starts in an interactive mode: + +@smallexample +Gnomovision version 69, Copyright (C) 19@var{yy} @var{name of author} +Gnomovision comes with ABSOLUTELY NO WARRANTY; for details +type `show w'. +This is free software, and you are welcome to redistribute it +under certain conditions; type `show c' for details. +@end smallexample + +The hypothetical commands @samp{show w} and @samp{show c} should show +the appropriate parts of the General Public License. Of course, the +commands you use may be called something other than @samp{show w} and +@samp{show c}; they could even be mouse-clicks or menu items---whatever +suits your program. + +You should also get your employer (if you work as a programmer) or your +school, if any, to sign a ``copyright disclaimer'' for the program, if +necessary. Here is a sample; alter the names: + +@smallexample +Yoyodyne, Inc., hereby disclaims all copyright interest in the program +`Gnomovision' (which makes passes at compilers) written by James Hacker. + +@var{signature of Ty Coon}, 1 April 1989 +Ty Coon, President of Vice +@end smallexample + +This General Public License does not permit incorporating your program into +proprietary programs. If your program is a subroutine library, you may +consider it more useful to permit linking proprietary applications with the +library. If this is what you want to do, use the GNU Library General +Public License instead of this License. + +@node Concepts, Examples, Copying, Top +@chapter The Concepts of Bison + +This chapter introduces many of the basic concepts without which the +details of Bison will not make sense. If you do not already know how to +use Bison or Yacc, we suggest you start by reading this chapter carefully. + +@menu +* Language and Grammar:: Languages and context-free grammars, + as mathematical ideas. +* Grammar in Bison:: How we represent grammars for Bison's sake. +* Semantic Values:: Each token or syntactic grouping can have + a semantic value (the value of an integer, + the name of an identifier, etc.). +* Semantic Actions:: Each rule can have an action containing C code. +* Bison Parser:: What are Bison's input and output, + how is the output used? +* Stages:: Stages in writing and running Bison grammars. +* Grammar Layout:: Overall structure of a Bison grammar file. +@end menu + +@node Language and Grammar, Grammar in Bison, , Concepts +@section Languages and Context-Free Grammars + +@cindex context-free grammar +@cindex grammar, context-free +In order for Bison to parse a language, it must be described by a +@dfn{context-free grammar}. This means that you specify one or more +@dfn{syntactic groupings} and give rules for constructing them from their +parts. For example, in the C language, one kind of grouping is called an +`expression'. One rule for making an expression might be, ``An expression +can be made of a minus sign and another expression''. Another would be, +``An expression can be an integer''. As you can see, rules are often +recursive, but there must be at least one rule which leads out of the +recursion. + +@cindex BNF +@cindex Backus-Naur form +The most common formal system for presenting such rules for humans to read +is @dfn{Backus-Naur Form} or ``BNF'', which was developed in order to +specify the language Algol 60. Any grammar expressed in BNF is a +context-free grammar. The input to Bison is essentially machine-readable +BNF. + +Not all context-free languages can be handled by Bison, only those +that are LALR(1). In brief, this means that it must be possible to +tell how to parse any portion of an input string with just a single +token of look-ahead. Strictly speaking, that is a description of an +LR(1) grammar, and LALR(1) involves additional restrictions that are +hard to explain simply; but it is rare in actual practice to find an +LR(1) grammar that fails to be LALR(1). @xref{Mystery Conflicts, , +Mysterious Reduce/Reduce Conflicts}, for more information on this. + +@cindex symbols (abstract) +@cindex token +@cindex syntactic grouping +@cindex grouping, syntactic +In the formal grammatical rules for a language, each kind of syntactic unit +or grouping is named by a @dfn{symbol}. Those which are built by grouping +smaller constructs according to grammatical rules are called +@dfn{nonterminal symbols}; those which can't be subdivided are called +@dfn{terminal symbols} or @dfn{token types}. We call a piece of input +corresponding to a single terminal symbol a @dfn{token}, and a piece +corresponding to a single nonterminal symbol a @dfn{grouping}.@refill + +We can use the C language as an example of what symbols, terminal and +nonterminal, mean. The tokens of C are identifiers, constants (numeric and +string), and the various keywords, arithmetic operators and punctuation +marks. So the terminal symbols of a grammar for C include `identifier', +`number', `string', plus one symbol for each keyword, operator or +punctuation mark: `if', `return', `const', `static', `int', `char', +`plus-sign', `open-brace', `close-brace', `comma' and many more. (These +tokens can be subdivided into characters, but that is a matter of +lexicography, not grammar.) + +Here is a simple C function subdivided into tokens: + +@example +int /* @r{keyword `int'} */ +square (x) /* @r{identifier, open-paren,} */ + /* @r{identifier, close-paren} */ + int x; /* @r{keyword `int', identifier, semicolon} */ +@{ /* @r{open-brace} */ + return x * x; /* @r{keyword `return', identifier,} */ + /* @r{asterisk, identifier, semicolon} */ +@} /* @r{close-brace} */ +@end example + +The syntactic groupings of C include the expression, the statement, the +declaration, and the function definition. These are represented in the +grammar of C by nonterminal symbols `expression', `statement', +`declaration' and `function definition'. The full grammar uses dozens of +additional language constructs, each with its own nonterminal symbol, in +order to express the meanings of these four. The example above is a +function definition; it contains one declaration, and one statement. In +the statement, each @samp{x} is an expression and so is @samp{x * x}. + +Each nonterminal symbol must have grammatical rules showing how it is made +out of simpler constructs. For example, one kind of C statement is the +@code{return} statement; this would be described with a grammar rule which +reads informally as follows: + +@quotation +A `statement' can be made of a `return' keyword, an `expression' and a +`semicolon'. +@end quotation + +@noindent +There would be many other rules for `statement', one for each kind of +statement in C. + +@cindex start symbol +One nonterminal symbol must be distinguished as the special one which +defines a complete utterance in the language. It is called the @dfn{start +symbol}. In a compiler, this means a complete input program. In the C +language, the nonterminal symbol `sequence of definitions and declarations' +plays this role. + +For example, @samp{1 + 2} is a valid C expression---a valid part of a C +program---but it is not valid as an @emph{entire} C program. In the +context-free grammar of C, this follows from the fact that `expression' is +not the start symbol. + +The Bison parser reads a sequence of tokens as its input, and groups the +tokens using the grammar rules. If the input is valid, the end result is +that the entire token sequence reduces to a single grouping whose symbol is +the grammar's start symbol. If we use a grammar for C, the entire input +must be a `sequence of definitions and declarations'. If not, the parser +reports a syntax error. + +@node Grammar in Bison, Semantic Values, Language and Grammar, Concepts +@section From Formal Rules to Bison Input +@cindex Bison grammar +@cindex grammar, Bison +@cindex formal grammar + +A formal grammar is a mathematical construct. To define the language +for Bison, you must write a file expressing the grammar in Bison syntax: +a @dfn{Bison grammar} file. @xref{Grammar File, ,Bison Grammar Files}. + +A nonterminal symbol in the formal grammar is represented in Bison input +as an identifier, like an identifier in C. By convention, it should be +in lower case, such as @code{expr}, @code{stmt} or @code{declaration}. + +The Bison representation for a terminal symbol is also called a @dfn{token +type}. Token types as well can be represented as C-like identifiers. By +convention, these identifiers should be upper case to distinguish them from +nonterminals: for example, @code{INTEGER}, @code{IDENTIFIER}, @code{IF} or +@code{RETURN}. A terminal symbol that stands for a particular keyword in +the language should be named after that keyword converted to upper case. +The terminal symbol @code{error} is reserved for error recovery. +@xref{Symbols}. + +A terminal symbol can also be represented as a character literal, just like +a C character constant. You should do this whenever a token is just a +single character (parenthesis, plus-sign, etc.): use that same character in +a literal as the terminal symbol for that token. + +A third way to represent a terminal symbol is with a C string constant +containing several characters. @xref{Symbols}, for more information. + +The grammar rules also have an expression in Bison syntax. For example, +here is the Bison rule for a C @code{return} statement. The semicolon in +quotes is a literal character token, representing part of the C syntax for +the statement; the naked semicolon, and the colon, are Bison punctuation +used in every rule. + +@example +stmt: RETURN expr ';' + ; +@end example + +@noindent +@xref{Rules, ,Syntax of Grammar Rules}. + +@node Semantic Values, Semantic Actions, Grammar in Bison, Concepts +@section Semantic Values +@cindex semantic value +@cindex value, semantic + +A formal grammar selects tokens only by their classifications: for example, +if a rule mentions the terminal symbol `integer constant', it means that +@emph{any} integer constant is grammatically valid in that position. The +precise value of the constant is irrelevant to how to parse the input: if +@samp{x+4} is grammatical then @samp{x+1} or @samp{x+3989} is equally +grammatical.@refill + +But the precise value is very important for what the input means once it is +parsed. A compiler is useless if it fails to distinguish between 4, 1 and +3989 as constants in the program! Therefore, each token in a Bison grammar +has both a token type and a @dfn{semantic value}. @xref{Semantics, ,Defining Language Semantics}, +for details. + +The token type is a terminal symbol defined in the grammar, such as +@code{INTEGER}, @code{IDENTIFIER} or @code{','}. It tells everything +you need to know to decide where the token may validly appear and how to +group it with other tokens. The grammar rules know nothing about tokens +except their types.@refill + +The semantic value has all the rest of the information about the +meaning of the token, such as the value of an integer, or the name of an +identifier. (A token such as @code{','} which is just punctuation doesn't +need to have any semantic value.) + +For example, an input token might be classified as token type +@code{INTEGER} and have the semantic value 4. Another input token might +have the same token type @code{INTEGER} but value 3989. When a grammar +rule says that @code{INTEGER} is allowed, either of these tokens is +acceptable because each is an @code{INTEGER}. When the parser accepts the +token, it keeps track of the token's semantic value. + +Each grouping can also have a semantic value as well as its nonterminal +symbol. For example, in a calculator, an expression typically has a +semantic value that is a number. In a compiler for a programming +language, an expression typically has a semantic value that is a tree +structure describing the meaning of the expression. + +@node Semantic Actions, Bison Parser, Semantic Values, Concepts +@section Semantic Actions +@cindex semantic actions +@cindex actions, semantic + +In order to be useful, a program must do more than parse input; it must +also produce some output based on the input. In a Bison grammar, a grammar +rule can have an @dfn{action} made up of C statements. Each time the +parser recognizes a match for that rule, the action is executed. +@xref{Actions}. + +Most of the time, the purpose of an action is to compute the semantic value +of the whole construct from the semantic values of its parts. For example, +suppose we have a rule which says an expression can be the sum of two +expressions. When the parser recognizes such a sum, each of the +subexpressions has a semantic value which describes how it was built up. +The action for this rule should create a similar sort of value for the +newly recognized larger expression. + +For example, here is a rule that says an expression can be the sum of +two subexpressions: + +@example +expr: expr '+' expr @{ $$ = $1 + $3; @} + ; +@end example + +@noindent +The action says how to produce the semantic value of the sum expression +from the values of the two subexpressions. + +@node Bison Parser, Stages, Semantic Actions, Concepts +@section Bison Output: the Parser File +@cindex Bison parser +@cindex Bison utility +@cindex lexical analyzer, purpose +@cindex parser + +When you run Bison, you give it a Bison grammar file as input. The output +is a C source file that parses the language described by the grammar. +This file is called a @dfn{Bison parser}. Keep in mind that the Bison +utility and the Bison parser are two distinct programs: the Bison utility +is a program whose output is the Bison parser that becomes part of your +program. + +The job of the Bison parser is to group tokens into groupings according to +the grammar rules---for example, to build identifiers and operators into +expressions. As it does this, it runs the actions for the grammar rules it +uses. + +The tokens come from a function called the @dfn{lexical analyzer} that you +must supply in some fashion (such as by writing it in C). The Bison parser +calls the lexical analyzer each time it wants a new token. It doesn't know +what is ``inside'' the tokens (though their semantic values may reflect +this). Typically the lexical analyzer makes the tokens by parsing +characters of text, but Bison does not depend on this. @xref{Lexical, ,The Lexical Analyzer Function @code{yylex}}. + +The Bison parser file is C code which defines a function named +@code{yyparse} which implements that grammar. This function does not make +a complete C program: you must supply some additional functions. One is +the lexical analyzer. Another is an error-reporting function which the +parser calls to report an error. In addition, a complete C program must +start with a function called @code{main}; you have to provide this, and +arrange for it to call @code{yyparse} or the parser will never run. +@xref{Interface, ,Parser C-Language Interface}. + +Aside from the token type names and the symbols in the actions you +write, all variable and function names used in the Bison parser file +begin with @samp{yy} or @samp{YY}. This includes interface functions +such as the lexical analyzer function @code{yylex}, the error reporting +function @code{yyerror} and the parser function @code{yyparse} itself. +This also includes numerous identifiers used for internal purposes. +Therefore, you should avoid using C identifiers starting with @samp{yy} +or @samp{YY} in the Bison grammar file except for the ones defined in +this manual. + +@node Stages, Grammar Layout, Bison Parser, Concepts +@section Stages in Using Bison +@cindex stages in using Bison +@cindex using Bison + +The actual language-design process using Bison, from grammar specification +to a working compiler or interpreter, has these parts: + +@enumerate +@item +Formally specify the grammar in a form recognized by Bison +(@pxref{Grammar File, ,Bison Grammar Files}). For each grammatical rule in the language, +describe the action that is to be taken when an instance of that rule +is recognized. The action is described by a sequence of C statements. + +@item +Write a lexical analyzer to process input and pass tokens to the +parser. The lexical analyzer may be written by hand in C +(@pxref{Lexical, ,The Lexical Analyzer Function @code{yylex}}). It could also be produced using Lex, but the use +of Lex is not discussed in this manual. + +@item +Write a controlling function that calls the Bison-produced parser. + +@item +Write error-reporting routines. +@end enumerate + +To turn this source code as written into a runnable program, you +must follow these steps: + +@enumerate +@item +Run Bison on the grammar to produce the parser. + +@item +Compile the code output by Bison, as well as any other source files. + +@item +Link the object files to produce the finished product. +@end enumerate + +@node Grammar Layout, , Stages, Concepts +@section The Overall Layout of a Bison Grammar +@cindex grammar file +@cindex file format +@cindex format of grammar file +@cindex layout of Bison grammar + +The input file for the Bison utility is a @dfn{Bison grammar file}. The +general form of a Bison grammar file is as follows: + +@example +%@{ +@var{C declarations} +%@} + +@var{Bison declarations} + +%% +@var{Grammar rules} +%% +@var{Additional C code} +@end example + +@noindent +The @samp{%%}, @samp{%@{} and @samp{%@}} are punctuation that appears +in every Bison grammar file to separate the sections. + +The C declarations may define types and variables used in the actions. +You can also use preprocessor commands to define macros used there, and use +@code{#include} to include header files that do any of these things. + +The Bison declarations declare the names of the terminal and nonterminal +symbols, and may also describe operator precedence and the data types of +semantic values of various symbols. + +The grammar rules define how to construct each nonterminal symbol from its +parts. + +The additional C code can contain any C code you want to use. Often the +definition of the lexical analyzer @code{yylex} goes here, plus subroutines +called by the actions in the grammar rules. In a simple program, all the +rest of the program can go here. + +@node Examples, Grammar File, Concepts, Top +@chapter Examples +@cindex simple examples +@cindex examples, simple + +Now we show and explain three sample programs written using Bison: a +reverse polish notation calculator, an algebraic (infix) notation +calculator, and a multi-function calculator. All three have been tested +under BSD Unix 4.3; each produces a usable, though limited, interactive +desk-top calculator. + +These examples are simple, but Bison grammars for real programming +languages are written the same way. +@ifinfo +You can copy these examples out of the Info file and into a source file +to try them. +@end ifinfo + +@menu +* RPN Calc:: Reverse polish notation calculator; + a first example with no operator precedence. +* Infix Calc:: Infix (algebraic) notation calculator. + Operator precedence is introduced. +* Simple Error Recovery:: Continuing after syntax errors. +* Multi-function Calc:: Calculator with memory and trig functions. + It uses multiple data-types for semantic values. +* Exercises:: Ideas for improving the multi-function calculator. +@end menu + +@node RPN Calc, Infix Calc, , Examples +@section Reverse Polish Notation Calculator +@cindex reverse polish notation +@cindex polish notation calculator +@cindex @code{rpcalc} +@cindex calculator, simple + +The first example is that of a simple double-precision @dfn{reverse polish +notation} calculator (a calculator using postfix operators). This example +provides a good starting point, since operator precedence is not an issue. +The second example will illustrate how operator precedence is handled. + +The source code for this calculator is named @file{rpcalc.y}. The +@samp{.y} extension is a convention used for Bison input files. + +@menu +* Decls: Rpcalc Decls. Bison and C declarations for rpcalc. +* Rules: Rpcalc Rules. Grammar Rules for rpcalc, with explanation. +* Lexer: Rpcalc Lexer. The lexical analyzer. +* Main: Rpcalc Main. The controlling function. +* Error: Rpcalc Error. The error reporting function. +* Gen: Rpcalc Gen. Running Bison on the grammar file. +* Comp: Rpcalc Compile. Run the C compiler on the output code. +@end menu + +@node Rpcalc Decls, Rpcalc Rules, , RPN Calc +@subsection Declarations for @code{rpcalc} + +Here are the C and Bison declarations for the reverse polish notation +calculator. As in C, comments are placed between @samp{/*@dots{}*/}. + +@example +/* Reverse polish notation calculator. */ + +%@{ +#define YYSTYPE double +#include +%@} + +%token NUM + +%% /* Grammar rules and actions follow */ +@end example + +The C declarations section (@pxref{C Declarations, ,The C Declarations Section}) contains two +preprocessor directives. + +The @code{#define} directive defines the macro @code{YYSTYPE}, thus +specifying the C data type for semantic values of both tokens and groupings +(@pxref{Value Type, ,Data Types of Semantic Values}). The Bison parser will use whatever type +@code{YYSTYPE} is defined as; if you don't define it, @code{int} is the +default. Because we specify @code{double}, each token and each expression +has an associated value, which is a floating point number. + +The @code{#include} directive is used to declare the exponentiation +function @code{pow}. + +The second section, Bison declarations, provides information to Bison about +the token types (@pxref{Bison Declarations, ,The Bison Declarations Section}). Each terminal symbol that is +not a single-character literal must be declared here. (Single-character +literals normally don't need to be declared.) In this example, all the +arithmetic operators are designated by single-character literals, so the +only terminal symbol that needs to be declared is @code{NUM}, the token +type for numeric constants. + +@node Rpcalc Rules, Rpcalc Lexer, Rpcalc Decls, RPN Calc +@subsection Grammar Rules for @code{rpcalc} + +Here are the grammar rules for the reverse polish notation calculator. + +@example +input: /* empty */ + | input line +; + +line: '\n' + | exp '\n' @{ printf ("\t%.10g\n", $1); @} +; + +exp: NUM @{ $$ = $1; @} + | exp exp '+' @{ $$ = $1 + $2; @} + | exp exp '-' @{ $$ = $1 - $2; @} + | exp exp '*' @{ $$ = $1 * $2; @} + | exp exp '/' @{ $$ = $1 / $2; @} + /* Exponentiation */ + | exp exp '^' @{ $$ = pow ($1, $2); @} + /* Unary minus */ + | exp 'n' @{ $$ = -$1; @} +; +%% +@end example + +The groupings of the rpcalc ``language'' defined here are the expression +(given the name @code{exp}), the line of input (@code{line}), and the +complete input transcript (@code{input}). Each of these nonterminal +symbols has several alternate rules, joined by the @samp{|} punctuator +which is read as ``or''. The following sections explain what these rules +mean. + +The semantics of the language is determined by the actions taken when a +grouping is recognized. The actions are the C code that appears inside +braces. @xref{Actions}. + +You must specify these actions in C, but Bison provides the means for +passing semantic values between the rules. In each action, the +pseudo-variable @code{$$} stands for the semantic value for the grouping +that the rule is going to construct. Assigning a value to @code{$$} is the +main job of most actions. The semantic values of the components of the +rule are referred to as @code{$1}, @code{$2}, and so on. + +@menu +* Rpcalc Input:: +* Rpcalc Line:: +* Rpcalc Expr:: +@end menu + +@node Rpcalc Input, Rpcalc Line, , Rpcalc Rules +@subsubsection Explanation of @code{input} + +Consider the definition of @code{input}: + +@example +input: /* empty */ + | input line +; +@end example + +This definition reads as follows: ``A complete input is either an empty +string, or a complete input followed by an input line''. Notice that +``complete input'' is defined in terms of itself. This definition is said +to be @dfn{left recursive} since @code{input} appears always as the +leftmost symbol in the sequence. @xref{Recursion, ,Recursive Rules}. + +The first alternative is empty because there are no symbols between the +colon and the first @samp{|}; this means that @code{input} can match an +empty string of input (no tokens). We write the rules this way because it +is legitimate to type @kbd{Ctrl-d} right after you start the calculator. +It's conventional to put an empty alternative first and write the comment +@samp{/* empty */} in it. + +The second alternate rule (@code{input line}) handles all nontrivial input. +It means, ``After reading any number of lines, read one more line if +possible.'' The left recursion makes this rule into a loop. Since the +first alternative matches empty input, the loop can be executed zero or +more times. + +The parser function @code{yyparse} continues to process input until a +grammatical error is seen or the lexical analyzer says there are no more +input tokens; we will arrange for the latter to happen at end of file. + +@node Rpcalc Line, Rpcalc Expr, Rpcalc Input, Rpcalc Rules +@subsubsection Explanation of @code{line} + +Now consider the definition of @code{line}: + +@example +line: '\n' + | exp '\n' @{ printf ("\t%.10g\n", $1); @} +; +@end example + +The first alternative is a token which is a newline character; this means +that rpcalc accepts a blank line (and ignores it, since there is no +action). The second alternative is an expression followed by a newline. +This is the alternative that makes rpcalc useful. The semantic value of +the @code{exp} grouping is the value of @code{$1} because the @code{exp} in +question is the first symbol in the alternative. The action prints this +value, which is the result of the computation the user asked for. + +This action is unusual because it does not assign a value to @code{$$}. As +a consequence, the semantic value associated with the @code{line} is +uninitialized (its value will be unpredictable). This would be a bug if +that value were ever used, but we don't use it: once rpcalc has printed the +value of the user's input line, that value is no longer needed. + +@node Rpcalc Expr, , Rpcalc Line, Rpcalc Rules +@subsubsection Explanation of @code{expr} + +The @code{exp} grouping has several rules, one for each kind of expression. +The first rule handles the simplest expressions: those that are just numbers. +The second handles an addition-expression, which looks like two expressions +followed by a plus-sign. The third handles subtraction, and so on. + +@example +exp: NUM + | exp exp '+' @{ $$ = $1 + $2; @} + | exp exp '-' @{ $$ = $1 - $2; @} + @dots{} + ; +@end example + +We have used @samp{|} to join all the rules for @code{exp}, but we could +equally well have written them separately: + +@example +exp: NUM ; +exp: exp exp '+' @{ $$ = $1 + $2; @} ; +exp: exp exp '-' @{ $$ = $1 - $2; @} ; + @dots{} +@end example + +Most of the rules have actions that compute the value of the expression in +terms of the value of its parts. For example, in the rule for addition, +@code{$1} refers to the first component @code{exp} and @code{$2} refers to +the second one. The third component, @code{'+'}, has no meaningful +associated semantic value, but if it had one you could refer to it as +@code{$3}. When @code{yyparse} recognizes a sum expression using this +rule, the sum of the two subexpressions' values is produced as the value of +the entire expression. @xref{Actions}. + +You don't have to give an action for every rule. When a rule has no +action, Bison by default copies the value of @code{$1} into @code{$$}. +This is what happens in the first rule (the one that uses @code{NUM}). + +The formatting shown here is the recommended convention, but Bison does +not require it. You can add or change whitespace as much as you wish. +For example, this: + +@example +exp : NUM | exp exp '+' @{$$ = $1 + $2; @} | @dots{} +@end example + +@noindent +means the same thing as this: + +@example +exp: NUM + | exp exp '+' @{ $$ = $1 + $2; @} + | @dots{} +@end example + +@noindent +The latter, however, is much more readable. + +@node Rpcalc Lexer, Rpcalc Main, Rpcalc Rules, RPN Calc +@subsection The @code{rpcalc} Lexical Analyzer +@cindex writing a lexical analyzer +@cindex lexical analyzer, writing + +The lexical analyzer's job is low-level parsing: converting characters or +sequences of characters into tokens. The Bison parser gets its tokens by +calling the lexical analyzer. @xref{Lexical, ,The Lexical Analyzer Function @code{yylex}}. + +Only a simple lexical analyzer is needed for the RPN calculator. This +lexical analyzer skips blanks and tabs, then reads in numbers as +@code{double} and returns them as @code{NUM} tokens. Any other character +that isn't part of a number is a separate token. Note that the token-code +for such a single-character token is the character itself. + +The return value of the lexical analyzer function is a numeric code which +represents a token type. The same text used in Bison rules to stand for +this token type is also a C expression for the numeric code for the type. +This works in two ways. If the token type is a character literal, then its +numeric code is the ASCII code for that character; you can use the same +character literal in the lexical analyzer to express the number. If the +token type is an identifier, that identifier is defined by Bison as a C +macro whose definition is the appropriate number. In this example, +therefore, @code{NUM} becomes a macro for @code{yylex} to use. + +The semantic value of the token (if it has one) is stored into the global +variable @code{yylval}, which is where the Bison parser will look for it. +(The C data type of @code{yylval} is @code{YYSTYPE}, which was defined +at the beginning of the grammar; @pxref{Rpcalc Decls, ,Declarations for @code{rpcalc}}.) + +A token type code of zero is returned if the end-of-file is encountered. +(Bison recognizes any nonpositive value as indicating the end of the +input.) + +Here is the code for the lexical analyzer: + +@example +@group +/* Lexical analyzer returns a double floating point + number on the stack and the token NUM, or the ASCII + character read if not a number. Skips all blanks + and tabs, returns 0 for EOF. */ + +#include +@end group + +@group +yylex () +@{ + int c; + + /* skip white space */ + while ((c = getchar ()) == ' ' || c == '\t') + ; +@end group +@group + /* process numbers */ + if (c == '.' || isdigit (c)) + @{ + ungetc (c, stdin); + scanf ("%lf", &yylval); + return NUM; + @} +@end group +@group + /* return end-of-file */ + if (c == EOF) + return 0; + /* return single chars */ + return c; +@} +@end group +@end example + +@node Rpcalc Main, Rpcalc Error, Rpcalc Lexer, RPN Calc +@subsection The Controlling Function +@cindex controlling function +@cindex main function in simple example + +In keeping with the spirit of this example, the controlling function is +kept to the bare minimum. The only requirement is that it call +@code{yyparse} to start the process of parsing. + +@example +@group +main () +@{ + yyparse (); +@} +@end group +@end example + +@node Rpcalc Error, Rpcalc Gen, Rpcalc Main, RPN Calc +@subsection The Error Reporting Routine +@cindex error reporting routine + +When @code{yyparse} detects a syntax error, it calls the error reporting +function @code{yyerror} to print an error message (usually but not always +@code{"parse error"}). It is up to the programmer to supply @code{yyerror} +(@pxref{Interface, ,Parser C-Language Interface}), so here is the definition we will use: + +@example +@group +#include + +yyerror (s) /* Called by yyparse on error */ + char *s; +@{ + printf ("%s\n", s); +@} +@end group +@end example + +After @code{yyerror} returns, the Bison parser may recover from the error +and continue parsing if the grammar contains a suitable error rule +(@pxref{Error Recovery}). Otherwise, @code{yyparse} returns nonzero. We +have not written any error rules in this example, so any invalid input will +cause the calculator program to exit. This is not clean behavior for a +real calculator, but it is adequate in the first example. + +@node Rpcalc Gen, Rpcalc Compile, Rpcalc Error, RPN Calc +@subsection Running Bison to Make the Parser +@cindex running Bison (introduction) + +Before running Bison to produce a parser, we need to decide how to arrange +all the source code in one or more source files. For such a simple example, +the easiest thing is to put everything in one file. The definitions of +@code{yylex}, @code{yyerror} and @code{main} go at the end, in the +``additional C code'' section of the file (@pxref{Grammar Layout, ,The Overall Layout of a Bison Grammar}). + +For a large project, you would probably have several source files, and use +@code{make} to arrange to recompile them. + +With all the source in a single file, you use the following command to +convert it into a parser file: + +@example +bison @var{file_name}.y +@end example + +@noindent +In this example the file was called @file{rpcalc.y} (for ``Reverse Polish +CALCulator''). Bison produces a file named @file{@var{file_name}.tab.c}, +removing the @samp{.y} from the original file name. The file output by +Bison contains the source code for @code{yyparse}. The additional +functions in the input file (@code{yylex}, @code{yyerror} and @code{main}) +are copied verbatim to the output. + +@node Rpcalc Compile, , Rpcalc Gen, RPN Calc +@subsection Compiling the Parser File +@cindex compiling the parser + +Here is how to compile and run the parser file: + +@example +@group +# @r{List files in current directory.} +% ls +rpcalc.tab.c rpcalc.y +@end group + +@group +# @r{Compile the Bison parser.} +# @r{@samp{-lm} tells compiler to search math library for @code{pow}.} +% cc rpcalc.tab.c -lm -o rpcalc +@end group + +@group +# @r{List files again.} +% ls +rpcalc rpcalc.tab.c rpcalc.y +@end group +@end example + +The file @file{rpcalc} now contains the executable code. Here is an +example session using @code{rpcalc}. + +@example +% rpcalc +4 9 + +13 +3 7 + 3 4 5 *+- +-13 +3 7 + 3 4 5 * + - n @r{Note the unary minus, @samp{n}} +13 +5 6 / 4 n + +-3.166666667 +3 4 ^ @r{Exponentiation} +81 +^D @r{End-of-file indicator} +% +@end example + +@node Infix Calc, Simple Error Recovery, RPN Calc, Examples +@section Infix Notation Calculator: @code{calc} +@cindex infix notation calculator +@cindex @code{calc} +@cindex calculator, infix notation + +We now modify rpcalc to handle infix operators instead of postfix. Infix +notation involves the concept of operator precedence and the need for +parentheses nested to arbitrary depth. Here is the Bison code for +@file{calc.y}, an infix desk-top calculator. + +@example +/* Infix notation calculator--calc */ + +%@{ +#define YYSTYPE double +#include +%@} + +/* BISON Declarations */ +%token NUM +%left '-' '+' +%left '*' '/' +%left NEG /* negation--unary minus */ +%right '^' /* exponentiation */ + +/* Grammar follows */ +%% +input: /* empty string */ + | input line +; + +line: '\n' + | exp '\n' @{ printf ("\t%.10g\n", $1); @} +; + +exp: NUM @{ $$ = $1; @} + | exp '+' exp @{ $$ = $1 + $3; @} + | exp '-' exp @{ $$ = $1 - $3; @} + | exp '*' exp @{ $$ = $1 * $3; @} + | exp '/' exp @{ $$ = $1 / $3; @} + | '-' exp %prec NEG @{ $$ = -$2; @} + | exp '^' exp @{ $$ = pow ($1, $3); @} + | '(' exp ')' @{ $$ = $2; @} +; +%% +@end example + +@noindent +The functions @code{yylex}, @code{yyerror} and @code{main} can be the same +as before. + +There are two important new features shown in this code. + +In the second section (Bison declarations), @code{%left} declares token +types and says they are left-associative operators. The declarations +@code{%left} and @code{%right} (right associativity) take the place of +@code{%token} which is used to declare a token type name without +associativity. (These tokens are single-character literals, which +ordinarily don't need to be declared. We declare them here to specify +the associativity.) + +Operator precedence is determined by the line ordering of the +declarations; the higher the line number of the declaration (lower on +the page or screen), the higher the precedence. Hence, exponentiation +has the highest precedence, unary minus (@code{NEG}) is next, followed +by @samp{*} and @samp{/}, and so on. @xref{Precedence, ,Operator Precedence}. + +The other important new feature is the @code{%prec} in the grammar section +for the unary minus operator. The @code{%prec} simply instructs Bison that +the rule @samp{| '-' exp} has the same precedence as @code{NEG}---in this +case the next-to-highest. @xref{Contextual Precedence, ,Context-Dependent Precedence}. + +Here is a sample run of @file{calc.y}: + +@need 500 +@example +% calc +4 + 4.5 - (34/(8*3+-3)) +6.880952381 +-56 + 2 +-54 +3 ^ 2 +9 +@end example + +@node Simple Error Recovery, Multi-function Calc, Infix Calc, Examples +@section Simple Error Recovery +@cindex error recovery, simple + +Up to this point, this manual has not addressed the issue of @dfn{error +recovery}---how to continue parsing after the parser detects a syntax +error. All we have handled is error reporting with @code{yyerror}. Recall +that by default @code{yyparse} returns after calling @code{yyerror}. This +means that an erroneous input line causes the calculator program to exit. +Now we show how to rectify this deficiency. + +The Bison language itself includes the reserved word @code{error}, which +may be included in the grammar rules. In the example below it has +been added to one of the alternatives for @code{line}: + +@example +@group +line: '\n' + | exp '\n' @{ printf ("\t%.10g\n", $1); @} + | error '\n' @{ yyerrok; @} +; +@end group +@end example + +This addition to the grammar allows for simple error recovery in the event +of a parse error. If an expression that cannot be evaluated is read, the +error will be recognized by the third rule for @code{line}, and parsing +will continue. (The @code{yyerror} function is still called upon to print +its message as well.) The action executes the statement @code{yyerrok}, a +macro defined automatically by Bison; its meaning is that error recovery is +complete (@pxref{Error Recovery}). Note the difference between +@code{yyerrok} and @code{yyerror}; neither one is a misprint.@refill + +This form of error recovery deals with syntax errors. There are other +kinds of errors; for example, division by zero, which raises an exception +signal that is normally fatal. A real calculator program must handle this +signal and use @code{longjmp} to return to @code{main} and resume parsing +input lines; it would also have to discard the rest of the current line of +input. We won't discuss this issue further because it is not specific to +Bison programs. + +@node Multi-function Calc, Exercises, Simple Error Recovery, Examples +@section Multi-Function Calculator: @code{mfcalc} +@cindex multi-function calculator +@cindex @code{mfcalc} +@cindex calculator, multi-function + +Now that the basics of Bison have been discussed, it is time to move on to +a more advanced problem. The above calculators provided only five +functions, @samp{+}, @samp{-}, @samp{*}, @samp{/} and @samp{^}. It would +be nice to have a calculator that provides other mathematical functions such +as @code{sin}, @code{cos}, etc. + +It is easy to add new operators to the infix calculator as long as they are +only single-character literals. The lexical analyzer @code{yylex} passes +back all non-number characters as tokens, so new grammar rules suffice for +adding a new operator. But we want something more flexible: built-in +functions whose syntax has this form: + +@example +@var{function_name} (@var{argument}) +@end example + +@noindent +At the same time, we will add memory to the calculator, by allowing you +to create named variables, store values in them, and use them later. +Here is a sample session with the multi-function calculator: + +@example +% mfcalc +pi = 3.141592653589 +3.1415926536 +sin(pi) +0.0000000000 +alpha = beta1 = 2.3 +2.3000000000 +alpha +2.3000000000 +ln(alpha) +0.8329091229 +exp(ln(beta1)) +2.3000000000 +% +@end example + +Note that multiple assignment and nested function calls are permitted. + +@menu +* Decl: Mfcalc Decl. Bison declarations for multi-function calculator. +* Rules: Mfcalc Rules. Grammar rules for the calculator. +* Symtab: Mfcalc Symtab. Symbol table management subroutines. +@end menu + +@node Mfcalc Decl, Mfcalc Rules, , Multi-function Calc +@subsection Declarations for @code{mfcalc} + +Here are the C and Bison declarations for the multi-function calculator. + +@smallexample +%@{ +#include /* For math functions, cos(), sin(), etc. */ +#include "calc.h" /* Contains definition of `symrec' */ +%@} +%union @{ +double val; /* For returning numbers. */ +symrec *tptr; /* For returning symbol-table pointers */ +@} + +%token NUM /* Simple double precision number */ +%token VAR FNCT /* Variable and Function */ +%type exp + +%right '=' +%left '-' '+' +%left '*' '/' +%left NEG /* Negation--unary minus */ +%right '^' /* Exponentiation */ + +/* Grammar follows */ + +%% +@end smallexample + +The above grammar introduces only two new features of the Bison language. +These features allow semantic values to have various data types +(@pxref{Multiple Types, ,More Than One Value Type}). + +The @code{%union} declaration specifies the entire list of possible types; +this is instead of defining @code{YYSTYPE}. The allowable types are now +double-floats (for @code{exp} and @code{NUM}) and pointers to entries in +the symbol table. @xref{Union Decl, ,The Collection of Value Types}. + +Since values can now have various types, it is necessary to associate a +type with each grammar symbol whose semantic value is used. These symbols +are @code{NUM}, @code{VAR}, @code{FNCT}, and @code{exp}. Their +declarations are augmented with information about their data type (placed +between angle brackets). + +The Bison construct @code{%type} is used for declaring nonterminal symbols, +just as @code{%token} is used for declaring token types. We have not used +@code{%type} before because nonterminal symbols are normally declared +implicitly by the rules that define them. But @code{exp} must be declared +explicitly so we can specify its value type. @xref{Type Decl, ,Nonterminal Symbols}. + +@node Mfcalc Rules, Mfcalc Symtab, Mfcalc Decl, Multi-function Calc +@subsection Grammar Rules for @code{mfcalc} + +Here are the grammar rules for the multi-function calculator. +Most of them are copied directly from @code{calc}; three rules, +those which mention @code{VAR} or @code{FNCT}, are new. + +@smallexample +input: /* empty */ + | input line +; + +line: + '\n' + | exp '\n' @{ printf ("\t%.10g\n", $1); @} + | error '\n' @{ yyerrok; @} +; + +exp: NUM @{ $$ = $1; @} + | VAR @{ $$ = $1->value.var; @} + | VAR '=' exp @{ $$ = $3; $1->value.var = $3; @} + | FNCT '(' exp ')' @{ $$ = (*($1->value.fnctptr))($3); @} + | exp '+' exp @{ $$ = $1 + $3; @} + | exp '-' exp @{ $$ = $1 - $3; @} + | exp '*' exp @{ $$ = $1 * $3; @} + | exp '/' exp @{ $$ = $1 / $3; @} + | '-' exp %prec NEG @{ $$ = -$2; @} + | exp '^' exp @{ $$ = pow ($1, $3); @} + | '(' exp ')' @{ $$ = $2; @} +; +/* End of grammar */ +%% +@end smallexample + +@node Mfcalc Symtab, , Mfcalc Rules, Multi-function Calc +@subsection The @code{mfcalc} Symbol Table +@cindex symbol table example + +The multi-function calculator requires a symbol table to keep track of the +names and meanings of variables and functions. This doesn't affect the +grammar rules (except for the actions) or the Bison declarations, but it +requires some additional C functions for support. + +The symbol table itself consists of a linked list of records. Its +definition, which is kept in the header @file{calc.h}, is as follows. It +provides for either functions or variables to be placed in the table. + +@smallexample +@group +/* Data type for links in the chain of symbols. */ +struct symrec +@{ + char *name; /* name of symbol */ + int type; /* type of symbol: either VAR or FNCT */ + union @{ + double var; /* value of a VAR */ + double (*fnctptr)(); /* value of a FNCT */ + @} value; + struct symrec *next; /* link field */ +@}; +@end group + +@group +typedef struct symrec symrec; + +/* The symbol table: a chain of `struct symrec'. */ +extern symrec *sym_table; + +symrec *putsym (); +symrec *getsym (); +@end group +@end smallexample + +The new version of @code{main} includes a call to @code{init_table}, a +function that initializes the symbol table. Here it is, and +@code{init_table} as well: + +@smallexample +@group +#include + +main () +@{ + init_table (); + yyparse (); +@} +@end group + +@group +yyerror (s) /* Called by yyparse on error */ + char *s; +@{ + printf ("%s\n", s); +@} + +struct init +@{ + char *fname; + double (*fnct)(); +@}; +@end group + +@group +struct init arith_fncts[] + = @{ + "sin", sin, + "cos", cos, + "atan", atan, + "ln", log, + "exp", exp, + "sqrt", sqrt, + 0, 0 + @}; + +/* The symbol table: a chain of `struct symrec'. */ +symrec *sym_table = (symrec *)0; +@end group + +@group +init_table () /* puts arithmetic functions in table. */ +@{ + int i; + symrec *ptr; + for (i = 0; arith_fncts[i].fname != 0; i++) + @{ + ptr = putsym (arith_fncts[i].fname, FNCT); + ptr->value.fnctptr = arith_fncts[i].fnct; + @} +@} +@end group +@end smallexample + +By simply editing the initialization list and adding the necessary include +files, you can add additional functions to the calculator. + +Two important functions allow look-up and installation of symbols in the +symbol table. The function @code{putsym} is passed a name and the type +(@code{VAR} or @code{FNCT}) of the object to be installed. The object is +linked to the front of the list, and a pointer to the object is returned. +The function @code{getsym} is passed the name of the symbol to look up. If +found, a pointer to that symbol is returned; otherwise zero is returned. + +@smallexample +symrec * +putsym (sym_name,sym_type) + char *sym_name; + int sym_type; +@{ + symrec *ptr; + ptr = (symrec *) malloc (sizeof (symrec)); + ptr->name = (char *) malloc (strlen (sym_name) + 1); + strcpy (ptr->name,sym_name); + ptr->type = sym_type; + ptr->value.var = 0; /* set value to 0 even if fctn. */ + ptr->next = (struct symrec *)sym_table; + sym_table = ptr; + return ptr; +@} + +symrec * +getsym (sym_name) + char *sym_name; +@{ + symrec *ptr; + for (ptr = sym_table; ptr != (symrec *) 0; + ptr = (symrec *)ptr->next) + if (strcmp (ptr->name,sym_name) == 0) + return ptr; + return 0; +@} +@end smallexample + +The function @code{yylex} must now recognize variables, numeric values, and +the single-character arithmetic operators. Strings of alphanumeric +characters with a leading nondigit are recognized as either variables or +functions depending on what the symbol table says about them. + +The string is passed to @code{getsym} for look up in the symbol table. If +the name appears in the table, a pointer to its location and its type +(@code{VAR} or @code{FNCT}) is returned to @code{yyparse}. If it is not +already in the table, then it is installed as a @code{VAR} using +@code{putsym}. Again, a pointer and its type (which must be @code{VAR}) is +returned to @code{yyparse}.@refill + +No change is needed in the handling of numeric values and arithmetic +operators in @code{yylex}. + +@smallexample +@group +#include +yylex () +@{ + int c; + + /* Ignore whitespace, get first nonwhite character. */ + while ((c = getchar ()) == ' ' || c == '\t'); + + if (c == EOF) + return 0; +@end group + +@group + /* Char starts a number => parse the number. */ + if (c == '.' || isdigit (c)) + @{ + ungetc (c, stdin); + scanf ("%lf", &yylval.val); + return NUM; + @} +@end group + +@group + /* Char starts an identifier => read the name. */ + if (isalpha (c)) + @{ + symrec *s; + static char *symbuf = 0; + static int length = 0; + int i; +@end group + +@group + /* Initially make the buffer long enough + for a 40-character symbol name. */ + if (length == 0) + length = 40, symbuf = (char *)malloc (length + 1); + + i = 0; + do +@end group +@group + @{ + /* If buffer is full, make it bigger. */ + if (i == length) + @{ + length *= 2; + symbuf = (char *)realloc (symbuf, length + 1); + @} + /* Add this character to the buffer. */ + symbuf[i++] = c; + /* Get another character. */ + c = getchar (); + @} +@end group +@group + while (c != EOF && isalnum (c)); + + ungetc (c, stdin); + symbuf[i] = '\0'; +@end group + +@group + s = getsym (symbuf); + if (s == 0) + s = putsym (symbuf, VAR); + yylval.tptr = s; + return s->type; + @} + + /* Any other character is a token by itself. */ + return c; +@} +@end group +@end smallexample + +This program is both powerful and flexible. You may easily add new +functions, and it is a simple job to modify this code to install predefined +variables such as @code{pi} or @code{e} as well. + +@node Exercises, , Multi-function Calc, Examples +@section Exercises +@cindex exercises + +@enumerate +@item +Add some new functions from @file{math.h} to the initialization list. + +@item +Add another array that contains constants and their values. Then +modify @code{init_table} to add these constants to the symbol table. +It will be easiest to give the constants type @code{VAR}. + +@item +Make the program report an error if the user refers to an +uninitialized variable in any way except to store a value in it. +@end enumerate + +@node Grammar File, Interface, Examples, Top +@chapter Bison Grammar Files + +Bison takes as input a context-free grammar specification and produces a +C-language function that recognizes correct instances of the grammar. + +The Bison grammar input file conventionally has a name ending in @samp{.y}. + +@menu +* Grammar Outline:: Overall layout of the grammar file. +* Symbols:: Terminal and nonterminal symbols. +* Rules:: How to write grammar rules. +* Recursion:: Writing recursive rules. +* Semantics:: Semantic values and actions. +* Declarations:: All kinds of Bison declarations are described here. +* Multiple Parsers:: Putting more than one Bison parser in one program. +@end menu + +@node Grammar Outline, Symbols, , Grammar File +@section Outline of a Bison Grammar + +A Bison grammar file has four main sections, shown here with the +appropriate delimiters: + +@example +%@{ +@var{C declarations} +%@} + +@var{Bison declarations} + +%% +@var{Grammar rules} +%% + +@var{Additional C code} +@end example + +Comments enclosed in @samp{/* @dots{} */} may appear in any of the sections. + +@menu +* C Declarations:: Syntax and usage of the C declarations section. +* Bison Declarations:: Syntax and usage of the Bison declarations section. +* Grammar Rules:: Syntax and usage of the grammar rules section. +* C Code:: Syntax and usage of the additional C code section. +@end menu + +@node C Declarations, Bison Declarations, , Grammar Outline +@subsection The C Declarations Section +@cindex C declarations section +@cindex declarations, C + +The @var{C declarations} section contains macro definitions and +declarations of functions and variables that are used in the actions in the +grammar rules. These are copied to the beginning of the parser file so +that they precede the definition of @code{yyparse}. You can use +@samp{#include} to get the declarations from a header file. If you don't +need any C declarations, you may omit the @samp{%@{} and @samp{%@}} +delimiters that bracket this section. + +@node Bison Declarations, Grammar Rules, C Declarations, Grammar Outline +@subsection The Bison Declarations Section +@cindex Bison declarations (introduction) +@cindex declarations, Bison (introduction) + +The @var{Bison declarations} section contains declarations that define +terminal and nonterminal symbols, specify precedence, and so on. +In some simple grammars you may not need any declarations. +@xref{Declarations, ,Bison Declarations}. + +@node Grammar Rules, C Code, Bison Declarations, Grammar Outline +@subsection The Grammar Rules Section +@cindex grammar rules section +@cindex rules section for grammar + +The @dfn{grammar rules} section contains one or more Bison grammar +rules, and nothing else. @xref{Rules, ,Syntax of Grammar Rules}. + +There must always be at least one grammar rule, and the first +@samp{%%} (which precedes the grammar rules) may never be omitted even +if it is the first thing in the file. + +@node C Code, , Grammar Rules, Grammar Outline +@subsection The Additional C Code Section +@cindex additional C code section +@cindex C code, section for additional + +The @var{additional C code} section is copied verbatim to the end of +the parser file, just as the @var{C declarations} section is copied to +the beginning. This is the most convenient place to put anything +that you want to have in the parser file but which need not come before +the definition of @code{yyparse}. For example, the definitions of +@code{yylex} and @code{yyerror} often go here. @xref{Interface, ,Parser C-Language Interface}. + +If the last section is empty, you may omit the @samp{%%} that separates it +from the grammar rules. + +The Bison parser itself contains many static variables whose names start +with @samp{yy} and many macros whose names start with @samp{YY}. It is a +good idea to avoid using any such names (except those documented in this +manual) in the additional C code section of the grammar file. + +@node Symbols, Rules, Grammar Outline, Grammar File +@section Symbols, Terminal and Nonterminal +@cindex nonterminal symbol +@cindex terminal symbol +@cindex token type +@cindex symbol + +@dfn{Symbols} in Bison grammars represent the grammatical classifications +of the language. + +A @dfn{terminal symbol} (also known as a @dfn{token type}) represents a +class of syntactically equivalent tokens. You use the symbol in grammar +rules to mean that a token in that class is allowed. The symbol is +represented in the Bison parser by a numeric code, and the @code{yylex} +function returns a token type code to indicate what kind of token has been +read. You don't need to know what the code value is; you can use the +symbol to stand for it. + +A @dfn{nonterminal symbol} stands for a class of syntactically equivalent +groupings. The symbol name is used in writing grammar rules. By convention, +it should be all lower case. + +Symbol names can contain letters, digits (not at the beginning), +underscores and periods. Periods make sense only in nonterminals. + +There are three ways of writing terminal symbols in the grammar: + +@itemize @bullet +@item +A @dfn{named token type} is written with an identifier, like an +identifier in C. By convention, it should be all upper case. Each +such name must be defined with a Bison declaration such as +@code{%token}. @xref{Token Decl, ,Token Type Names}. + +@item +@cindex character token +@cindex literal token +@cindex single-character literal +A @dfn{character token type} (or @dfn{literal character token}) is +written in the grammar using the same syntax used in C for character +constants; for example, @code{'+'} is a character token type. A +character token type doesn't need to be declared unless you need to +specify its semantic value data type (@pxref{Value Type, ,Data Types of +Semantic Values}), associativity, or precedence (@pxref{Precedence, +,Operator Precedence}). + +By convention, a character token type is used only to represent a +token that consists of that particular character. Thus, the token +type @code{'+'} is used to represent the character @samp{+} as a +token. Nothing enforces this convention, but if you depart from it, +your program will confuse other readers. + +All the usual escape sequences used in character literals in C can be +used in Bison as well, but you must not use the null character as a +character literal because its ASCII code, zero, is the code @code{yylex} +returns for end-of-input (@pxref{Calling Convention, ,Calling Convention +for @code{yylex}}). + +@item +@cindex string token +@cindex literal string token +@cindex multi-character literal +A @dfn{literal string token} is written like a C string constant; for +example, @code{"<="} is a literal string token. A literal string token +doesn't need to be declared unless you need to specify its semantic +value data type (@pxref{Value Type}), associativity, precedence +(@pxref{Precedence}). + +You can associate the literal string token with a symbolic name as an +alias, using the @code{%token} declaration (@pxref{Token Decl, ,Token +Declarations}). If you don't do that, the lexical analyzer has to +retrieve the token number for the literal string token from the +@code{yytname} table (@pxref{Calling Convention}). + +@strong{WARNING}: literal string tokens do not work in Yacc. + +By convention, a literal string token is used only to represent a token +that consists of that particular string. Thus, you should use the token +type @code{"<="} to represent the string @samp{<=} as a token. Bison +does not enforces this convention, but if you depart from it, people who +read your program will be confused. + +All the escape sequences used in string literals in C can be used in +Bison as well. A literal string token must contain two or more +characters; for a token containing just one character, use a character +token (see above). +@end itemize + +How you choose to write a terminal symbol has no effect on its +grammatical meaning. That depends only on where it appears in rules and +on when the parser function returns that symbol. + +The value returned by @code{yylex} is always one of the terminal symbols +(or 0 for end-of-input). Whichever way you write the token type in the +grammar rules, you write it the same way in the definition of @code{yylex}. +The numeric code for a character token type is simply the ASCII code for +the character, so @code{yylex} can use the identical character constant to +generate the requisite code. Each named token type becomes a C macro in +the parser file, so @code{yylex} can use the name to stand for the code. +(This is why periods don't make sense in terminal symbols.) +@xref{Calling Convention, ,Calling Convention for @code{yylex}}. + +If @code{yylex} is defined in a separate file, you need to arrange for the +token-type macro definitions to be available there. Use the @samp{-d} +option when you run Bison, so that it will write these macro definitions +into a separate header file @file{@var{name}.tab.h} which you can include +in the other source files that need it. @xref{Invocation, ,Invoking Bison}. + +The symbol @code{error} is a terminal symbol reserved for error recovery +(@pxref{Error Recovery}); you shouldn't use it for any other purpose. +In particular, @code{yylex} should never return this value. + +@node Rules, Recursion, Symbols, Grammar File +@section Syntax of Grammar Rules +@cindex rule syntax +@cindex grammar rule syntax +@cindex syntax of grammar rules + +A Bison grammar rule has the following general form: + +@example +@group +@var{result}: @var{components}@dots{} + ; +@end group +@end example + +@noindent +where @var{result} is the nonterminal symbol that this rule describes +and @var{components} are various terminal and nonterminal symbols that +are put together by this rule (@pxref{Symbols}). + +For example, + +@example +@group +exp: exp '+' exp + ; +@end group +@end example + +@noindent +says that two groupings of type @code{exp}, with a @samp{+} token in between, +can be combined into a larger grouping of type @code{exp}. + +Whitespace in rules is significant only to separate symbols. You can add +extra whitespace as you wish. + +Scattered among the components can be @var{actions} that determine +the semantics of the rule. An action looks like this: + +@example +@{@var{C statements}@} +@end example + +@noindent +Usually there is only one action and it follows the components. +@xref{Actions}. + +@findex | +Multiple rules for the same @var{result} can be written separately or can +be joined with the vertical-bar character @samp{|} as follows: + +@ifinfo +@example +@var{result}: @var{rule1-components}@dots{} + | @var{rule2-components}@dots{} + @dots{} + ; +@end example +@end ifinfo +@iftex +@example +@group +@var{result}: @var{rule1-components}@dots{} + | @var{rule2-components}@dots{} + @dots{} + ; +@end group +@end example +@end iftex + +@noindent +They are still considered distinct rules even when joined in this way. + +If @var{components} in a rule is empty, it means that @var{result} can +match the empty string. For example, here is how to define a +comma-separated sequence of zero or more @code{exp} groupings: + +@example +@group +expseq: /* empty */ + | expseq1 + ; +@end group + +@group +expseq1: exp + | expseq1 ',' exp + ; +@end group +@end example + +@noindent +It is customary to write a comment @samp{/* empty */} in each rule +with no components. + +@node Recursion, Semantics, Rules, Grammar File +@section Recursive Rules +@cindex recursive rule + +A rule is called @dfn{recursive} when its @var{result} nonterminal appears +also on its right hand side. Nearly all Bison grammars need to use +recursion, because that is the only way to define a sequence of any number +of somethings. Consider this recursive definition of a comma-separated +sequence of one or more expressions: + +@example +@group +expseq1: exp + | expseq1 ',' exp + ; +@end group +@end example + +@cindex left recursion +@cindex right recursion +@noindent +Since the recursive use of @code{expseq1} is the leftmost symbol in the +right hand side, we call this @dfn{left recursion}. By contrast, here +the same construct is defined using @dfn{right recursion}: + +@example +@group +expseq1: exp + | exp ',' expseq1 + ; +@end group +@end example + +@noindent +Any kind of sequence can be defined using either left recursion or +right recursion, but you should always use left recursion, because it +can parse a sequence of any number of elements with bounded stack +space. Right recursion uses up space on the Bison stack in proportion +to the number of elements in the sequence, because all the elements +must be shifted onto the stack before the rule can be applied even +once. @xref{Algorithm, ,The Bison Parser Algorithm }, for +further explanation of this. + +@cindex mutual recursion +@dfn{Indirect} or @dfn{mutual} recursion occurs when the result of the +rule does not appear directly on its right hand side, but does appear +in rules for other nonterminals which do appear on its right hand +side. + +For example: + +@example +@group +expr: primary + | primary '+' primary + ; +@end group + +@group +primary: constant + | '(' expr ')' + ; +@end group +@end example + +@noindent +defines two mutually-recursive nonterminals, since each refers to the +other. + +@node Semantics, Declarations, Recursion, Grammar File +@section Defining Language Semantics +@cindex defining language semantics +@cindex language semantics, defining + +The grammar rules for a language determine only the syntax. The semantics +are determined by the semantic values associated with various tokens and +groupings, and by the actions taken when various groupings are recognized. + +For example, the calculator calculates properly because the value +associated with each expression is the proper number; it adds properly +because the action for the grouping @w{@samp{@var{x} + @var{y}}} is to add +the numbers associated with @var{x} and @var{y}. + +@menu +* Value Type:: Specifying one data type for all semantic values. +* Multiple Types:: Specifying several alternative data types. +* Actions:: An action is the semantic definition of a grammar rule. +* Action Types:: Specifying data types for actions to operate on. +* Mid-Rule Actions:: Most actions go at the end of a rule. + This says when, why and how to use the exceptional + action in the middle of a rule. +@end menu + +@node Value Type, Multiple Types, , Semantics +@subsection Data Types of Semantic Values +@cindex semantic value type +@cindex value type, semantic +@cindex data types of semantic values +@cindex default data type + +In a simple program it may be sufficient to use the same data type for +the semantic values of all language constructs. This was true in the +RPN and infix calculator examples (@pxref{RPN Calc, ,Reverse Polish Notation Calculator}). + +Bison's default is to use type @code{int} for all semantic values. To +specify some other type, define @code{YYSTYPE} as a macro, like this: + +@example +#define YYSTYPE double +@end example + +@noindent +This macro definition must go in the C declarations section of the grammar +file (@pxref{Grammar Outline, ,Outline of a Bison Grammar}). + +@node Multiple Types, Actions, Value Type, Semantics +@subsection More Than One Value Type + +In most programs, you will need different data types for different kinds +of tokens and groupings. For example, a numeric constant may need type +@code{int} or @code{long}, while a string constant needs type @code{char *}, +and an identifier might need a pointer to an entry in the symbol table. + +To use more than one data type for semantic values in one parser, Bison +requires you to do two things: + +@itemize @bullet +@item +Specify the entire collection of possible data types, with the +@code{%union} Bison declaration (@pxref{Union Decl, ,The Collection of Value Types}). + +@item +Choose one of those types for each symbol (terminal or nonterminal) +for which semantic values are used. This is done for tokens with the +@code{%token} Bison declaration (@pxref{Token Decl, ,Token Type Names}) and for groupings +with the @code{%type} Bison declaration (@pxref{Type Decl, ,Nonterminal Symbols}). +@end itemize + +@node Actions, Action Types, Multiple Types, Semantics +@subsection Actions +@cindex action +@vindex $$ +@vindex $@var{n} + +An action accompanies a syntactic rule and contains C code to be executed +each time an instance of that rule is recognized. The task of most actions +is to compute a semantic value for the grouping built by the rule from the +semantic values associated with tokens or smaller groupings. + +An action consists of C statements surrounded by braces, much like a +compound statement in C. It can be placed at any position in the rule; it +is executed at that position. Most rules have just one action at the end +of the rule, following all the components. Actions in the middle of a rule +are tricky and used only for special purposes (@pxref{Mid-Rule Actions, ,Actions in Mid-Rule}). + +The C code in an action can refer to the semantic values of the components +matched by the rule with the construct @code{$@var{n}}, which stands for +the value of the @var{n}th component. The semantic value for the grouping +being constructed is @code{$$}. (Bison translates both of these constructs +into array element references when it copies the actions into the parser +file.) + +Here is a typical example: + +@example +@group +exp: @dots{} + | exp '+' exp + @{ $$ = $1 + $3; @} +@end group +@end example + +@noindent +This rule constructs an @code{exp} from two smaller @code{exp} groupings +connected by a plus-sign token. In the action, @code{$1} and @code{$3} +refer to the semantic values of the two component @code{exp} groupings, +which are the first and third symbols on the right hand side of the rule. +The sum is stored into @code{$$} so that it becomes the semantic value of +the addition-expression just recognized by the rule. If there were a +useful semantic value associated with the @samp{+} token, it could be +referred to as @code{$2}.@refill + +@cindex default action +If you don't specify an action for a rule, Bison supplies a default: +@w{@code{$$ = $1}.} Thus, the value of the first symbol in the rule becomes +the value of the whole rule. Of course, the default rule is valid only +if the two data types match. There is no meaningful default action for +an empty rule; every empty rule must have an explicit action unless the +rule's value does not matter. + +@code{$@var{n}} with @var{n} zero or negative is allowed for reference +to tokens and groupings on the stack @emph{before} those that match the +current rule. This is a very risky practice, and to use it reliably +you must be certain of the context in which the rule is applied. Here +is a case in which you can use this reliably: + +@example +@group +foo: expr bar '+' expr @{ @dots{} @} + | expr bar '-' expr @{ @dots{} @} + ; +@end group + +@group +bar: /* empty */ + @{ previous_expr = $0; @} + ; +@end group +@end example + +As long as @code{bar} is used only in the fashion shown here, @code{$0} +always refers to the @code{expr} which precedes @code{bar} in the +definition of @code{foo}. + +@node Action Types, Mid-Rule Actions, Actions, Semantics +@subsection Data Types of Values in Actions +@cindex action data types +@cindex data types in actions + +If you have chosen a single data type for semantic values, the @code{$$} +and @code{$@var{n}} constructs always have that data type. + +If you have used @code{%union} to specify a variety of data types, then you +must declare a choice among these types for each terminal or nonterminal +symbol that can have a semantic value. Then each time you use @code{$$} or +@code{$@var{n}}, its data type is determined by which symbol it refers to +in the rule. In this example,@refill + +@example +@group +exp: @dots{} + | exp '+' exp + @{ $$ = $1 + $3; @} +@end group +@end example + +@noindent +@code{$1} and @code{$3} refer to instances of @code{exp}, so they all +have the data type declared for the nonterminal symbol @code{exp}. If +@code{$2} were used, it would have the data type declared for the +terminal symbol @code{'+'}, whatever that might be.@refill + +Alternatively, you can specify the data type when you refer to the value, +by inserting @samp{<@var{type}>} after the @samp{$} at the beginning of the +reference. For example, if you have defined types as shown here: + +@example +@group +%union @{ + int itype; + double dtype; +@} +@end group +@end example + +@noindent +then you can write @code{$1} to refer to the first subunit of the +rule as an integer, or @code{$1} to refer to it as a double. + +@node Mid-Rule Actions, , Action Types, Semantics +@subsection Actions in Mid-Rule +@cindex actions in mid-rule +@cindex mid-rule actions + +Occasionally it is useful to put an action in the middle of a rule. +These actions are written just like usual end-of-rule actions, but they +are executed before the parser even recognizes the following components. + +A mid-rule action may refer to the components preceding it using +@code{$@var{n}}, but it may not refer to subsequent components because +it is run before they are parsed. + +The mid-rule action itself counts as one of the components of the rule. +This makes a difference when there is another action later in the same rule +(and usually there is another at the end): you have to count the actions +along with the symbols when working out which number @var{n} to use in +@code{$@var{n}}. + +The mid-rule action can also have a semantic value. The action can set +its value with an assignment to @code{$$}, and actions later in the rule +can refer to the value using @code{$@var{n}}. Since there is no symbol +to name the action, there is no way to declare a data type for the value +in advance, so you must use the @samp{$<@dots{}>} construct to specify a +data type each time you refer to this value. + +There is no way to set the value of the entire rule with a mid-rule +action, because assignments to @code{$$} do not have that effect. The +only way to set the value for the entire rule is with an ordinary action +at the end of the rule. + +Here is an example from a hypothetical compiler, handling a @code{let} +statement that looks like @samp{let (@var{variable}) @var{statement}} and +serves to create a variable named @var{variable} temporarily for the +duration of @var{statement}. To parse this construct, we must put +@var{variable} into the symbol table while @var{statement} is parsed, then +remove it afterward. Here is how it is done: + +@example +@group +stmt: LET '(' var ')' + @{ $$ = push_context (); + declare_variable ($3); @} + stmt @{ $$ = $6; + pop_context ($5); @} +@end group +@end example + +@noindent +As soon as @samp{let (@var{variable})} has been recognized, the first +action is run. It saves a copy of the current semantic context (the +list of accessible variables) as its semantic value, using alternative +@code{context} in the data-type union. Then it calls +@code{declare_variable} to add the new variable to that list. Once the +first action is finished, the embedded statement @code{stmt} can be +parsed. Note that the mid-rule action is component number 5, so the +@samp{stmt} is component number 6. + +After the embedded statement is parsed, its semantic value becomes the +value of the entire @code{let}-statement. Then the semantic value from the +earlier action is used to restore the prior list of variables. This +removes the temporary @code{let}-variable from the list so that it won't +appear to exist while the rest of the program is parsed. + +Taking action before a rule is completely recognized often leads to +conflicts since the parser must commit to a parse in order to execute the +action. For example, the following two rules, without mid-rule actions, +can coexist in a working parser because the parser can shift the open-brace +token and look at what follows before deciding whether there is a +declaration or not: + +@example +@group +compound: '@{' declarations statements '@}' + | '@{' statements '@}' + ; +@end group +@end example + +@noindent +But when we add a mid-rule action as follows, the rules become nonfunctional: + +@example +@group +compound: @{ prepare_for_local_variables (); @} + '@{' declarations statements '@}' +@end group +@group + | '@{' statements '@}' + ; +@end group +@end example + +@noindent +Now the parser is forced to decide whether to run the mid-rule action +when it has read no farther than the open-brace. In other words, it +must commit to using one rule or the other, without sufficient +information to do it correctly. (The open-brace token is what is called +the @dfn{look-ahead} token at this time, since the parser is still +deciding what to do about it. @xref{Look-Ahead, ,Look-Ahead Tokens}.) + +You might think that you could correct the problem by putting identical +actions into the two rules, like this: + +@example +@group +compound: @{ prepare_for_local_variables (); @} + '@{' declarations statements '@}' + | @{ prepare_for_local_variables (); @} + '@{' statements '@}' + ; +@end group +@end example + +@noindent +But this does not help, because Bison does not realize that the two actions +are identical. (Bison never tries to understand the C code in an action.) + +If the grammar is such that a declaration can be distinguished from a +statement by the first token (which is true in C), then one solution which +does work is to put the action after the open-brace, like this: + +@example +@group +compound: '@{' @{ prepare_for_local_variables (); @} + declarations statements '@}' + | '@{' statements '@}' + ; +@end group +@end example + +@noindent +Now the first token of the following declaration or statement, +which would in any case tell Bison which rule to use, can still do so. + +Another solution is to bury the action inside a nonterminal symbol which +serves as a subroutine: + +@example +@group +subroutine: /* empty */ + @{ prepare_for_local_variables (); @} + ; + +@end group + +@group +compound: subroutine + '@{' declarations statements '@}' + | subroutine + '@{' statements '@}' + ; +@end group +@end example + +@noindent +Now Bison can execute the action in the rule for @code{subroutine} without +deciding which rule for @code{compound} it will eventually use. Note that +the action is now at the end of its rule. Any mid-rule action can be +converted to an end-of-rule action in this way, and this is what Bison +actually does to implement mid-rule actions. + +@node Declarations, Multiple Parsers, Semantics, Grammar File +@section Bison Declarations +@cindex declarations, Bison +@cindex Bison declarations + +The @dfn{Bison declarations} section of a Bison grammar defines the symbols +used in formulating the grammar and the data types of semantic values. +@xref{Symbols}. + +All token type names (but not single-character literal tokens such as +@code{'+'} and @code{'*'}) must be declared. Nonterminal symbols must be +declared if you need to specify which data type to use for the semantic +value (@pxref{Multiple Types, ,More Than One Value Type}). + +The first rule in the file also specifies the start symbol, by default. +If you want some other symbol to be the start symbol, you must declare +it explicitly (@pxref{Language and Grammar, ,Languages and Context-Free Grammars}). + +@menu +* Token Decl:: Declaring terminal symbols. +* Precedence Decl:: Declaring terminals with precedence and associativity. +* Union Decl:: Declaring the set of all semantic value types. +* Type Decl:: Declaring the choice of type for a nonterminal symbol. +* Expect Decl:: Suppressing warnings about shift/reduce conflicts. +* Start Decl:: Specifying the start symbol. +* Pure Decl:: Requesting a reentrant parser. +* Decl Summary:: Table of all Bison declarations. +@end menu + +@node Token Decl, Precedence Decl, , Declarations +@subsection Token Type Names +@cindex declaring token type names +@cindex token type names, declaring +@cindex declaring literal string tokens +@findex %token + +The basic way to declare a token type name (terminal symbol) is as follows: + +@example +%token @var{name} +@end example + +Bison will convert this into a @code{#define} directive in +the parser, so that the function @code{yylex} (if it is in this file) +can use the name @var{name} to stand for this token type's code. + +Alternatively, you can use @code{%left}, @code{%right}, or @code{%nonassoc} +instead of @code{%token}, if you wish to specify precedence. +@xref{Precedence Decl, ,Operator Precedence}. + +You can explicitly specify the numeric code for a token type by appending +an integer value in the field immediately following the token name: + +@example +%token NUM 300 +@end example + +@noindent +It is generally best, however, to let Bison choose the numeric codes for +all token types. Bison will automatically select codes that don't conflict +with each other or with ASCII characters. + +In the event that the stack type is a union, you must augment the +@code{%token} or other token declaration to include the data type +alternative delimited by angle-brackets (@pxref{Multiple Types, ,More Than One Value Type}). + +For example: + +@example +@group +%union @{ /* define stack type */ + double val; + symrec *tptr; +@} +%token NUM /* define token NUM and its type */ +@end group +@end example + +You can associate a literal string token with a token type name by +writing the literal string at the end of a @code{%token} +declaration which declares the name. For example: + +@example +%token arrow "=>" +@end example + +@noindent +For example, a grammar for the C language might specify these names with +equivalent literal string tokens: + +@example +%token OR "||" +%token LE 134 "<=" +%left OR "<=" +@end example + +@noindent +Once you equate the literal string and the token name, you can use them +interchangeably in further declarations or the grammar rules. The +@code{yylex} function can use the token name or the literal string to +obtain the token type code number (@pxref{Calling Convention}). + +@node Precedence Decl, Union Decl, Token Decl, Declarations +@subsection Operator Precedence +@cindex precedence declarations +@cindex declaring operator precedence +@cindex operator precedence, declaring + +Use the @code{%left}, @code{%right} or @code{%nonassoc} declaration to +declare a token and specify its precedence and associativity, all at +once. These are called @dfn{precedence declarations}. +@xref{Precedence, ,Operator Precedence}, for general information on operator precedence. + +The syntax of a precedence declaration is the same as that of +@code{%token}: either + +@example +%left @var{symbols}@dots{} +@end example + +@noindent +or + +@example +%left <@var{type}> @var{symbols}@dots{} +@end example + +And indeed any of these declarations serves the purposes of @code{%token}. +But in addition, they specify the associativity and relative precedence for +all the @var{symbols}: + +@itemize @bullet +@item +The associativity of an operator @var{op} determines how repeated uses +of the operator nest: whether @samp{@var{x} @var{op} @var{y} @var{op} +@var{z}} is parsed by grouping @var{x} with @var{y} first or by +grouping @var{y} with @var{z} first. @code{%left} specifies +left-associativity (grouping @var{x} with @var{y} first) and +@code{%right} specifies right-associativity (grouping @var{y} with +@var{z} first). @code{%nonassoc} specifies no associativity, which +means that @samp{@var{x} @var{op} @var{y} @var{op} @var{z}} is +considered a syntax error. + +@item +The precedence of an operator determines how it nests with other operators. +All the tokens declared in a single precedence declaration have equal +precedence and nest together according to their associativity. +When two tokens declared in different precedence declarations associate, +the one declared later has the higher precedence and is grouped first. +@end itemize + +@node Union Decl, Type Decl, Precedence Decl, Declarations +@subsection The Collection of Value Types +@cindex declaring value types +@cindex value types, declaring +@findex %union + +The @code{%union} declaration specifies the entire collection of possible +data types for semantic values. The keyword @code{%union} is followed by a +pair of braces containing the same thing that goes inside a @code{union} in +C. + +For example: + +@example +@group +%union @{ + double val; + symrec *tptr; +@} +@end group +@end example + +@noindent +This says that the two alternative types are @code{double} and @code{symrec +*}. They are given names @code{val} and @code{tptr}; these names are used +in the @code{%token} and @code{%type} declarations to pick one of the types +for a terminal or nonterminal symbol (@pxref{Type Decl, ,Nonterminal Symbols}). + +Note that, unlike making a @code{union} declaration in C, you do not write +a semicolon after the closing brace. + +@node Type Decl, Expect Decl, Union Decl, Declarations +@subsection Nonterminal Symbols +@cindex declaring value types, nonterminals +@cindex value types, nonterminals, declaring +@findex %type + +@noindent +When you use @code{%union} to specify multiple value types, you must +declare the value type of each nonterminal symbol for which values are +used. This is done with a @code{%type} declaration, like this: + +@example +%type <@var{type}> @var{nonterminal}@dots{} +@end example + +@noindent +Here @var{nonterminal} is the name of a nonterminal symbol, and @var{type} +is the name given in the @code{%union} to the alternative that you want +(@pxref{Union Decl, ,The Collection of Value Types}). You can give any number of nonterminal symbols in +the same @code{%type} declaration, if they have the same value type. Use +spaces to separate the symbol names. + +You can also declare the value type of a terminal symbol. To do this, +use the same @code{<@var{type}>} construction in a declaration for the +terminal symbol. All kinds of token declarations allow +@code{<@var{type}>}. + +@node Expect Decl, Start Decl, Type Decl, Declarations +@subsection Suppressing Conflict Warnings +@cindex suppressing conflict warnings +@cindex preventing warnings about conflicts +@cindex warnings, preventing +@cindex conflicts, suppressing warnings of +@findex %expect + +Bison normally warns if there are any conflicts in the grammar +(@pxref{Shift/Reduce, ,Shift/Reduce Conflicts}), but most real grammars have harmless shift/reduce +conflicts which are resolved in a predictable way and would be difficult to +eliminate. It is desirable to suppress the warning about these conflicts +unless the number of conflicts changes. You can do this with the +@code{%expect} declaration. + +The declaration looks like this: + +@example +%expect @var{n} +@end example + +Here @var{n} is a decimal integer. The declaration says there should be no +warning if there are @var{n} shift/reduce conflicts and no reduce/reduce +conflicts. The usual warning is given if there are either more or fewer +conflicts, or if there are any reduce/reduce conflicts. + +In general, using @code{%expect} involves these steps: + +@itemize @bullet +@item +Compile your grammar without @code{%expect}. Use the @samp{-v} option +to get a verbose list of where the conflicts occur. Bison will also +print the number of conflicts. + +@item +Check each of the conflicts to make sure that Bison's default +resolution is what you really want. If not, rewrite the grammar and +go back to the beginning. + +@item +Add an @code{%expect} declaration, copying the number @var{n} from the +number which Bison printed. +@end itemize + +Now Bison will stop annoying you about the conflicts you have checked, but +it will warn you again if changes in the grammar result in additional +conflicts. + +@node Start Decl, Pure Decl, Expect Decl, Declarations +@subsection The Start-Symbol +@cindex declaring the start symbol +@cindex start symbol, declaring +@cindex default start symbol +@findex %start + +Bison assumes by default that the start symbol for the grammar is the first +nonterminal specified in the grammar specification section. The programmer +may override this restriction with the @code{%start} declaration as follows: + +@example +%start @var{symbol} +@end example + +@node Pure Decl, Decl Summary, Start Decl, Declarations +@subsection A Pure (Reentrant) Parser +@cindex reentrant parser +@cindex pure parser +@findex %pure_parser + +A @dfn{reentrant} program is one which does not alter in the course of +execution; in other words, it consists entirely of @dfn{pure} (read-only) +code. Reentrancy is important whenever asynchronous execution is possible; +for example, a nonreentrant program may not be safe to call from a signal +handler. In systems with multiple threads of control, a nonreentrant +program must be called only within interlocks. + +The Bison parser is not normally a reentrant program, because it uses +statically allocated variables for communication with @code{yylex}. These +variables include @code{yylval} and @code{yylloc}. + +The Bison declaration @code{%pure_parser} says that you want the parser +to be reentrant. It looks like this: + +@example +%pure_parser +@end example + +The effect is that the two communication variables become local +variables in @code{yyparse}, and a different calling convention is used +for the lexical analyzer function @code{yylex}. @xref{Pure Calling, +,Calling Conventions for Pure Parsers}, for the details of this. The +variable @code{yynerrs} also becomes local in @code{yyparse} +(@pxref{Error Reporting, ,The Error Reporting Function @code{yyerror}}). +The convention for calling @code{yyparse} itself is unchanged. + +@node Decl Summary, , Pure Decl, Declarations +@subsection Bison Declaration Summary +@cindex Bison declaration summary +@cindex declaration summary +@cindex summary, Bison declaration + +Here is a summary of all Bison declarations: + +@table @code +@item %union +Declare the collection of data types that semantic values may have +(@pxref{Union Decl, ,The Collection of Value Types}). + +@item %token +Declare a terminal symbol (token type name) with no precedence +or associativity specified (@pxref{Token Decl, ,Token Type Names}). + +@item %right +Declare a terminal symbol (token type name) that is right-associative +(@pxref{Precedence Decl, ,Operator Precedence}). + +@item %left +Declare a terminal symbol (token type name) that is left-associative +(@pxref{Precedence Decl, ,Operator Precedence}). + +@item %nonassoc +Declare a terminal symbol (token type name) that is nonassociative +(using it in a way that would be associative is a syntax error) +(@pxref{Precedence Decl, ,Operator Precedence}). + +@item %type +Declare the type of semantic values for a nonterminal symbol +(@pxref{Type Decl, ,Nonterminal Symbols}). + +@item %start +Specify the grammar's start symbol (@pxref{Start Decl, ,The Start-Symbol}). + +@item %expect +Declare the expected number of shift-reduce conflicts +(@pxref{Expect Decl, ,Suppressing Conflict Warnings}). + +@item %pure_parser +Request a pure (reentrant) parser program (@pxref{Pure Decl, ,A Pure (Reentrant) Parser}). + +@item %no_lines +Don't generate any @code{#line} preprocessor commands in the parser +file. Ordinarily Bison writes these commands in the parser file so that +the C compiler and debuggers will associate errors and object code with +your source file (the grammar file). This directive causes them to +associate errors with the parser file, treating it an independent source +file in its own right. + +@item %raw +The output file @file{@var{name}.h} normally defines the tokens with +Yacc-compatible token numbers. If this option is specified, the +internal Bison numbers are used instead. (Yacc-compatible numbers start +at 257 except for single character tokens; Bison assigns token numbers +sequentially for all tokens starting at 3.) + +@item %token_table +Generate an array of token names in the parser file. The name of the +array is @code{yytname}; @code{yytname[@var{i}]} is the name of the +token whose internal Bison token code number is @var{i}. The first three +elements of @code{yytname} are always @code{"$"}, @code{"error"}, and +@code{"$illegal"}; after these come the symbols defined in the grammar +file. + +For single-character literal tokens and literal string tokens, the name +in the table includes the single-quote or double-quote characters: for +example, @code{"'+'"} is a single-character literal and @code{"\"<=\""} +is a literal string token. All the characters of the literal string +token appear verbatim in the string found in the table; even +double-quote characters are not escaped. For example, if the token +consists of three characters @samp{*"*}, its string in @code{yytname} +contains @samp{"*"*"}. (In C, that would be written as +@code{"\"*\"*\""}). + +When you specify @code{%token_table}, Bison also generates macro +definitions for macros @code{YYNTOKENS}, @code{YYNNTS}, and +@code{YYNRULES}, and @code{YYNSTATES}: + +@table @code +@item YYNTOKENS +The highest token number, plus one. +@item YYNNTS +The number of non-terminal symbols. +@item YYNRULES +The number of grammar rules, +@item YYNSTATES +The number of parser states (@pxref{Parser States}). +@end table +@end table + +@node Multiple Parsers,, Declarations, Grammar File +@section Multiple Parsers in the Same Program + +Most programs that use Bison parse only one language and therefore contain +only one Bison parser. But what if you want to parse more than one +language with the same program? Then you need to avoid a name conflict +between different definitions of @code{yyparse}, @code{yylval}, and so on. + +The easy way to do this is to use the option @samp{-p @var{prefix}} +(@pxref{Invocation, ,Invoking Bison}). This renames the interface functions and +variables of the Bison parser to start with @var{prefix} instead of +@samp{yy}. You can use this to give each parser distinct names that do +not conflict. + +The precise list of symbols renamed is @code{yyparse}, @code{yylex}, +@code{yyerror}, @code{yynerrs}, @code{yylval}, @code{yychar} and +@code{yydebug}. For example, if you use @samp{-p c}, the names become +@code{cparse}, @code{clex}, and so on. + +@strong{All the other variables and macros associated with Bison are not +renamed.} These others are not global; there is no conflict if the same +name is used in different parsers. For example, @code{YYSTYPE} is not +renamed, but defining this in different ways in different parsers causes +no trouble (@pxref{Value Type, ,Data Types of Semantic Values}). + +The @samp{-p} option works by adding macro definitions to the beginning +of the parser source file, defining @code{yyparse} as +@code{@var{prefix}parse}, and so on. This effectively substitutes one +name for the other in the entire parser file. + +@node Interface, Algorithm, Grammar File, Top +@chapter Parser C-Language Interface +@cindex C-language interface +@cindex interface + +The Bison parser is actually a C function named @code{yyparse}. Here we +describe the interface conventions of @code{yyparse} and the other +functions that it needs to use. + +Keep in mind that the parser uses many C identifiers starting with +@samp{yy} and @samp{YY} for internal purposes. If you use such an +identifier (aside from those in this manual) in an action or in additional +C code in the grammar file, you are likely to run into trouble. + +@menu +* Parser Function:: How to call @code{yyparse} and what it returns. +* Lexical:: You must supply a function @code{yylex} + which reads tokens. +* Error Reporting:: You must supply a function @code{yyerror}. +* Action Features:: Special features for use in actions. +@end menu + +@node Parser Function, Lexical, , Interface +@section The Parser Function @code{yyparse} +@findex yyparse + +You call the function @code{yyparse} to cause parsing to occur. This +function reads tokens, executes actions, and ultimately returns when it +encounters end-of-input or an unrecoverable syntax error. You can also +write an action which directs @code{yyparse} to return immediately without +reading further. + +The value returned by @code{yyparse} is 0 if parsing was successful (return +is due to end-of-input). + +The value is 1 if parsing failed (return is due to a syntax error). + +In an action, you can cause immediate return from @code{yyparse} by using +these macros: + +@table @code +@item YYACCEPT +@findex YYACCEPT +Return immediately with value 0 (to report success). + +@item YYABORT +@findex YYABORT +Return immediately with value 1 (to report failure). +@end table + +@node Lexical, Error Reporting, Parser Function, Interface +@section The Lexical Analyzer Function @code{yylex} +@findex yylex +@cindex lexical analyzer + +The @dfn{lexical analyzer} function, @code{yylex}, recognizes tokens from +the input stream and returns them to the parser. Bison does not create +this function automatically; you must write it so that @code{yyparse} can +call it. The function is sometimes referred to as a lexical scanner. + +In simple programs, @code{yylex} is often defined at the end of the Bison +grammar file. If @code{yylex} is defined in a separate source file, you +need to arrange for the token-type macro definitions to be available there. +To do this, use the @samp{-d} option when you run Bison, so that it will +write these macro definitions into a separate header file +@file{@var{name}.tab.h} which you can include in the other source files +that need it. @xref{Invocation, ,Invoking Bison}.@refill + +@menu +* Calling Convention:: How @code{yyparse} calls @code{yylex}. +* Token Values:: How @code{yylex} must return the semantic value + of the token it has read. +* Token Positions:: How @code{yylex} must return the text position + (line number, etc.) of the token, if the + actions want that. +* Pure Calling:: How the calling convention differs + in a pure parser (@pxref{Pure Decl, ,A Pure (Reentrant) Parser}). +@end menu + +@node Calling Convention, Token Values, , Lexical +@subsection Calling Convention for @code{yylex} + +The value that @code{yylex} returns must be the numeric code for the type +of token it has just found, or 0 for end-of-input. + +When a token is referred to in the grammar rules by a name, that name +in the parser file becomes a C macro whose definition is the proper +numeric code for that token type. So @code{yylex} can use the name +to indicate that type. @xref{Symbols}. + +When a token is referred to in the grammar rules by a character literal, +the numeric code for that character is also the code for the token type. +So @code{yylex} can simply return that character code. The null character +must not be used this way, because its code is zero and that is what +signifies end-of-input. + +Here is an example showing these things: + +@example +yylex () +@{ + @dots{} + if (c == EOF) /* Detect end of file. */ + return 0; + @dots{} + if (c == '+' || c == '-') + return c; /* Assume token type for `+' is '+'. */ + @dots{} + return INT; /* Return the type of the token. */ + @dots{} +@} +@end example + +@noindent +This interface has been designed so that the output from the @code{lex} +utility can be used without change as the definition of @code{yylex}. + +If the grammar uses literal string tokens, there are two ways that +@code{yylex} can determine the token type codes for them: + +@itemize @bullet +@item +If the grammar defines symbolic token names as aliases for the +literal string tokens, @code{yylex} can use these symbolic names like +all others. In this case, the use of the literal string tokens in +the grammar file has no effect on @code{yylex}. + +@item +@code{yylex} can find the multi-character token in the @code{yytname} +table. The index of the token in the table is the token type's code. +The name of a multi-character token is recorded in @code{yytname} with a +double-quote, the token's characters, and another double-quote. The +token's characters are not escaped in any way; they appear verbatim in +the contents of the string in the table. + +Here's code for looking up a token in @code{yytname}, assuming that the +characters of the token are stored in @code{token_buffer}. + +@smallexample +for (i = 0; i < YYNTOKENS; i++) + @{ + if (yytname[i] != 0 + && yytname[i][0] == '"' + && strncmp (yytname[i] + 1, token_buffer, strlen (token_buffer)) + && yytname[i][strlen (token_buffer) + 1] == '"' + && yytname[i][strlen (token_buffer) + 2] == 0) + break; + @} +@end smallexample + +The @code{yytname} table is generated only if you use the +@code{%token_table} declaration. @xref{Decl Summary}. +@end itemize + +@node Token Values, Token Positions, Calling Convention, Lexical +@subsection Semantic Values of Tokens + +@vindex yylval +In an ordinary (nonreentrant) parser, the semantic value of the token must +be stored into the global variable @code{yylval}. When you are using +just one data type for semantic values, @code{yylval} has that type. +Thus, if the type is @code{int} (the default), you might write this in +@code{yylex}: + +@example +@group + @dots{} + yylval = value; /* Put value onto Bison stack. */ + return INT; /* Return the type of the token. */ + @dots{} +@end group +@end example + +When you are using multiple data types, @code{yylval}'s type is a union +made from the @code{%union} declaration (@pxref{Union Decl, ,The Collection of Value Types}). So when +you store a token's value, you must use the proper member of the union. +If the @code{%union} declaration looks like this: + +@example +@group +%union @{ + int intval; + double val; + symrec *tptr; +@} +@end group +@end example + +@noindent +then the code in @code{yylex} might look like this: + +@example +@group + @dots{} + yylval.intval = value; /* Put value onto Bison stack. */ + return INT; /* Return the type of the token. */ + @dots{} +@end group +@end example + +@node Token Positions, Pure Calling, Token Values, Lexical +@subsection Textual Positions of Tokens + +@vindex yylloc +If you are using the @samp{@@@var{n}}-feature (@pxref{Action Features, ,Special Features for Use in Actions}) in +actions to keep track of the textual locations of tokens and groupings, +then you must provide this information in @code{yylex}. The function +@code{yyparse} expects to find the textual location of a token just parsed +in the global variable @code{yylloc}. So @code{yylex} must store the +proper data in that variable. The value of @code{yylloc} is a structure +and you need only initialize the members that are going to be used by the +actions. The four members are called @code{first_line}, +@code{first_column}, @code{last_line} and @code{last_column}. Note that +the use of this feature makes the parser noticeably slower. + +@tindex YYLTYPE +The data type of @code{yylloc} has the name @code{YYLTYPE}. + +@node Pure Calling, , Token Positions, Lexical +@subsection Calling Conventions for Pure Parsers + +When you use the Bison declaration @code{%pure_parser} to request a +pure, reentrant parser, the global communication variables @code{yylval} +and @code{yylloc} cannot be used. (@xref{Pure Decl, ,A Pure (Reentrant) +Parser}.) In such parsers the two global variables are replaced by +pointers passed as arguments to @code{yylex}. You must declare them as +shown here, and pass the information back by storing it through those +pointers. + +@example +yylex (lvalp, llocp) + YYSTYPE *lvalp; + YYLTYPE *llocp; +@{ + @dots{} + *lvalp = value; /* Put value onto Bison stack. */ + return INT; /* Return the type of the token. */ + @dots{} +@} +@end example + +If the grammar file does not use the @samp{@@} constructs to refer to +textual positions, then the type @code{YYLTYPE} will not be defined. In +this case, omit the second argument; @code{yylex} will be called with +only one argument. + +@vindex YYPARSE_PARAM +If you use a reentrant parser, you can optionally pass additional +parameter information to it in a reentrant way. To do so, define the +macro @code{YYPARSE_PARAM} as a variable name. This modifies the +@code{yyparse} function to accept one argument, of type @code{void *}, +with that name. + +When you call @code{yyparse}, pass the address of an object, casting the +address to @code{void *}. The grammar actions can refer to the contents +of the object by casting the pointer value back to its proper type and +then dereferencing it. Here's an example. Write this in the parser: + +@example +%@{ +struct parser_control +@{ + int nastiness; + int randomness; +@}; + +#define YYPARSE_PARAM parm +%@} +@end example + +@noindent +Then call the parser like this: + +@example +struct parser_control +@{ + int nastiness; + int randomness; +@}; + +@dots{} + +@{ + struct parser_control foo; + @dots{} /* @r{Store proper data in @code{foo}.} */ + value = yyparse ((void *) &foo); + @dots{} +@} +@end example + +@noindent +In the grammar actions, use expressions like this to refer to the data: + +@example +((struct parser_control *) parm)->randomness +@end example + +@vindex YYLEX_PARAM +If you wish to pass the additional parameter data to @code{yylex}, +define the macro @code{YYLEX_PARAM} just like @code{YYPARSE_PARAM}, as +shown here: + +@example +%@{ +struct parser_control +@{ + int nastiness; + int randomness; +@}; + +#define YYPARSE_PARAM parm +#define YYLEX_PARAM parm +%@} +@end example + +You should then define @code{yylex} to accept one additional +argument---the value of @code{parm}. (This makes either two or three +arguments in total, depending on whether an argument of type +@code{YYLTYPE} is passed.) You can declare the argument as a pointer to +the proper object type, or you can declare it as @code{void *} and +access the contents as shown above. + +You can use @samp{%pure_parser} to request a reentrant parser without +also using @code{YYPARSE_PARAM}. Then you should call @code{yyparse} +with no arguments, as usual. + +@node Error Reporting, Action Features, Lexical, Interface +@section The Error Reporting Function @code{yyerror} +@cindex error reporting function +@findex yyerror +@cindex parse error +@cindex syntax error + +The Bison parser detects a @dfn{parse error} or @dfn{syntax error} +whenever it reads a token which cannot satisfy any syntax rule. A +action in the grammar can also explicitly proclaim an error, using the +macro @code{YYERROR} (@pxref{Action Features, ,Special Features for Use in Actions}). + +The Bison parser expects to report the error by calling an error +reporting function named @code{yyerror}, which you must supply. It is +called by @code{yyparse} whenever a syntax error is found, and it +receives one argument. For a parse error, the string is normally +@w{@code{"parse error"}}. + +@findex YYERROR_VERBOSE +If you define the macro @code{YYERROR_VERBOSE} in the Bison declarations +section (@pxref{Bison Declarations, ,The Bison Declarations Section}), then Bison provides a more verbose +and specific error message string instead of just plain @w{@code{"parse +error"}}. It doesn't matter what definition you use for +@code{YYERROR_VERBOSE}, just whether you define it. + +The parser can detect one other kind of error: stack overflow. This +happens when the input contains constructions that are very deeply +nested. It isn't likely you will encounter this, since the Bison +parser extends its stack automatically up to a very large limit. But +if overflow happens, @code{yyparse} calls @code{yyerror} in the usual +fashion, except that the argument string is @w{@code{"parser stack +overflow"}}. + +The following definition suffices in simple programs: + +@example +@group +yyerror (s) + char *s; +@{ +@end group +@group + fprintf (stderr, "%s\n", s); +@} +@end group +@end example + +After @code{yyerror} returns to @code{yyparse}, the latter will attempt +error recovery if you have written suitable error recovery grammar rules +(@pxref{Error Recovery}). If recovery is impossible, @code{yyparse} will +immediately return 1. + +@vindex yynerrs +The variable @code{yynerrs} contains the number of syntax errors +encountered so far. Normally this variable is global; but if you +request a pure parser (@pxref{Pure Decl, ,A Pure (Reentrant) Parser}) then it is a local variable +which only the actions can access. + +@node Action Features, , Error Reporting, Interface +@section Special Features for Use in Actions +@cindex summary, action features +@cindex action features summary + +Here is a table of Bison constructs, variables and macros that +are useful in actions. + +@table @samp +@item $$ +Acts like a variable that contains the semantic value for the +grouping made by the current rule. @xref{Actions}. + +@item $@var{n} +Acts like a variable that contains the semantic value for the +@var{n}th component of the current rule. @xref{Actions}. + +@item $<@var{typealt}>$ +Like @code{$$} but specifies alternative @var{typealt} in the union +specified by the @code{%union} declaration. @xref{Action Types, ,Data Types of Values in Actions}. + +@item $<@var{typealt}>@var{n} +Like @code{$@var{n}} but specifies alternative @var{typealt} in the +union specified by the @code{%union} declaration. +@xref{Action Types, ,Data Types of Values in Actions}.@refill + +@item YYABORT; +Return immediately from @code{yyparse}, indicating failure. +@xref{Parser Function, ,The Parser Function @code{yyparse}}. + +@item YYACCEPT; +Return immediately from @code{yyparse}, indicating success. +@xref{Parser Function, ,The Parser Function @code{yyparse}}. + +@item YYBACKUP (@var{token}, @var{value}); +@findex YYBACKUP +Unshift a token. This macro is allowed only for rules that reduce +a single value, and only when there is no look-ahead token. +It installs a look-ahead token with token type @var{token} and +semantic value @var{value}; then it discards the value that was +going to be reduced by this rule. + +If the macro is used when it is not valid, such as when there is +a look-ahead token already, then it reports a syntax error with +a message @samp{cannot back up} and performs ordinary error +recovery. + +In either case, the rest of the action is not executed. + +@item YYEMPTY +@vindex YYEMPTY +Value stored in @code{yychar} when there is no look-ahead token. + +@item YYERROR; +@findex YYERROR +Cause an immediate syntax error. This statement initiates error +recovery just as if the parser itself had detected an error; however, it +does not call @code{yyerror}, and does not print any message. If you +want to print an error message, call @code{yyerror} explicitly before +the @samp{YYERROR;} statement. @xref{Error Recovery}. + +@item YYRECOVERING +This macro stands for an expression that has the value 1 when the parser +is recovering from a syntax error, and 0 the rest of the time. +@xref{Error Recovery}. + +@item yychar +Variable containing the current look-ahead token. (In a pure parser, +this is actually a local variable within @code{yyparse}.) When there is +no look-ahead token, the value @code{YYEMPTY} is stored in the variable. +@xref{Look-Ahead, ,Look-Ahead Tokens}. + +@item yyclearin; +Discard the current look-ahead token. This is useful primarily in +error rules. @xref{Error Recovery}. + +@item yyerrok; +Resume generating error messages immediately for subsequent syntax +errors. This is useful primarily in error rules. +@xref{Error Recovery}. + +@item @@@var{n} +@findex @@@var{n} +Acts like a structure variable containing information on the line +numbers and column numbers of the @var{n}th component of the current +rule. The structure has four members, like this: + +@example +struct @{ + int first_line, last_line; + int first_column, last_column; +@}; +@end example + +Thus, to get the starting line number of the third component, use +@samp{@@3.first_line}. + +In order for the members of this structure to contain valid information, +you must make @code{yylex} supply this information about each token. +If you need only certain members, then @code{yylex} need only fill in +those members. + +The use of this feature makes the parser noticeably slower. +@end table + +@node Algorithm, Error Recovery, Interface, Top +@chapter The Bison Parser Algorithm +@cindex Bison parser algorithm +@cindex algorithm of parser +@cindex shifting +@cindex reduction +@cindex parser stack +@cindex stack, parser + +As Bison reads tokens, it pushes them onto a stack along with their +semantic values. The stack is called the @dfn{parser stack}. Pushing a +token is traditionally called @dfn{shifting}. + +For example, suppose the infix calculator has read @samp{1 + 5 *}, with a +@samp{3} to come. The stack will have four elements, one for each token +that was shifted. + +But the stack does not always have an element for each token read. When +the last @var{n} tokens and groupings shifted match the components of a +grammar rule, they can be combined according to that rule. This is called +@dfn{reduction}. Those tokens and groupings are replaced on the stack by a +single grouping whose symbol is the result (left hand side) of that rule. +Running the rule's action is part of the process of reduction, because this +is what computes the semantic value of the resulting grouping. + +For example, if the infix calculator's parser stack contains this: + +@example +1 + 5 * 3 +@end example + +@noindent +and the next input token is a newline character, then the last three +elements can be reduced to 15 via the rule: + +@example +expr: expr '*' expr; +@end example + +@noindent +Then the stack contains just these three elements: + +@example +1 + 15 +@end example + +@noindent +At this point, another reduction can be made, resulting in the single value +16. Then the newline token can be shifted. + +The parser tries, by shifts and reductions, to reduce the entire input down +to a single grouping whose symbol is the grammar's start-symbol +(@pxref{Language and Grammar, ,Languages and Context-Free Grammars}). + +This kind of parser is known in the literature as a bottom-up parser. + +@menu +* Look-Ahead:: Parser looks one token ahead when deciding what to do. +* Shift/Reduce:: Conflicts: when either shifting or reduction is valid. +* Precedence:: Operator precedence works by resolving conflicts. +* Contextual Precedence:: When an operator's precedence depends on context. +* Parser States:: The parser is a finite-state-machine with stack. +* Reduce/Reduce:: When two rules are applicable in the same situation. +* Mystery Conflicts:: Reduce/reduce conflicts that look unjustified. +* Stack Overflow:: What happens when stack gets full. How to avoid it. +@end menu + +@node Look-Ahead, Shift/Reduce, , Algorithm +@section Look-Ahead Tokens +@cindex look-ahead token + +The Bison parser does @emph{not} always reduce immediately as soon as the +last @var{n} tokens and groupings match a rule. This is because such a +simple strategy is inadequate to handle most languages. Instead, when a +reduction is possible, the parser sometimes ``looks ahead'' at the next +token in order to decide what to do. + +When a token is read, it is not immediately shifted; first it becomes the +@dfn{look-ahead token}, which is not on the stack. Now the parser can +perform one or more reductions of tokens and groupings on the stack, while +the look-ahead token remains off to the side. When no more reductions +should take place, the look-ahead token is shifted onto the stack. This +does not mean that all possible reductions have been done; depending on the +token type of the look-ahead token, some rules may choose to delay their +application. + +Here is a simple case where look-ahead is needed. These three rules define +expressions which contain binary addition operators and postfix unary +factorial operators (@samp{!}), and allow parentheses for grouping. + +@example +@group +expr: term '+' expr + | term + ; +@end group + +@group +term: '(' expr ')' + | term '!' + | NUMBER + ; +@end group +@end example + +Suppose that the tokens @w{@samp{1 + 2}} have been read and shifted; what +should be done? If the following token is @samp{)}, then the first three +tokens must be reduced to form an @code{expr}. This is the only valid +course, because shifting the @samp{)} would produce a sequence of symbols +@w{@code{term ')'}}, and no rule allows this. + +If the following token is @samp{!}, then it must be shifted immediately so +that @w{@samp{2 !}} can be reduced to make a @code{term}. If instead the +parser were to reduce before shifting, @w{@samp{1 + 2}} would become an +@code{expr}. It would then be impossible to shift the @samp{!} because +doing so would produce on the stack the sequence of symbols @code{expr +'!'}. No rule allows that sequence. + +@vindex yychar +The current look-ahead token is stored in the variable @code{yychar}. +@xref{Action Features, ,Special Features for Use in Actions}. + +@node Shift/Reduce, Precedence, Look-Ahead, Algorithm +@section Shift/Reduce Conflicts +@cindex conflicts +@cindex shift/reduce conflicts +@cindex dangling @code{else} +@cindex @code{else}, dangling + +Suppose we are parsing a language which has if-then and if-then-else +statements, with a pair of rules like this: + +@example +@group +if_stmt: + IF expr THEN stmt + | IF expr THEN stmt ELSE stmt + ; +@end group +@end example + +@noindent +Here we assume that @code{IF}, @code{THEN} and @code{ELSE} are +terminal symbols for specific keyword tokens. + +When the @code{ELSE} token is read and becomes the look-ahead token, the +contents of the stack (assuming the input is valid) are just right for +reduction by the first rule. But it is also legitimate to shift the +@code{ELSE}, because that would lead to eventual reduction by the second +rule. + +This situation, where either a shift or a reduction would be valid, is +called a @dfn{shift/reduce conflict}. Bison is designed to resolve +these conflicts by choosing to shift, unless otherwise directed by +operator precedence declarations. To see the reason for this, let's +contrast it with the other alternative. + +Since the parser prefers to shift the @code{ELSE}, the result is to attach +the else-clause to the innermost if-statement, making these two inputs +equivalent: + +@example +if x then if y then win (); else lose; + +if x then do; if y then win (); else lose; end; +@end example + +But if the parser chose to reduce when possible rather than shift, the +result would be to attach the else-clause to the outermost if-statement, +making these two inputs equivalent: + +@example +if x then if y then win (); else lose; + +if x then do; if y then win (); end; else lose; +@end example + +The conflict exists because the grammar as written is ambiguous: either +parsing of the simple nested if-statement is legitimate. The established +convention is that these ambiguities are resolved by attaching the +else-clause to the innermost if-statement; this is what Bison accomplishes +by choosing to shift rather than reduce. (It would ideally be cleaner to +write an unambiguous grammar, but that is very hard to do in this case.) +This particular ambiguity was first encountered in the specifications of +Algol 60 and is called the ``dangling @code{else}'' ambiguity. + +To avoid warnings from Bison about predictable, legitimate shift/reduce +conflicts, use the @code{%expect @var{n}} declaration. There will be no +warning as long as the number of shift/reduce conflicts is exactly @var{n}. +@xref{Expect Decl, ,Suppressing Conflict Warnings}. + +The definition of @code{if_stmt} above is solely to blame for the +conflict, but the conflict does not actually appear without additional +rules. Here is a complete Bison input file that actually manifests the +conflict: + +@example +@group +%token IF THEN ELSE variable +%% +@end group +@group +stmt: expr + | if_stmt + ; +@end group + +@group +if_stmt: + IF expr THEN stmt + | IF expr THEN stmt ELSE stmt + ; +@end group + +expr: variable + ; +@end example + +@node Precedence, Contextual Precedence, Shift/Reduce, Algorithm +@section Operator Precedence +@cindex operator precedence +@cindex precedence of operators + +Another situation where shift/reduce conflicts appear is in arithmetic +expressions. Here shifting is not always the preferred resolution; the +Bison declarations for operator precedence allow you to specify when to +shift and when to reduce. + +@menu +* Why Precedence:: An example showing why precedence is needed. +* Using Precedence:: How to specify precedence in Bison grammars. +* Precedence Examples:: How these features are used in the previous example. +* How Precedence:: How they work. +@end menu + +@node Why Precedence, Using Precedence, , Precedence +@subsection When Precedence is Needed + +Consider the following ambiguous grammar fragment (ambiguous because the +input @w{@samp{1 - 2 * 3}} can be parsed in two different ways): + +@example +@group +expr: expr '-' expr + | expr '*' expr + | expr '<' expr + | '(' expr ')' + @dots{} + ; +@end group +@end example + +@noindent +Suppose the parser has seen the tokens @samp{1}, @samp{-} and @samp{2}; +should it reduce them via the rule for the addition operator? It depends +on the next token. Of course, if the next token is @samp{)}, we must +reduce; shifting is invalid because no single rule can reduce the token +sequence @w{@samp{- 2 )}} or anything starting with that. But if the next +token is @samp{*} or @samp{<}, we have a choice: either shifting or +reduction would allow the parse to complete, but with different +results. + +To decide which one Bison should do, we must consider the +results. If the next operator token @var{op} is shifted, then it +must be reduced first in order to permit another opportunity to +reduce the sum. The result is (in effect) @w{@samp{1 - (2 +@var{op} 3)}}. On the other hand, if the subtraction is reduced +before shifting @var{op}, the result is @w{@samp{(1 - 2) @var{op} +3}}. Clearly, then, the choice of shift or reduce should depend +on the relative precedence of the operators @samp{-} and +@var{op}: @samp{*} should be shifted first, but not @samp{<}. + +@cindex associativity +What about input such as @w{@samp{1 - 2 - 5}}; should this be +@w{@samp{(1 - 2) - 5}} or should it be @w{@samp{1 - (2 - 5)}}? For +most operators we prefer the former, which is called @dfn{left +association}. The latter alternative, @dfn{right association}, is +desirable for assignment operators. The choice of left or right +association is a matter of whether the parser chooses to shift or +reduce when the stack contains @w{@samp{1 - 2}} and the look-ahead +token is @samp{-}: shifting makes right-associativity. + +@node Using Precedence, Precedence Examples, Why Precedence, Precedence +@subsection Specifying Operator Precedence +@findex %left +@findex %right +@findex %nonassoc + +Bison allows you to specify these choices with the operator precedence +declarations @code{%left} and @code{%right}. Each such declaration +contains a list of tokens, which are operators whose precedence and +associativity is being declared. The @code{%left} declaration makes all +those operators left-associative and the @code{%right} declaration makes +them right-associative. A third alternative is @code{%nonassoc}, which +declares that it is a syntax error to find the same operator twice ``in a +row''. + +The relative precedence of different operators is controlled by the +order in which they are declared. The first @code{%left} or +@code{%right} declaration in the file declares the operators whose +precedence is lowest, the next such declaration declares the operators +whose precedence is a little higher, and so on. + +@node Precedence Examples, How Precedence, Using Precedence, Precedence +@subsection Precedence Examples + +In our example, we would want the following declarations: + +@example +%left '<' +%left '-' +%left '*' +@end example + +In a more complete example, which supports other operators as well, we +would declare them in groups of equal precedence. For example, @code{'+'} is +declared with @code{'-'}: + +@example +%left '<' '>' '=' NE LE GE +%left '+' '-' +%left '*' '/' +@end example + +@noindent +(Here @code{NE} and so on stand for the operators for ``not equal'' +and so on. We assume that these tokens are more than one character long +and therefore are represented by names, not character literals.) + +@node How Precedence, , Precedence Examples, Precedence +@subsection How Precedence Works + +The first effect of the precedence declarations is to assign precedence +levels to the terminal symbols declared. The second effect is to assign +precedence levels to certain rules: each rule gets its precedence from the +last terminal symbol mentioned in the components. (You can also specify +explicitly the precedence of a rule. @xref{Contextual Precedence, ,Context-Dependent Precedence}.) + +Finally, the resolution of conflicts works by comparing the +precedence of the rule being considered with that of the +look-ahead token. If the token's precedence is higher, the +choice is to shift. If the rule's precedence is higher, the +choice is to reduce. If they have equal precedence, the choice +is made based on the associativity of that precedence level. The +verbose output file made by @samp{-v} (@pxref{Invocation, ,Invoking Bison}) says +how each conflict was resolved. + +Not all rules and not all tokens have precedence. If either the rule or +the look-ahead token has no precedence, then the default is to shift. + +@node Contextual Precedence, Parser States, Precedence, Algorithm +@section Context-Dependent Precedence +@cindex context-dependent precedence +@cindex unary operator precedence +@cindex precedence, context-dependent +@cindex precedence, unary operator +@findex %prec + +Often the precedence of an operator depends on the context. This sounds +outlandish at first, but it is really very common. For example, a minus +sign typically has a very high precedence as a unary operator, and a +somewhat lower precedence (lower than multiplication) as a binary operator. + +The Bison precedence declarations, @code{%left}, @code{%right} and +@code{%nonassoc}, can only be used once for a given token; so a token has +only one precedence declared in this way. For context-dependent +precedence, you need to use an additional mechanism: the @code{%prec} +modifier for rules.@refill + +The @code{%prec} modifier declares the precedence of a particular rule by +specifying a terminal symbol whose precedence should be used for that rule. +It's not necessary for that symbol to appear otherwise in the rule. The +modifier's syntax is: + +@example +%prec @var{terminal-symbol} +@end example + +@noindent +and it is written after the components of the rule. Its effect is to +assign the rule the precedence of @var{terminal-symbol}, overriding +the precedence that would be deduced for it in the ordinary way. The +altered rule precedence then affects how conflicts involving that rule +are resolved (@pxref{Precedence, ,Operator Precedence}). + +Here is how @code{%prec} solves the problem of unary minus. First, declare +a precedence for a fictitious terminal symbol named @code{UMINUS}. There +are no tokens of this type, but the symbol serves to stand for its +precedence: + +@example +@dots{} +%left '+' '-' +%left '*' +%left UMINUS +@end example + +Now the precedence of @code{UMINUS} can be used in specific rules: + +@example +@group +exp: @dots{} + | exp '-' exp + @dots{} + | '-' exp %prec UMINUS +@end group +@end example + +@node Parser States, Reduce/Reduce, Contextual Precedence, Algorithm +@section Parser States +@cindex finite-state machine +@cindex parser state +@cindex state (of parser) + +The function @code{yyparse} is implemented using a finite-state machine. +The values pushed on the parser stack are not simply token type codes; they +represent the entire sequence of terminal and nonterminal symbols at or +near the top of the stack. The current state collects all the information +about previous input which is relevant to deciding what to do next. + +Each time a look-ahead token is read, the current parser state together +with the type of look-ahead token are looked up in a table. This table +entry can say, ``Shift the look-ahead token.'' In this case, it also +specifies the new parser state, which is pushed onto the top of the +parser stack. Or it can say, ``Reduce using rule number @var{n}.'' +This means that a certain number of tokens or groupings are taken off +the top of the stack, and replaced by one grouping. In other words, +that number of states are popped from the stack, and one new state is +pushed. + +There is one other alternative: the table can say that the look-ahead token +is erroneous in the current state. This causes error processing to begin +(@pxref{Error Recovery}). + +@node Reduce/Reduce, Mystery Conflicts, Parser States, Algorithm +@section Reduce/Reduce Conflicts +@cindex reduce/reduce conflict +@cindex conflicts, reduce/reduce + +A reduce/reduce conflict occurs if there are two or more rules that apply +to the same sequence of input. This usually indicates a serious error +in the grammar. + +For example, here is an erroneous attempt to define a sequence +of zero or more @code{word} groupings. + +@example +sequence: /* empty */ + @{ printf ("empty sequence\n"); @} + | maybeword + | sequence word + @{ printf ("added word %s\n", $2); @} + ; + +maybeword: /* empty */ + @{ printf ("empty maybeword\n"); @} + | word + @{ printf ("single word %s\n", $1); @} + ; +@end example + +@noindent +The error is an ambiguity: there is more than one way to parse a single +@code{word} into a @code{sequence}. It could be reduced to a +@code{maybeword} and then into a @code{sequence} via the second rule. +Alternatively, nothing-at-all could be reduced into a @code{sequence} +via the first rule, and this could be combined with the @code{word} +using the third rule for @code{sequence}. + +There is also more than one way to reduce nothing-at-all into a +@code{sequence}. This can be done directly via the first rule, +or indirectly via @code{maybeword} and then the second rule. + +You might think that this is a distinction without a difference, because it +does not change whether any particular input is valid or not. But it does +affect which actions are run. One parsing order runs the second rule's +action; the other runs the first rule's action and the third rule's action. +In this example, the output of the program changes. + +Bison resolves a reduce/reduce conflict by choosing to use the rule that +appears first in the grammar, but it is very risky to rely on this. Every +reduce/reduce conflict must be studied and usually eliminated. Here is the +proper way to define @code{sequence}: + +@example +sequence: /* empty */ + @{ printf ("empty sequence\n"); @} + | sequence word + @{ printf ("added word %s\n", $2); @} + ; +@end example + +Here is another common error that yields a reduce/reduce conflict: + +@example +sequence: /* empty */ + | sequence words + | sequence redirects + ; + +words: /* empty */ + | words word + ; + +redirects:/* empty */ + | redirects redirect + ; +@end example + +@noindent +The intention here is to define a sequence which can contain either +@code{word} or @code{redirect} groupings. The individual definitions of +@code{sequence}, @code{words} and @code{redirects} are error-free, but the +three together make a subtle ambiguity: even an empty input can be parsed +in infinitely many ways! + +Consider: nothing-at-all could be a @code{words}. Or it could be two +@code{words} in a row, or three, or any number. It could equally well be a +@code{redirects}, or two, or any number. Or it could be a @code{words} +followed by three @code{redirects} and another @code{words}. And so on. + +Here are two ways to correct these rules. First, to make it a single level +of sequence: + +@example +sequence: /* empty */ + | sequence word + | sequence redirect + ; +@end example + +Second, to prevent either a @code{words} or a @code{redirects} +from being empty: + +@example +sequence: /* empty */ + | sequence words + | sequence redirects + ; + +words: word + | words word + ; + +redirects:redirect + | redirects redirect + ; +@end example + +@node Mystery Conflicts, Stack Overflow, Reduce/Reduce, Algorithm +@section Mysterious Reduce/Reduce Conflicts + +Sometimes reduce/reduce conflicts can occur that don't look warranted. +Here is an example: + +@example +@group +%token ID + +%% +def: param_spec return_spec ',' + ; +param_spec: + type + | name_list ':' type + ; +@end group +@group +return_spec: + type + | name ':' type + ; +@end group +@group +type: ID + ; +@end group +@group +name: ID + ; +name_list: + name + | name ',' name_list + ; +@end group +@end example + +It would seem that this grammar can be parsed with only a single token +of look-ahead: when a @code{param_spec} is being read, an @code{ID} is +a @code{name} if a comma or colon follows, or a @code{type} if another +@code{ID} follows. In other words, this grammar is LR(1). + +@cindex LR(1) +@cindex LALR(1) +However, Bison, like most parser generators, cannot actually handle all +LR(1) grammars. In this grammar, two contexts, that after an @code{ID} +at the beginning of a @code{param_spec} and likewise at the beginning of +a @code{return_spec}, are similar enough that Bison assumes they are the +same. They appear similar because the same set of rules would be +active---the rule for reducing to a @code{name} and that for reducing to +a @code{type}. Bison is unable to determine at that stage of processing +that the rules would require different look-ahead tokens in the two +contexts, so it makes a single parser state for them both. Combining +the two contexts causes a conflict later. In parser terminology, this +occurrence means that the grammar is not LALR(1). + +In general, it is better to fix deficiencies than to document them. But +this particular deficiency is intrinsically hard to fix; parser +generators that can handle LR(1) grammars are hard to write and tend to +produce parsers that are very large. In practice, Bison is more useful +as it is now. + +When the problem arises, you can often fix it by identifying the two +parser states that are being confused, and adding something to make them +look distinct. In the above example, adding one rule to +@code{return_spec} as follows makes the problem go away: + +@example +@group +%token BOGUS +@dots{} +%% +@dots{} +return_spec: + type + | name ':' type + /* This rule is never used. */ + | ID BOGUS + ; +@end group +@end example + +This corrects the problem because it introduces the possibility of an +additional active rule in the context after the @code{ID} at the beginning of +@code{return_spec}. This rule is not active in the corresponding context +in a @code{param_spec}, so the two contexts receive distinct parser states. +As long as the token @code{BOGUS} is never generated by @code{yylex}, +the added rule cannot alter the way actual input is parsed. + +In this particular example, there is another way to solve the problem: +rewrite the rule for @code{return_spec} to use @code{ID} directly +instead of via @code{name}. This also causes the two confusing +contexts to have different sets of active rules, because the one for +@code{return_spec} activates the altered rule for @code{return_spec} +rather than the one for @code{name}. + +@example +param_spec: + type + | name_list ':' type + ; +return_spec: + type + | ID ':' type + ; +@end example + +@node Stack Overflow, , Mystery Conflicts, Algorithm +@section Stack Overflow, and How to Avoid It +@cindex stack overflow +@cindex parser stack overflow +@cindex overflow of parser stack + +The Bison parser stack can overflow if too many tokens are shifted and +not reduced. When this happens, the parser function @code{yyparse} +returns a nonzero value, pausing only to call @code{yyerror} to report +the overflow. + +@vindex YYMAXDEPTH +By defining the macro @code{YYMAXDEPTH}, you can control how deep the +parser stack can become before a stack overflow occurs. Define the +macro with a value that is an integer. This value is the maximum number +of tokens that can be shifted (and not reduced) before overflow. +It must be a constant expression whose value is known at compile time. + +The stack space allowed is not necessarily allocated. If you specify a +large value for @code{YYMAXDEPTH}, the parser actually allocates a small +stack at first, and then makes it bigger by stages as needed. This +increasing allocation happens automatically and silently. Therefore, +you do not need to make @code{YYMAXDEPTH} painfully small merely to save +space for ordinary inputs that do not need much stack. + +@cindex default stack limit +The default value of @code{YYMAXDEPTH}, if you do not define it, is +10000. + +@vindex YYINITDEPTH +You can control how much stack is allocated initially by defining the +macro @code{YYINITDEPTH}. This value too must be a compile-time +constant integer. The default is 200. + +@node Error Recovery, Context Dependency, Algorithm, Top +@chapter Error Recovery +@cindex error recovery +@cindex recovery from errors + +It is not usually acceptable to have a program terminate on a parse +error. For example, a compiler should recover sufficiently to parse the +rest of the input file and check it for errors; a calculator should accept +another expression. + +In a simple interactive command parser where each input is one line, it may +be sufficient to allow @code{yyparse} to return 1 on error and have the +caller ignore the rest of the input line when that happens (and then call +@code{yyparse} again). But this is inadequate for a compiler, because it +forgets all the syntactic context leading up to the error. A syntax error +deep within a function in the compiler input should not cause the compiler +to treat the following line like the beginning of a source file. + +@findex error +You can define how to recover from a syntax error by writing rules to +recognize the special token @code{error}. This is a terminal symbol that +is always defined (you need not declare it) and reserved for error +handling. The Bison parser generates an @code{error} token whenever a +syntax error happens; if you have provided a rule to recognize this token +in the current context, the parse can continue. + +For example: + +@example +stmnts: /* empty string */ + | stmnts '\n' + | stmnts exp '\n' + | stmnts error '\n' +@end example + +The fourth rule in this example says that an error followed by a newline +makes a valid addition to any @code{stmnts}. + +What happens if a syntax error occurs in the middle of an @code{exp}? The +error recovery rule, interpreted strictly, applies to the precise sequence +of a @code{stmnts}, an @code{error} and a newline. If an error occurs in +the middle of an @code{exp}, there will probably be some additional tokens +and subexpressions on the stack after the last @code{stmnts}, and there +will be tokens to read before the next newline. So the rule is not +applicable in the ordinary way. + +But Bison can force the situation to fit the rule, by discarding part of +the semantic context and part of the input. First it discards states and +objects from the stack until it gets back to a state in which the +@code{error} token is acceptable. (This means that the subexpressions +already parsed are discarded, back to the last complete @code{stmnts}.) At +this point the @code{error} token can be shifted. Then, if the old +look-ahead token is not acceptable to be shifted next, the parser reads +tokens and discards them until it finds a token which is acceptable. In +this example, Bison reads and discards input until the next newline +so that the fourth rule can apply. + +The choice of error rules in the grammar is a choice of strategies for +error recovery. A simple and useful strategy is simply to skip the rest of +the current input line or current statement if an error is detected: + +@example +stmnt: error ';' /* on error, skip until ';' is read */ +@end example + +It is also useful to recover to the matching close-delimiter of an +opening-delimiter that has already been parsed. Otherwise the +close-delimiter will probably appear to be unmatched, and generate another, +spurious error message: + +@example +primary: '(' expr ')' + | '(' error ')' + @dots{} + ; +@end example + +Error recovery strategies are necessarily guesses. When they guess wrong, +one syntax error often leads to another. In the above example, the error +recovery rule guesses that an error is due to bad input within one +@code{stmnt}. Suppose that instead a spurious semicolon is inserted in the +middle of a valid @code{stmnt}. After the error recovery rule recovers +from the first error, another syntax error will be found straightaway, +since the text following the spurious semicolon is also an invalid +@code{stmnt}. + +To prevent an outpouring of error messages, the parser will output no error +message for another syntax error that happens shortly after the first; only +after three consecutive input tokens have been successfully shifted will +error messages resume. + +Note that rules which accept the @code{error} token may have actions, just +as any other rules can. + +@findex yyerrok +You can make error messages resume immediately by using the macro +@code{yyerrok} in an action. If you do this in the error rule's action, no +error messages will be suppressed. This macro requires no arguments; +@samp{yyerrok;} is a valid C statement. + +@findex yyclearin +The previous look-ahead token is reanalyzed immediately after an error. If +this is unacceptable, then the macro @code{yyclearin} may be used to clear +this token. Write the statement @samp{yyclearin;} in the error rule's +action. + +For example, suppose that on a parse error, an error handling routine is +called that advances the input stream to some point where parsing should +once again commence. The next symbol returned by the lexical scanner is +probably correct. The previous look-ahead token ought to be discarded +with @samp{yyclearin;}. + +@vindex YYRECOVERING +The macro @code{YYRECOVERING} stands for an expression that has the +value 1 when the parser is recovering from a syntax error, and 0 the +rest of the time. A value of 1 indicates that error messages are +currently suppressed for new syntax errors. + +@node Context Dependency, Debugging, Error Recovery, Top +@chapter Handling Context Dependencies + +The Bison paradigm is to parse tokens first, then group them into larger +syntactic units. In many languages, the meaning of a token is affected by +its context. Although this violates the Bison paradigm, certain techniques +(known as @dfn{kludges}) may enable you to write Bison parsers for such +languages. + +@menu +* Semantic Tokens:: Token parsing can depend on the semantic context. +* Lexical Tie-ins:: Token parsing can depend on the syntactic context. +* Tie-in Recovery:: Lexical tie-ins have implications for how + error recovery rules must be written. +@end menu + +(Actually, ``kludge'' means any technique that gets its job done but is +neither clean nor robust.) + +@node Semantic Tokens, Lexical Tie-ins, , Context Dependency +@section Semantic Info in Token Types + +The C language has a context dependency: the way an identifier is used +depends on what its current meaning is. For example, consider this: + +@example +foo (x); +@end example + +This looks like a function call statement, but if @code{foo} is a typedef +name, then this is actually a declaration of @code{x}. How can a Bison +parser for C decide how to parse this input? + +The method used in GNU C is to have two different token types, +@code{IDENTIFIER} and @code{TYPENAME}. When @code{yylex} finds an +identifier, it looks up the current declaration of the identifier in order +to decide which token type to return: @code{TYPENAME} if the identifier is +declared as a typedef, @code{IDENTIFIER} otherwise. + +The grammar rules can then express the context dependency by the choice of +token type to recognize. @code{IDENTIFIER} is accepted as an expression, +but @code{TYPENAME} is not. @code{TYPENAME} can start a declaration, but +@code{IDENTIFIER} cannot. In contexts where the meaning of the identifier +is @emph{not} significant, such as in declarations that can shadow a +typedef name, either @code{TYPENAME} or @code{IDENTIFIER} is +accepted---there is one rule for each of the two token types. + +This technique is simple to use if the decision of which kinds of +identifiers to allow is made at a place close to where the identifier is +parsed. But in C this is not always so: C allows a declaration to +redeclare a typedef name provided an explicit type has been specified +earlier: + +@example +typedef int foo, bar, lose; +static foo (bar); /* @r{redeclare @code{bar} as static variable} */ +static int foo (lose); /* @r{redeclare @code{foo} as function} */ +@end example + +Unfortunately, the name being declared is separated from the declaration +construct itself by a complicated syntactic structure---the ``declarator''. + +As a result, the part of Bison parser for C needs to be duplicated, with +all the nonterminal names changed: once for parsing a declaration in which +a typedef name can be redefined, and once for parsing a declaration in +which that can't be done. Here is a part of the duplication, with actions +omitted for brevity: + +@example +initdcl: + declarator maybeasm '=' + init + | declarator maybeasm + ; + +notype_initdcl: + notype_declarator maybeasm '=' + init + | notype_declarator maybeasm + ; +@end example + +@noindent +Here @code{initdcl} can redeclare a typedef name, but @code{notype_initdcl} +cannot. The distinction between @code{declarator} and +@code{notype_declarator} is the same sort of thing. + +There is some similarity between this technique and a lexical tie-in +(described next), in that information which alters the lexical analysis is +changed during parsing by other parts of the program. The difference is +here the information is global, and is used for other purposes in the +program. A true lexical tie-in has a special-purpose flag controlled by +the syntactic context. + +@node Lexical Tie-ins, Tie-in Recovery, Semantic Tokens, Context Dependency +@section Lexical Tie-ins +@cindex lexical tie-in + +One way to handle context-dependency is the @dfn{lexical tie-in}: a flag +which is set by Bison actions, whose purpose is to alter the way tokens are +parsed. + +For example, suppose we have a language vaguely like C, but with a special +construct @samp{hex (@var{hex-expr})}. After the keyword @code{hex} comes +an expression in parentheses in which all integers are hexadecimal. In +particular, the token @samp{a1b} must be treated as an integer rather than +as an identifier if it appears in that context. Here is how you can do it: + +@example +@group +%@{ +int hexflag; +%@} +%% +@dots{} +@end group +@group +expr: IDENTIFIER + | constant + | HEX '(' + @{ hexflag = 1; @} + expr ')' + @{ hexflag = 0; + $$ = $4; @} + | expr '+' expr + @{ $$ = make_sum ($1, $3); @} + @dots{} + ; +@end group + +@group +constant: + INTEGER + | STRING + ; +@end group +@end example + +@noindent +Here we assume that @code{yylex} looks at the value of @code{hexflag}; when +it is nonzero, all integers are parsed in hexadecimal, and tokens starting +with letters are parsed as integers if possible. + +The declaration of @code{hexflag} shown in the C declarations section of +the parser file is needed to make it accessible to the actions +(@pxref{C Declarations, ,The C Declarations Section}). You must also write the code in @code{yylex} +to obey the flag. + +@node Tie-in Recovery, , Lexical Tie-ins, Context Dependency +@section Lexical Tie-ins and Error Recovery + +Lexical tie-ins make strict demands on any error recovery rules you have. +@xref{Error Recovery}. + +The reason for this is that the purpose of an error recovery rule is to +abort the parsing of one construct and resume in some larger construct. +For example, in C-like languages, a typical error recovery rule is to skip +tokens until the next semicolon, and then start a new statement, like this: + +@example +stmt: expr ';' + | IF '(' expr ')' stmt @{ @dots{} @} + @dots{} + error ';' + @{ hexflag = 0; @} + ; +@end example + +If there is a syntax error in the middle of a @samp{hex (@var{expr})} +construct, this error rule will apply, and then the action for the +completed @samp{hex (@var{expr})} will never run. So @code{hexflag} would +remain set for the entire rest of the input, or until the next @code{hex} +keyword, causing identifiers to be misinterpreted as integers. + +To avoid this problem the error recovery rule itself clears @code{hexflag}. + +There may also be an error recovery rule that works within expressions. +For example, there could be a rule which applies within parentheses +and skips to the close-parenthesis: + +@example +@group +expr: @dots{} + | '(' expr ')' + @{ $$ = $2; @} + | '(' error ')' + @dots{} +@end group +@end example + +If this rule acts within the @code{hex} construct, it is not going to abort +that construct (since it applies to an inner level of parentheses within +the construct). Therefore, it should not clear the flag: the rest of +the @code{hex} construct should be parsed with the flag still in effect. + +What if there is an error recovery rule which might abort out of the +@code{hex} construct or might not, depending on circumstances? There is no +way you can write the action to determine whether a @code{hex} construct is +being aborted or not. So if you are using a lexical tie-in, you had better +make sure your error recovery rules are not of this kind. Each rule must +be such that you can be sure that it always will, or always won't, have to +clear the flag. + +@node Debugging, Invocation, Context Dependency, Top +@chapter Debugging Your Parser +@findex YYDEBUG +@findex yydebug +@cindex debugging +@cindex tracing the parser + +If a Bison grammar compiles properly but doesn't do what you want when it +runs, the @code{yydebug} parser-trace feature can help you figure out why. + +To enable compilation of trace facilities, you must define the macro +@code{YYDEBUG} when you compile the parser. You could use +@samp{-DYYDEBUG=1} as a compiler option or you could put @samp{#define +YYDEBUG 1} in the C declarations section of the grammar file +(@pxref{C Declarations, ,The C Declarations Section}). Alternatively, use the @samp{-t} option when +you run Bison (@pxref{Invocation, ,Invoking Bison}). We always define @code{YYDEBUG} so that +debugging is always possible. + +The trace facility uses @code{stderr}, so you must add @w{@code{#include +}} to the C declarations section unless it is already there. + +Once you have compiled the program with trace facilities, the way to +request a trace is to store a nonzero value in the variable @code{yydebug}. +You can do this by making the C code do it (in @code{main}, perhaps), or +you can alter the value with a C debugger. + +Each step taken by the parser when @code{yydebug} is nonzero produces a +line or two of trace information, written on @code{stderr}. The trace +messages tell you these things: + +@itemize @bullet +@item +Each time the parser calls @code{yylex}, what kind of token was read. + +@item +Each time a token is shifted, the depth and complete contents of the +state stack (@pxref{Parser States}). + +@item +Each time a rule is reduced, which rule it is, and the complete contents +of the state stack afterward. +@end itemize + +To make sense of this information, it helps to refer to the listing file +produced by the Bison @samp{-v} option (@pxref{Invocation, ,Invoking Bison}). This file +shows the meaning of each state in terms of positions in various rules, and +also what each state will do with each possible input token. As you read +the successive trace messages, you can see that the parser is functioning +according to its specification in the listing file. Eventually you will +arrive at the place where something undesirable happens, and you will see +which parts of the grammar are to blame. + +The parser file is a C program and you can use C debuggers on it, but it's +not easy to interpret what it is doing. The parser function is a +finite-state machine interpreter, and aside from the actions it executes +the same code over and over. Only the values of variables show where in +the grammar it is working. + +@findex YYPRINT +The debugging information normally gives the token type of each token +read, but not its semantic value. You can optionally define a macro +named @code{YYPRINT} to provide a way to print the value. If you define +@code{YYPRINT}, it should take three arguments. The parser will pass a +standard I/O stream, the numeric code for the token type, and the token +value (from @code{yylval}). + +Here is an example of @code{YYPRINT} suitable for the multi-function +calculator (@pxref{Mfcalc Decl, ,Declarations for @code{mfcalc}}): + +@smallexample +#define YYPRINT(file, type, value) yyprint (file, type, value) + +static void +yyprint (file, type, value) + FILE *file; + int type; + YYSTYPE value; +@{ + if (type == VAR) + fprintf (file, " %s", value.tptr->name); + else if (type == NUM) + fprintf (file, " %d", value.val); +@} +@end smallexample + +@node Invocation, Table of Symbols, Debugging, Top +@chapter Invoking Bison +@cindex invoking Bison +@cindex Bison invocation +@cindex options for invoking Bison + +The usual way to invoke Bison is as follows: + +@example +bison @var{infile} +@end example + +Here @var{infile} is the grammar file name, which usually ends in +@samp{.y}. The parser file's name is made by replacing the @samp{.y} +with @samp{.tab.c}. Thus, the @samp{bison foo.y} filename yields +@file{foo.tab.c}, and the @samp{bison hack/foo.y} filename yields +@file{hack/foo.tab.c}.@refill + +@menu +* Bison Options:: All the options described in detail, + in alphabetical order by short options. +* Option Cross Key:: Alphabetical list of long options. +* VMS Invocation:: Bison command syntax on VMS. +@end menu + +@node Bison Options, Option Cross Key, , Invocation +@section Bison Options + +Bison supports both traditional single-letter options and mnemonic long +option names. Long option names are indicated with @samp{--} instead of +@samp{-}. Abbreviations for option names are allowed as long as they +are unique. When a long option takes an argument, like +@samp{--file-prefix}, connect the option name and the argument with +@samp{=}. + +Here is a list of options that can be used with Bison, alphabetized by +short option. It is followed by a cross key alphabetized by long +option. + +@table @samp +@item -b @var{file-prefix} +@itemx --file-prefix=@var{prefix} +Specify a prefix to use for all Bison output file names. The names are +chosen as if the input file were named @file{@var{prefix}.c}. + +@item -d +@itemx --defines +Write an extra output file containing macro definitions for the token +type names defined in the grammar and the semantic value type +@code{YYSTYPE}, as well as a few @code{extern} variable declarations. + +If the parser output file is named @file{@var{name}.c} then this file +is named @file{@var{name}.h}.@refill + +This output file is essential if you wish to put the definition of +@code{yylex} in a separate source file, because @code{yylex} needs to +be able to refer to token type codes and the variable +@code{yylval}. @xref{Token Values, ,Semantic Values of Tokens}.@refill + +@item -l +@itemx --no-lines +Don't put any @code{#line} preprocessor commands in the parser file. +Ordinarily Bison puts them in the parser file so that the C compiler +and debuggers will associate errors with your source file, the +grammar file. This option causes them to associate errors with the +parser file, treating it as an independent source file in its own right. + +@item -n +@itemx --no-parser +Do not include any C code in the parser file; generate tables only. The +parser file contains just @code{#define} directives and static variable +declarations. + +This option also tells Bison to write the C code for the grammar actions +into a file named @file{@var{filename}.act}, in the form of a +brace-surrounded body fit for a @code{switch} statement. + +@item -o @var{outfile} +@itemx --output-file=@var{outfile} +Specify the name @var{outfile} for the parser file. + +The other output files' names are constructed from @var{outfile} +as described under the @samp{-v} and @samp{-d} options. + +@item -p @var{prefix} +@itemx --name-prefix=@var{prefix} +Rename the external symbols used in the parser so that they start with +@var{prefix} instead of @samp{yy}. The precise list of symbols renamed +is @code{yyparse}, @code{yylex}, @code{yyerror}, @code{yynerrs}, +@code{yylval}, @code{yychar} and @code{yydebug}. + +For example, if you use @samp{-p c}, the names become @code{cparse}, +@code{clex}, and so on. + +@xref{Multiple Parsers, ,Multiple Parsers in the Same Program}. + +@item -r +@itemx --raw +Pretend that @code{%raw} was specified. @xref{Decl Summary}. + +@item -t +@itemx --debug +Output a definition of the macro @code{YYDEBUG} into the parser file, +so that the debugging facilities are compiled. @xref{Debugging, ,Debugging Your Parser}. + +@item -v +@itemx --verbose +Write an extra output file containing verbose descriptions of the +parser states and what is done for each type of look-ahead token in +that state. + +This file also describes all the conflicts, both those resolved by +operator precedence and the unresolved ones. + +The file's name is made by removing @samp{.tab.c} or @samp{.c} from +the parser output file name, and adding @samp{.output} instead.@refill + +Therefore, if the input file is @file{foo.y}, then the parser file is +called @file{foo.tab.c} by default. As a consequence, the verbose +output file is called @file{foo.output}.@refill + +@item -V +@itemx --version +Print the version number of Bison and exit. + +@item -h +@itemx --help +Print a summary of the command-line options to Bison and exit. + +@need 1750 +@item -y +@itemx --yacc +@itemx --fixed-output-files +Equivalent to @samp{-o y.tab.c}; the parser output file is called +@file{y.tab.c}, and the other outputs are called @file{y.output} and +@file{y.tab.h}. The purpose of this option is to imitate Yacc's output +file name conventions. Thus, the following shell script can substitute +for Yacc:@refill + +@example +bison -y $* +@end example +@end table + +@node Option Cross Key, VMS Invocation, Bison Options, Invocation +@section Option Cross Key + +Here is a list of options, alphabetized by long option, to help you find +the corresponding short option. + +@tex +\def\leaderfill{\leaders\hbox to 1em{\hss.\hss}\hfill} + +{\tt +\line{ --debug \leaderfill -t} +\line{ --defines \leaderfill -d} +\line{ --file-prefix \leaderfill -b} +\line{ --fixed-output-files \leaderfill -y} +\line{ --help \leaderfill -h} +\line{ --name-prefix \leaderfill -p} +\line{ --no-lines \leaderfill -l} +\line{ --no-parser \leaderfill -n} +\line{ --output-file \leaderfill -o} +\line{ --raw \leaderfill -r} +\line{ --token-table \leaderfill -k} +\line{ --verbose \leaderfill -v} +\line{ --version \leaderfill -V} +\line{ --yacc \leaderfill -y} +} +@end tex + +@ifinfo +@example +--debug -t +--defines -d +--file-prefix=@var{prefix} -b @var{file-prefix} +--fixed-output-files --yacc -y +--help -h +--name-prefix=@var{prefix} -p @var{name-prefix} +--no-lines -l +--no-parser -n +--output-file=@var{outfile} -o @var{outfile} +--raw -r +--token-table -k +--verbose -v +--version -V +@end example +@end ifinfo + +@node VMS Invocation, , Option Cross Key, Invocation +@section Invoking Bison under VMS +@cindex invoking Bison under VMS +@cindex VMS + +The command line syntax for Bison on VMS is a variant of the usual +Bison command syntax---adapted to fit VMS conventions. + +To find the VMS equivalent for any Bison option, start with the long +option, and substitute a @samp{/} for the leading @samp{--}, and +substitute a @samp{_} for each @samp{-} in the name of the long option. +For example, the following invocation under VMS: + +@example +bison /debug/name_prefix=bar foo.y +@end example + +@noindent +is equivalent to the following command under POSIX. + +@example +bison --debug --name-prefix=bar foo.y +@end example + +The VMS file system does not permit filenames such as +@file{foo.tab.c}. In the above example, the output file +would instead be named @file{foo_tab.c}. + +@node Table of Symbols, Glossary, Invocation, Top +@appendix Bison Symbols +@cindex Bison symbols, table of +@cindex symbols in Bison, table of + +@table @code +@item error +A token name reserved for error recovery. This token may be used in +grammar rules so as to allow the Bison parser to recognize an error in +the grammar without halting the process. In effect, a sentence +containing an error may be recognized as valid. On a parse error, the +token @code{error} becomes the current look-ahead token. Actions +corresponding to @code{error} are then executed, and the look-ahead +token is reset to the token that originally caused the violation. +@xref{Error Recovery}. + +@item YYABORT +Macro to pretend that an unrecoverable syntax error has occurred, by +making @code{yyparse} return 1 immediately. The error reporting +function @code{yyerror} is not called. @xref{Parser Function, ,The Parser Function @code{yyparse}}. + +@item YYACCEPT +Macro to pretend that a complete utterance of the language has been +read, by making @code{yyparse} return 0 immediately. +@xref{Parser Function, ,The Parser Function @code{yyparse}}. + +@item YYBACKUP +Macro to discard a value from the parser stack and fake a look-ahead +token. @xref{Action Features, ,Special Features for Use in Actions}. + +@item YYERROR +Macro to pretend that a syntax error has just been detected: call +@code{yyerror} and then perform normal error recovery if possible +(@pxref{Error Recovery}), or (if recovery is impossible) make +@code{yyparse} return 1. @xref{Error Recovery}. + +@item YYERROR_VERBOSE +Macro that you define with @code{#define} in the Bison declarations +section to request verbose, specific error message strings when +@code{yyerror} is called. + +@item YYINITDEPTH +Macro for specifying the initial size of the parser stack. +@xref{Stack Overflow}. + +@item YYLEX_PARAM +Macro for specifying an extra argument (or list of extra arguments) for +@code{yyparse} to pass to @code{yylex}. @xref{Pure Calling,, Calling +Conventions for Pure Parsers}. + +@item YYLTYPE +Macro for the data type of @code{yylloc}; a structure with four +members. @xref{Token Positions, ,Textual Positions of Tokens}. + +@item yyltype +Default value for YYLTYPE. + +@item YYMAXDEPTH +Macro for specifying the maximum size of the parser stack. +@xref{Stack Overflow}. + +@item YYPARSE_PARAM +Macro for specifying the name of a parameter that @code{yyparse} should +accept. @xref{Pure Calling,, Calling Conventions for Pure Parsers}. + +@item YYRECOVERING +Macro whose value indicates whether the parser is recovering from a +syntax error. @xref{Action Features, ,Special Features for Use in Actions}. + +@item YYSTYPE +Macro for the data type of semantic values; @code{int} by default. +@xref{Value Type, ,Data Types of Semantic Values}. + +@item yychar +External integer variable that contains the integer value of the +current look-ahead token. (In a pure parser, it is a local variable +within @code{yyparse}.) Error-recovery rule actions may examine this +variable. @xref{Action Features, ,Special Features for Use in Actions}. + +@item yyclearin +Macro used in error-recovery rule actions. It clears the previous +look-ahead token. @xref{Error Recovery}. + +@item yydebug +External integer variable set to zero by default. If @code{yydebug} +is given a nonzero value, the parser will output information on input +symbols and parser action. @xref{Debugging, ,Debugging Your Parser}. + +@item yyerrok +Macro to cause parser to recover immediately to its normal mode +after a parse error. @xref{Error Recovery}. + +@item yyerror +User-supplied function to be called by @code{yyparse} on error. The +function receives one argument, a pointer to a character string +containing an error message. @xref{Error Reporting, ,The Error Reporting Function @code{yyerror}}. + +@item yylex +User-supplied lexical analyzer function, called with no arguments +to get the next token. @xref{Lexical, ,The Lexical Analyzer Function @code{yylex}}. + +@item yylval +External variable in which @code{yylex} should place the semantic +value associated with a token. (In a pure parser, it is a local +variable within @code{yyparse}, and its address is passed to +@code{yylex}.) @xref{Token Values, ,Semantic Values of Tokens}. + +@item yylloc +External variable in which @code{yylex} should place the line and +column numbers associated with a token. (In a pure parser, it is a +local variable within @code{yyparse}, and its address is passed to +@code{yylex}.) You can ignore this variable if you don't use the +@samp{@@} feature in the grammar actions. @xref{Token Positions, ,Textual Positions of Tokens}. + +@item yynerrs +Global variable which Bison increments each time there is a parse +error. (In a pure parser, it is a local variable within +@code{yyparse}.) @xref{Error Reporting, ,The Error Reporting Function @code{yyerror}}. + +@item yyparse +The parser function produced by Bison; call this function to start +parsing. @xref{Parser Function, ,The Parser Function @code{yyparse}}. + +@item %left +Bison declaration to assign left associativity to token(s). +@xref{Precedence Decl, ,Operator Precedence}. + +@item %no_lines +Bison declaration to avoid generating @code{#line} directives in the +parser file. @xref{Decl Summary}. + +@item %nonassoc +Bison declaration to assign nonassociativity to token(s). +@xref{Precedence Decl, ,Operator Precedence}. + +@item %prec +Bison declaration to assign a precedence to a specific rule. +@xref{Contextual Precedence, ,Context-Dependent Precedence}. + +@item %pure_parser +Bison declaration to request a pure (reentrant) parser. +@xref{Pure Decl, ,A Pure (Reentrant) Parser}. + +@item %raw +Bison declaration to use Bison internal token code numbers in token +tables instead of the usual Yacc-compatible token code numbers. +@xref{Decl Summary}. + +@item %right +Bison declaration to assign right associativity to token(s). +@xref{Precedence Decl, ,Operator Precedence}. + +@item %start +Bison declaration to specify the start symbol. @xref{Start Decl, ,The Start-Symbol}. + +@item %token +Bison declaration to declare token(s) without specifying precedence. +@xref{Token Decl, ,Token Type Names}. + +@item %token_table +Bison declaration to include a token name table in the parser file. +@xref{Decl Summary}. + +@item %type +Bison declaration to declare nonterminals. @xref{Type Decl, ,Nonterminal Symbols}. + +@item %union +Bison declaration to specify several possible data types for semantic +values. @xref{Union Decl, ,The Collection of Value Types}. +@end table + +These are the punctuation and delimiters used in Bison input: + +@table @samp +@item %% +Delimiter used to separate the grammar rule section from the +Bison declarations section or the additional C code section. +@xref{Grammar Layout, ,The Overall Layout of a Bison Grammar}. + +@item %@{ %@} +All code listed between @samp{%@{} and @samp{%@}} is copied directly +to the output file uninterpreted. Such code forms the ``C +declarations'' section of the input file. @xref{Grammar Outline, ,Outline of a Bison Grammar}. + +@item /*@dots{}*/ +Comment delimiters, as in C. + +@item : +Separates a rule's result from its components. @xref{Rules, ,Syntax of Grammar Rules}. + +@item ; +Terminates a rule. @xref{Rules, ,Syntax of Grammar Rules}. + +@item | +Separates alternate rules for the same result nonterminal. +@xref{Rules, ,Syntax of Grammar Rules}. +@end table + +@node Glossary, Index, Table of Symbols, Top +@appendix Glossary +@cindex glossary + +@table @asis +@item Backus-Naur Form (BNF) +Formal method of specifying context-free grammars. BNF was first used +in the @cite{ALGOL-60} report, 1963. @xref{Language and Grammar, ,Languages and Context-Free Grammars}. + +@item Context-free grammars +Grammars specified as rules that can be applied regardless of context. +Thus, if there is a rule which says that an integer can be used as an +expression, integers are allowed @emph{anywhere} an expression is +permitted. @xref{Language and Grammar, ,Languages and Context-Free Grammars}. + +@item Dynamic allocation +Allocation of memory that occurs during execution, rather than at +compile time or on entry to a function. + +@item Empty string +Analogous to the empty set in set theory, the empty string is a +character string of length zero. + +@item Finite-state stack machine +A ``machine'' that has discrete states in which it is said to exist at +each instant in time. As input to the machine is processed, the +machine moves from state to state as specified by the logic of the +machine. In the case of the parser, the input is the language being +parsed, and the states correspond to various stages in the grammar +rules. @xref{Algorithm, ,The Bison Parser Algorithm }. + +@item Grouping +A language construct that is (in general) grammatically divisible; +for example, `expression' or `declaration' in C. +@xref{Language and Grammar, ,Languages and Context-Free Grammars}. + +@item Infix operator +An arithmetic operator that is placed between the operands on which it +performs some operation. + +@item Input stream +A continuous flow of data between devices or programs. + +@item Language construct +One of the typical usage schemas of the language. For example, one of +the constructs of the C language is the @code{if} statement. +@xref{Language and Grammar, ,Languages and Context-Free Grammars}. + +@item Left associativity +Operators having left associativity are analyzed from left to right: +@samp{a+b+c} first computes @samp{a+b} and then combines with +@samp{c}. @xref{Precedence, ,Operator Precedence}. + +@item Left recursion +A rule whose result symbol is also its first component symbol; +for example, @samp{expseq1 : expseq1 ',' exp;}. @xref{Recursion, ,Recursive Rules}. + +@item Left-to-right parsing +Parsing a sentence of a language by analyzing it token by token from +left to right. @xref{Algorithm, ,The Bison Parser Algorithm }. + +@item Lexical analyzer (scanner) +A function that reads an input stream and returns tokens one by one. +@xref{Lexical, ,The Lexical Analyzer Function @code{yylex}}. + +@item Lexical tie-in +A flag, set by actions in the grammar rules, which alters the way +tokens are parsed. @xref{Lexical Tie-ins}. + +@item Literal string token +A token which constists of two or more fixed characters. +@xref{Symbols}. + +@item Look-ahead token +A token already read but not yet shifted. @xref{Look-Ahead, ,Look-Ahead Tokens}. + +@item LALR(1) +The class of context-free grammars that Bison (like most other parser +generators) can handle; a subset of LR(1). @xref{Mystery Conflicts, , +Mysterious Reduce/Reduce Conflicts}. + +@item LR(1) +The class of context-free grammars in which at most one token of +look-ahead is needed to disambiguate the parsing of any piece of input. + +@item Nonterminal symbol +A grammar symbol standing for a grammatical construct that can +be expressed through rules in terms of smaller constructs; in other +words, a construct that is not a token. @xref{Symbols}. + +@item Parse error +An error encountered during parsing of an input stream due to invalid +syntax. @xref{Error Recovery}. + +@item Parser +A function that recognizes valid sentences of a language by analyzing +the syntax structure of a set of tokens passed to it from a lexical +analyzer. + +@item Postfix operator +An arithmetic operator that is placed after the operands upon which it +performs some operation. + +@item Reduction +Replacing a string of nonterminals and/or terminals with a single +nonterminal, according to a grammar rule. @xref{Algorithm, ,The Bison Parser Algorithm }. + +@item Reentrant +A reentrant subprogram is a subprogram which can be in invoked any +number of times in parallel, without interference between the various +invocations. @xref{Pure Decl, ,A Pure (Reentrant) Parser}. + +@item Reverse polish notation +A language in which all operators are postfix operators. + +@item Right recursion +A rule whose result symbol is also its last component symbol; +for example, @samp{expseq1: exp ',' expseq1;}. @xref{Recursion, ,Recursive Rules}. + +@item Semantics +In computer languages, the semantics are specified by the actions +taken for each instance of the language, i.e., the meaning of +each statement. @xref{Semantics, ,Defining Language Semantics}. + +@item Shift +A parser is said to shift when it makes the choice of analyzing +further input from the stream rather than reducing immediately some +already-recognized rule. @xref{Algorithm, ,The Bison Parser Algorithm }. + +@item Single-character literal +A single character that is recognized and interpreted as is. +@xref{Grammar in Bison, ,From Formal Rules to Bison Input}. + +@item Start symbol +The nonterminal symbol that stands for a complete valid utterance in +the language being parsed. The start symbol is usually listed as the +first nonterminal symbol in a language specification. +@xref{Start Decl, ,The Start-Symbol}. + +@item Symbol table +A data structure where symbol names and associated data are stored +during parsing to allow for recognition and use of existing +information in repeated uses of a symbol. @xref{Multi-function Calc}. + +@item Token +A basic, grammatically indivisible unit of a language. The symbol +that describes a token in the grammar is a terminal symbol. +The input of the Bison parser is a stream of tokens which comes from +the lexical analyzer. @xref{Symbols}. + +@item Terminal symbol +A grammar symbol that has no rules in the grammar and therefore +is grammatically indivisible. The piece of text it represents +is a token. @xref{Language and Grammar, ,Languages and Context-Free Grammars}. +@end table + +@node Index, , Glossary, Top +@unnumbered Index + +@printindex cp + +@contents + +@bye + + + + +@c old menu + +* Introduction:: +* Conditions:: +* Copying:: The GNU General Public License says + how you can copy and share Bison + +Tutorial sections: +* Concepts:: Basic concepts for understanding Bison. +* Examples:: Three simple explained examples of using Bison. + +Reference sections: +* Grammar File:: Writing Bison declarations and rules. +* Interface:: C-language interface to the parser function @code{yyparse}. +* Algorithm:: How the Bison parser works at run-time. +* Error Recovery:: Writing rules for error recovery. +* Context Dependency::What to do if your language syntax is too + messy for Bison to handle straightforwardly. +* Debugging:: Debugging Bison parsers that parse wrong. +* Invocation:: How to run Bison (to produce the parser source file). +* Table of Symbols:: All the keywords of the Bison language are explained. +* Glossary:: Basic concepts are explained. +* Index:: Cross-references to the text. + diff --git a/contrib/bison/build.com b/contrib/bison/build.com new file mode 100644 index 000000000000..869ceb13d54a --- /dev/null +++ b/contrib/bison/build.com @@ -0,0 +1,83 @@ +$! Set the def dir to proper place for use in batch. Works for interactive too. +$flnm = f$enviroment("PROCEDURE") ! get current procedure name +$set default 'f$parse(flnm,,,"DEVICE")''f$parse(flnm,,,"DIRECTORY")' +$! +$! This command procedure compiles and links BISON for VMS. +$! BISON has been tested with VAXC version 2.3 and VMS version 4.5 +$! and on VMS 4.5 with GCC 1.12. +$! +$! Bj|rn Larsen blarsen@ifi.uio.no +$! With some contributions by Gabor Karsai, +$! KARSAIG1%VUENGVAX.BITNET@jade.berkeley.edu +$! All merged and cleaned by RMS. +$! +$! Adapted for both VAX-11 "C" and VMS/GCC compilation by +$! David L. Kashtan kashtan.iu.ai.sri.com +$! +$! First we try to sense which C compiler we have available. Sensing logic +$! borrowed from Emacs. +$! +$set noon !do not bomb if an error occurs. +$assign nla0: sys$output +$assign nla0: sys$error !so we do not get an error message about this. +$cc nla0:compiler_check.c +$if $status.eq.%x38090 then goto try_gcc +$ CC :== CC +$ cc_options:="/NOLIST/define=(""index=strchr"",""rindex=strrchr"")" +$ extra_linker_files:="VMSHLP," +$goto have_compiler +$! +$try_gcc: +$gcc nla0:compiler_check.c +$if $status.eq.%x38090 then goto whoops +$ CC :== GCC +$ cc_options:="/DEBUG" +$ extra_linker_files:="GNU_CC:[000000]GCCLIB/LIB," +$goto have_compiler +$! +$whoops: +$write sys$output "You must have a C compiler to build BISON. Sorry." +$deassign sys$output +$deassign sys$error +$exit %x38090 +$! +$! +$have_compiler: +$deassign sys$output +$deassign sys$error +$set on +$if f$search("compiler_check.obj").nes."" then dele/nolog compiler_check.obj; +$write sys$output "Building BISON with the ''cc' compiler." +$! +$! Do the compilation (compiler type is all set up) +$! +$ Compile: +$ if "''p1'" .eqs. "LINK" then goto Link +$ 'CC' 'cc_options' files.c +$ 'CC' 'cc_options' LR0.C +$ 'CC' 'cc_options' ALLOCATE.C +$ 'CC' 'cc_options' CLOSURE.C +$ 'CC' 'cc_options' CONFLICTS.C +$ 'CC' 'cc_options' DERIVES.C +$ 'CC' 'cc_options' VMSGETARGS.C +$ 'CC' 'cc_options' GRAM.C +$ 'CC' 'cc_options' LALR.C +$ 'CC' 'cc_options' LEX.C +$ 'CC' 'cc_options' MAIN.C +$ 'CC' 'cc_options' NULLABLE.C +$ 'CC' 'cc_options' OUTPUT.C +$ 'CC' 'cc_options' PRINT.C +$ 'CC' 'cc_options' READER.C +$ 'CC' 'cc_options' REDUCE.C +$ 'CC' 'cc_options' SYMTAB.C +$ 'CC' 'cc_options' WARSHALL.C +$ 'CC' 'cc_options' VERSION.C +$ if "''CC'" .eqs. "CC" then macro vmshlp.mar +$ Link: +$ link/exec=bison main,LR0,allocate,closure,conflicts,derives,files,- +vmsgetargs,gram,lalr,lex,nullable,output,print,reader,reduce,symtab,warshall,- +version,'extra_linker_files'sys$library:vaxcrtl/lib +$! +$! Generate bison.hlp (for online help). +$! +$runoff bison.rnh diff --git a/contrib/bison/closure.c b/contrib/bison/closure.c new file mode 100644 index 000000000000..b354458efd9f --- /dev/null +++ b/contrib/bison/closure.c @@ -0,0 +1,351 @@ +/* Subroutines for bison + Copyright (C) 1984, 1989 Free Software Foundation, Inc. + +This file is part of Bison, the GNU Compiler Compiler. + +Bison is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 2, or (at your option) +any later version. + +Bison is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with Bison; see the file COPYING. If not, write to +the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA. */ + + +/* subroutines of file LR0.c. + +Entry points: + + closure (items, n) + +Given a vector of item numbers items, of length n, +set up ruleset and itemset to indicate what rules could be run +and which items could be accepted when those items are the active ones. + +ruleset contains a bit for each rule. closure sets the bits +for all rules which could potentially describe the next input to be read. + +itemset is a vector of item numbers; itemsetend points to just beyond the end + of the part of it that is significant. +closure places there the indices of all items which represent units of +input that could arrive next. + + initialize_closure (n) + +Allocates the itemset and ruleset vectors, +and precomputes useful data so that closure can be called. +n is the number of elements to allocate for itemset. + + finalize_closure () + +Frees itemset, ruleset and internal data. + +*/ + +#include +#include "system.h" +#include "machine.h" +#include "new.h" +#include "gram.h" + + +extern short **derives; +extern char **tags; + +void set_fderives(); +void set_firsts(); + +extern void RTC(); + +short *itemset; +short *itemsetend; +static unsigned *ruleset; + +/* internal data. See comments before set_fderives and set_firsts. */ +static unsigned *fderives; +static unsigned *firsts; + +/* number of words required to hold a bit for each rule */ +static int rulesetsize; + +/* number of words required to hold a bit for each variable */ +static int varsetsize; + + +void +initialize_closure(n) +int n; +{ + itemset = NEW2(n, short); + + rulesetsize = WORDSIZE(nrules + 1); + ruleset = NEW2(rulesetsize, unsigned); + + set_fderives(); +} + + + +/* set fderives to an nvars by nrules matrix of bits + indicating which rules can help derive the beginning of the data + for each nonterminal. For example, if symbol 5 can be derived as + the sequence of symbols 8 3 20, and one of the rules for deriving + symbol 8 is rule 4, then the [5 - ntokens, 4] bit in fderives is set. */ +void +set_fderives() +{ + register unsigned *rrow; + register unsigned *vrow; + register int j; + register unsigned cword; + register short *rp; + register int b; + + int ruleno; + int i; + + fderives = NEW2(nvars * rulesetsize, unsigned) - ntokens * rulesetsize; + + set_firsts(); + + rrow = fderives + ntokens * rulesetsize; + + for (i = ntokens; i < nsyms; i++) + { + vrow = firsts + ((i - ntokens) * varsetsize); + cword = *vrow++; + b = 0; + for (j = ntokens; j < nsyms; j++) + { + if (cword & (1 << b)) + { + rp = derives[j]; + while ((ruleno = *rp++) > 0) + { + SETBIT(rrow, ruleno); + } + } + + b++; + if (b >= BITS_PER_WORD && j + 1 < nsyms) + { + cword = *vrow++; + b = 0; + } + } + + rrow += rulesetsize; + } + +#ifdef DEBUG + print_fderives(); +#endif + + FREE(firsts); +} + + + +/* set firsts to be an nvars by nvars bit matrix indicating which items + can represent the beginning of the input corresponding to which other items. + For example, if some rule expands symbol 5 into the sequence of symbols 8 3 20, + the symbol 8 can be the beginning of the data for symbol 5, + so the bit [8 - ntokens, 5 - ntokens] in firsts is set. */ +void +set_firsts() +{ + register unsigned *row; +/* register int done; JF unused */ + register int symbol; + register short *sp; + register int rowsize; + + int i; + + varsetsize = rowsize = WORDSIZE(nvars); + + firsts = NEW2(nvars * rowsize, unsigned); + + row = firsts; + for (i = ntokens; i < nsyms; i++) + { + sp = derives[i]; + while (*sp >= 0) + { + symbol = ritem[rrhs[*sp++]]; + if (ISVAR(symbol)) + { + symbol -= ntokens; + SETBIT(row, symbol); + } + } + + row += rowsize; + } + + RTC(firsts, nvars); + +#ifdef DEBUG + print_firsts(); +#endif +} + + +void +closure(core, n) +short *core; +int n; +{ + register int ruleno; + register unsigned word; + register short *csp; + register unsigned *dsp; + register unsigned *rsp; + + short *csend; + unsigned *rsend; + int symbol; + int itemno; + + rsp = ruleset; + rsend = ruleset + rulesetsize; + csend = core + n; + + if (n == 0) + { + dsp = fderives + start_symbol * rulesetsize; + while (rsp < rsend) + *rsp++ = *dsp++; + } + else + { + while (rsp < rsend) + *rsp++ = 0; + + csp = core; + while (csp < csend) + { + symbol = ritem[*csp++]; + if (ISVAR(symbol)) + { + dsp = fderives + symbol * rulesetsize; + rsp = ruleset; + while (rsp < rsend) + *rsp++ |= *dsp++; + } + } + } + + ruleno = 0; + itemsetend = itemset; + csp = core; + rsp = ruleset; + while (rsp < rsend) + { + word = *rsp++; + if (word == 0) + { + ruleno += BITS_PER_WORD; + } + else + { + register int b; + + for (b = 0; b < BITS_PER_WORD; b++) + { + if (word & (1 << b)) + { + itemno = rrhs[ruleno]; + while (csp < csend && *csp < itemno) + *itemsetend++ = *csp++; + *itemsetend++ = itemno; + } + + ruleno++; + } + } + } + + while (csp < csend) + *itemsetend++ = *csp++; + +#ifdef DEBUG + print_closure(n); +#endif +} + + +void +finalize_closure() +{ + FREE(itemset); + FREE(ruleset); + FREE(fderives + ntokens * rulesetsize); +} + + + +#ifdef DEBUG + +print_closure(n) +int n; +{ + register short *isp; + + printf("\n\nn = %d\n\n", n); + for (isp = itemset; isp < itemsetend; isp++) + printf(" %d\n", *isp); +} + + + +print_firsts() +{ + register int i; + register int j; + register unsigned *rowp; + + printf("\n\n\nFIRSTS\n\n"); + + for (i = ntokens; i < nsyms; i++) + { + printf("\n\n%s firsts\n\n", tags[i]); + + rowp = firsts + ((i - ntokens) * varsetsize); + + for (j = 0; j < nvars; j++) + if (BITISSET (rowp, j)) + printf(" %s\n", tags[j + ntokens]); + } +} + + + +print_fderives() +{ + register int i; + register int j; + register unsigned *rp; + + printf("\n\n\nFDERIVES\n"); + + for (i = ntokens; i < nsyms; i++) + { + printf("\n\n%s derives\n\n", tags[i]); + rp = fderives + i * rulesetsize; + + for (j = 0; j <= nrules; j++) + if (BITISSET (rp, j)) + printf(" %d\n", j); + } + + fflush(stdout); +} + +#endif diff --git a/contrib/bison/configure b/contrib/bison/configure new file mode 100755 index 000000000000..3ed1f48c34e2 --- /dev/null +++ b/contrib/bison/configure @@ -0,0 +1,1498 @@ +#! /bin/sh + +# Guess values for system-dependent variables and create Makefiles. +# Generated automatically using autoconf version 2.7 +# Copyright (C) 1992, 1993, 1994 Free Software Foundation, Inc. +# +# This configure script is free software; the Free Software Foundation +# gives unlimited permission to copy, distribute and modify it. + +# Defaults: +ac_help= +ac_default_prefix=/usr/local +# Any additions from configure.in: + +# Initialize some variables set by options. +# The variables have the same names as the options, with +# dashes changed to underlines. +build=NONE +cache_file=./config.cache +exec_prefix=NONE +host=NONE +no_create= +nonopt=NONE +no_recursion= +prefix=NONE +program_prefix=NONE +program_suffix=NONE +program_transform_name=s,x,x, +silent= +site= +srcdir= +target=NONE +verbose= +x_includes=NONE +x_libraries=NONE +bindir='${exec_prefix}/bin' +sbindir='${exec_prefix}/sbin' +libexecdir='${exec_prefix}/libexec' +datadir='${prefix}/share' +sysconfdir='${prefix}/etc' +sharedstatedir='${prefix}/com' +localstatedir='${prefix}/var' +libdir='${exec_prefix}/lib' +includedir='${prefix}/include' +oldincludedir='/usr/include' +infodir='${prefix}/info' +mandir='${prefix}/man' + +# Initialize some other variables. +subdirs= +MFLAGS= MAKEFLAGS= + +ac_prev= +for ac_option +do + + # If the previous option needs an argument, assign it. + if test -n "$ac_prev"; then + eval "$ac_prev=\$ac_option" + ac_prev= + continue + fi + + case "$ac_option" in + -*=*) ac_optarg=`echo "$ac_option" | sed 's/[-_a-zA-Z0-9]*=//'` ;; + *) ac_optarg= ;; + esac + + # Accept the important Cygnus configure options, so we can diagnose typos. + + case "$ac_option" in + + -bindir | --bindir | --bindi | --bind | --bin | --bi) + ac_prev=bindir ;; + -bindir=* | --bindir=* | --bindi=* | --bind=* | --bin=* | --bi=*) + bindir="$ac_optarg" ;; + + -build | --build | --buil | --bui | --bu) + ac_prev=build ;; + -build=* | --build=* | --buil=* | --bui=* | --bu=*) + build="$ac_optarg" ;; + + -cache-file | --cache-file | --cache-fil | --cache-fi \ + | --cache-f | --cache- | --cache | --cach | --cac | --ca | --c) + ac_prev=cache_file ;; + -cache-file=* | --cache-file=* | --cache-fil=* | --cache-fi=* \ + | --cache-f=* | --cache-=* | --cache=* | --cach=* | --cac=* | --ca=* | --c=*) + cache_file="$ac_optarg" ;; + + -datadir | --datadir | --datadi | --datad | --data | --dat | --da) + ac_prev=datadir ;; + -datadir=* | --datadir=* | --datadi=* | --datad=* | --data=* | --dat=* \ + | --da=*) + datadir="$ac_optarg" ;; + + -disable-* | --disable-*) + ac_feature=`echo $ac_option|sed -e 's/-*disable-//'` + # Reject names that are not valid shell variable names. + if test -n "`echo $ac_feature| sed 's/[-a-zA-Z0-9_]//g'`"; then + { echo "configure: error: $ac_feature: invalid feature name" 1>&2; exit 1; } + fi + ac_feature=`echo $ac_feature| sed 's/-/_/g'` + eval "enable_${ac_feature}=no" ;; + + -enable-* | --enable-*) + ac_feature=`echo $ac_option|sed -e 's/-*enable-//' -e 's/=.*//'` + # Reject names that are not valid shell variable names. + if test -n "`echo $ac_feature| sed 's/[-_a-zA-Z0-9]//g'`"; then + { echo "configure: error: $ac_feature: invalid feature name" 1>&2; exit 1; } + fi + ac_feature=`echo $ac_feature| sed 's/-/_/g'` + case "$ac_option" in + *=*) ;; + *) ac_optarg=yes ;; + esac + eval "enable_${ac_feature}='$ac_optarg'" ;; + + -exec-prefix | --exec_prefix | --exec-prefix | --exec-prefi \ + | --exec-pref | --exec-pre | --exec-pr | --exec-p | --exec- \ + | --exec | --exe | --ex) + ac_prev=exec_prefix ;; + -exec-prefix=* | --exec_prefix=* | --exec-prefix=* | --exec-prefi=* \ + | --exec-pref=* | --exec-pre=* | --exec-pr=* | --exec-p=* | --exec-=* \ + | --exec=* | --exe=* | --ex=*) + exec_prefix="$ac_optarg" ;; + + -gas | --gas | --ga | --g) + # Obsolete; use --with-gas. + with_gas=yes ;; + + -help | --help | --hel | --he) + # Omit some internal or obsolete options to make the list less imposing. + # This message is too long to be a string in the A/UX 3.1 sh. + cat << EOF +Usage: configure [options] [host] +Options: [defaults in brackets after descriptions] +Configuration: + --cache-file=FILE cache test results in FILE + --help print this message + --no-create do not create output files + --quiet, --silent do not print \`checking...' messages + --version print the version of autoconf that created configure +Directory and file names: + --prefix=PREFIX install architecture-independent files in PREFIX + [$ac_default_prefix] + --exec-prefix=EPREFIX install architecture-dependent files in EPREFIX + [same as prefix] + --bindir=DIR user executables in DIR [EPREFIX/bin] + --sbindir=DIR system admin executables in DIR [EPREFIX/sbin] + --libexecdir=DIR program executables in DIR [EPREFIX/libexec] + --datadir=DIR read-only architecture-independent data in DIR + [PREFIX/share] + --sysconfdir=DIR read-only single-machine data in DIR [PREFIX/etc] + --sharedstatedir=DIR modifiable architecture-independent data in DIR + [PREFIX/com] + --localstatedir=DIR modifiable single-machine data in DIR [PREFIX/var] + --libdir=DIR object code libraries in DIR [EPREFIX/lib] + --includedir=DIR C header files in DIR [PREFIX/include] + --oldincludedir=DIR C header files for non-gcc in DIR [/usr/include] + --infodir=DIR info documentation in DIR [PREFIX/info] + --mandir=DIR man documentation in DIR [PREFIX/man] + --srcdir=DIR find the sources in DIR [configure dir or ..] + --program-prefix=PREFIX prepend PREFIX to installed program names + --program-suffix=SUFFIX append SUFFIX to installed program names + --program-transform-name=PROGRAM + run sed PROGRAM on installed program names +EOF + cat << EOF +Host type: + --build=BUILD configure for building on BUILD [BUILD=HOST] + --host=HOST configure for HOST [guessed] + --target=TARGET configure for TARGET [TARGET=HOST] +Features and packages: + --disable-FEATURE do not include FEATURE (same as --enable-FEATURE=no) + --enable-FEATURE[=ARG] include FEATURE [ARG=yes] + --with-PACKAGE[=ARG] use PACKAGE [ARG=yes] + --without-PACKAGE do not use PACKAGE (same as --with-PACKAGE=no) + --x-includes=DIR X include files are in DIR + --x-libraries=DIR X library files are in DIR +EOF + if test -n "$ac_help"; then + echo "--enable and --with options recognized:$ac_help" + fi + exit 0 ;; + + -host | --host | --hos | --ho) + ac_prev=host ;; + -host=* | --host=* | --hos=* | --ho=*) + host="$ac_optarg" ;; + + -includedir | --includedir | --includedi | --included | --include \ + | --includ | --inclu | --incl | --inc) + ac_prev=includedir ;; + -includedir=* | --includedir=* | --includedi=* | --included=* | --include=* \ + | --includ=* | --inclu=* | --incl=* | --inc=*) + includedir="$ac_optarg" ;; + + -infodir | --infodir | --infodi | --infod | --info | --inf) + ac_prev=infodir ;; + -infodir=* | --infodir=* | --infodi=* | --infod=* | --info=* | --inf=*) + infodir="$ac_optarg" ;; + + -libdir | --libdir | --libdi | --libd) + ac_prev=libdir ;; + -libdir=* | --libdir=* | --libdi=* | --libd=*) + libdir="$ac_optarg" ;; + + -libexecdir | --libexecdir | --libexecdi | --libexecd | --libexec \ + | --libexe | --libex | --libe) + ac_prev=libexecdir ;; + -libexecdir=* | --libexecdir=* | --libexecdi=* | --libexecd=* | --libexec=* \ + | --libexe=* | --libex=* | --libe=*) + libexecdir="$ac_optarg" ;; + + -localstatedir | --localstatedir | --localstatedi | --localstated \ + | --localstate | --localstat | --localsta | --localst \ + | --locals | --local | --loca | --loc | --lo) + ac_prev=localstatedir ;; + -localstatedir=* | --localstatedir=* | --localstatedi=* | --localstated=* \ + | --localstate=* | --localstat=* | --localsta=* | --localst=* \ + | --locals=* | --local=* | --loca=* | --loc=* | --lo=*) + localstatedir="$ac_optarg" ;; + + -mandir | --mandir | --mandi | --mand | --man | --ma | --m) + ac_prev=mandir ;; + -mandir=* | --mandir=* | --mandi=* | --mand=* | --man=* | --ma=* | --m=*) + mandir="$ac_optarg" ;; + + -nfp | --nfp | --nf) + # Obsolete; use --without-fp. + with_fp=no ;; + + -no-create | --no-create | --no-creat | --no-crea | --no-cre \ + | --no-cr | --no-c) + no_create=yes ;; + + -no-recursion | --no-recursion | --no-recursio | --no-recursi \ + | --no-recurs | --no-recur | --no-recu | --no-rec | --no-re | --no-r) + no_recursion=yes ;; + + -oldincludedir | --oldincludedir | --oldincludedi | --oldincluded \ + | --oldinclude | --oldinclud | --oldinclu | --oldincl | --oldinc \ + | --oldin | --oldi | --old | --ol | --o) + ac_prev=oldincludedir ;; + -oldincludedir=* | --oldincludedir=* | --oldincludedi=* | --oldincluded=* \ + | --oldinclude=* | --oldinclud=* | --oldinclu=* | --oldincl=* | --oldinc=* \ + | --oldin=* | --oldi=* | --old=* | --ol=* | --o=*) + oldincludedir="$ac_optarg" ;; + + -prefix | --prefix | --prefi | --pref | --pre | --pr | --p) + ac_prev=prefix ;; + -prefix=* | --prefix=* | --prefi=* | --pref=* | --pre=* | --pr=* | --p=*) + prefix="$ac_optarg" ;; + + -program-prefix | --program-prefix | --program-prefi | --program-pref \ + | --program-pre | --program-pr | --program-p) + ac_prev=program_prefix ;; + -program-prefix=* | --program-prefix=* | --program-prefi=* \ + | --program-pref=* | --program-pre=* | --program-pr=* | --program-p=*) + program_prefix="$ac_optarg" ;; + + -program-suffix | --program-suffix | --program-suffi | --program-suff \ + | --program-suf | --program-su | --program-s) + ac_prev=program_suffix ;; + -program-suffix=* | --program-suffix=* | --program-suffi=* \ + | --program-suff=* | --program-suf=* | --program-su=* | --program-s=*) + program_suffix="$ac_optarg" ;; + + -program-transform-name | --program-transform-name \ + | --program-transform-nam | --program-transform-na \ + | --program-transform-n | --program-transform- \ + | --program-transform | --program-transfor \ + | --program-transfo | --program-transf \ + | --program-trans | --program-tran \ + | --progr-tra | --program-tr | --program-t) + ac_prev=program_transform_name ;; + -program-transform-name=* | --program-transform-name=* \ + | --program-transform-nam=* | --program-transform-na=* \ + | --program-transform-n=* | --program-transform-=* \ + | --program-transform=* | --program-transfor=* \ + | --program-transfo=* | --program-transf=* \ + | --program-trans=* | --program-tran=* \ + | --progr-tra=* | --program-tr=* | --program-t=*) + program_transform_name="$ac_optarg" ;; + + -q | -quiet | --quiet | --quie | --qui | --qu | --q \ + | -silent | --silent | --silen | --sile | --sil) + silent=yes ;; + + -sbindir | --sbindir | --sbindi | --sbind | --sbin | --sbi | --sb) + ac_prev=sbindir ;; + -sbindir=* | --sbindir=* | --sbindi=* | --sbind=* | --sbin=* \ + | --sbi=* | --sb=*) + sbindir="$ac_optarg" ;; + + -sharedstatedir | --sharedstatedir | --sharedstatedi \ + | --sharedstated | --sharedstate | --sharedstat | --sharedsta \ + | --sharedst | --shareds | --shared | --share | --shar \ + | --sha | --sh) + ac_prev=sharedstatedir ;; + -sharedstatedir=* | --sharedstatedir=* | --sharedstatedi=* \ + | --sharedstated=* | --sharedstate=* | --sharedstat=* | --sharedsta=* \ + | --sharedst=* | --shareds=* | --shared=* | --share=* | --shar=* \ + | --sha=* | --sh=*) + sharedstatedir="$ac_optarg" ;; + + -site | --site | --sit) + ac_prev=site ;; + -site=* | --site=* | --sit=*) + site="$ac_optarg" ;; + + -srcdir | --srcdir | --srcdi | --srcd | --src | --sr) + ac_prev=srcdir ;; + -srcdir=* | --srcdir=* | --srcdi=* | --srcd=* | --src=* | --sr=*) + srcdir="$ac_optarg" ;; + + -sysconfdir | --sysconfdir | --sysconfdi | --sysconfd | --sysconf \ + | --syscon | --sysco | --sysc | --sys | --sy) + ac_prev=sysconfdir ;; + -sysconfdir=* | --sysconfdir=* | --sysconfdi=* | --sysconfd=* | --sysconf=* \ + | --syscon=* | --sysco=* | --sysc=* | --sys=* | --sy=*) + sysconfdir="$ac_optarg" ;; + + -target | --target | --targe | --targ | --tar | --ta | --t) + ac_prev=target ;; + -target=* | --target=* | --targe=* | --targ=* | --tar=* | --ta=* | --t=*) + target="$ac_optarg" ;; + + -v | -verbose | --verbose | --verbos | --verbo | --verb) + verbose=yes ;; + + -version | --version | --versio | --versi | --vers) + echo "configure generated by autoconf version 2.7" + exit 0 ;; + + -with-* | --with-*) + ac_package=`echo $ac_option|sed -e 's/-*with-//' -e 's/=.*//'` + # Reject names that are not valid shell variable names. + if test -n "`echo $ac_package| sed 's/[-_a-zA-Z0-9]//g'`"; then + { echo "configure: error: $ac_package: invalid package name" 1>&2; exit 1; } + fi + ac_package=`echo $ac_package| sed 's/-/_/g'` + case "$ac_option" in + *=*) ;; + *) ac_optarg=yes ;; + esac + eval "with_${ac_package}='$ac_optarg'" ;; + + -without-* | --without-*) + ac_package=`echo $ac_option|sed -e 's/-*without-//'` + # Reject names that are not valid shell variable names. + if test -n "`echo $ac_package| sed 's/[-a-zA-Z0-9_]//g'`"; then + { echo "configure: error: $ac_package: invalid package name" 1>&2; exit 1; } + fi + ac_package=`echo $ac_package| sed 's/-/_/g'` + eval "with_${ac_package}=no" ;; + + --x) + # Obsolete; use --with-x. + with_x=yes ;; + + -x-includes | --x-includes | --x-include | --x-includ | --x-inclu \ + | --x-incl | --x-inc | --x-in | --x-i) + ac_prev=x_includes ;; + -x-includes=* | --x-includes=* | --x-include=* | --x-includ=* | --x-inclu=* \ + | --x-incl=* | --x-inc=* | --x-in=* | --x-i=*) + x_includes="$ac_optarg" ;; + + -x-libraries | --x-libraries | --x-librarie | --x-librari \ + | --x-librar | --x-libra | --x-libr | --x-lib | --x-li | --x-l) + ac_prev=x_libraries ;; + -x-libraries=* | --x-libraries=* | --x-librarie=* | --x-librari=* \ + | --x-librar=* | --x-libra=* | --x-libr=* | --x-lib=* | --x-li=* | --x-l=*) + x_libraries="$ac_optarg" ;; + + -*) { echo "configure: error: $ac_option: invalid option; use --help to show usage" 1>&2; exit 1; } + ;; + + *) + if test -n "`echo $ac_option| sed 's/[-a-z0-9.]//g'`"; then + echo "configure: warning: $ac_option: invalid host type" 1>&2 + fi + if test "x$nonopt" != xNONE; then + { echo "configure: error: can only configure for one host and one target at a time" 1>&2; exit 1; } + fi + nonopt="$ac_option" + ;; + + esac +done + +if test -n "$ac_prev"; then + { echo "configure: error: missing argument to --`echo $ac_prev | sed 's/_/-/g'`" 1>&2; exit 1; } +fi + +trap 'rm -fr conftest* confdefs* core core.* *.core $ac_clean_files; exit 1' 1 2 15 + +# File descriptor usage: +# 0 standard input +# 1 file creation +# 2 errors and warnings +# 3 some systems may open it to /dev/tty +# 4 used on the Kubota Titan +# 6 checking for... messages and results +# 5 compiler messages saved in config.log +if test "$silent" = yes; then + exec 6>/dev/null +else + exec 6>&1 +fi +exec 5>./config.log + +echo "\ +This file contains any messages produced by compilers while +running configure, to aid debugging if configure makes a mistake. +" 1>&5 + +# Strip out --no-create and --no-recursion so they do not pile up. +# Also quote any args containing shell metacharacters. +ac_configure_args= +for ac_arg +do + case "$ac_arg" in + -no-create | --no-create | --no-creat | --no-crea | --no-cre \ + | --no-cr | --no-c) ;; + -no-recursion | --no-recursion | --no-recursio | --no-recursi \ + | --no-recurs | --no-recur | --no-recu | --no-rec | --no-re | --no-r) ;; + *" "*|*" "*|*[\[\]\~\#\$\^\&\*\(\)\{\}\\\|\;\<\>\?]*) + ac_configure_args="$ac_configure_args '$ac_arg'" ;; + *) ac_configure_args="$ac_configure_args $ac_arg" ;; + esac +done + +# NLS nuisances. +# Only set LANG and LC_ALL to C if already set. +# These must not be set unconditionally because not all systems understand +# e.g. LANG=C (notably SCO). +if test "${LC_ALL+set}" = set; then LC_ALL=C; export LC_ALL; fi +if test "${LANG+set}" = set; then LANG=C; export LANG; fi + +# confdefs.h avoids OS command line length limits that DEFS can exceed. +rm -rf conftest* confdefs.h +# AIX cpp loses on an empty file, so make sure it contains at least a newline. +echo > confdefs.h + +# A filename unique to this package, relative to the directory that +# configure is in, which we can look for to find out if srcdir is correct. +ac_unique_file=reduce.c + +# Find the source files, if location was not specified. +if test -z "$srcdir"; then + ac_srcdir_defaulted=yes + # Try the directory containing this script, then its parent. + ac_prog=$0 + ac_confdir=`echo $ac_prog|sed 's%/[^/][^/]*$%%'` + test "x$ac_confdir" = "x$ac_prog" && ac_confdir=. + srcdir=$ac_confdir + if test ! -r $srcdir/$ac_unique_file; then + srcdir=.. + fi +else + ac_srcdir_defaulted=no +fi +if test ! -r $srcdir/$ac_unique_file; then + if test "$ac_srcdir_defaulted" = yes; then + { echo "configure: error: can not find sources in $ac_confdir or .." 1>&2; exit 1; } + else + { echo "configure: error: can not find sources in $srcdir" 1>&2; exit 1; } + fi +fi +srcdir=`echo "${srcdir}" | sed 's%\([^/]\)/*$%\1%'` + +# Prefer explicitly selected file to automatically selected ones. +if test -z "$CONFIG_SITE"; then + if test "x$prefix" != xNONE; then + CONFIG_SITE="$prefix/share/config.site $prefix/etc/config.site" + else + CONFIG_SITE="$ac_default_prefix/share/config.site $ac_default_prefix/etc/config.site" + fi +fi +for ac_site_file in $CONFIG_SITE; do + if test -r "$ac_site_file"; then + echo "loading site script $ac_site_file" + . "$ac_site_file" + fi +done + +if test -r "$cache_file"; then + echo "loading cache $cache_file" + . $cache_file +else + echo "creating cache $cache_file" + > $cache_file +fi + +ac_ext=c +# CFLAGS is not in ac_cpp because -g, -O, etc. are not valid cpp options. +ac_cpp='echo $CPP $CPPFLAGS 1>&5; +$CPP $CPPFLAGS' +ac_compile='echo ${CC-cc} -c $CFLAGS $CPPFLAGS conftest.$ac_ext 1>&5; +${CC-cc} -c $CFLAGS $CPPFLAGS conftest.$ac_ext 1>&5 2>&5' +ac_link='echo ${CC-cc} -o conftest $CFLAGS $CPPFLAGS $LDFLAGS conftest.$ac_ext $LIBS 1>&5; +${CC-cc} -o conftest $CFLAGS $CPPFLAGS $LDFLAGS conftest.$ac_ext $LIBS 1>&5 2>&5' + +if (echo "testing\c"; echo 1,2,3) | grep c >/dev/null; then + # Stardent Vistra SVR4 grep lacks -e, says ghazi@caip.rutgers.edu. + if (echo -n testing; echo 1,2,3) | sed s/-n/xn/ | grep xn >/dev/null; then + ac_n= ac_c=' +' ac_t=' ' + else + ac_n=-n ac_c= ac_t= + fi +else + ac_n= ac_c='\c' ac_t= +fi + + + +# Extract the first word of "gcc", so it can be a program name with args. +set dummy gcc; ac_word=$2 +echo $ac_n "checking for $ac_word""... $ac_c" 1>&6 +if eval "test \"`echo '$''{'ac_cv_prog_CC'+set}'`\" = set"; then + echo $ac_n "(cached) $ac_c" 1>&6 +else + if test -n "$CC"; then + ac_cv_prog_CC="$CC" # Let the user override the test. +else + IFS="${IFS= }"; ac_save_ifs="$IFS"; IFS="${IFS}:" + for ac_dir in $PATH; do + test -z "$ac_dir" && ac_dir=. + if test -f $ac_dir/$ac_word; then + ac_cv_prog_CC="gcc" + break + fi + done + IFS="$ac_save_ifs" + test -z "$ac_cv_prog_CC" && ac_cv_prog_CC="cc" +fi +fi +CC="$ac_cv_prog_CC" +if test -n "$CC"; then + echo "$ac_t""$CC" 1>&6 +else + echo "$ac_t""no" 1>&6 +fi + + +echo $ac_n "checking whether we are using GNU C""... $ac_c" 1>&6 +if eval "test \"`echo '$''{'ac_cv_prog_gcc'+set}'`\" = set"; then + echo $ac_n "(cached) $ac_c" 1>&6 +else + cat > conftest.c <&5 | egrep yes >/dev/null 2>&1; then + ac_cv_prog_gcc=yes +else + ac_cv_prog_gcc=no +fi +fi + +echo "$ac_t""$ac_cv_prog_gcc" 1>&6 +if test $ac_cv_prog_gcc = yes; then + GCC=yes + if test "${CFLAGS+set}" != set; then + echo $ac_n "checking whether ${CC-cc} accepts -g""... $ac_c" 1>&6 +if eval "test \"`echo '$''{'ac_cv_prog_gcc_g'+set}'`\" = set"; then + echo $ac_n "(cached) $ac_c" 1>&6 +else + echo 'void f(){}' > conftest.c +if test -z "`${CC-cc} -g -c conftest.c 2>&1`"; then + ac_cv_prog_gcc_g=yes +else + ac_cv_prog_gcc_g=no +fi +rm -f conftest* + +fi + +echo "$ac_t""$ac_cv_prog_gcc_g" 1>&6 + if test $ac_cv_prog_gcc_g = yes; then + CFLAGS="-g -O" + else + CFLAGS="-O" + fi + fi +else + GCC= + test "${CFLAGS+set}" = set || CFLAGS="-g" +fi + +ac_aux_dir= +for ac_dir in $srcdir $srcdir/.. $srcdir/../..; do + if test -f $ac_dir/install-sh; then + ac_aux_dir=$ac_dir + ac_install_sh="$ac_aux_dir/install-sh -c" + break + elif test -f $ac_dir/install.sh; then + ac_aux_dir=$ac_dir + ac_install_sh="$ac_aux_dir/install.sh -c" + break + fi +done +if test -z "$ac_aux_dir"; then + { echo "configure: error: can not find install-sh or install.sh in $srcdir $srcdir/.. $srcdir/../.." 1>&2; exit 1; } +fi +ac_config_guess=$ac_aux_dir/config.guess +ac_config_sub=$ac_aux_dir/config.sub +ac_configure=$ac_aux_dir/configure # This should be Cygnus configure. + +# Find a good install program. We prefer a C program (faster), +# so one script is as good as another. But avoid the broken or +# incompatible versions: +# SysV /etc/install, /usr/sbin/install +# SunOS /usr/etc/install +# IRIX /sbin/install +# AIX /bin/install +# AFS /usr/afsws/bin/install, which mishandles nonexistent args +# SVR4 /usr/ucb/install, which tries to use the nonexistent group "staff" +# ./install, which can be erroneously created by make from ./install.sh. +echo $ac_n "checking for a BSD compatible install""... $ac_c" 1>&6 +if test -z "$INSTALL"; then +if eval "test \"`echo '$''{'ac_cv_path_install'+set}'`\" = set"; then + echo $ac_n "(cached) $ac_c" 1>&6 +else + IFS="${IFS= }"; ac_save_ifs="$IFS"; IFS="${IFS}:" + for ac_dir in $PATH; do + # Account for people who put trailing slashes in PATH elements. + case "$ac_dir/" in + /|./|.//|/etc/*|/usr/sbin/*|/usr/etc/*|/sbin/*|/usr/afsws/bin/*|/usr/ucb/*) ;; + *) + # OSF1 and SCO ODT 3.0 have their own names for install. + for ac_prog in ginstall installbsd scoinst install; do + if test -f $ac_dir/$ac_prog; then + if test $ac_prog = install && + grep dspmsg $ac_dir/$ac_prog >/dev/null 2>&1; then + # AIX install. It has an incompatible calling convention. + # OSF/1 installbsd also uses dspmsg, but is usable. + : + else + ac_cv_path_install="$ac_dir/$ac_prog -c" + break 2 + fi + fi + done + ;; + esac + done + IFS="$ac_save_ifs" + +fi + if test "${ac_cv_path_install+set}" = set; then + INSTALL="$ac_cv_path_install" + else + # As a last resort, use the slow shell script. We don't cache a + # path for INSTALL within a source directory, because that will + # break other packages using the cache if that directory is + # removed, or if the path is relative. + INSTALL="$ac_install_sh" + fi +fi +echo "$ac_t""$INSTALL" 1>&6 + +# Use test -z because SunOS4 sh mishandles braces in ${var-val}. +# It thinks the first close brace ends the variable substitution. +test -z "$INSTALL_PROGRAM" && INSTALL_PROGRAM='${INSTALL}' + +test -z "$INSTALL_DATA" && INSTALL_DATA='${INSTALL} -m 644' + + +echo $ac_n "checking how to run the C preprocessor""... $ac_c" 1>&6 +# On Suns, sometimes $CPP names a directory. +if test -n "$CPP" && test -d "$CPP"; then + CPP= +fi +if test -z "$CPP"; then +if eval "test \"`echo '$''{'ac_cv_prog_CPP'+set}'`\" = set"; then + echo $ac_n "(cached) $ac_c" 1>&6 +else + # This must be in double quotes, not single quotes, because CPP may get + # substituted into the Makefile and "${CC-cc}" will confuse make. + CPP="${CC-cc} -E" + # On the NeXT, cc -E runs the code through the compiler's parser, + # not just through cpp. + cat > conftest.$ac_ext < +Syntax Error +EOF +eval "$ac_cpp conftest.$ac_ext >/dev/null 2>conftest.out" +ac_err=`grep -v '^ *+' conftest.out` +if test -z "$ac_err"; then + : +else + echo "$ac_err" >&5 + rm -rf conftest* + CPP="${CC-cc} -E -traditional-cpp" + cat > conftest.$ac_ext < +Syntax Error +EOF +eval "$ac_cpp conftest.$ac_ext >/dev/null 2>conftest.out" +ac_err=`grep -v '^ *+' conftest.out` +if test -z "$ac_err"; then + : +else + echo "$ac_err" >&5 + rm -rf conftest* + CPP=/lib/cpp +fi +rm -f conftest* +fi +rm -f conftest* + ac_cv_prog_CPP="$CPP" +fi + CPP="$ac_cv_prog_CPP" +else + ac_cv_prog_CPP="$CPP" +fi +echo "$ac_t""$CPP" 1>&6 + +ac_safe=`echo "minix/config.h" | tr './\055' '___'` +echo $ac_n "checking for minix/config.h""... $ac_c" 1>&6 +if eval "test \"`echo '$''{'ac_cv_header_$ac_safe'+set}'`\" = set"; then + echo $ac_n "(cached) $ac_c" 1>&6 +else + cat > conftest.$ac_ext < +EOF +eval "$ac_cpp conftest.$ac_ext >/dev/null 2>conftest.out" +ac_err=`grep -v '^ *+' conftest.out` +if test -z "$ac_err"; then + rm -rf conftest* + eval "ac_cv_header_$ac_safe=yes" +else + echo "$ac_err" >&5 + rm -rf conftest* + eval "ac_cv_header_$ac_safe=no" +fi +rm -f conftest* +fi +if eval "test \"`echo '$ac_cv_header_'$ac_safe`\" = yes"; then + echo "$ac_t""yes" 1>&6 + MINIX=yes +else + echo "$ac_t""no" 1>&6 +MINIX= +fi + +if test "$MINIX" = yes; then + cat >> confdefs.h <<\EOF +#define _POSIX_SOURCE 1 +EOF + + cat >> confdefs.h <<\EOF +#define _POSIX_1_SOURCE 2 +EOF + + cat >> confdefs.h <<\EOF +#define _MINIX 1 +EOF + +fi + +echo $ac_n "checking for POSIXized ISC""... $ac_c" 1>&6 +if test -d /etc/conf/kconfig.d && + grep _POSIX_VERSION /usr/include/sys/unistd.h >/dev/null 2>&1 +then + echo "$ac_t""yes" 1>&6 + ISC=yes # If later tests want to check for ISC. + cat >> confdefs.h <<\EOF +#define _POSIX_SOURCE 1 +EOF + + if test "$GCC" = yes; then + CC="$CC -posix" + else + CC="$CC -Xp" + fi +else + echo "$ac_t""no" 1>&6 + ISC= +fi + + +# If we cannot run a trivial program, we must be cross compiling. +echo $ac_n "checking whether cross-compiling""... $ac_c" 1>&6 +if eval "test \"`echo '$''{'ac_cv_c_cross'+set}'`\" = set"; then + echo $ac_n "(cached) $ac_c" 1>&6 +else + if test "$cross_compiling" = yes; then + ac_cv_c_cross=yes +else +cat > conftest.$ac_ext </dev/null; then + ac_cv_c_cross=no +else + ac_cv_c_cross=yes +fi +fi +rm -fr conftest* +fi + +echo "$ac_t""$ac_cv_c_cross" 1>&6 +cross_compiling=$ac_cv_c_cross + +echo $ac_n "checking for ANSI C header files""... $ac_c" 1>&6 +if eval "test \"`echo '$''{'ac_cv_header_stdc'+set}'`\" = set"; then + echo $ac_n "(cached) $ac_c" 1>&6 +else + cat > conftest.$ac_ext < +#include +#include +#include +EOF +eval "$ac_cpp conftest.$ac_ext >/dev/null 2>conftest.out" +ac_err=`grep -v '^ *+' conftest.out` +if test -z "$ac_err"; then + rm -rf conftest* + ac_cv_header_stdc=yes +else + echo "$ac_err" >&5 + rm -rf conftest* + ac_cv_header_stdc=no +fi +rm -f conftest* + +if test $ac_cv_header_stdc = yes; then + # SunOS 4.x string.h does not declare mem*, contrary to ANSI. +cat > conftest.$ac_ext < +EOF +if (eval "$ac_cpp conftest.$ac_ext") 2>&5 | + egrep "memchr" >/dev/null 2>&1; then + : +else + rm -rf conftest* + ac_cv_header_stdc=no +fi +rm -f conftest* + +fi + +if test $ac_cv_header_stdc = yes; then + # ISC 2.0.2 stdlib.h does not declare free, contrary to ANSI. +cat > conftest.$ac_ext < +EOF +if (eval "$ac_cpp conftest.$ac_ext") 2>&5 | + egrep "free" >/dev/null 2>&1; then + : +else + rm -rf conftest* + ac_cv_header_stdc=no +fi +rm -f conftest* + +fi + +if test $ac_cv_header_stdc = yes; then + # /bin/cc in Irix-4.0.5 gets non-ANSI ctype macros unless using -ansi. +if test "$cross_compiling" = yes; then + : +else +cat > conftest.$ac_ext < +#define ISLOWER(c) ('a' <= (c) && (c) <= 'z') +#define TOUPPER(c) (ISLOWER(c) ? 'A' + ((c) - 'a') : (c)) +#define XOR(e, f) (((e) && !(f)) || (!(e) && (f))) +int main () { int i; for (i = 0; i < 256; i++) +if (XOR (islower (i), ISLOWER (i)) || toupper (i) != TOUPPER (i)) exit(2); +exit (0); } + +EOF +eval $ac_link +if test -s conftest && (./conftest; exit) 2>/dev/null; then + : +else + ac_cv_header_stdc=no +fi +fi +rm -fr conftest* +fi +fi + +echo "$ac_t""$ac_cv_header_stdc" 1>&6 +if test $ac_cv_header_stdc = yes; then + cat >> confdefs.h <<\EOF +#define STDC_HEADERS 1 +EOF + +fi + +for ac_hdr in string.h stdlib.h memory.h +do +ac_safe=`echo "$ac_hdr" | tr './\055' '___'` +echo $ac_n "checking for $ac_hdr""... $ac_c" 1>&6 +if eval "test \"`echo '$''{'ac_cv_header_$ac_safe'+set}'`\" = set"; then + echo $ac_n "(cached) $ac_c" 1>&6 +else + cat > conftest.$ac_ext < +EOF +eval "$ac_cpp conftest.$ac_ext >/dev/null 2>conftest.out" +ac_err=`grep -v '^ *+' conftest.out` +if test -z "$ac_err"; then + rm -rf conftest* + eval "ac_cv_header_$ac_safe=yes" +else + echo "$ac_err" >&5 + rm -rf conftest* + eval "ac_cv_header_$ac_safe=no" +fi +rm -f conftest* +fi +if eval "test \"`echo '$ac_cv_header_'$ac_safe`\" = yes"; then + echo "$ac_t""yes" 1>&6 + ac_tr_hdr=HAVE_`echo $ac_hdr | tr 'abcdefghijklmnopqrstuvwxyz./\055' 'ABCDEFGHIJKLMNOPQRSTUVWXYZ___'` + cat >> confdefs.h <&6 +fi +done + + +echo $ac_n "checking for working const""... $ac_c" 1>&6 +if eval "test \"`echo '$''{'ac_cv_c_const'+set}'`\" = set"; then + echo $ac_n "(cached) $ac_c" 1>&6 +else + cat > conftest.$ac_ext <j = 5; +} +{ /* ULTRIX-32 V3.1 (Rev 9) vcc rejects this */ + const int foo = 10; +} + +; return 0; } +EOF +if eval $ac_compile; then + rm -rf conftest* + ac_cv_c_const=yes +else + rm -rf conftest* + ac_cv_c_const=no +fi +rm -f conftest* + +fi + +echo "$ac_t""$ac_cv_c_const" 1>&6 +if test $ac_cv_c_const = no; then + cat >> confdefs.h <<\EOF +#define const +EOF + +fi + + +# The Ultrix 4.2 mips builtin alloca declared by alloca.h only works +# for constant arguments. Useless! +echo $ac_n "checking for working alloca.h""... $ac_c" 1>&6 +if eval "test \"`echo '$''{'ac_cv_header_alloca_h'+set}'`\" = set"; then + echo $ac_n "(cached) $ac_c" 1>&6 +else + cat > conftest.$ac_ext < +int main() { return 0; } +int t() { +char *p = alloca(2 * sizeof(int)); +; return 0; } +EOF +if eval $ac_link; then + rm -rf conftest* + ac_cv_header_alloca_h=yes +else + rm -rf conftest* + ac_cv_header_alloca_h=no +fi +rm -f conftest* + +fi + +echo "$ac_t""$ac_cv_header_alloca_h" 1>&6 +if test $ac_cv_header_alloca_h = yes; then + cat >> confdefs.h <<\EOF +#define HAVE_ALLOCA_H 1 +EOF + +fi + +echo $ac_n "checking for alloca""... $ac_c" 1>&6 +if eval "test \"`echo '$''{'ac_cv_func_alloca'+set}'`\" = set"; then + echo $ac_n "(cached) $ac_c" 1>&6 +else + cat > conftest.$ac_ext < +# else +# ifdef _AIX + #pragma alloca +# else +# ifndef alloca /* predefined by HP cc +Olibcalls */ +char *alloca (); +# endif +# endif +# endif +#endif + +int main() { return 0; } +int t() { +char *p = (char *) alloca(1); +; return 0; } +EOF +if eval $ac_link; then + rm -rf conftest* + ac_cv_func_alloca=yes +else + rm -rf conftest* + ac_cv_func_alloca=no +fi +rm -f conftest* + +fi + +echo "$ac_t""$ac_cv_func_alloca" 1>&6 +if test $ac_cv_func_alloca = yes; then + cat >> confdefs.h <<\EOF +#define HAVE_ALLOCA 1 +EOF + +fi + +if test $ac_cv_func_alloca = no; then + # The SVR3 libPW and SVR4 libucb both contain incompatible functions + # that cause trouble. Some versions do not even contain alloca or + # contain a buggy version. If you still want to use their alloca, + # use ar to extract alloca.o from them instead of compiling alloca.c. + ALLOCA=alloca.o + cat >> confdefs.h <<\EOF +#define C_ALLOCA 1 +EOF + + +echo $ac_n "checking whether alloca needs Cray hooks""... $ac_c" 1>&6 +if eval "test \"`echo '$''{'ac_cv_os_cray'+set}'`\" = set"; then + echo $ac_n "(cached) $ac_c" 1>&6 +else + cat > conftest.$ac_ext <&5 | + egrep "webecray" >/dev/null 2>&1; then + rm -rf conftest* + ac_cv_os_cray=yes +else + rm -rf conftest* + ac_cv_os_cray=no +fi +rm -f conftest* + +fi + +echo "$ac_t""$ac_cv_os_cray" 1>&6 +if test $ac_cv_os_cray = yes; then +for ac_func in _getb67 GETB67 getb67; do + echo $ac_n "checking for $ac_func""... $ac_c" 1>&6 +if eval "test \"`echo '$''{'ac_cv_func_$ac_func'+set}'`\" = set"; then + echo $ac_n "(cached) $ac_c" 1>&6 +else + cat > conftest.$ac_ext < +/* Override any gcc2 internal prototype to avoid an error. */ +char $ac_func(); + +int main() { return 0; } +int t() { + +/* The GNU C library defines this for functions which it implements + to always fail with ENOSYS. Some functions are actually named + something starting with __ and the normal name is an alias. */ +#if defined (__stub_$ac_func) || defined (__stub___$ac_func) +choke me +#else +$ac_func(); +#endif + +; return 0; } +EOF +if eval $ac_link; then + rm -rf conftest* + eval "ac_cv_func_$ac_func=yes" +else + rm -rf conftest* + eval "ac_cv_func_$ac_func=no" +fi +rm -f conftest* + +fi +if eval "test \"`echo '$ac_cv_func_'$ac_func`\" = yes"; then + echo "$ac_t""yes" 1>&6 + cat >> confdefs.h <&6 +fi + +done +fi + +echo $ac_n "checking stack direction for C alloca""... $ac_c" 1>&6 +if eval "test \"`echo '$''{'ac_cv_c_stack_direction'+set}'`\" = set"; then + echo $ac_n "(cached) $ac_c" 1>&6 +else + if test "$cross_compiling" = yes; then + ac_cv_c_stack_direction=0 +else +cat > conftest.$ac_ext < addr) ? 1 : -1; +} +main () +{ + exit (find_stack_direction() < 0); +} +EOF +eval $ac_link +if test -s conftest && (./conftest; exit) 2>/dev/null; then + ac_cv_c_stack_direction=1 +else + ac_cv_c_stack_direction=-1 +fi +fi +rm -fr conftest* +fi + +echo "$ac_t""$ac_cv_c_stack_direction" 1>&6 +cat >> confdefs.h <&6 +if eval "test \"`echo '$''{'ac_cv_func_$ac_func'+set}'`\" = set"; then + echo $ac_n "(cached) $ac_c" 1>&6 +else + cat > conftest.$ac_ext < +/* Override any gcc2 internal prototype to avoid an error. */ +char $ac_func(); + +int main() { return 0; } +int t() { + +/* The GNU C library defines this for functions which it implements + to always fail with ENOSYS. Some functions are actually named + something starting with __ and the normal name is an alias. */ +#if defined (__stub_$ac_func) || defined (__stub___$ac_func) +choke me +#else +$ac_func(); +#endif + +; return 0; } +EOF +if eval $ac_link; then + rm -rf conftest* + eval "ac_cv_func_$ac_func=yes" +else + rm -rf conftest* + eval "ac_cv_func_$ac_func=no" +fi +rm -f conftest* + +fi +if eval "test \"`echo '$ac_cv_func_'$ac_func`\" = yes"; then + echo "$ac_t""yes" 1>&6 + ac_tr_func=HAVE_`echo $ac_func | tr 'abcdefghijklmnopqrstuvwxyz' 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'` + cat >> confdefs.h <&6 +fi +done + + +trap '' 1 2 15 +cat > confcache <<\EOF +# This file is a shell script that caches the results of configure +# tests run on this system so they can be shared between configure +# scripts and configure runs. It is not useful on other systems. +# If it contains results you don't want to keep, you may remove or edit it. +# +# By default, configure uses ./config.cache as the cache file, +# creating it if it does not exist already. You can give configure +# the --cache-file=FILE option to use a different cache file; that is +# what configure does when it calls configure scripts in +# subdirectories, so they share the cache. +# Giving --cache-file=/dev/null disables caching, for debugging configure. +# config.status only pays attention to the cache file if you give it the +# --recheck option to rerun configure. +# +EOF +# Ultrix sh set writes to stderr and can't be redirected directly, +# and sets the high bit in the cache file unless we assign to the vars. +(set) 2>&1 | + sed -n "s/^\([a-zA-Z0-9_]*_cv_[a-zA-Z0-9_]*\)=\(.*\)/\1=\${\1='\2'}/p" \ + >> confcache +if cmp -s $cache_file confcache; then + : +else + if test -w $cache_file; then + echo "updating cache $cache_file" + cat confcache > $cache_file + else + echo "not updating unwritable cache $cache_file" + fi +fi +rm -f confcache + +trap 'rm -fr conftest* confdefs* core core.* *.core $ac_clean_files; exit 1' 1 2 15 + +test "x$prefix" = xNONE && prefix=$ac_default_prefix +# Let make expand exec_prefix. +test "x$exec_prefix" = xNONE && exec_prefix='${prefix}' + +# Any assignment to VPATH causes Sun make to only execute +# the first set of double-colon rules, so remove it if not needed. +# If there is a colon in the path, we need to keep it. +if test "x$srcdir" = x.; then + ac_vpsub='/^[ ]*VPATH[ ]*=[^:]*$/d' +fi + +trap 'rm -f $CONFIG_STATUS conftest*; exit 1' 1 2 15 + +# Transform confdefs.h into DEFS. +# Protect against shell expansion while executing Makefile rules. +# Protect against Makefile macro expansion. +cat > conftest.defs <<\EOF +s%#define \([A-Za-z_][A-Za-z0-9_]*\) \(.*\)%-D\1=\2%g +s%[ `~#$^&*(){}\\|;'"<>?]%\\&%g +s%\[%\\&%g +s%\]%\\&%g +s%\$%$$%g +EOF +DEFS=`sed -f conftest.defs confdefs.h | tr '\012' ' '` +rm -f conftest.defs + + +# Without the "./", some shells look in PATH for config.status. +: ${CONFIG_STATUS=./config.status} + +echo creating $CONFIG_STATUS +rm -f $CONFIG_STATUS +cat > $CONFIG_STATUS </dev/null | sed 1q`: +# +# $0 $ac_configure_args +# +# Compiler output produced by configure, useful for debugging +# configure, is in ./config.log if it exists. + +ac_cs_usage="Usage: $CONFIG_STATUS [--recheck] [--version] [--help]" +for ac_option +do + case "\$ac_option" in + -recheck | --recheck | --rechec | --reche | --rech | --rec | --re | --r) + echo "running \${CONFIG_SHELL-/bin/sh} $0 $ac_configure_args --no-create --no-recursion" + exec \${CONFIG_SHELL-/bin/sh} $0 $ac_configure_args --no-create --no-recursion ;; + -version | --version | --versio | --versi | --vers | --ver | --ve | --v) + echo "$CONFIG_STATUS generated by autoconf version 2.7" + exit 0 ;; + -help | --help | --hel | --he | --h) + echo "\$ac_cs_usage"; exit 0 ;; + *) echo "\$ac_cs_usage"; exit 1 ;; + esac +done + +ac_given_srcdir=$srcdir +ac_given_INSTALL="$INSTALL" + +trap 'rm -fr `echo "Makefile" | sed "s/:[^ ]*//g"` conftest*; exit 1' 1 2 15 +EOF +cat >> $CONFIG_STATUS < conftest.subs <<\\CEOF +$ac_vpsub +$extrasub +s%@CFLAGS@%$CFLAGS%g +s%@CPPFLAGS@%$CPPFLAGS%g +s%@CXXFLAGS@%$CXXFLAGS%g +s%@DEFS@%$DEFS%g +s%@LDFLAGS@%$LDFLAGS%g +s%@LIBS@%$LIBS%g +s%@exec_prefix@%$exec_prefix%g +s%@prefix@%$prefix%g +s%@program_transform_name@%$program_transform_name%g +s%@bindir@%$bindir%g +s%@sbindir@%$sbindir%g +s%@libexecdir@%$libexecdir%g +s%@datadir@%$datadir%g +s%@sysconfdir@%$sysconfdir%g +s%@sharedstatedir@%$sharedstatedir%g +s%@localstatedir@%$localstatedir%g +s%@libdir@%$libdir%g +s%@includedir@%$includedir%g +s%@oldincludedir@%$oldincludedir%g +s%@infodir@%$infodir%g +s%@mandir@%$mandir%g +s%@CC@%$CC%g +s%@INSTALL_PROGRAM@%$INSTALL_PROGRAM%g +s%@INSTALL_DATA@%$INSTALL_DATA%g +s%@CPP@%$CPP%g +s%@ALLOCA@%$ALLOCA%g + +CEOF +EOF +cat >> $CONFIG_STATUS <> $CONFIG_STATUS <<\EOF +for ac_file in .. $CONFIG_FILES; do if test "x$ac_file" != x..; then + # Support "outfile[:infile]", defaulting infile="outfile.in". + case "$ac_file" in + *:*) ac_file_in=`echo "$ac_file"|sed 's%.*:%%'` + ac_file=`echo "$ac_file"|sed 's%:.*%%'` ;; + *) ac_file_in="${ac_file}.in" ;; + esac + + # Adjust relative srcdir, etc. for subdirectories. + + # Remove last slash and all that follows it. Not all systems have dirname. + ac_dir=`echo $ac_file|sed 's%/[^/][^/]*$%%'` + if test "$ac_dir" != "$ac_file" && test "$ac_dir" != .; then + # The file is in a subdirectory. + test ! -d "$ac_dir" && mkdir "$ac_dir" + ac_dir_suffix="/`echo $ac_dir|sed 's%^\./%%'`" + # A "../" for each directory in $ac_dir_suffix. + ac_dots=`echo $ac_dir_suffix|sed 's%/[^/]*%../%g'` + else + ac_dir_suffix= ac_dots= + fi + + case "$ac_given_srcdir" in + .) srcdir=. + if test -z "$ac_dots"; then top_srcdir=. + else top_srcdir=`echo $ac_dots|sed 's%/$%%'`; fi ;; + /*) srcdir="$ac_given_srcdir$ac_dir_suffix"; top_srcdir="$ac_given_srcdir" ;; + *) # Relative path. + srcdir="$ac_dots$ac_given_srcdir$ac_dir_suffix" + top_srcdir="$ac_dots$ac_given_srcdir" ;; + esac + + case "$ac_given_INSTALL" in + [/$]*) INSTALL="$ac_given_INSTALL" ;; + *) INSTALL="$ac_dots$ac_given_INSTALL" ;; + esac + echo creating "$ac_file" + rm -f "$ac_file" + configure_input="Generated automatically from `echo $ac_file_in|sed 's%.*/%%'` by configure." + case "$ac_file" in + *Makefile*) ac_comsub="1i\\ +# $configure_input" ;; + *) ac_comsub= ;; + esac + sed -e "$ac_comsub +s%@configure_input@%$configure_input%g +s%@srcdir@%$srcdir%g +s%@top_srcdir@%$top_srcdir%g +s%@INSTALL@%$INSTALL%g +" -f conftest.subs $ac_given_srcdir/$ac_file_in > $ac_file +fi; done +rm -f conftest.subs + + + +exit 0 +EOF +chmod +x $CONFIG_STATUS +rm -fr confdefs* $ac_clean_files +test "$no_create" = yes || ${CONFIG_SHELL-/bin/sh} $CONFIG_STATUS || exit 1 + diff --git a/contrib/bison/configure.bat b/contrib/bison/configure.bat new file mode 100644 index 000000000000..f92b00ad56a8 --- /dev/null +++ b/contrib/bison/configure.bat @@ -0,0 +1,28 @@ +@echo off +echo Configuring bison for go32 +rem This batch file assumes a unix-type "sed" program + +echo # Makefile generated by "configure.bat"> Makefile +echo all.dos : bison >> Makefile + +if exist config.sed del config.sed + +echo "s/@srcdir@/./g ">> config.sed +echo "s/@CC@/gcc/g ">> config.sed +echo "s/@INSTALL@//g ">> config.sed +echo "s/@INSTALL_PROGRAM@//g ">> config.sed +echo "s/@INSTALL_DATA@//g ">> config.sed +echo "s/@DEFS@/-DHAVE_STRERROR/g ">> config.sed +echo "s/@LIBS@//g ">> config.sed +echo "s/@ALLOCA@//g ">> config.sed + +echo "/^bison[ ]*:/,/-o/ { ">> config.sed +echo " s/ \$(CC)/ >bison.rf/ ">> config.sed +echo " /-o/ a\ ">> config.sed +echo " $(CC) @bison.rf ">> config.sed +echo "} ">> config.sed + +sed -e "s/^\"//" -e "s/\"$//" -e "s/[ ]*$//" config.sed > config2.sed +sed -f config2.sed Makefile.in >> Makefile +del config.sed +del config2.sed diff --git a/contrib/bison/configure.in b/contrib/bison/configure.in new file mode 100644 index 000000000000..4456254d9ea7 --- /dev/null +++ b/contrib/bison/configure.in @@ -0,0 +1,22 @@ +dnl Process this file with autoconf to produce a configure script. +AC_INIT(reduce.c) + +dnl Checks for programs. +AC_PROG_CC +AC_PROG_INSTALL + +AC_MINIX +AC_ISC_POSIX + +dnl Checks for header files. +AC_HEADER_STDC +AC_CHECK_HEADERS(string.h stdlib.h memory.h) + +dnl Checks for typedefs, structures, and compiler characteristics. +AC_C_CONST + +dnl Checks for library functions. +AC_FUNC_ALLOCA +AC_CHECK_FUNCS(strerror) + +AC_OUTPUT(Makefile) diff --git a/contrib/bison/conflicts.c b/contrib/bison/conflicts.c new file mode 100644 index 000000000000..29656f320782 --- /dev/null +++ b/contrib/bison/conflicts.c @@ -0,0 +1,753 @@ +/* Find and resolve or report look-ahead conflicts for bison, + Copyright (C) 1984, 1989, 1992 Free Software Foundation, Inc. + +This file is part of Bison, the GNU Compiler Compiler. + +Bison is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 2, or (at your option) +any later version. + +Bison is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with Bison; see the file COPYING. If not, write to +the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA. */ + +#include +#include "system.h" +#include "machine.h" +#include "new.h" +#include "files.h" +#include "gram.h" +#include "state.h" + + +extern char **tags; +extern int tokensetsize; +extern char *consistent; +extern short *accessing_symbol; +extern shifts **shift_table; +extern unsigned *LA; +extern short *LAruleno; +extern short *lookaheads; +extern int verboseflag; + +void set_conflicts(); +void resolve_sr_conflict(); +void flush_shift(); +void log_resolution(); +void total_conflicts(); +void count_sr_conflicts(); +void count_rr_conflicts(); + +char any_conflicts; +char *conflicts; +errs **err_table; +int expected_conflicts; + + +static unsigned *shiftset; +static unsigned *lookaheadset; +static int src_total; +static int rrc_total; +static int src_count; +static int rrc_count; + + +void +initialize_conflicts() +{ + register int i; +/* register errs *sp; JF unused */ + + conflicts = NEW2(nstates, char); + shiftset = NEW2(tokensetsize, unsigned); + lookaheadset = NEW2(tokensetsize, unsigned); + + err_table = NEW2(nstates, errs *); + + any_conflicts = 0; + + for (i = 0; i < nstates; i++) + set_conflicts(i); +} + + +void +set_conflicts(state) +int state; +{ + register int i; + register int k; + register shifts *shiftp; + register unsigned *fp2; + register unsigned *fp3; + register unsigned *fp4; + register unsigned *fp1; + register int symbol; + + if (consistent[state]) return; + + for (i = 0; i < tokensetsize; i++) + lookaheadset[i] = 0; + + shiftp = shift_table[state]; + if (shiftp) + { + k = shiftp->nshifts; + for (i = 0; i < k; i++) + { + symbol = accessing_symbol[shiftp->shifts[i]]; + if (ISVAR(symbol)) break; + SETBIT(lookaheadset, symbol); + } + } + + k = lookaheads[state + 1]; + fp4 = lookaheadset + tokensetsize; + + /* loop over all rules which require lookahead in this state */ + /* first check for shift-reduce conflict, and try to resolve using precedence */ + + for (i = lookaheads[state]; i < k; i++) + if (rprec[LAruleno[i]]) + { + fp1 = LA + i * tokensetsize; + fp2 = fp1; + fp3 = lookaheadset; + + while (fp3 < fp4) + { + if (*fp2++ & *fp3++) + { + resolve_sr_conflict(state, i); + break; + } + } + } + + /* loop over all rules which require lookahead in this state */ + /* Check for conflicts not resolved above. */ + + for (i = lookaheads[state]; i < k; i++) + { + fp1 = LA + i * tokensetsize; + fp2 = fp1; + fp3 = lookaheadset; + + while (fp3 < fp4) + { + if (*fp2++ & *fp3++) + { + conflicts[state] = 1; + any_conflicts = 1; + } + } + + fp2 = fp1; + fp3 = lookaheadset; + + while (fp3 < fp4) + *fp3++ |= *fp2++; + } +} + + + +/* Attempt to resolve shift-reduce conflict for one rule +by means of precedence declarations. +It has already been checked that the rule has a precedence. +A conflict is resolved by modifying the shift or reduce tables +so that there is no longer a conflict. */ + +void +resolve_sr_conflict(state, lookaheadnum) +int state; +int lookaheadnum; +{ + register int i; + register int mask; + register unsigned *fp1; + register unsigned *fp2; + register int redprec; + errs *errp = (errs *) xmalloc (sizeof(errs) + ntokens * sizeof(short)); + short *errtokens = errp->errs; + + /* find the rule to reduce by to get precedence of reduction */ + redprec = rprec[LAruleno[lookaheadnum]]; + + mask = 1; + fp1 = LA + lookaheadnum * tokensetsize; + fp2 = lookaheadset; + for (i = 0; i < ntokens; i++) + { + if ((mask & *fp2 & *fp1) && sprec[i]) + /* Shift-reduce conflict occurs for token number i + and it has a precedence. + The precedence of shifting is that of token i. */ + { + if (sprec[i] < redprec) + { + if (verboseflag) log_resolution(state, lookaheadnum, i, "reduce"); + *fp2 &= ~mask; /* flush the shift for this token */ + flush_shift(state, i); + } + else if (sprec[i] > redprec) + { + if (verboseflag) log_resolution(state, lookaheadnum, i, "shift"); + *fp1 &= ~mask; /* flush the reduce for this token */ + } + else + { + /* Matching precedence levels. + For left association, keep only the reduction. + For right association, keep only the shift. + For nonassociation, keep neither. */ + + switch (sassoc[i]) + { + + case RIGHT_ASSOC: + if (verboseflag) log_resolution(state, lookaheadnum, i, "shift"); + break; + + case LEFT_ASSOC: + if (verboseflag) log_resolution(state, lookaheadnum, i, "reduce"); + break; + + case NON_ASSOC: + if (verboseflag) log_resolution(state, lookaheadnum, i, "an error"); + break; + } + + if (sassoc[i] != RIGHT_ASSOC) + { + *fp2 &= ~mask; /* flush the shift for this token */ + flush_shift(state, i); + } + if (sassoc[i] != LEFT_ASSOC) + { + *fp1 &= ~mask; /* flush the reduce for this token */ + } + if (sassoc[i] == NON_ASSOC) + { + /* Record an explicit error for this token. */ + *errtokens++ = i; + } + } + } + + mask <<= 1; + if (mask == 0) + { + mask = 1; + fp2++; fp1++; + } + } + errp->nerrs = errtokens - errp->errs; + if (errp->nerrs) + { + /* Some tokens have been explicitly made errors. Allocate + a permanent errs structure for this state, to record them. */ + i = (char *) errtokens - (char *) errp; + err_table[state] = (errs *) xmalloc ((unsigned int)i); + bcopy (errp, err_table[state], i); + } + else + err_table[state] = 0; + free(errp); +} + + + +/* turn off the shift recorded for the specified token in the specified state. +Used when we resolve a shift-reduce conflict in favor of the reduction. */ + +void +flush_shift(state, token) +int state; +int token; +{ + register shifts *shiftp; + register int k, i; +/* register unsigned symbol; JF unused */ + + shiftp = shift_table[state]; + + if (shiftp) + { + k = shiftp->nshifts; + for (i = 0; i < k; i++) + { + if (shiftp->shifts[i] && token == accessing_symbol[shiftp->shifts[i]]) + (shiftp->shifts[i]) = 0; + } + } +} + + +void +log_resolution(state, LAno, token, resolution) +int state, LAno, token; +char *resolution; +{ + fprintf(foutput, + "Conflict in state %d between rule %d and token %s resolved as %s.\n", + state, LAruleno[LAno], tags[token], resolution); +} + + +void +conflict_log() +{ + register int i; + + src_total = 0; + rrc_total = 0; + + for (i = 0; i < nstates; i++) + { + if (conflicts[i]) + { + count_sr_conflicts(i); + count_rr_conflicts(i); + src_total += src_count; + rrc_total += rrc_count; + } + } + + total_conflicts(); +} + + +void +verbose_conflict_log() +{ + register int i; + + src_total = 0; + rrc_total = 0; + + for (i = 0; i < nstates; i++) + { + if (conflicts[i]) + { + count_sr_conflicts(i); + count_rr_conflicts(i); + src_total += src_count; + rrc_total += rrc_count; + + fprintf(foutput, "State %d contains", i); + + if (src_count == 1) + fprintf(foutput, " 1 shift/reduce conflict"); + else if (src_count > 1) + fprintf(foutput, " %d shift/reduce conflicts", src_count); + + if (src_count > 0 && rrc_count > 0) + fprintf(foutput, " and"); + + if (rrc_count == 1) + fprintf(foutput, " 1 reduce/reduce conflict"); + else if (rrc_count > 1) + fprintf(foutput, " %d reduce/reduce conflicts", rrc_count); + + putc('.', foutput); + putc('\n', foutput); + } + } + + total_conflicts(); +} + + +void +total_conflicts() +{ + extern int fixed_outfiles; + + if (src_total == expected_conflicts && rrc_total == 0) + return; + + if (fixed_outfiles) + { + /* If invoked under the name `yacc', use the output format + specified by POSIX. */ + fprintf(stderr, "conflicts: "); + if (src_total > 0) + fprintf(stderr, " %d shift/reduce", src_total); + if (src_total > 0 && rrc_total > 0) + fprintf(stderr, ","); + if (rrc_total > 0) + fprintf(stderr, " %d reduce/reduce", rrc_total); + putc('\n', stderr); + } + else + { + fprintf(stderr, "%s contains", infile); + + if (src_total == 1) + fprintf(stderr, " 1 shift/reduce conflict"); + else if (src_total > 1) + fprintf(stderr, " %d shift/reduce conflicts", src_total); + + if (src_total > 0 && rrc_total > 0) + fprintf(stderr, " and"); + + if (rrc_total == 1) + fprintf(stderr, " 1 reduce/reduce conflict"); + else if (rrc_total > 1) + fprintf(stderr, " %d reduce/reduce conflicts", rrc_total); + + putc('.', stderr); + putc('\n', stderr); + } +} + + +void +count_sr_conflicts(state) +int state; +{ + register int i; + register int k; + register int mask; + register shifts *shiftp; + register unsigned *fp1; + register unsigned *fp2; + register unsigned *fp3; + register int symbol; + + src_count = 0; + + shiftp = shift_table[state]; + if (!shiftp) return; + + for (i = 0; i < tokensetsize; i++) + { + shiftset[i] = 0; + lookaheadset[i] = 0; + } + + k = shiftp->nshifts; + for (i = 0; i < k; i++) + { + if (! shiftp->shifts[i]) continue; + symbol = accessing_symbol[shiftp->shifts[i]]; + if (ISVAR(symbol)) break; + SETBIT(shiftset, symbol); + } + + k = lookaheads[state + 1]; + fp3 = lookaheadset + tokensetsize; + + for (i = lookaheads[state]; i < k; i++) + { + fp1 = LA + i * tokensetsize; + fp2 = lookaheadset; + + while (fp2 < fp3) + *fp2++ |= *fp1++; + } + + fp1 = shiftset; + fp2 = lookaheadset; + + while (fp2 < fp3) + *fp2++ &= *fp1++; + + mask = 1; + fp2 = lookaheadset; + for (i = 0; i < ntokens; i++) + { + if (mask & *fp2) + src_count++; + + mask <<= 1; + if (mask == 0) + { + mask = 1; + fp2++; + } + } +} + + +void +count_rr_conflicts(state) +int state; +{ + register int i; + register int j; + register int count; + register unsigned mask; + register unsigned *baseword; + register unsigned *wordp; + register int m; + register int n; + + rrc_count = 0; + + m = lookaheads[state]; + n = lookaheads[state + 1]; + + if (n - m < 2) return; + + mask = 1; + baseword = LA + m * tokensetsize; + for (i = 0; i < ntokens; i++) + { + wordp = baseword; + + count = 0; + for (j = m; j < n; j++) + { + if (mask & *wordp) + count++; + + wordp += tokensetsize; + } + + if (count >= 2) rrc_count++; + + mask <<= 1; + if (mask == 0) + { + mask = 1; + baseword++; + } + } +} + + +void +print_reductions(state) +int state; +{ + register int i; + register int j; + register int k; + register unsigned *fp1; + register unsigned *fp2; + register unsigned *fp3; + register unsigned *fp4; + register int rule; + register int symbol; + register unsigned mask; + register int m; + register int n; + register int default_LA; + register int default_rule; + register int cmax; + register int count; + register shifts *shiftp; + register errs *errp; + int nodefault = 0; + + for (i = 0; i < tokensetsize; i++) + shiftset[i] = 0; + + shiftp = shift_table[state]; + if (shiftp) + { + k = shiftp->nshifts; + for (i = 0; i < k; i++) + { + if (! shiftp->shifts[i]) continue; + symbol = accessing_symbol[shiftp->shifts[i]]; + if (ISVAR(symbol)) break; + /* if this state has a shift for the error token, + don't use a default rule. */ + if (symbol == error_token_number) nodefault = 1; + SETBIT(shiftset, symbol); + } + } + + errp = err_table[state]; + if (errp) + { + k = errp->nerrs; + for (i = 0; i < k; i++) + { + if (! errp->errs[i]) continue; + symbol = errp->errs[i]; + SETBIT(shiftset, symbol); + } + } + + m = lookaheads[state]; + n = lookaheads[state + 1]; + + if (n - m == 1 && ! nodefault) + { + default_rule = LAruleno[m]; + + fp1 = LA + m * tokensetsize; + fp2 = shiftset; + fp3 = lookaheadset; + fp4 = lookaheadset + tokensetsize; + + while (fp3 < fp4) + *fp3++ = *fp1++ & *fp2++; + + mask = 1; + fp3 = lookaheadset; + + for (i = 0; i < ntokens; i++) + { + if (mask & *fp3) + fprintf(foutput, " %-4s\t[reduce using rule %d (%s)]\n", + tags[i], default_rule, tags[rlhs[default_rule]]); + + mask <<= 1; + if (mask == 0) + { + mask = 1; + fp3++; + } + } + + fprintf(foutput, " $default\treduce using rule %d (%s)\n\n", + default_rule, tags[rlhs[default_rule]]); + } + else if (n - m >= 1) + { + cmax = 0; + default_LA = -1; + fp4 = lookaheadset + tokensetsize; + + if (! nodefault) + for (i = m; i < n; i++) + { + fp1 = LA + i * tokensetsize; + fp2 = shiftset; + fp3 = lookaheadset; + + while (fp3 < fp4) + *fp3++ = *fp1++ & (~(*fp2++)); + + count = 0; + mask = 1; + fp3 = lookaheadset; + for (j = 0; j < ntokens; j++) + { + if (mask & *fp3) + count++; + + mask <<= 1; + if (mask == 0) + { + mask = 1; + fp3++; + } + } + + if (count > cmax) + { + cmax = count; + default_LA = i; + default_rule = LAruleno[i]; + } + + fp2 = shiftset; + fp3 = lookaheadset; + + while (fp3 < fp4) + *fp2++ |= *fp3++; + } + + for (i = 0; i < tokensetsize; i++) + shiftset[i] = 0; + + if (shiftp) + { + k = shiftp->nshifts; + for (i = 0; i < k; i++) + { + if (! shiftp->shifts[i]) continue; + symbol = accessing_symbol[shiftp->shifts[i]]; + if (ISVAR(symbol)) break; + SETBIT(shiftset, symbol); + } + } + + mask = 1; + fp1 = LA + m * tokensetsize; + fp2 = shiftset; + for (i = 0; i < ntokens; i++) + { + int defaulted = 0; + + if (mask & *fp2) + count = 1; + else + count = 0; + + fp3 = fp1; + for (j = m; j < n; j++) + { + if (mask & *fp3) + { + if (count == 0) + { + if (j != default_LA) + { + rule = LAruleno[j]; + fprintf(foutput, " %-4s\treduce using rule %d (%s)\n", + tags[i], rule, tags[rlhs[rule]]); + } + else defaulted = 1; + + count++; + } + else + { + if (defaulted) + { + rule = LAruleno[default_LA]; + fprintf(foutput, " %-4s\treduce using rule %d (%s)\n", + tags[i], rule, tags[rlhs[rule]]); + defaulted = 0; + } + rule = LAruleno[j]; + fprintf(foutput, " %-4s\t[reduce using rule %d (%s)]\n", + tags[i], rule, tags[rlhs[rule]]); + } + } + + fp3 += tokensetsize; + } + + mask <<= 1; + if (mask == 0) + { + mask = 1; + /* We tried incrementing just fp1, and just fp2; both seem wrong. + It seems necessary to increment both in sync. */ + fp1++; + fp2++; + } + } + + if (default_LA >= 0) + { + fprintf(foutput, " $default\treduce using rule %d (%s)\n", + default_rule, tags[rlhs[default_rule]]); + } + + putc('\n', foutput); + } +} + + +void +finalize_conflicts() +{ + FREE(conflicts); + FREE(shiftset); + FREE(lookaheadset); +} diff --git a/contrib/bison/derives.c b/contrib/bison/derives.c new file mode 100644 index 000000000000..f7dfaf736e13 --- /dev/null +++ b/contrib/bison/derives.c @@ -0,0 +1,118 @@ +/* Match rules with nonterminals for bison, + Copyright (C) 1984, 1989 Free Software Foundation, Inc. + +This file is part of Bison, the GNU Compiler Compiler. + +Bison is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 2, or (at your option) +any later version. + +Bison is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with Bison; see the file COPYING. If not, write to +the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA. */ + + +/* set_derives finds, for each variable (nonterminal), which rules can derive it. + It sets up the value of derives so that + derives[i - ntokens] points to a vector of rule numbers, + terminated with -1. */ + +#include +#include "system.h" +#include "new.h" +#include "types.h" +#include "gram.h" + + +short **derives; + +void +set_derives() +{ + register int i; + register int lhs; + register shorts *p; + register short *q; + register shorts **dset; + register shorts *delts; + + dset = NEW2(nvars, shorts *) - ntokens; + delts = NEW2(nrules + 1, shorts); + + p = delts; + for (i = nrules; i > 0; i--) + { + lhs = rlhs[i]; + if (lhs >= 0) + { + p->next = dset[lhs]; + p->value = i; + dset[lhs] = p; + p++; + } + } + + derives = NEW2(nvars, short *) - ntokens; + q = NEW2(nvars + nrules, short); + + for (i = ntokens; i < nsyms; i++) + { + derives[i] = q; + p = dset[i]; + while (p) + { + *q++ = p->value; + p = p->next; + } + *q++ = -1; + } + +#ifdef DEBUG + print_derives(); +#endif + + FREE(dset + ntokens); + FREE(delts); +} + +void +free_derives() +{ + FREE(derives[ntokens]); + FREE(derives + ntokens); +} + + + +#ifdef DEBUG + +print_derives() +{ + register int i; + register short *sp; + + extern char **tags; + + printf("\n\n\nDERIVES\n\n"); + + for (i = ntokens; i < nsyms; i++) + { + printf("%s derives", tags[i]); + for (sp = derives[i]; *sp > 0; sp++) + { + printf(" %d", *sp); + } + putchar('\n'); + } + + putchar('\n'); +} + +#endif + diff --git a/contrib/bison/files.c b/contrib/bison/files.c new file mode 100644 index 000000000000..e47b0eeb6d58 --- /dev/null +++ b/contrib/bison/files.c @@ -0,0 +1,414 @@ +/* Open and close files for bison, + Copyright (C) 1984, 1986, 1989, 1992 Free Software Foundation, Inc. + +This file is part of Bison, the GNU Compiler Compiler. + +Bison is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 2, or (at your option) +any later version. + +Bison is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with Bison; see the file COPYING. If not, write to +the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA. */ + + +#if defined (VMS) & !defined (__VMS_POSIX) +#include +#define unlink delete +#ifndef XPFILE +#define XPFILE "GNU_BISON:[000000]BISON.SIMPLE" +#endif +#ifndef XPFILE1 +#define XPFILE1 "GNU_BISON:[000000]BISON.HAIRY" +#endif +#endif + +#include +#include "system.h" +#include "files.h" +#include "new.h" +#include "gram.h" + +FILE *finput = NULL; +FILE *foutput = NULL; +FILE *fdefines = NULL; +FILE *ftable = NULL; +FILE *fattrs = NULL; +FILE *fguard = NULL; +FILE *faction = NULL; +FILE *fparser = NULL; + +/* File name specified with -o for the output file, or 0 if no -o. */ +char *spec_outfile; + +char *infile; +char *outfile; +char *defsfile; +char *tabfile; +char *attrsfile; +char *guardfile; +char *actfile; +char *tmpattrsfile; +char *tmptabfile; +char *tmpdefsfile; + +extern int noparserflag; + +extern char *mktemp(); /* So the compiler won't complain */ +extern char *getenv(); +extern void perror(); +FILE *tryopen(); /* This might be a good idea */ +void done(); + +extern char *program_name; +extern int verboseflag; +extern int definesflag; +int fixed_outfiles = 0; + + +char* +stringappend(string1, end1, string2) +char *string1; +int end1; +char *string2; +{ + register char *ostring; + register char *cp, *cp1; + register int i; + + cp = string2; i = 0; + while (*cp++) i++; + + ostring = NEW2(i+end1+1, char); + + cp = ostring; + cp1 = string1; + for (i = 0; i < end1; i++) + *cp++ = *cp1++; + + cp1 = string2; + while (*cp++ = *cp1++) ; + + return ostring; +} + + +/* JF this has been hacked to death. Nowaday it sets up the file names for + the output files, and opens the tmp files and the parser */ +void +openfiles() +{ + char *name_base; + register char *cp; + char *filename; + int base_length; + int short_base_length; + +#if defined (VMS) & !defined (__VMS_POSIX) + char *tmp_base = "sys$scratch:b_"; +#else + char *tmp_base = "/tmp/b."; +#endif + int tmp_len; + +#ifdef MSDOS + tmp_base = getenv ("TMP"); + if (tmp_base == 0) + tmp_base = ""; + strlwr (infile); +#endif /* MSDOS */ + + tmp_len = strlen (tmp_base); + + if (spec_outfile) + { + /* -o was specified. The precise -o name will be used for ftable. + For other output files, remove the ".c" or ".tab.c" suffix. */ + name_base = spec_outfile; +#ifdef MSDOS + strlwr (name_base); +#endif /* MSDOS */ + /* BASE_LENGTH includes ".tab" but not ".c". */ + base_length = strlen (name_base); + if (!strcmp (name_base + base_length - 2, ".c")) + base_length -= 2; + /* SHORT_BASE_LENGTH includes neither ".tab" nor ".c". */ + short_base_length = base_length; + if (!strncmp (name_base + short_base_length - 4, ".tab", 4)) + short_base_length -= 4; + else if (!strncmp (name_base + short_base_length - 4, "_tab", 4)) + short_base_length -= 4; + } + else if (spec_file_prefix) + { + /* -b was specified. Construct names from it. */ + /* SHORT_BASE_LENGTH includes neither ".tab" nor ".c". */ + short_base_length = strlen (spec_file_prefix); + /* Count room for `.tab'. */ + base_length = short_base_length + 4; + name_base = (char *) xmalloc (base_length + 1); + /* Append `.tab'. */ + strcpy (name_base, spec_file_prefix); +#ifdef VMS + strcat (name_base, "_tab"); +#else + strcat (name_base, ".tab"); +#endif +#ifdef MSDOS + strlwr (name_base); +#endif /* MSDOS */ + } + else + { + /* -o was not specified; compute output file name from input + or use y.tab.c, etc., if -y was specified. */ + + name_base = fixed_outfiles ? "y.y" : infile; + + /* BASE_LENGTH gets length of NAME_BASE, sans ".y" suffix if any. */ + + base_length = strlen (name_base); + if (!strcmp (name_base + base_length - 2, ".y")) + base_length -= 2; + short_base_length = base_length; + +#ifdef VMS + name_base = stringappend(name_base, short_base_length, "_tab"); +#else +#ifdef MSDOS + name_base = stringappend(name_base, short_base_length, "_tab"); +#else + name_base = stringappend(name_base, short_base_length, ".tab"); +#endif /* not MSDOS */ +#endif + base_length = short_base_length + 4; + } + + finput = tryopen(infile, "r"); + + if (! noparserflag) + { + filename = getenv("BISON_SIMPLE"); +#ifdef MSDOS + /* File doesn't exist in current directory; try in INIT directory. */ + cp = getenv("INIT"); + if (filename == 0 && cp != NULL) + { + filename = xmalloc(strlen(cp) + strlen(PFILE) + 2); + strcpy(filename, cp); + cp = filename + strlen(filename); + *cp++ = '/'; + strcpy(cp, PFILE); + } +#endif /* MSDOS */ + fparser = tryopen(filename ? filename : PFILE, "r"); + } + + if (verboseflag) + { +#ifdef MSDOS + outfile = stringappend(name_base, short_base_length, ".out"); +#else + /* We used to use just .out if spec_name_prefix (-p) was used, + but that conflicts with Posix. */ + outfile = stringappend(name_base, short_base_length, ".output"); +#endif + foutput = tryopen(outfile, "w"); + } + + if (noparserflag) + { + /* use permanent name for actions file */ + actfile = stringappend(name_base, short_base_length, ".act"); + faction = tryopen(actfile, "w"); + } + +#ifdef MSDOS + if (! noparserflag) + actfile = mktemp(stringappend(tmp_base, tmp_len, "acXXXXXX")); + tmpattrsfile = mktemp(stringappend(tmp_base, tmp_len, "atXXXXXX")); + tmptabfile = mktemp(stringappend(tmp_base, tmp_len, "taXXXXXX")); + tmpdefsfile = mktemp(stringappend(tmp_base, tmp_len, "deXXXXXX")); +#else + if (! noparserflag) + actfile = mktemp(stringappend(tmp_base, tmp_len, "act.XXXXXX")); + tmpattrsfile = mktemp(stringappend(tmp_base, tmp_len, "attrs.XXXXXX")); + tmptabfile = mktemp(stringappend(tmp_base, tmp_len, "tab.XXXXXX")); + tmpdefsfile = mktemp(stringappend(tmp_base, tmp_len, "defs.XXXXXX")); +#endif /* not MSDOS */ + + if (! noparserflag) + faction = tryopen(actfile, "w+"); + fattrs = tryopen(tmpattrsfile,"w+"); + ftable = tryopen(tmptabfile, "w+"); + + if (definesflag) + { + defsfile = stringappend(name_base, base_length, ".h"); + fdefines = tryopen(tmpdefsfile, "w+"); + } + +#ifndef MSDOS + if (! noparserflag) + unlink(actfile); + unlink(tmpattrsfile); + unlink(tmptabfile); + unlink(tmpdefsfile); +#endif + + /* These are opened by `done' or `open_extra_files', if at all */ + if (spec_outfile) + tabfile = spec_outfile; + else + tabfile = stringappend(name_base, base_length, ".c"); + +#ifdef VMS + attrsfile = stringappend(name_base, short_base_length, "_stype.h"); + guardfile = stringappend(name_base, short_base_length, "_guard.c"); +#else +#ifdef MSDOS + attrsfile = stringappend(name_base, short_base_length, ".sth"); + guardfile = stringappend(name_base, short_base_length, ".guc"); +#else + attrsfile = stringappend(name_base, short_base_length, ".stype.h"); + guardfile = stringappend(name_base, short_base_length, ".guard.c"); +#endif /* not MSDOS */ +#endif /* not VMS */ +} + + + +/* open the output files needed only for the semantic parser. +This is done when %semantic_parser is seen in the declarations section. */ + +void +open_extra_files() +{ + FILE *ftmp; + int c; + char *filename, *cp; + + if (fparser) + fclose(fparser); + + if (! noparserflag) + { + filename = (char *) getenv ("BISON_HAIRY"); +#ifdef MSDOS + /* File doesn't exist in current directory; try in INIT directory. */ + cp = getenv("INIT"); + if (filename == 0 && cp != NULL) + { + filename = xmalloc(strlen(cp) + strlen(PFILE1) + 2); + strcpy(filename, cp); + cp = filename + strlen(filename); + *cp++ = '/'; + strcpy(cp, PFILE1); + } +#endif + fparser= tryopen(filename ? filename : PFILE1, "r"); + } + + /* JF change from inline attrs file to separate one */ + ftmp = tryopen(attrsfile, "w"); + rewind(fattrs); + while((c=getc(fattrs))!=EOF) /* Thank god for buffering */ + putc(c,ftmp); + fclose(fattrs); + fattrs=ftmp; + + fguard = tryopen(guardfile, "w"); + +} + + /* JF to make file opening easier. This func tries to open file + NAME with mode MODE, and prints an error message if it fails. */ +FILE * +tryopen(name, mode) +char *name; +char *mode; +{ + FILE *ptr; + + ptr = fopen(name, mode); + if (ptr == NULL) + { + fprintf(stderr, "%s: ", program_name); + perror(name); + done(2); + } + return ptr; +} + +void +done(k) +int k; +{ + if (faction) + fclose(faction); + + if (fattrs) + fclose(fattrs); + + if (fguard) + fclose(fguard); + + if (finput) + fclose(finput); + + if (fparser) + fclose(fparser); + + if (foutput) + fclose(foutput); + + /* JF write out the output file */ + if (k == 0 && ftable) + { + FILE *ftmp; + register int c; + + ftmp=tryopen(tabfile, "w"); + rewind(ftable); + while((c=getc(ftable)) != EOF) + putc(c,ftmp); + fclose(ftmp); + fclose(ftable); + + if (definesflag) + { + ftmp = tryopen(defsfile, "w"); + fflush(fdefines); + rewind(fdefines); + while((c=getc(fdefines)) != EOF) + putc(c,ftmp); + fclose(ftmp); + fclose(fdefines); + } + } + +#if defined (VMS) & !defined (__VMS_POSIX) + if (faction && ! noparserflag) + delete(actfile); + if (fattrs) + delete(tmpattrsfile); + if (ftable) + delete(tmptabfile); + if (k==0) sys$exit(SS$_NORMAL); + sys$exit(SS$_ABORT); +#else +#ifdef MSDOS + if (actfile && ! noparserflag) unlink(actfile); + if (tmpattrsfile) unlink(tmpattrsfile); + if (tmptabfile) unlink(tmptabfile); + if (tmpdefsfile) unlink(tmpdefsfile); +#endif /* MSDOS */ + exit(k); +#endif /* not VMS, or __VMS_POSIX */ +} diff --git a/contrib/bison/files.h b/contrib/bison/files.h new file mode 100644 index 000000000000..8d24890e5daf --- /dev/null +++ b/contrib/bison/files.h @@ -0,0 +1,52 @@ +/* File names and variables for bison, + Copyright (C) 1984, 1989 Free Software Foundation, Inc. + +This file is part of Bison, the GNU Compiler Compiler. + +Bison is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 2, or (at your option) +any later version. + +Bison is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with Bison; see the file COPYING. If not, write to +the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA. */ + + +/* These two should be pathnames for opening the sample parser files. + When bison is installed, they should be absolute pathnames. + XPFILE1 and XPFILE2 normally come from the Makefile. */ + +#define PFILE XPFILE /* Simple parser */ +#define PFILE1 XPFILE1 /* Semantic parser */ + +extern FILE *finput; /* read grammar specifications */ +extern FILE *foutput; /* optionally output messages describing the actions taken */ +extern FILE *fdefines; /* optionally output #define's for token numbers. */ +extern FILE *ftable; /* output the tables and the parser */ +extern FILE *fattrs; /* if semantic parser, output a .h file that defines YYSTYPE */ + /* and also contains all the %{ ... %} definitions. */ +extern FILE *fguard; /* if semantic parser, output yyguard, containing all the guard code */ +extern FILE *faction; /* output all the action code; precise form depends on which parser */ +extern FILE *fparser; /* read the parser to copy into ftable */ + +/* File name specified with -o for the output file, or 0 if no -o. */ +extern char *spec_outfile; + +extern char *spec_name_prefix; /* for -a, from getargs.c */ + +/* File name pfx specified with -b, or 0 if no -b. */ +extern char *spec_file_prefix; + +extern char *infile; +extern char *outfile; +extern char *defsfile; +extern char *tabfile; +extern char *attrsfile; +extern char *guardfile; +extern char *actfile; diff --git a/contrib/bison/getargs.c b/contrib/bison/getargs.c new file mode 100644 index 000000000000..ce6d77962d28 --- /dev/null +++ b/contrib/bison/getargs.c @@ -0,0 +1,168 @@ +/* Parse command line arguments for bison, + Copyright (C) 1984, 1986, 1989, 1992 Free Software Foundation, Inc. + +This file is part of Bison, the GNU Compiler Compiler. + +Bison is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 2, or (at your option) +any later version. + +Bison is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with Bison; see the file COPYING. If not, write to +the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA. */ + + +#include +#include "getopt.h" +#include "system.h" +#include "files.h" + +int verboseflag; +int definesflag; +int debugflag; +int nolinesflag; +int noparserflag = 0; +int toknumflag = 0; +int rawtoknumflag = 0; +char *spec_name_prefix; /* for -p. */ +char *spec_file_prefix; /* for -b. */ +extern int fixed_outfiles;/* for -y */ + +extern char *program_name; +extern char *version_string; + +extern void warns(); /* main.c */ + +struct option longopts[] = +{ + {"debug", 0, &debugflag, 1}, + {"defines", 0, &definesflag, 1}, + {"file-prefix", 1, 0, 'b'}, + {"fixed-output-files", 0, &fixed_outfiles, 1}, + {"help", 0, 0, 'h'}, + {"name-prefix", 1, 0, 'p'}, /* was 'a'; apparently unused -wjh */ + {"no-lines", 0, &nolinesflag, 1}, + {"no-parser", 0, &noparserflag, 1}, + {"output", 1, 0, 'o'}, + {"output-file", 1, 0, 'o'}, + {"raw", 0, &rawtoknumflag, 1}, + {"token-table", 0, &toknumflag, 1}, + {"verbose", 0, &verboseflag, 1}, + {"version", 0, 0, 'V'}, + {"yacc", 0, &fixed_outfiles, 1}, + {0, 0, 0, 0} +}; + +void +usage (stream) + FILE *stream; +{ + fprintf (stream, "\ +Usage: %s [-dhklntvyV] [-b file-prefix] [-o outfile] [-p name-prefix]\n\ + [--debug] [--defines] [--fixed-output-files] [--no-lines]\n\ + [--verbose] [--version] [--help] [--yacc]\n\ + [--no-parser] [--token-table]\n\ + [--file-prefix=prefix] [--name-prefix=prefix]\n\ + [--output=outfile] grammar-file\n", + program_name); +} + +void +getargs(argc, argv) + int argc; + char *argv[]; +{ + register int c; + + verboseflag = 0; + definesflag = 0; + debugflag = 0; + noparserflag = 0; + rawtoknumflag = 0; + toknumflag = 0; + fixed_outfiles = 0; + + while ((c = getopt_long (argc, argv, "yvdhrltknVo:b:p:", longopts, (int *)0)) + != EOF) + { + switch (c) + { + case 0: + /* Certain long options cause getopt_long to return 0. */ + break; + + case 'y': + fixed_outfiles = 1; + break; + + case 'h': + usage (stdout); + exit (0); + + case 'V': + printf ("%s", version_string); + exit (0); + + case 'v': + verboseflag = 1; + break; + + case 'd': + definesflag = 1; + break; + + case 'l': + nolinesflag = 1; + break; + + case 'k': + toknumflag = 1; + break; + + case 'r': + rawtoknumflag = 1; + break; + + case 'n': + noparserflag = 1; + break; + + case 't': + debugflag = 1; + break; + + case 'o': + spec_outfile = optarg; + break; + + case 'b': + spec_file_prefix = optarg; + break; + + case 'p': + spec_name_prefix = optarg; + break; + + default: + usage (stderr); + exit (1); + } + } + + if (optind == argc) + { + fprintf(stderr, "%s: no grammar file given\n", program_name); + exit(1); + } + if (optind < argc - 1) + fprintf(stderr, "%s: extra arguments ignored after '%s'\n", + program_name, argv[optind]); + + infile = argv[optind]; +} diff --git a/contrib/bison/getopt.c b/contrib/bison/getopt.c new file mode 100644 index 000000000000..ab534de5e48a --- /dev/null +++ b/contrib/bison/getopt.c @@ -0,0 +1,813 @@ +/* Getopt for GNU. + NOTE: getopt is now part of the C library, so if you don't know what + "Keep this file name-space clean" means, talk to roland@gnu.ai.mit.edu + before changing it! + + Copyright (C) 1987, 88, 89, 90, 91, 92, 93, 94, 95, 1996 + Free Software Foundation, Inc. + + This program is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by the + Free Software Foundation; either version 2, or (at your option) any + later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, + USA. */ + +/* This tells Alpha OSF/1 not to define a getopt prototype in . + Ditto for AIX 3.2 and . */ +#ifndef _NO_PROTO +#define _NO_PROTO +#endif + +#ifdef HAVE_CONFIG_H +#include +#endif + +#if !defined (__STDC__) || !__STDC__ +/* This is a separate conditional since some stdc systems + reject `defined (const)'. */ +#ifndef const +#define const +#endif +#endif + +#include + +/* Comment out all this code if we are using the GNU C Library, and are not + actually compiling the library itself. This code is part of the GNU C + Library, but also included in many other GNU distributions. Compiling + and linking in this code is a waste when using the GNU C library + (especially if it is a shared library). Rather than having every GNU + program understand `configure --with-gnu-libc' and omit the object files, + it is simpler to just do this in the source for each such file. */ + +#if defined (_LIBC) || !defined (__GNU_LIBRARY__) + + +/* This needs to come after some library #include + to get __GNU_LIBRARY__ defined. */ +#ifdef __GNU_LIBRARY__ +/* Don't include stdlib.h for non-GNU C libraries because some of them + contain conflicting prototypes for getopt. */ +#include +#include +#endif /* GNU C library. */ + +#ifndef _ +/* This is for other GNU distributions with internationalized messages. + When compiling libc, the _ macro is predefined. */ +#ifdef HAVE_LIBINTL_H +# include +# define _(msgid) gettext (msgid) +#else +# define _(msgid) (msgid) +#endif +#endif + +/* This version of `getopt' appears to the caller like standard Unix `getopt' + but it behaves differently for the user, since it allows the user + to intersperse the options with the other arguments. + + As `getopt' works, it permutes the elements of ARGV so that, + when it is done, all the options precede everything else. Thus + all application programs are extended to handle flexible argument order. + + Setting the environment variable POSIXLY_CORRECT disables permutation. + Then the behavior is completely standard. + + GNU application programs can use a third alternative mode in which + they can distinguish the relative order of options and other arguments. */ + +#include "getopt.h" + +/* For communication from `getopt' to the caller. + When `getopt' finds an option that takes an argument, + the argument value is returned here. + Also, when `ordering' is RETURN_IN_ORDER, + each non-option ARGV-element is returned here. */ + +char *optarg = NULL; + +/* Index in ARGV of the next element to be scanned. + This is used for communication to and from the caller + and for communication between successive calls to `getopt'. + + On entry to `getopt', zero means this is the first call; initialize. + + When `getopt' returns EOF, this is the index of the first of the + non-option elements that the caller should itself scan. + + Otherwise, `optind' communicates from one call to the next + how much of ARGV has been scanned so far. */ + +/* XXX 1003.2 says this must be 1 before any call. */ +int optind = 0; + +/* The next char to be scanned in the option-element + in which the last option character we returned was found. + This allows us to pick up the scan where we left off. + + If this is zero, or a null string, it means resume the scan + by advancing to the next ARGV-element. */ + +static char *nextchar; + +/* Callers store zero here to inhibit the error message + for unrecognized options. */ + +int opterr = 1; + +/* Set to an option character which was unrecognized. + This must be initialized on some systems to avoid linking in the + system's own getopt implementation. */ + +int optopt = '?'; + +/* Describe how to deal with options that follow non-option ARGV-elements. + + If the caller did not specify anything, + the default is REQUIRE_ORDER if the environment variable + POSIXLY_CORRECT is defined, PERMUTE otherwise. + + REQUIRE_ORDER means don't recognize them as options; + stop option processing when the first non-option is seen. + This is what Unix does. + This mode of operation is selected by either setting the environment + variable POSIXLY_CORRECT, or using `+' as the first character + of the list of option characters. + + PERMUTE is the default. We permute the contents of ARGV as we scan, + so that eventually all the non-options are at the end. This allows options + to be given in any order, even with programs that were not written to + expect this. + + RETURN_IN_ORDER is an option available to programs that were written + to expect options and other ARGV-elements in any order and that care about + the ordering of the two. We describe each non-option ARGV-element + as if it were the argument of an option with character code 1. + Using `-' as the first character of the list of option characters + selects this mode of operation. + + The special argument `--' forces an end of option-scanning regardless + of the value of `ordering'. In the case of RETURN_IN_ORDER, only + `--' can cause `getopt' to return EOF with `optind' != ARGC. */ + +static enum +{ + REQUIRE_ORDER, PERMUTE, RETURN_IN_ORDER +} ordering; + +/* Value of POSIXLY_CORRECT environment variable. */ +static char *posixly_correct; + +#ifdef __GNU_LIBRARY__ +/* We want to avoid inclusion of string.h with non-GNU libraries + because there are many ways it can cause trouble. + On some systems, it contains special magic macros that don't work + in GCC. */ +#include +#define my_index strchr +#else + +/* Avoid depending on library functions or files + whose names are inconsistent. */ + +char *getenv (); + +static char * +my_index (str, chr) + const char *str; + int chr; +{ + while (*str) + { + if (*str == chr) + return (char *) str; + str++; + } + return 0; +} + +/* If using GCC, we can safely declare strlen this way. + If not using GCC, it is ok not to declare it. */ +#ifdef __GNUC__ +/* Note that Motorola Delta 68k R3V7 comes with GCC but not stddef.h. + That was relevant to code that was here before. */ +#if !defined (__STDC__) || !__STDC__ +/* gcc with -traditional declares the built-in strlen to return int, + and has done so at least since version 2.4.5. -- rms. */ +extern int strlen (const char *); +#endif /* not __STDC__ */ +#endif /* __GNUC__ */ + +#endif /* not __GNU_LIBRARY__ */ + +/* Handle permutation of arguments. */ + +/* Describe the part of ARGV that contains non-options that have + been skipped. `first_nonopt' is the index in ARGV of the first of them; + `last_nonopt' is the index after the last of them. */ + +static int first_nonopt; +static int last_nonopt; + +/* Bash 2.0 gives us an environment variable containing flags + indicating ARGV elements that should not be considered arguments. */ + +static const char *nonoption_flags; +static int nonoption_flags_len; + +/* Exchange two adjacent subsequences of ARGV. + One subsequence is elements [first_nonopt,last_nonopt) + which contains all the non-options that have been skipped so far. + The other is elements [last_nonopt,optind), which contains all + the options processed since those non-options were skipped. + + `first_nonopt' and `last_nonopt' are relocated so that they describe + the new indices of the non-options in ARGV after they are moved. */ + +#if defined (__STDC__) && __STDC__ +static void exchange (char **); +#endif + +static void +exchange (argv) + char **argv; +{ + int bottom = first_nonopt; + int middle = last_nonopt; + int top = optind; + char *tem; + + /* Exchange the shorter segment with the far end of the longer segment. + That puts the shorter segment into the right place. + It leaves the longer segment in the right place overall, + but it consists of two parts that need to be swapped next. */ + + while (top > middle && middle > bottom) + { + if (top - middle > middle - bottom) + { + /* Bottom segment is the short one. */ + int len = middle - bottom; + register int i; + + /* Swap it with the top part of the top segment. */ + for (i = 0; i < len; i++) + { + tem = argv[bottom + i]; + argv[bottom + i] = argv[top - (middle - bottom) + i]; + argv[top - (middle - bottom) + i] = tem; + } + /* Exclude the moved bottom segment from further swapping. */ + top -= len; + } + else + { + /* Top segment is the short one. */ + int len = top - middle; + register int i; + + /* Swap it with the bottom part of the bottom segment. */ + for (i = 0; i < len; i++) + { + tem = argv[bottom + i]; + argv[bottom + i] = argv[middle + i]; + argv[middle + i] = tem; + } + /* Exclude the moved top segment from further swapping. */ + bottom += len; + } + } + + /* Update records for the slots the non-options now occupy. */ + + first_nonopt += (optind - last_nonopt); + last_nonopt = optind; +} + +/* Initialize the internal data when the first call is made. */ + +#if defined (__STDC__) && __STDC__ +static const char *_getopt_initialize (const char *); +#endif +static const char * +_getopt_initialize (optstring) + const char *optstring; +{ + /* Start processing options with ARGV-element 1 (since ARGV-element 0 + is the program name); the sequence of previously skipped + non-option ARGV-elements is empty. */ + + first_nonopt = last_nonopt = optind = 1; + + nextchar = NULL; + + posixly_correct = getenv ("POSIXLY_CORRECT"); + + /* Determine how to handle the ordering of options and nonoptions. */ + + if (optstring[0] == '-') + { + ordering = RETURN_IN_ORDER; + ++optstring; + } + else if (optstring[0] == '+') + { + ordering = REQUIRE_ORDER; + ++optstring; + } + else if (posixly_correct != NULL) + ordering = REQUIRE_ORDER; + else + ordering = PERMUTE; + + if (posixly_correct == NULL) + { + /* Bash 2.0 puts a special variable in the environment for each + command it runs, specifying which ARGV elements are the results of + file name wildcard expansion and therefore should not be + considered as options. */ + char var[100]; + sprintf (var, "_%d_GNU_nonoption_argv_flags_", getpid ()); + nonoption_flags = getenv (var); + if (nonoption_flags == NULL) + nonoption_flags_len = 0; + else + nonoption_flags_len = strlen (nonoption_flags); + } + + return optstring; +} + +/* Scan elements of ARGV (whose length is ARGC) for option characters + given in OPTSTRING. + + If an element of ARGV starts with '-', and is not exactly "-" or "--", + then it is an option element. The characters of this element + (aside from the initial '-') are option characters. If `getopt' + is called repeatedly, it returns successively each of the option characters + from each of the option elements. + + If `getopt' finds another option character, it returns that character, + updating `optind' and `nextchar' so that the next call to `getopt' can + resume the scan with the following option character or ARGV-element. + + If there are no more option characters, `getopt' returns `EOF'. + Then `optind' is the index in ARGV of the first ARGV-element + that is not an option. (The ARGV-elements have been permuted + so that those that are not options now come last.) + + OPTSTRING is a string containing the legitimate option characters. + If an option character is seen that is not listed in OPTSTRING, + return '?' after printing an error message. If you set `opterr' to + zero, the error message is suppressed but we still return '?'. + + If a char in OPTSTRING is followed by a colon, that means it wants an arg, + so the following text in the same ARGV-element, or the text of the following + ARGV-element, is returned in `optarg'. Two colons mean an option that + wants an optional arg; if there is text in the current ARGV-element, + it is returned in `optarg', otherwise `optarg' is set to zero. + + If OPTSTRING starts with `-' or `+', it requests different methods of + handling the non-option ARGV-elements. + See the comments about RETURN_IN_ORDER and REQUIRE_ORDER, above. + + Long-named options begin with `--' instead of `-'. + Their names may be abbreviated as long as the abbreviation is unique + or is an exact match for some defined option. If they have an + argument, it follows the option name in the same ARGV-element, separated + from the option name by a `=', or else the in next ARGV-element. + When `getopt' finds a long-named option, it returns 0 if that option's + `flag' field is nonzero, the value of the option's `val' field + if the `flag' field is zero. + + The elements of ARGV aren't really const, because we permute them. + But we pretend they're const in the prototype to be compatible + with other systems. + + LONGOPTS is a vector of `struct option' terminated by an + element containing a name which is zero. + + LONGIND returns the index in LONGOPT of the long-named option found. + It is only valid when a long-named option has been found by the most + recent call. + + If LONG_ONLY is nonzero, '-' as well as '--' can introduce + long-named options. */ + +int +_getopt_internal (argc, argv, optstring, longopts, longind, long_only) + int argc; + char *const *argv; + const char *optstring; + const struct option *longopts; + int *longind; + int long_only; +{ + optarg = NULL; + + if (optind == 0) + { + optstring = _getopt_initialize (optstring); + optind = 1; /* Don't scan ARGV[0], the program name. */ + } + + /* Test whether ARGV[optind] points to a non-option argument. + Either it does not have option syntax, or there is an environment flag + from the shell indicating it is not an option. */ +#define NONOPTION_P (argv[optind][0] != '-' || argv[optind][1] == '\0' \ + || (optind < nonoption_flags_len \ + && nonoption_flags[optind] == '1')) + + if (nextchar == NULL || *nextchar == '\0') + { + /* Advance to the next ARGV-element. */ + + /* Give FIRST_NONOPT & LAST_NONOPT rational values if OPTIND has been + moved back by the user (who may also have changed the arguments). */ + if (last_nonopt > optind) + last_nonopt = optind; + if (first_nonopt > optind) + first_nonopt = optind; + + if (ordering == PERMUTE) + { + /* If we have just processed some options following some non-options, + exchange them so that the options come first. */ + + if (first_nonopt != last_nonopt && last_nonopt != optind) + exchange ((char **) argv); + else if (last_nonopt != optind) + first_nonopt = optind; + + /* Skip any additional non-options + and extend the range of non-options previously skipped. */ + + while (optind < argc && NONOPTION_P) + optind++; + last_nonopt = optind; + } + + /* The special ARGV-element `--' means premature end of options. + Skip it like a null option, + then exchange with previous non-options as if it were an option, + then skip everything else like a non-option. */ + + if (optind != argc && !strcmp (argv[optind], "--")) + { + optind++; + + if (first_nonopt != last_nonopt && last_nonopt != optind) + exchange ((char **) argv); + else if (first_nonopt == last_nonopt) + first_nonopt = optind; + last_nonopt = argc; + + optind = argc; + } + + /* If we have done all the ARGV-elements, stop the scan + and back over any non-options that we skipped and permuted. */ + + if (optind == argc) + { + /* Set the next-arg-index to point at the non-options + that we previously skipped, so the caller will digest them. */ + if (first_nonopt != last_nonopt) + optind = first_nonopt; + return EOF; + } + + /* If we have come to a non-option and did not permute it, + either stop the scan or describe it to the caller and pass it by. */ + + if (NONOPTION_P) + { + if (ordering == REQUIRE_ORDER) + return EOF; + optarg = argv[optind++]; + return 1; + } + + /* We have found another option-ARGV-element. + Skip the initial punctuation. */ + + nextchar = (argv[optind] + 1 + + (longopts != NULL && argv[optind][1] == '-')); + } + + /* Decode the current option-ARGV-element. */ + + /* Check whether the ARGV-element is a long option. + + If long_only and the ARGV-element has the form "-f", where f is + a valid short option, don't consider it an abbreviated form of + a long option that starts with f. Otherwise there would be no + way to give the -f short option. + + On the other hand, if there's a long option "fubar" and + the ARGV-element is "-fu", do consider that an abbreviation of + the long option, just like "--fu", and not "-f" with arg "u". + + This distinction seems to be the most useful approach. */ + + if (longopts != NULL + && (argv[optind][1] == '-' + || (long_only && (argv[optind][2] || !my_index (optstring, argv[optind][1]))))) + { + char *nameend; + const struct option *p; + const struct option *pfound = NULL; + int exact = 0; + int ambig = 0; + int indfound; + int option_index; + + for (nameend = nextchar; *nameend && *nameend != '='; nameend++) + /* Do nothing. */ ; + + /* Test all long options for either exact match + or abbreviated matches. */ + for (p = longopts, option_index = 0; p->name; p++, option_index++) + if (!strncmp (p->name, nextchar, nameend - nextchar)) + { + if (nameend - nextchar == strlen (p->name)) + { + /* Exact match found. */ + pfound = p; + indfound = option_index; + exact = 1; + break; + } + else if (pfound == NULL) + { + /* First nonexact match found. */ + pfound = p; + indfound = option_index; + } + else + /* Second or later nonexact match found. */ + ambig = 1; + } + + if (ambig && !exact) + { + if (opterr) + fprintf (stderr, _("%s: option `%s' is ambiguous\n"), + argv[0], argv[optind]); + nextchar += strlen (nextchar); + optind++; + optopt = 0; + return '?'; + } + + if (pfound != NULL) + { + option_index = indfound; + optind++; + if (*nameend) + { + /* Don't test has_arg with >, because some C compilers don't + allow it to be used on enums. */ + if (pfound->has_arg) + optarg = nameend + 1; + else + { + if (opterr) + if (argv[optind - 1][1] == '-') + /* --option */ + fprintf (stderr, + _("%s: option `--%s' doesn't allow an argument\n"), + argv[0], pfound->name); + else + /* +option or -option */ + fprintf (stderr, + _("%s: option `%c%s' doesn't allow an argument\n"), + argv[0], argv[optind - 1][0], pfound->name); + + nextchar += strlen (nextchar); + + optopt = pfound->val; + return '?'; + } + } + else if (pfound->has_arg == 1) + { + if (optind < argc) + optarg = argv[optind++]; + else + { + if (opterr) + fprintf (stderr, + _("%s: option `%s' requires an argument\n"), + argv[0], argv[optind - 1]); + nextchar += strlen (nextchar); + optopt = pfound->val; + return optstring[0] == ':' ? ':' : '?'; + } + } + nextchar += strlen (nextchar); + if (longind != NULL) + *longind = option_index; + if (pfound->flag) + { + *(pfound->flag) = pfound->val; + return 0; + } + return pfound->val; + } + + /* Can't find it as a long option. If this is not getopt_long_only, + or the option starts with '--' or is not a valid short + option, then it's an error. + Otherwise interpret it as a short option. */ + if (!long_only || argv[optind][1] == '-' + || my_index (optstring, *nextchar) == NULL) + { + if (opterr) + { + if (argv[optind][1] == '-') + /* --option */ + fprintf (stderr, _("%s: unrecognized option `--%s'\n"), + argv[0], nextchar); + else + /* +option or -option */ + fprintf (stderr, _("%s: unrecognized option `%c%s'\n"), + argv[0], argv[optind][0], nextchar); + } + nextchar = (char *) ""; + optind++; + optopt = 0; + return '?'; + } + } + + /* Look at and handle the next short option-character. */ + + { + char c = *nextchar++; + char *temp = my_index (optstring, c); + + /* Increment `optind' when we start to process its last character. */ + if (*nextchar == '\0') + ++optind; + + if (temp == NULL || c == ':') + { + if (opterr) + { + if (posixly_correct) + /* 1003.2 specifies the format of this message. */ + fprintf (stderr, _("%s: illegal option -- %c\n"), + argv[0], c); + else + fprintf (stderr, _("%s: invalid option -- %c\n"), + argv[0], c); + } + optopt = c; + return '?'; + } + if (temp[1] == ':') + { + if (temp[2] == ':') + { + /* This is an option that accepts an argument optionally. */ + if (*nextchar != '\0') + { + optarg = nextchar; + optind++; + } + else + optarg = NULL; + nextchar = NULL; + } + else + { + /* This is an option that requires an argument. */ + if (*nextchar != '\0') + { + optarg = nextchar; + /* If we end this ARGV-element by taking the rest as an arg, + we must advance to the next element now. */ + optind++; + } + else if (optind == argc) + { + if (opterr) + { + /* 1003.2 specifies the format of this message. */ + fprintf (stderr, + _("%s: option requires an argument -- %c\n"), + argv[0], c); + } + optopt = c; + if (optstring[0] == ':') + c = ':'; + else + c = '?'; + } + else + /* We already incremented `optind' once; + increment it again when taking next ARGV-elt as argument. */ + optarg = argv[optind++]; + nextchar = NULL; + } + } + return c; + } +} + +int +getopt (argc, argv, optstring) + int argc; + char *const *argv; + const char *optstring; +{ + return _getopt_internal (argc, argv, optstring, + (const struct option *) 0, + (int *) 0, + 0); +} + +#endif /* _LIBC or not __GNU_LIBRARY__. */ + +#ifdef TEST + +/* Compile with -DTEST to make an executable for use in testing + the above definition of `getopt'. */ + +int +main (argc, argv) + int argc; + char **argv; +{ + int c; + int digit_optind = 0; + + while (1) + { + int this_option_optind = optind ? optind : 1; + + c = getopt (argc, argv, "abc:d:0123456789"); + if (c == EOF) + break; + + switch (c) + { + case '0': + case '1': + case '2': + case '3': + case '4': + case '5': + case '6': + case '7': + case '8': + case '9': + if (digit_optind != 0 && digit_optind != this_option_optind) + printf ("digits occur in two different argv-elements.\n"); + digit_optind = this_option_optind; + printf ("option %c\n", c); + break; + + case 'a': + printf ("option a\n"); + break; + + case 'b': + printf ("option b\n"); + break; + + case 'c': + printf ("option c with value `%s'\n", optarg); + break; + + case '?': + break; + + default: + printf ("?? getopt returned character code 0%o ??\n", c); + } + } + + if (optind < argc) + { + printf ("non-option ARGV-elements: "); + while (optind < argc) + printf ("%s ", argv[optind++]); + printf ("\n"); + } + + exit (0); +} + +#endif /* TEST */ diff --git a/contrib/bison/getopt.h b/contrib/bison/getopt.h new file mode 100644 index 000000000000..0de18814f789 --- /dev/null +++ b/contrib/bison/getopt.h @@ -0,0 +1,130 @@ +/* Declarations for getopt. + Copyright (C) 1989, 90, 91, 92, 93, 94 Free Software Foundation, Inc. + + This program is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by the + Free Software Foundation; either version 2, or (at your option) any + later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, + USA. */ + +#ifndef _GETOPT_H +#define _GETOPT_H 1 + +#ifdef __cplusplus +extern "C" { +#endif + +/* For communication from `getopt' to the caller. + When `getopt' finds an option that takes an argument, + the argument value is returned here. + Also, when `ordering' is RETURN_IN_ORDER, + each non-option ARGV-element is returned here. */ + +extern char *optarg; + +/* Index in ARGV of the next element to be scanned. + This is used for communication to and from the caller + and for communication between successive calls to `getopt'. + + On entry to `getopt', zero means this is the first call; initialize. + + When `getopt' returns EOF, this is the index of the first of the + non-option elements that the caller should itself scan. + + Otherwise, `optind' communicates from one call to the next + how much of ARGV has been scanned so far. */ + +extern int optind; + +/* Callers store zero here to inhibit the error message `getopt' prints + for unrecognized options. */ + +extern int opterr; + +/* Set to an option character which was unrecognized. */ + +extern int optopt; + +/* Describe the long-named options requested by the application. + The LONG_OPTIONS argument to getopt_long or getopt_long_only is a vector + of `struct option' terminated by an element containing a name which is + zero. + + The field `has_arg' is: + no_argument (or 0) if the option does not take an argument, + required_argument (or 1) if the option requires an argument, + optional_argument (or 2) if the option takes an optional argument. + + If the field `flag' is not NULL, it points to a variable that is set + to the value given in the field `val' when the option is found, but + left unchanged if the option is not found. + + To have a long-named option do something other than set an `int' to + a compiled-in constant, such as set a value from `optarg', set the + option's `flag' field to zero and its `val' field to a nonzero + value (the equivalent single-letter option character, if there is + one). For long options that have a zero `flag' field, `getopt' + returns the contents of the `val' field. */ + +struct option +{ +#if defined (__STDC__) && __STDC__ + const char *name; +#else + char *name; +#endif + /* has_arg can't be an enum because some compilers complain about + type mismatches in all the code that assumes it is an int. */ + int has_arg; + int *flag; + int val; +}; + +/* Names for the values of the `has_arg' field of `struct option'. */ + +#define no_argument 0 +#define required_argument 1 +#define optional_argument 2 + +#if defined (__STDC__) && __STDC__ +#ifdef __GNU_LIBRARY__ +/* Many other libraries have conflicting prototypes for getopt, with + differences in the consts, in stdlib.h. To avoid compilation + errors, only prototype getopt for the GNU C library. */ +extern int getopt (int argc, char *const *argv, const char *shortopts); +#else /* not __GNU_LIBRARY__ */ +extern int getopt (); +#endif /* __GNU_LIBRARY__ */ +extern int getopt_long (int argc, char *const *argv, const char *shortopts, + const struct option *longopts, int *longind); +extern int getopt_long_only (int argc, char *const *argv, + const char *shortopts, + const struct option *longopts, int *longind); + +/* Internal only. Users should not call this directly. */ +extern int _getopt_internal (int argc, char *const *argv, + const char *shortopts, + const struct option *longopts, int *longind, + int long_only); +#else /* not __STDC__ */ +extern int getopt (); +extern int getopt_long (); +extern int getopt_long_only (); + +extern int _getopt_internal (); +#endif /* __STDC__ */ + +#ifdef __cplusplus +} +#endif + +#endif /* _GETOPT_H */ diff --git a/contrib/bison/getopt1.c b/contrib/bison/getopt1.c new file mode 100644 index 000000000000..9d394e683938 --- /dev/null +++ b/contrib/bison/getopt1.c @@ -0,0 +1,181 @@ +/* getopt_long and getopt_long_only entry points for GNU getopt. + Copyright (C) 1987, 88, 89, 90, 91, 92, 1993, 1994 + Free Software Foundation, Inc. + + This program is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by the + Free Software Foundation; either version 2, or (at your option) any + later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, + USA. */ + +#ifdef HAVE_CONFIG_H +#include +#endif + +#include "getopt.h" + +#if !defined (__STDC__) || !__STDC__ +/* This is a separate conditional since some stdc systems + reject `defined (const)'. */ +#ifndef const +#define const +#endif +#endif + +#include + +/* Comment out all this code if we are using the GNU C Library, and are not + actually compiling the library itself. This code is part of the GNU C + Library, but also included in many other GNU distributions. Compiling + and linking in this code is a waste when using the GNU C library + (especially if it is a shared library). Rather than having every GNU + program understand `configure --with-gnu-libc' and omit the object files, + it is simpler to just do this in the source for each such file. */ + +#if defined (_LIBC) || !defined (__GNU_LIBRARY__) + + +/* This needs to come after some library #include + to get __GNU_LIBRARY__ defined. */ +#ifdef __GNU_LIBRARY__ +#include +#else +char *getenv (); +#endif + +#ifndef NULL +#define NULL 0 +#endif + +int +getopt_long (argc, argv, options, long_options, opt_index) + int argc; + char *const *argv; + const char *options; + const struct option *long_options; + int *opt_index; +{ + return _getopt_internal (argc, argv, options, long_options, opt_index, 0); +} + +/* Like getopt_long, but '-' as well as '--' can indicate a long option. + If an option that starts with '-' (not '--') doesn't match a long option, + but does match a short option, it is parsed as a short option + instead. */ + +int +getopt_long_only (argc, argv, options, long_options, opt_index) + int argc; + char *const *argv; + const char *options; + const struct option *long_options; + int *opt_index; +{ + return _getopt_internal (argc, argv, options, long_options, opt_index, 1); +} + + +#endif /* _LIBC or not __GNU_LIBRARY__. */ + +#ifdef TEST + +#include + +int +main (argc, argv) + int argc; + char **argv; +{ + int c; + int digit_optind = 0; + + while (1) + { + int this_option_optind = optind ? optind : 1; + int option_index = 0; + static struct option long_options[] = + { + {"add", 1, 0, 0}, + {"append", 0, 0, 0}, + {"delete", 1, 0, 0}, + {"verbose", 0, 0, 0}, + {"create", 0, 0, 0}, + {"file", 1, 0, 0}, + {0, 0, 0, 0} + }; + + c = getopt_long (argc, argv, "abc:d:0123456789", + long_options, &option_index); + if (c == EOF) + break; + + switch (c) + { + case 0: + printf ("option %s", long_options[option_index].name); + if (optarg) + printf (" with arg %s", optarg); + printf ("\n"); + break; + + case '0': + case '1': + case '2': + case '3': + case '4': + case '5': + case '6': + case '7': + case '8': + case '9': + if (digit_optind != 0 && digit_optind != this_option_optind) + printf ("digits occur in two different argv-elements.\n"); + digit_optind = this_option_optind; + printf ("option %c\n", c); + break; + + case 'a': + printf ("option a\n"); + break; + + case 'b': + printf ("option b\n"); + break; + + case 'c': + printf ("option c with value `%s'\n", optarg); + break; + + case 'd': + printf ("option d with value `%s'\n", optarg); + break; + + case '?': + break; + + default: + printf ("?? getopt returned character code 0%o ??\n", c); + } + } + + if (optind < argc) + { + printf ("non-option ARGV-elements: "); + while (optind < argc) + printf ("%s ", argv[optind++]); + printf ("\n"); + } + + exit (0); +} + +#endif /* TEST */ diff --git a/contrib/bison/gram.c b/contrib/bison/gram.c new file mode 100644 index 000000000000..cc1418d9db6d --- /dev/null +++ b/contrib/bison/gram.c @@ -0,0 +1,58 @@ +/* Allocate input grammar variables for bison, + Copyright (C) 1984, 1986, 1989 Free Software Foundation, Inc. + +This file is part of Bison, the GNU Compiler Compiler. + +Bison is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 2, or (at your option) +any later version. + +Bison is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with Bison; see the file COPYING. If not, write to +the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA. */ + + +/* comments for these variables are in gram.h */ + +int nitems; +int nrules; +int nsyms; +int ntokens; +int nvars; + +short *ritem; +short *rlhs; +short *rrhs; +short *rprec; +short *rprecsym; +short *sprec; +short *rassoc; +short *sassoc; +short *token_translations; +short *rline; + +int start_symbol; + +int translations; + +int max_user_token_number; + +int semantic_parser; + +int pure_parser; + +int error_token_number; + +/* This is to avoid linker problems which occur on VMS when using GCC, + when the file in question contains data definitions only. */ + +void +dummy() +{ +} diff --git a/contrib/bison/gram.h b/contrib/bison/gram.h new file mode 100644 index 000000000000..080ce0d9fe4c --- /dev/null +++ b/contrib/bison/gram.h @@ -0,0 +1,125 @@ +/* Data definitions for internal representation of bison's input, + Copyright (C) 1984, 1986, 1989, 1992 Free Software Foundation, Inc. + +This file is part of Bison, the GNU Compiler Compiler. + +Bison is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 2, or (at your option) +any later version. + +Bison is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with Bison; see the file COPYING. If not, write to +the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA. */ + + +/* representation of the grammar rules: + +ntokens is the number of tokens, and nvars is the number of variables +(nonterminals). nsyms is the total number, ntokens + nvars. + + (the true number of token values assigned is ntokens + reduced by one for each alias declaration) + +Each symbol (either token or variable) receives a symbol number. +Numbers 0 to ntokens-1 are for tokens, and ntokens to nsyms-1 are for +variables. Symbol number zero is the end-of-input token. This token +is counted in ntokens. + +The rules receive rule numbers 1 to nrules in the order they are written. +Actions and guards are accessed via the rule number. + +The rules themselves are described by three arrays: rrhs, rlhs and +ritem. rlhs[R] is the symbol number of the left hand side of rule R. +The right hand side is stored as symbol numbers in a portion of +ritem. rrhs[R] contains the index in ritem of the beginning of the +portion for rule R. + +If rlhs[R] is -1, the rule has been thrown out by reduce.c +and should be ignored. + +The length of the portion is one greater + than the number of symbols in the rule's right hand side. +The last element in the portion contains minus R, which +identifies it as the end of a portion and says which rule it is for. + +The portions of ritem come in order of increasing rule number and are +followed by an element which is zero to mark the end. nitems is the +total length of ritem, not counting the final zero. Each element of +ritem is called an "item" and its index in ritem is an item number. + +Item numbers are used in the finite state machine to represent +places that parsing can get to. + +Precedence levels are recorded in the vectors sprec and rprec. +sprec records the precedence level of each symbol, +rprec the precedence level of each rule. +rprecsym is the symbol-number of the symbol in %prec for this rule (if any). + +Precedence levels are assigned in increasing order starting with 1 so +that numerically higher precedence values mean tighter binding as they +ought to. Zero as a symbol or rule's precedence means none is +assigned. + +Associativities are recorded similarly in rassoc and sassoc. */ + + +#define ISTOKEN(s) ((s) < ntokens) +#define ISVAR(s) ((s) >= ntokens) + + +extern int nitems; +extern int nrules; +extern int nsyms; +extern int ntokens; +extern int nvars; + +extern short *ritem; +extern short *rlhs; +extern short *rrhs; +extern short *rprec; +extern short *rprecsym; +extern short *sprec; +extern short *rassoc; +extern short *sassoc; +extern short *rline; /* Source line number of each rule */ + +extern int start_symbol; + + +/* associativity values in elements of rassoc, sassoc. */ + +#define RIGHT_ASSOC 1 +#define LEFT_ASSOC 2 +#define NON_ASSOC 3 + +/* token translation table: +indexed by a token number as returned by the user's yylex routine, +it yields the internal token number used by the parser and throughout bison. +If translations is zero, the translation table is not used because +the two kinds of token numbers are the same. +(It is noted in reader.c that "Nowadays translations is always set to 1...") +*/ + +extern short *token_translations; +extern int translations; +extern int max_user_token_number; + +/* semantic_parser is nonzero if the input file says to use the hairy parser +that provides for semantic error recovery. If it is zero, the yacc-compatible +simplified parser is used. */ + +extern int semantic_parser; + +/* pure_parser is nonzero if should generate a parser that is all pure and reentrant. */ + +extern int pure_parser; + +/* error_token_number is the token number of the error token. */ + +extern int error_token_number; diff --git a/contrib/bison/install-sh b/contrib/bison/install-sh new file mode 100755 index 000000000000..58719246f040 --- /dev/null +++ b/contrib/bison/install-sh @@ -0,0 +1,238 @@ +#! /bin/sh +# +# install - install a program, script, or datafile +# This comes from X11R5. +# +# Calling this script install-sh is preferred over install.sh, to prevent +# `make' implicit rules from creating a file called install from it +# when there is no Makefile. +# +# This script is compatible with the BSD install script, but was written +# from scratch. +# + + +# set DOITPROG to echo to test this script + +# Don't use :- since 4.3BSD and earlier shells don't like it. +doit="${DOITPROG-}" + + +# put in absolute paths if you don't have them in your path; or use env. vars. + +mvprog="${MVPROG-mv}" +cpprog="${CPPROG-cp}" +chmodprog="${CHMODPROG-chmod}" +chownprog="${CHOWNPROG-chown}" +chgrpprog="${CHGRPPROG-chgrp}" +stripprog="${STRIPPROG-strip}" +rmprog="${RMPROG-rm}" +mkdirprog="${MKDIRPROG-mkdir}" + +transformbasename="" +transform_arg="" +instcmd="$mvprog" +chmodcmd="$chmodprog 0755" +chowncmd="" +chgrpcmd="" +stripcmd="" +rmcmd="$rmprog -f" +mvcmd="$mvprog" +src="" +dst="" +dir_arg="" + +while [ x"$1" != x ]; do + case $1 in + -c) instcmd="$cpprog" + shift + continue;; + + -d) dir_arg=true + shift + continue;; + + -m) chmodcmd="$chmodprog $2" + shift + shift + continue;; + + -o) chowncmd="$chownprog $2" + shift + shift + continue;; + + -g) chgrpcmd="$chgrpprog $2" + shift + shift + continue;; + + -s) stripcmd="$stripprog" + shift + continue;; + + -t=*) transformarg=`echo $1 | sed 's/-t=//'` + shift + continue;; + + -b=*) transformbasename=`echo $1 | sed 's/-b=//'` + shift + continue;; + + *) if [ x"$src" = x ] + then + src=$1 + else + # this colon is to work around a 386BSD /bin/sh bug + : + dst=$1 + fi + shift + continue;; + esac +done + +if [ x"$src" = x ] +then + echo "install: no input file specified" + exit 1 +else + true +fi + +if [ x"$dir_arg" != x ]; then + dst=$src + src="" + + if [ -d $dst ]; then + instcmd=: + else + instcmd=mkdir + fi +else + +# Waiting for this to be detected by the "$instcmd $src $dsttmp" command +# might cause directories to be created, which would be especially bad +# if $src (and thus $dsttmp) contains '*'. + + if [ -f $src -o -d $src ] + then + true + else + echo "install: $src does not exist" + exit 1 + fi + + if [ x"$dst" = x ] + then + echo "install: no destination specified" + exit 1 + else + true + fi + +# If destination is a directory, append the input filename; if your system +# does not like double slashes in filenames, you may need to add some logic + + if [ -d $dst ] + then + dst="$dst"/`basename $src` + else + true + fi +fi + +## this sed command emulates the dirname command +dstdir=`echo $dst | sed -e 's,[^/]*$,,;s,/$,,;s,^$,.,'` + +# Make sure that the destination directory exists. +# this part is taken from Noah Friedman's mkinstalldirs script + +# Skip lots of stat calls in the usual case. +if [ ! -d "$dstdir" ]; then +defaultIFS=' +' +IFS="${IFS-${defaultIFS}}" + +oIFS="${IFS}" +# Some sh's can't handle IFS=/ for some reason. +IFS='%' +set - `echo ${dstdir} | sed -e 's@/@%@g' -e 's@^%@/@'` +IFS="${oIFS}" + +pathcomp='' + +while [ $# -ne 0 ] ; do + pathcomp="${pathcomp}${1}" + shift + + if [ ! -d "${pathcomp}" ] ; + then + $mkdirprog "${pathcomp}" + else + true + fi + + pathcomp="${pathcomp}/" +done +fi + +if [ x"$dir_arg" != x ] +then + $doit $instcmd $dst && + + if [ x"$chowncmd" != x ]; then $doit $chowncmd $dst; else true ; fi && + if [ x"$chgrpcmd" != x ]; then $doit $chgrpcmd $dst; else true ; fi && + if [ x"$stripcmd" != x ]; then $doit $stripcmd $dst; else true ; fi && + if [ x"$chmodcmd" != x ]; then $doit $chmodcmd $dst; else true ; fi +else + +# If we're going to rename the final executable, determine the name now. + + if [ x"$transformarg" = x ] + then + dstfile=`basename $dst` + else + dstfile=`basename $dst $transformbasename | + sed $transformarg`$transformbasename + fi + +# don't allow the sed command to completely eliminate the filename + + if [ x"$dstfile" = x ] + then + dstfile=`basename $dst` + else + true + fi + +# Make a temp file name in the proper directory. + + dsttmp=$dstdir/#inst.$$# + +# Move or copy the file name to the temp name + + $doit $instcmd $src $dsttmp && + + trap "rm -f ${dsttmp}" 0 && + +# and set any options; do chmod last to preserve setuid bits + +# If any of these fail, we abort the whole thing. If we want to +# ignore errors from any of these, just make sure not to ignore +# errors from the above "$doit $instcmd $src $dsttmp" command. + + if [ x"$chowncmd" != x ]; then $doit $chowncmd $dsttmp; else true;fi && + if [ x"$chgrpcmd" != x ]; then $doit $chgrpcmd $dsttmp; else true;fi && + if [ x"$stripcmd" != x ]; then $doit $stripcmd $dsttmp; else true;fi && + if [ x"$chmodcmd" != x ]; then $doit $chmodcmd $dsttmp; else true;fi && + +# Now rename the file to the real destination. + + $doit $rmcmd -f $dstdir/$dstfile && + $doit $mvcmd $dsttmp $dstdir/$dstfile + +fi && + + +exit 0 diff --git a/contrib/bison/lalr.c b/contrib/bison/lalr.c new file mode 100644 index 000000000000..32a5f29dd5cb --- /dev/null +++ b/contrib/bison/lalr.c @@ -0,0 +1,770 @@ +/* Compute look-ahead criteria for bison, + Copyright (C) 1984, 1986, 1989 Free Software Foundation, Inc. + +This file is part of Bison, the GNU Compiler Compiler. + +Bison is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 2, or (at your option) +any later version. + +Bison is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with Bison; see the file COPYING. If not, write to +the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA. */ + + +/* Compute how to make the finite state machine deterministic; + find which rules need lookahead in each state, and which lookahead tokens they accept. + +lalr(), the entry point, builds these data structures: + +goto_map, from_state and to_state + record each shift transition which accepts a variable (a nonterminal). +ngotos is the number of such transitions. +from_state[t] is the state number which a transition leads from +and to_state[t] is the state number it leads to. +All the transitions that accept a particular variable are grouped together and +goto_map[i - ntokens] is the index in from_state and to_state of the first of them. + +consistent[s] is nonzero if no lookahead is needed to decide what to do in state s. + +LAruleno is a vector which records the rules that need lookahead in various states. +The elements of LAruleno that apply to state s are those from + lookaheads[s] through lookaheads[s+1]-1. +Each element of LAruleno is a rule number. + +If lr is the length of LAruleno, then a number from 0 to lr-1 +can specify both a rule and a state where the rule might be applied. + +LA is a lr by ntokens matrix of bits. +LA[l, i] is 1 if the rule LAruleno[l] is applicable in the appropriate state + when the next token is symbol i. +If LA[l, i] and LA[l, j] are both 1 for i != j, it is a conflict. +*/ + +#include +#include "system.h" +#include "machine.h" +#include "types.h" +#include "state.h" +#include "new.h" +#include "gram.h" + + +extern short **derives; +extern char *nullable; + + +int tokensetsize; +short *lookaheads; +short *LAruleno; +unsigned *LA; +short *accessing_symbol; +char *consistent; +core **state_table; +shifts **shift_table; +reductions **reduction_table; +short *goto_map; +short *from_state; +short *to_state; + +short **transpose(); +void set_state_table(); +void set_accessing_symbol(); +void set_shift_table(); +void set_reduction_table(); +void set_maxrhs(); +void initialize_LA(); +void set_goto_map(); +void initialize_F(); +void build_relations(); +void add_lookback_edge(); +void compute_FOLLOWS(); +void compute_lookaheads(); +void digraph(); +void traverse(); + +extern void toomany(); +extern void berror(); + +static int infinity; +static int maxrhs; +static int ngotos; +static unsigned *F; +static short **includes; +static shorts **lookback; +static short **R; +static short *INDEX; +static short *VERTICES; +static int top; + + +void +lalr() +{ + tokensetsize = WORDSIZE(ntokens); + + set_state_table(); + set_accessing_symbol(); + set_shift_table(); + set_reduction_table(); + set_maxrhs(); + initialize_LA(); + set_goto_map(); + initialize_F(); + build_relations(); + compute_FOLLOWS(); + compute_lookaheads(); +} + + +void +set_state_table() +{ + register core *sp; + + state_table = NEW2(nstates, core *); + + for (sp = first_state; sp; sp = sp->next) + state_table[sp->number] = sp; +} + + +void +set_accessing_symbol() +{ + register core *sp; + + accessing_symbol = NEW2(nstates, short); + + for (sp = first_state; sp; sp = sp->next) + accessing_symbol[sp->number] = sp->accessing_symbol; +} + + +void +set_shift_table() +{ + register shifts *sp; + + shift_table = NEW2(nstates, shifts *); + + for (sp = first_shift; sp; sp = sp->next) + shift_table[sp->number] = sp; +} + + +void +set_reduction_table() +{ + register reductions *rp; + + reduction_table = NEW2(nstates, reductions *); + + for (rp = first_reduction; rp; rp = rp->next) + reduction_table[rp->number] = rp; +} + + +void +set_maxrhs() +{ + register short *itemp; + register int length; + register int max; + + length = 0; + max = 0; + for (itemp = ritem; *itemp; itemp++) + { + if (*itemp > 0) + { + length++; + } + else + { + if (length > max) max = length; + length = 0; + } + } + + maxrhs = max; +} + + +void +initialize_LA() +{ + register int i; + register int j; + register int count; + register reductions *rp; + register shifts *sp; + register short *np; + + consistent = NEW2(nstates, char); + lookaheads = NEW2(nstates + 1, short); + + count = 0; + for (i = 0; i < nstates; i++) + { + register int k; + + lookaheads[i] = count; + + rp = reduction_table[i]; + sp = shift_table[i]; + if (rp && (rp->nreds > 1 + || (sp && ! ISVAR(accessing_symbol[sp->shifts[0]])))) + count += rp->nreds; + else + consistent[i] = 1; + + if (sp) + for (k = 0; k < sp->nshifts; k++) + { + if (accessing_symbol[sp->shifts[k]] == error_token_number) + { + consistent[i] = 0; + break; + } + } + } + + lookaheads[nstates] = count; + + if (count == 0) + { + LA = NEW2(1 * tokensetsize, unsigned); + LAruleno = NEW2(1, short); + lookback = NEW2(1, shorts *); + } + else + { + LA = NEW2(count * tokensetsize, unsigned); + LAruleno = NEW2(count, short); + lookback = NEW2(count, shorts *); + } + + np = LAruleno; + for (i = 0; i < nstates; i++) + { + if (!consistent[i]) + { + if (rp = reduction_table[i]) + for (j = 0; j < rp->nreds; j++) + *np++ = rp->rules[j]; + } + } +} + + +void +set_goto_map() +{ + register shifts *sp; + register int i; + register int symbol; + register int k; + register short *temp_map; + register int state2; + register int state1; + + goto_map = NEW2(nvars + 1, short) - ntokens; + temp_map = NEW2(nvars + 1, short) - ntokens; + + ngotos = 0; + for (sp = first_shift; sp; sp = sp->next) + { + for (i = sp->nshifts - 1; i >= 0; i--) + { + symbol = accessing_symbol[sp->shifts[i]]; + + if (ISTOKEN(symbol)) break; + + if (ngotos == MAXSHORT) + toomany("gotos"); + + ngotos++; + goto_map[symbol]++; + } + } + + k = 0; + for (i = ntokens; i < nsyms; i++) + { + temp_map[i] = k; + k += goto_map[i]; + } + + for (i = ntokens; i < nsyms; i++) + goto_map[i] = temp_map[i]; + + goto_map[nsyms] = ngotos; + temp_map[nsyms] = ngotos; + + from_state = NEW2(ngotos, short); + to_state = NEW2(ngotos, short); + + for (sp = first_shift; sp; sp = sp->next) + { + state1 = sp->number; + for (i = sp->nshifts - 1; i >= 0; i--) + { + state2 = sp->shifts[i]; + symbol = accessing_symbol[state2]; + + if (ISTOKEN(symbol)) break; + + k = temp_map[symbol]++; + from_state[k] = state1; + to_state[k] = state2; + } + } + + FREE(temp_map + ntokens); +} + + + +/* Map_goto maps a state/symbol pair into its numeric representation. */ + +int +map_goto(state, symbol) +int state; +int symbol; +{ + register int high; + register int low; + register int middle; + register int s; + + low = goto_map[symbol]; + high = goto_map[symbol + 1] - 1; + + while (low <= high) + { + middle = (low + high) / 2; + s = from_state[middle]; + if (s == state) + return (middle); + else if (s < state) + low = middle + 1; + else + high = middle - 1; + } + + berror("map_goto"); +/* NOTREACHED */ + return 0; +} + + +void +initialize_F() +{ + register int i; + register int j; + register int k; + register shifts *sp; + register short *edge; + register unsigned *rowp; + register short *rp; + register short **reads; + register int nedges; + register int stateno; + register int symbol; + register int nwords; + + nwords = ngotos * tokensetsize; + F = NEW2(nwords, unsigned); + + reads = NEW2(ngotos, short *); + edge = NEW2(ngotos + 1, short); + nedges = 0; + + rowp = F; + for (i = 0; i < ngotos; i++) + { + stateno = to_state[i]; + sp = shift_table[stateno]; + + if (sp) + { + k = sp->nshifts; + + for (j = 0; j < k; j++) + { + symbol = accessing_symbol[sp->shifts[j]]; + if (ISVAR(symbol)) + break; + SETBIT(rowp, symbol); + } + + for (; j < k; j++) + { + symbol = accessing_symbol[sp->shifts[j]]; + if (nullable[symbol]) + edge[nedges++] = map_goto(stateno, symbol); + } + + if (nedges) + { + reads[i] = rp = NEW2(nedges + 1, short); + + for (j = 0; j < nedges; j++) + rp[j] = edge[j]; + + rp[nedges] = -1; + nedges = 0; + } + } + + rowp += tokensetsize; + } + + digraph(reads); + + for (i = 0; i < ngotos; i++) + { + if (reads[i]) + FREE(reads[i]); + } + + FREE(reads); + FREE(edge); +} + + +void +build_relations() +{ + register int i; + register int j; + register int k; + register short *rulep; + register short *rp; + register shifts *sp; + register int length; + register int nedges; + register int done; + register int state1; + register int stateno; + register int symbol1; + register int symbol2; + register short *shortp; + register short *edge; + register short *states; + register short **new_includes; + + includes = NEW2(ngotos, short *); + edge = NEW2(ngotos + 1, short); + states = NEW2(maxrhs + 1, short); + + for (i = 0; i < ngotos; i++) + { + nedges = 0; + state1 = from_state[i]; + symbol1 = accessing_symbol[to_state[i]]; + + for (rulep = derives[symbol1]; *rulep > 0; rulep++) + { + length = 1; + states[0] = state1; + stateno = state1; + + for (rp = ritem + rrhs[*rulep]; *rp > 0; rp++) + { + symbol2 = *rp; + sp = shift_table[stateno]; + k = sp->nshifts; + + for (j = 0; j < k; j++) + { + stateno = sp->shifts[j]; + if (accessing_symbol[stateno] == symbol2) break; + } + + states[length++] = stateno; + } + + if (!consistent[stateno]) + add_lookback_edge(stateno, *rulep, i); + + length--; + done = 0; + while (!done) + { + done = 1; + rp--; + /* JF added rp>=ritem && I hope to god its right! */ + if (rp>=ritem && ISVAR(*rp)) + { + stateno = states[--length]; + edge[nedges++] = map_goto(stateno, *rp); + if (nullable[*rp]) done = 0; + } + } + } + + if (nedges) + { + includes[i] = shortp = NEW2(nedges + 1, short); + for (j = 0; j < nedges; j++) + shortp[j] = edge[j]; + shortp[nedges] = -1; + } + } + + new_includes = transpose(includes, ngotos); + + for (i = 0; i < ngotos; i++) + if (includes[i]) + FREE(includes[i]); + + FREE(includes); + + includes = new_includes; + + FREE(edge); + FREE(states); +} + + +void +add_lookback_edge(stateno, ruleno, gotono) +int stateno; +int ruleno; +int gotono; +{ + register int i; + register int k; + register int found; + register shorts *sp; + + i = lookaheads[stateno]; + k = lookaheads[stateno + 1]; + found = 0; + while (!found && i < k) + { + if (LAruleno[i] == ruleno) + found = 1; + else + i++; + } + + if (found == 0) + berror("add_lookback_edge"); + + sp = NEW(shorts); + sp->next = lookback[i]; + sp->value = gotono; + lookback[i] = sp; +} + + + +short ** +transpose(R_arg, n) +short **R_arg; +int n; +{ + register short **new_R; + register short **temp_R; + register short *nedges; + register short *sp; + register int i; + register int k; + + nedges = NEW2(n, short); + + for (i = 0; i < n; i++) + { + sp = R_arg[i]; + if (sp) + { + while (*sp >= 0) + nedges[*sp++]++; + } + } + + new_R = NEW2(n, short *); + temp_R = NEW2(n, short *); + + for (i = 0; i < n; i++) + { + k = nedges[i]; + if (k > 0) + { + sp = NEW2(k + 1, short); + new_R[i] = sp; + temp_R[i] = sp; + sp[k] = -1; + } + } + + FREE(nedges); + + for (i = 0; i < n; i++) + { + sp = R_arg[i]; + if (sp) + { + while (*sp >= 0) + *temp_R[*sp++]++ = i; + } + } + + FREE(temp_R); + + return (new_R); +} + + +void +compute_FOLLOWS() +{ + register int i; + + digraph(includes); + + for (i = 0; i < ngotos; i++) + { + if (includes[i]) FREE(includes[i]); + } + + FREE(includes); +} + + +void +compute_lookaheads() +{ + register int i; + register int n; + register unsigned *fp1; + register unsigned *fp2; + register unsigned *fp3; + register shorts *sp; + register unsigned *rowp; +/* register short *rulep; JF unused */ +/* register int count; JF unused */ + register shorts *sptmp;/* JF */ + + rowp = LA; + n = lookaheads[nstates]; + for (i = 0; i < n; i++) + { + fp3 = rowp + tokensetsize; + for (sp = lookback[i]; sp; sp = sp->next) + { + fp1 = rowp; + fp2 = F + tokensetsize * sp->value; + while (fp1 < fp3) + *fp1++ |= *fp2++; + } + + rowp = fp3; + } + + for (i = 0; i < n; i++) + {/* JF removed ref to freed storage */ + for (sp = lookback[i]; sp; sp = sptmp) { + sptmp=sp->next; + FREE(sp); + } + } + + FREE(lookback); + FREE(F); +} + + +void +digraph(relation) +short **relation; +{ + register int i; + + infinity = ngotos + 2; + INDEX = NEW2(ngotos + 1, short); + VERTICES = NEW2(ngotos + 1, short); + top = 0; + + R = relation; + + for (i = 0; i < ngotos; i++) + INDEX[i] = 0; + + for (i = 0; i < ngotos; i++) + { + if (INDEX[i] == 0 && R[i]) + traverse(i); + } + + FREE(INDEX); + FREE(VERTICES); +} + + +void +traverse(i) +register int i; +{ + register unsigned *fp1; + register unsigned *fp2; + register unsigned *fp3; + register int j; + register short *rp; + + int height; + unsigned *base; + + VERTICES[++top] = i; + INDEX[i] = height = top; + + base = F + i * tokensetsize; + fp3 = base + tokensetsize; + + rp = R[i]; + if (rp) + { + while ((j = *rp++) >= 0) + { + if (INDEX[j] == 0) + traverse(j); + + if (INDEX[i] > INDEX[j]) + INDEX[i] = INDEX[j]; + + fp1 = base; + fp2 = F + j * tokensetsize; + + while (fp1 < fp3) + *fp1++ |= *fp2++; + } + } + + if (INDEX[i] == height) + { + for (;;) + { + j = VERTICES[top--]; + INDEX[j] = infinity; + + if (i == j) + break; + + fp1 = base; + fp2 = F + j * tokensetsize; + + while (fp1 < fp3) + *fp2++ = *fp1++; + } + } +} diff --git a/contrib/bison/lex.c b/contrib/bison/lex.c new file mode 100644 index 000000000000..7ae1387e4d5c --- /dev/null +++ b/contrib/bison/lex.c @@ -0,0 +1,620 @@ +/* Token-reader for Bison's input parser, + Copyright (C) 1984, 1986, 1989, 1992 Free Software Foundation, Inc. + +This file is part of Bison, the GNU Compiler Compiler. + +Bison is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 2, or (at your option) +any later version. + +Bison is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with Bison; see the file COPYING. If not, write to +the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA. */ + + +/* + lex is the entry point. It is called from reader.c. + It returns one of the token-type codes defined in lex.h. + When an identifier is seen, the code IDENTIFIER is returned + and the name is looked up in the symbol table using symtab.c; + symval is set to a pointer to the entry found. */ + +#include +#include +#include "system.h" +#include "files.h" +#include "getopt.h" /* for optarg */ +#include "symtab.h" +#include "lex.h" +#include "new.h" + +/* flags set by % directives */ +extern int definesflag; /* for -d */ +extern int toknumflag; /* for -k */ +extern int noparserflag; /* for -n */ +extern int fixed_outfiles; /* for -y */ +extern int nolinesflag; /* for -l */ +extern int rawtoknumflag; /* for -r */ +extern int verboseflag; /* for -v */ +extern int debugflag; /* for -t */ +extern char *spec_name_prefix; /* for -p */ +extern char *spec_file_prefix; /* for -b */ +/*spec_outfile is declared in files.h, for -o */ + +extern int lineno; +extern int translations; + +int parse_percent_token(); + +/* functions from main.c */ +extern char *printable_version(); +extern void fatal(); +extern void warni(); +extern void warn(); + +/* Buffer for storing the current token. */ +char *token_buffer; + +/* Allocated size of token_buffer, not including space for terminator. */ +static int maxtoken; + +bucket *symval; +int numval; + +static int unlexed; /* these two describe a token to be reread */ +static bucket *unlexed_symval; /* by the next call to lex */ + + +void +init_lex() +{ + maxtoken = 100; + token_buffer = NEW2 (maxtoken + 1, char); + unlexed = -1; +} + + +static char * +grow_token_buffer (p) + char *p; +{ + int offset = p - token_buffer; + maxtoken *= 2; + token_buffer = (char *) xrealloc(token_buffer, maxtoken + 1); + return token_buffer + offset; +} + + +int +skip_white_space() +{ + register int c; + register int inside; + + c = getc(finput); + + for (;;) + { + int cplus_comment; + + switch (c) + { + case '/': + c = getc(finput); + if (c != '*' && c != '/') + { + warn("unexpected `/' found and ignored"); + break; + } + cplus_comment = (c == '/'); + + c = getc(finput); + + inside = 1; + while (inside) + { + if (!cplus_comment && c == '*') + { + while (c == '*') + c = getc(finput); + + if (c == '/') + { + inside = 0; + c = getc(finput); + } + } + else if (c == '\n') + { + lineno++; + if (cplus_comment) + inside = 0; + c = getc(finput); + } + else if (c == EOF) + fatal("unterminated comment"); + else + c = getc(finput); + } + + break; + + case '\n': + lineno++; + + case ' ': + case '\t': + case '\f': + c = getc(finput); + break; + + default: + return (c); + } + } +} + +/* do a getc, but give error message if EOF encountered */ +int +safegetc(f) + FILE *f; +{ + register int c = getc(f); + if (c == EOF) + fatal("Unexpected end of file"); + return c; +} + +/* read one literal character from finput. process \ escapes. + append the normalized string version of the char to *pp. + assign the character code to *pcode + return 1 unless the character is an unescaped `term' or \n + report error for \n +*/ +int +literalchar(pp, pcode, term) + char **pp; + int *pcode; + char term; +{ + register int c; + register char *p; + register int code; + int wasquote = 0; + + c = safegetc(finput); + if (c == '\n') + { + warn("unescaped newline in constant"); + ungetc(c, finput); + code = '?'; + wasquote = 1; + } + else if (c != '\\') + { + code = c; + if (c == term) + wasquote = 1; + } + else + { + c = safegetc(finput); + if (c == 't') code = '\t'; + else if (c == 'n') code = '\n'; + else if (c == 'a') code = '\007'; + else if (c == 'r') code = '\r'; + else if (c == 'f') code = '\f'; + else if (c == 'b') code = '\b'; + else if (c == 'v') code = 013; + else if (c == '\\') code = '\\'; + else if (c == '\'') code = '\''; + else if (c == '\"') code = '\"'; + else if (c <= '7' && c >= '0') + { + code = 0; + while (c <= '7' && c >= '0') + { + code = (code * 8) + (c - '0'); + if (code >= 256 || code < 0) + { + warni("octal value outside range 0...255: `\\%o'", code); + code &= 0xFF; + break; + } + c = safegetc(finput); + } + ungetc(c, finput); + } + else if (c == 'x') + { + c = safegetc(finput); + code = 0; + while (1) + { + if (c >= '0' && c <= '9') + code *= 16, code += c - '0'; + else if (c >= 'a' && c <= 'f') + code *= 16, code += c - 'a' + 10; + else if (c >= 'A' && c <= 'F') + code *= 16, code += c - 'A' + 10; + else + break; + if (code >= 256 || code<0) + { + warni("hexadecimal value above 255: `\\x%x'", code); + code &= 0xFF; + break; + } + c = safegetc(finput); + } + ungetc(c, finput); + } + else + { + warni ("unknown escape sequence: `\\' followed by `%s'", + printable_version(c)); + code = '?'; + } + } /* has \ */ + + /* now fill token_buffer with the canonical name for this character + as a literal token. Do not use what the user typed, + so that `\012' and `\n' can be interchangeable. */ + + p = *pp; + if (code >= 040 && code < 0177) + *p++ = code; + else if (code == '\\') {*p++ = '\\'; *p++ = '\\';} + else if (code == '\'') {*p++ = '\\'; *p++ = '\'';} + else if (code == '\"') {*p++ = '\\'; *p++ = '\"';} + else if (code == '\t') {*p++ = '\\'; *p++ = 't';} + else if (code == '\n') {*p++ = '\\'; *p++ = 'n';} + else if (code == '\r') {*p++ = '\\'; *p++ = 'r';} + else if (code == '\v') {*p++ = '\\'; *p++ = 'v';} + else if (code == '\b') {*p++ = '\\'; *p++ = 'b';} + else if (code == '\f') {*p++ = '\\'; *p++ = 'f';} + else + { + *p++ = '\\'; + *p++ = code / 0100 + '0'; + *p++ = ((code / 010) & 07) + '0'; + *p++ = (code & 07) + '0'; + } + *pp = p; + *pcode = code; + return ! wasquote; +} + + +void +unlex(token) + int token; +{ + unlexed = token; + unlexed_symval = symval; +} + + +int +lex() +{ + register int c; + char *p; + + if (unlexed >= 0) + { + symval = unlexed_symval; + c = unlexed; + unlexed = -1; + return (c); + } + + c = skip_white_space(); + *token_buffer = c; /* for error messages (token buffer always valid) */ + token_buffer[1] = 0; + + switch (c) + { + case EOF: + strcpy(token_buffer, "EOF"); + return (ENDFILE); + + case 'A': case 'B': case 'C': case 'D': case 'E': + case 'F': case 'G': case 'H': case 'I': case 'J': + case 'K': case 'L': case 'M': case 'N': case 'O': + case 'P': case 'Q': case 'R': case 'S': case 'T': + case 'U': case 'V': case 'W': case 'X': case 'Y': + case 'Z': + case 'a': case 'b': case 'c': case 'd': case 'e': + case 'f': case 'g': case 'h': case 'i': case 'j': + case 'k': case 'l': case 'm': case 'n': case 'o': + case 'p': case 'q': case 'r': case 's': case 't': + case 'u': case 'v': case 'w': case 'x': case 'y': + case 'z': + case '.': case '_': + p = token_buffer; + while (isalnum(c) || c == '_' || c == '.') + { + if (p == token_buffer + maxtoken) + p = grow_token_buffer(p); + + *p++ = c; + c = getc(finput); + } + + *p = 0; + ungetc(c, finput); + symval = getsym(token_buffer); + return (IDENTIFIER); + + case '0': case '1': case '2': case '3': case '4': + case '5': case '6': case '7': case '8': case '9': + { + numval = 0; + + p = token_buffer; + while (isdigit(c)) + { + if (p == token_buffer + maxtoken) + p = grow_token_buffer(p); + + *p++ = c; + numval = numval*10 + c - '0'; + c = getc(finput); + } + *p = 0; + ungetc(c, finput); + return (NUMBER); + } + + case '\'': + + /* parse the literal token and compute character code in code */ + + translations = -1; + { + int code, discode; + char discard[10], *dp; + p = token_buffer; + *p++ = '\''; + literalchar(&p, &code, '\''); + + c = getc(finput); + if (c != '\'') + { + warn("use \"...\" for multi-character literal tokens"); + dp = discard; + while (literalchar(&dp, &discode, '\'')) {} + } + *p++ = '\''; + *p = 0; + symval = getsym(token_buffer); + symval->class = STOKEN; + if (! symval->user_token_number) + symval->user_token_number = code; + return (IDENTIFIER); + } + + case '\"': + + /* parse the literal string token and treat as an identifier */ + + translations = -1; + { + int code; /* ignored here */ + p = token_buffer; + *p++ = '\"'; + while (literalchar(&p, &code, '\"')) /* read up to and including " */ + { + if (p >= token_buffer + maxtoken - 4) + p = grow_token_buffer(p); + } + *p = 0; + + symval = getsym(token_buffer); + symval->class = STOKEN; + + return (IDENTIFIER); + } + + case ',': + return (COMMA); + + case ':': + return (COLON); + + case ';': + return (SEMICOLON); + + case '|': + return (BAR); + + case '{': + return (LEFT_CURLY); + + case '=': + do + { + c = getc(finput); + if (c == '\n') lineno++; + } + while(c==' ' || c=='\n' || c=='\t'); + + if (c == '{') + { + strcpy(token_buffer, "={"); + return(LEFT_CURLY); + } + else + { + ungetc(c, finput); + return(ILLEGAL); + } + + case '<': + p = token_buffer; + c = getc(finput); + while (c != '>') + { + if (c == EOF) + fatal("unterminated type name at end of file"); + if (c == '\n') + { + warn("unterminated type name"); + ungetc(c, finput); + break; + } + + if (p == token_buffer + maxtoken) + p = grow_token_buffer(p); + + *p++ = c; + c = getc(finput); + } + *p = 0; + return (TYPENAME); + + + case '%': + return (parse_percent_token()); + + default: + return (ILLEGAL); + } +} + +/* the following table dictates the action taken for the various + % directives. A setflag value causes the named flag to be + set. A retval action returns the code. +*/ +struct percent_table_struct { + char *name; + void *setflag; + int retval; +} percent_table[] = +{ + {"token", NULL, TOKEN}, + {"term", NULL, TOKEN}, + {"nterm", NULL, NTERM}, + {"type", NULL, TYPE}, + {"guard", NULL, GUARD}, + {"union", NULL, UNION}, + {"expect", NULL, EXPECT}, + {"thong", NULL, THONG}, + {"start", NULL, START}, + {"left", NULL, LEFT}, + {"right", NULL, RIGHT}, + {"nonassoc", NULL, NONASSOC}, + {"binary", NULL, NONASSOC}, + {"semantic_parser", NULL, SEMANTIC_PARSER}, + {"pure_parser", NULL, PURE_PARSER}, + {"prec", NULL, PREC}, + + {"no_lines", &nolinesflag, NOOP}, /* -l */ + {"raw", &rawtoknumflag, NOOP}, /* -r */ + {"token_table", &toknumflag, NOOP}, /* -k */ + +#if 0 + /* These can be utilized after main is reoganized so + open_files() is deferred 'til after read_declarations(). + But %{ and %union both put information into files + that have to be opened before read_declarations(). + */ + {"yacc", &fixed_outfiles, NOOP}, /* -y */ + {"fixed_output_files", &fixed_outfiles, NOOP}, /* -y */ + {"defines", &definesflag, NOOP}, /* -d */ + {"no_parser", &noparserflag, NOOP}, /* -n */ + {"output_file", &spec_outfile, SETOPT}, /* -o */ + {"file_prefix", &spec_file_prefix, SETOPT}, /* -b */ + {"name_prefix", &spec_name_prefix, SETOPT}, /* -p */ + + /* These would be acceptable, but they do not affect processing */ + {"verbose", &verboseflag, NOOP}, /* -v */ + {"debug", &debugflag, NOOP}, /* -t */ + /* {"help", , NOOP}, /* -h */ + /* {"version", , NOOP}, /* -V */ +#endif + + {NULL, NULL, ILLEGAL} +}; + +/* Parse a token which starts with %. + Assumes the % has already been read and discarded. */ + +int +parse_percent_token () +{ + register int c; + register char *p; + register struct percent_table_struct *tx; + + p = token_buffer; + c = getc(finput); + *p++ = '%'; + *p++ = c; /* for error msg */ + *p = 0; + + switch (c) + { + case '%': + return (TWO_PERCENTS); + + case '{': + return (PERCENT_LEFT_CURLY); + + case '<': + return (LEFT); + + case '>': + return (RIGHT); + + case '2': + return (NONASSOC); + + case '0': + return (TOKEN); + + case '=': + return (PREC); + } + if (!isalpha(c)) + return (ILLEGAL); + + p = token_buffer; + *p++ = '%'; + while (isalpha(c) || c == '_' || c == '-') + { + if (p == token_buffer + maxtoken) + p = grow_token_buffer(p); + + if (c == '-') c = '_'; + *p++ = c; + c = getc(finput); + } + + ungetc(c, finput); + + *p = 0; + + /* table lookup % directive */ + for (tx = percent_table; tx->name; tx++) + if (strcmp(token_buffer+1, tx->name) == 0) + break; + if (tx->retval == SETOPT) + { + *((char **)(tx->setflag)) = optarg; + return NOOP; + } + if (tx->setflag) + { + *((int *)(tx->setflag)) = 1; + return NOOP; + } + return tx->retval; +} diff --git a/contrib/bison/lex.h b/contrib/bison/lex.h new file mode 100644 index 000000000000..6cbaebcbb4a6 --- /dev/null +++ b/contrib/bison/lex.h @@ -0,0 +1,50 @@ +/* Token type definitions for bison's input reader, + Copyright (C) 1984, 1989, 1992 Free Software Foundation, Inc. + +This file is part of Bison, the GNU Compiler Compiler. + +Bison is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 2, or (at your option) +any later version. + +Bison is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with Bison; see the file COPYING. If not, write to +the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA. */ + + +#define ENDFILE 0 +#define IDENTIFIER 1 +#define COMMA 2 +#define COLON 3 +#define SEMICOLON 4 +#define BAR 5 +#define LEFT_CURLY 6 +#define TWO_PERCENTS 7 +#define PERCENT_LEFT_CURLY 8 +#define TOKEN 9 +#define NTERM 10 +#define GUARD 11 +#define TYPE 12 +#define UNION 13 +#define START 14 +#define LEFT 15 +#define RIGHT 16 +#define NONASSOC 17 +#define PREC 18 +#define SEMANTIC_PARSER 19 +#define PURE_PARSER 20 +#define TYPENAME 21 +#define NUMBER 22 +#define EXPECT 23 +#define THONG 24 +#define NOOP 25 +#define SETOPT 26 +#define ILLEGAL 27 + +#define MAXTOKEN 1024 diff --git a/contrib/bison/machine.h b/contrib/bison/machine.h new file mode 100644 index 000000000000..6c05691f793d --- /dev/null +++ b/contrib/bison/machine.h @@ -0,0 +1,39 @@ +/* Define machine-dependencies for bison, + Copyright (C) 1984, 1989 Free Software Foundation, Inc. + +This file is part of Bison, the GNU Compiler Compiler. + +Bison is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 2, or (at your option) +any later version. + +Bison is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with Bison; see the file COPYING. If not, write to +the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA. */ + +#ifdef eta10 +#define MAXSHORT 2147483647 +#define MINSHORT -2147483648 +#else +#define MAXSHORT 32767 +#define MINSHORT -32768 +#endif + +#if defined (MSDOS) && !defined (__GO32__) +#define BITS_PER_WORD 16 +#define MAXTABLE 16383 +#else +#define BITS_PER_WORD 32 +#define MAXTABLE 32767 +#endif + +#define WORDSIZE(n) (((n) + BITS_PER_WORD - 1) / BITS_PER_WORD) +#define SETBIT(x, i) ((x)[(i)/BITS_PER_WORD] |= (1<<((i) % BITS_PER_WORD))) +#define RESETBIT(x, i) ((x)[(i)/BITS_PER_WORD] &= ~(1<<((i) % BITS_PER_WORD))) +#define BITISSET(x, i) (((x)[(i)/BITS_PER_WORD] & (1<<((i) % BITS_PER_WORD))) != 0) diff --git a/contrib/bison/main.c b/contrib/bison/main.c new file mode 100644 index 000000000000..e27e1a547544 --- /dev/null +++ b/contrib/bison/main.c @@ -0,0 +1,238 @@ +/* Top level entry point of bison, + Copyright (C) 1984, 1986, 1989, 1992, 1995 Free Software Foundation, Inc. + +This file is part of Bison, the GNU Compiler Compiler. + +Bison is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 2, or (at your option) +any later version. + +Bison is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with Bison; see the file COPYING. If not, write to +the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA. */ + + +#include +#include "system.h" +#include "machine.h" /* for MAXSHORT */ + +extern int lineno; +extern int verboseflag; + +/* Nonzero means failure has been detected; don't write a parser file. */ +int failure; + +/* The name this program was run with, for messages. */ +char *program_name; + +extern void getargs(), openfiles(), reader(), reduce_grammar(); +extern void set_derives(), set_nullable(), generate_states(); +extern void lalr(), initialize_conflicts(), verbose(), terse(); +extern void output(), done(); + + +/* VMS complained about using `int'. */ + +int +main(argc, argv) + int argc; + char *argv[]; +{ + program_name = argv[0]; + failure = 0; + lineno = 0; + getargs(argc, argv); + openfiles(); + + /* read the input. Copy some parts of it to fguard, faction, ftable and fattrs. + In file reader.c. + The other parts are recorded in the grammar; see gram.h. */ + reader(); + if (failure) + done(failure); + + /* find useless nonterminals and productions and reduce the grammar. In + file reduce.c */ + reduce_grammar(); + + /* record other info about the grammar. In files derives and nullable. */ + set_derives(); + set_nullable(); + + /* convert to nondeterministic finite state machine. In file LR0. + See state.h for more info. */ + generate_states(); + + /* make it deterministic. In file lalr. */ + lalr(); + + /* Find and record any conflicts: places where one token of lookahead is not + enough to disambiguate the parsing. In file conflicts. + Also resolve s/r conflicts based on precedence declarations. */ + initialize_conflicts(); + + /* print information about results, if requested. In file print. */ + if (verboseflag) + verbose(); + else + terse(); + + /* output the tables and the parser to ftable. In file output. */ + output(); + done(failure); +} + +/* functions to report errors which prevent a parser from being generated */ + + +/* Return a string containing a printable version of C: + either C itself, or the corresponding \DDD code. */ + +char * +printable_version(c) + char c; +{ + static char buf[10]; + if (c < ' ' || c >= '\177') + sprintf(buf, "\\%o", c); + else + { + buf[0] = c; + buf[1] = '\0'; + } + return buf; +} + +/* Generate a string from the integer I. + Return a ptr to internal memory containing the string. */ + +char * +int_to_string(i) + int i; +{ + static char buf[20]; + sprintf(buf, "%d", i); + return buf; +} + +/* Print the message S for a fatal error. */ + +void +fatal(s) + char *s; +{ + extern char *infile; + + if (infile == 0) + fprintf(stderr, "fatal error: %s\n", s); + else + fprintf(stderr, "\"%s\", line %d: %s\n", infile, lineno, s); + done(1); +} + + +/* Print a message for a fatal error. Use FMT to construct the message + and incorporate string X1. */ + +void +fatals(fmt, x1) + char *fmt, *x1; +{ + char buffer[200]; + sprintf(buffer, fmt, x1); + fatal(buffer); +} + +/* Print a warning message S. */ + +void +warn(s) + char *s; +{ + extern char *infile; + + if (infile == 0) + fprintf(stderr, "error: %s\n", s); + else + fprintf(stderr, "(\"%s\", line %d) error: %s\n", + infile, lineno, s); + + failure = 1; +} + +/* Print a warning message containing the string for the integer X1. + The message is given by the format FMT. */ + +void +warni(fmt, x1) + char *fmt; + int x1; +{ + char buffer[200]; + sprintf(buffer, fmt, x1); + warn(buffer); +} + +/* Print a warning message containing the string X1. + The message is given by the format FMT. */ + +void +warns(fmt, x1) + char *fmt, *x1; +{ + char buffer[200]; + sprintf(buffer, fmt, x1); + warn(buffer); +} + +/* Print a warning message containing the two strings X1 and X2. + The message is given by the format FMT. */ + +void +warnss(fmt, x1, x2) + char *fmt, *x1, *x2; +{ + char buffer[200]; + sprintf(buffer, fmt, x1, x2); + warn(buffer); +} + +/* Print a warning message containing the 3 strings X1, X2, X3. + The message is given by the format FMT. */ + +void +warnsss(fmt, x1, x2, x3) + char *fmt, *x1, *x2, *x3; +{ + char buffer[200]; + sprintf(buffer, fmt, x1, x2, x3); + warn(buffer); +} + +/* Print a message for the fatal occurence of more than MAXSHORT + instances of whatever is denoted by the string S. */ + +void +toomany(s) + char *s; +{ + char buffer[200]; + sprintf(buffer, "limit of %d exceeded, too many %s", MAXSHORT, s); + fatal(buffer); +} + +/* Abort for an internal error denoted by string S. */ + +void +berror(s) + char *s; +{ + fprintf(stderr, "internal error, %s\n", s); + abort(); +} diff --git a/contrib/bison/mkinstalldirs b/contrib/bison/mkinstalldirs new file mode 100755 index 000000000000..a01481be4367 --- /dev/null +++ b/contrib/bison/mkinstalldirs @@ -0,0 +1,40 @@ +#! /bin/sh +# mkinstalldirs --- make directory hierarchy +# Author: Noah Friedman +# Created: 1993-05-16 +# Public domain + +# $Id: mkinstalldirs,v 1.10 1996/05/03 07:37:52 friedman Exp $ + +errstatus=0 + +for file +do + set fnord `echo ":$file" | sed -ne 's/^:\//#/;s/^://;s/\// /g;s/^#/\//;p'` + shift + + pathcomp= + for d + do + pathcomp="$pathcomp$d" + case "$pathcomp" in + -* ) pathcomp=./$pathcomp ;; + esac + + if test ! -d "$pathcomp"; then + echo "mkdir $pathcomp" 1>&2 + + mkdir "$pathcomp" || lasterr=$? + + if test ! -d "$pathcomp"; then + errstatus=$lasterr + fi + fi + + pathcomp="$pathcomp/" + done +done + +exit $errstatus + +# mkinstalldirs ends here diff --git a/contrib/bison/new.h b/contrib/bison/new.h new file mode 100644 index 000000000000..ab045b360b02 --- /dev/null +++ b/contrib/bison/new.h @@ -0,0 +1,31 @@ +/* Storage allocation interface for bison, + Copyright (C) 1984, 1989 Free Software Foundation, Inc. + +This file is part of Bison, the GNU Compiler Compiler. + +Bison is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 2, or (at your option) +any later version. + +Bison is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with Bison; see the file COPYING. If not, write to +the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA. */ + + +#define NEW(t) ((t *) xmalloc((unsigned) sizeof(t))) +#define NEW2(n, t) ((t *) xmalloc((unsigned) ((n) * sizeof(t)))) + +#ifdef __STDC__ +#define FREE(x) (x ? (void) free((char *) (x)) : (void)0) +#else +#define FREE(x) ((x) != 0 && (free ((char *) (x)), 0)) +#endif + +extern char *xmalloc(); +extern char *xrealloc(); diff --git a/contrib/bison/nullable.c b/contrib/bison/nullable.c new file mode 100644 index 000000000000..b85dec610d51 --- /dev/null +++ b/contrib/bison/nullable.c @@ -0,0 +1,136 @@ +/* Part of the bison parser generator, + Copyright (C) 1984, 1989 Free Software Foundation, Inc. + +This file is part of Bison, the GNU Compiler Compiler. + +Bison is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 2, or (at your option) +any later version. + +Bison is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with Bison; see the file COPYING. If not, write to +the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA. */ + + +/* set up nullable, a vector saying which nonterminals can expand into the null string. + nullable[i - ntokens] is nonzero if symbol i can do so. */ + +#include +#include "system.h" +#include "types.h" +#include "gram.h" +#include "new.h" + + +char *nullable; + + +void +set_nullable() +{ + register short *r; + register short *s1; + register short *s2; + register int ruleno; + register int symbol; + register shorts *p; + + short *squeue; + short *rcount; + shorts **rsets; + shorts *relts; + char any_tokens; + short *r1; + +#ifdef TRACE + fprintf(stderr, "Entering set_nullable"); +#endif + + nullable = NEW2(nvars, char) - ntokens; + + squeue = NEW2(nvars, short); + s1 = s2 = squeue; + + rcount = NEW2(nrules + 1, short); + rsets = NEW2(nvars, shorts *) - ntokens; + /* This is said to be more elements than we actually use. + Supposedly nitems - nrules is enough. + But why take the risk? */ + relts = NEW2(nitems + nvars + 1, shorts); + p = relts; + + r = ritem; + while (*r) + { + if (*r < 0) + { + symbol = rlhs[-(*r++)]; + if (symbol >= 0 && !nullable[symbol]) + { + nullable[symbol] = 1; + *s2++ = symbol; + } + } + else + { + r1 = r; + any_tokens = 0; + for (symbol = *r++; symbol > 0; symbol = *r++) + { + if (ISTOKEN(symbol)) + any_tokens = 1; + } + + if (!any_tokens) + { + ruleno = -symbol; + r = r1; + for (symbol = *r++; symbol > 0; symbol = *r++) + { + rcount[ruleno]++; + p->next = rsets[symbol]; + p->value = ruleno; + rsets[symbol] = p; + p++; + } + } + } + } + + while (s1 < s2) + { + p = rsets[*s1++]; + while (p) + { + ruleno = p->value; + p = p->next; + if (--rcount[ruleno] == 0) + { + symbol = rlhs[ruleno]; + if (symbol >= 0 && !nullable[symbol]) + { + nullable[symbol] = 1; + *s2++ = symbol; + } + } + } + } + + FREE(squeue); + FREE(rcount); + FREE(rsets + ntokens); + FREE(relts); +} + + +void +free_nullable() +{ + FREE(nullable + ntokens); +} diff --git a/contrib/bison/output.c b/contrib/bison/output.c new file mode 100644 index 000000000000..8b2d314f6bf3 --- /dev/null +++ b/contrib/bison/output.c @@ -0,0 +1,1484 @@ +/* Output the generated parsing program for bison, + Copyright (C) 1984, 1986, 1989, 1992 Free Software Foundation, Inc. + +This file is part of Bison, the GNU Compiler Compiler. + +Bison is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 2, or (at your option) +any later version. + +Bison is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with Bison; see the file COPYING. If not, write to +the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA. */ + + +/* functions to output parsing data to various files. Entries are: + + output_headers () + +Output constant strings to the beginning of certain files. + + output_trailers() + +Output constant strings to the ends of certain files. + + output () + +Output the parsing tables and the parser code to ftable. + +The parser tables consist of these tables. +Starred ones needed only for the semantic parser. +Double starred are output only if switches are set. + +yytranslate = vector mapping yylex's token numbers into bison's token numbers. + +** yytname = vector of string-names indexed by bison token number + +** yytoknum = vector of yylex token numbers corresponding to entries in yytname + +yyrline = vector of line-numbers of all rules. For yydebug printouts. + +yyrhs = vector of items of all rules. + This is exactly what ritems contains. For yydebug and for semantic + parser. + +yyprhs[r] = index in yyrhs of first item for rule r. + +yyr1[r] = symbol number of symbol that rule r derives. + +yyr2[r] = number of symbols composing right hand side of rule r. + +* yystos[s] = the symbol number of the symbol that leads to state s. + +yydefact[s] = default rule to reduce with in state s, + when yytable doesn't specify something else to do. + Zero means the default is an error. + +yydefgoto[i] = default state to go to after a reduction of a rule that + generates variable ntokens + i, except when yytable + specifies something else to do. + +yypact[s] = index in yytable of the portion describing state s. + The lookahead token's type is used to index that portion + to find out what to do. + + If the value in yytable is positive, + we shift the token and go to that state. + + If the value is negative, it is minus a rule number to reduce by. + + If the value is zero, the default action from yydefact[s] is used. + +yypgoto[i] = the index in yytable of the portion describing + what to do after reducing a rule that derives variable i + ntokens. + This portion is indexed by the parser state number, s, + as of before the text for this nonterminal was read. + The value from yytable is the state to go to if + the corresponding value in yycheck is s. + +yytable = a vector filled with portions for different uses, + found via yypact and yypgoto. + +yycheck = a vector indexed in parallel with yytable. + It indicates, in a roundabout way, the bounds of the + portion you are trying to examine. + + Suppose that the portion of yytable starts at index p + and the index to be examined within the portion is i. + Then if yycheck[p+i] != i, i is outside the bounds + of what is actually allocated, and the default + (from yydefact or yydefgoto) should be used. + Otherwise, yytable[p+i] should be used. + +YYFINAL = the state number of the termination state. +YYFLAG = most negative short int. Used to flag ?? +YYNTBASE = ntokens. + +*/ + +#include +#include "system.h" +#include "machine.h" +#include "new.h" +#include "files.h" +#include "gram.h" +#include "state.h" + + +extern int debugflag; +extern int nolinesflag; +extern int noparserflag; +extern int toknumflag; + +extern char **tags; +extern int *user_toknums; +extern int tokensetsize; +extern int final_state; +extern core **state_table; +extern shifts **shift_table; +extern errs **err_table; +extern reductions **reduction_table; +extern short *accessing_symbol; +extern unsigned *LA; +extern short *LAruleno; +extern short *lookaheads; +extern char *consistent; +extern short *goto_map; +extern short *from_state; +extern short *to_state; + +void output_token_translations(); +void output_gram(); +void output_stos(); +void output_rule_data(); +void output_defines(); +void output_actions(); +void token_actions(); +void save_row(); +void goto_actions(); +void save_column(); +void sort_actions(); +void pack_table(); +void output_base(); +void output_table(); +void output_check(); +void output_parser(); +void output_program(); +void free_itemset(); +void free_shifts(); +void free_reductions(); +void free_itemsets(); +int action_row(); +int default_goto(); +int matching_state(); +int pack_vector(); + +extern void berror(); +extern void fatals(); +extern char *int_to_string(); +extern void reader_output_yylsp(); + +static int nvectors; +static int nentries; +static short **froms; +static short **tos; +static short *tally; +static short *width; +static short *actrow; +static short *state_count; +static short *order; +static short *base; +static short *pos; +static short *table; +static short *check; +static int lowzero; +static int high; + + + +#define GUARDSTR "\n#include \"%s\"\nextern int yyerror;\n\ +extern int yycost;\nextern char * yymsg;\nextern YYSTYPE yyval;\n\n\ +yyguard(n, yyvsp, yylsp)\nregister int n;\nregister YYSTYPE *yyvsp;\n\ +register YYLTYPE *yylsp;\n\ +{\n yyerror = 0;\nyycost = 0;\n yymsg = 0;\nswitch (n)\n {" + +#define ACTSTR "\n#include \"%s\"\nextern YYSTYPE yyval;\ +\nextern int yychar;\ +yyaction(n, yyvsp, yylsp)\nregister int n;\nregister YYSTYPE *yyvsp;\n\ +register YYLTYPE *yylsp;\n{\n switch (n)\n{" + +#define ACTSTR_SIMPLE "\n switch (yyn) {\n" + + +void +output_headers() +{ + if (semantic_parser) + fprintf(fguard, GUARDSTR, attrsfile); + + if (noparserflag) + return; + + fprintf(faction, (semantic_parser ? ACTSTR : ACTSTR_SIMPLE), attrsfile); +/* if (semantic_parser) JF moved this below + fprintf(ftable, "#include \"%s\"\n", attrsfile); + fprintf(ftable, "#include \n\n"); +*/ + + /* Rename certain symbols if -p was specified. */ + if (spec_name_prefix) + { + fprintf(ftable, "#define yyparse %sparse\n", spec_name_prefix); + fprintf(ftable, "#define yylex %slex\n", spec_name_prefix); + fprintf(ftable, "#define yyerror %serror\n", spec_name_prefix); + fprintf(ftable, "#define yylval %slval\n", spec_name_prefix); + fprintf(ftable, "#define yychar %schar\n", spec_name_prefix); + fprintf(ftable, "#define yydebug %sdebug\n", spec_name_prefix); + fprintf(ftable, "#define yynerrs %snerrs\n", spec_name_prefix); + } +} + + +void +output_trailers() +{ + if (semantic_parser) + fprintf(fguard, "\n }\n}\n"); + + fprintf(faction, "\n"); + + if (noparserflag) + return; + + if (semantic_parser) + fprintf(faction, " }\n"); + fprintf(faction, "}\n"); +} + + +void +output() +{ + int c; + + /* output_token_defines(ftable); / * JF put out token defines FIRST */ + if (!semantic_parser) /* JF Put out other stuff */ + { + rewind(fattrs); + while ((c=getc(fattrs))!=EOF) + putc(c,ftable); + } + reader_output_yylsp(ftable); + if (debugflag) + fprintf(ftable, "#ifndef YYDEBUG\n#define YYDEBUG %d\n#endif\n\n", + !!debugflag); + + if (semantic_parser) + fprintf(ftable, "#include \"%s\"\n", attrsfile); + + if (! noparserflag) + fprintf(ftable, "#include \n\n"); + + /* Make "const" do nothing if not in ANSI C. */ + fprintf (ftable, "#ifndef __cplusplus\n#ifndef __STDC__\n#define const\n#endif\n#endif\n\n"); + + free_itemsets(); + output_defines(); + output_token_translations(); +/* if (semantic_parser) */ + /* This is now unconditional because debugging printouts can use it. */ + output_gram(); + FREE(ritem); + if (semantic_parser) + output_stos(); + output_rule_data(); + output_actions(); + if (! noparserflag) + output_parser(); + output_program(); +} + + +void +output_token_translations() +{ + register int i, j; +/* register short *sp; JF unused */ + + if (translations) + { + fprintf(ftable, + "\n#define YYTRANSLATE(x) ((unsigned)(x) <= %d ? yytranslate[x] : %d)\n", + max_user_token_number, nsyms); + + if (ntokens < 127) /* play it very safe; check maximum element value. */ + fprintf(ftable, "\nstatic const char yytranslate[] = { 0"); + else + fprintf(ftable, "\nstatic const short yytranslate[] = { 0"); + + j = 10; + for (i = 1; i <= max_user_token_number; i++) + { + putc(',', ftable); + + if (j >= 10) + { + putc('\n', ftable); + j = 1; + } + else + { + j++; + } + + fprintf(ftable, "%6d", token_translations[i]); + } + + fprintf(ftable, "\n};\n"); + } + else + { + fprintf(ftable, "\n#define YYTRANSLATE(x) (x)\n"); + } +} + + +void +output_gram() +{ + register int i; + register int j; + register short *sp; + + /* With the ordinary parser, + yyprhs and yyrhs are needed only for yydebug. */ + /* With the noparser option, all tables are generated */ + if (! semantic_parser && ! noparserflag) + fprintf(ftable, "\n#if YYDEBUG != 0"); + + fprintf(ftable, "\nstatic const short yyprhs[] = { 0"); + + j = 10; + for (i = 1; i <= nrules; i++) + { + putc(',', ftable); + + if (j >= 10) + { + putc('\n', ftable); + j = 1; + } + else + { + j++; + } + + fprintf(ftable, "%6d", rrhs[i]); + } + + fprintf(ftable, "\n};\n"); + + fprintf(ftable, "\nstatic const short yyrhs[] = {%6d", ritem[0]); + + j = 10; + for (sp = ritem + 1; *sp; sp++) + { + putc(',', ftable); + + if (j >= 10) + { + putc('\n', ftable); + j = 1; + } + else + { + j++; + } + + if (*sp > 0) + fprintf(ftable, "%6d", *sp); + else + fprintf(ftable, " 0"); + } + + fprintf(ftable, "\n};\n"); + + if (! semantic_parser && ! noparserflag) + fprintf(ftable, "\n#endif\n"); +} + + +void +output_stos() +{ + register int i; + register int j; + + fprintf(ftable, "\nstatic const short yystos[] = { 0"); + + j = 10; + for (i = 1; i < nstates; i++) + { + putc(',', ftable); + + if (j >= 10) + { + putc('\n', ftable); + j = 1; + } + else + { + j++; + } + + fprintf(ftable, "%6d", accessing_symbol[i]); + } + + fprintf(ftable, "\n};\n"); +} + + +void +output_rule_data() +{ + register int i; + register int j; + + fprintf(ftable, "\n#if YYDEBUG != 0\n"); + fprintf(ftable, "static const short yyrline[] = { 0"); + + j = 10; + for (i = 1; i <= nrules; i++) + { + putc(',', ftable); + + if (j >= 10) + { + putc('\n', ftable); + j = 1; + } + else + { + j++; + } + + fprintf(ftable, "%6d", rline[i]); + } + fprintf(ftable, "\n};\n#endif\n\n"); + + if (toknumflag || noparserflag) + { + fprintf(ftable, "#define YYNTOKENS %d\n", ntokens); + fprintf(ftable, "#define YYNNTS %d\n", nvars); + fprintf(ftable, "#define YYNRULES %d\n", nrules); + fprintf(ftable, "#define YYNSTATES %d\n", nstates); + fprintf(ftable, "#define YYMAXUTOK %d\n\n", max_user_token_number); + } + + if (! toknumflag && ! noparserflag) + fprintf(ftable, "\n#if YYDEBUG != 0 || defined (YYERROR_VERBOSE)\n\n"); + + /* Output the table of symbol names. */ + + fprintf(ftable, + "static const char * const yytname[] = { \"%s\"", + tags[0]); + + j = strlen (tags[0]) + 44; + for (i = 1; i < nsyms; i++) + /* this used to be i<=nsyms, but that output a final "" symbol + almost by accident */ + { + register char *p; + putc(',', ftable); + j++; + + if (j > 75) + { + putc('\n', ftable); + j = 0; + } + + putc ('\"', ftable); + j++; + + for (p = tags[i]; p && *p; p++) + { + if (*p == '"' || *p == '\\') + { + fprintf(ftable, "\\%c", *p); + j += 2; + } + else if (*p == '\n') + { + fprintf(ftable, "\\n"); + j += 2; + } + else if (*p == '\t') + { + fprintf(ftable, "\\t"); + j += 2; + } + else if (*p == '\b') + { + fprintf(ftable, "\\b"); + j += 2; + } + else if (*p < 040 || *p >= 0177) + { + fprintf(ftable, "\\%03o", *p); + j += 4; + } + else + { + putc(*p, ftable); + j++; + } + } + + putc ('\"', ftable); + j++; + } + fprintf(ftable, ", NULL\n};\n"); /* add a NULL entry to list of tokens */ + + if (! toknumflag && ! noparserflag) + fprintf(ftable, "#endif\n\n"); + + if (toknumflag) + { + fprintf(ftable, "static const short yytoknum[] = { 0"); + j = 10; + for (i = 1; i <= ntokens; i++) { + putc(',', ftable); + if (j >= 10) + { + putc('\n', ftable); + j = 1; + } + else + j++; + fprintf(ftable, "%6d", user_toknums[i]); + } + fprintf(ftable, "\n};\n\n"); + } + + fprintf(ftable, "static const short yyr1[] = { 0"); + + j = 10; + for (i = 1; i <= nrules; i++) + { + putc(',', ftable); + + if (j >= 10) + { + putc('\n', ftable); + j = 1; + } + else + { + j++; + } + + fprintf(ftable, "%6d", rlhs[i]); + } + + FREE(rlhs + 1); + + fprintf(ftable, "\n};\n\nstatic const short yyr2[] = { 0"); + + j = 10; + for (i = 1; i < nrules; i++) + { + putc(',', ftable); + + if (j >= 10) + { + putc('\n', ftable); + j = 1; + } + else + { + j++; + } + + fprintf(ftable, "%6d", rrhs[i + 1] - rrhs[i] - 1); + } + + putc(',', ftable); + if (j >= 10) + putc('\n', ftable); + + fprintf(ftable, "%6d\n};\n", nitems - rrhs[nrules] - 1); + FREE(rrhs + 1); +} + + +void +output_defines() +{ + fprintf(ftable, "\n\n#define\tYYFINAL\t\t%d\n", final_state); + fprintf(ftable, "#define\tYYFLAG\t\t%d\n", MINSHORT); + fprintf(ftable, "#define\tYYNTBASE\t%d\n", ntokens); +} + + + +/* compute and output yydefact, yydefgoto, yypact, yypgoto, yytable and yycheck. */ + +void +output_actions() +{ + nvectors = nstates + nvars; + + froms = NEW2(nvectors, short *); + tos = NEW2(nvectors, short *); + tally = NEW2(nvectors, short); + width = NEW2(nvectors, short); + + token_actions(); + free_shifts(); + free_reductions(); + FREE(lookaheads); + FREE(LA); + FREE(LAruleno); + FREE(accessing_symbol); + + goto_actions(); + FREE(goto_map + ntokens); + FREE(from_state); + FREE(to_state); + + sort_actions(); + pack_table(); + output_base(); + output_table(); + output_check(); +} + + + +/* figure out the actions for the specified state, indexed by lookahead token type. + + The yydefact table is output now. The detailed info + is saved for putting into yytable later. */ + +void +token_actions() +{ + register int i; + register int j; + register int k; + + actrow = NEW2(ntokens, short); + + k = action_row(0); + fprintf(ftable, "\nstatic const short yydefact[] = {%6d", k); + save_row(0); + + j = 10; + for (i = 1; i < nstates; i++) + { + putc(',', ftable); + + if (j >= 10) + { + putc('\n', ftable); + j = 1; + } + else + { + j++; + } + + k = action_row(i); + fprintf(ftable, "%6d", k); + save_row(i); + } + + fprintf(ftable, "\n};\n"); + FREE(actrow); +} + + + +/* Decide what to do for each type of token if seen as the lookahead token in specified state. + The value returned is used as the default action (yydefact) for the state. + In addition, actrow is filled with what to do for each kind of token, + index by symbol number, with zero meaning do the default action. + The value MINSHORT, a very negative number, means this situation + is an error. The parser recognizes this value specially. + + This is where conflicts are resolved. The loop over lookahead rules + considered lower-numbered rules last, and the last rule considered that likes + a token gets to handle it. */ + +int +action_row(state) +int state; +{ + register int i; + register int j; + register int k; + register int m; + register int n; + register int count; + register int default_rule; + register int nreds; + register int max; + register int rule; + register int shift_state; + register int symbol; + register unsigned mask; + register unsigned *wordp; + register reductions *redp; + register shifts *shiftp; + register errs *errp; + int nodefault = 0; /* set nonzero to inhibit having any default reduction */ + + for (i = 0; i < ntokens; i++) + actrow[i] = 0; + + default_rule = 0; + nreds = 0; + redp = reduction_table[state]; + + if (redp) + { + nreds = redp->nreds; + + if (nreds >= 1) + { + /* loop over all the rules available here which require lookahead */ + m = lookaheads[state]; + n = lookaheads[state + 1]; + + for (i = n - 1; i >= m; i--) + { + rule = - LAruleno[i]; + wordp = LA + i * tokensetsize; + mask = 1; + + /* and find each token which the rule finds acceptable to come next */ + for (j = 0; j < ntokens; j++) + { + /* and record this rule as the rule to use if that token follows. */ + if (mask & *wordp) + actrow[j] = rule; + + mask <<= 1; + if (mask == 0) + { + mask = 1; + wordp++; + } + } + } + } + } + + shiftp = shift_table[state]; + + /* now see which tokens are allowed for shifts in this state. + For them, record the shift as the thing to do. So shift is preferred to reduce. */ + + if (shiftp) + { + k = shiftp->nshifts; + + for (i = 0; i < k; i++) + { + shift_state = shiftp->shifts[i]; + if (! shift_state) continue; + + symbol = accessing_symbol[shift_state]; + + if (ISVAR(symbol)) + break; + + actrow[symbol] = shift_state; + + /* do not use any default reduction if there is a shift for error */ + + if (symbol == error_token_number) nodefault = 1; + } + } + + errp = err_table[state]; + + /* See which tokens are an explicit error in this state + (due to %nonassoc). For them, record MINSHORT as the action. */ + + if (errp) + { + k = errp->nerrs; + + for (i = 0; i < k; i++) + { + symbol = errp->errs[i]; + actrow[symbol] = MINSHORT; + } + } + + /* now find the most common reduction and make it the default action for this state. */ + + if (nreds >= 1 && ! nodefault) + { + if (consistent[state]) + default_rule = redp->rules[0]; + else + { + max = 0; + for (i = m; i < n; i++) + { + count = 0; + rule = - LAruleno[i]; + + for (j = 0; j < ntokens; j++) + { + if (actrow[j] == rule) + count++; + } + + if (count > max) + { + max = count; + default_rule = rule; + } + } + + /* actions which match the default are replaced with zero, + which means "use the default" */ + + if (max > 0) + { + for (j = 0; j < ntokens; j++) + { + if (actrow[j] == default_rule) + actrow[j] = 0; + } + + default_rule = - default_rule; + } + } + } + + /* If have no default rule, the default is an error. + So replace any action which says "error" with "use default". */ + + if (default_rule == 0) + for (j = 0; j < ntokens; j++) + { + if (actrow[j] == MINSHORT) + actrow[j] = 0; + } + + return (default_rule); +} + + +void +save_row(state) +int state; +{ + register int i; + register int count; + register short *sp; + register short *sp1; + register short *sp2; + + count = 0; + for (i = 0; i < ntokens; i++) + { + if (actrow[i] != 0) + count++; + } + + if (count == 0) + return; + + froms[state] = sp1 = sp = NEW2(count, short); + tos[state] = sp2 = NEW2(count, short); + + for (i = 0; i < ntokens; i++) + { + if (actrow[i] != 0) + { + *sp1++ = i; + *sp2++ = actrow[i]; + } + } + + tally[state] = count; + width[state] = sp1[-1] - sp[0] + 1; +} + + + +/* figure out what to do after reducing with each rule, + depending on the saved state from before the beginning + of parsing the data that matched this rule. + + The yydefgoto table is output now. The detailed info + is saved for putting into yytable later. */ + +void +goto_actions() +{ + register int i; + register int j; + register int k; + + state_count = NEW2(nstates, short); + + k = default_goto(ntokens); + fprintf(ftable, "\nstatic const short yydefgoto[] = {%6d", k); + save_column(ntokens, k); + + j = 10; + for (i = ntokens + 1; i < nsyms; i++) + { + putc(',', ftable); + + if (j >= 10) + { + putc('\n', ftable); + j = 1; + } + else + { + j++; + } + + k = default_goto(i); + fprintf(ftable, "%6d", k); + save_column(i, k); + } + + fprintf(ftable, "\n};\n"); + FREE(state_count); +} + + + +int +default_goto(symbol) +int symbol; +{ + register int i; + register int m; + register int n; + register int default_state; + register int max; + + m = goto_map[symbol]; + n = goto_map[symbol + 1]; + + if (m == n) + return (-1); + + for (i = 0; i < nstates; i++) + state_count[i] = 0; + + for (i = m; i < n; i++) + state_count[to_state[i]]++; + + max = 0; + default_state = -1; + + for (i = 0; i < nstates; i++) + { + if (state_count[i] > max) + { + max = state_count[i]; + default_state = i; + } + } + + return (default_state); +} + + +void +save_column(symbol, default_state) +int symbol; +int default_state; +{ + register int i; + register int m; + register int n; + register short *sp; + register short *sp1; + register short *sp2; + register int count; + register int symno; + + m = goto_map[symbol]; + n = goto_map[symbol + 1]; + + count = 0; + for (i = m; i < n; i++) + { + if (to_state[i] != default_state) + count++; + } + + if (count == 0) + return; + + symno = symbol - ntokens + nstates; + + froms[symno] = sp1 = sp = NEW2(count, short); + tos[symno] = sp2 = NEW2(count, short); + + for (i = m; i < n; i++) + { + if (to_state[i] != default_state) + { + *sp1++ = from_state[i]; + *sp2++ = to_state[i]; + } + } + + tally[symno] = count; + width[symno] = sp1[-1] - sp[0] + 1; +} + + + +/* the next few functions decide how to pack + the actions and gotos information into yytable. */ + +void +sort_actions() +{ + register int i; + register int j; + register int k; + register int t; + register int w; + + order = NEW2(nvectors, short); + nentries = 0; + + for (i = 0; i < nvectors; i++) + { + if (tally[i] > 0) + { + t = tally[i]; + w = width[i]; + j = nentries - 1; + + while (j >= 0 && (width[order[j]] < w)) + j--; + + while (j >= 0 && (width[order[j]] == w) && (tally[order[j]] < t)) + j--; + + for (k = nentries - 1; k > j; k--) + order[k + 1] = order[k]; + + order[j + 1] = i; + nentries++; + } + } +} + + +void +pack_table() +{ + register int i; + register int place; + register int state; + + base = NEW2(nvectors, short); + pos = NEW2(nentries, short); + table = NEW2(MAXTABLE, short); + check = NEW2(MAXTABLE, short); + + lowzero = 0; + high = 0; + + for (i = 0; i < nvectors; i++) + base[i] = MINSHORT; + + for (i = 0; i < MAXTABLE; i++) + check[i] = -1; + + for (i = 0; i < nentries; i++) + { + state = matching_state(i); + + if (state < 0) + place = pack_vector(i); + else + place = base[state]; + + pos[i] = place; + base[order[i]] = place; + } + + for (i = 0; i < nvectors; i++) + { + if (froms[i]) + FREE(froms[i]); + if (tos[i]) + FREE(tos[i]); + } + + FREE(froms); + FREE(tos); + FREE(pos); +} + + + +int +matching_state(vector) +int vector; +{ + register int i; + register int j; + register int k; + register int t; + register int w; + register int match; + register int prev; + + i = order[vector]; + if (i >= nstates) + return (-1); + + t = tally[i]; + w = width[i]; + + for (prev = vector - 1; prev >= 0; prev--) + { + j = order[prev]; + if (width[j] != w || tally[j] != t) + return (-1); + + match = 1; + for (k = 0; match && k < t; k++) + { + if (tos[j][k] != tos[i][k] || froms[j][k] != froms[i][k]) + match = 0; + } + + if (match) + return (j); + } + + return (-1); +} + + + +int +pack_vector(vector) +int vector; +{ + register int i; + register int j; + register int k; + register int t; + register int loc; + register int ok; + register short *from; + register short *to; + + i = order[vector]; + t = tally[i]; + + if (t == 0) + berror("pack_vector"); + + from = froms[i]; + to = tos[i]; + + for (j = lowzero - from[0]; j < MAXTABLE; j++) + { + ok = 1; + + for (k = 0; ok && k < t; k++) + { + loc = j + from[k]; + if (loc > MAXTABLE) + fatals("maximum table size (%s) exceeded", int_to_string(MAXTABLE)); + + if (table[loc] != 0) + ok = 0; + } + + for (k = 0; ok && k < vector; k++) + { + if (pos[k] == j) + ok = 0; + } + + if (ok) + { + for (k = 0; k < t; k++) + { + loc = j + from[k]; + table[loc] = to[k]; + check[loc] = from[k]; + } + + while (table[lowzero] != 0) + lowzero++; + + if (loc > high) + high = loc; + + return (j); + } + } + + berror("pack_vector"); + return 0; /* JF keep lint happy */ +} + + + +/* the following functions output yytable, yycheck + and the vectors whose elements index the portion starts */ + +void +output_base() +{ + register int i; + register int j; + + fprintf(ftable, "\nstatic const short yypact[] = {%6d", base[0]); + + j = 10; + for (i = 1; i < nstates; i++) + { + putc(',', ftable); + + if (j >= 10) + { + putc('\n', ftable); + j = 1; + } + else + { + j++; + } + + fprintf(ftable, "%6d", base[i]); + } + + fprintf(ftable, "\n};\n\nstatic const short yypgoto[] = {%6d", base[nstates]); + + j = 10; + for (i = nstates + 1; i < nvectors; i++) + { + putc(',', ftable); + + if (j >= 10) + { + putc('\n', ftable); + j = 1; + } + else + { + j++; + } + + fprintf(ftable, "%6d", base[i]); + } + + fprintf(ftable, "\n};\n"); + FREE(base); +} + + +void +output_table() +{ + register int i; + register int j; + + fprintf(ftable, "\n\n#define\tYYLAST\t\t%d\n\n", high); + fprintf(ftable, "\nstatic const short yytable[] = {%6d", table[0]); + + j = 10; + for (i = 1; i <= high; i++) + { + putc(',', ftable); + + if (j >= 10) + { + putc('\n', ftable); + j = 1; + } + else + { + j++; + } + + fprintf(ftable, "%6d", table[i]); + } + + fprintf(ftable, "\n};\n"); + FREE(table); +} + + +void +output_check() +{ + register int i; + register int j; + + fprintf(ftable, "\nstatic const short yycheck[] = {%6d", check[0]); + + j = 10; + for (i = 1; i <= high; i++) + { + putc(',', ftable); + + if (j >= 10) + { + putc('\n', ftable); + j = 1; + } + else + { + j++; + } + + fprintf(ftable, "%6d", check[i]); + } + + fprintf(ftable, "\n};\n"); + FREE(check); +} + + + +/* copy the parser code into the ftable file at the end. */ + +void +output_parser() +{ + register int c; +#ifdef DONTDEF + FILE *fpars; +#else +#define fpars fparser +#endif + + if (pure_parser) + fprintf(ftable, "#define YYPURE 1\n\n"); + +#ifdef DONTDEF /* JF no longer needed 'cuz open_extra_files changes the + currently open parser from bison.simple to bison.hairy */ + if (semantic_parser) + fpars = fparser; + else fpars = fparser1; +#endif + + /* Loop over lines in the standard parser file. */ + + while (1) + { + int write_line = 1; + + c = getc(fpars); + + /* See if the line starts with `#line. + If so, set write_line to 0. */ + if (nolinesflag) + if (c == '#') + { + c = getc(fpars); + if (c == 'l') + { + c = getc(fpars); + if (c == 'i') + { + c = getc(fpars); + if (c == 'n') + { + c = getc(fpars); + if (c == 'e') + write_line = 0; + else + fprintf(ftable, "#lin"); + } + else + fprintf(ftable, "#li"); + } + else + fprintf(ftable, "#l"); + } + else + fprintf(ftable, "#"); + } + + /* now write out the line... */ + for (; c != '\n' && c != EOF; c = getc(fpars)) + if (write_line) + if (c == '$') + { + /* `$' in the parser file indicates where to put the actions. + Copy them in at this point. */ + rewind(faction); + for(c=getc(faction);c!=EOF;c=getc(faction)) + putc(c,ftable); + } + else + putc(c, ftable); + if (c == EOF) + break; + putc(c, ftable); + } +} + +void +output_program() +{ + register int c; + extern int lineno; + + if (!nolinesflag) + fprintf(ftable, "#line %d \"%s\"\n", lineno, infile); + + c = getc(finput); + while (c != EOF) + { + putc(c, ftable); + c = getc(finput); + } +} + + +void +free_itemsets() +{ + register core *cp,*cptmp; + + FREE(state_table); + + for (cp = first_state; cp; cp = cptmp) { + cptmp=cp->next; + FREE(cp); + } +} + + +void +free_shifts() +{ + register shifts *sp,*sptmp;/* JF derefrenced freed ptr */ + + FREE(shift_table); + + for (sp = first_shift; sp; sp = sptmp) { + sptmp=sp->next; + FREE(sp); + } +} + + +void +free_reductions() +{ + register reductions *rp,*rptmp;/* JF fixed freed ptr */ + + FREE(reduction_table); + + for (rp = first_reduction; rp; rp = rptmp) { + rptmp=rp->next; + FREE(rp); + } +} diff --git a/contrib/bison/print.c b/contrib/bison/print.c new file mode 100644 index 000000000000..17a47ab4fdad --- /dev/null +++ b/contrib/bison/print.c @@ -0,0 +1,373 @@ +/* Print information on generated parser, for bison, + Copyright (C) 1984, 1986, 1989 Free Software Foundation, Inc. + +This file is part of Bison, the GNU Compiler Compiler. + +Bison is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 2, or (at your option) +any later version. + +Bison is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with Bison; see the file COPYING. If not, write to +the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA. */ + + +#include +#include "system.h" +#include "machine.h" +#include "new.h" +#include "files.h" +#include "gram.h" +#include "state.h" + + +extern char **tags; +extern int nstates; +extern short *accessing_symbol; +extern core **state_table; +extern shifts **shift_table; +extern errs **err_table; +extern reductions **reduction_table; +extern char *consistent; +extern char any_conflicts; +extern char *conflicts; +extern int final_state; + +extern void conflict_log(); +extern void verbose_conflict_log(); +extern void print_reductions(); + +void print_token(); +void print_state(); +void print_core(); +void print_actions(); +void print_grammar(); + +void +terse() +{ + if (any_conflicts) + { + conflict_log(); + } +} + + +void +verbose() +{ + register int i; + + if (any_conflicts) + verbose_conflict_log(); + + print_grammar(); + + for (i = 0; i < nstates; i++) + { + print_state(i); + } +} + + +void +print_token(extnum, token) +int extnum, token; +{ + fprintf(foutput, " type %d is %s\n", extnum, tags[token]); +} + + +void +print_state(state) +int state; +{ + fprintf(foutput, "\n\nstate %d\n\n", state); + print_core(state); + print_actions(state); +} + + +void +print_core(state) +int state; +{ + register int i; + register int k; + register int rule; + register core *statep; + register short *sp; + register short *sp1; + + statep = state_table[state]; + k = statep->nitems; + + if (k == 0) return; + + for (i = 0; i < k; i++) + { + sp1 = sp = ritem + statep->items[i]; + + while (*sp > 0) + sp++; + + rule = -(*sp); + fprintf(foutput, " %s -> ", tags[rlhs[rule]]); + + for (sp = ritem + rrhs[rule]; sp < sp1; sp++) + { + fprintf(foutput, "%s ", tags[*sp]); + } + + putc('.', foutput); + + while (*sp > 0) + { + fprintf(foutput, " %s", tags[*sp]); + sp++; + } + + fprintf (foutput, " (rule %d)", rule); + putc('\n', foutput); + } + + putc('\n', foutput); +} + + +void +print_actions(state) +int state; +{ + register int i; + register int k; + register int state1; + register int symbol; + register shifts *shiftp; + register errs *errp; + register reductions *redp; + register int rule; + + shiftp = shift_table[state]; + redp = reduction_table[state]; + errp = err_table[state]; + + if (!shiftp && !redp) + { + if (final_state == state) + fprintf(foutput, " $default\taccept\n"); + else + fprintf(foutput, " NO ACTIONS\n"); + return; + } + + if (shiftp) + { + k = shiftp->nshifts; + + for (i = 0; i < k; i++) + { + if (! shiftp->shifts[i]) continue; + state1 = shiftp->shifts[i]; + symbol = accessing_symbol[state1]; + /* The following line used to be turned off. */ + if (ISVAR(symbol)) break; + if (symbol==0) /* I.e. strcmp(tags[symbol],"$")==0 */ + fprintf(foutput, " $ \tgo to state %d\n", state1); + else + fprintf(foutput, " %-4s\tshift, and go to state %d\n", + tags[symbol], state1); + } + + if (i > 0) + putc('\n', foutput); + } + else + { + i = 0; + k = 0; + } + + if (errp) + { + int j, nerrs; + + nerrs = errp->nerrs; + + for (j = 0; j < nerrs; j++) + { + if (! errp->errs[j]) continue; + symbol = errp->errs[j]; + fprintf(foutput, " %-4s\terror (nonassociative)\n", tags[symbol]); + } + + if (j > 0) + putc('\n', foutput); + } + + if (consistent[state] && redp) + { + rule = redp->rules[0]; + symbol = rlhs[rule]; + fprintf(foutput, " $default\treduce using rule %d (%s)\n\n", + rule, tags[symbol]); + } + else if (redp) + { + print_reductions(state); + } + + if (i < k) + { + for (; i < k; i++) + { + if (! shiftp->shifts[i]) continue; + state1 = shiftp->shifts[i]; + symbol = accessing_symbol[state1]; + fprintf(foutput, " %-4s\tgo to state %d\n", tags[symbol], state1); + } + + putc('\n', foutput); + } +} + +#define END_TEST(end) \ + if (column + strlen(buffer) > (end)) \ + { fprintf (foutput, "%s\n ", buffer); column = 3; buffer[0] = 0; } \ + else + +void +print_grammar() +{ + int i, j; + short* rule; + char buffer[90]; + int column = 0; + + /* rule # : LHS -> RHS */ + fputs("\nGrammar\n", foutput); + for (i = 1; i <= nrules; i++) + /* Don't print rules disabled in reduce_grammar_tables. */ + if (rlhs[i] >= 0) + { + fprintf(foutput, "rule %-4d %s ->", i, tags[rlhs[i]]); + rule = &ritem[rrhs[i]]; + if (*rule > 0) + while (*rule > 0) + fprintf(foutput, " %s", tags[*rule++]); + else + fputs (" /* empty */", foutput); + putc('\n', foutput); + } + + /* TERMINAL (type #) : rule #s terminal is on RHS */ + fputs("\nTerminals, with rules where they appear\n\n", foutput); + fprintf(foutput, "%s (-1)\n", tags[0]); + if (translations) + { + for (i = 0; i <= max_user_token_number; i++) + if (token_translations[i] != 2) + { + buffer[0] = 0; + column = strlen (tags[token_translations[i]]); + fprintf(foutput, "%s", tags[token_translations[i]]); + END_TEST (50); + sprintf (buffer, " (%d)", i); + + for (j = 1; j <= nrules; j++) + { + for (rule = &ritem[rrhs[j]]; *rule > 0; rule++) + if (*rule == token_translations[i]) + { + END_TEST (65); + sprintf (buffer + strlen(buffer), " %d", j); + break; + } + } + fprintf (foutput, "%s\n", buffer); + } + } + else + for (i = 1; i < ntokens; i++) + { + buffer[0] = 0; + column = strlen (tags[i]); + fprintf(foutput, "%s", tags[i]); + END_TEST (50); + sprintf (buffer, " (%d)", i); + + for (j = 1; j <= nrules; j++) + { + for (rule = &ritem[rrhs[j]]; *rule > 0; rule++) + if (*rule == i) + { + END_TEST (65); + sprintf (buffer + strlen(buffer), " %d", j); + break; + } + } + fprintf (foutput, "%s\n", buffer); + } + + fputs("\nNonterminals, with rules where they appear\n\n", foutput); + for (i = ntokens; i <= nsyms - 1; i++) + { + int left_count = 0, right_count = 0; + + for (j = 1; j <= nrules; j++) + { + if (rlhs[j] == i) + left_count++; + for (rule = &ritem[rrhs[j]]; *rule > 0; rule++) + if (*rule == i) + { + right_count++; + break; + } + } + + buffer[0] = 0; + fprintf(foutput, "%s", tags[i]); + column = strlen (tags[i]); + sprintf (buffer, " (%d)", i); + END_TEST (0); + + if (left_count > 0) + { + END_TEST (50); + sprintf (buffer + strlen(buffer), " on left:"); + + for (j = 1; j <= nrules; j++) + { + END_TEST (65); + if (rlhs[j] == i) + sprintf (buffer + strlen(buffer), " %d", j); + } + } + + if (right_count > 0) + { + if (left_count > 0) + sprintf (buffer + strlen(buffer), ","); + END_TEST (50); + sprintf (buffer + strlen(buffer), " on right:"); + for (j = 1; j <= nrules; j++) + { + for (rule = &ritem[rrhs[j]]; *rule > 0; rule++) + if (*rule == i) + { + END_TEST (65); + sprintf (buffer + strlen(buffer), " %d", j); + break; + } + } + } + fprintf (foutput, "%s\n", buffer); + } +} diff --git a/contrib/bison/reader.c b/contrib/bison/reader.c new file mode 100644 index 000000000000..662353b307ef --- /dev/null +++ b/contrib/bison/reader.c @@ -0,0 +1,2073 @@ +/* Input parser for bison + Copyright (C) 1984, 1986, 1989, 1992 Free Software Foundation, Inc. + +This file is part of Bison, the GNU Compiler Compiler. + +Bison is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 2, or (at your option) +any later version. + +Bison is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with Bison; see the file COPYING. If not, write to +the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA. */ + + +/* read in the grammar specification and record it in the format described in gram.h. + All guards are copied into the fguard file and all actions into faction, + in each case forming the body of a C function (yyguard or yyaction) + which contains a switch statement to decide which guard or action to execute. + +The entry point is reader(). */ + +#include +#include +#include "system.h" +#include "files.h" +#include "new.h" +#include "symtab.h" +#include "lex.h" +#include "gram.h" +#include "machine.h" + +#define LTYPESTR "\n#ifndef YYLTYPE\ntypedef\n struct yyltype\n\ + {\n int timestamp;\n int first_line;\n int first_column;\ +\n int last_line;\n int last_column;\n char *text;\n }\n\ + yyltype;\n\n#define YYLTYPE yyltype\n#endif\n\n" + +/* Number of slots allocated (but not necessarily used yet) in `rline' */ +int rline_allocated; + +extern char *program_name; +extern int definesflag; +extern int nolinesflag; +extern int noparserflag; +extern int rawtoknumflag; +extern bucket *symval; +extern int numval; +extern int expected_conflicts; +extern char *token_buffer; + +extern void init_lex(); +extern void tabinit(); +extern void output_headers(); +extern void output_trailers(); +extern void free_symtab(); +extern void open_extra_files(); +extern char *int_to_string(); +extern void fatal(); +extern void fatals(); +extern void warn(); +extern void warni(); +extern void warns(); +extern void warnss(); +extern void warnsss(); +extern void unlex(); +extern void done(); + +extern int skip_white_space(); +extern int parse_percent_token(); +extern int lex(); + +void reader_output_yylsp(); +void read_declarations(); +void copy_definition(); +void parse_token_decl(); +void parse_start_decl(); +void parse_type_decl(); +void parse_assoc_decl(); +void parse_union_decl(); +void parse_expect_decl(); +void parse_thong_decl(); +void copy_action(); +void readgram(); +void record_rule_line(); +void packsymbols(); +void output_token_defines(); +void packgram(); +int read_signed_integer(); +static int get_type(); + +typedef + struct symbol_list + { + struct symbol_list *next; + bucket *sym; + bucket *ruleprec; + } + symbol_list; + + + +int lineno; +symbol_list *grammar; +int start_flag; +bucket *startval; +char **tags; +int *user_toknums; + +/* Nonzero if components of semantic values are used, implying + they must be unions. */ +static int value_components_used; + +static int typed; /* nonzero if %union has been seen. */ + +static int lastprec; /* incremented for each %left, %right or %nonassoc seen */ + +static int gensym_count; /* incremented for each generated symbol */ + +static bucket *errtoken; + +/* Nonzero if any action or guard uses the @n construct. */ +static int yylsp_needed; + +extern char *version_string; + + +static void +skip_to_char(target) + int target; +{ + int c; + if (target == '\n') + warn(" Skipping to next \\n"); + else + warni(" Skipping to next %c", target); + + do + c = skip_white_space(); + while (c != target && c != EOF); + if (c != EOF) + ungetc(c, finput); +} + + +void +reader() +{ + start_flag = 0; + startval = NULL; /* start symbol not specified yet. */ + +#if 0 + translations = 0; /* initially assume token number translation not needed. */ +#endif + /* Nowadays translations is always set to 1, + since we give `error' a user-token-number + to satisfy the Posix demand for YYERRCODE==256. */ + translations = 1; + + nsyms = 1; + nvars = 0; + nrules = 0; + nitems = 0; + rline_allocated = 10; + rline = NEW2(rline_allocated, short); + + typed = 0; + lastprec = 0; + + gensym_count = 0; + + semantic_parser = 0; + pure_parser = 0; + yylsp_needed = 0; + + grammar = NULL; + + init_lex(); + lineno = 1; + + /* initialize the symbol table. */ + tabinit(); + /* construct the error token */ + errtoken = getsym("error"); + errtoken->class = STOKEN; + errtoken->user_token_number = 256; /* Value specified by posix. */ + /* construct a token that represents all undefined literal tokens. */ + /* it is always token number 2. */ + getsym("$undefined.")->class = STOKEN; + /* Read the declaration section. Copy %{ ... %} groups to ftable and fdefines file. + Also notice any %token, %left, etc. found there. */ + if (noparserflag) + fprintf(ftable, "\n/* Bison-generated parse tables, made from %s\n", + infile); + else + fprintf(ftable, "\n/* A Bison parser, made from %s\n", infile); + fprintf(ftable, " by %s */\n\n", version_string); + fprintf(ftable, "#define YYBISON 1 /* Identify Bison output. */\n\n"); + read_declarations(); + /* start writing the guard and action files, if they are needed. */ + output_headers(); + /* read in the grammar, build grammar in list form. write out guards and actions. */ + readgram(); + /* Now we know whether we need the line-number stack. + If we do, write its type into the .tab.h file. */ + if (fdefines) + reader_output_yylsp(fdefines); + /* write closing delimiters for actions and guards. */ + output_trailers(); + if (yylsp_needed) + fprintf(ftable, "#define YYLSP_NEEDED\n\n"); + /* assign the symbols their symbol numbers. + Write #defines for the token symbols into fdefines if requested. */ + packsymbols(); + /* convert the grammar into the format described in gram.h. */ + packgram(); + /* free the symbol table data structure + since symbols are now all referred to by symbol number. */ + free_symtab(); +} + +void +reader_output_yylsp(f) + FILE *f; +{ + if (yylsp_needed) + fprintf(f, LTYPESTR); +} + +/* read from finput until %% is seen. Discard the %%. +Handle any % declarations, +and copy the contents of any %{ ... %} groups to fattrs. */ + +void +read_declarations () +{ + register int c; + register int tok; + + for (;;) + { + c = skip_white_space(); + + if (c == '%') + { + tok = parse_percent_token(); + + switch (tok) + { + case TWO_PERCENTS: + return; + + case PERCENT_LEFT_CURLY: + copy_definition(); + break; + + case TOKEN: + parse_token_decl (STOKEN, SNTERM); + break; + + case NTERM: + parse_token_decl (SNTERM, STOKEN); + break; + + case TYPE: + parse_type_decl(); + break; + + case START: + parse_start_decl(); + break; + + case UNION: + parse_union_decl(); + break; + + case EXPECT: + parse_expect_decl(); + break; + case THONG: + parse_thong_decl(); + break; + case LEFT: + parse_assoc_decl(LEFT_ASSOC); + break; + + case RIGHT: + parse_assoc_decl(RIGHT_ASSOC); + break; + + case NONASSOC: + parse_assoc_decl(NON_ASSOC); + break; + + case SEMANTIC_PARSER: + if (semantic_parser == 0) + { + semantic_parser = 1; + open_extra_files(); + } + break; + + case PURE_PARSER: + pure_parser = 1; + break; + + case NOOP: + break; + + default: + warns("unrecognized: %s", token_buffer); + skip_to_char('%'); + } + } + else if (c == EOF) + fatal("no input grammar"); + else + { + char buff[100]; + sprintf(buff, "unknown character: %s", printable_version(c)); + warn(buff); + skip_to_char('%'); + } + } +} + + +/* copy the contents of a %{ ... %} into the definitions file. +The %{ has already been read. Return after reading the %}. */ + +void +copy_definition () +{ + register int c; + register int match; + register int ended; + register int after_percent; /* -1 while reading a character if prev char was % */ + int cplus_comment; + + if (!nolinesflag) + fprintf(fattrs, "#line %d \"%s\"\n", lineno, infile); + + after_percent = 0; + + c = getc(finput); + + for (;;) + { + switch (c) + { + case '\n': + putc(c, fattrs); + lineno++; + break; + + case '%': + after_percent = -1; + break; + + case '\'': + case '"': + match = c; + putc(c, fattrs); + c = getc(finput); + + while (c != match) + { + if (c == EOF) + fatal("unterminated string at end of file"); + if (c == '\n') + { + warn("unterminated string"); + ungetc(c, finput); + c = match; + continue; + } + + putc(c, fattrs); + + if (c == '\\') + { + c = getc(finput); + if (c == EOF) + fatal("unterminated string at end of file"); + putc(c, fattrs); + if (c == '\n') + lineno++; + } + + c = getc(finput); + } + + putc(c, fattrs); + break; + + case '/': + putc(c, fattrs); + c = getc(finput); + if (c != '*' && c != '/') + continue; + + cplus_comment = (c == '/'); + putc(c, fattrs); + c = getc(finput); + + ended = 0; + while (!ended) + { + if (!cplus_comment && c == '*') + { + while (c == '*') + { + putc(c, fattrs); + c = getc(finput); + } + + if (c == '/') + { + putc(c, fattrs); + ended = 1; + } + } + else if (c == '\n') + { + lineno++; + putc(c, fattrs); + if (cplus_comment) + ended = 1; + else + c = getc(finput); + } + else if (c == EOF) + fatal("unterminated comment in `%{' definition"); + else + { + putc(c, fattrs); + c = getc(finput); + } + } + + break; + + case EOF: + fatal("unterminated `%{' definition"); + + default: + putc(c, fattrs); + } + + c = getc(finput); + + if (after_percent) + { + if (c == '}') + return; + putc('%', fattrs); + } + after_percent = 0; + + } + +} + + + +/* parse what comes after %token or %nterm. +For %token, what_is is STOKEN and what_is_not is SNTERM. +For %nterm, the arguments are reversed. */ + +void +parse_token_decl (what_is, what_is_not) + int what_is, what_is_not; +{ + register int token = 0; + register char *typename = 0; + register struct bucket *symbol = NULL; /* pts to symbol being defined */ + int k; + + for (;;) + { + if(ungetc(skip_white_space(), finput) == '%') + return; + token = lex(); + if (token == COMMA) + { + symbol = NULL; + continue; + } + if (token == TYPENAME) + { + k = strlen(token_buffer); + typename = NEW2(k + 1, char); + strcpy(typename, token_buffer); + value_components_used = 1; + symbol = NULL; + } + else if (token == IDENTIFIER && *symval->tag == '\"' + && symbol) + { + translations = 1; + symval->class = STOKEN; + symval->type_name = typename; + symval->user_token_number = symbol->user_token_number; + symbol->user_token_number = SALIAS; + + symval->alias = symbol; + symbol->alias = symval; + symbol = NULL; + + nsyms--; /* symbol and symval combined are only one symbol */ + } + else if (token == IDENTIFIER) + { + int oldclass = symval->class; + symbol = symval; + + if (symbol->class == what_is_not) + warns("symbol %s redefined", symbol->tag); + symbol->class = what_is; + if (what_is == SNTERM && oldclass != SNTERM) + symbol->value = nvars++; + + if (typename) + { + if (symbol->type_name == NULL) + symbol->type_name = typename; + else if (strcmp(typename, symbol->type_name) != 0) + warns("type redeclaration for %s", symbol->tag); + } + } + else if (symbol && token == NUMBER) + { + symbol->user_token_number = numval; + translations = 1; + } + else + { + warnss("`%s' is invalid in %s", + token_buffer, + (what_is == STOKEN) ? "%token" : "%nterm"); + skip_to_char('%'); + } + } + +} + +/* parse what comes after %thong + the full syntax is + %thong token number literal + the or number may be omitted. The number specifies the + user_token_number. + + Two symbols are entered in the table, one for the token symbol and + one for the literal. Both are given the , if any, from the declaration. + The ->user_token_number of the first is SALIAS and the ->user_token_number + of the second is set to the number, if any, from the declaration. + The two symbols are linked via pointers in their ->alias fields. + + during output_defines_table, the symbol is reported + thereafter, only the literal string is retained + it is the literal string that is output to yytname +*/ + +void +parse_thong_decl () +{ + register int token; + register struct bucket *symbol; + register char *typename = 0; + int k, usrtoknum; + + translations = 1; + token = lex(); /* fetch typename or first token */ + if (token == TYPENAME) { + k = strlen(token_buffer); + typename = NEW2(k + 1, char); + strcpy(typename, token_buffer); + value_components_used = 1; + token = lex(); /* fetch first token */ + } + + /* process first token */ + + if (token != IDENTIFIER) + { + warns("unrecognized item %s, expected an identifier", + token_buffer); + skip_to_char('%'); + return; + } + symval->class = STOKEN; + symval->type_name = typename; + symval->user_token_number = SALIAS; + symbol = symval; + + token = lex(); /* get number or literal string */ + + if (token == NUMBER) { + usrtoknum = numval; + token = lex(); /* okay, did number, now get literal */ + } + else usrtoknum = 0; + + /* process literal string token */ + + if (token != IDENTIFIER || *symval->tag != '\"') + { + warns("expected string constant instead of %s", + token_buffer); + skip_to_char('%'); + return; + } + symval->class = STOKEN; + symval->type_name = typename; + symval->user_token_number = usrtoknum; + + symval->alias = symbol; + symbol->alias = symval; + + nsyms--; /* symbol and symval combined are only one symbol */ +} + + +/* parse what comes after %start */ + +void +parse_start_decl () +{ + if (start_flag) + warn("multiple %start declarations"); + if (lex() != IDENTIFIER) + warn("invalid %start declaration"); + else + { + start_flag = 1; + startval = symval; + } +} + + + +/* read in a %type declaration and record its information for get_type_name to access */ + +void +parse_type_decl () +{ + register int k; + register char *name; + + if (lex() != TYPENAME) + { + warn("%type declaration has no "); + skip_to_char('%'); + return; + } + + k = strlen(token_buffer); + name = NEW2(k + 1, char); + strcpy(name, token_buffer); + + for (;;) + { + register int t; + + if(ungetc(skip_white_space(), finput) == '%') + return; + + t = lex(); + + switch (t) + { + + case COMMA: + case SEMICOLON: + break; + + case IDENTIFIER: + if (symval->type_name == NULL) + symval->type_name = name; + else if (strcmp(name, symval->type_name) != 0) + warns("type redeclaration for %s", symval->tag); + + break; + + default: + warns("invalid %%type declaration due to item: `%s'", token_buffer); + skip_to_char('%'); + } + } +} + + + +/* read in a %left, %right or %nonassoc declaration and record its information. */ +/* assoc is either LEFT_ASSOC, RIGHT_ASSOC or NON_ASSOC. */ + +void +parse_assoc_decl (assoc) +int assoc; +{ + register int k; + register char *name = NULL; + register int prev = 0; + + lastprec++; /* Assign a new precedence level, never 0. */ + + for (;;) + { + register int t; + + if(ungetc(skip_white_space(), finput) == '%') + return; + + t = lex(); + + switch (t) + { + + case TYPENAME: + k = strlen(token_buffer); + name = NEW2(k + 1, char); + strcpy(name, token_buffer); + break; + + case COMMA: + break; + + case IDENTIFIER: + if (symval->prec != 0) + warns("redefining precedence of %s", symval->tag); + symval->prec = lastprec; + symval->assoc = assoc; + if (symval->class == SNTERM) + warns("symbol %s redefined", symval->tag); + symval->class = STOKEN; + if (name) + { /* record the type, if one is specified */ + if (symval->type_name == NULL) + symval->type_name = name; + else if (strcmp(name, symval->type_name) != 0) + warns("type redeclaration for %s", symval->tag); + } + break; + + case NUMBER: + if (prev == IDENTIFIER) + { + symval->user_token_number = numval; + translations = 1; + } + else + { + warns("invalid text (%s) - number should be after identifier", + token_buffer); + skip_to_char('%'); + } + break; + + case SEMICOLON: + return; + + default: + warns("unexpected item: %s", token_buffer); + skip_to_char('%'); + } + + prev = t; + + } +} + + + +/* copy the union declaration into fattrs (and fdefines), + where it is made into the + definition of YYSTYPE, the type of elements of the parser value stack. */ + +void +parse_union_decl() +{ + register int c; + register int count; + register int in_comment; + int cplus_comment; + + if (typed) + warn("multiple %union declarations"); + + typed = 1; + + if (!nolinesflag) + fprintf(fattrs, "\n#line %d \"%s\"\n", lineno, infile); + else + fprintf(fattrs, "\n"); + + fprintf(fattrs, "typedef union"); + if (fdefines) + fprintf(fdefines, "typedef union"); + + count = 0; + in_comment = 0; + + c = getc(finput); + + while (c != EOF) + { + putc(c, fattrs); + if (fdefines) + putc(c, fdefines); + + switch (c) + { + case '\n': + lineno++; + break; + + case '/': + c = getc(finput); + if (c != '*' && c != '/') + ungetc(c, finput); + else + { + putc(c, fattrs); + if (fdefines) + putc(c, fdefines); + cplus_comment = (c == '/'); + in_comment = 1; + c = getc(finput); + while (in_comment) + { + putc(c, fattrs); + if (fdefines) + putc(c, fdefines); + + if (c == '\n') + { + lineno++; + if (cplus_comment) + { + in_comment = 0; + break; + } + } + if (c == EOF) + fatal("unterminated comment at end of file"); + + if (!cplus_comment && c == '*') + { + c = getc(finput); + if (c == '/') + { + putc('/', fattrs); + if (fdefines) + putc('/', fdefines); + in_comment = 0; + } + } + else + c = getc(finput); + } + } + break; + + + case '{': + count++; + break; + + case '}': + if (count == 0) + warn ("unmatched close-brace (`}')"); + count--; + if (count <= 0) + { + fprintf(fattrs, " YYSTYPE;\n"); + if (fdefines) + fprintf(fdefines, " YYSTYPE;\n"); + /* JF don't choke on trailing semi */ + c=skip_white_space(); + if(c!=';') ungetc(c,finput); + return; + } + } + + c = getc(finput); + } +} + +/* parse the declaration %expect N which says to expect N + shift-reduce conflicts. */ + +void +parse_expect_decl() +{ + register int c; + register int count; + char buffer[20]; + + c = getc(finput); + while (c == ' ' || c == '\t') + c = getc(finput); + + count = 0; + while (c >= '0' && c <= '9') + { + if (count < 20) + buffer[count++] = c; + c = getc(finput); + } + buffer[count] = 0; + + ungetc (c, finput); + + if (count <= 0 || count > 10) + warn("argument of %expect is not an integer"); + expected_conflicts = atoi (buffer); +} + +/* that's all of parsing the declaration section */ + +/* Get the data type (alternative in the union) of the value for symbol n in rule rule. */ + +char * +get_type_name(n, rule) +int n; +symbol_list *rule; +{ + static char *msg = "invalid $ value"; + + register int i; + register symbol_list *rp; + + if (n < 0) + { + warn(msg); + return NULL; + } + + rp = rule; + i = 0; + + while (i < n) + { + rp = rp->next; + if (rp == NULL || rp->sym == NULL) + { + warn(msg); + return NULL; + } + i++; + } + + return (rp->sym->type_name); +} + + +/* after %guard is seen in the input file, +copy the actual guard into the guards file. +If the guard is followed by an action, copy that into the actions file. +stack_offset is the number of values in the current rule so far, +which says where to find $0 with respect to the top of the stack, +for the simple parser in which the stack is not popped until after the guard is run. */ + +void +copy_guard(rule, stack_offset) +symbol_list *rule; +int stack_offset; +{ + register int c; + register int n; + register int count; + register int match; + register int ended; + register char *type_name; + int brace_flag = 0; + int cplus_comment; + + /* offset is always 0 if parser has already popped the stack pointer */ + if (semantic_parser) stack_offset = 0; + + fprintf(fguard, "\ncase %d:\n", nrules); + if (!nolinesflag) + fprintf(fguard, "#line %d \"%s\"\n", lineno, infile); + putc('{', fguard); + + count = 0; + c = getc(finput); + + while (brace_flag ? (count > 0) : (c != ';')) + { + switch (c) + { + case '\n': + putc(c, fguard); + lineno++; + break; + + case '{': + putc(c, fguard); + brace_flag = 1; + count++; + break; + + case '}': + putc(c, fguard); + if (count > 0) + count--; + else + { + warn("unmatched right brace (`}')"); + c = getc(finput); /* skip it */ + } + break; + + case '\'': + case '"': + match = c; + putc(c, fguard); + c = getc(finput); + + while (c != match) + { + if (c == EOF) + fatal("unterminated string at end of file"); + if (c == '\n') + { + warn("unterminated string"); + ungetc(c, finput); + c = match; /* invent terminator */ + continue; + } + + putc(c, fguard); + + if (c == '\\') + { + c = getc(finput); + if (c == EOF) + fatal("unterminated string"); + putc(c, fguard); + if (c == '\n') + lineno++; + } + + c = getc(finput); + } + + putc(c, fguard); + break; + + case '/': + putc(c, fguard); + c = getc(finput); + if (c != '*' && c != '/') + continue; + + cplus_comment = (c == '/'); + putc(c, fguard); + c = getc(finput); + + ended = 0; + while (!ended) + { + if (!cplus_comment && c == '*') + { + while (c == '*') + { + putc(c, fguard); + c = getc(finput); + } + + if (c == '/') + { + putc(c, fguard); + ended = 1; + } + } + else if (c == '\n') + { + lineno++; + putc(c, fguard); + if (cplus_comment) + ended = 1; + else + c = getc(finput); + } + else if (c == EOF) + fatal("unterminated comment"); + else + { + putc(c, fguard); + c = getc(finput); + } + } + + break; + + case '$': + c = getc(finput); + type_name = NULL; + + if (c == '<') + { + register char *cp = token_buffer; + + while ((c = getc(finput)) != '>' && c > 0) + *cp++ = c; + *cp = 0; + type_name = token_buffer; + + c = getc(finput); + } + + if (c == '$') + { + fprintf(fguard, "yyval"); + if (!type_name) type_name = rule->sym->type_name; + if (type_name) + fprintf(fguard, ".%s", type_name); + if(!type_name && typed) + warns("$$ of `%s' has no declared type", rule->sym->tag); + } + + else if (isdigit(c) || c == '-') + { + ungetc (c, finput); + n = read_signed_integer(finput); + c = getc(finput); + + if (!type_name && n > 0) + type_name = get_type_name(n, rule); + + fprintf(fguard, "yyvsp[%d]", n - stack_offset); + if (type_name) + fprintf(fguard, ".%s", type_name); + if(!type_name && typed) + warnss("$%s of `%s' has no declared type", int_to_string(n), rule->sym->tag); + continue; + } + else + warni("$%s is invalid", printable_version(c)); + + break; + + case '@': + c = getc(finput); + if (isdigit(c) || c == '-') + { + ungetc (c, finput); + n = read_signed_integer(finput); + c = getc(finput); + } + else + { + warni("@%s is invalid", printable_version(c)); + n = 1; + } + + fprintf(fguard, "yylsp[%d]", n - stack_offset); + yylsp_needed = 1; + + continue; + + case EOF: + fatal("unterminated %%guard clause"); + + default: + putc(c, fguard); + } + + if (c != '}' || count != 0) + c = getc(finput); + } + + c = skip_white_space(); + + fprintf(fguard, ";\n break;}"); + if (c == '{') + copy_action(rule, stack_offset); + else if (c == '=') + { + c = getc(finput); /* why not skip_white_space -wjh */ + if (c == '{') + copy_action(rule, stack_offset); + } + else + ungetc(c, finput); +} + + + +/* Assuming that a { has just been seen, copy everything up to the matching } +into the actions file. +stack_offset is the number of values in the current rule so far, +which says where to find $0 with respect to the top of the stack. */ + +void +copy_action(rule, stack_offset) +symbol_list *rule; +int stack_offset; +{ + register int c; + register int n; + register int count; + register int match; + register int ended; + register char *type_name; + int cplus_comment; + + /* offset is always 0 if parser has already popped the stack pointer */ + if (semantic_parser) stack_offset = 0; + + fprintf(faction, "\ncase %d:\n", nrules); + if (!nolinesflag) + fprintf(faction, "#line %d \"%s\"\n", lineno, infile); + putc('{', faction); + + count = 1; + c = getc(finput); + + while (count > 0) + { + while (c != '}') + { + switch (c) + { + case '\n': + putc(c, faction); + lineno++; + break; + + case '{': + putc(c, faction); + count++; + break; + + case '\'': + case '"': + match = c; + putc(c, faction); + c = getc(finput); + + while (c != match) + { + if (c == '\n') + { + warn("unterminated string"); + ungetc(c, finput); + c = match; + continue; + } + else if (c == EOF) + fatal("unterminated string at end of file"); + + putc(c, faction); + + if (c == '\\') + { + c = getc(finput); + if (c == EOF) + fatal("unterminated string"); + putc(c, faction); + if (c == '\n') + lineno++; + } + + c = getc(finput); + } + + putc(c, faction); + break; + + case '/': + putc(c, faction); + c = getc(finput); + if (c != '*' && c != '/') + continue; + + cplus_comment = (c == '/'); + putc(c, faction); + c = getc(finput); + + ended = 0; + while (!ended) + { + if (!cplus_comment && c == '*') + { + while (c == '*') + { + putc(c, faction); + c = getc(finput); + } + + if (c == '/') + { + putc(c, faction); + ended = 1; + } + } + else if (c == '\n') + { + lineno++; + putc(c, faction); + if (cplus_comment) + ended = 1; + else + c = getc(finput); + } + else if (c == EOF) + fatal("unterminated comment"); + else + { + putc(c, faction); + c = getc(finput); + } + } + + break; + + case '$': + c = getc(finput); + type_name = NULL; + + if (c == '<') + { + register char *cp = token_buffer; + + while ((c = getc(finput)) != '>' && c > 0) + *cp++ = c; + *cp = 0; + type_name = token_buffer; + value_components_used = 1; + + c = getc(finput); + } + if (c == '$') + { + fprintf(faction, "yyval"); + if (!type_name) type_name = get_type_name(0, rule); + if (type_name) + fprintf(faction, ".%s", type_name); + if(!type_name && typed) + warns("$$ of `%s' has no declared type", rule->sym->tag); + } + else if (isdigit(c) || c == '-') + { + ungetc (c, finput); + n = read_signed_integer(finput); + c = getc(finput); + + if (!type_name && n > 0) + type_name = get_type_name(n, rule); + + fprintf(faction, "yyvsp[%d]", n - stack_offset); + if (type_name) + fprintf(faction, ".%s", type_name); + if(!type_name && typed) + warnss("$%s of `%s' has no declared type", + int_to_string(n), rule->sym->tag); + continue; + } + else + warni("$%s is invalid", printable_version(c)); + + break; + + case '@': + c = getc(finput); + if (isdigit(c) || c == '-') + { + ungetc (c, finput); + n = read_signed_integer(finput); + c = getc(finput); + } + else + { + warn("invalid @-construct"); + n = 1; + } + + fprintf(faction, "yylsp[%d]", n - stack_offset); + yylsp_needed = 1; + + continue; + + case EOF: + fatal("unmatched `{'"); + + default: + putc(c, faction); + } + + c = getc(finput); + } + + /* above loop exits when c is '}' */ + + if (--count) + { + putc(c, faction); + c = getc(finput); + } + } + + fprintf(faction, ";\n break;}"); +} + + + +/* generate a dummy symbol, a nonterminal, +whose name cannot conflict with the user's names. */ + +bucket * +gensym() +{ + register bucket *sym; + + sprintf (token_buffer, "@%d", ++gensym_count); + sym = getsym(token_buffer); + sym->class = SNTERM; + sym->value = nvars++; + return (sym); +} + +/* Parse the input grammar into a one symbol_list structure. +Each rule is represented by a sequence of symbols: the left hand side +followed by the contents of the right hand side, followed by a null pointer +instead of a symbol to terminate the rule. +The next symbol is the lhs of the following rule. + +All guards and actions are copied out to the appropriate files, +labelled by the rule number they apply to. */ + +void +readgram() +{ + register int t; + register bucket *lhs; + register symbol_list *p; + register symbol_list *p1; + register bucket *bp; + + symbol_list *crule; /* points to first symbol_list of current rule. */ + /* its symbol is the lhs of the rule. */ + symbol_list *crule1; /* points to the symbol_list preceding crule. */ + + p1 = NULL; + + t = lex(); + + while (t != TWO_PERCENTS && t != ENDFILE) + { + if (t == IDENTIFIER || t == BAR) + { + register int actionflag = 0; + int rulelength = 0; /* number of symbols in rhs of this rule so far */ + int xactions = 0; /* JF for error checking */ + bucket *first_rhs = 0; + + if (t == IDENTIFIER) + { + lhs = symval; + + if (!start_flag) + { + startval = lhs; + start_flag = 1; + } + + t = lex(); + if (t != COLON) + { + warn("ill-formed rule: initial symbol not followed by colon"); + unlex(t); + } + } + + if (nrules == 0 && t == BAR) + { + warn("grammar starts with vertical bar"); + lhs = symval; /* BOGUS: use a random symval */ + } + /* start a new rule and record its lhs. */ + + nrules++; + nitems++; + + record_rule_line (); + + p = NEW(symbol_list); + p->sym = lhs; + + crule1 = p1; + if (p1) + p1->next = p; + else + grammar = p; + + p1 = p; + crule = p; + + /* mark the rule's lhs as a nonterminal if not already so. */ + + if (lhs->class == SUNKNOWN) + { + lhs->class = SNTERM; + lhs->value = nvars; + nvars++; + } + else if (lhs->class == STOKEN) + warns("rule given for %s, which is a token", lhs->tag); + + /* read the rhs of the rule. */ + + for (;;) + { + t = lex(); + if (t == PREC) + { + t = lex(); + crule->ruleprec = symval; + t = lex(); + } + + if (! (t == IDENTIFIER || t == LEFT_CURLY)) break; + + /* If next token is an identifier, see if a colon follows it. + If one does, exit this rule now. */ + if (t == IDENTIFIER) + { + register bucket *ssave; + register int t1; + + ssave = symval; + t1 = lex(); + unlex(t1); + symval = ssave; + if (t1 == COLON) break; + + if(!first_rhs) /* JF */ + first_rhs = symval; + /* Not followed by colon => + process as part of this rule's rhs. */ + } + + /* If we just passed an action, that action was in the middle + of a rule, so make a dummy rule to reduce it to a + non-terminal. */ + if (actionflag) + { + register bucket *sdummy; + + /* Since the action was written out with this rule's */ + /* number, we must give the new rule this number */ + /* by inserting the new rule before it. */ + + /* Make a dummy nonterminal, a gensym. */ + sdummy = gensym(); + + /* Make a new rule, whose body is empty, + before the current one, so that the action + just read can belong to it. */ + nrules++; + nitems++; + record_rule_line (); + p = NEW(symbol_list); + if (crule1) + crule1->next = p; + else grammar = p; + p->sym = sdummy; + crule1 = NEW(symbol_list); + p->next = crule1; + crule1->next = crule; + + /* insert the dummy generated by that rule into this rule. */ + nitems++; + p = NEW(symbol_list); + p->sym = sdummy; + p1->next = p; + p1 = p; + + actionflag = 0; + } + + if (t == IDENTIFIER) + { + nitems++; + p = NEW(symbol_list); + p->sym = symval; + p1->next = p; + p1 = p; + } + else /* handle an action. */ + { + copy_action(crule, rulelength); + actionflag = 1; + xactions++; /* JF */ + } + rulelength++; + } /* end of read rhs of rule */ + + /* Put an empty link in the list to mark the end of this rule */ + p = NEW(symbol_list); + p1->next = p; + p1 = p; + + if (t == PREC) + { + warn("two @prec's in a row"); + t = lex(); + crule->ruleprec = symval; + t = lex(); + } + if (t == GUARD) + { + if (! semantic_parser) + warn("%%guard present but %%semantic_parser not specified"); + + copy_guard(crule, rulelength); + t = lex(); + } + else if (t == LEFT_CURLY) + { + /* This case never occurs -wjh */ + if (actionflag) warn("two actions at end of one rule"); + copy_action(crule, rulelength); + actionflag = 1; + xactions++; /* -wjh */ + t = lex(); + } + /* If $$ is being set in default way, + warn if any type mismatch. */ + else if (!xactions && first_rhs && lhs->type_name != first_rhs->type_name) + { + if (lhs->type_name == 0 || first_rhs->type_name == 0 + || strcmp(lhs->type_name,first_rhs->type_name)) + warnss("type clash (`%s' `%s') on default action", + lhs->type_name ? lhs->type_name : "", + first_rhs->type_name ? first_rhs->type_name : ""); + } + /* Warn if there is no default for $$ but we need one. */ + else if (!xactions && !first_rhs && lhs->type_name != 0) + warn("empty rule for typed nonterminal, and no action"); + if (t == SEMICOLON) + t = lex(); + } +#if 0 + /* these things can appear as alternatives to rules. */ +/* NO, they cannot. + a) none of the documentation allows them + b) most of them scan forward until finding a next % + thus they may swallow lots of intervening rules +*/ + else if (t == TOKEN) + { + parse_token_decl(STOKEN, SNTERM); + t = lex(); + } + else if (t == NTERM) + { + parse_token_decl(SNTERM, STOKEN); + t = lex(); + } + else if (t == TYPE) + { + t = get_type(); + } + else if (t == UNION) + { + parse_union_decl(); + t = lex(); + } + else if (t == EXPECT) + { + parse_expect_decl(); + t = lex(); + } + else if (t == START) + { + parse_start_decl(); + t = lex(); + } +#endif + + else + { + warns("invalid input: %s", token_buffer); + t = lex(); + } + } + + /* grammar has been read. Do some checking */ + + if (nsyms > MAXSHORT) + fatals("too many symbols (tokens plus nonterminals); maximum %s", + int_to_string(MAXSHORT)); + if (nrules == 0) + fatal("no rules in the input grammar"); + + if (typed == 0 /* JF put out same default YYSTYPE as YACC does */ + && !value_components_used) + { + /* We used to use `unsigned long' as YYSTYPE on MSDOS, + but it seems better to be consistent. + Most programs should declare their own type anyway. */ + fprintf(fattrs, "#ifndef YYSTYPE\n#define YYSTYPE int\n#endif\n"); + if (fdefines) + fprintf(fdefines, "#ifndef YYSTYPE\n#define YYSTYPE int\n#endif\n"); + } + + /* Report any undefined symbols and consider them nonterminals. */ + + for (bp = firstsymbol; bp; bp = bp->next) + if (bp->class == SUNKNOWN) + { + warns("symbol %s is used, but is not defined as a token and has no rules", + bp->tag); + bp->class = SNTERM; + bp->value = nvars++; + } + + ntokens = nsyms - nvars; +} + + +void +record_rule_line () +{ + /* Record each rule's source line number in rline table. */ + + if (nrules >= rline_allocated) + { + rline_allocated = nrules * 2; + rline = (short *) xrealloc (rline, + rline_allocated * sizeof (short)); + } + rline[nrules] = lineno; +} + + +/* read in a %type declaration and record its information for get_type_name to access */ +/* this is unused. it is only called from the #if 0 part of readgram */ +static int +get_type() +{ + register int k; + register int t; + register char *name; + + t = lex(); + + if (t != TYPENAME) + { + warn("ill-formed %type declaration"); + return t; + } + + k = strlen(token_buffer); + name = NEW2(k + 1, char); + strcpy(name, token_buffer); + + for (;;) + { + t = lex(); + + switch (t) + { + case SEMICOLON: + return (lex()); + + case COMMA: + break; + + case IDENTIFIER: + if (symval->type_name == NULL) + symval->type_name = name; + else if (strcmp(name, symval->type_name) != 0) + warns("type redeclaration for %s", symval->tag); + + break; + + default: + return (t); + } + } +} + + + +/* assign symbol numbers, and write definition of token names into fdefines. +Set up vectors tags and sprec of names and precedences of symbols. */ + +void +packsymbols() +{ + register bucket *bp; + register int tokno = 1; + register int i; + register int last_user_token_number; + + /* int lossage = 0; JF set but not used */ + + tags = NEW2(nsyms + 1, char *); + tags[0] = "$"; + user_toknums = NEW2(nsyms + 1, int); + user_toknums[0] = 0; + + sprec = NEW2(nsyms, short); + sassoc = NEW2(nsyms, short); + + max_user_token_number = 256; + last_user_token_number = 256; + + for (bp = firstsymbol; bp; bp = bp->next) + { + if (bp->class == SNTERM) + { + bp->value += ntokens; + } + else if (bp->alias) + { + /* this symbol and its alias are a single token defn. + allocate a tokno, and assign to both + check agreement of ->prec and ->assoc fields + and make both the same + */ + if (bp->value == 0) + bp->value = bp->alias->value = tokno++; + + if (bp->prec != bp->alias->prec) { + if (bp->prec != 0 && bp->alias->prec != 0 + && bp->user_token_number == SALIAS) + warnss("conflicting precedences for %s and %s", + bp->tag, bp->alias->tag); + if (bp->prec != 0) bp->alias->prec = bp->prec; + else bp->prec = bp->alias->prec; + } + + if (bp->assoc != bp->alias->assoc) { + if (bp->assoc != 0 && bp->alias->assoc != 0 + && bp->user_token_number == SALIAS) + warnss("conflicting assoc values for %s and %s", + bp->tag, bp->alias->tag); + if (bp->assoc != 0) bp->alias->assoc = bp->assoc; + else bp->assoc = bp->alias->assoc; + } + + if (bp->user_token_number == SALIAS) + continue; /* do not do processing below for SALIASs */ + + } + else /* bp->class == STOKEN */ + { + bp->value = tokno++; + } + + if (bp->class == STOKEN) + { + if (translations && !(bp->user_token_number)) + bp->user_token_number = ++last_user_token_number; + if (bp->user_token_number > max_user_token_number) + max_user_token_number = bp->user_token_number; + } + + tags[bp->value] = bp->tag; + user_toknums[bp->value] = bp->user_token_number; + sprec[bp->value] = bp->prec; + sassoc[bp->value] = bp->assoc; + + } + + if (translations) + { + register int i; + + token_translations = NEW2(max_user_token_number+1, short); + + /* initialize all entries for literal tokens to 2, + the internal token number for $undefined., + which represents all invalid inputs. */ + for (i = 0; i <= max_user_token_number; i++) + token_translations[i] = 2; + + for (bp = firstsymbol; bp; bp = bp->next) + { + if (bp->value >= ntokens) continue; /* non-terminal */ + if (bp->user_token_number == SALIAS) continue; + if (token_translations[bp->user_token_number] != 2) + warnsss("tokens %s and %s both assigned number %s", + tags[token_translations[bp->user_token_number]], + bp->tag, + int_to_string(bp->user_token_number)); + token_translations[bp->user_token_number] = bp->value; + } + } + + error_token_number = errtoken->value; + + if (! noparserflag) + output_token_defines(ftable); + + if (startval->class == SUNKNOWN) + fatals("the start symbol %s is undefined", startval->tag); + else if (startval->class == STOKEN) + fatals("the start symbol %s is a token", startval->tag); + + start_symbol = startval->value; + + if (definesflag) + { + output_token_defines(fdefines); + + if (!pure_parser) + { + if (spec_name_prefix) + fprintf(fdefines, "\nextern YYSTYPE %slval;\n", spec_name_prefix); + else + fprintf(fdefines, "\nextern YYSTYPE yylval;\n"); + } + + if (semantic_parser) + for (i = ntokens; i < nsyms; i++) + { + /* don't make these for dummy nonterminals made by gensym. */ + if (*tags[i] != '@') + fprintf(fdefines, "#define\tNT%s\t%d\n", tags[i], i); + } +#if 0 + /* `fdefines' is now a temporary file, so we need to copy its + contents in `done', so we can't close it here. */ + fclose(fdefines); + fdefines = NULL; +#endif + } +} + +/* For named tokens, but not literal ones, define the name. + The value is the user token number. +*/ +void +output_token_defines(file) +FILE *file; +{ + bucket *bp; + register char *cp, *symbol; + register char c; + + for (bp = firstsymbol; bp; bp = bp->next) + { + symbol = bp->tag; /* get symbol */ + + if (bp->value >= ntokens) continue; + if (bp->user_token_number == SALIAS) continue; + if ('\'' == *symbol) continue; /* skip literal character */ + if (bp == errtoken) continue; /* skip error token */ + if ('\"' == *symbol) + { + /* use literal string only if given a symbol with an alias */ + if (bp->alias) + symbol = bp->alias->tag; + else + continue; + } + + /* Don't #define nonliteral tokens whose names contain periods. */ + cp = symbol; + while ((c = *cp++) && c != '.'); + if (c != '\0') continue; + + fprintf(file, "#define\t%s\t%d\n", symbol, + ((translations && ! rawtoknumflag) + ? bp->user_token_number + : bp->value)); + if (semantic_parser) + fprintf(file, "#define\tT%s\t%d\n", symbol, bp->value); + } + + putc('\n', file); +} + + + +/* convert the rules into the representation using rrhs, rlhs and ritems. */ + +void +packgram() +{ + register int itemno; + register int ruleno; + register symbol_list *p; +/* register bucket *bp; JF unused */ + + bucket *ruleprec; + + ritem = NEW2(nitems + 1, short); + rlhs = NEW2(nrules, short) - 1; + rrhs = NEW2(nrules, short) - 1; + rprec = NEW2(nrules, short) - 1; + rprecsym = NEW2(nrules, short) - 1; + rassoc = NEW2(nrules, short) - 1; + + itemno = 0; + ruleno = 1; + + p = grammar; + while (p) + { + rlhs[ruleno] = p->sym->value; + rrhs[ruleno] = itemno; + ruleprec = p->ruleprec; + + p = p->next; + while (p && p->sym) + { + ritem[itemno++] = p->sym->value; + /* A rule gets by default the precedence and associativity + of the last token in it. */ + if (p->sym->class == STOKEN) + { + rprec[ruleno] = p->sym->prec; + rassoc[ruleno] = p->sym->assoc; + } + if (p) p = p->next; + } + + /* If this rule has a %prec, + the specified symbol's precedence replaces the default. */ + if (ruleprec) + { + rprec[ruleno] = ruleprec->prec; + rassoc[ruleno] = ruleprec->assoc; + rprecsym[ruleno] = ruleprec->value; + } + + ritem[itemno++] = -ruleno; + ruleno++; + + if (p) p = p->next; + } + + ritem[itemno] = 0; +} + +/* Read a signed integer from STREAM and return its value. */ + +int +read_signed_integer (stream) + FILE *stream; +{ + register int c = getc(stream); + register int sign = 1; + register int n; + + if (c == '-') + { + c = getc(stream); + sign = -1; + } + n = 0; + while (isdigit(c)) + { + n = 10*n + (c - '0'); + c = getc(stream); + } + + ungetc(c, stream); + + return n * sign; +} diff --git a/contrib/bison/reduce.c b/contrib/bison/reduce.c new file mode 100644 index 000000000000..13e62ca357b8 --- /dev/null +++ b/contrib/bison/reduce.c @@ -0,0 +1,598 @@ +/* Grammar reduction for Bison. + Copyright (C) 1988, 1989 Free Software Foundation, Inc. + +This file is part of Bison, the GNU Compiler Compiler. + +Bison is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 2, or (at your option) +any later version. + +Bison is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with Bison; see the file COPYING. If not, write to +the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA. */ + + +/* + * Reduce the grammar: Find and eliminate unreachable terminals, + * nonterminals, and productions. David S. Bakin. + */ + +/* + * Don't eliminate unreachable terminals: They may be used by the user's + * parser. + */ + +#include +#include "system.h" +#include "files.h" +#include "gram.h" +#include "machine.h" +#include "new.h" + + +extern char **tags; /* reader.c */ +extern int verboseflag; /* getargs.c */ +static int statisticsflag; /* XXXXXXX */ + +#ifndef TRUE +#define TRUE (1) +#define FALSE (0) +#endif +typedef int bool; +typedef unsigned *BSet; +typedef short *rule; + + +/* + * N is set of all nonterminals which are not useless. P is set of all rules + * which have no useless nonterminals in their RHS. V is the set of all + * accessible symbols. + */ + +static BSet N, P, V, V1; + +static int nuseful_productions, nuseless_productions, + nuseful_nonterminals, nuseless_nonterminals; + + +static void useless_nonterminals(); +static void inaccessable_symbols(); +static void reduce_grammar_tables(); +static void print_results(); +static void print_notices(); +void dump_grammar(); + +extern void fatals (); + + +bool +bits_equal (L, R, n) +BSet L; +BSet R; +int n; +{ + int i; + + for (i = n - 1; i >= 0; i--) + if (L[i] != R[i]) + return FALSE; + return TRUE; +} + + +int +nbits (i) +unsigned i; +{ + int count = 0; + + while (i != 0) { + i ^= (i & -i); + ++count; + } + return count; +} + + +int +bits_size (S, n) +BSet S; +int n; +{ + int i, count = 0; + + for (i = n - 1; i >= 0; i--) + count += nbits(S[i]); + return count; +} + +void +reduce_grammar () +{ + bool reduced; + + /* Allocate the global sets used to compute the reduced grammar */ + + N = NEW2(WORDSIZE(nvars), unsigned); + P = NEW2(WORDSIZE(nrules + 1), unsigned); + V = NEW2(WORDSIZE(nsyms), unsigned); + V1 = NEW2(WORDSIZE(nsyms), unsigned); + + useless_nonterminals(); + inaccessable_symbols(); + + reduced = (bool) (nuseless_nonterminals + nuseless_productions > 0); + + if (verboseflag) + print_results(); + + if (reduced == FALSE) + goto done_reducing; + + print_notices(); + + if (!BITISSET(N, start_symbol - ntokens)) + fatals("Start symbol %s does not derive any sentence", + tags[start_symbol]); + + reduce_grammar_tables(); + /* if (verboseflag) { + fprintf(foutput, "REDUCED GRAMMAR\n\n"); + dump_grammar(); + } + */ + + /**/ statisticsflag = FALSE; /* someday getopts should handle this */ + if (statisticsflag == TRUE) + fprintf(stderr, + "reduced %s defines %d terminal%s, %d nonterminal%s\ +, and %d production%s.\n", infile, + ntokens, (ntokens == 1 ? "" : "s"), + nvars, (nvars == 1 ? "" : "s"), + nrules, (nrules == 1 ? "" : "s")); + + done_reducing: + + /* Free the global sets used to compute the reduced grammar */ + + FREE(N); + FREE(V); + FREE(P); + +} + +/* + * Another way to do this would be with a set for each production and then do + * subset tests against N, but even for the C grammar the whole reducing + * process takes only 2 seconds on my 8Mhz AT. + */ + +static bool +useful_production (i, N) +int i; +BSet N; +{ + rule r; + short n; + + /* + * A production is useful if all of the nonterminals in its RHS + * appear in the set of useful nonterminals. + */ + + for (r = &ritem[rrhs[i]]; *r > 0; r++) + if (ISVAR(n = *r)) + if (!BITISSET(N, n - ntokens)) + return FALSE; + return TRUE; +} + + +/* Remember that rules are 1-origin, symbols are 0-origin. */ + +static void +useless_nonterminals () +{ + BSet Np, Ns; + int i, n; + + /* + * N is set as built. Np is set being built this iteration. P is set + * of all productions which have a RHS all in N. + */ + + Np = NEW2(WORDSIZE(nvars), unsigned); + + /* + * The set being computed is a set of nonterminals which can derive + * the empty string or strings consisting of all terminals. At each + * iteration a nonterminal is added to the set if there is a + * production with that nonterminal as its LHS for which all the + * nonterminals in its RHS are already in the set. Iterate until the + * set being computed remains unchanged. Any nonterminals not in the + * set at that point are useless in that they will never be used in + * deriving a sentence of the language. + * + * This iteration doesn't use any special traversal over the + * productions. A set is kept of all productions for which all the + * nonterminals in the RHS are in useful. Only productions not in + * this set are scanned on each iteration. At the end, this set is + * saved to be used when finding useful productions: only productions + * in this set will appear in the final grammar. + */ + + n = 0; + while (1) + { + for (i = WORDSIZE(nvars) - 1; i >= 0; i--) + Np[i] = N[i]; + for (i = 1; i <= nrules; i++) + { + if (!BITISSET(P, i)) + { + if (useful_production(i, N)) + { + SETBIT(Np, rlhs[i] - ntokens); + SETBIT(P, i); + } + } + } + if (bits_equal(N, Np, WORDSIZE(nvars))) + break; + Ns = Np; + Np = N; + N = Ns; + } + FREE(N); + N = Np; +} + +static void +inaccessable_symbols () +{ + BSet Vp, Vs, Pp; + int i, n; + short t; + rule r; + + /* + * Find out which productions are reachable and which symbols are + * used. Starting with an empty set of productions and a set of + * symbols which only has the start symbol in it, iterate over all + * productions until the set of productions remains unchanged for an + * iteration. For each production which has a LHS in the set of + * reachable symbols, add the production to the set of reachable + * productions, and add all of the nonterminals in the RHS of the + * production to the set of reachable symbols. + * + * Consider only the (partially) reduced grammar which has only + * nonterminals in N and productions in P. + * + * The result is the set P of productions in the reduced grammar, and + * the set V of symbols in the reduced grammar. + * + * Although this algorithm also computes the set of terminals which are + * reachable, no terminal will be deleted from the grammar. Some + * terminals might not be in the grammar but might be generated by + * semantic routines, and so the user might want them available with + * specified numbers. (Is this true?) However, the nonreachable + * terminals are printed (if running in verbose mode) so that the user + * can know. + */ + + Vp = NEW2(WORDSIZE(nsyms), unsigned); + Pp = NEW2(WORDSIZE(nrules + 1), unsigned); + + /* If the start symbol isn't useful, then nothing will be useful. */ + if (!BITISSET(N, start_symbol - ntokens)) + goto end_iteration; + + SETBIT(V, start_symbol); + + n = 0; + while (1) + { + for (i = WORDSIZE(nsyms) - 1; i >= 0; i--) + Vp[i] = V[i]; + for (i = 1; i <= nrules; i++) + { + if (!BITISSET(Pp, i) && BITISSET(P, i) && + BITISSET(V, rlhs[i])) + { + for (r = &ritem[rrhs[i]]; *r >= 0; r++) + { + if (ISTOKEN(t = *r) + || BITISSET(N, t - ntokens)) + { + SETBIT(Vp, t); + } + } + SETBIT(Pp, i); + } + } + if (bits_equal(V, Vp, WORDSIZE(nsyms))) + { + break; + } + Vs = Vp; + Vp = V; + V = Vs; + } + end_iteration: + + FREE(V); + V = Vp; + + /* Tokens 0, 1, and 2 are internal to Bison. Consider them useful. */ + SETBIT(V, 0); /* end-of-input token */ + SETBIT(V, 1); /* error token */ + SETBIT(V, 2); /* some undefined token */ + + FREE(P); + P = Pp; + + nuseful_productions = bits_size(P, WORDSIZE(nrules + 1)); + nuseless_productions = nrules - nuseful_productions; + + nuseful_nonterminals = 0; + for (i = ntokens; i < nsyms; i++) + if (BITISSET(V, i)) + nuseful_nonterminals++; + nuseless_nonterminals = nvars - nuseful_nonterminals; + + /* A token that was used in %prec should not be warned about. */ + for (i = 1; i < nrules; i++) + if (rprecsym[i] != 0) + SETBIT(V1, rprecsym[i]); +} + +static void +reduce_grammar_tables () +{ +/* This is turned off because we would need to change the numbers + in the case statements in the actions file. */ +#if 0 + /* remove useless productions */ + if (nuseless_productions > 0) + { + short np, pn, ni, pi; + + np = 0; + ni = 0; + for (pn = 1; pn <= nrules; pn++) + { + if (BITISSET(P, pn)) + { + np++; + if (pn != np) + { + rlhs[np] = rlhs[pn]; + rline[np] = rline[pn]; + rprec[np] = rprec[pn]; + rassoc[np] = rassoc[pn]; + rrhs[np] = rrhs[pn]; + if (rrhs[np] != ni) + { + pi = rrhs[np]; + rrhs[np] = ni; + while (ritem[pi] >= 0) + ritem[ni++] = ritem[pi++]; + ritem[ni++] = -np; + } + } else { + while (ritem[ni++] >= 0); + } + } + } + ritem[ni] = 0; + nrules -= nuseless_productions; + nitems = ni; + + /* + * Is it worth it to reduce the amount of memory for the + * grammar? Probably not. + */ + + } +#endif /* 0 */ + /* Disable useless productions, + since they may contain useless nonterms + that would get mapped below to -1 and confuse everyone. */ + if (nuseless_productions > 0) + { + int pn; + + for (pn = 1; pn <= nrules; pn++) + { + if (!BITISSET(P, pn)) + { + rlhs[pn] = -1; + } + } + } + + /* remove useless symbols */ + if (nuseless_nonterminals > 0) + { + + int i, n; +/* short j; JF unused */ + short *nontermmap; + rule r; + + /* + * create a map of nonterminal number to new nonterminal + * number. -1 in the map means it was useless and is being + * eliminated. + */ + + nontermmap = NEW2(nvars, short) - ntokens; + for (i = ntokens; i < nsyms; i++) + nontermmap[i] = -1; + + n = ntokens; + for (i = ntokens; i < nsyms; i++) + if (BITISSET(V, i)) + nontermmap[i] = n++; + + /* Shuffle elements of tables indexed by symbol number. */ + + for (i = ntokens; i < nsyms; i++) + { + n = nontermmap[i]; + if (n >= 0) + { + sassoc[n] = sassoc[i]; + sprec[n] = sprec[i]; + tags[n] = tags[i]; + } else { + free(tags[i]); + } + } + + /* Replace all symbol numbers in valid data structures. */ + + for (i = 1; i <= nrules; i++) + { + /* Ignore the rules disabled above. */ + if (rlhs[i] >= 0) + rlhs[i] = nontermmap[rlhs[i]]; + if (ISVAR (rprecsym[i])) + /* Can this happen? */ + rprecsym[i] = nontermmap[rprecsym[i]]; + } + + for (r = ritem; *r; r++) + if (ISVAR(*r)) + *r = nontermmap[*r]; + + start_symbol = nontermmap[start_symbol]; + + nsyms -= nuseless_nonterminals; + nvars -= nuseless_nonterminals; + + free(&nontermmap[ntokens]); + } +} + +static void +print_results () +{ + int i; +/* short j; JF unused */ + rule r; + bool b; + + if (nuseless_nonterminals > 0) + { + fprintf(foutput, "Useless nonterminals:\n\n"); + for (i = ntokens; i < nsyms; i++) + if (!BITISSET(V, i)) + fprintf(foutput, " %s\n", tags[i]); + } + b = FALSE; + for (i = 0; i < ntokens; i++) + { + if (!BITISSET(V, i) && !BITISSET(V1, i)) + { + if (!b) + { + fprintf(foutput, "\n\nTerminals which are not used:\n\n"); + b = TRUE; + } + fprintf(foutput, " %s\n", tags[i]); + } + } + + if (nuseless_productions > 0) + { + fprintf(foutput, "\n\nUseless rules:\n\n"); + for (i = 1; i <= nrules; i++) + { + if (!BITISSET(P, i)) + { + fprintf(foutput, "#%-4d ", i); + fprintf(foutput, "%s :\t", tags[rlhs[i]]); + for (r = &ritem[rrhs[i]]; *r >= 0; r++) + { + fprintf(foutput, " %s", tags[*r]); + } + fprintf(foutput, ";\n"); + } + } + } + if (nuseless_nonterminals > 0 || nuseless_productions > 0 || b) + fprintf(foutput, "\n\n"); +} + +void +dump_grammar () +{ + int i; + rule r; + + fprintf(foutput, + "ntokens = %d, nvars = %d, nsyms = %d, nrules = %d, nitems = %d\n\n", + ntokens, nvars, nsyms, nrules, nitems); + fprintf(foutput, "Variables\n---------\n\n"); + fprintf(foutput, "Value Sprec Sassoc Tag\n"); + for (i = ntokens; i < nsyms; i++) + fprintf(foutput, "%5d %5d %5d %s\n", + i, sprec[i], sassoc[i], tags[i]); + fprintf(foutput, "\n\n"); + fprintf(foutput, "Rules\n-----\n\n"); + for (i = 1; i <= nrules; i++) + { + fprintf(foutput, "%-5d(%5d%5d)%5d : (@%-5d)", + i, rprec[i], rassoc[i], rlhs[i], rrhs[i]); + for (r = &ritem[rrhs[i]]; *r > 0; r++) + fprintf(foutput, "%5d", *r); + fprintf(foutput, " [%d]\n", -(*r)); + } + fprintf(foutput, "\n\n"); + fprintf(foutput, "Rules interpreted\n-----------------\n\n"); + for (i = 1; i <= nrules; i++) + { + fprintf(foutput, "%-5d %s :", i, tags[rlhs[i]]); + for (r = &ritem[rrhs[i]]; *r > 0; r++) + fprintf(foutput, " %s", tags[*r]); + fprintf(foutput, "\n"); + } + fprintf(foutput, "\n\n"); +} + + +static void +print_notices () +{ + extern int fixed_outfiles; + + if (fixed_outfiles && nuseless_productions) + fprintf(stderr, "%d rules never reduced\n", nuseless_productions); + + fprintf(stderr, "%s contains ", infile); + + if (nuseless_nonterminals > 0) + { + fprintf(stderr, "%d useless nonterminal%s", + nuseless_nonterminals, + (nuseless_nonterminals == 1 ? "" : "s")); + } + if (nuseless_nonterminals > 0 && nuseless_productions > 0) + fprintf(stderr, " and "); + + if (nuseless_productions > 0) + { + fprintf(stderr, "%d useless rule%s", + nuseless_productions, + (nuseless_productions == 1 ? "" : "s")); + } + fprintf(stderr, "\n"); + fflush(stderr); +} diff --git a/contrib/bison/state.h b/contrib/bison/state.h new file mode 100644 index 000000000000..53f9d094bcbe --- /dev/null +++ b/contrib/bison/state.h @@ -0,0 +1,137 @@ +/* Type definitions for nondeterministic finite state machine for bison, + Copyright (C) 1984, 1989 Free Software Foundation, Inc. + +This file is part of Bison, the GNU Compiler Compiler. + +Bison is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 2, or (at your option) +any later version. + +Bison is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with Bison; see the file COPYING. If not, write to +the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA. */ + + +/* These type definitions are used to represent a nondeterministic + finite state machine that parses the specified grammar. + This information is generated by the function generate_states + in the file LR0. + +Each state of the machine is described by a set of items -- +particular positions in particular rules -- that are the possible +places where parsing could continue when the machine is in this state. +These symbols at these items are the allowable inputs that can follow now. + +A core represents one state. States are numbered in the number field. +When generate_states is finished, the starting state is state 0 +and nstates is the number of states. (A transition to a state +whose state number is nstates indicates termination.) All the cores +are chained together and first_state points to the first one (state 0). + +For each state there is a particular symbol which must have been the +last thing accepted to reach that state. It is the accessing_symbol +of the core. + +Each core contains a vector of nitems items which are the indices +in the ritems vector of the items that are selected in this state. + +The link field is used for chaining buckets that hash states by +their itemsets. This is for recognizing equivalent states and +combining them when the states are generated. + +The two types of transitions are shifts (push the lookahead token +and read another) and reductions (combine the last n things on the +stack via a rule, replace them with the symbol that the rule derives, +and leave the lookahead token alone). When the states are generated, +these transitions are represented in two other lists. + +Each shifts structure describes the possible shift transitions out +of one state, the state whose number is in the number field. +The shifts structures are linked through next and first_shift points to them. +Each contains a vector of numbers of the states that shift transitions +can go to. The accessing_symbol fields of those states' cores say what kind +of input leads to them. + +A shift to state zero should be ignored. Conflict resolution +deletes shifts by changing them to zero. + +Each reductions structure describes the possible reductions at the state +whose number is in the number field. The data is a list of nreds rules, +represented by their rule numbers. first_reduction points to the list +of these structures. + +Conflict resolution can decide that certain tokens in certain +states should explicitly be errors (for implementing %nonassoc). +For each state, the tokens that are errors for this reason +are recorded in an errs structure, which has the state number +in its number field. The rest of the errs structure is full +of token numbers. + +There is at least one shift transition present in state zero. +It leads to a next-to-final state whose accessing_symbol is +the grammar's start symbol. The next-to-final state has one shift +to the final state, whose accessing_symbol is zero (end of input). +The final state has one shift, which goes to the termination state +(whose number is nstates-1). +The reason for the extra state at the end is to placate the parser's +strategy of making all decisions one token ahead of its actions. */ + + +typedef + struct core + { + struct core *next; + struct core *link; + short number; + short accessing_symbol; + short nitems; + short items[1]; + } + core; + + + +typedef + struct shifts + { + struct shifts *next; + short number; + short nshifts; + short shifts[1]; + } + shifts; + + + +typedef + struct errs + { + short nerrs; + short errs[1]; + } + errs; + + + +typedef + struct reductions + { + struct reductions *next; + short number; + short nreds; + short rules[1]; + } + reductions; + + + +extern int nstates; +extern core *first_state; +extern shifts *first_shift; +extern reductions *first_reduction; diff --git a/contrib/bison/symtab.c b/contrib/bison/symtab.c new file mode 100644 index 000000000000..adfe39011cfc --- /dev/null +++ b/contrib/bison/symtab.c @@ -0,0 +1,150 @@ +/* Symbol table manager for Bison, + Copyright (C) 1984, 1989 Free Software Foundation, Inc. + +This file is part of Bison, the GNU Compiler Compiler. + +Bison is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 2, or (at your option) +any later version. + +Bison is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with Bison; see the file COPYING. If not, write to +the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA. */ + + +#include +#include "system.h" +#include "new.h" +#include "symtab.h" +#include "gram.h" + + +bucket **symtab; +bucket *firstsymbol; +bucket *lastsymbol; + + + +int +hash(key) +char *key; +{ + register char *cp; + register int k; + + cp = key; + k = 0; + while (*cp) + k = ((k << 1) ^ (*cp++)) & 0x3fff; + + return (k % TABSIZE); +} + + + +char * +copys(s) +char *s; +{ + register int i; + register char *cp; + register char *result; + + i = 1; + for (cp = s; *cp; cp++) + i++; + + result = xmalloc((unsigned int)i); + strcpy(result, s); + return (result); +} + + +void +tabinit() +{ +/* register int i; JF unused */ + + symtab = NEW2(TABSIZE, bucket *); + + firstsymbol = NULL; + lastsymbol = NULL; +} + + +bucket * +getsym(key) +char *key; +{ + register int hashval; + register bucket *bp; + register int found; + + hashval = hash(key); + bp = symtab[hashval]; + + found = 0; + while (bp != NULL && found == 0) + { + if (strcmp(key, bp->tag) == 0) + found = 1; + else + bp = bp->link; + } + + if (found == 0) + { + nsyms++; + + bp = NEW(bucket); + bp->link = symtab[hashval]; + bp->next = NULL; + bp->tag = copys(key); + bp->class = SUNKNOWN; + + if (firstsymbol == NULL) + { + firstsymbol = bp; + lastsymbol = bp; + } + else + { + lastsymbol->next = bp; + lastsymbol = bp; + } + + symtab[hashval] = bp; + } + + return (bp); +} + + +void +free_symtab() +{ + register int i; + register bucket *bp,*bptmp;/* JF don't use ptr after free */ + + for (i = 0; i < TABSIZE; i++) + { + bp = symtab[i]; + while (bp) + { + bptmp = bp->link; +#if 0 /* This causes crashes because one string can appear more than once. */ + if (bp->type_name) + FREE(bp->type_name); +#endif + FREE(bp); + bp = bptmp; + } + } + FREE(symtab); +} diff --git a/contrib/bison/symtab.h b/contrib/bison/symtab.h new file mode 100644 index 000000000000..f515721d6b54 --- /dev/null +++ b/contrib/bison/symtab.h @@ -0,0 +1,56 @@ +/* Definitions for symtab.c and callers, part of bison, + Copyright (C) 1984, 1989, 1992 Free Software Foundation, Inc. + +This file is part of Bison, the GNU Compiler Compiler. + +Bison is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 2, or (at your option) +any later version. + +Bison is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with Bison; see the file COPYING. If not, write to +the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA. */ + + +#define TABSIZE 1009 + + +/* symbol classes */ + +#define SUNKNOWN 0 +#define STOKEN 1 /* terminal symbol */ +#define SNTERM 2 /* non-terminal */ + +#define SALIAS -9991 /* for symbol generated with an alias */ + +typedef + struct bucket + { + struct bucket *link; + struct bucket *next; + char *tag; + char *type_name; + short value; + short prec; + short assoc; + short user_token_number; + /* special value SALIAS in the identifier + half of the identifier-symbol pair for an alias */ + struct bucket *alias; + /* points to the other in the identifier-symbol + pair for an alias */ + char class; + } + bucket; + + +extern bucket **symtab; +extern bucket *firstsymbol; + +extern bucket *getsym(); diff --git a/contrib/bison/system.h b/contrib/bison/system.h new file mode 100644 index 000000000000..8d3562c9b072 --- /dev/null +++ b/contrib/bison/system.h @@ -0,0 +1,25 @@ +#ifdef MSDOS +#include +#endif + +#if defined(HAVE_STDLIB_H) || defined(MSDOS) +#include +#endif + +#if (defined(VMS) || defined(MSDOS)) && !defined(HAVE_STRING_H) +#define HAVE_STRING_H 1 +#endif + +#if defined(STDC_HEADERS) || defined(HAVE_STRING_H) +#include +/* An ANSI string.h and pre-ANSI memory.h might conflict. */ +#if !defined(STDC_HEADERS) && defined(HAVE_MEMORY_H) +#include +#endif /* not STDC_HEADERS and HAVE_MEMORY_H */ +#ifndef bcopy +#define bcopy(src, dst, num) memcpy((dst), (src), (num)) +#endif +#else /* not STDC_HEADERS and not HAVE_STRING_H */ +#include +/* memory.h and strings.h conflict on some systems. */ +#endif /* not STDC_HEADERS and not HAVE_STRING_H */ diff --git a/contrib/bison/types.h b/contrib/bison/types.h new file mode 100644 index 000000000000..a4aa0a750125 --- /dev/null +++ b/contrib/bison/types.h @@ -0,0 +1,27 @@ +/* Define data type for representing bison's grammar input as it is parsed, + Copyright (C) 1984, 1989 Free Software Foundation, Inc. + +This file is part of Bison, the GNU Compiler Compiler. + +Bison is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 2, or (at your option) +any later version. + +Bison is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with Bison; see the file COPYING. If not, write to +the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA. */ + + +typedef + struct shorts + { + struct shorts *next; + short value; + } + shorts; diff --git a/contrib/bison/version.c b/contrib/bison/version.c new file mode 100644 index 000000000000..2bc1122875ff --- /dev/null +++ b/contrib/bison/version.c @@ -0,0 +1 @@ +char *version_string = "GNU Bison version 1.25\n"; diff --git a/contrib/bison/vmsgetargs.c b/contrib/bison/vmsgetargs.c new file mode 100644 index 000000000000..83cb8bfcfade --- /dev/null +++ b/contrib/bison/vmsgetargs.c @@ -0,0 +1,180 @@ +/* VMS version of getargs; Uses DCL command parsing. + Copyright (C) 1989, 1992 Free Software Foundation, Inc. + +This file is part of Bison, the GNU Compiler Compiler. + +Bison is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 2, or (at your option) +any later version. + +Bison is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with Bison; see the file COPYING. If not, write to +the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA. */ + + +#include +#include +#include "files.h" + +/* + * VMS version of getargs: Uses DCL command parsing + * (argc and argv are ignored) + */ +int verboseflag; +int definesflag; +int debugflag; +int nolinesflag; +extern int noparserflag; +extern int toknumflag; +extern int rawtoknumflag; +extern int fixed_outfiles; +extern char * version_string; + +/* Allocate storgate and initialize, since bison uses them elsewhere. */ +char *spec_name_prefix; +char *spec_file_prefix; + +getargs(argc,argv) + int argc; + char *argv[]; +{ + register char *cp; + static char Input_File[256]; + static char output_spec[256], name_prefix_spec[256], file_prefix_spec[256]; + extern char *infile; + + verboseflag = 0; + definesflag = 0; + debugflag = 0; + fixed_outfiles = 0; + nolinesflag = 0; + noparserflag = 0; + toknumflag = 0; + rawtoknumflag = 0; + /* + * Check for /VERBOSE qualifier + */ + if (cli_present("BISON$VERBOSE")) verboseflag = 1; + /* + * Check for /DEFINES qualifier + */ + if (cli_present("BISON$DEFINES")) definesflag = 1; + /* + * Check for /FIXED_OUTFILES qualifier + */ + if (cli_present("BISON$FIXED_OUTFILES")) fixed_outfiles = 1; + if (cli_present("BISON$YACC")) fixed_outfiles = 1; + /* + * Check for /VERSION qualifier + */ + if (cli_present("BISON$VERSION")) printf("%s",version_string); + /* + * Check for /NOLINES qualifier + */ + if (cli_present("BISON$NOLINES")) nolinesflag = 1; + /* + * Check for /NOPARSER qualifier + */ + if (cli_present("BISON$NOPARSER")) noparserflag = 1; + /* + * Check for /RAW qualifier + */ + if (cli_present("BISON$RAW")) rawtoknumflag = 1; + /* + * Check for /TOKEN_TABLE qualifier + */ + if (cli_present("BISON$TOKEN_TABLE")) toknumflag = 1; + /* + * Check for /DEBUG qualifier + */ + if (cli_present("BISON$DEBUG")) debugflag = 1; + /* + * Get the filename + */ + cli_get_value("BISON$INFILE", Input_File, sizeof(Input_File)); + /* + * Lowercaseify the input filename + */ + cp = Input_File; + while(*cp) + { + if (isupper(*cp)) *cp = tolower(*cp); + cp++; + } + infile = Input_File; + /* + * Get the output file + */ + if (cli_present("BISON$OUTPUT")) + { + cli_get_value("BISON$OUTPUT", output_spec, sizeof(output_spec)); + for (cp = spec_outfile = output_spec; *cp; cp++) + if (isupper(*cp)) + *cp = tolower(*cp); + } + /* + * Get the output file + */ + if (cli_present("BISON$FILE_PREFIX")) + { + cli_get_value("BISON$FILE_PREFIX", file_prefix_spec, + sizeof(file_prefix_spec)); + for (cp = spec_file_prefix = file_prefix_spec; *cp; cp++) + if (isupper(*cp)) + *cp = tolower(*cp); + } + /* + * Get the output file + */ + if (cli_present("BISON$NAME_PREFIX")) + { + cli_get_value("BISON$NAME_PREFIX", name_prefix_spec, + sizeof(name_prefix_spec)); + for (cp = spec_name_prefix = name_prefix_spec; *cp; cp++) + if (isupper(*cp)) + *cp = tolower(*cp); + } +} + +/************ DCL PARSING ROUTINES **********/ + +/* + * See if "NAME" is present + */ +int +cli_present(Name) + char *Name; +{ + struct {int Size; char *Ptr;} Descr; + + Descr.Ptr = Name; + Descr.Size = strlen(Name); + return((cli$present(&Descr) & 1) ? 1 : 0); +} + +/* + * Get value of "NAME" + */ +int +cli_get_value(Name,Buffer,Size) + char *Name; + char *Buffer; +{ + struct {int Size; char *Ptr;} Descr1,Descr2; + + Descr1.Ptr = Name; + Descr1.Size = strlen(Name); + Descr2.Ptr = Buffer; + Descr2.Size = Size-1; + if (cli$get_value(&Descr1,&Descr2,&Descr2.Size) & 1) { + Buffer[Descr2.Size] = 0; + return(1); + } + return(0); +} diff --git a/contrib/bison/vmshlp.mar b/contrib/bison/vmshlp.mar new file mode 100644 index 000000000000..637d170d584b --- /dev/null +++ b/contrib/bison/vmshlp.mar @@ -0,0 +1,42 @@ +;/* Macro help routines for the BISON/VMS program +; Gabor Karsai, Vanderbilt University +; +;BISON is distributed in the hope that it will be useful, but WITHOUT ANY +;WARRANTY. No author or distributor accepts responsibility to anyone +;for the consequences of using it or for whether it serves any +;particular purpose or works at all, unless he says so in writing. +;Refer to the BISON General Public License for full details. +; +;Everyone is granted permission to copy, modify and redistribute BISON, +;but only under the conditions described in the BISON General Public +;License. A copy of this license is supposed to have been given to you +;along with BISON so you can know your rights and responsibilities. It +;should be in a file named COPYING. Among other things, the copyright +;notice and this notice must be preserved on all copies. +; +; In other words, you are welcome to use, share and improve this program. +; You are forbidden to forbid anyone else to use, share and improve +; what you give them. Help stamp out software-hoarding! */ +; + .psect vmshlp pic,usr,rel,ovr,shr,long,exe,nowrt + +alloca:: + .word 0 + subl2 ^X4(ap),sp + movl ^X10(fp),r1 + movq ^X8(fp),ap + bicl2 #03,sp + addl2 #^X1c,sp + movl sp,r0 + jmp (r1) + +bcopy:: + .word ^X0e00 + movl ^X04(ap),r11 + movl ^X08(ap),r10 + movl ^X0c(ap),r9 + brb 1$ +2$: movb (r10)+,(r11)+ +1$: sobgeq r9,2$ + ret + .end diff --git a/contrib/bison/warshall.c b/contrib/bison/warshall.c new file mode 100644 index 000000000000..65487cbfb3b6 --- /dev/null +++ b/contrib/bison/warshall.c @@ -0,0 +1,119 @@ +/* Generate transitive closure of a matrix, + Copyright (C) 1984, 1989 Free Software Foundation, Inc. + +This file is part of Bison, the GNU Compiler Compiler. + +Bison is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 2, or (at your option) +any later version. + +Bison is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with Bison; see the file COPYING. If not, write to +the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA. */ + + +#include +#include "system.h" +#include "machine.h" + + +/* given n by n matrix of bits R, modify its contents + to be the transive closure of what was given. */ + +void +TC(R, n) +unsigned *R; +int n; +{ + register int rowsize; + register unsigned mask; + register unsigned *rowj; + register unsigned *rp; + register unsigned *rend; + register unsigned *ccol; + + unsigned *relend; + unsigned *cword; + unsigned *rowi; + + rowsize = WORDSIZE(n) * sizeof(unsigned); + relend = (unsigned *) ((char *) R + (n * rowsize)); + + cword = R; + mask = 1; + rowi = R; + while (rowi < relend) + { + ccol = cword; + rowj = R; + + while (rowj < relend) + { + if (*ccol & mask) + { + rp = rowi; + rend = (unsigned *) ((char *) rowj + rowsize); + + while (rowj < rend) + *rowj++ |= *rp++; + } + else + { + rowj = (unsigned *) ((char *) rowj + rowsize); + } + + ccol = (unsigned *) ((char *) ccol + rowsize); + } + + mask <<= 1; + if (mask == 0) + { + mask = 1; + cword++; + } + + rowi = (unsigned *) ((char *) rowi + rowsize); + } +} + + +/* Reflexive Transitive Closure. Same as TC + and then set all the bits on the diagonal of R. */ + +void +RTC(R, n) +unsigned *R; +int n; +{ + register int rowsize; + register unsigned mask; + register unsigned *rp; + register unsigned *relend; + + TC(R, n); + + rowsize = WORDSIZE(n) * sizeof(unsigned); + relend = (unsigned *) ((char *) R + n*rowsize); + + mask = 1; + rp = R; + while (rp < relend) + { + *rp |= mask; + + mask <<= 1; + if (mask == 0) + { + mask = 1; + rp++; + } + + rp = (unsigned *) ((char *) rp + rowsize); + } +}