mirror of
https://git.savannah.gnu.org/git/emacs.git
synced 2024-12-03 08:30:09 +00:00
dd91aac508
* etc/DEBUG: Don’t imply Clang works as well as GCC when debugging Emacs. Mention a coverage problem with Clang.
1060 lines
44 KiB
Plaintext
1060 lines
44 KiB
Plaintext
Debugging GNU Emacs
|
||
|
||
Copyright (C) 1985, 2000-2022 Free Software Foundation, Inc.
|
||
See the end of the file for license conditions.
|
||
|
||
** Preliminaries
|
||
|
||
This section can be skipped if you are already familiar with building
|
||
Emacs with debug info, configuring and starting GDB, and simple GDB
|
||
debugging techniques.
|
||
|
||
*** Configuring Emacs for debugging
|
||
|
||
It is best to configure and build Emacs with special options that will
|
||
make the debugging easier. Here are the configure-time options we
|
||
recommend (they are in addition to any other options you might need,
|
||
such as --prefix):
|
||
|
||
./configure --enable-checking='yes,glyphs' --enable-check-lisp-object-type \
|
||
CFLAGS='-O0 -g3'
|
||
|
||
The -O0 flag is important, as debugging optimized code can be hard.
|
||
If the problem happens only with optimized code, you may need to
|
||
enable optimizations. If that happens, try using -Og first instead of
|
||
-O2, as -Og disables some optimizations that make debugging some code
|
||
exceptionally hard.
|
||
|
||
Older versions of GCC may need more than just the -g3 flag. For more,
|
||
search for "analyze failed assertions" below.
|
||
|
||
The 2 --enable-* switches are optional. They don't have any effect on
|
||
debugging with GDB, but will compile additional code that might catch
|
||
the problem you are debugging much earlier, in the form of assertion
|
||
violation. The --enable-checking option also enables additional
|
||
functionality useful for debugging display problems; see more about
|
||
this below under "Debugging Emacs redisplay problems".
|
||
|
||
Emacs needs not be installed to be debugged, you can debug the binary
|
||
created in the 'src' directory.
|
||
|
||
*** Configuring GDB
|
||
|
||
To start GDB to debug Emacs, you can simply type "gdb ./emacs RET" at
|
||
the shell prompt (assuming you do that from the directory of the Emacs
|
||
executable, usually the 'src' sub-directory of the Emacs tree).
|
||
However, we recommend starting GDB from Emacs, see below.
|
||
|
||
When you debug Emacs with GDB, you should start GDB in the directory
|
||
where the Emacs executable was made (the 'src' directory in the Emacs
|
||
source tree). That directory has a .gdbinit file that defines various
|
||
"user-defined" commands for debugging Emacs. (These commands are
|
||
described below under "Examining Lisp object values" and "Debugging
|
||
Emacs Redisplay problems".)
|
||
|
||
Starting the debugger from Emacs, via the "M-x gdb" command (described
|
||
below), when the current buffer visits one of the Emacs C source files
|
||
will automatically start GDB in the 'src' directory. If you invoke
|
||
"M-x gdb" from a buffer whose default directory is different, such as
|
||
from the "*scratch*" buffer, you can change the default directory with
|
||
the "M-x cd" command before starting the debugger.
|
||
|
||
Recent GDB versions by default do not automatically load .gdbinit
|
||
files in the directory where you invoke GDB. With those versions of
|
||
GDB, you will see a warning when GDB starts, like this:
|
||
|
||
warning: File ".../src/.gdbinit" auto-loading has been declined by your `auto-load safe-path' set to "$debugdir:$datadir/auto-load".
|
||
|
||
The simplest way to fix this is to add the following line to your
|
||
~/.gdbinit file (creating such a file if it doesn't already exist):
|
||
|
||
add-auto-load-safe-path /path/to/emacs/src/.gdbinit
|
||
|
||
There are other ways to overcome that difficulty, they are all
|
||
described in the node "Auto-loading safe path" in the GDB user manual.
|
||
If nothing else helps, type "source /path/to/.gdbinit RET" at the GDB
|
||
prompt, to unconditionally load the GDB init file.
|
||
|
||
Running GDB on macOS sometimes brings an error message like this:
|
||
|
||
Unable to find Mach task port for process-id NNN: (os/kern) failure (0x5).
|
||
|
||
To overcome this, search the Internet for the phrase "Unable to find
|
||
Mach task port for process-id", and you will find detailed
|
||
instructions to follow.
|
||
|
||
*** Use the Emacs GDB UI front-end
|
||
|
||
We recommend using the GUI front-end for GDB provided by Emacs. With
|
||
it, you can start GDB by typing "M-x gdb RET". This will suggest the
|
||
file name of the default binary to debug; if the suggested default is
|
||
not the Emacs binary you want to debug, change the file name as
|
||
needed. Alternatively, if you want to attach the debugger to an
|
||
already running Emacs process, change the GDB command shown in the
|
||
minibuffer to say this:
|
||
|
||
gdb -i=mi -p PID
|
||
|
||
where PID is the numerical process ID of the running Emacs process,
|
||
displayed by system utilities such as 'top' or 'ps' on Posix hosts and
|
||
Task Manager on MS-Windows.
|
||
|
||
Once the debugger starts, open the additional windows provided by the
|
||
GDB UI, by typing "M-x gdb-many-windows RET". (Alternatively, click
|
||
Gud->GDB-MI->Display Other Windows" from the menu bar.) At this
|
||
point, make your frame large enough (or full-screen) such that the
|
||
windows you just opened have enough space to show the content without
|
||
horizontal scrolling.
|
||
|
||
You can later restore your window configuration with the companion
|
||
command "M-x gdb-restore-windows RET", or by deselecting "Display
|
||
Other Windows" from the menu bar.
|
||
|
||
*** Setting initial breakpoints
|
||
|
||
Before you let Emacs run, you should now set breakpoints in the code
|
||
which you want to debug, so that Emacs stops there and lets GDB take
|
||
control. If the code which you want to debug is executed under some
|
||
rare conditions, or only when a certain Emacs command is manually
|
||
invoked, then just set your breakpoint there, let Emacs run, and
|
||
trigger the breakpoint by invoking that command or reproducing those
|
||
rare conditions.
|
||
|
||
If you are less lucky, and the code in question is run very
|
||
frequently, you will have to find some way of avoiding triggering your
|
||
breakpoint when the conditions for the buggy behavior did not yet
|
||
happen. There's no single recipe for this, you will have to be
|
||
creative and study the code to see what's appropriate. Some useful
|
||
tricks for that:
|
||
|
||
. Make your breakpoint conditional on certain buffer or string
|
||
position. For example:
|
||
|
||
(gdb) break foo.c:1234 if PT >= 9876
|
||
|
||
. Set a break point in some rarely called function, then create the
|
||
conditions for the bug, call that rare function, and when GDB gets
|
||
control, set the breakpoint in the buggy code, knowing that it
|
||
will now be called when the bug happens.
|
||
|
||
. If the bug manifests itself as an error message, set a breakpoint
|
||
in Fsignal, and when it breaks, look at the backtrace to see what
|
||
triggers the error.
|
||
|
||
Some additional techniques are described below under "Getting control
|
||
to the debugger".
|
||
|
||
You are now ready to start your debugging session.
|
||
|
||
*** Running Emacs from GDB
|
||
|
||
If you are starting a new Emacs session, type "run", followed by any
|
||
command-line arguments (e.g., "-Q") into the *gud-emacs* buffer and
|
||
press RET. If you ran GDB outside of Emacs, type "run" followed by
|
||
the command-line arguments at the GDB prompt instead.
|
||
|
||
If you attached the debugger to a running Emacs, type "continue" into
|
||
the *gud-emacs* buffer and press RET.
|
||
|
||
Many variables you will encounter while debugging are Lisp objects.
|
||
These are normally displayed as opaque pointers or integers that are
|
||
hard to interpret, especially if they represent long lists.
|
||
(They are instead displayed as structures containing these opaque
|
||
values, if --enable-check-lisp-object-type is in effect.) You can
|
||
use the 'pp' command to display them in their Lisp form. That command
|
||
displays its output on the standard error stream, which you
|
||
can redirect to a file using "M-x redirect-debugging-output".
|
||
This means that if you attach GDB to a running Emacs that was invoked
|
||
from a desktop icon, chances are you will not see the output at all,
|
||
or it will wind up in an obscure place (check the documentation of
|
||
your desktop environment).
|
||
|
||
Additional information about displaying Lisp objects can be found
|
||
under "Examining Lisp object values" below.
|
||
|
||
The rest of this document describes specific useful techniques for
|
||
debugging Emacs; we suggest reading it in its entirety the first time
|
||
you are about to debug Emacs, then look up your specific issues
|
||
whenever you need.
|
||
|
||
Good luck!
|
||
|
||
** When you are trying to analyze failed assertions or backtraces, it
|
||
is essential to compile Emacs with flags suitable for debugging.
|
||
Although CFLAGS="-O0 -g3" often suffices with modern compilers,
|
||
you may benefit further by using CFLAGS="-O0 -g3 -gdwarf-4", replacing
|
||
"4" by the highest version of DWARF that your compiler supports;
|
||
this is especially important for GCC versions older than 4.8.
|
||
With GCC and higher optimization levels such as -O2, the
|
||
-fno-omit-frame-pointer and -fno-crossjumping options are often
|
||
essential. The latter prevents GCC from using the same abort call for
|
||
all assertions in a given function, rendering the stack backtrace
|
||
useless for identifying the specific failed assertion.
|
||
|
||
** It is a good idea to run Emacs under GDB (or some other suitable
|
||
debugger) *all the time*. Then, when Emacs crashes, you will be able
|
||
to debug the live process, not just a core dump. (This is especially
|
||
important on systems which don't support core files, and instead print
|
||
just the registers and some stack addresses.)
|
||
|
||
** If Emacs hangs, or seems to be stuck in some infinite loop, typing
|
||
"kill -TSTP PID", where PID is the Emacs process ID, will cause GDB to
|
||
kick in, provided that you run under GDB.
|
||
|
||
** Getting control to the debugger
|
||
|
||
Setting a breakpoint in a strategic place, after loading Emacs into
|
||
the debugger, but before running it, is the most efficient way of
|
||
making sure control will be returned to the debugger when you need
|
||
that.
|
||
|
||
'Fsignal' is a very useful place to put a breakpoint in. All Lisp
|
||
errors go through there. If you are only interested in errors that
|
||
would fire the Lisp debugger, breaking at 'maybe_call_debugger' is
|
||
useful.
|
||
|
||
Another technique for getting control to the debugger is to put a
|
||
breakpoint in some rarely used function. One such convenient function
|
||
is Fredraw_display, which you can invoke at will interactively with
|
||
"M-x redraw-display RET".
|
||
|
||
It is also useful to have a guaranteed way to return to the debugger
|
||
at any arbitrary time. When using X, this is easy: type C-z at the
|
||
window where you are interacting with GDB, and it will stop Emacs just
|
||
as it would stop any ordinary program. (This doesn't work if GDB was
|
||
attached to a running Emacs process; in that case, you will need to
|
||
type C-z to the shell window from which Emacs was started, or use the
|
||
"kill -TSTP" method described below.)
|
||
|
||
When Emacs is displaying on a text terminal, things are not so easy,
|
||
so we describe the various alternatives below (however, those of them
|
||
that use signals only work on Posix systems).
|
||
|
||
The src/.gdbinit file in the Emacs distribution arranges for SIGINT
|
||
(C-g in Emacs on a text-mode frame) to be passed to Emacs and not give
|
||
control back to GDB. On modern systems, you can override that with
|
||
this command:
|
||
|
||
handle SIGINT stop nopass
|
||
|
||
After this 'handle' command, SIGINT will return control to GDB. If
|
||
you want the C-g to cause a quit within Emacs as well, omit the 'nopass'.
|
||
See the GDB manual for more details about signal handling and the
|
||
'handle' command.
|
||
|
||
A technique that can work when 'handle SIGINT' does not is to store
|
||
the code for some character into the variable stop_character. Thus,
|
||
|
||
set stop_character = 29
|
||
|
||
makes Control-] (decimal code 29) the stop character.
|
||
Typing Control-] will cause immediate stop. You cannot
|
||
use the set command until the inferior process has been started, so
|
||
start Emacs with the 'start' command, to get an opportunity to do the
|
||
above 'set' command.
|
||
|
||
On a Posix host, you can also send a signal using the 'kill' command
|
||
from a shell prompt, like this:
|
||
|
||
kill -TSTP Emacs-PID
|
||
|
||
where Emacs-PID is the process ID of Emacs being debugged. Other
|
||
useful signals to send are SIGUSR1 and SIGUSR2; see "Error Debugging"
|
||
in the ELisp manual for how to use those.
|
||
|
||
When Emacs is displaying on a text terminal, it is useful to have a
|
||
separate terminal for the debug session. This can be done by starting
|
||
Emacs as usual, then attaching to it from gdb with the 'attach'
|
||
command which is explained in the node "Attach" of the GDB manual.
|
||
|
||
On MS-Windows, you can alternatively start Emacs from its own separate
|
||
console by setting the new-console option before running Emacs under
|
||
GDB:
|
||
|
||
(gdb) set new-console 1
|
||
(gdb) run
|
||
|
||
If you do this, then typing C-c or C-BREAK into the console window
|
||
through which you interact with GDB will stop Emacs and return control
|
||
to the debugger, no matter if Emacs displays GUI or text-mode frames.
|
||
This is the only reliable alternative on MS-Windows to get control to
|
||
the debugger, besides setting breakpoints in advance.
|
||
|
||
** Examining Lisp object values.
|
||
|
||
When you have a live process to debug, and it has not encountered a
|
||
fatal error, you can use the GDB command 'pr'. First print the value
|
||
in the ordinary way, with the 'p' command. Then type 'pr' with no
|
||
arguments. This calls a subroutine which uses the Lisp printer.
|
||
|
||
You can also use 'pp value' to print the emacs value directly.
|
||
|
||
To see the current value of a Lisp Variable, use 'pv variable'.
|
||
|
||
These commands send their output to stderr; if that is closed or
|
||
redirected to some file you don't know, you won't see their output.
|
||
This is particularly so for Emacs invoked on MS-Windows from the
|
||
desktop shortcut. You can use the command 'redirect-debugging-output'
|
||
to redirect stderr to a file.
|
||
|
||
Note: It is not a good idea to try 'pr', 'pp', or 'pv' if you know that Emacs
|
||
is in deep trouble: its stack smashed (e.g., if it encountered SIGSEGV
|
||
due to stack overflow), or crucial data structures, such as 'obarray',
|
||
corrupted, etc. In such cases, the Emacs subroutine called by 'pr'
|
||
might make more damage, like overwrite some data that is important for
|
||
debugging the original problem.
|
||
|
||
Also, on some systems it is impossible to use 'pr' if you stopped
|
||
Emacs while it was inside 'select'. This is in fact what happens if
|
||
you stop Emacs while it is waiting. In such a situation, don't try to
|
||
use 'pr'. Instead, use 's' to step out of the system call. Then
|
||
Emacs will be between instructions and capable of handling 'pr'.
|
||
|
||
If you can't use 'pr' command, for whatever reason, you can use the
|
||
'xpr' command to print out the data type and value of the last data
|
||
value, For example:
|
||
|
||
p it->object
|
||
xpr
|
||
|
||
You may also analyze data values using lower-level commands. Use the
|
||
'xtype' command to print out the data type of the last data value.
|
||
Once you know the data type, use the command that corresponds to that
|
||
type. Here are these commands:
|
||
|
||
xint xptr xwindow xmarker xoverlay xmiscfree xintfwd xboolfwd xobjfwd
|
||
xbufobjfwd xkbobjfwd xbuflocal xbuffer xsymbol xstring xvector xframe
|
||
xwinconfig xcompiled xcons xcar xcdr xsubr xprocess xfloat xscrollbar
|
||
xchartable xsubchartable xboolvector xhashtable xlist xcoding
|
||
xcharset xfontset xfont
|
||
|
||
Each one of them applies to a certain type or class of types.
|
||
(Some of these types are not visible in Lisp, because they exist only
|
||
internally.)
|
||
|
||
Each x... command prints some information about the value, and
|
||
produces a GDB value (subsequently available in $) through which you
|
||
can get at the rest of the contents.
|
||
|
||
In general, most of the rest of the contents will be additional Lisp
|
||
objects which you can examine in turn with the x... commands.
|
||
|
||
Even with a live process, these x... commands are useful for
|
||
examining the fields in a buffer, window, process, frame or marker.
|
||
Here's an example using concepts explained in the node "Value History"
|
||
of the GDB manual to print values associated with the variable
|
||
called frame. First, use these commands:
|
||
|
||
cd src
|
||
gdb emacs
|
||
b set_frame_buffer_list
|
||
r -q
|
||
|
||
Then Emacs hits the breakpoint:
|
||
|
||
(gdb) p frame
|
||
$1 = 139854428
|
||
(gdb) xpr
|
||
Lisp_Vectorlike
|
||
PVEC_FRAME
|
||
$2 = (struct frame *) 0x8560258
|
||
"emacs@localhost"
|
||
(gdb) p *$
|
||
$3 = {
|
||
size = 1073742931,
|
||
next = 0x85dfe58,
|
||
name = 140615219,
|
||
[...]
|
||
}
|
||
|
||
Now we can use 'pp' to print the frame parameters:
|
||
|
||
(gdb) pp $->param_alist
|
||
((background-mode . light) (display-type . color) [...])
|
||
|
||
The Emacs C code heavily uses macros defined in lisp.h. So suppose
|
||
we want the address of the l-value expression near the bottom of
|
||
'add_command_key' from keyboard.c:
|
||
|
||
XVECTOR (this_command_keys)->contents[this_command_key_count++] = key;
|
||
|
||
XVECTOR is a macro, so GDB only knows about it if Emacs has been compiled with
|
||
preprocessor macro information. GCC provides this if you specify the options
|
||
'-gdwarf-N' (where N is 2 or higher) and '-g3'. In this case, GDB can
|
||
evaluate expressions like "p XVECTOR (this_command_keys)".
|
||
|
||
When this information isn't available, you can use the xvector command in GDB
|
||
to get the same result. Here is how:
|
||
|
||
(gdb) p this_command_keys
|
||
$1 = 1078005760
|
||
(gdb) xvector
|
||
$2 = (struct Lisp_Vector *) 0x411000
|
||
0
|
||
(gdb) p $->contents[this_command_key_count]
|
||
$3 = 1077872640
|
||
(gdb) p &$
|
||
$4 = (int *) 0x411008
|
||
|
||
Here's a related example of macros and the GDB 'define' command.
|
||
There are many Lisp vectors such as 'recent_keys', which contains the
|
||
last 300 keystrokes. We can print this Lisp vector
|
||
|
||
p recent_keys
|
||
pr
|
||
|
||
But this may be inconvenient, since 'recent_keys' is much more verbose
|
||
than 'C-h l'. We might want to print only the last 10 elements of
|
||
this vector. 'recent_keys' is updated in keyboard.c by the command
|
||
|
||
XVECTOR (recent_keys)->contents[recent_keys_index] = c;
|
||
|
||
So we define a GDB command 'xvector-elts', so the last 10 keystrokes
|
||
are printed by
|
||
|
||
xvector-elts recent_keys recent_keys_index 10
|
||
|
||
where you can define xvector-elts as follows:
|
||
|
||
define xvector-elts
|
||
set $i = 0
|
||
p $arg0
|
||
xvector
|
||
set $foo = $
|
||
while $i < $arg2
|
||
p $foo->contents[$arg1-($i++)]
|
||
pr
|
||
end
|
||
document xvector-elts
|
||
Prints a range of elements of a Lisp vector.
|
||
xvector-elts v n i
|
||
prints 'i' elements of the vector 'v' ending at the index 'n'.
|
||
end
|
||
|
||
** Getting Lisp-level backtrace information within GDB
|
||
|
||
The most convenient way is to use the 'xbacktrace' command. This
|
||
shows the names of the Lisp functions that are currently active.
|
||
|
||
If that doesn't work (e.g., because the 'backtrace_list' structure is
|
||
corrupted), type "bt" at the GDB prompt, to produce the C-level
|
||
backtrace, and look for stack frames that call Ffuncall. Select them
|
||
one by one in GDB, by typing "up N", where N is the appropriate number
|
||
of frames to go up, and in each frame that calls Ffuncall type this:
|
||
|
||
p *args
|
||
pr
|
||
|
||
This will print the name of the Lisp function called by that level
|
||
of function calling.
|
||
|
||
By printing the remaining elements of args, you can see the argument
|
||
values. Here's how to print the first argument:
|
||
|
||
p args[1]
|
||
pr
|
||
|
||
If you do not have a live process, you can use xtype and the other
|
||
x... commands such as xsymbol to get such information, albeit less
|
||
conveniently. For example:
|
||
|
||
p *args
|
||
xtype
|
||
|
||
and, assuming that "xtype" says that args[0] is a symbol:
|
||
|
||
xsymbol
|
||
|
||
** Debugging Emacs redisplay problems
|
||
|
||
If you configured Emacs with --enable-checking='glyphs', you can use redisplay
|
||
tracing facilities from a running Emacs session.
|
||
|
||
The command "M-x trace-redisplay RET" will produce a trace of what redisplay
|
||
does on the standard error stream. This is very useful for understanding the
|
||
code paths taken by the display engine under various conditions, especially if
|
||
some redisplay optimizations produce wrong results. (You know that redisplay
|
||
optimizations might be involved if "M-x redraw-display RET", or even just
|
||
typing "M-x", causes Emacs to correct the bad display.) Since the cursor
|
||
blinking feature triggers periodic redisplay cycles, we recommend disabling
|
||
'blink-cursor-mode' before invoking 'trace-redisplay', so that you have less
|
||
clutter in the trace. You can also have up to 30 last trace messages dumped to
|
||
standard error by invoking the 'dump-redisplay-history' command.
|
||
|
||
To find the code paths which were taken by the display engine, search xdisp.c
|
||
for the trace messages you see.
|
||
|
||
The command 'dump-glyph-matrix' is useful for producing on standard error
|
||
stream a full dump of the selected window's glyph matrix. See the function's
|
||
doc string for more details. If you are debugging redisplay issues in
|
||
text-mode frames, you may find the command 'dump-frame-glyph-matrix' useful.
|
||
|
||
Other commands useful for debugging redisplay are 'dump-glyph-row' and
|
||
'dump-tool-bar-row'.
|
||
|
||
If you run Emacs under GDB, you can print the contents of any glyph matrix by
|
||
just calling that function with the matrix as its argument. For example, the
|
||
following command will print the contents of the current matrix of the window
|
||
whose pointer is in 'w':
|
||
|
||
(gdb) p dump_glyph_matrix (w->current_matrix, 2)
|
||
|
||
(The second argument 2 tells dump_glyph_matrix to print the glyphs in
|
||
a long form.)
|
||
|
||
The Emacs display code includes special debugging code, but it is normally
|
||
disabled. Configuring Emacs with --enable-checking='yes,glyphs' enables it.
|
||
|
||
Building Emacs like that activates many assertions which scrutinize
|
||
display code operation more than Emacs does normally. (To see the
|
||
code which tests these assertions, look for calls to the 'eassert'
|
||
macros.) Any assertion that is reported to fail should be investigated.
|
||
|
||
When you debug display problems running emacs under X, you can use
|
||
the 'ff' command to flush all pending display updates to the screen.
|
||
|
||
The src/.gdbinit file defines many useful commands for dumping redisplay
|
||
related data structures in a terse and user-friendly format:
|
||
|
||
'ppt' prints value of PT, narrowing, and gap in current buffer.
|
||
'pit' dumps the current display iterator 'it'.
|
||
'pwin' dumps the current window 'win'.
|
||
'prow' dumps the current glyph_row 'row'.
|
||
'pg' dumps the current glyph 'glyph'.
|
||
'pgi' dumps the next glyph.
|
||
'pgrow' dumps all glyphs in current glyph_row 'row'.
|
||
'pcursor' dumps current output_cursor.
|
||
|
||
The above commands also exist in a version with an 'x' suffix which takes an
|
||
object of the relevant type as argument. For example, 'pgrowx' dumps all
|
||
glyphs in its argument, which must be of type 'struct glyph_row'.
|
||
|
||
Since redisplay is performed by Emacs very frequently, you need to place your
|
||
breakpoints cleverly to avoid hitting them all the time, when the issue you are
|
||
debugging did not (yet) happen. Here are some useful techniques for that:
|
||
|
||
. Put a breakpoint at 'Fredraw_display' before running Emacs. Then do
|
||
whatever is required to reproduce the bad display, and invoke "M-x
|
||
redraw-display". The debugger will kick in, and you can set or enable
|
||
breakpoints in strategic places, knowing that the bad display will be
|
||
redrawn from scratch.
|
||
|
||
. For debugging incorrect cursor position, a good place to put a breakpoint is
|
||
in 'set_cursor_from_row'. The first time this function is called as part of
|
||
'redraw-display', Emacs is redrawing the minibuffer window, which is usually
|
||
not what you want; type "continue" to get to the call you want. In general,
|
||
always make sure 'set_cursor_from_row' is called for the right window and
|
||
buffer by examining the value of w->contents: it should be the buffer whose
|
||
display you are debugging.
|
||
|
||
. 'set_cursor_from_row' is also a good place to look at the contents of a
|
||
screen line (a.k.a. "glyph row"), by means of the 'pgrow' GDB command. Of
|
||
course, you need first to make sure the cursor is on the screen line which
|
||
you want to investigate. If you have set a breakpoint in 'Fredraw_display',
|
||
as advised above, move cursor to that line before invoking 'redraw-display'.
|
||
|
||
. If the problem happens only at some specific buffer position or for some
|
||
specific rarely-used character, you can make your breakpoints conditional on
|
||
those values. The display engine maintains the buffer and string position
|
||
it is processing in the it->current member; for example, the buffer
|
||
character position is in it->current.pos.charpos. Most redisplay functions
|
||
accept a pointer to a 'struct it' object as their argument, so you can make
|
||
conditional breakpoints in those functions, like this:
|
||
|
||
(gdb) break x_produce_glyphs if it->current.pos.charpos == 1234
|
||
|
||
For conditioning on the character being displayed, use it->c or
|
||
it->char_to_display.
|
||
|
||
. You can also make the breakpoints conditional on what object is being used
|
||
for producing glyphs for display. The it->method member has the value
|
||
GET_FROM_BUFFER for displaying buffer contents, GET_FROM_STRING for
|
||
displaying a Lisp string (e.g., a 'display' property or an overlay string),
|
||
GET_FROM_IMAGE for displaying an image, etc. See 'enum it_method' in
|
||
dispextern.h for the full list of values.
|
||
|
||
** Debugging problems with native-compiled Lisp.
|
||
|
||
When you encounter problems specific to native-compilation of Lisp, we
|
||
recommend to follow the procedure below to try to identify the cause:
|
||
|
||
. Reduce the problematic .el file to the minimum by bisection, and
|
||
try identifying the function that causes the problem.
|
||
|
||
. Reduce the problematic function to the minimal code that still
|
||
reproduces the problem.
|
||
|
||
. Study the problem's artifacts, like Lisp or C backtraces, to try
|
||
identifying the cause of the problem.
|
||
|
||
If you cannot figure out the cause for the problem using the above,
|
||
native-compile the problematic file after setting the variable
|
||
'comp-libgccjit-reproducer' to a non-nil value. That should produce a
|
||
file named ELNFILENAME_libgccjit_repro.c, where ELNFILENAME is the
|
||
name of the problematic .eln file, either in the same directory where
|
||
the .eln file is produced, or under your ~/.emacs.d/eln-cache (which
|
||
one depends on how the native-compilation is invoked). It is also
|
||
possible that the reproducer file's name will be something like
|
||
subr--trampoline-XXXXXXX_FUNCTION_libgccjit_repro.c, where XXXXXXX is
|
||
a long string of hex digits and FUNCTION is some function from the
|
||
compiled .el file. Attach that reproducer C file to your bug report.
|
||
|
||
** Following longjmp call.
|
||
|
||
Recent versions of glibc (2.4+?) encrypt stored values for setjmp/longjmp which
|
||
prevents GDB from being able to follow a longjmp call using 'next'. To
|
||
disable this protection you need to set the environment variable
|
||
LD_POINTER_GUARD to 0.
|
||
|
||
** Using GDB in Emacs
|
||
|
||
Debugging with GDB in Emacs offers some advantages over the command line (See
|
||
the GDB Graphical Interface node of the Emacs manual). There are also some
|
||
features available just for debugging Emacs:
|
||
|
||
1) The command gud-print is available on the tool bar (the 'p' icon) and
|
||
allows the user to print the s-expression of the variable at point,
|
||
in the GUD buffer.
|
||
|
||
2) Pressing 'p' on a component of a watch expression that is a lisp object
|
||
in the speedbar prints its s-expression in the GUD buffer.
|
||
|
||
3) The STOP button on the tool bar and the Signals->STOP menu-bar menu
|
||
item are adjusted so that they send SIGTSTP instead of the usual
|
||
SIGINT.
|
||
|
||
4) The command gud-pv has the global binding 'C-x C-a C-v' and prints the
|
||
value of the lisp variable at point.
|
||
|
||
** Debugging what happens while preloading and dumping Emacs
|
||
|
||
Debugging 'temacs' is useful when you want to establish whether a
|
||
problem happens in an undumped Emacs. To run 'temacs' under a
|
||
debugger, type "gdb temacs", then start it with 'r -batch -l loadup'.
|
||
|
||
If you need to debug what happens during dumping, start it with 'r -batch -l
|
||
loadup dump' instead. For debugging the bootstrap dumping, use "loadup
|
||
bootstrap" instead of "loadup dump".
|
||
|
||
If temacs actually succeeds when running under GDB in this way, do not
|
||
try to run the dumped Emacs, because it was dumped with the GDB
|
||
breakpoints in it.
|
||
|
||
** If you encounter X protocol errors
|
||
|
||
The X server normally reports protocol errors asynchronously,
|
||
so you find out about them long after the primitive which caused
|
||
the error has returned.
|
||
|
||
To get clear information about the cause of an error, try evaluating
|
||
(x-synchronize t). That puts Emacs into synchronous mode, where each
|
||
Xlib call checks for errors before it returns. This mode is much
|
||
slower, but when you get an error, you will see exactly which call
|
||
really caused the error.
|
||
|
||
You can start Emacs in a synchronous mode by invoking it with the -xrm
|
||
option, like this:
|
||
|
||
emacs -xrm "emacs.synchronous: true"
|
||
|
||
Setting a breakpoint in the function 'x_error_quitter' and looking at
|
||
the backtrace when Emacs stops inside that function will show what
|
||
code causes the X protocol errors.
|
||
|
||
Note that the -xrm option may have no effect when you make an Emacs
|
||
process invoked with the -nw option a server and want to trace X
|
||
protocol errors from subsequent invocations of emacsclient in a GUI
|
||
frame. In that case calling the initial Emacs via
|
||
|
||
emacs -nw --eval '(setq x-command-line-resources "emacs.synchronous: true")'
|
||
|
||
should give more reliable results.
|
||
|
||
For X protocol errors related to displaying unusual characters or to
|
||
font-related customizations, try invoking Emacs like this:
|
||
|
||
XFT_DEBUG=16 emacs -xrm "emacs.synchronous: true"
|
||
|
||
This should produce information from the libXft library which could
|
||
give useful hints regarding font-related problems in that library.
|
||
|
||
Some bugs related to the X protocol disappear when Emacs runs in a
|
||
synchronous mode. To track down those bugs, we suggest the following
|
||
procedure:
|
||
|
||
- Run Emacs under a debugger and put a breakpoint inside the
|
||
primitive function which, when called from Lisp, triggers the X
|
||
protocol errors. For example, if the errors happen when you
|
||
delete a frame, put a breakpoint inside 'Fdelete_frame'.
|
||
|
||
- When the breakpoint breaks, step through the code, looking for
|
||
calls to X functions (the ones whose names begin with "X" or
|
||
"Xt" or "Xm").
|
||
|
||
- Insert calls to 'XSync' before and after each call to the X
|
||
functions, like this:
|
||
|
||
XSync (f->output_data.x->display_info->display, 0);
|
||
|
||
where 'f' is the pointer to the 'struct frame' of the selected
|
||
frame, normally available via XFRAME (selected_frame). (Most
|
||
functions which call X already have some variable that holds the
|
||
pointer to the frame, perhaps called 'f' or 'sf', so you shouldn't
|
||
need to compute it.)
|
||
|
||
If your debugger can call functions in the program being debugged,
|
||
you should be able to issue the calls to 'XSync' without recompiling
|
||
Emacs. For example, with GDB, just type:
|
||
|
||
call XSync (f->output_data.x->display_info->display, 0)
|
||
|
||
before and immediately after the suspect X calls. If your
|
||
debugger does not support this, you will need to add these pairs
|
||
of calls in the source and rebuild Emacs.
|
||
|
||
Either way, systematically step through the code and issue these
|
||
calls until you find the first X function called by Emacs after
|
||
which a call to 'XSync' winds up in the function
|
||
'x_error_quitter'. The first X function call for which this
|
||
happens is the one that generated the X protocol error.
|
||
|
||
- You should now look around this offending X call and try to figure
|
||
out what is wrong with it.
|
||
|
||
** If Emacs causes errors or memory leaks in your X server
|
||
|
||
You can trace the traffic between Emacs and your X server with a tool
|
||
like xmon.
|
||
|
||
Xmon can be used to see exactly what Emacs sends when X protocol errors
|
||
happen. If Emacs causes the X server memory usage to increase you can
|
||
use xmon to see what items Emacs creates in the server (windows,
|
||
graphical contexts, pixmaps) and what items Emacs delete. If there
|
||
are consistently more creations than deletions, the type of item
|
||
and the activity you do when the items get created can give a hint where
|
||
to start debugging.
|
||
|
||
** If the symptom of the bug is that Emacs fails to respond
|
||
|
||
Don't assume Emacs is 'hung'--it may instead be in an infinite loop.
|
||
To find out which, make the problem happen under GDB and stop Emacs
|
||
once it is not responding. (If Emacs is using X Windows directly, you
|
||
can stop Emacs by typing C-z at the GDB job. On MS-Windows, run Emacs
|
||
as usual, and then attach GDB to it -- that will usually interrupt
|
||
whatever Emacs is doing and let you perform the steps described
|
||
below.)
|
||
|
||
Then try stepping with 'step'. If Emacs is hung, the 'step' command
|
||
won't return. If it is looping, 'step' will return.
|
||
|
||
If this shows Emacs is hung in a system call, stop it again and
|
||
examine the arguments of the call. If you report the bug, it is very
|
||
important to state exactly where in the source the system call is, and
|
||
what the arguments are.
|
||
|
||
If Emacs is in an infinite loop, try to determine where the loop
|
||
starts and ends. The easiest way to do this is to use the GDB command
|
||
'finish'. Each time you use it, Emacs resumes execution until it
|
||
exits one stack frame. Keep typing 'finish' until it doesn't
|
||
return--that means the infinite loop is in the stack frame which you
|
||
just tried to finish.
|
||
|
||
Stop Emacs again, and use 'finish' repeatedly again until you get back
|
||
to that frame. Then use 'next' to step through that frame. By
|
||
stepping, you will see where the loop starts and ends. Also, examine
|
||
the data being used in the loop and try to determine why the loop does
|
||
not exit when it should.
|
||
|
||
On GNU and Unix systems, you can also try sending Emacs SIGUSR2,
|
||
which, if 'debug-on-event' has its default value, will cause Emacs to
|
||
attempt to break out of its current loop and enter the Lisp
|
||
debugger. (See the node "Debugging" in the ELisp manual for the
|
||
details about the Lisp debugger.) This feature is useful when a
|
||
C-level debugger is not conveniently available.
|
||
|
||
** If certain operations in Emacs are slower than they used to be, here
|
||
is some advice for how to find out why.
|
||
|
||
Stop Emacs repeatedly during the slow operation, and make a backtrace
|
||
each time. Compare the backtraces looking for a pattern--a specific
|
||
function that shows up more often than you'd expect.
|
||
|
||
If you don't see a pattern in the C backtraces, get some Lisp
|
||
backtrace information by typing "xbacktrace" or by looking at Ffuncall
|
||
frames (see above), and again look for a pattern.
|
||
|
||
When using X, you can stop Emacs at any time by typing C-z at GDB.
|
||
When not using X, you can do this with C-g. On non-Unix platforms,
|
||
such as MS-DOS, you might need to press C-BREAK instead.
|
||
|
||
** If GDB does not run and your debuggers can't load Emacs.
|
||
|
||
On some systems, no debugger can load Emacs with a symbol table,
|
||
perhaps because they all have fixed limits on the number of symbols
|
||
and Emacs exceeds the limits. Here is a method that can be used
|
||
in such an extremity. Do
|
||
|
||
nm -n temacs > nmout
|
||
strip temacs
|
||
adb temacs
|
||
0xd:i
|
||
0xe:i
|
||
14:i
|
||
17:i
|
||
:r -l loadup (or whatever)
|
||
|
||
It is necessary to refer to the file 'nmout' to convert
|
||
numeric addresses into symbols and vice versa.
|
||
|
||
It is useful to be running under a window system.
|
||
Then, if Emacs becomes hopelessly wedged, you can create another
|
||
window to do kill -9 in. kill -ILL is often useful too, since that
|
||
may make Emacs dump core or return to adb.
|
||
|
||
** Debugging incorrect screen updating on a text terminal.
|
||
|
||
To debug Emacs problems that update the screen wrong, it is useful
|
||
to have a record of what input you typed and what Emacs sent to the
|
||
screen. To make these records, do
|
||
|
||
(open-dribble-file "~/.dribble")
|
||
(open-termscript "~/.termscript")
|
||
|
||
The dribble file contains all characters read by Emacs from the
|
||
terminal, and the termscript file contains all characters it sent to
|
||
the terminal. The use of the directory '~/' prevents interference
|
||
with any other user.
|
||
|
||
If you have irreproducible display problems, put those two expressions
|
||
in your ~/.emacs file. When the problem happens, exit the Emacs that
|
||
you were running, kill it, and rename the two files. Then you can start
|
||
another Emacs without clobbering those files, and use it to examine them.
|
||
|
||
An easy way to see if too much text is being redrawn on a terminal is to
|
||
evaluate '(setq inverse-video t)' before you try the operation you think
|
||
will cause too much redrawing. This doesn't refresh the screen, so only
|
||
newly drawn text is in inverse video.
|
||
|
||
** Debugging LessTif
|
||
|
||
If you encounter bugs whereby Emacs built with LessTif grabs all mouse
|
||
and keyboard events, or LessTif menus behave weirdly, it might be
|
||
helpful to set the 'DEBUGSOURCES' and 'DEBUG_FILE' environment
|
||
variables, so that one can see what LessTif was doing at this point.
|
||
For instance
|
||
|
||
export DEBUGSOURCES="RowColumn.c:MenuShell.c:MenuUtil.c"
|
||
export DEBUG_FILE=/usr/tmp/LESSTIF_TRACE
|
||
emacs &
|
||
|
||
causes LessTif to print traces from the three named source files to a
|
||
file in '/usr/tmp' (that file can get pretty large). The above should
|
||
be typed at the shell prompt before invoking Emacs, as shown by the
|
||
last line above.
|
||
|
||
Running GDB from another terminal could also help with such problems.
|
||
You can arrange for GDB to run on one machine, with the Emacs display
|
||
appearing on another. Then, when the bug happens, you can go back to
|
||
the machine where you started GDB and use the debugger from there.
|
||
|
||
** Debugging problems which happen in GC
|
||
|
||
The array 'last_marked' (defined on alloc.c) can be used to display up
|
||
to the 512 most-recent objects marked by the garbage collection process.
|
||
Whenever the garbage collector marks a Lisp object, it records the
|
||
pointer to that object in the 'last_marked' array, which is maintained
|
||
as a circular buffer. The variable 'last_marked_index' holds the
|
||
index into the 'last_marked' array one place beyond where the pointer
|
||
to the very last marked object is stored.
|
||
|
||
The single most important goal in debugging GC problems is to find the
|
||
Lisp data structure that got corrupted. This is not easy since GC
|
||
changes the tag bits and relocates strings which make it hard to look
|
||
at Lisp objects with commands such as 'pr'. It is sometimes necessary
|
||
to convert Lisp_Object variables into pointers to C struct's manually.
|
||
|
||
Use the 'last_marked' array and the source to reconstruct the sequence
|
||
that objects were marked. In general, you need to correlate the
|
||
values recorded in the 'last_marked' array with the corresponding
|
||
stack frames in the backtrace, beginning with the innermost frame.
|
||
Some subroutines of 'mark_object' are invoked recursively, others loop
|
||
over portions of the data structure and mark them as they go. By
|
||
looking at the code of those routines and comparing the frames in the
|
||
backtrace with the values in 'last_marked', you will be able to find
|
||
connections between the values in 'last_marked'. E.g., when GC finds
|
||
a cons cell, it recursively marks its car and its cdr. Similar things
|
||
happen with properties of symbols, elements of vectors, etc. Use
|
||
these connections to reconstruct the data structure that was being
|
||
marked, paying special attention to the strings and names of symbols
|
||
that you encounter: these strings and symbol names can be used to grep
|
||
the sources to find out what high-level symbols and global variables
|
||
are involved in the crash.
|
||
|
||
Once you discover the corrupted Lisp object or data structure, grep
|
||
the sources for its uses and try to figure out what could cause the
|
||
corruption. If looking at the sources doesn't help, you could try
|
||
setting a watchpoint on the corrupted data, and see what code modifies
|
||
it in some invalid way. (Obviously, this technique is only useful for
|
||
data that is modified only very rarely.)
|
||
|
||
It is also useful to look at the corrupted object or data structure in
|
||
a fresh Emacs session and compare its contents with a session that you
|
||
are debugging.
|
||
|
||
** Debugging the TTY (non-windowed) version
|
||
|
||
The most convenient method of debugging the character-terminal display
|
||
is to do that on a window system such as X. Begin by starting an
|
||
xterm window, then type these commands inside that window:
|
||
|
||
$ tty
|
||
$ echo $TERM
|
||
|
||
Let's say these commands print "/dev/ttyp4" and "xterm", respectively.
|
||
|
||
Now start Emacs (the normal, windowed-display session, i.e. without
|
||
the '-nw' option), and invoke "M-x gdb RET emacs RET" from there. Now
|
||
type these commands at GDB's prompt:
|
||
|
||
(gdb) set args -nw -t /dev/ttyp4
|
||
(gdb) set environment TERM xterm
|
||
(gdb) run
|
||
|
||
The debugged Emacs should now start in no-window mode with its display
|
||
directed to the xterm window you opened above.
|
||
|
||
Similar arrangement is possible on a character terminal by using the
|
||
'screen' package.
|
||
|
||
On MS-Windows, you can start Emacs in its own separate console by
|
||
setting the new-console option before running Emacs under GDB:
|
||
|
||
(gdb) set new-console 1
|
||
(gdb) run
|
||
|
||
** Running Emacs with undefined-behavior sanitization
|
||
|
||
Building Emacs with undefined-behavior sanitization can help find
|
||
several kinds of low-level problems in C code, including:
|
||
|
||
* Out-of-bounds access of many (but not all) arrays.
|
||
* Signed integer overflow, e.g., (INT_MAX + 1).
|
||
* Integer shifts by a negative or wider-than-word value.
|
||
* Misaligned pointers and pointer overflow.
|
||
* Loading a bool or enum value that is out of range for its type.
|
||
* Passing NULL to or returning NULL from a function requiring nonnull.
|
||
* Passing a size larger than the corresponding array to memcmp etc.
|
||
* Passing invalid values to some builtin functions, e.g., __builtin_clz (0).
|
||
* Reaching __builtin_unreachable calls (in Emacs, 'eassume' failure).
|
||
|
||
To use GCC's UndefinedBehaviorSanitizer, append '-fsanitize=undefined'
|
||
to CFLAGS, either when running 'configure' or running 'make'.
|
||
When supported, you can also specify 'bound-strict' and
|
||
'float-cast-overflow'. For example:
|
||
|
||
./configure \
|
||
CFLAGS='-O0 -g3 -fsanitize=undefined,bounds-strict,float-cast-overflow'
|
||
|
||
You may need to append '-static-libubsan' to CFLAGS if your version of
|
||
GCC is installed in an unusual location.
|
||
|
||
Clang's UB sanitizer can also be used, but has coverage problems.
|
||
You'll need '-fsanitize=undefined -fno-sanitize=pointer-overflow' to
|
||
suppress misguided warnings about adding zero to a null pointer,
|
||
although this also suppresses any valid pointer overflow warnings.
|
||
|
||
When using GDB to debug an executable with undefined-behavior
|
||
sanitization, the GDB command:
|
||
|
||
(gdb) rbreak ^__ubsan_handle_
|
||
|
||
will let you gain control when an error is detected and before
|
||
UndefinedBehaviorSanitizer outputs to stderr or terminates the
|
||
program.
|
||
|
||
** Running Emacs with address sanitization
|
||
|
||
Building Emacs with address sanitization can help debug memory-use
|
||
problems, such as freeing the same object twice. To use
|
||
AddressSanitizer with GCC and similar compilers, append
|
||
'-fsanitize=address' to CFLAGS, either when running 'configure' or
|
||
running 'make'. Configure, build and run Emacs with
|
||
ASAN_OPTIONS='detect_leaks=0' in the environment to suppress
|
||
diagnostics of minor memory leaks in Emacs. For example:
|
||
|
||
export ASAN_OPTIONS='detect_leaks=0'
|
||
./configure CFLAGS='-O0 -g3 -fsanitize=address'
|
||
make
|
||
src/emacs
|
||
|
||
You may need to append '-static-libasan' to CFLAGS if your version of
|
||
GCC is installed in an unusual location.
|
||
|
||
When using GDB to debug an executable with address sanitization, the
|
||
GDB command:
|
||
|
||
(gdb) rbreak ^__asan_report_
|
||
|
||
will let you gain control when an error is detected and before
|
||
AddressSanitizer outputs to stderr or terminates the program.
|
||
|
||
Address sanitization is incompatible with undefined-behavior
|
||
sanitization, unfortunately. Address sanitization is also
|
||
incompatible with the --with-dumping=unexec option of 'configure'.
|
||
|
||
** Running Emacs under Valgrind
|
||
|
||
Valgrind <http://valgrind.org/> is free software that can be useful
|
||
when debugging low-level Emacs problems. Unlike GCC sanitizers,
|
||
Valgrind does not need you to compile Emacs with special debugging
|
||
flags, so it can be helpful in investigating problems that vanish when
|
||
Emacs is recompiled with debugging enabled. However, by default
|
||
Valgrind generates many false alarms with Emacs, and you will need to
|
||
maintain a suppressions file to suppress these false alarms and use
|
||
Valgrind effectively. For example, you might invoke Valgrind this
|
||
way:
|
||
|
||
valgrind --suppressions=valgrind.supp ./emacs
|
||
|
||
where valgrind.supp contains groups of lines like the following, which
|
||
suppresses some Valgrind false alarms during Emacs garbage collection:
|
||
|
||
{
|
||
Fgarbage_collect Cond - conservative garbage collection
|
||
Memcheck:Cond
|
||
...
|
||
fun:Fgarbage_collect
|
||
}
|
||
|
||
Unfortunately Valgrind suppression files tend to be system-dependent,
|
||
so you will need to keep one around that matches your system.
|
||
|
||
** How to recover buffer contents from an Emacs core dump file
|
||
|
||
The file etc/emacs-buffer.gdb defines a set of GDB commands for
|
||
recovering the contents of Emacs buffers from a core dump file. You
|
||
might also find those commands useful for displaying the list of
|
||
buffers in human-readable format from within the debugger.
|
||
|
||
|
||
This file is part of GNU Emacs.
|
||
|
||
GNU Emacs is free software: you can redistribute it and/or modify
|
||
it under the terms of the GNU General Public License as published by
|
||
the Free Software Foundation, either version 3 of the License, or
|
||
(at your option) any later version.
|
||
|
||
GNU Emacs is distributed in the hope that it will be useful,
|
||
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||
GNU General Public License for more details.
|
||
|
||
You should have received a copy of the GNU General Public License
|
||
along with GNU Emacs. If not, see <https://www.gnu.org/licenses/>.
|
||
|
||
|
||
Local variables:
|
||
mode: outline
|
||
paragraph-separate: "[ ]*$"
|
||
end:
|