1
0
mirror of https://git.savannah.gnu.org/git/emacs.git synced 2024-11-29 07:58:28 +00:00
emacs/doc/misc/eww.texi
Jim Porter 4b0f5cdb01 Add 'eww-readable-urls'
* lisp/net/eww.el (eww-readable-urls): New option.
(eww-default-readable-p): New function...
(eww-display-html): ... use it.

* test/lisp/net/eww-tests.el (eww-test/readable/default-readable): New
test.

* doc/misc/eww.texi (Basics): Document 'eww-readable-urls'.

* etc/NEWS: Announce this change (bug#68254).
2024-03-23 10:17:06 -07:00

522 lines
20 KiB
Plaintext

\input texinfo @c -*-texinfo-*-
@c %**start of header
@setfilename ../../info/eww.info
@settitle Emacs Web Wowser
@include docstyle.texi
@c %**end of header
@copying
This file documents the GNU Emacs Web Wowser (EWW) package.
Copyright @copyright{} 2014--2024 Free Software Foundation, Inc.
@quotation
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License, Version 1.3 or
any later version published by the Free Software Foundation; with no
Invariant Sections, with the Front-Cover Texts being ``A GNU Manual,''
and with the Back-Cover Texts as in (a) below. A copy of the license
is included in the section entitled ``GNU Free Documentation License.''
(a) The FSF's Back-Cover Text is: ``You have the freedom to copy and
modify this GNU manual.''
@end quotation
@end copying
@dircategory Emacs misc features
@direntry
* EWW: (eww). Emacs Web Wowser
@end direntry
@finalout
@titlepage
@title Emacs Web Wowser (EWW)
@subtitle A web browser for GNU Emacs.
@page
@vskip 0pt plus 1filll
@insertcopying
@end titlepage
@contents
@ifnottex
@node Top
@top EWW
@insertcopying
@end ifnottex
@menu
* Overview::
* Basics::
* Advanced::
* Command Line::
Appendices
* History and Acknowledgments::
* GNU Free Documentation License:: The license for this documentation.
Indices
* Key Index::
* Variable Index::
* Lisp Function Index::
* Concept Index::
@end menu
@node Overview
@chapter Overview
@dfn{EWW}, the Emacs Web Wowser, is a web browser for GNU Emacs that
provides a simple, no-frills experience that focuses on readability.
It loads, parses, and displays web pages using @dfn{shr.el}. It can
display images inline, if Emacs was built with image support, but
there is no support for CSS or JavaScript.
To use EWW, you need to use an Emacs built with @code{libxml2}
support.
@node Basics
@chapter Basic Usage
@findex eww
@findex eww-open-file
@vindex eww-search-prefix
@cindex eww
@cindex Web Browsing
You can open a URL or search the web with the command @kbd{M-x eww}.
If the input doesn't look like a URL or domain name the web will be
searched via @code{eww-search-prefix}. The default search engine is
@url{https://duckduckgo.com, DuckDuckGo}. If you want to open a file
either prefix the file name with @code{file://} or use the command
@kbd{M-x eww-open-file}.
If you invoke @code{eww} or @code{eww-open-file} with a prefix
argument, as in @w{@kbd{C-u M-x eww}}, they will create a new EWW
buffer instead of reusing the default one, which is normally called
@file{*eww*}.
@findex eww-quit
@findex eww-reload
@findex eww-copy-page-url
@findex shr-maybe-probe-and-copy-url
@kindex q
@kindex w
@kindex g
If loading the URL was successful the buffer @file{*eww*} is opened
and the web page is rendered in it. You can leave EWW by pressing
@kbd{q} or exit the browser by calling @kbd{eww-quit}. To reload the
web page hit @kbd{g} (@code{eww-reload}).
Pressing @kbd{w} when point is on a link will call
@code{shr-maybe-probe-and-copy-url}, which copies this link's
@acronym{URL} to the kill ring. If point is not on a link, pressing
@kbd{w} calls @code{eww-copy-page-url}, which will copy the current
page's URL to the kill ring instead.
@findex eww-copy-alternate-url
@kindex A
The @kbd{A} command (@code{eww-copy-alternate-url}) copies the URL
of an alternate link on the current page into the kill ring. If the
page specifies multiple alternate links, this command prompts for one
of them in the minibuffer, with completion. Alternate links are
references that an @acronym{HTML} page may include to point to other
documents that act as its alternative representations. Notably,
@acronym{HTML} pages can use alternate links to point to their
translated versions and to @acronym{RSS} feeds. Alternate links
appear in the @samp{<head>} section of @acronym{HTML} pages as
@samp{<link>} elements with @samp{rel} attribute equal to
@samp{``alternate''}; they are part of the page's metadata and are not
visible in its rendered content.
@findex eww-open-in-new-buffer
@kindex M-RET
The @kbd{M-@key{RET}} command (@code{eww-open-in-new-buffer}) opens
the URL at point in a new EWW buffer, akin to opening a link in a new
``tab'' in other browsers. If invoked with prefix argument, the
command will not make the new buffer the current one. When
@code{global-tab-line-mode} is enabled, this buffer is displayed in
the tab on the window tab line. When @code{tab-bar-mode} is enabled,
a new tab is created on the frame tab bar.
@findex eww-readable
@kindex R
The @kbd{R} command (@code{eww-readable}) will attempt to determine
which part of the document contains the ``readable'' text, and will
only display this part. This usually gets rid of menus and the like.
When called interactively, this command toggles the display of the
readable parts. With a positive prefix argument, this command always
displays the readable parts, and with a zero or negative prefix, it
always displays the full page.
@vindex eww-readable-urls
If you want EWW to render a certain page in ``readable'' mode by
default, you can add a regular expression matching its URL to
@code{eww-readable-urls}. Each entry can either be a regular expression
in string form or a cons cell of the form
@w{@code{(@var{regexp} . @var{readability})}}. If @var{readability} is
non-@code{nil}, this behaves the same as the string form; otherwise,
URLs matching @var{regexp} will never be displayed in readable mode by
default. For example, you can use this to make all pages default to
readable mode, except for a few outliers:
@example
(setq eww-readable-urls '(("https://example\\.com/" . nil)
".*"))
@end example
@findex eww-toggle-fonts
@vindex shr-use-fonts
@kindex F
The @kbd{F} command (@code{eww-toggle-fonts}) toggles whether to use
variable-pitch fonts or not. This sets the @code{shr-use-fonts} variable.
@findex eww-toggle-colors
@vindex shr-use-colors
@kindex M-C
The @kbd{M-C} command (@code{eww-toggle-colors}) toggles whether to use
HTML-specified colors or not. This sets the @code{shr-use-colors} variable.
@findex eww-toggle-images
@vindex shr-inhibit-images
@kindex M-I
@cindex Image Display
The @kbd{M-I} command (@code{eww-toggle-images}, capital letter i)
toggles whether to display images or not. This also sets the
@code{shr-inhibit-images} variable.
@findex eww-download
@vindex eww-download-directory
@kindex d
@cindex Download
A URL can be downloaded with @kbd{d} (@code{eww-download}). This
will download the link under point if there is one, or else the URL of
the current page. The file will be written to the directory specified
by @code{eww-download-directory} (default: @file{~/Downloads/}, if it
exists; otherwise as specified by the @samp{DOWNLOAD} @acronym{XDG}
directory)).
@findex eww-back-url
@findex eww-forward-url
@findex eww-list-histories
@kindex r
@kindex l
@kindex H
@cindex History
EWW remembers the URLs you have visited to allow you to go back and
forth between them. By pressing @kbd{l} (@code{eww-back-url}) you go
to the previous URL@. You can go forward again with @kbd{r}
(@code{eww-forward-url}). If you want an overview of your browsing
history press @kbd{H} (@code{eww-list-histories}) to open the history
buffer @file{*eww history*}. The history is lost when EWW is quit.
If you want to remember websites you can use bookmarks.
@vindex eww-before-browse-history-function
By default, when browsing to a new page from a ``historical'' one
(i.e.@: a page loaded by navigating back via @code{eww-back-url}), EWW
will first delete any history entries newer than the current page. This
is the same behavior as most other web browsers. You can change this by
customizing @code{eww-before-browse-history-function} to another value.
For example, setting it to @code{ignore} will preserve the existing
history entries and simply prepend the new page to the history list.
@vindex eww-history-limit
Along with the URLs visited, EWW also remembers both the rendered
page (as it appears in the buffer) and its source. This can take a
considerable amount of memory, so EWW discards the history entries to
keep their number within a set limit, as specified by
@code{eww-history-limit}; the default being 50. This variable could
also be set to @code{nil} to allow for the history list to grow
indefinitely.
@cindex PDF
PDFs are viewed inline, by default, with @code{doc-view-mode}, but
this can be customized by using the mailcap (@pxref{mailcap,,,
emacs-mime, Emacs MIME Manual})
mechanism, in particular @code{mailcap-mime-data}.
@findex eww-add-bookmark
@findex eww-list-bookmarks
@kindex b
@kindex B
@cindex Bookmarks
EWW allows you to @dfn{bookmark} URLs. Simply hit @kbd{b}
(@code{eww-add-bookmark}) to store a bookmark for the current website.
You can view stored bookmarks with @kbd{B}
(@code{eww-list-bookmarks}). This will open the bookmark buffer
@file{*eww bookmarks*}.
@findex eww-switch-to-buffer
@findex eww-list-buffers
@kindex s
@kindex S
@cindex Multiple Buffers
To get summary of currently opened EWW buffers, press @kbd{S}
(@code{eww-list-buffers}). The @file{*eww buffers*} buffer allows you
to quickly kill, flip through and switch to specific EWW buffer. To
switch EWW buffers through a minibuffer prompt, press @kbd{s}
(@code{eww-switch-to-buffer}).
@findex eww-browse-with-external-browser
@vindex browse-url-secondary-browser-function
@vindex eww-use-external-browser-for-content-type
@kindex &
@cindex External Browser
Although EWW and shr.el do their best to render webpages in GNU
Emacs some websites use features which can not be properly represented
or are not implemented (e.g., JavaScript). If you have trouble
viewing a website with EWW then hit @kbd{&}
(@code{eww-browse-with-external-browser}) inside the EWW buffer to
open the website in the external browser specified by
@code{browse-url-secondary-browser-function}. Some content types,
such as video or audio content, do not make sense to display in GNU
Emacs at all. You can tell EWW to open specific content automatically
in an external browser by customizing
@code{eww-use-external-browser-for-content-type}.
@node Advanced
@chapter Advanced
@findex eww-retrieve-command
EWW normally uses @code{url-retrieve} to fetch the @acronym{HTML}
before rendering it, and @code{url-retrieve-synchronously} when
the value of @code{eww-retrieve-command} is @code{sync}. It can
sometimes be convenient to use an external program to do this, and
@code{eww-retrieve-command} should then be a list that specifies
a command and the parameters. For instance, to use the Chromium
browser, you could say something like this:
@lisp
(setq eww-retrieve-command
'("chromium" "--headless" "--dump-dom"))
@end lisp
The command should return the @acronym{HTML} on standard output, and
the data should use @acronym{UTF-8} as the charset.
@findex eww-view-source
@kindex v
@cindex Viewing Source
You can view the source of a website with @kbd{v}
(@code{eww-view-source}). This will open a new buffer
@file{*eww-source*} and insert the source. The buffer will be set to
@code{html-mode} if available.
@findex url-cookie-list
@kindex C
@cindex Cookies
EWW handles cookies through the @ref{Top, url package, ,url}
package. You can list existing cookies with @kbd{C}
(@code{url-cookie-list}). For details about the Cookie handling
@xref{Cookies,,,url}.
@vindex shr-cookie-policy
Many @acronym{HTML} pages have images embedded in them, and EWW will
download most of these by default. When fetching images, cookies can
be sent and received, and these can be used to track users. To
control when to send cookies when retrieving these images, the
@code{shr-cookie-policy} variable can be used. The default value,
@code{same-origin}, means that EWW will only send cookies when
fetching images that originate from the same source as the
@acronym{HTML} page. @code{nil} means ``never send cookies when
retrieving these images'' and @code{t} means ``always send cookies
when retrieving these images''.
@vindex eww-use-browse-url
When following links in EWW, @acronym{URL}s that match the
@code{eww-use-browse-url} regexp will be passed to @code{browse-url}
instead of EWW handling them itself. The action can be further
customized by altering @code{browse-url-handlers}.
@vindex eww-header-line-format
@cindex Header
The header line of the EWW buffer can be changed by customizing
@code{eww-header-line-format}. The format replaces @code{%t} with the
title of the website and @code{%u} with the URL.
@findex eww-toggle-paragraph-direction
@cindex paragraph direction
The @kbd{D} command (@code{eww-toggle-paragraph-direction}) toggles
the paragraphs direction between left-to-right and right-to-left
text. This can be useful on web pages that display right-to-left test
(like Arabic and Hebrew), but where the web pages don't explicitly
state the directionality.
@c @vindex shr-bullet
@c @vindex shr-hr-line
@c @vindex eww-form-checkbox-selected-symbol
@c @vindex eww-form-checkbox-symbol
@c EWW and the rendering engine shr.el use ASCII characters to
@c represent some graphical elements, such as bullet points
@c (@code{shr-bullet}), check boxes
@c (@code{eww-form-checkbox-selected-symbol} and
@c @code{eww-form-checkbox-symbol}), and horizontal rules
@c @code{shr-hr-line}). Depending on your fonts these characters can be
@c replaced by Unicode glyphs to achieve better looking results.
@vindex shr-max-image-proportion
@vindex shr-blocked-images
@vindex shr-allowed-images
@cindex Image Display
Loading random images from the web can be problematic due to their
size or content. By customizing @code{shr-max-image-proportion} you
can set the maximal image proportion in relation to the window they
are displayed in. E.g., 0.7 means an image is allowed to take up 70%
of the width and height. If Emacs supports image scaling, then larger
images are scaled down. You can block specific images completely by
customizing @code{shr-blocked-images}.
@vindex shr-inhibit-images
You can control image display by customizing
@code{shr-inhibit-images}. If this variable is @code{nil}, display
the ``ALT'' text of images instead.
@vindex shr-color-visible-distance-min
@vindex shr-color-visible-luminance-min
@cindex Contrast
EWW (or rather its HTML renderer @code{shr}) uses the colors declared
in the HTML page, but adjusts them if needed to keep a certain minimum
contrast. If that is still too low for you, you can customize the
variables @code{shr-color-visible-distance-min} and
@code{shr-color-visible-luminance-min} to get a better contrast.
@vindex shr-max-width
@vindex shr-width
By default, the max width used when rendering is 120 characters, but
this can be adjusted by changing the @code{shr-max-width} variable.
If a specified width is preferred no matter what the width of the
window is, @code{shr-width} can be set. If both variables are
@code{nil}, the window width will always be used.
@vindex shr-discard-aria-hidden
@cindex @code{aria-hidden}, HTML attribute
The HTML attribute @code{aria-hidden} is meant to tell screen
readers to ignore a tag's contents. You can customize the variable
@code{shr-discard-aria-hidden} to tell @code{shr} to ignore such tags.
This can be useful when using a screen reader on the output of
@code{shr} (e.g., on EWW buffer text). It can be useful even when not
using a screen reader, since web authors often put this attribute on
non-essential decorative elements.
@cindex Desktop Support
@cindex Saving Sessions
In addition to maintaining the history at run-time, EWW will also
save the partial state of its buffers (the URIs and the titles of the
pages visited) in the desktop file if one is used. @xref{Saving Emacs
Sessions,,, emacs, The GNU Emacs Manual}.
@vindex eww-desktop-remove-duplicates
EWW history may sensibly contain multiple entries for the same page
URI@. At run-time, these entries may still have different associated
point positions or the actual Web page contents.
The latter, however, tend to be overly large to preserve in the
desktop file, so they get omitted, thus rendering the respective
entries entirely equivalent. By default, such duplicate entries are
not saved. Setting @code{eww-desktop-remove-duplicates} to @code{nil}
will force EWW to save them anyway.
@vindex eww-restore-desktop
Restoring EWW buffers' contents may prove to take too long to
finish. When the @code{eww-restore-desktop} variable is set to
@code{nil} (the default), EWW will not try to reload the last visited
Web page when the buffer is restored from the desktop file, thus
allowing for faster Emacs start-up times. When set to @code{t},
restoring the buffers will also initiate the reloading of such pages.
@vindex eww-restore-reload-prompt
The EWW buffer restored from the desktop file but not yet reloaded
will contain a prompt, as specified by the
@code{eww-restore-reload-prompt} variable. The value of this variable
will be passed through @code{substitute-command-keys} upon each use,
thus allowing for the use of the usual substitutions, such as
@code{\[eww-reload]} for the current key binding of the
@code{eww-reload} command.
@vindex eww-auto-rename-buffer
If the @code{eww-auto-rename-buffer} user option is non-@code{nil},
EWW buffers will be renamed after rendering a document. If this is
@code{title}, rename based on the title of the document. If this is
@code{url}, rename based on the @acronym{URL} of the document. This
can also be a user-defined function, which is called with no
parameters in the EWW buffer, and should return a string.
@cindex utm
@vindex eww-url-transformers
EWW runs the URLs through @code{eww-url-transformers} before using
them. This user option is a list of functions, where each function is
called with the URL as the parameter, and should return the (possibly)
transformed URL. By default, this variable contains
@code{eww-remove-tracking}, which removes the common @samp{utm_}
trackers from links.
@cindex video
@vindex shr-use-xwidgets-for-media
If Emacs has been built with xwidget support, EWW can use that to
display @samp{<video>} elements. However, this support is still
experimental, and on some systems doesn't work (and even worse) may
crash your Emacs, so this feature is off by default. If you wish to
switch it on, set @code{shr-use-xwidgets-for-media} to a
non-@code{nil} value.
@node Command Line
@chapter Command Line Usage
It can be convenient to start eww directly from the command line. The
@code{eww-browse} function can be used for that:
@example
emacs -f eww-browse https://gnu.org
@end example
This also allows registering Emacs as a @acronym{MIME} handler for the
@samp{"text/x-uri"} media type. How to do that varies between
systems, but typically you'd register the handler to call @samp{"emacs
-f eww-browse %u"}.
@node History and Acknowledgments
@appendix History and Acknowledgments
EWW was originally written by Lars Ingebrigtsen, known for his work on
Gnus. He started writing an Emacs HTML rendering library,
@code{shr.el}, to read blogs in Gnus. He eventually added a web
browser front end and HTML form support. Which resulted in EWW, the
Emacs Web Wowser. EWW was announced on 16 June 2013:
@url{https://lars.ingebrigtsen.no/2013/06/16/eww/}.
EWW was then moved from the Gnus repository to GNU Emacs and several
developers started contributing to it as well.
@node GNU Free Documentation License
@chapter GNU Free Documentation License
@include doclicense.texi
@node Key Index
@unnumbered Key Index
@printindex ky
@node Variable Index
@unnumbered Variable Index
@vindex eww-after-render-hook
After eww has rendered the data in the buffer,
@code{eww-after-render-hook} is called. It can be used to alter the
contents, for instance.
@printindex vr
@node Lisp Function Index
@unnumbered Function Index
@printindex fn
@node Concept Index
@unnumbered Concept Index
@printindex cp
@bye