As of FreeBSD 6, devices can only be opened through devfs. These device
nodes don't have major and minor numbers anymore. The st_rdev field in
struct stat is simply based a copy of st_ino.
Simply display device numbers as hexadecimal, using "%#jx". This is
allowed by POSIX, since it explicitly states things like the following
(example taken from ls(1)):
"If the file is a character special or block special file, the
size of the file may be replaced with implementation-defined
information associated with the device in question."
This makes the output of these commands more compact. For example, ls(1)
now uses approximately four columns less. While there, simplify the
column length calculation from ls(1) by calling snprintf() with a NULL
buffer.
Don't be afraid; if needed one can still obtain individual major/minor
numbers using stat(1).
Because sh executes commands in subshell environments without forking in
more and more cases (particularly from 8.0 on), it makes sense to describe
subshell environments more precisely using ideas from POSIX, together with
some FreeBSD-specific items.
In particular, the hash and times builtins may not behave as if their state
is copied for a subshell environment while leaving the parent shell
environment unchanged.
to re-establishment of 64bit arithmetic, but is committed separately, to
not obscure that conversion. This commit does not change the observed
behaviour of expr in any way. Style will be fixed in a follow-up commit.
again. This brings back the behaviour of expr in FreeBSD-4, which had been
reverted due to an assumed incompatbility with POSIX.1 for FreeBSD-5.
This issue has been discussed in the freebsd-standards list, and the
consensus was, that POSIX.1 is in fact not violated by this extension,
since it affects only cases of POSIX undefined behaviour (overflow of
signed long).
Other operating systems did upgrade their versions of expr to support
64bit range, after it had been initially brought to FreeBSD. They have
used it for a decade without problems, meanwhile.
The -e option is retained, but it will only select less strict checking
of numeric parameters (leading white-space, leading "+" are allowed and
skipped, an empty string is considered to represent 0 in numeric context.)
The call of check_utility_compat() as a means of establishing backwards
compatibility with FreeBSD-4 is considered obsolete, but preserved in
this commit. It is expected to be removed in a later revision of this
file.
Reviewed by: bde, das, jilles
MFC after: 2 month (those parts that do not violate POLA)
* Shell patterns are also for ${var#pat} and the like.
* An '!' by itself will not trigger pathname generation so do not call it a
meta-character, even though it has a special meaning directly after an
'['.
* Character ranges are locale-dependent.
* A '^' will complement a character class like '!' but is non-standard.
MFC after: 1 week
POSIX requires a -h option to sh and set, to locate and remember utilities
invoked by functions as they are defined. Given that this
locate-and-remember process is optional elsewhere, it seems safe enough to
make this option do nothing.
POSIX does not specify a long name for this option. Follow ksh in calling it
"trackall".
Replacing ;; with the new control operator ;& will cause the next list to be
executed as well without checking its pattern, continuing until a list ends
with ;; or until the end of the case statement. This is like omitting
"break" in a C "switch" statement.
The sequence ;& was formerly invalid.
This feature is proposed for the next POSIX issue in Austin Group issue
#449.
The eval special builtin now runs the code with EV_EXIT if it was run
with EV_EXIT itself.
In particular, this eliminates one fork when a command substitution contains
an eval command that ends with an external program or a subshell.
This is similar to what r220978 did for functions.
Have mkbuiltins write the prototypes for the *cmd functions to builtins.h
instead of builtins.c and include builtins.h in more .c files instead of
duplicating prototypes for *cmd functions in other headers.
In optimized command substitution, save and restore any variables changed by
expansions (${var=value} and $((var=assigned))), instead of trying to
determine if an expansion may cause such changes.
If $! is referenced in optimized command substitution, do not cause jobs to
be remembered longer.
This fixes $(jobs $!) again, simplifies the man page and shortens the code.
When I added UTF-8 support in r221646, the LC_COLLATE-based ordering broke
because of sign extension of char.
Because of libc restrictions, this does not work for UTF-8. For UTF-8
locales, ranges always use character code order.
In most cases, login shells are started from the home directory, but not in
all, such as xterm -ls.
This commit depends on r222957 for read_profile() performing parameter
expansion.
PR: bin/50569
The function name expandstr() and the general idea of doing this kind of
expansion by treating the text as a here document without end marker is from
dash.
All variants of parameter expansion and arithmetic expansion also work (the
latter is not required by POSIX but it does not take extra code and many
other shells also allow it).
Command substitution is prevented because I think it causes too much code to
be re-entered (for example creating an unbounded recursion of trace lines).
Unfortunately, our LINENO is somewhat crude, otherwise PS4='$LINENO+ ' would
be quite useful.
The "exp" builtin is undocumented, non-standard and not very useful.
If exp's return value is not used, something like
VAR=$(exp EXPRESSION)
is equivalent to
VAR=$((EXPRESSION))
except that errors in the expression are fatal and quoting special
characters is not needed in the latter case.
If exp's return value is used, something like
if exp EXPRESSION >/dev/null
can be replaced by
if [ $((EXPRESSION)) -ne 0 ]
with similar differences.
The exp-run showed that "let" is close enough to bash's and ksh's builtin
that removing it would break a few ports. Therefore, "let" remains in 9.x.
PR: bin/104432
Exp-run done by: pav (with some other sh(1) changes)
CDPATH should be ignored not only for pathnames starting with '/' but also
for pathnames whose first component is '.' or '..'.
The man page already describes this behaviour.
If IFS is null, unquoted $@/$* should still expand to separate words.
This differs from quoted $@ (which does not depend on IFS) in that pathname
generation is performed and empty words are removed.
If the length of a directory in PATH together with the given filename
exceeded FILENAME_MAX (which may happen even for pathnames that work), a
static buffer was overflown.
The static buffer is unnecessary, we can use the stalloc() stack.
Obtained from: NetBSD
MFC after: 1 week
This reflects failure to determine the pathname of the new directory in the
exit status (1). Normally, cd returns successfully if it did chdir() and the
call was successful.
In POSIX, -e only has meaning with -P; because our -L is not entirely
compliant and may fall back to -P mode, -e has some effect with -L as well.
This is sometimes used with eval or old-style command substitution, and most
shells other than ash derivatives allow it.
It can also be used with scripts that violate POSIX's requirement on the
application that they end in a newline (scripts must be text files except
that line length is unlimited).
Example:
v=`cat <<EOF
foo
EOF`
echo $v
This commit does not add support for the similar construct with new-style
command substitution, like
v=$(cat <<EOF
foo
EOF)
This continues to require a newline after the terminator.
Because we have no iconv in base, support for other charsets is not
possible.
Note that \u/\U are processed using the locale that was active when the
shell started. This is necessary to avoid behaviour that depends on the
parse/execute split (for example when placing braces around an entire
script). Therefore, UTF-8 encoding is implemented manually.
?, [...] patterns match codepoints instead of bytes. They do not match
invalid sequences. [...] patterns must not contain invalid sequences
otherwise they will not match anything. This is so that ${var#?} removes the
first codepoint, not the first byte, without putting UTF-8 knowledge into
the ${var#pattern} code. However, * continues to match any string and an
invalid sequence matches an identical invalid sequence. (This differs from
fnmatch(3).)
This ensures that mbrtowc(3) can be used directly once it has been verified
that there is no CTL* byte. Dealing with a CTLESC byte within a multibyte
character would be complicated.
The new values do occur in iso-8859-* encodings. This decreases efficiency
slightly but should not affect correctness.
Caveat: Updating across this change and rebuilding without cleaning may
yield a subtly broken sh binary. By default, make buildworld will clean and
avoid problems.
A string between $' and ' may contain backslash escape sequences similar to
the ones in a C string constant (except that a single-quote must be escaped
and a double-quote need not be). Details are in the sh(1) man page.
This construct is useful to include unprintable characters, tabs and
newlines in strings; while this can be done with a command substitution
containing a printf command, that needs ugly workarounds if the result is to
end with a newline as command substitution removes all trailing newlines.
The construct may also be useful in future to describe unprintable
characters without needing to write those characters themselves in 'set -x',
'export -p' and the like.
The implementation attempts to comply to the proposal for the next issue of
the POSIX specification. Because this construct is not in POSIX.1-2008,
using it in scripts intended to be portable is unwise.
Matching the minimal locale support in the rest of sh, the \u and \U
sequences are currently not useful.
Exp-run done by: pav (with some other sh(1) changes)
Note that this only applies to variables that are actually used.
Things like (0 && unsetvar) do not cause an error.
Exp-run done by: pav (with some other sh(1) changes)
In particular, this makes things like ${#foo[0]} and ${#foo[@]} errors
rather than silent equivalents of ${#foo}.
PR: bin/151720
Submitted by: Mark Johnston
Exp-run done by: pav (with some other sh(1) changes)
For backgrounded pipelines and subshells, the previous value of $? was being
preserved, which is incorrect.
For backgrounded simple commands containing a command substitution, the
status of the last command substitution was returned instead of 0.
If fork() fails, this is an error.
If the -p option is turned off, privileges from a setuid or setgid binary
are dropped. Make sure to check if this succeeds. If it fails, this is an
error which will cause the shell to abort except in interactive mode or if
'command' was used to make 'set' or an outer 'eval' or '.' non-special.
Note that taking advantage of this feature and writing setuid shell scripts
seems unwise.
MFC after: 1 week
If EV_EXIT causes an exit, use the exception mechanism to unwind
redirections and local variables. This way, if the final command is a
redirected command, an EXIT trap now executes without the redirections.
Because of these changes, EV_EXIT can now be inherited by the body of a
function, so do so. This means that a function no longer prevents a fork
before an exec being skipped, such as in
f() { head -1 /etc/passwd; }; echo $(f)
Wrapping a single builtin in a function may still cause an otherwise
unnecessary fork with command substitution, however.
An exit command or -e failure still invokes the EXIT trap with the
original redirections and local variables in place.
Note: this depends on SHELLPROC being gone. A SHELLPROC depended on
keeping the redirections and local variables and only cleaning up the
state to restore them.
This is only a problem if IFS contains digits, which is unusual but valid.
Because of an incorrect fix for PR bin/12137, "${#parameter}" was treated
as ${#parameter}. The underlying problem was that "${#parameter}"
erroneously added CTLESC bytes before determining the length. This
was properly fixed for PR bin/56147 but the incorrect fix was not backed
out.
Reported by: Seeker on forums.freebsd.org
MFC after: 2 weeks
POSIX does not require the shell to fork for a subshell environment, and we
use that possibility in various ways (command substitutions with a single
command and most subshells that are the final command of a shell process).
Therefore do not tie subshells to forking in the man page.
Command substitutions with expansions are a bit strange, causing a fork for
$(...$(($x))...) because $x might expand to y=2; they will probably be
changed later but this is how they work now.
These already worked: $# ${#} ${##} ${#-} ${#?}
These now work as well: ${#+word} ${#-word} ${##word} ${#%word}
There is an ambiguity in the standard with ${#?}: it could be the length of
$? or it could be $# giving an error in the (impossible) case that it is not
set. We continue to use the former interpretation as it seems more useful.
New features:
* proper lazy evaluation of || and &&
* ?: ternary operator
* executable is considerably smaller (8K on i386) because lex and yacc are
no longer used
Differences from dash:
* arith_t instead of intmax_t
* imaxdiv() not used
* unset or null variables default to 0
* let/exp builtin (undocumented, will probably be removed later)
Obtained from: dash
* In {(...) <redir1;} <redir2, do not drop redir1.
* Maintain the difference between (...) <redir and {(...)} <redir:
In (...) <redir, the redirection is performed in the child, while in
{(...)} <redir it should be performed in the parent (like {(...); :;}
<redir)
POSIX requires this and it is simpler than the previous code that remembered
command locations when appending directories to PATH.
In particular,
PATH=$PATH
is no longer a no-op but discards all cached command locations.
If execve() returns an [ENOEXEC] error, check if the file is binary before
trying to execute it using sh. A file is considered binary if at least one
of the first 256 bytes is '\0'.
In particular, trying to execute ELF binaries for the wrong architecture now
fails with an "Exec format error" message instead of syntax errors and
potentially strange results.
These are called "shell procedures" in the source.
If execve() failed with [ENOEXEC], the shell would reinitialize itself
and execute the program as a script. This requires a fair amount of code
which is not frequently used (most scripts have a #! magic number).
Therefore just execute a new instance of sh (_PATH_BSHELL) to run the
script.
This matches the constants from <signal.h> with 'SIG' removed, which POSIX
requires kill and trap to accept and 'kill -l' to write.
'kill -l', 'trap', 'trap -l' output is now upper case.
In Turkish locales, signal names with an upper case 'I' are now accepted,
while signal names with a lower case 'i' are no longer accepted, and the
output of 'killall -l' now contains proper capital 'I' without dot instead
of a dotted capital 'I'.
* There is no plan for an alternative to the command "set".
* Attempting to unset a readonly variable has not raised an error for quite
a while, so the order of unsetting a variable and a function with the same
name does not matter.
MFC after: 1 week
When a foreground job exits on a signal, a message is printed to stdout
about this. The buffer was not flushed after this which could result in the
message being written to the wrong file if the next command was a builtin
and had stdout redirected.
Example:
sh -c 'kill -9 $$'; : > foo; echo FOO:; cat foo
Reported by: gcooper
MFC after: 1 week
This is useful so that it is easier to exit on a signal than to reset the
trap to default and resend the signal. It matches ksh93. POSIX says that
'exit' without args from a trap action uses the exit status from the last
command before the trap, which is different from 'exit $?' and matches this
if the previous command is assumed to have exited on the signal.
If the signal is SIGSTOP, SIGTSTP, SIGTTIN or SIGTTOU, or if the default
action for the signal is to ignore it, a normal _exit(2) is done with exit
status 128+signal_number.
* Make 'trap --' do the same as 'trap' instead of nothing.
* Make '--' stop option processing (note that '-' action is not an option).
Side effect: The error message for an unknown option is different.
All builtins are now always found before a PATH search.
Most ash derivatives have an undocumented feature where the presence of an
entry "%builtin" in $PATH will cause builtins to be checked at that point of
the PATH search, rather than before looking at any directories as documented
in the man page (very old versions do document this feature).
I am removing this feature from sh, as it complicates the code, may violate
expectations (for example, /usr/bin/alias is very close to a forkbomb with
PATH=/usr/bin:%builtin, only /usr/bin/builtin not being another link saves
it) and appears to be unused (all the %builtin google code search finds is
in some sort of ash source code).
Note that aliases and functions took and take precedence above builtins.
Because aliases work on a lexical level they can only ever be overridden on
a lexical level (quoting or preceding 'builtin' or 'command'). Allowing
override of functions via PATH does not really fit in the model of sh and it
would work differently from %builtin if implemented.
Note: POSIX says special builtins are found before functions. We comply to
this because we do not allow functions with the same name as a special
builtin.
Silence from: freebsd-hackers@ (message sent 20101225)
Discussed with: dougb
It should use the original exit status, just like falling off the
end of the trap handler.
Outside an EXIT trap, 'exit' is still equivalent to 'exit $?'.
An error message is written, the builtin is not executed, nonzero exit
status is returned but the shell does not abort.
This was already checked for special builtins and external commands, with
the same consequences except that the shell aborts for special builtins.
Obtained from: NetBSD
Change the criterion for builtins to be safe to execute in the same process
in optimized command substitution from a blacklist of only cd, . and eval to
a whitelist.
This avoids clobbering the main shell environment such as by $(exit 4) and
$(set -x).
The builtins jobid, jobs, times and trap can still show information not
available in a child process; this is deliberately permitted. (Changing
traps is not.)
For some builtins, whether they are safe depends on the arguments passed to
them. Some of these are always considered unsafe to keep things simple; this
only harms efficiency a little in the rare case they are used alone in a
command substitution.
If SIGINT arrived at exactly the right moment (unlikely), an exception
handler in a no longer active stack frame would be called.
Because the old handler was not used in the normal path, clang thought it
was a dead value and if an exception happened it would longjmp() to garbage.
This caused builtins/fc1.0 to fail if histedit.c was compiled with clang.
MFC after: 1 week
Before considering to execute a command substitution in the same process,
check if any of the expansions may have a side effect; if so, execute it in
a new process just like happens if it is not a single simple command.
Although the check happens at run time, it is a static check that does not
depend on current state. It is triggered by:
- expanding $! (which may cause the job to be remembered)
- ${var=value} default value assignment
- assignment operators in arithmetic
- parameter substitutions in arithmetic except ${#param}, $$, $# and $?
- command substitutions in arithmetic
This means that $((v+1)) does not prevent optimized command substitution,
whereas $(($v+1)) does, because $v might expand to something containing
assignment operators.
Scripts should not depend on these exact details for correctness. It is also
imaginable to have the shell fork if and when a side effect is encountered
or to create a new temporary namespace for variables.
Due to the $! change, the construct $(jobs $!) no longer works. The value of
$! should be stored in a variable outside command substitution first.
Command substitutions consisting of a single simple command are executed in
the main shell process but this should be invisible apart from performance
and very few exceptions such as $(trap).
Maintain a pointer to the end of the stack string area instead of how much
space is left. This simplifies the macros in memalloc.h. The places where
the new variable must be updated are only where the memory area is created,
destroyed or resized.
This allows specifying a %job (which is equivalent to the corresponding
process group).
Additionally, it improves reliability of kill from sh in high-load
situations and ensures "kill" finds the correct utility regardless of PATH,
as required by POSIX (unless the undocumented %builtin mechanism is used).
Side effect: fatal errors (any error other than kill(2) failure) now return
exit status 2 instead of 1. (This is consistent with other sh builtins, but
not in NetBSD.)
Code size increases about 1K on i386.
Obtained from: NetBSD
The #define for warnx now behaves much like the libc function (except that
it uses sh command name and output).
Also, it now uses C99 __VA_ARGS__ so there is no need for three different
macros for 0, 1 or 2 parameters.
Constants in arithmetic starting with 0 should be octal only.
This avoids the following highly puzzling result:
$ echo $((018-017))
3
by making it an error instead.
c is assigned 0 and *loc is pointing to NULL, so c!=0 cannot be true,
and dereferencing loc would be a bad idea anyway.
Coverity Prevent: CID 5113
Reviewed by: jilles
The CTLESC byte to protect a special character was output before instead of
after a newline directly preceding the special character.
The special handling of newlines is because command substitutions discard
all trailing newlines.
* Prefer kill(-X) to killpg(X).
* Remove some dead code.
* No additional SIGINT is needed if int_pending() is already true.
No functional change is intended.
The herefd hack wrote out partial here documents while expanding them. It
seems unnecessary complication given that other expansions just allocate
memory. It causes bugs because the stack is also used for intermediate
results such as arithmetic expressions. Such places should disable herefd
for the duration but not all of them do, and I prefer removing the need for
disabling herefd to disabling it everywhere needed.
Here documents larger than 1024 bytes will use a bit more CPU time and
memory.
Additionally this allows a later change to expand here documents in the
current shell environment. (This is faster for small here documents but also
changes behaviour.)
Obtained from: dash
The code to translate the internal representation to text did not know about
various additions to the internal representation since the original ash and
therefore wrote binary stuff to the terminal.
The code is used in the jobs command and similar output.
Note that the output is far from complete and mostly serves for recognition
purposes.
If describing the status of a pipeline, write all elements of the pipeline
and show the status of the last process (which would also end up in $?).
Only write one report per job, not one for every process that exits.
To keep some earlier behaviour, if any process started by the shell in a
foreground job terminates because of a signal, write a message about the
signal (at most one message per job, however).
Also, do not write messages about signals in the wait builtin in
non-interactive shells. Only true foreground jobs now write such messages
(for example, "Terminated").
In r208489, I added code to reap zombies when forking new processes, to
limit the amount of zombies. However, this can lead to marking a job as done
or stopped if it consists of multiple processes and the first process ends
very quickly. Fix this by only checking for zombies before forking the first
process of a job and not marking any jobs without processes as done or
stopped.
The getpgid() call will fail if the first process in the job has already
terminated, resulting in output of "-1".
The pgid of a job is always the pid of the first process in the job and
other code already relies on this.
Make sure all built-in commands are in the subsection named such, except
exp, let and wordexp which are deliberately undocumented. The text said only
built-ins that really need to be a built-in were documented there but in
fact almost all of them were already documented.
* Prefer one CHECKSTRSPACE with multiple USTPUTC to multiple STPUTC.
* Add STPUTS macro (based on function) and use it instead of loops that add
nul-terminated strings to the stack string.
No functional change is intended, but code size is about 1K less on i386.
If getcwd fails, do not treat this as an error, but print a warning and
unset PWD. This is similar to the behaviour when starting the shell in a
directory whose name cannot be determined.
Since is_alpha/is_name/is_in_name were made ASCII-only, this can no longer
happen.
Additionally, the check was wrong because it did not include the new
CTLQUOTEEND.
This was removed in 2001 but I think it is appropriate to add it back:
* I do not want to encourage people to write fragile and non-portable echo
commands by making printf much slower than echo.
* Recent versions of Autoconf use it a lot.
* Almost no software still wants to support systems that do not have
printf(1) at all.
* In many other shells printf is already a builtin.
Side effect: printf is now always the builtin version (which behaves
identically to /usr/bin/printf) and cannot be overridden via PATH (except
via the undocumented %builtin mechanism).
Code size increases about 5K on i386. Embedded folks might want to replace
/usr/bin/printf with a hard link to /usr/bin/alias.