file and processes information retrieval from the running kernel via sysctl
in the form of new library, libprocstat. The library also supports KVM backend
for analyzing memory crash dumps. Both procstat(1) and fstat(1) utilities have
been modified to take advantage of the library (as the bonus point the fstat(1)
utility no longer need superuser privileges to operate), and the procstat(1)
utility is now able to display information from memory dumps as well.
The newly introduced fuser(1) utility also uses this library and able to operate
via sysctl and kvm backends.
The library is by no means complete (e.g. KVM backend is missing vnode name
resolution routines, and there're no manpages for the library itself) so I
plan to improve it further. I'm commiting it so it will get wider exposure
and review.
We won't be able to MFC this work as it relies on changes in HEAD, which
was introduced some time ago, that break kernel ABI. OTOH we may be able
to merge the library with KVM backend if we really need it there.
Discussed with: rwatson
prefixes (Ki, Mi, Gi...) for humanize_number(3).
Note that applications has to pass HN_IEC_PREFIXES to use this
feature for backward compatibility reasons.
Reviewed by: arundel
MFC after: 2 weeks
integer overflow when the input is very large (for example, 100 Pi would
become about 10 Ei which exceeded signed int64_t).
Solve this issue by splitting the division into two parts and avoid the
multiplication.
PR: bin/146205
Reviewed by: arundel
MFC after: 1 month
it possible for the kernel to track login class the process is assigned to,
which is required for RCTL. This change also make setusercontext(3) call
setloginclass(2) and makes it possible to retrieve current login class using
id(1).
Reviewed by: kib (as part of a larger patch)
user in question (usually but not necessarily because we were called
with LOGIN_SETUSER). This plugs a hole where users could raise their
resource limits and expand their CPU mask.
MFC after: 3 weeks
a very bad one, since the shift does not actually overflow. This is
a better example (assuming uint64_t = unsigned long long):
~0LLU >> 9 = 0x7fffffffffffffLLU
~0LLU >> 9 << 10 = 0xfffffffffffffc00LLU
~0LLU >> 9 << 10 >> 10 = 0x3fffffffffffffLLU
switch. Since expand_number() does not accept negative numbers, switch
from int64_t to uint64_t; this makes it easier to check for overflow.
MFC after: 3 weeks
Although groff_mdoc(7) gives another impression, this is the ordering
most widely used and also required by mdocml/mandoc.
Reviewed by: ru
Approved by: philip, ed (mentors)
I changed login_tty() to only work when the application is not a session
leader yet. This works fine for applications in the base system, but it
turns out various applications call this function after daemonizing,
which means they already use their own session.
If setsid() fails, just call tcsetsid() on the current session.
tcsetsid() will already perform proper security checks.
Reported by: Oliver Lehmann
MFC after: 1 week
These functions only apply to utmp(5). They cannot be kept intact when
moving towards utmpx. The login(3) function would break, because its
argument is an utmp structure. The logout(3) and logwtmp(3) functions
cannot be used, since they provide a functionality which partially
overlaps.
Increment SHLIB_MAJOR to 9 to indicate the removal.
Similar to libexec/, do the same with lib/. Make WARNS=6 the norm and
lower it when needed.
I'm setting WARNS?=0 for secure/. It seems secure/ includes the
Makefile.inc provided by lib/. I'm not going to touch that directory.
Most of the code there is contributed anyway.
There are several reasons why it didn't work:
- It was missing <sys/cdefs.h> for __BEGIN_DECLS.
- It uses various primitive types that were not declared.
preparation for 8.0-RELEASE. Add the previous version of those
libraries to ObsoleteFiles.inc and bump __FreeBSD_Version.
Reviewed by: kib
Approved by: re (rwatson)
- update for getrlimit(2) manpage;
- support for setting RLIMIT_SWAP in login class;
- addition to the limits(1) and sh and csh limit-setting builtins;
- tuning(7) documentation on the sysctls controlling overcommit.
In collaboration with: pho
Reviewed by: alc
Approved by: re (kensmith)
The problem with fcntl(2) locks is that they are not inherited by child
processes. This breaks pidfile(3), where the common idiom is to open
and lock the PID file before daemonizing.
The entire world seems to use the non-standard TIOCSCTTY ioctl to make a
TTY a controlling terminal of a session. Even though tcsetsid(3) is also
non-standard, I think it's a lot better to use in our own source code,
mainly because it's similar to tcsetpgrp(), tcgetpgrp() and tcgetsid().
I stole the idea from QNX. They do it the other way around; their
TIOCSCTTY is just a wrapper around tcsetsid(). tcsetsid() then calls
into an IPC framework.
quotactl(2) interface and inactive quotas by accessing the quota
files directly.
Update the edquota program to use this new interface as proof of
concept.
This changes struct kinfo_filedesc and kinfo_vmentry such that they are
same on both 32 and 64 bit platforms like i386/amd64 and won't require
sysctl wrapping.
Two new OIDs are assigned. The old ones are available under
COMPAT_FREEBSD7 - but it isn't that simple. The superceded interface
was never actually released on 7.x.
The other main change is to pack the data passed to userland via the
sysctl. kf_structsize and kve_structsize are reduced for the copyout.
If you have a process with 100,000+ sockets open, the unpacked records
require a 132MB+ copyout. With packing, it is "only" ~35MB. (Still
seriously unpleasant, but not quite as devastating). A similar problem
exists for the vmentry structure - have lots and lots of shared libraries
and small mmaps and its copyout gets expensive too.
My immediate problem is valgrind. It traditionally achieves this
functionality by parsing procfs output, in a packed format. Secondly, when
tracing 32 bit binaries on amd64 under valgrind, it uses a cross compiled
32 bit binary which ran directly into the differing data structures in 32
vs 64 bit mode. (valgrind uses this to track file descriptor operations
and this therefore affected every single 32 bit binary)
I've added two utility functions to libutil to unpack the structures into
a fixed record length and to make it a little more convenient to use.
parentheses.
Fixed alignment issue in gr_dup() in its assignment of gr_mem using a
struct to force alignment without performing alignment mathematics. This
was noticed recently with libutil was built with WARNS=6 on platform such
as sparc64.
Added checks to gr_dup(), gr_equal() and gr_make() to prevent segfaults
when examining struct group's with the struct members pointing to NULL's.
With fix of alignment issue, restore WARNS?=6.
Reviewed by: des
MFC after: 1 week
struct sockaddr * that it casts internally to the appropriate type based on
sa_family. However, struct sockaddr has very lax alignment requirements,
which causes the compiler to complain when you cast a struct sockaddr * to,
say, a struct sockaddr_in6 *.
I find it reasonable to assume that the pointer we received is in fact
correctly aligned. Therefore, we can work around the compiler warnings by
casting to void * before casting to the desired type. For readability's
sake, this is done with macros.
The same technique should prove useful in other parts of the tree that
deal with socket addresses.
MFC after: 3 weeks
As discussed on the commits list, there is no need to call revoke()
inside openpty(). On RELENG_6 and RELENG_7 unlockpt() will call
revoke(). On HEAD we create pseudo-terminals on demand, so there is no
need to revoke the slave device node.
This change should never be MFC'd, because the implementation we have in
RELENG_6 and RELENG_7 should work flawlessly with older versions of
libc.
Discussed with: jhb
MFC after: never
- Pass O_NOCTTY to posix_openpt(2). This makes the implementation work
consistently on implementations that make the PTY the controlling TTY
by default.
- Call unlockpt() before opening the slave device. POSIX mentions that
de slave device should only be opened after grantpt() and unlockpt()
have been called.
- Replace some redundant code by a label.
In theory we could remove a lot of code from openpty() on FreeBSD
-CURRENT, because grantpt(), unlockpt() and revoke() are not needed in
our implementation. We'd better keep them there. This makes the code
still work with older FreeBSD releases and even makes it work on other
non-BSD operating systems.
I've compiled openpty() on Linux. You only need to remove the revoke()
call, because revoke() on Linux always returns -1. Apart from that, it
seems to work like it should.
Reviewed by: jhb
The last half year I've been working on a replacement TTY layer for the
FreeBSD kernel. The new TTY layer was designed to improve the following:
- Improved driver model:
The old TTY layer has a driver model that is not abstract enough to
make it friendly to use. A good example is the output path, where the
device drivers directly access the output buffers. This means that an
in-kernel PPP implementation must always convert network buffers into
TTY buffers.
If a PPP implementation would be built on top of the new TTY layer
(still needs a hooks layer, though), it would allow the PPP
implementation to directly hand the data to the TTY driver.
- Improved hotplugging:
With the old TTY layer, it isn't entirely safe to destroy TTY's from
the system. This implementation has a two-step destructing design,
where the driver first abandons the TTY. After all threads have left
the TTY, the TTY layer calls a routine in the driver, which can be
used to free resources (unit numbers, etc).
The pts(4) driver also implements this feature, which means
posix_openpt() will now return PTY's that are created on the fly.
- Improved performance:
One of the major improvements is the per-TTY mutex, which is expected
to improve scalability when compared to the old Giant locking.
Another change is the unbuffered copying to userspace, which is both
used on TTY device nodes and PTY masters.
Upgrading should be quite straightforward. Unlike previous versions,
existing kernel configuration files do not need to be changed, except
when they reference device drivers that are listed in UPDATING.
Obtained from: //depot/projects/mpsafetty/...
Approved by: philip (ex-mentor)
Discussed: on the lists, at BSDCan, at the DevSummit
Sponsored by: Snow B.V., the Netherlands
dcons(4) fixed by: kan
after similar calls related to struct pwd in libutil/pw_util.c:
- gr_equal()
Perform a deep comparison of two struct grp's. It does a thorough, yet
unoptimized comparison of all the members regardless of order.
- gr_make()
Create a string (see group(5)) from a struct grp.
- gr_dup()
Duplicate a struct grp. Returns a value that is a single contiguous
block of memory.
- gr_scan()
Create a struct grp from a string (as produced by gr_make()).
MFC after: 3 weeks
Significant changes:
- rev. 1.11: Use PRId64 instead of a cast to long long and %lld to print
an int64_t.
- rev. 1.12: Fix a bug that humanize_number() produces "1000" where it
should be "1.0G" or "1.0M". The bug reported by Greg Troxel.
PR: 118461
PR: 102694
Approved by: rwatson (mentor)
Obtained from: NetBSD
MFC after: 1 month
Specifically, remove the BUGS section and note that openpty(3) now always
does the various security-related steps. Also, update the error return
value section. The PR below is for the original bug rather than the doc
updates.
MFC after: 1 week
PR: bin/9770
kick off any other users on the device line before using it since
openpty(3) is documented to do this. Note that grantpt(3) does not
call revoke(2), it only adjusts permissions and ownership.
MFC after: 3 days
success and zero pid from pidfile_read(). Return EAGAIN instead. Sleep
up to three times for 5 ms while waiting for pidfile to be written.
mount(8) does the kill(mountpid, SIGHUP). If mountd pidfile is truncated,
that would result in the SIGHUP delivered to the mount' process group
instead of the mountd.
Found and analyzed by: Peter Holm
Tested by: Peter Holm, kris
Reviewed by: pjd
MFC after: 1 week
Reported by: phk
- While here, check the unit before calculating the actually number.
This way we can return EINVAL for invalid unit instead of ERANGE.
Approved by: re (kensmith)
a number in human-readable form is converted to int64_t, for example:
123b -> 123
10k -> 10240
16G -> 17179869184
First version submitted by: Eric Anderson <anderson@freebsd.org>
Approved by: re (bmah)
to pidfile_write happen, the pidfile will have nul characters prepended
due to the cached file descriptor offset...
Reviewed by: scottl
MFC after: 3 days
from /etc/login.conf, or an unterminated string buffer could result.
Probably, login_times.c should reject excessively long time strings as
unparseable, rather than truncating, which might render an invalid
string valid.
Found with: Coverity Prevent (tm)
Reviewed by: csjp
MFC after: 3 days
rather than forcing the state to LOOK. If we are in the middle of parsing
a line when we have to do a FILL we would have lost any token we were in
the middle of parsing and would have treated the next character as being
at the start of a new line instead.
PR: kern/89181
Submitted by: Antony Mawer gnats at mawer dot org
MFC after: 1 week