Commit Graph

30685 Commits

Author SHA1 Message Date
Pádraig Brady
aba9800995 cksum: prefer -a sha2 -l ###, to -a sha###
To make the interface more concise and consistent,
while being backwards compatible.

* src/digest.c (main): Continue to support -a "sha###" but
also support -a "sha2" and treat it like "sha3", except in...
(output_file): ... maintain the legacy tags for better compatability.
* doc/coreutils.texi (cksum invocation): Document the -a sha2 option.
* tests/cksum/cksum-base64.pl: Adjust as per modified --help.
* tests/cksum/cksum-c.sh: Add new supported SHA2-### tagged variant.
* NEWS: Mention the new feature.
2025-09-04 14:49:07 +01:00
Collin Funk
403d82a0bf cksum: add support for SHA-3
* src/digest.c: Include sha3.h.
(BLAKE2B_MAX_LEN): Rename to
DIGEST_MAX_LEN since it is also used for SHA-3.
(sha3_sum_stream): New function.
(enum Algorithm, algorithm_args, algorithm_args, algorithm_types)
algorithm_tags, algorithm_bits, cksumfns, cksum_output_fns): Add entries
for SHA-3.
(usage): Mention that SHA-3 is supported. Mention requirements for
--length with SHA-3.
(split_3): Use DIGEST_MAX_LEN instead of BLAKE2B_MAX_LEN. Determine the
length of the digest for SHA-3. Make sure it is 224, 256, 384, or 512.
(digest_file): Set the digest length in bytes. Use DIGEST_MAX_LEN
instead of BLAKE2B_MAX_LEN. Always append the digest length to SHA3 in
the output.
(main): Allow the use of --length with 'cksum -a sha3'.  Use
DIGEST_MAX_LEN instead of BLAKE2B_MAX_LEN. Make sure it is 224, 256,
384, or 512.
* tests/cksum/cksum-base64.pl (@pairs): Add expected sha3 output.
(fmt): Modify the output to use SHA3-512 since that is the default.
(@Tests): Modify arguments for sha3 to use --length=512.
* tests/cksum/cksum-sha3.sh: New test, based on tests/cksum/b2sum.sh.
* tests/local.mk (all_tests): Add the test.
* bootstrap.conf: Add crypto/sha3.
* gnulib: Update to latest commit.
* NEWS: Mention the change.
* doc/coreutils.texi (cksum general options): Mention sha3 as a
supported argument to the -a option. Mention that 'cksum -a sha3'
supports the --length option. Mention that SHA-3 is considered secure.
2025-09-03 22:29:33 -07:00
Collin Funk
022673367b maint: avoid syntax-check failure from previous commit
* src/df.c: Don't include uchar.h.
* src/ls.c: Likewise.
* src/wc.c: Likewise.
2025-09-03 19:15:49 -07:00
Collin Funk
3a81d44d43 fold: check that characters are not non-breaking spaces when -s is used
NetBSD 10 and Solaris 11.4 treat non-breaking spaces as blank
characters unlike glibc.

* src/system.h: Include uchar.h.
(c32isnbspace): New function based on iswnbspace from src/wc.c.
* src/fold.c (fold_file): Use it.
* src/wc.c (iswnbspace): Remove function.
(maybe_c32isnbspace): New function.
(wc, main): Use it.
Fixes https://bugs.gnu.org/79300
2025-09-03 19:12:18 -07:00
Collin Funk
e09f2bc611 maint: prefer issymlink to readlink with a small buffer
* bootstrap.conf (gnulib_modules): Add issymlink and issymlinkat.
* src/copy.c: Include issymlink.h.
(copy_reg): Use issymlink instead of readlinkat.
* src/rmdir.c: Include issymlink.h.
(main): Use issymlink instead of readlink.
* src/tail.c: Include issymlink.h.
(recheck, any_symlinks): Use issymlink instead of readlink.
* src/test.c: Include issymlink.h.
(unary_operator): Use issymlink instead of readlink.
2025-09-03 18:31:08 -07:00
Paul Eggert
1b2f629d96 build: update gnulib submodule to latest 2025-09-03 18:08:53 -07:00
Bruno Haible
db04fc2a09 build: Update after gnulib changed
* gnulib-tests/Makefile.am: Move the AM_CFLAGS assignment before the
'include gnulib.mk'.
2025-09-03 18:08:53 -07:00
Collin Funk
8f6430666f maint: avoid syntax-check failure from previous commit
* tests/seq/seq-long-double.sh: Place comma after "I.e.".
2025-09-02 21:14:34 -07:00
Pádraig Brady
701416709d seq: be more accurate with large integer start values
* src/seq.c (main): Avoid possibly innacurate conversion
to long double, for all digit start values.
* tests/seq/seq-long-double.sh: Add a test case.
* NEWS: Mention the improvement.
Fixes https://bugs.gnu.org/79369
2025-09-02 20:51:55 +01:00
Paul Eggert
c6397d0872 df: pacify static analysis
Problem reported by Yubiao Hu <https://bugs.gnu.org/79336>.
* src/df.c (get_dev): Assume MOUNT_POINT is non-null.
2025-09-01 10:03:14 -07:00
Pádraig Brady
ebd670e7eb ls: fix alignment with locale formatted --size
Fix allocated size alignment in locales with multi-byte grouping chars.
Tested with: LC_ALL=sv_SE.utf8 ls --size --block-size=\'k

* src/ls.c (print_file_name_and_frills): Don't rely on
printf("%*s", width, string) to pad multi-byte strings appropriately.
Instead work out the padding required and use:
printf("%*s%s", padding, "", string) to pad multi-byte appropriately.
* tests/ls/block-size.sh: Add a test case.
* NEWS: Mention the bug fix.
Fixes https://bugs.gnu.org/79347
2025-08-31 19:25:38 +01:00
Pádraig Brady
735a4a27f3 b2sum: --length: fix upper bound check
* src/digest.c (main): Don't saturate -l to BLAKE2B_MAX_LEN,
so that the subsequent bounds check is performed.
* tests/cksum/b2sum.sh: Add a test case.
* NEWS: Mention the fix introduced in commit v9.5-71-gf2c84fe63
2025-08-30 12:44:29 +01:00
Collin Funk
89b9115da6 fold: fix handling of invalid multi-byte characters
* src/fold.c (fold_file): Continue the loop when we have buffered bytes
but nothing left to read from the file.
(adjust_column): Don't assume that the character is printable.
* tests/fold/fold-characters.sh: Add a new test case.
(bad_unicode): New function.
2025-08-28 18:57:13 -07:00
Pádraig Brady
4b35a3b920 tests: fold: add tests for multi-byte width
* tests/fold/fold.pl: The i18n patch didn't actually test folding
of multi-byte characters, so add tests for various multi-byte forms.
2025-08-27 16:41:07 +01:00
Pádraig Brady
0001bbc3e2 tests: fold: copy i18n patch tests
* tests/fold/fold.pl: Copy tests from Fedora,
removing copy & pasted logic that was
extraneous to either the i18n patch or upstream.
2025-08-27 14:36:16 +01:00
Pádraig Brady
cbf6a27810 tests: parameterize IO_BUFSIZE
* src/getlimits.c (main): Output IO_BUFSIZE, useful for
sizing data for tests.
* tests/fold/fold-characters.sh: Use it rather than hardcoding.
2025-08-27 12:03:46 +01:00
Pádraig Brady
aec4f85476 build: fold: fix build failure with C99
GCC 10.2 gave the following error:
"error: label at end of compound statement"

* src/fold.c (fold_file): Add a ";" to avoid C2X specific syntax.
2025-08-27 11:50:01 +01:00
Collin Funk
ae89cd646a fold: don't truncate multibyte characters at the end of the buffer
* src/fold.c (fold_file): Replace invalid characters with the original
byte read. Copy multibyte sequences that may not yet be read to the
start of the buffer before reading more bytes.
* tests/fold/fold-characters.sh: Add a test case.
2025-08-26 16:59:32 -07:00
Pádraig Brady
01646ccd64 tests: fold: consolidate all fold tests in tests/fold
* tests/misc/fold.pl: Move from here to ...
* tests/fold/fold.pl: ... here.
* tests/local.mk: Adjust accordingly.
2025-08-26 16:49:56 +01:00
Pádraig Brady
17be45b952 tests: fold: add a memory constraint test
Enforcing this interface behavior is worthwhile
irrespective of our current implementation,
to ensure future or other implementations conform.

* tests/fold/fold-characters.sh: Ensure the fold implementation
uses bounded memory.
2025-08-26 16:48:31 +01:00
Collin Funk
fb9016d505 fold: use fread instead of getline
* src/fold.c: Include ioblksize.h.
(fold_file): Use two IO_BUFSIZE-sized buffers. Use fread instead of
getline. Check for if we reached the end of file.
2025-08-24 13:42:40 -07:00
Pádraig Brady
4bfcf62f74 tests: nproc: fix false failure on some systems
* tests/nproc/nproc-quota.sh: Also simulate sched_getscheduler()
as this will not be called on older or non linux, or
may return ENOSYS on Alpine.
Fixes https://bugs.gnu.org/79299
2025-08-24 11:48:52 +01:00
Collin Funk
11a31296fa maint: prefer STRUCT_UTMP to struct gl_utmp
* cfg.mk (sc_prohibit-struct-gl_utmp): New rule for 'make syntax-check'.
* src/pinky.c (time_string, print_entry, scan_entries, short_pinky):
Use STRUCT_UTMP instead of struct gl_utmp.
* src/uptime.c (print_uptime, print_uptime, uptime): Likewise.
* src/users.c (list_entries_users, users): Likewise.
* src/who.c (time_string, print_user, print_boottime, print_deadprocs)
(print_login, print_initspawn, print_clockchange, print_runlevel)
list_entries_who, scan_entries, who): Likewise.
2025-08-23 18:10:07 -07:00
Pádraig Brady
6c668dc133 tests: cp: ensure copy offload is not disabled for sparse files
Related to commits v9.1-109-g879d2180d and v9.7-248-g306de6c26

* tests/cp/sparse-perf.sh: This edge case was missed a couple of times,
so add a test to ensure we attempt copy offload.
2025-08-23 18:55:46 +01:00
Collin Funk
7d0d9658f9 fold: add the --characters option
* src/fold.c: Include mcel.h.
(count_bytes): Remove variable.
(counting_mode, last_character_width): New variables.
(shortopts, long_options): Add the option.
(adjust_column): If --characters is in used account for number of
characters instead of their width.
(fold_file): Use getline and iterate over the result with mcel functions
to handle multibyte characters.
(main): Check for the option.
* src/local.mk (src_fold_LDADD): Add $(LIBC32CONV), $(LIBUNISTRING), and
$(MBRTOWC_LIB).
* tests/fold/fold-characters.sh: New file.
* tests/fold/fold-spaces.sh: New file.
* tests/fold/fold-nbsp.sh: New file.
* tests/local.mk (all_tests): Add the tests.
* NEWS: Mention the new option.
* doc/coreutils.texi (fold invocation): Likewise.
2025-08-22 22:09:50 -07:00
Paul Eggert
39f22fe687 cp: improve hole handling on squashfs
Better fix for problem reported by Jeremy Allison
<https://bugs.gnu.org/79267>.
* src/copy.c (struct scan_inference): New type, replacing
union scan_inference.  All uses changed.  This is so
infer_scantype can report the first hole's offset when known.
(lseek_copy): 5th arg is now struct scan_inference const *,
not just off_t.  All uses changed.
(infer_scantype): If SEEK_SET+SEEK_HOLE do not find a hole,
fall back on ZERO_SCANTYPE.
2025-08-22 17:40:30 -07:00
Paul Eggert
306de6c261 cp: go back to copy_file_range optimization
This reverts part of the previous change.
* src/copy.c (lseek_copy): When calling sparse_copy, do not
ask it to scan for zeros unless --sparse=always, so that it
can use copy_file_range which can be far more efficient.
2025-08-22 17:40:29 -07:00
Paul Eggert
b7fc76269b cp: always punch holes that we make
Problem reported by Jeremy Allison <https://bugs.gnu.org/79267>.
* src/copy.c (create_hole, sparse_copy): Omit arg PUNCH_HOLES,
as we always punch holes now.  All uses changed.
(lseek_copy): When calling sparse_copy, scan for holes when
sparse_mode == SPARSE_AUTO, as that means we are making holes.
(copy_reg): Always punch any hole made at end.
2025-08-21 15:58:21 -07:00
Pádraig Brady
61a7382c8f tests: nproc: add a test for cgroup quotas
* tests/nproc/nproc-quota.sh: New root only test.
* tests/local.mk: Reference the new test.
2025-08-21 00:04:03 +01:00
Collin Funk
238855dd4e maint: prefer https to http
* doc/sort-version.texi (Other version/natural sort implementations):
Use https in documentation link.
* tests/chmod/symlinks.sh: Use https in license text.
2025-08-19 11:41:17 -07:00
Pádraig Brady
c4df55be7f nproc: honor cgroup v2 CPU quotas
* NEWS: Mention the new feature.
* doc/coreutils.texi (nproc invocation): Mention that
cgroup CPU quotas can limit the reported number.
* gnulib: Update to new nproc gnulib implementation:
https://github.com/coreutils/gnulib/commit/9b07115f4a
2025-08-19 17:24:21 +01:00
Collin Funk
37bca77c4d logname: list David MacKenzie as the author
* src/logname.c (AUTHORS): List David as the author.
* AUTHORS: Likewise.
2025-08-18 20:42:48 -07:00
Paul Eggert
ae6fd2b4a1 realpath: improve doc and NEWS 2025-08-14 21:08:44 -07:00
Collin Funk
c9ae3b553d maint: avoid syntax-check failure from previous commit
* src/tsort.c: Don't include long-options.h since the previous commit
removed the call to parse_gnu_standard_options_only. This avoids a
sc_prohibit_long_options_without_use syntax-check failure.
2025-08-14 19:43:52 -07:00
Paul Eggert
39eec70f7b tsort: add do-nothing -w option
This is for conformance to POSIX.1-2024
* src/tsort.c: Include getopt.h.
(main): Accept and ignore -w.  Do not bother altering
the usage message, as the option is useless.
* tests/misc/tsort.pl (cycle-3): New test.
2025-08-14 17:35:23 -07:00
Pádraig Brady
2543c0052c maint: use short form bug URLs
* cfg.mk (sc_prohibit-long-form-bug-urls): Disallow long form in code.
* scripts/git-hooks/commit-msg: Disallow long form in commit messages.
* NEWS: Shorten long urls.
* bootstrap.conf: Likewise.
* configure.ac: Likewise.
* scripts/git-hooks/commit-msg: Likewise.
* src/csplit.c: Likewise.
* src/fmt.c: Likewise.
* src/make-prime-list.c: Likewise.
* src/nohup.c: Likewise.
* tests/od/od-float.sh: Likewise.
* tests/rm/r-root.sh: Likewise.
* tests/tail/inotify-race.sh: Likewise.
* tests/tail/inotify-race2.sh: Likewise.
2025-08-12 17:50:40 +01:00
Bruno Haible
f4d339c934 basenc: Don't trigger undefined behaviour in mini-gmp
* src/basenc.c (base58_encode): Avoid calling mpz_import on an empty
limb sequence.
2025-08-11 18:07:20 -07:00
Collin Funk
2a092e80e2 realpath: support the -E option required by POSIX
* src/realpath.c (longopts): Add the option.
(main): Likewise.
(usage): Add the option to the --help message.
* tests/misc/realpath.sh: Add a simple test.
* doc/coreutils.texi (realpath invocation): Mention the new option.
* NEWS: Likewise.
2025-08-10 11:41:56 -07:00
Pádraig Brady
192e09042f doc: --base58: add example usage to info
* doc/coreutils.texi (basenc invocation): Add an example
using --base58 to generate a unique ID.  This also demonstrates
compound usage of the basenc command, to convert to/from binary.
2025-08-10 16:21:42 +01:00
Pádraig Brady
d7774907db doc: rearrange NEWS to more standard ordering
* NEWS: Also move items to more appropriate sections.
Also use more consistent quoting.
2025-08-09 22:32:18 +01:00
Pádraig Brady
02c3d43e3a basenc: add base58 support
A 58 character encoding that:
 - avoids visually ambiguous 0OIl characters
 - uses only alphanumeric characters
Described at:
 - https://tools.ietf.org/html/draft-msporny-base58-03

This implementation uses GMP (or gnulib's gmp fallback).
Performance is good in comparison to other implementations.
For example when using libgmp on an i7-5600U system,
encoding is 530 times faster, and decoding 830 times faster
than the implementation using arbitrary precision ints in cpython 3.13.

Memory use is proportional to the size of input.

Encoding benchmarks:

  $ time yes | head -c65535 | src/basenc --base58 -w0 >file.enc
  real    0m0.018s

  ./configure --quiet --without-libgmp && make -j $(nproc)
  $ time yes | head -c65535 | src/basenc --base58 -w0 >file.enc
  real    0m3.431s

  # dnf install python3-base58
  $ time yes | head -c65535 | base58 >file.enc  # cpython 3.13
  real    0m9.700s

Decoding benchmarks:

  $ time src/basenc --base58 -d <file.enc >/dev/null
  real    0m0.010s

  $ ./configure --without-libgmp && make  # gnulib gmp
  $ time src/basenc --base58 -d <file.enc >/dev/null
  real    0m0.145s

  $ time base58 -d <file.enc >/dev/null  # cpython 3.13
  real    0m8.302s

* src/basenc.c (base_decode_ctx_finalize, base_encode_ctx_init,
base_encode_ctx, base_encode_ctx_finalize): New functions to
provide more general processing functionality.
(base58_{de,en}code_ctx{_init,,_finalize}): New functions to
accumulate all input before calling ...
(base58_{de,en}code): ... the GMP based encoding/decoding routines.
(do_encode, do_decode): Call the ctx variants if enabled.
* doc/coreutils.texi (basenc invocation): Describe the new option,
and indicate the main use case being interactive user use.
* src/local.mk: Link basenc with GMP.
* tests/basenc/basenc.pl: Add test cases.
* NEWS: Mention the new feature.
2025-08-09 22:10:27 +01:00
Pádraig Brady
c480428e37 basenc: fix stripping of '=' chars in some encodings
* src/basenc.c (do_decode): With -i ensure we strip '=' chars
if there is no padding for the chosen encoding.
* tests/basenc/basenc.pl: Add a test case.
* NEWS: Mention the bug fix.
2025-08-09 09:59:00 +01:00
Collin Funk
13202995ba maint: prefer attribute.h in .c files
* src/basenc.c (base16_encode, z85_encoding, do_decode): Use
ATTRIBUTE_NONSTRING instead of ATTRIBUTE_NONSTRING.
* src/basenc.c (sc_prohibit-_gl-attributes): New rule for
'make syntax-check'.
2025-08-08 20:49:29 -07:00
Paul Eggert
26bf557e3f cp: omit some needless lseek calls
The sparse code sometimes issued multiple lseeks against the
same file without doing anything in betwee.  Optimize them away
by keeping track of the last hole output, in a way that
crosses the sparse_copy function call boundary.
* src/copy.c (sparse_copy): New arg hole_size, replacing old args
scan_holes and last_write_made_hole.  All callers changed.
(sparse_copy, lseek_copy): Do not create hole at
end; let the caller deal with it.  All callers changed.
(lseek_copy): New args hole_size and total_n_read.  Caller changed.
(copy_reg): Create hole at end for both lseek_copy and sparse_copy.
2025-08-07 20:20:04 -07:00
Paul Eggert
2f2adb294e cp: --sparse=always was missing some holes
The sparse code assumed that st_blksize was the minimum hole size.
However, st_blksize is an optimum I/O buffer size, not the file
system fundamental block size.  Use ST_NBLOCKSIZE instead;
although it may underestimate the true block size that just slows
‘cp’ down a bit, without introducing bugs.
* src/copy.c (sparse_copy): Arg scan_holes replaces
the old hole_size arg.  All callers changed.
(lseek_copy): Remove hole_size arg; no longer needed.
Caller changed.
2025-08-07 20:20:04 -07:00
Collin Funk
c9a30d6781 maint: use consistent references to standard files in messages
* cfg.mk (sc_standard_outputs): Add a grep command for source files.
* src/du.c (main): Use standard input instead of stdin, standard output
instead of stdout, and standard error instead of stderr in messages.
* src/nohup.c (main): Likewise.
* src/sort.c (main): Likewise.
* src/split.c (main): Likewise.
* src/stdbuf.c (main): Likewise.
* src/wc.c (main): Likewise.
* tests/du/files0-from.pl (@Tests): Adjust test case to new messages.
* tests/sort/sort-files0-from.pl: Likewise.
* tests/wc/wc-files0-from.pl: Likewise.
2025-08-05 18:46:04 -07:00
Bernhard Voelker
257efe1ec4 maint: remove now-unused include of 'safe-read.h'
'make syntax-check' complains:
  src/tail.c
  maint.mk: the above files include safe-read.h but don't use it
  make: *** [maint.mk:737: sc_prohibit_safe_read_without_use] Error 1

The removal was missed for tail.c in recent commit d3c7072a09.

* src/tail.c (safe-read.h): Remove include.
2025-08-04 10:53:49 +01:00
Paul Eggert
5c822f4dac maint: prefer same-inode.h
This does not change behavior on POSIX platforms; it’s mostly to
make it clearer when we’re looking for file identity.
* src/cat.c (main):
* src/copy.c (struct dir_list, is_ancestor, copy_internal):
* src/tail.c (struct File_spec, record_open_fd, recheck)
(tail_forever_inotify, tail_file):
* src/test.c (binary_operator):
Use psame_inode, PSAME_INODE, or SAME_INODE instead of comparing
device and inode numbers by hand.
2025-08-03 19:48:06 -07:00
Paul Eggert
01e4d2983a tail: refactor ‘failable’
* src/tail.c (recheck, tail_file): Do not mark a file as tailable
merely because --retry is not in effect.  Simplify internal logic.
This should not change behavior; it’s just for clarity and to
make the code match the comments better.
2025-08-03 19:48:06 -07:00
Paul Eggert
e0100ac746 tail: fix race between read and fstat
* src/tail.c (get_file_status): Remove, since after the changes
described below it would be called in just one place and it’s a
bit clearer to inline by hand.
(tail_file): Don’t call fstat after reading the file, as that
misses changes arriving between read and fstat.  Instead, reuse
the fstat done before reading the file.
2025-08-03 19:48:06 -07:00