dd: fix nocache regions passed to posix_fadvise()

Previously with oflag=direct the call to invalidate_cache()
was not passed to the kernel, as it was less than a page size,
and a subsequent call was not made to invalidate the pending space.
Similarly with oflag=nocache the pending space at EOF was
not invalidated.  Even though these amount to only a single page
in the page cache it can be significant.  For example on
XFS before kernel patch v4.9-rc1-4-g0ee7a3f, O_DIRECT files
would have been read inefficiently if any pages were cached,
even if they were already synced to storage.

* src/dd.c (i_nocache_eof, o_nocache_eof): New bools used
to control when we want invalidate_cache(,0) to clear to EOF.
(cache_round): Use IO_BUFSIZE (currently 132KiB) to minimize
calls to the relatively expensive advise function, rather
than page_size.  This also makes it clear that while the
kernel function operates on pages, this size is chosen for
performance reasons.
(invalidate_cache): Refactor to share more code between
input and output paths. Use i_nocache_eof and o_nocache_eof
rather than proxying off max_records.  Ensure we
invalidate full pages when clearing to EOF as the kernel
will ignore any non complete pages.  Fix the offset used
for the output path.
(dd_copy): Invalidate the cache of the input after the
offset is updated, for consistency and so we don't try to
invalidate before the start of the file.  When we read
EOF on input, set flags so that we invalidate to EOF.
(main): Invalidate to EOF in more cases, by depending
on the i_nocache_eof and o_nocache_eof flags.
* doc/coreutils.texi (dd invocation): Clarify the alignment
and persisted caveats on the example applying "nocache"
to part of a file.
* tests/dd/nocache_eof.sh: A new test.
* tests/local.mk: Reference the new test.
* NEWS: Mention the bug fix.
Issue reported by Eric Bergen.
This commit is contained in:
Pádraig Brady
2017-10-10 23:29:08 -07:00
parent 9fa178fccd
commit de15a497d1
6 changed files with 181 additions and 49 deletions

98
tests/dd/nocache_eof.sh Executable file
View File

@@ -0,0 +1,98 @@
#!/bin/sh
# Ensure dd invalidates to EOF when appropriate
# Copyright (C) 2017 Free Software Foundation, Inc.
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
. "${srcdir=.}/tests/init.sh"; path_prepend_ ./src
print_ver_ dd
require_strace_ fadvise64
head -c1234567 /dev/zero > in.f || framework_failure_
# Check basic operation or skip.
# We could check for 'Operation not supported' error here,
# but that was seen to be brittle. HPUX returns ENOTTY for example.
# So assume that if this basic operation fails, it's due to lack
# of support by the system.
dd if=in.f iflag=nocache count=0 ||
skip_ 'this file system lacks support for posix_fadvise()'
strace_dd() {
strace -o dd.strace -e fadvise64 dd status=none "$@" || fail=1
}
advised_to_eof() {
grep -F ' 0, POSIX_FADV_DONTNEED' dd.strace >/dev/null
}
# The commented fadvise64 calls are what are expected with
# a 4KiB page size and 128KiB IO_BUFSIZE.
strace_dd if=in.f of=out.f bs=1M oflag=direct
#The first call is redundant but inconsequential
#fadvise64(1, 1048576, 0, POSIX_FADV_DONTNEED) = 0
#fadvise64(1, 1048576, 0, POSIX_FADV_DONTNEED) = 0
advised_to_eof || fail=1
strace_dd if=in.f of=out.f bs=1M oflag=nocache,sync
#fadvise64(1, 0, 1048576, POSIX_FADV_DONTNEED) = 0
#fadvise64(1, 1048576, 131072, POSIX_FADV_DONTNEED) = 0
#fadvise64(1, 1179648, 0, POSIX_FADV_DONTNEED) = 0
advised_to_eof || fail=1
strace_dd if=in.f count=0 iflag=nocache
#fadvise64(0, 0, 0, POSIX_FADV_DONTNEED) = 0
advised_to_eof || fail=1
strace_dd if=in.f of=/dev/null iflag=nocache skip=10 count=300
#fadvise64(0, 5120, 131072, POSIX_FADV_DONTNEED) = 0
#fadvise64(0, 136192, 22528, POSIX_FADV_DONTNEED) = 0
returns_ 1 advised_to_eof || fail=1
strace_dd if=in.f of=/dev/null iflag=nocache bs=1M count=3000
#fadvise64(0, 0, 1048576, POSIX_FADV_DONTNEED) = 0
#fadvise64(0, 1048576, 131072, POSIX_FADV_DONTNEED) = 0
#fadvise64(0, 1179648, 0, POSIX_FADV_DONTNEED) = 0
advised_to_eof || fail=1
strace_dd if=in.f of=/dev/null bs=1M count=1 iflag=nocache
#fadvise64(0, 0, 1048576, POSIX_FADV_DONTNEED) = 0
returns_ 1 advised_to_eof || fail=1
strace_dd if=in.f of=out.f bs=1M iflag=nocache oflag=nocache,sync
#fadvise64(0, 0, 1048576, POSIX_FADV_DONTNEED) = 0
#fadvise64(1, 0, 1048576, POSIX_FADV_DONTNEED) = 0
#fadvise64(0, 1048576, 131072, POSIX_FADV_DONTNEED) = 0
#fadvise64(1, 1048576, 131072, POSIX_FADV_DONTNEED) = 0
#fadvise64(0, 1179648, 0, POSIX_FADV_DONTNEED) = 0
#fadvise64(1, 1179648, 0, POSIX_FADV_DONTNEED) = 0
advised_to_eof || fail=1
# Ensure sub page size offsets are handled.
# I.e., only page aligned offsets are sent to fadvise.
if ! strace -o dd.strace -e fadvise64 dd status=none \
if=in.f of=out.f bs=1M oflag=direct seek=512 oflag=seek_bytes 2>err; then
# older XFS had a page size alignment requirement
echo "dd: error writing 'out.f': Invalid argument" > err_ok
compare err_ok err || fail=1
else
#The first call is redundant but inconsequential
#fadvise64(1, 1048576, 0, POSIX_FADV_DONTNEED) = 0
#fadvise64(1, 1048576, 0, POSIX_FADV_DONTNEED) = 0
advised_to_eof || fail=1
fi
Exit $fail