Compare commits

...

36 Commits

Author SHA1 Message Date
Jim Meyering
25e704ebf2 tests: cp/fiemap: exercise previously-failing parts
* tests/cp/fiemap-2: New test.
* tests/Makefile.am (TESTS): Add it.
2011-01-29 17:05:28 +01:00
Jim Meyering
cea933f993 copy: make extent_copy use sparse_copy, rather than its own code
* src/copy.c (extent_copy): Before this change, extent_copy would fail
to create holes, thus breaking --sparse=auto and --sparse=always.
I.e., copying a large enough file of all zeros, cp --sparse=always
should introduce a hole, but with extent_copy, it would not.
2011-01-29 17:05:28 +01:00
Jim Meyering
cb542c1882 copy: remove obsolete comment
* src/copy.c (sparse_copy): Remove now-obsolete comment about
how we used to work around lack of ftruncate.  Combine nested
if conditions into one.
2011-01-29 17:05:28 +01:00
Jim Meyering
6fa3d25887 copy: factor sparse-copying code into its own function, because
we're going to have to use it from within extent_copy, too.
* src/copy.c (sparse_copy): New function, factored out of...
(copy_reg): ...here.
Remove now-unused locals.
2011-01-29 17:05:28 +01:00
Jim Meyering
789d7ddee1 fiemap copy: avoid leak-on-error
* src/copy.c (extent_copy): Don't leak an extent_scan buffer on
failed lseek, read, or write.
2011-01-29 17:05:28 +01:00
Jim Meyering
4d64bef1d6 fiemap copy: avoid a performance hit due to very small buffer
* src/copy.c (extent_copy): Don't let what should have been a
temporary reduction of buf_size (to handle a short ext_len) become
permanent and thus impact the performance of all further iterations.
2011-01-29 17:05:28 +01:00
Jim Meyering
9869073ddb fiemap copy: simplify post-loop logic; improve comments
* src/copy.c (extent_copy): Avoid duplication in post-loop
extend-to-desired-length code.
2011-01-29 17:05:28 +01:00
Jim Meyering
ca0a487261 fiemap copy: rename some locals
(extent_copy): Rename locals: s/*ext_logical/*ext_start/
2011-01-29 17:05:28 +01:00
Jim Meyering
3f63faa233 tests: ensure that FIEMAP-enabled cp copies a sparse file efficiently
* tests/cp/fiemap-perf: New file.
* tests/Makefile.am (TESTS): Add it.
2011-01-29 17:05:23 +01:00
Jim Meyering
700bcaf54e copy: don't allocate a separate buffer just for extent-based copy
* src/copy.c (copy_reg): Move use of extent_scan to just *after*
we allocate the main copying buffer, so we can...
(extent_scan): Take a new parameter, BUF, and use that rather
than allocating a private buffer.  Update caller.
2011-01-28 23:28:38 +01:00
Jim Meyering
395011ce8f copy: tweak variable name; improve a comment
* src/copy.c (copy_reg): Rename a variable to make more sense from
caller's perspective: s/require_normal_copy/normal_copy_required/.
This is an output-only variable, and the original name could make
it look like an input (or i&o) variable.
2011-01-28 23:28:38 +01:00
Jim Meyering
6c4fc520b6 copy: call extent_copy also when make_holes is false, ...
so that we benefit from using extents also when reading a sparse
input file with --sparse=never.
* src/copy.c (copy_reg): Remove erroneous test of "make_holes"
so that we call extent_copy also when make_holes is false.
Otherwise, what's the point of that parameter?
2011-01-28 23:28:38 +01:00
Jim Meyering
85034c48b4 * src/copy.c (copy_reg): Remove useless else-after-goto. 2011-01-28 23:28:38 +01:00
Jim Meyering
65cdc84bd4 copy.c: shorten a comment to fit in 80 columns 2011-01-28 23:28:38 +01:00
Jim Meyering
9c4d227e1d extent-scan.c: don't include error.h or quote.h
* src/extent-scan.c: Don't include error.h or quote.h.  Neither is used.
2011-01-28 23:28:38 +01:00
Jim Meyering
fb3e015d82 formatting 2011-01-28 23:28:38 +01:00
Jim Meyering
4829a64437 distribute extent-scan.h, too
* src/Makefile.am (copy_sources): Also distribute extent-scan.h.
2011-01-28 23:28:38 +01:00
Jim Meyering
0508ea6850 rename extent-scan functions to start with extent_scan_ 2011-01-28 23:28:38 +01:00
Jim Meyering
db7bd09c70 rename extent_scan member
* extent-scan.h [struct extent_scan]: Rename member:
s/hit_last_extent/hit_final_extent/.  "final" is clearer,
since "last" can be interpreted as "preceding".
2011-01-28 23:28:38 +01:00
Jim Meyering
1d2026db51 fiemap copy: don't let write failure go unreported; adjust style, etc.
* src/copy.c (write_zeros): Add comments.
(extent_copy): Move decls of "ok" and "i" down to scope where used.
Adjust comments.
Rename local: s/holes_len/hole_size/
Print a diagnostic upon failure to write zeros.
2011-01-28 23:28:38 +01:00
jeff.liu
b8929e761d bug#6131: [PATCH]: fiemap support for efficient sparse file copy
Jim Meyering wrote:
> jeff.liu wrote:
>> Sorry for the delay.
>>
>> This is the new patch to isolate the stuff regarding to extents reading to a new module. and teach
>> cp(1) to make use of it.
>
> Jeff,
>
> I applied your patch to my rebased fiemap-copy branch.
> My first step was to run the usual
>
>   ./bootstrap && ./configure && make && make check
>
> "make check" failed on due to a double free in your new code:
> (x86_64, Fedora 13, ext4 working directory)
>
> To get details, I made this temporary modification:
Hi Jim,

I am sorry for the fault, it fixed at the patch below.
Would you please revie at your convenience?

Changes:
========
1. fix write_zeros() as Jim's comments, thanks for pointing this out.
2. remove char const *fname from struct extent_scan.
3. change the signature of open_extent_scan() from "void open_extent_scan(struct extent_scan
**scan)" to "void open_extent_scan(struct extent_scan *scan)"; the reason is I'd like to reduce once
memory allocation for the extent_scan variable, instead, using stack to save it.
4. remove close_extent_scan() from a function defined at extent-scan.c to extent-scan.h as a Macro
definination, but it does nothing for now, since initial extent scan defined at stack.
5. add a macro "free_extents_info()" defined at extent-scan.h to release the memory allocated to
extent info which should be called combine with get_extents_info(), it just one line, so IMHO,
define it as macro should be ok.

I have done the memory check via `valgrind`, no issue found.
make test against cp/sparse-fiemap failed at the extent compare stage, but the file content is
identical to each other by comparing those two files "j1/j2" manually.
Is it make sense if we verify them through diff(1) since the testing file is in small size?
or we have to merge the contig extents from the output of `filefrag', I admit I have not dig into
the filefrag-extent-compare at the moment, I need to recall the perl language syntax. :-P.

>From 50a3338db06442fa2d789fd65175172d140cc96e Mon Sep 17 00:00:00 2001
From: Jie Liu <jeff.liu@oracle.com>
Date: Wed, 29 Sep 2010 15:35:43 +0800
Subject: [PATCH 1/1] cp: add a new module for scanning extents

* src/extent-scan.c: Source code for scanning extents.
  Call open_extent_scan() to initialize extent scan.
  Call get_extents_info() to get a number of extents for each iteration.
* src/extent-scan.h: Header file of extent-scan.c.
  Wrap free_extent_info() as macro define to release the space allocated extent_info per extent scan.
  Wrap close_extent_scan() as macro define but do nothing at the moment.
* src/Makefile.am: Reference it and link it to copy_source.
* src/copy.c: Make use of the new module, replace fiemap_copy() with extent_copy().

Signed-off-by: Jie Liu <jeff.liu@oracle.com>
2011-01-28 23:28:38 +01:00
Jim Meyering
c7ab2b8ae8 build: distribute new test script, filefrag-extent-compare
* tests/Makefile.am (EXTRA_DIST): Add filefrag-extent-compare.
2011-01-28 23:28:38 +01:00
Jim Meyering
01cbe2f978 build: distribute new file, fiemap.h
* src/Makefile.am (noinst_HEADERS): Add fiemap.h.
2011-01-28 23:28:38 +01:00
Jie Liu
a782b73226 copy.c: add FIEMAP_FLAG_SYNC to fiemap ioctl
* src/copy.c (fiemap_copy): Force kernel to sync the source
file before mapping.
2011-01-28 23:28:38 +01:00
Jim Meyering
881e357dff tests: accommodate varying filefrag -v "flags" output
* tests/cp/sparse-fiemap: Accommodate values other than "eof"
in the "flags" column of filefrag -v output
2011-01-28 23:28:38 +01:00
Jim Meyering
5be9d09189 fiemap.h: include <stdint.h>, not <linux/types.h>
* src/fiemap.h: Include stdint.h, not linux/types.h,
now that this file uses only portable type names.
2011-01-28 23:28:38 +01:00
Paul Eggert
690767b2e9 copy.c: ensure proper alignment of fiemap buffer
* src/copy.c (fiemap_copy): Ensure that our fiemap buffer
is large enough and well-aligned.
Replace "0LL" with equivalent "0" as 3rd argument to lseek.
2011-01-28 23:28:37 +01:00
Jim Meyering
5d154393cd copy.c: adjust comments, tweak semantics
* src/copy.c (fiemap_copy): Rename from fiemap_copy_ok.
Add/improve comments.
Remove local, "fail".
(fiemap_copy): Do not require caller to set
"normal_copy_required" before calling fiemap_copy.
Report ioctl failure if it's the 2nd or subsequent call.
2011-01-28 23:28:37 +01:00
Jim Meyering
539e125c82 tests: improve fiemap test to work with 4 FS types; fall back on ext4
* tests/cp/sparse-fiemap: Improve.
* tests/filefrag-extent-compare: New file.
2011-01-28 23:28:37 +01:00
Jim Meyering
fb563d0e4a tests: relax the root-tests cross-check
* cfg.mk (sc_root_tests): Allow spaces before "require_root_",
now that tests/cp/sparse-fiemap has a conditional use.
2011-01-28 23:28:37 +01:00
Jim Meyering
b9173f3a8a tests: test fiemap-enabled cp more thoroughly
* tests/cp/sparse-fiemap: More tests.
2011-01-28 23:28:37 +01:00
Jim Meyering
b301d1a035 tests: require root only if current partition is neither btrfs nor xfs
* tests/cp/sparse-fiemap: Don't require root access if current
partition is btrfs or xfs.
Use init.sh, not test-lib.sh.
2011-01-28 23:28:37 +01:00
Jim Meyering
1816c57f38 tests: exercise more of the new FIEMAP copying code
* tests/cp/sparse-fiemap: Ensure that a file with many extents (more
than fit in copy.c's internal 4KiB buffer) is copied properly.
2011-01-28 23:28:37 +01:00
Jim Meyering
530df8cbbf tests: sparse-fiemap: factor out some set-up
* tests/cp/sparse-fiemap: Cd into test directory sooner.
2011-01-28 23:28:37 +01:00
Jie Liu
33f74dfe29 tests: add a new test for FIEMAP-copy
* tests/cp/sparse-fiemap: Add a new test for FIEMAP-copy against a
loopbacked ext4 partition.
* tests/Makefile.am (sparse-fiemap): Reference the new test.
2011-01-28 23:28:37 +01:00
Jie Liu
b66ee767ef cp: Add FIEMAP support for efficient sparse file copy
* src/fiemap.h: Add fiemap.h for fiemap ioctl(2) support.
Copied from linux's include/linux/fiemap.h, with minor formatting changes.
* src/copy.c (copy_reg): Now, when `cp' invoked with --sparse=[WHEN] option, we
will try to do FIEMAP-copy if the underlaying file system support it, fall back
to a normal copy if it fails.
2011-01-28 23:28:37 +01:00
11 changed files with 857 additions and 97 deletions

2
cfg.mk
View File

@@ -80,7 +80,7 @@ sc_root_tests:
@if test -d tests \
&& grep check-root tests/Makefile.am>/dev/null 2>&1; then \
t1=sc-root.expected; t2=sc-root.actual; \
grep -nl '^require_root_$$' \
grep -nl '^ *require_root_$$' \
$$($(VC_LIST) tests) |sed s,tests/,, |sort > $$t1; \
sed -n '/^root_tests =[ ]*\\$$/,/[^\]$$/p' \
$(srcdir)/tests/Makefile.am \

View File

@@ -145,6 +145,7 @@ noinst_HEADERS = \
copy.h \
cp-hash.h \
dircolors.h \
fiemap.h \
find-mount-point.h \
fs.h \
group-list.h \
@@ -449,7 +450,7 @@ uninstall-local:
fi; \
fi
copy_sources = copy.c cp-hash.c
copy_sources = copy.c cp-hash.c extent-scan.c extent-scan.h
# Use `ginstall' in the definition of PROGRAMS and in dependencies to avoid
# confusion with the `install' target. The install rule transforms `ginstall'

View File

@@ -36,6 +36,7 @@
#include "buffer-lcm.h"
#include "copy.h"
#include "cp-hash.h"
#include "extent-scan.h"
#include "error.h"
#include "fcntl--.h"
#include "file-set.h"
@@ -62,6 +63,10 @@
# include "verror.h"
#endif
#ifndef HAVE_FIEMAP
# include "fiemap.h"
#endif
#ifndef HAVE_FCHOWN
# define HAVE_FCHOWN false
# define fchown(fd, uid, gid) (-1)
@@ -129,6 +134,122 @@ utimens_symlink (char const *file, struct timespec const *timespec)
return err;
}
/* Copy the regular file open on SRC_FD/SRC_NAME to DST_FD/DST_NAME,
honoring the MAKE_HOLES setting and using the BUF_SIZE-byte buffer
BUF for temporary storage. Copy no more than MAX_N_READ bytes.
Return true upon successful completion;
print a diagnostic and return false upon error.
Note that for best results, BUF should be "well"-aligned.
BUF must have sizeof(uintptr_t)-1 bytes of additional space
beyond BUF[BUF_SIZE-1].
Set *LAST_WRITE_MADE_HOLE to true if the final operation on
DEST_FD introduced a hole. */
static bool
sparse_copy (int src_fd, int dest_fd, char *buf, size_t buf_size,
bool make_holes,
char const *src_name, char const *dst_name,
uintmax_t max_n_read, bool *last_write_made_hole)
{
typedef uintptr_t word;
*last_write_made_hole = false;
while (max_n_read)
{
word *wp = NULL;
ssize_t n_read = read (src_fd, buf, MIN (max_n_read, buf_size));
if (n_read < 0)
{
#ifdef EINTR
if (errno == EINTR)
continue;
#endif
error (0, errno, _("reading %s"), quote (src_name));
return false;
}
if (n_read == 0)
break;
max_n_read -= n_read;
if (make_holes)
{
char *cp;
/* Sentinel to stop loop. */
buf[n_read] = '\1';
#ifdef lint
/* Usually, buf[n_read] is not the byte just before a "word"
(aka uintptr_t) boundary. In that case, the word-oriented
test below (*wp++ == 0) would read some uninitialized bytes
after the sentinel. To avoid false-positive reports about
this condition (e.g., from a tool like valgrind), set the
remaining bytes -- to any value. */
memset (buf + n_read + 1, 0, sizeof (word) - 1);
#endif
/* Find first nonzero *word*, or the word with the sentinel. */
wp = (word *) buf;
while (*wp++ == 0)
continue;
/* Find the first nonzero *byte*, or the sentinel. */
cp = (char *) (wp - 1);
while (*cp++ == 0)
continue;
if (cp <= buf + n_read)
/* Clear to indicate that a normal write is needed. */
wp = NULL;
else
{
/* We found the sentinel, so the whole input block was zero.
Make a hole. */
if (lseek (dest_fd, n_read, SEEK_CUR) < 0)
{
error (0, errno, _("cannot lseek %s"), quote (dst_name));
return false;
}
*last_write_made_hole = true;
}
}
if (!wp)
{
size_t n = n_read;
if (full_write (dest_fd, buf, n) != n)
{
error (0, errno, _("writing %s"), quote (dst_name));
return false;
}
*last_write_made_hole = false;
/* It is tempting to return early here upon a short read from a
regular file. That would save the final read syscall for each
file. Unfortunately that doesn't work for certain files in
/proc with linux kernels from at least 2.6.9 .. 2.6.29. */
}
}
return true;
}
/* If the file ends with a `hole' (i.e., if sparse_copy set wrote_hole_at_eof),
call this function to record the length of the output file. */
static bool
sparse_copy_finalize (int dest_fd, char const *dst_name)
{
off_t len = lseek (dest_fd, 0, SEEK_CUR);
if (0 <= len && ftruncate (dest_fd, len) < 0)
{
error (0, errno, _("truncating %s"), quote (dst_name));
return false;
}
return true;
}
/* Perform the O(1) btrfs clone operation, if possible.
Upon success, return 0. Otherwise, return -1 and set errno. */
static inline int
@@ -148,6 +269,154 @@ clone_file (int dest_fd, int src_fd)
#endif
}
/* Write N_BYTES zero bytes to file descriptor FD. Return true if successful.
Upon write failure, set errno and return false. */
static bool
write_zeros (int fd, uint64_t n_bytes)
{
static char *zeros;
static size_t nz = IO_BUFSIZE;
/* Attempt to use a relatively large calloc'd source buffer for
efficiency, but if that allocation fails, resort to a smaller
statically allocated one. */
if (zeros == NULL)
{
static char fallback[1024];
zeros = calloc (nz, 1);
if (zeros == NULL)
{
zeros = fallback;
nz = sizeof fallback;
}
}
while (n_bytes)
{
uint64_t n = MIN (sizeof nz, n_bytes);
if ((full_write (fd, zeros, n)) != n)
return false;
n_bytes -= n;
}
return true;
}
/* Perform an efficient extent copy, if possible. This avoids
the overhead of detecting holes in hole-introducing/preserving
copy, and thus makes copying sparse files much more efficient.
Upon a successful copy, return true. If the initial extent scan
fails, set *NORMAL_COPY_REQUIRED to true and return false.
Upon any other failure, set *NORMAL_COPY_REQUIRED to false and
return false. */
static bool
extent_copy (int src_fd, int dest_fd, char *buf, size_t buf_size,
off_t src_total_size, bool make_holes,
char const *src_name, char const *dst_name,
bool *require_normal_copy)
{
struct extent_scan scan;
off_t last_ext_start = 0;
uint64_t last_ext_len = 0;
extent_scan_init (src_fd, &scan);
bool wrote_hole_at_eof = true;
do
{
bool ok = extent_scan_read (&scan);
if (! ok)
{
if (scan.hit_final_extent)
break;
if (scan.initial_scan_failed)
{
*require_normal_copy = true;
return false;
}
error (0, errno, _("%s: failed to get extents info"),
quote (src_name));
return false;
}
unsigned int i;
for (i = 0; i < scan.ei_count; i++)
{
off_t ext_start = scan.ext_info[i].ext_logical;
uint64_t ext_len = scan.ext_info[i].ext_length;
if (lseek (src_fd, ext_start, SEEK_SET) < 0)
{
error (0, errno, _("cannot lseek %s"), quote (src_name));
fail:
extent_scan_free (&scan);
return false;
}
if (make_holes)
{
if (lseek (dest_fd, ext_start, SEEK_SET) < 0)
{
error (0, errno, _("cannot lseek %s"), quote (dst_name));
goto fail;
}
}
else
{
/* When not inducing holes and when there is a hole between
the end of the previous extent and the beginning of the
current one, write zeros to the destination file. */
if (last_ext_start + last_ext_len < ext_start)
{
uint64_t hole_size = (ext_start
- last_ext_start
- last_ext_len);
if (! write_zeros (dest_fd, hole_size))
{
error (0, errno, _("%s: write failed"), quote (dst_name));
goto fail;
}
}
}
last_ext_start = ext_start;
last_ext_len = ext_len;
if ( ! sparse_copy (src_fd, dest_fd, buf, buf_size,
make_holes, src_name, dst_name, ext_len,
&wrote_hole_at_eof))
return false;
}
/* Release the space allocated to scan->ext_info. */
extent_scan_free (&scan);
}
while (! scan.hit_final_extent);
/* When the source file ends with a hole, we have to do a little more work,
since the above copied only up to and including the final extent.
In order to complete the copy, we may have to insert a hole or write
zeros in the destination corresponding to the source file's hole-at-EOF.
In addition, if the final extent was a block of zeros at EOF and we've
just converted them to a hole in the destination, we must call ftruncate
here in order to record the proper length in the destination. */
off_t dest_len = lseek (dest_fd, 0, SEEK_CUR);
if ((dest_len < src_total_size || wrote_hole_at_eof)
&& (make_holes
? ftruncate (dest_fd, src_total_size)
: ! write_zeros (dest_fd, src_total_size - dest_len)))
{
error (0, errno, _("failed to extend %s"), quote (dst_name));
return false;
}
return true;
}
/* FIXME: describe */
/* FIXME: rewrite this to use a hash table so we avoid the quadratic
performance hit that's probably noticeable only on trees deeper
@@ -647,7 +916,6 @@ copy_reg (char const *src_name, char const *dst_name,
if (data_copy_required)
{
typedef uintptr_t word;
off_t n_read_total = 0;
/* Choose a suitable buffer size; it may be adjusted later. */
size_t buf_alignment = lcm (getpagesize (), sizeof (word));
@@ -655,7 +923,6 @@ copy_reg (char const *src_name, char const *dst_name,
size_t buf_size = io_blksize (sb);
/* Deal with sparse files. */
bool last_write_made_hole = false;
bool make_holes = false;
if (S_ISREG (sb.st_mode))
@@ -704,106 +971,35 @@ copy_reg (char const *src_name, char const *dst_name,
buf_alloc = xmalloc (buf_size + buf_alignment_slop);
buf = ptr_align (buf_alloc, buf_alignment);
while (true)
bool normal_copy_required;
/* Perform an efficient extent-based copy, falling back to the
standard copy only if the initial extent scan fails. If the
'--sparse=never' option is specified, write all data but use
any extents to read more efficiently. */
if (extent_copy (source_desc, dest_desc, buf, buf_size,
src_open_sb.st_size, make_holes,
src_name, dst_name, &normal_copy_required))
goto preserve_metadata;
if (! normal_copy_required)
{
word *wp = NULL;
ssize_t n_read = read (source_desc, buf, buf_size);
if (n_read < 0)
{
#ifdef EINTR
if (errno == EINTR)
continue;
#endif
error (0, errno, _("reading %s"), quote (src_name));
return_val = false;
goto close_src_and_dst_desc;
}
if (n_read == 0)
break;
n_read_total += n_read;
if (make_holes)
{
char *cp;
/* Sentinel to stop loop. */
buf[n_read] = '\1';
#ifdef lint
/* Usually, buf[n_read] is not the byte just before a "word"
(aka uintptr_t) boundary. In that case, the word-oriented
test below (*wp++ == 0) would read some uninitialized bytes
after the sentinel. To avoid false-positive reports about
this condition (e.g., from a tool like valgrind), set the
remaining bytes -- to any value. */
memset (buf + n_read + 1, 0, sizeof (word) - 1);
#endif
/* Find first nonzero *word*, or the word with the sentinel. */
wp = (word *) buf;
while (*wp++ == 0)
continue;
/* Find the first nonzero *byte*, or the sentinel. */
cp = (char *) (wp - 1);
while (*cp++ == 0)
continue;
if (cp <= buf + n_read)
/* Clear to indicate that a normal write is needed. */
wp = NULL;
else
{
/* We found the sentinel, so the whole input block was zero.
Make a hole. */
if (lseek (dest_desc, n_read, SEEK_CUR) < 0)
{
error (0, errno, _("cannot lseek %s"), quote (dst_name));
return_val = false;
goto close_src_and_dst_desc;
}
last_write_made_hole = true;
}
}
if (!wp)
{
size_t n = n_read;
if (full_write (dest_desc, buf, n) != n)
{
error (0, errno, _("writing %s"), quote (dst_name));
return_val = false;
goto close_src_and_dst_desc;
}
last_write_made_hole = false;
/* It is tempting to return early here upon a short read from a
regular file. That would save the final read syscall for each
file. Unfortunately that doesn't work for certain files in
/proc with linux kernels from at least 2.6.9 .. 2.6.29. */
}
return_val = false;
goto close_src_and_dst_desc;
}
/* If the file ends with a `hole', we need to do something to record
the length of the file. On modern systems, calling ftruncate does
the job. On systems without native ftruncate support, we have to
write a byte at the ending position. Otherwise the kernel would
truncate the file at the end of the last write operation. */
if (last_write_made_hole)
bool wrote_hole_at_eof;
if ( ! sparse_copy (source_desc, dest_desc, buf, buf_size,
make_holes, src_name, dst_name, UINTMAX_MAX,
&wrote_hole_at_eof)
|| (wrote_hole_at_eof &&
! sparse_copy_finalize (dest_desc, dst_name)))
{
if (ftruncate (dest_desc, n_read_total) < 0)
{
error (0, errno, _("truncating %s"), quote (dst_name));
return_val = false;
goto close_src_and_dst_desc;
}
return_val = false;
goto close_src_and_dst_desc;
}
}
preserve_metadata:
if (x->preserve_timestamps)
{
struct timespec timespec[2];

116
src/extent-scan.c Normal file
View File

@@ -0,0 +1,116 @@
/* extent-scan.c -- core functions for scanning extents
Copyright (C) 2010 Free Software Foundation, Inc.
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.
Written by Jie Liu (jeff.liu@oracle.com). */
#include <config.h>
#include <stdio.h>
#include <sys/types.h>
#include <sys/ioctl.h>
#include <assert.h>
#include "system.h"
#include "extent-scan.h"
#ifndef HAVE_FIEMAP
# include "fiemap.h"
#endif
/* Allocate space for struct extent_scan, initialize the entries if
necessary and return it as the input argument of extent_scan_read(). */
extern void
extent_scan_init (int src_fd, struct extent_scan *scan)
{
scan->fd = src_fd;
scan->ei_count = 0;
scan->scan_start = 0;
scan->initial_scan_failed = false;
scan->hit_final_extent = false;
}
#ifdef __linux__
# ifndef FS_IOC_FIEMAP
# define FS_IOC_FIEMAP _IOWR ('f', 11, struct fiemap)
# endif
/* Call ioctl(2) with FS_IOC_FIEMAP (available in linux 2.6.27) to
obtain a map of file extents excluding holes. */
extern bool
extent_scan_read (struct extent_scan *scan)
{
union { struct fiemap f; char c[4096]; } fiemap_buf;
struct fiemap *fiemap = &fiemap_buf.f;
struct fiemap_extent *fm_extents = &fiemap->fm_extents[0];
enum { count = (sizeof fiemap_buf - sizeof *fiemap) / sizeof *fm_extents };
verify (count != 0);
/* This is required at least to initialize fiemap->fm_start,
but also serves (in mid 2010) to appease valgrind, which
appears not to know the semantics of the FIEMAP ioctl. */
memset (&fiemap_buf, 0, sizeof fiemap_buf);
fiemap->fm_start = scan->scan_start;
fiemap->fm_flags = FIEMAP_FLAG_SYNC;
fiemap->fm_extent_count = count;
fiemap->fm_length = FIEMAP_MAX_OFFSET - scan->scan_start;
/* Fall back to the standard copy if call ioctl(2) failed for the
the first time. */
if (ioctl (scan->fd, FS_IOC_FIEMAP, fiemap) < 0)
{
if (scan->scan_start == 0)
scan->initial_scan_failed = true;
return false;
}
/* If 0 extents are returned, then more get_extent_table() are not needed. */
if (fiemap->fm_mapped_extents == 0)
{
scan->hit_final_extent = true;
return false;
}
scan->ei_count = fiemap->fm_mapped_extents;
scan->ext_info = xnmalloc (scan->ei_count, sizeof (struct extent_info));
unsigned int i;
for (i = 0; i < scan->ei_count; i++)
{
assert (fm_extents[i].fe_logical <= OFF_T_MAX);
scan->ext_info[i].ext_logical = fm_extents[i].fe_logical;
scan->ext_info[i].ext_length = fm_extents[i].fe_length;
scan->ext_info[i].ext_flags = fm_extents[i].fe_flags;
}
i--;
if (scan->ext_info[i].ext_flags & FIEMAP_EXTENT_LAST)
{
scan->hit_final_extent = true;
return true;
}
scan->scan_start = fm_extents[i].fe_logical + fm_extents[i].fe_length;
return true;
}
#else
extern bool
extent_scan_read (struct extent_scan *scan ATTRIBUTE_UNUSED)
{
errno = ENOTSUP;
return false;
}
#endif

68
src/extent-scan.h Normal file
View File

@@ -0,0 +1,68 @@
/* core functions for efficient reading sparse files
Copyright (C) 2010 Free Software Foundation, Inc.
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.
Written by Jie Liu (jeff.liu@oracle.com). */
#ifndef EXTENT_SCAN_H
# define EXTENT_SCAN_H
/* Structure used to store information of each extent. */
struct extent_info
{
/* Logical offset of an extent. */
off_t ext_logical;
/* Extent length. */
uint64_t ext_length;
/* Extent flags, use it for FIEMAP only, or set it to zero. */
uint32_t ext_flags;
};
/* Structure used to reserve extent scan information per file. */
struct extent_scan
{
/* File descriptor of extent scan run against. */
int fd;
/* Next scan start offset. */
off_t scan_start;
/* How many extent info returned for a scan. */
uint32_t ei_count;
/* If true, fall back to a normal copy, either set by the
failure of ioctl(2) for FIEMAP or lseek(2) with SEEK_DATA. */
bool initial_scan_failed;
/* If true, the total extent scan per file has been finished. */
bool hit_final_extent;
/* Extent information: a malloc'd array of ei_count structs. */
struct extent_info *ext_info;
};
void extent_scan_init (int src_fd, struct extent_scan *scan);
bool extent_scan_read (struct extent_scan *scan);
static inline void
extent_scan_free (struct extent_scan *scan)
{
free (scan->ext_info);
}
#endif /* EXTENT_SCAN_H */

102
src/fiemap.h Normal file
View File

@@ -0,0 +1,102 @@
/* FS_IOC_FIEMAP ioctl infrastructure.
Some portions copyright (C) 2007 Cluster File Systems, Inc
Authors: Mark Fasheh <mfasheh@suse.com>
Kalpak Shah <kalpak.shah@sun.com>
Andreas Dilger <adilger@sun.com>. */
/* Copy from kernel, modified to respect GNU code style by Jie Liu. */
#ifndef _LINUX_FIEMAP_H
# define _LINUX_FIEMAP_H
# include <stdint.h>
struct fiemap_extent
{
/* Logical offset in bytes for the start of the extent
from the beginning of the file. */
uint64_t fe_logical;
/* Physical offset in bytes for the start of the extent
from the beginning of the disk. */
uint64_t fe_physical;
/* Length in bytes for this extent. */
uint64_t fe_length;
uint64_t fe_reserved64[2];
/* FIEMAP_EXTENT_* flags for this extent. */
uint32_t fe_flags;
uint32_t fe_reserved[3];
};
struct fiemap
{
/* Logical offset(inclusive) at which to start mapping(in). */
uint64_t fm_start;
/* Logical length of mapping which userspace wants(in). */
uint64_t fm_length;
/* FIEMAP_FLAG_* flags for request(in/out). */
uint32_t fm_flags;
/* Number of extents that were mapped(out). */
uint32_t fm_mapped_extents;
/* Size of fm_extents array(in). */
uint32_t fm_extent_count;
uint32_t fm_reserved;
/* Array of mapped extents(out). */
struct fiemap_extent fm_extents[0];
};
/* The maximum offset can be mapped for a file. */
# define FIEMAP_MAX_OFFSET (~0ULL)
/* Sync file data before map. */
# define FIEMAP_FLAG_SYNC 0x00000001
/* Map extented attribute tree. */
# define FIEMAP_FLAG_XATTR 0x00000002
# define FIEMAP_FLAGS_COMPAT (FIEMAP_FLAG_SYNC | FIEMAP_FLAG_XATTR)
/* Last extent in file. */
# define FIEMAP_EXTENT_LAST 0x00000001
/* Data location unknown. */
# define FIEMAP_EXTENT_UNKNOWN 0x00000002
/* Location still pending, Sets EXTENT_UNKNOWN. */
# define FIEMAP_EXTENT_DELALLOC 0x00000004
/* Data can not be read while fs is unmounted. */
# define FIEMAP_EXTENT_ENCODED 0x00000008
/* Data is encrypted by fs. Sets EXTENT_NO_BYPASS. */
# define FIEMAP_EXTENT_DATA_ENCRYPTED 0x00000080
/* Extent offsets may not be block aligned. */
# define FIEMAP_EXTENT_NOT_ALIGNED 0x00000100
/* Data mixed with metadata. Sets EXTENT_NOT_ALIGNED. */
# define FIEMAP_EXTENT_DATA_INLINE 0x00000200
/* Multiple files in block. Set EXTENT_NOT_ALIGNED. */
# define FIEMAP_EXTENT_DATA_TAIL 0x00000400
/* Space allocated, but not data (i.e. zero). */
# define FIEMAP_EXTENT_UNWRITTEN 0x00000800
/* File does not natively support extents. Result merged for efficiency. */
# define FIEMAP_EXTENT_MERGED 0x00001000
/* Space shared with other files. */
# define FIEMAP_EXTENT_SHARED 0x00002000
#endif

View File

@@ -10,6 +10,7 @@ EXTRA_DIST = \
CuTmpdir.pm \
check.mk \
envvar-check \
filefrag-extent-compare \
init.cfg \
init.sh \
lang-default \
@@ -25,6 +26,7 @@ root_tests = \
cp/special-bits \
cp/cp-mv-enotsup-xattr \
cp/capability \
cp/sparse-fiemap \
dd/skip-seek-past-dev \
install/install-C-root \
ls/capability \
@@ -318,6 +320,8 @@ TESTS = \
cp/dir-vs-file \
cp/existing-perm-race \
cp/fail-perm \
cp/fiemap-perf \
cp/fiemap-2 \
cp/file-perm-race \
cp/into-self \
cp/link \

54
tests/cp/fiemap-2 Executable file
View File

@@ -0,0 +1,54 @@
#!/bin/sh
# Exercise a few more corners of the fiemap-copying code.
# Copyright (C) 2011 Free Software Foundation, Inc.
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
. "${srcdir=.}/init.sh"; path_prepend_ ../src
print_ver_ cp
# Require a fiemap-enabled FS.
df -T -t btrfs -t xfs -t ext4 -t ocfs2 . \
|| skip_ "this file system lacks FIEMAP support"
# Exercise the code that handles a file ending in a hole.
printf x > k || framework_failure_
dd bs=1k seek=128 of=k < /dev/null || framework_failure_
# The first time through the outer loop, the input file, K, ends with a hole.
# The second time through, we append a byte so that it does not.
for append in no yes; do
test $append = yes && printf y >> k
for i in always never; do
cp --sparse=$i k k2 || fail=1
cmp k k2 || fail=1
done
done
# Ensure that --sparse=always can restore holes.
rm -f k
# Create a file starting with an "x", followed by 256K-1 0 bytes.
printf x > k || framework_failure_
dd bs=1k seek=1 of=k count=255 < /dev/zero || framework_failure_
# cp should detect the all-zero blocks and convert some of them to holes.
# How many it detects/converts currently depends on io_blksize.
# Currently, on my F14/ext4 desktop, this K starts off with size 256KiB,
# (note that the K in the preceding test starts off with size 4KiB).
# cp from coreutils-8.9 with --sparse=always reduces the size to 32KiB.
cp --sparse=always k k2 || fail=1
test $(stat -c %b k2) -lt $(stat -c %b k) || fail=1
Exit $fail

32
tests/cp/fiemap-perf Executable file
View File

@@ -0,0 +1,32 @@
#!/bin/sh
# ensure that a sparse file is copied efficiently, by default
# Copyright (C) 2011 Free Software Foundation, Inc.
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
. "${srcdir=.}/init.sh"; path_prepend_ ../src
print_ver_ cp
# Require a fiemap-enabled FS.
df -T -t btrfs -t xfs -t ext4 -t ocfs2 . \
|| skip_ "this file system lacks FIEMAP support"
# Create a large-but-sparse file.
timeout 10 truncate -s1T f || framework_failure_
# Nothing can read (much less write) that many bytes in so little time.
timeout 10 cp f f2 || fail=1
Exit $fail

119
tests/cp/sparse-fiemap Executable file
View File

@@ -0,0 +1,119 @@
#!/bin/sh
# Test cp --sparse=always through fiemap copy
# Copyright (C) 2010 Free Software Foundation, Inc.
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
if test "$VERBOSE" = yes; then
set -x
cp --version
fi
. "${srcdir=.}/init.sh"; path_prepend_ ../src
if df -T -t btrfs -t xfs -t ext4 -t ocfs2 . ; then
: # Current dir is on a partition with working extents. Good!
else
# It's not; we need to create one, hence we need root access.
require_root_
cwd=$PWD
cleanup_() { cd /; umount "$cwd/mnt"; }
skip=0
# Create an ext4 loopback file system
dd if=/dev/zero of=blob bs=32k count=1000 || skip=1
mkdir mnt
mkfs -t ext4 -F blob ||
skip_test_ "failed to create ext4 file system"
mount -oloop blob mnt || skip=1
cd mnt || skip=1
echo test > f || skip=1
test -s f || skip=1
test $skip = 1 &&
skip_test_ "insufficient mount/ext4 support"
fi
# Create a 1TiB sparse file
dd if=/dev/zero of=sparse bs=1k count=1 seek=1G || framework_failure
# It takes many minutes to copy this sparse file using the old method.
# By contrast, it takes far less than 1 second using FIEMAP-copy.
timeout 10 cp --sparse=always sparse fiemap || fail=1
# Ensure that the sparse file copied through fiemap has the same size
# in bytes as the original.
test $(stat --printf %s sparse) = $(stat --printf %s fiemap) || fail=1
# =================================================
# Ensure that we exercise the FIEMAP-copying code enough
# to provoke at least two iterations of the do...while loop
# in which it calls ioctl (fd, FS_IOC_FIEMAP,...
# This also verifies that non-trivial extents are preserved.
$PERL -e 1 || skip_test_ 'skipping part of this test; you lack perl'
# Extract logical block number and length pairs from filefrag -v output.
# The initial sed is to remove the "eof" from the normally-empty "flags" field.
# Similarly, remove flags values like "unknown,delalloc,eof".
# That is required when that final extent has no number in the "expected" field.
f()
{
sed 's/ [a-z,][a-z,]*$//' $@ \
| awk '/^ *[0-9]/ {printf "%d %d ", $2 ,NF < 5 ? $NF : $5 } END {print ""}'
}
for i in $(seq 1 2 21); do
for j in 1 2 31 100; do
$PERL -e 'BEGIN { $n = '$i' * 1024; *F = *STDOUT }' \
-e 'for (1..'$j') { sysseek (*F, $n, 1)' \
-e '&& syswrite (*F, chr($_)x$n) or die "$!"}' > j1 || fail=1
# sync
cp --sparse=always j1 j2 || fail=1
# sync
# Technically we may need the 'sync' uses above, but
# uncommenting them makes this test take much longer.
cmp j1 j2 || fail=1
filefrag -v j1 | grep extent \
|| skip_test_ 'skipping part of this test; you lack filefrag'
# Here is sample filefrag output:
# $ perl -e 'BEGIN{$n=16*1024; *F=*STDOUT}' \
# -e 'for (1..5) { sysseek(*F,$n,1)' \
# -e '&& syswrite *F,"."x$n or die "$!"}' > j
# $ filefrag -v j
# File system type is: ef53
# File size of j is 163840 (40 blocks, blocksize 4096)
# ext logical physical expected length flags
# 0 4 6258884 4
# 1 12 6258892 6258887 4
# 2 20 6258900 6258895 4
# 3 28 6258908 6258903 4
# 4 36 6258916 6258911 4 eof
# j: 6 extents found
# exclude the physical block numbers; they always differ
filefrag -v j1 > ff1 || fail=1
filefrag -v j2 > ff2 || fail=1
{ f ff1; f ff2; } \
| $PERL $abs_top_srcdir/tests/filefrag-extent-compare \
|| { fail=1; break; }
done
test $fail = 1 && break
done
Exit $fail

View File

@@ -0,0 +1,68 @@
eval '(exit $?0)' && eval 'exec perl -wS "$0" ${1+"$@"}'
& eval 'exec perl -wS "$0" $argv:q'
if 0;
# Determine whether two files have the same extents by comparing
# the logical block numbers and lengths from filefrag -v for each.
# Invoke like this:
# This helper function, f, extracts logical block number and lengths.
# f() { awk '/^ *[0-9]/ {printf "%d %d ",$2,NF<5?$NF:$5} END {print ""}'; }
# { filefrag -v j1 | f; filefrag -v j2 | f; } | ./filefrag-extent-compare
use warnings;
use strict;
(my $ME = $0) =~ s|.*/||;
my @line = <>;
my $n_lines = @line;
$n_lines == 2
or die "$ME: expected exactly two input lines; got $n_lines\n";
my @A = split ' ', $line[0];
my @B = split ' ', $line[1];
@A % 2 || @B % 2
and die "$ME: unexpected input: odd number of numbers; expected even\n";
my @a;
my @b;
foreach my $i (0..@A/2-1) { $a[$i] = { L_BLK => $A[2*$i], LEN => $A[2*$i+1] } };
foreach my $i (0..@B/2-1) { $b[$i] = { L_BLK => $B[2*$i], LEN => $B[2*$i+1] } };
my $i = 0;
my $j = 0;
while (1)
{
!defined $a[$i] && !defined $b[$j]
and exit 0;
defined $a[$i] && defined $b[$j]
or die "\@a and \@b have different lengths, even after adjustment\n";
($a[$i]->{L_BLK} == $b[$j]->{L_BLK}
&& $a[$i]->{LEN} == $b[$j]->{LEN})
and next;
($a[$i]->{LEN} < $b[$j]->{LEN}
&& exists $a[$i+1] && $a[$i]->{LEN} + $a[$i+1]->{LEN} == $b[$j]->{LEN})
and ++$i, next;
exists $b[$j+1] && $a[$i]->{LEN} == $b[$i]->{LEN} + $b[$i+1]->{LEN}
and ++$j, next;
die "differing extent:\n"
. " [$i]=$a[$i]->{L_BLK} $a[$i]->{LEN}\n"
. " [$j]=$b[$j]->{L_BLK} $b[$j]->{LEN}\n"
}
continue
{
++$i;
++$j;
}
### Setup "GNU" style for perl-mode and cperl-mode.
## Local Variables:
## mode: perl
## perl-indent-level: 2
## perl-continued-statement-offset: 2
## perl-continued-brace-offset: 0
## perl-brace-offset: 0
## perl-brace-imaginary-offset: 0
## perl-label-offset: -2
## perl-extra-newline-before-brace: t
## perl-merge-trailing-else: nil
## End: