Manually transferring drupal files with rsync

When we are migrating a site, rsync is our friend. Let's get intimate and hands on with file synchronization.

rsync will let you recursively synchronize directories, only updating what files and directories when they are different from one another. If you are working on a site with a lot of files, downloading an archive every so often can be very tedious. rsync operations happen over SSH and are quite fast. Much faster than sFTP.

The most important thing you NEED to remember about rsync early on is that operations happen FROM a source TO a target destination.

RTFM on rsync online or run man rsync in your Terminal.

Using Drush's built-in RSYNC

note that drush rsync can be funky, especially with older versions of drush. Real rsync in generally recommended.

Basic Usage

Drush aliases makes operations destinations clear. 

Here is some basic usage. We pull images down from dev to our local. Drupal will find the path to each instance's %files directory. Magic.

Note that the slash appears on the source, but not the target destination.

$ drush rsync @mysite.dev:%files/images/ @mysite.local:%files/images

Here, we pull down all files from live to local, but exclude drupal's private files

$ drush rsync --exclude=private @mysite.live:%files/ @mysite.local:%files

note that you cannot use drush to sync 2 remote instances, like live to test.

Safety first: passing flags

Drush will only support some rsync flags. For more complicated rsync operations, we will leave drush. But let's see what it can do for us

Drush has some option flags of its own we can send to test operations before committing. In this case we are going from our local to staging:

$ drush --simulate --debug rsync --mode=razoglp @sitename.local:%files/images/ @sitename.staging:%files/images


Excluding Paths

Drush provides the ability to leverage parts of rsync, but will not easily exclude files based on patterns. We can exclude paths, however:

$ drush rsync -y --mode=razog --exclude-paths=%files/private:%files/css:%files/js:%files/styles:%files/ctools:%files/backup_migrate @mysite.dev:%files/ @mysite.local:%files/ 


Using native rsync for full control of file transfers


Here is a verbose exclude list using straight rsync:

$ rsync -razv --delete --exclude=css --exclude=*_cache --exclude=js --exclude=googleanalytics --exclude=xmlsitemap \
  --exclude=backup_migrate --exclude=styles user@domain.com:/path/to/remote/files/ \
  /path/to/local/files/

We exclude a number of directories here and also delete anything on our local that does not exist on the remote.


setting hosts

With native rsync, we don't use drush aliases. However, we can set up shortcuts to hosts in .ssh/config.

# mysite
Host mysite.dev
  Hostname mysite.com
  port 2222
  Compression yes
  User username
Host mysite.test
  Hostname test.mysite.com
  port 2222
  Compression yes
  User username


These greatly simplify logging in via SSH. Now, instead of

$ ssh -p 2222 username@test.mysite.com

you can simply do:

$ ssh mysite.test


  -e 'ssh -axp 1022' 


By default, symbolic links are not transferred at all.


We can replace multiple exclude patterns in the rsync command with a reference to an external file, using a flag, like so: --exclude-from '/var/www/backup/rsync-exclude.txt'

Here are files and directories to exclude from drupal's files directory.

css
civicrm
ctools
js
googleanalytics
xmlsitemap
backup_migrate
tmp
styles
.DS_Store
.DS_Store*
.sass-cache
ehthumbs.db
Icon?
Thumbs.db
._*
*.bak
*_cache
*.tgz
*.un~
.buildpath
.project
.settings
*.sublime-project
*.sublime-workspace
*.sublime-projectcompletions
nbproject


rsync to pantheon 

Grab the pantheon credentials from the desitnation account and modify to fit.

Log in to the client server to run this operation, though you can sync a local directory to Pantheon in much the same way.

You will need to get the ssh key for the machine performing the rsync operation to Pantheon up to your control panel if it is not already there.

# rsync the files to Panthon dev
rsync -trazog --verbose --exclude-from='/var/www/backup/rsync-files-exclude.txt' \
  /var/www/mysite/htdocs/sites/default/files/ \
  --ipv4 -e 'ssh -p 2222' \
  dev.b37d48af-d7ab-4d1e-b6cf-XXXXXXXXXXX@appserver.dev.b37d48af-d7ab-4d1e-b6cf-XXXXXXXXXXX.drush.in:files;
# enter pantheon dashboard password when prompted



Resources



rsync options
 -v, --verbose               increase verbosity
 -q, --quiet                 suppress non-error messages
     --no-motd               suppress daemon-mode MOTD (see caveat)
 -c, --checksum              skip based on checksum, not mod-time & size
 -a, --archive               archive mode; equals -rlptgoD (no -H,-A,-X)
     --no-OPTION             turn off an implied OPTION (e.g. --no-D)
 -r, --recursive             recurse into directories
 -R, --relative              use relative path names
     --no-implied-dirs       don't send implied dirs with --relative
 -b, --backup                make backups (see --suffix & --backup-dir)
     --backup-dir=DIR        make backups into hierarchy based in DIR
     --suffix=SUFFIX         backup suffix (default ~ w/o --backup-dir)
 -u, --update                skip files that are newer on the receiver
     --inplace               update destination files in-place
     --append                append data onto shorter files
     --append-verify         --append w/old data in file checksum
 -d, --dirs                  transfer directories without recursing
 -l, --links                 copy symlinks as symlinks
 -L, --copy-links            transform symlink into referent file/dir
     --copy-unsafe-links     only "unsafe" symlinks are transformed
     --safe-links            ignore symlinks that point outside the tree
 -k, --copy-dirlinks         transform symlink to dir into referent dir
 -K, --keep-dirlinks         treat symlinked dir on receiver as dir
 -H, --hard-links            preserve hard links
 -p, --perms                 preserve permissions
 -E, --executability         preserve executability
     --chmod=CHMOD           affect file and/or directory permissions
 -A, --acls                  preserve ACLs (implies -p)
 -X, --xattrs                preserve extended attributes
 -o, --owner                 preserve owner (super-user only)
 -g, --group                 preserve group
     --devices               preserve device files (super-user only)
     --specials              preserve special files
 -D                          same as --devices --specials
 -t, --times                 preserve modification times
 -O, --omit-dir-times        omit directories from --times
     --super                 receiver attempts super-user activities
     --fake-super            store/recover privileged attrs using xattrs
 -S, --sparse                handle sparse files efficiently
 -n, --dry-run               perform a trial run with no changes made
 -W, --whole-file            copy files whole (w/o delta-xfer algorithm)
 -x, --one-file-system       don't cross filesystem boundaries
 -B, --block-size=SIZE       force a fixed checksum block-size
 -e, --rsh=COMMAND           specify the remote shell to use
     --rsync-path=PROGRAM    specify the rsync to run on remote machine
     --existing              skip creating new files on receiver
     --ignore-existing       skip updating files that exist on receiver
     --remove-source-files   sender removes synchronized files (non-dir)
     --del                   an alias for --delete-during
     --delete                delete extraneous files from dest dirs
     --delete-before         receiver deletes before transfer (default)
     --delete-during         receiver deletes during xfer, not before
     --delete-delay          find deletions during, delete after
     --delete-after          receiver deletes after transfer, not before
     --delete-excluded       also delete excluded files from dest dirs
     --ignore-errors         delete even if there are I/O errors
     --force                 force deletion of dirs even if not empty
     --max-delete=NUM        don't delete more than NUM files
     --max-size=SIZE         don't transfer any file larger than SIZE
     --min-size=SIZE         don't transfer any file smaller than SIZE
     --partial               keep partially transferred files
     --partial-dir=DIR       put a partially transferred file into DIR
     --delay-updates         put all updated files into place at end
 -m, --prune-empty-dirs      prune empty directory chains from file-list
     --numeric-ids           don't map uid/gid values by user/group name
     --timeout=SECONDS       set I/O timeout in seconds
     --contimeout=SECONDS    set daemon connection timeout in seconds
 -I, --ignore-times          don't skip files that match size and time
     --size-only             skip files that match in size
     --modify-window=NUM     compare mod-times with reduced accuracy
 -T, --temp-dir=DIR          create temporary files in directory DIR
 -y, --fuzzy                 find similar file for basis if no dest file
     --compare-dest=DIR      also compare received files relative to DIR
     --copy-dest=DIR         ... and include copies of unchanged files
     --link-dest=DIR         hardlink to files in DIR when unchanged
 -z, --compress              compress file data during the transfer
     --compress-level=NUM    explicitly set compression level
     --skip-compress=LIST    skip compressing files with suffix in LIST
 -C, --cvs-exclude           auto-ignore files in the same way CVS does
 -f, --filter=RULE           add a file-filtering RULE
 -F                          same as --filter='dir-merge /.rsync-filter'
                             repeated: --filter='- .rsync-filter'
     --exclude=PATTERN       exclude files matching PATTERN
     --exclude-from=FILE     read exclude patterns from FILE
     --include=PATTERN       don't exclude files matching PATTERN
     --include-from=FILE     read include patterns from FILE
     --files-from=FILE       read list of source-file names from FILE
 -0, --from0                 all *from/filter files are delimited by 0s
 -s, --protect-args          no space-splitting; wildcard chars only
     --address=ADDRESS       bind address for outgoing socket to daemon
     --port=PORT             specify double-colon alternate port number
     --sockopts=OPTIONS      specify custom TCP options
     --blocking-io           use blocking I/O for the remote shell
     --stats                 give some file-transfer stats
 -8, --8-bit-output          leave high-bit chars unescaped in output
 -h, --human-readable        output numbers in a human-readable format
     --progress              show progress during transfer
 -P                          same as --partial --progress
 -i, --itemize-changes       output a change-summary for all updates
     --out-format=FORMAT     output updates using the specified FORMAT
     --log-file=FILE         log what we're doing to the specified FILE
     --log-file-format=FMT   log updates using the specified FMT
     --password-file=FILE    read daemon-access password from FILE
     --list-only             list the files instead of copying them
     --bwlimit=KBPS          limit I/O bandwidth; KBytes per second
     --write-batch=FILE      write a batched update to FILE
     --only-write-batch=FILE like --write-batch but w/o updating dest
     --read-batch=FILE       read a batched update from FILE
     --protocol=NUM          force an older protocol version to be used
     --iconv=CONVERT_SPEC    request charset conversion of filenames
     --checksum-seed=NUM     set block/file checksum seed (advanced)
 -4, --ipv4                  prefer IPv4
 -6, --ipv6                  prefer IPv6
     --version               print version number
(-h) --help                  show this help (see below for -h comment)