Tuesday, October 21, 2014

How to use RSYNC to compare 2 directories / folders and then create a 3rd directory with only the files that are new / different?

It took me a long while to finally create an rsync command to do what I needed correctly. This can be used to manage server backup folders or just synchronizing folders.

In my case, I have 2 very similar folders. FolderA is my "live" real site. "FolderB" is my backup copy of FolderA.

In order to keep my FolderB backup copy synchronized with the real FolderA website, I need to use rsync, but only find files that are different in any way, new or modified. For example, if I updated a PHP file in real site, I want it copied to the backup site and if someone uploaded a picture on the real site - I want that new file copied to the backup folder as well.

Please note that command below uses "--dry-run" parameter, keep it - it means nothing will actually be done or changed / written until you delete the --dry-run parameter from the command. It just means its in "test mode" and won't destroy anything until removed.

Because I don't want the files automatically written into my backup folder, I am creating a 3rd folder called "difference" and that's where ALL updated files from FolderA (real site) will be created. This means that I need to copy the "difference" folder into my backup folderB (overwrite any existing files in folderB with the ones from difference folder).

Confused yet? I think I am. Summary is: compare real site folderA to backup folderB and create the updated files inside a brand new /difference/ folder.

The command: 

rsync -rv --checksum --exclude '.htaccess' --exclude 'wp-config.php' --progress --stats --dry-run --compare-dest=/var/www/html/folderB/ /var/www/html/folderA/ /var/www/html/difference

  • note that I am excluding 2 files from being synchronized, modify or delete the exclude parameters as you see fit for your needs
  • folderA = your real site / your source of all up to date changes
  • folderB = this is your backup folder
  • difference = this is the brand new folder where updated files from folderA will be written that do not exist or are not up-to-date in folderB
Don't forget to remove --dry-run parameter to actually write the files for real in the /difference/ directory!