Revised: 2024-09-09
Even though Robocopy is a multi-threaded application, you can speed up copy times significantly by running parallel Robocopy jobs. This can reduce your backup speed by at least a factor of 6. To do this, you need to come up with a strategy where you can maximize the number independent folders which can be processed by Robocopy.
For example, creating backup jobs for each user home directory typically gives a fairly distributed strategy where Robocopy could backup each directory concurrently.
The script below performs a "ls" of the users home mount and dispatches a single job to backup the user's directory. It iterates through all of the users folders and limits the number of jobs processed by the backup server. You simply need to change $src, $dest and $log per your VPSA and server configuration. $max_jobs is optimal, but you probably don't want to exceed more 40 jobs otherwise, performance may degrade.
Clone this repo for latest updates:
https://github.com/howardyoung/parallel-robocopy
###
#
# This script runs robocopy jobs in parallel by increasing the number of outstanding i/o's to the VPSA. Even though you can
# change the number of threads using the "/mt:#" parameter, your backups will run faster by adding two or more jobs to your
# original set.
#
# To do this, you need to subdivide the work into directories. That is, each job will recurse the directory until completed.
# The ideal case is to have 100's of directories as the root of the backup. Simply change $src to get
# the list of folders to backup and the list is used to feed $ScriptBlock.
#
# For maximum SMB throughput, do not exceed 8 concurrent Robocopy jobs with 20 threads. Any more will degrade
# the performance by causing disk thrashing looking up directory entries. Lower the number of threads to 8 if one
# or more of your volumes are encrypted.
#
# Parameters:
# $src Change this to a directory which has lots of subdirectories that can be processed in parallel
# $dest Change this to where you want to backup your files to
# $max_jobs Change this to the number of parallel jobs to run ( <= 8 )
# $log Change this to the directory where you want to store the output of each robocopy job.
#
####
#
# This script will throttle the number of concurrent jobs based on $max_jobs
#
$max_jobs = 8
$tstart = get-date
#
# Set $src to a directory with lots of sub-directories
#
$src = "\\10.10.1.163\nas-encrypt\home\"
#
# Set $dest to a local folder or share you want to back up the data to
#
$dest = "c:\users\administrator\tmp\backup\"
#
# Set $log to a local folder to store logfiles
#
$log = "c:\users\administrator\Logs\"
$files = ls $src
$files | %{
$ScriptBlock = {
param($name, $src, $dest, $log)
$log += "\$name-$(get-date -f yyyy-MM-dd-mm-ss).log"
robocopy $src$name $dest$name /mir /nfl /np /mt:8 /ndl > $log
Write-Host $src$name " completed"
}
$j = Get-Job -State "Running"
while ($j.count -ge $max_jobs)
{
Start-Sleep -Milliseconds 500
$j = Get-Job -State "Running"
}
Get-job -State "Completed" | Receive-job
Remove-job -State "Completed"
Start-Job $ScriptBlock -ArgumentList $_,$src,$dest,$log
}
#
# No more jobs to process. Wait for all of them to complete
#
While (Get-Job -State "Running") { Start-Sleep 2 }
Remove-Job -State "Completed"
Get-Job | Write-host
$tend = get-date
new-timespan -start $tstart -end $tend