Anyone who manages a file storage has to keep track of the size of files to ensure there is always enough free space. Documents, photos, backups and other can quickly occupy up your shared file resources — especially if you have a lot of duplicates. Duplicate files are often the result of users’ mistakes, such as double copy actions or incorrect folder transfers. To avoid wasting space and driving up storage costs, you have to analyze your file structure, find and delete duplicate Files Using PowerShell. When you say there are files with the same content but with different names
As a result we end up running out of disk space and then get in to a situation where we have to sit and find the unnecessary files to gain free storage space.
One of the biggest issue that we see during such clean-up activity is to get rid of duplicate files. A simple Windows PowerShell script can help you complete this tedious task faster. we having may types of approach to handle this scenario, we will discuss about few examples here.
Contents
Find Duplicate file using Get-FileHash
Do you need to compare two files or make sure a file has not changed? The PowerShell cmdlet Get-FileHash generates hash values both for files or streams of data. A hash is simply a function that converts one value into another. Sometimes the hash value may be smaller to save on space, or the hash value may be a checksum used to validate a file. Therefore a hash will be different if even a single character in the input is changed
In this demo, i am having 4 text files, which 3 files are having same content with different file name and remaining 1 file unique content as shown in the below image.
STEP 1: Open the PowerShell window
Open PowerShell: Click on the Start Menu and type “PowerShell” in the search bar. Then, select “Windows PowerShell” from the results.
STEP 2: Find to the directory where you want to search for duplicate files:
$filePath = ‘C:\Thiyagu Disk\backupFiles\’
STEP 3: Get the all child items inside the file path to check the duplicate.
Use the Get-ChildItem cmdlet to find all files in the directory: Type “Get-ChildItem -Recurse -File” to list all files in the current directory and its subdirectories. The “-Recurse” option tells PowerShell to search all subdirectories.
Get-ChildItem –path $filePath -Recurse
STEP 4: Find duplicate files using Get-FileHash cmdlet.
Using Get-FileHash generates hash values both for files or streams of data and group by the hash value as shown below to find the duplicate and unique files/folders/…
Get-ChildItem –path $filePath -Recurse | Get-FileHash | Group-Object -property hash | Where-Object { $_.count -gt 1 } | ForEach-Object { $_.group | Select-Object Path, Hash }
Full Code : Find the duplicate files
$filePath = ‘C:\backupFiles\’
$group_by_unique_files = Get-ChildItem –path $filePath -Recurse | Get-FileHash | Group-Object -property hash | Where-Object { $_.count -gt 1 }
$duplicatefile_details = $group_by_unique_files | ForEach-Object { $_.group | Select-Object Path, Hash }
$duplicatefile_details
Full Code: Find and delete duplicate Files Using PowerShell
$filePath = ‘C:\backupFiles\’
$group_by_files = Get-ChildItem –path $filePath -Recurse | Get-FileHash | Group-Object -property hash | Where-Object { $_.count -gt 1 }
$group_by_files
$duplicatefile_details = $group_by_files | ForEach-Object { $_.group | Select-Object Path, Hash}
$duplicatefile_details | Out-GridView -OutputMode Multiple | Remove-item
After finding the duplicate files, you can move/delete based on your requirement. if you want to delete through UI, you can use Out-GridView and delete by selecting the multiple files as shown below. A user may select files to be deleted in the table (to select multiple files, press and hold CTRL) and click OK.
Note: Please be careful while using the Remove-Item cmdlet as it can permanently delete files from your computer. It’s recommended to test this command on a test folder before using it on your actual data.
DuplicateFilesDeleter program is a very easy, superfast app to help you with unnecessary files .