When is this required (Real-time Scenario)?
In our blog, we already discussed the zip and unzip files using Powershell. In this post, we are going to discuss how to extract specific files from the zip. I have several zip files that Contain multiple file types and my team got one requirement for extracting only specific files (like .txt/.wav/.xls, file name filter) alone from the Zip on weekly basis and need to ignore all other file types in the zips. If we go for manual then it will be time taking process to check all the zip files on a weekly basis. so we decided to make this automated.
Sample Folder structure used for this example:
How to achieve?
In this example, we are going to use .NET method. To make this process of separating the specific files from the Zip, We need to specify the โentryโ object within the .zip file object and pass that to the ExtractToFile() method. Below are the 4 main steps for extracting specific files from the Zip.
- Fetch the ZIP file and open it for reading.
- Identifies all files inside the ZIP file.
- Fetch the files based on the given extension/any matched filters & copy them to the destination folder.
- Finally, close the zip.
STEP #1: Fetch the ZIP file and open it for reading.
First, we need to open a zip archive for reading (As we are using .NET, we must first open it for reading) at the specified path using the dotnet ZipFile.OpenRead(String) Method (Namespace: System.IO.Compression).
Add-Type -Assembly System.IO.Compression.FileSystem
$zipFile = [IO.Compression.ZipFile]::OpenRead($sourceFilePath)
STEP #2: Identifies all files inside the ZIP file
Next, we need to get all the entries and start identifying/filtering the files. In this example, we are going to fetch all the txt files from the zip.
$zipFile.Entries | where {$_.Name -like ‘*.txt’}
STEP #3: Fetch all the files based on the filters & copy them to the destination folder.
Now you can extract the selected items from the ZIP archive and copy them to the output folder.
foreach {
$FileName = $_.Name
[System.IO.Compression.ZipFileExtensions]::ExtractToFile($_, “$destPath\$FileName”, $true)
}
Note: To only extract a single file from a .zip file, things get a little trickier. We need to specify the โentryโ ($zipFile.Entries[0]) object within the .zip file object and pass that to the ExtractToFile() method.ย ย
[System.IO.Compression.ZipFileExtensions]::ExtractToFile($zipFile.Entries[0], “$destPath\ExtractMe1.txt”, $true)
To extract multiple files the target file needs to be composed for each.
$zipFile.Entries | Where-Object Name -like *.txt | ForEach-Object{[System.IO.Compression.ZipFileExtensions]::ExtractToFile($_, “$extractPath\$($_.Name)”, $true)}
STEP #4: Close/Dispose the Zip
$zipFile.Dispose()
Final Code:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
############################################################################################ #Project: How to Extract Specific Files from ZIP Archive in PowerShell #Developer: Thiyagu S (dotnet-helpers.com) #Tools : PowerShell 5.1.15063.1155 [irp] #E-Mail: mail2thiyaguji@gmail.com ############################################################################################ $destPath = "C:\dotnet-helpers\Destination\" $sourcePath = 'C:\dotnet-helpers\Source\ExtractMe.zip' # load ZIP methods Add-Type -Assembly System.IO.Compression.FileSystem # open ZIP archive for reading $zipFile = [IO.Compression.ZipFile]::OpenRead($sourcePath) #Find all files in ZIP that match the filter (i.e. file extension) #Use the Entries property to retrieve the entire collection of entries $zipFile.Entries | where {$_.Name -like '*.txt'} | foreach {$FileName = $_.Name [System.IO.Compression.ZipFileExtensions]::ExtractToFile($_, "$destPath\$FileName", $true)} # close ZIP file $zipFile.Dispose() |
Output
Good Morning, thanks for this article. How can I extend this script for processing more than one file in the source folder? I mean: if in the source folder, i have ExtractMe1.zip and ExtractMe2.zip ExtractMeN.zip ecc., how can I process all, one after another? Thanks in advance.