When is this required (Real-time Scenario)?

In our blog, we already discussed the zip and unzip files using Powershell. In this post, we are going to discuss how to extract specific files from the zip. I have several zip files that Contain multiple file types and my team got one requirement for extracting only specific files (like .txt/.wav/.xls, file name filter) alone from the Zip on weekly basis and need to ignore all other file types in the zips. If we go for manual then it will be time taking process to check all the zip files on a weekly basis. so we decided to make this automated.

Sample Folder structure used for this example:

How to achieve?

In this example, we are going to use .NET method. To make this process of separating the specific files from the Zip, We need to specify the ‘entry’ object within the .zip file object and pass that to the ExtractToFile() method. Below are the 4 main steps for extracting specific files from the Zip.

  • Fetch the ZIP file and open it for reading.
  • Identifies all files inside the ZIP file.
  • Fetch the files based on the given extension/any matched filters & copy them to the destination folder.
  • Finally, close the zip.

STEP #1: Fetch the ZIP file and open it for reading.

First, we need to open a zip archive for reading (As we are using .NET, we must first open it for reading) at the specified path using the dotnet ZipFile.OpenRead(String) Method (Namespace: System.IO.Compression).

Add-Type -Assembly System.IO.Compression.FileSystem
$zipFile = [IO.Compression.ZipFile]::OpenRead($sourceFilePath)

STEP #2: Identifies all files inside the ZIP file

Next, we need to get all the entries and start identifying/filtering the files. In this example, we are going to fetch all the txt files from the zip.

$zipFile.Entries | where {$_.Name -like ‘*.txt’}

STEP #3: Fetch all the files based on the filters & copy them to the destination folder.

Now you can extract the selected items from the ZIP archive and copy them to the output folder.

foreach {
$FileName = $_.Name
[System.IO.Compression.ZipFileExtensions]::ExtractToFile($_, “$destPath\$FileName”, $true)
}

Note: To only extract a single file from a .zip file, things get a little trickier. We need to specify the ‘entry’ ($zipFile.Entries[0]) object within the .zip file object and pass that to the ExtractToFile() method.   

[System.IO.Compression.ZipFileExtensions]::ExtractToFile($zipFile.Entries[0], “$destPath\ExtractMe1.txt”, $true)

To extract multiple files the target file needs to be composed for each.

$zipFile.Entries | Where-Object Name -like *.txt | ForEach-Object{[System.IO.Compression.ZipFileExtensions]::ExtractToFile($_, “$extractPath\$($_.Name)”, $true)}

STEP #4: Close/Dispose the Zip

$zipFile.Dispose()

Final Code:

Output