I recently had the requirement to fetch out all of the hyperlinks from a Large excel worksheet to prepare another document and its repetitive monthly task. While I could have gone the manual approach by looking through all of the cells in the worksheet and fetching/copying all of the hyperlinks through manually, that would have been a long grueling thing to perform. So we thought to create an Automated script to perform this action instead of Manual work,
For this scenario, we have the PowerShell COM object creation to hook into Excel. Application COM object and then open up an existing Excel document and locate all of the hyperlinks which are in the spreadsheet. Automation will always be better than manual if allowed to use it. Instead of giving a deep conversation, let us start implementation quickly.
STEP #1: Create Excel.Application COM Object
The first thing that you need to do is to create Excel.Application COM object.
$excel = New-Object -ComObject excel.application
$excel.visible = $False
STEP #2: Open the excel using the Open method
Now you can open up the existing Excel document using the Open method under the Workbooks property on the Excel object. Note that you need to supply the full path of the document else it will throw the error. Here I take a sample sheet with a dummy value (shown in the below snapshot)
$workbook = $excel.Workbooks.Open(‘C:\dotnet-helpers\MyExcel.xlsx’)
STEP #3: Fetch the Hyperlinks using a worksheet object which has a property called Hyperlinks
In this article, based on my requirement here I am using many worksheets so I will Loop the WorkSheet to validate the sheet one by one. If you have only a single worksheet, you can use the WorkSheet property and specify โ1โ which says to use the first worksheet in the workbook. If you wanted the second worksheet, you would have used a โ2โ as an increment way.
The worksheet object has a property called Hyperlinks which lists all of the cells which have hyperlinks. As per the below code, not only do you get the hyperlinks, but also the display text, cell Ranges, etc.,
#Looping the Sheets to check all the sheets in the Excel
for ($loop_index = 1; $loop_index -le $workbook.Sheets.Count; $loop_index++)
{
ย ย ย ย ย #Assigining the hyperlinks to the Variable from each sheet
ย ย ย ย ย $Sheet = $workbook.Worksheets($loop_index).Hyperlinks
ย ย ย ย #looping the Sheet one by one and store it in collection
ย ย ย ย $Sheet | ForEach-Object
ย ย ย ย ย ย ย ย ย {
ย ย ย ย ย ย ย ย ย $Range = $_.Range
ย ย ย ย ย ย ย ย ย [pscustomobject]@{
ย ย ย ย ย ย ย ย ย SheetName = $workbook.Worksheets($loop_index).name
ย ย ย ย ย ย ย ย LinkText = $_.TextToDisplay
ย ย ย ย ย ย ย ย Url = $_.Address
ย ย ย ย ย ย ย ย Cell = $Range.Row , $Range.Column
ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย }
ย ย ย ย ย ย ย }
}
STEP #3: Finally kill the Excel process
The below cmdlet will kill all the open excel in the specific system. In my server, we will not open any excel so I am confident to use this in my automation. So you ensure before using this below line of code.
Stop-Process -name excel
Final Code
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
#################################################### #Project : Fetch all the Hyperlinks from the excel #Developer : Thiyagu S (dotnet-helpers.com) #Tools : PowerShell 5.1.15063.1155 #E-Mail : mail2thiyaguji@gmail.com #################################################### #Load Excel $excel = New-Object -ComObject excel.application $excel.visible = $False $workbook = $excel.Workbooks.Open('C:\dotnet-helpers\MyExcel.xlsx') #Looping the Workbook sheet to check all the sheets in the Excel for ($loop_index = 1; $loop_index -le $workbook.Sheets.Count; $loop_index++) { #Assigining the hyperlinks to the Variable from each sheet $Sheet = $workbook.Worksheets($loop_index).Hyperlinks #looping the Sheet one by one and store it in collection $Sheet | ForEach-Object { $Range = $_.Range [pscustomobject]@{ SheetName = $workbook.Worksheets($loop_index).name LinkText = $_.TextToDisplay Url = $_.Address Cell = $Range.Row , $Range.Column } } } #Kill the excel Stop-Process -name excel |
Hi Venkat,
Yes, you can do it with below few lines.
In the below script, test1 is the file name that needs to be close.
$excel = [System.Runtime.InteropServices.Marshal]::GetActiveObject(“Excel.Application”)
$excel.DisplayAlerts = $false1
$excel.Workbooks | %{if($_.Name -imatch “test1”){$_.Close()}}
Thank you for nice artical.. Is any other way to closing the excel files using excel file names. My requirements is to close the specific excel files. In my machine other excel files should not be closed.
Hi Eztam Red,
Thanks, I had not tested with build-by formulas.
Yes, it will effectively work for huge excel which contains large data with 15 sheets (during my working scenario I tested this).
Hello and at first , thanks for this nice article.
Will this solution also collect hyperlinks they are build by formulars?
Will it absolutley nesseccary to use the huge and heavy excel com library, will it Not be easyer to use libs like EPLus?
Best Regards
Eztam Red