I recently had the requirement to fetch out all of the hyperlinks from a Large excel worksheet to prepare another document and its repetitive monthly task. While I could have gone the manual approach by looking through all of the cells in the worksheet and fetching/copying all of the hyperlinks through manually, that would have been a long grueling thing to perform. So we thought to create an Automated script to perform this action instead of Manual work,
For this scenario, we have the PowerShell COM object creation to hook into Excel. Application COM object and then open up an existing Excel document and locate all of the hyperlinks which are in the spreadsheet. Automation will always be better than manual if allowed to use it. Instead of giving a deep conversation, let us start implementation quickly.
STEP #1: Create Excel.Application COM Object
The first thing that you need to do is to create Excel.Application COM object.
$excel = New-Object -ComObject excel.application
$excel.visible = $False
STEP #2: Open the excel using the Open method
Now you can open up the existing Excel document using the Open method under the Workbooks property on the Excel object. Note that you need to supply the full path of the document else it will throw the error. Here I take a sample sheet with a dummy value (shown in the below snapshot)
$workbook = $excel.Workbooks.Open(‘C:\dotnet-helpers\MyExcel.xlsx’)
STEP #3: Fetch the Hyperlinks using a worksheet object which has a property called Hyperlinks
In this article, based on my requirement here I am using many worksheets so I will Loop the WorkSheet to validate the sheet one by one. If you have only a single worksheet, you can use the WorkSheet property and specify ‘1’ which says to use the first worksheet in the workbook. If you wanted the second worksheet, you would have used a ‘2’ as an increment way.
The worksheet object has a property called Hyperlinks which lists all of the cells which have hyperlinks. As per the below code, not only do you get the hyperlinks, but also the display text, cell Ranges, etc.,
#Looping the Sheets to check all the sheets in the Excel
for ($loop_index = 1; $loop_index -le $workbook.Sheets.Count; $loop_index++)
{
#Assigining the hyperlinks to the Variable from each sheet
$Sheet = $workbook.Worksheets($loop_index).Hyperlinks
#looping the Sheet one by one and store it in collection
$Sheet | ForEach-Object
{
$Range = $_.Range
[pscustomobject]@{
SheetName = $workbook.Worksheets($loop_index).name
LinkText = $_.TextToDisplay
Url = $_.Address
Cell = $Range.Row , $Range.Column
}
}
}
STEP #3: Finally kill the Excel process
The below cmdlet will kill all the open excel in the specific system. In my server, we will not open any excel so I am confident to use this in my automation. So you ensure before using this below line of code.
Stop-Process -name excel
Final Code
#################################################### #Project : Fetch all the Hyperlinks from the excel #Developer : Thiyagu S (dotnet-helpers.com) #Tools : PowerShell 5.1.15063.1155 #E-Mail : mail2thiyaguji@gmail.com #################################################### #Load Excel $excel = New-Object -ComObject excel.application $excel.visible = $False $workbook = $excel.Workbooks.Open('C:\dotnet-helpers\MyExcel.xlsx') #Looping the Workbook sheet to check all the sheets in the Excel for ($loop_index = 1; $loop_index -le $workbook.Sheets.Count; $loop_index++) { #Assigining the hyperlinks to the Variable from each sheet $Sheet = $workbook.Worksheets($loop_index).Hyperlinks #looping the Sheet one by one and store it in collection $Sheet | ForEach-Object { $Range = $_.Range [pscustomobject]@{ SheetName = $workbook.Worksheets($loop_index).name LinkText = $_.TextToDisplay Url = $_.Address Cell = $Range.Row , $Range.Column } } } #Kill the excel Stop-Process -name excel