Web Scraping in Excel VBA Using Internet Explorer: Complete Guide

Learn web scraping in Excel using VBA and Internet Explorer, including how it works, basic code examples, and practical automation concepts.

Internet Explorer Automation (Old Method)

Kaise kaam karta hai: VBA Internet Explorer ko open karke webpage load karta hai.

1) Opening or closing browser

Object library required:-
(Tools → References → Microsoft Internet Controls)

InternetExplorer → class from Microsoft Internet Controls

Excel VBA Code
 
Sub Test()

    Dim ie As InternetExplorer

    Set ie = New InternetExplorer
    ie.Visible = True
    ie.Navigate "https://www.google.com/"   ' navigate any url

    ' ie.Quit   ' internet explorer will be close

End Sub

You should know that the following two lines of code do the same job and are commonly used by VBA developers:

Set ie = New InternetExplorer
Set ie = CreateObject("InternetExplorer.Application")

Early Binding
Dim ie As InternetExplorer
Set ie = New InternetExplorer

You are creating an Internet Explorer object using Early Binding
VBA already knows what InternetExplorer is
You MUST enable: Microsoft Internet Controls (Reference)

Late Binding
Dim ie As Object
Set ie = CreateObject("InternetExplorer.Application")

You are creating IE using Late Binding
VBA does NOT know object type in advance
No reference needed

2) Control Existing/already opened browser (Internet explorer)

Excel VBA Code
 
Sub Test()

    Dim sh As SHDocVw.ShellWindows
    Dim ie As SHDocVw.InternetExplorer

    Set sh = New SHDocVw.ShellWindows

    For Each ie In sh

        ' If ie.Name = "Internet Explorer" Then
            Debug.Print ie.Name   ' You can replace ie.Name with ie.LocationURL
        ' End If

    Next ie

End Sub

2

Excel VBA Code
 
Sub test()

Dim sh As SHDocVw.ShellWindows
Dim ie As SHDocVw.InternetExplorer

Set sh = New SHDocVw.ShellWindows

For Each ie In sh
    If ie.Name = "internet Explorer" Then
        If InStr(ie.LocationURL, "Youtube") > 0 Then
            Debug.Print ie.LocationURL
        Else
            Debug.Print "Not Found"
        End If
        
    End If
Next


End Sub

3

Excel VBA Code
 
Sub test()

‘ Run for 5 second and Quit internet explorer
Dim ie As InternetExplorer
Set ie = New InternetExplorer

ie.Visible = True
ie.Navigate "https://www.google.com"


Application.Wait (Now + TimeValue("00:00:05"))


ie.Quit

End Sub

Method 1

Application.Wait (Now + TimeValue("00:00:05")) 

Loading of this url can take some time. This is hardcore method to wait. It is not good idea method to wait. We should use browser ready state or busy state.

Method 2

Do Until ie.ReadyState = READYSTATE_COMPLETE And ie.Busy = False
    DoEvents
Loop

Method 3

Do While ie.Busy Or ie.ReadyState <> 4
    DoEvents
    If Timer - t > 15 Then Exit Do   ' 15 seconds timeout
Loop


3) Object identification techniques

Object library required:-
(Tools → References → Microsoft HTML object library)

Dim Doc As HTMLDocument
Set Doc = ie.Document
Doc.getElementById("searchInput").Value = "Hello"

Excel VBA Code
 
Sub test()

Dim ie As InternetExplorer
Set ie = New InternetExplorer

ie.Visible = True
ie.Navigate "https://www.wikipedia.org/"


Application.Wait (Now + TimeValue("00:00:05"))  ' This will wait 5 sec to load url


Dim Doc As HTMLDocument
Set Doc = ie.Document

'Doc.getElementById("searchInput").Value = "Hello"
Doc.getElementsByName("search")(0).Value = "Hello"

Application.Wait (Now + TimeValue("00:00:05"))
ie.Quit

End Sub


For input box

Doc.getElementById("searchInput").Value = "Hello"
Doc.getElementsByName("search")(0).Value = "Hello"

 For combobox

Doc.getElementById("idname").selectedIndex = 3

 For Radio Button & Checkbox

Doc.getElementById("idname").click

For more than one element having same tagname eg input, div, br etc
We create new dim as object

Dim elems As Object
Dim elem As Object

4) Web Scraping


Excel VBA Code
 
Sub Web_Scraping()

Dim ie As SHDocVw.InternetExplorer

Set ie = New InternetExplorer
ie.Visible = True
ie.Navigate "https://en.wikipedia.org/"

Do Until ie.ReadyState = READYSTATE_COMPLETE And ie.Busy = False
    DoEvents
Loop

Dim Doc As MSHTML.HTMLDocument
Set Doc = ie.Document

' searchinput box ko select karna or usme jo search karna hai usko fill karna
Dim HTMLinput As MSHTML.IHTMLElement
Set HTMLinput = Doc.getElementById("searchInput")
HTMLinput.Select
HTMLinput.Value = "List of state and union territory capitals in India"

' Search box me fill karane ke baad searh button par click karana hai
Dim HTMLButtons As MSHTML.IHTMLElementCollection
Set HTMLButtons = Doc.getElementsByTagName("button")

Dim HTMLButton As MSHTML.IHTMLElement
    For Each HTMLButton In HTMLButtons
        'Debug.Print "Class: " & HTMLButton.className, "TagName: " & HTMLButton.tagName, "ID: " & HTMLButton.ID, "InnerText: " & HTMLButton.innerText
        
        If HTMLButton.className = "cdx-button cdx-search-input__end-button" Then
            HTMLButton.Click
            Exit For
        End If
    Next HTMLButton


' Ab search ke baad jo page khulega usme ek table hai uska data ko extract karna hai.
' -------- Wait for Search Result Page ----------
    Do Until ie.ReadyState = READYSTATE_COMPLETE And ie.Busy = False
        DoEvents
    Loop

    ' IMPORTANT: Reset document
    Set Doc = ie.Document

    ' -------- Extract Table ----------

Dim HTMLTables As MSHTML.IHTMLElementCollection
Set HTMLTables = Doc.getElementsByTagName("table")

Dim HTMLTable As MSHTML.IHTMLElement
    For Each HTMLTable In HTMLTables
        'Debug.Print "Class: " & HTMLTable.className, "TagName: " & HTMLTable.tagName, "ID: " & HTMLTable.ID
          
        Dim HTMLRaw As MSHTML.IHTMLElement
        
        For Each HTMLRaw In HTMLTable.getElementsByTagName("tr")
            Debug.Print HTMLRaw.innerText
        Next HTMLRaw
    
    
    Next HTMLTable



Application.Wait (Now + TimeValue("0:00:05"))
ie.Quit

End Sub


End

Post a Comment