How to retrieve data from web?

T

tanilov

Hi all,
I need to do a macro to retrieve data from web and populate a sheet.

After some searches, I found that webquery could be the answer.
So, from Data->Import External Data->New Query, I browse to the page
with interested data, select the table I need and import data. It looks
so easy... but I have a new problem: to access page, first I have to
login into a different page :-(
So... I cannot use the webquery (can I?) :'(

Googling I found some posts with similar problem, where people suggets
to do something like below:

Set ie = CreateObject("InternetExplorer.Application")

With ie
.Visible = True
' Go to DDTS page
.Navigate "http://www.address.com/login"

' Loop until the page is fully loaded
Do Until .ReadyState = 4
DoEvents
Loop

' Make the desired selections on the web page and click the submit
Button
Set ipf = ie.document.all.Item("username")

ipf.Value = "user"
Set ipf = ie.document.all.Item("password")
ipf.Value = "pwd"

Set ipf = ie.document.all.Item(".save")

Set ipf = ie.document.all.Item("login_form")
ipf.Submit

' Loop until the page is fully loaded
Do Until .ReadyState = 4
DoEvents
Loop


..Navigate "http://www.address.com/member/page"
Do Until .ReadyState = 4
DoEvents
Loop


With this code, I can access to the page I need, but now I don't know
how to copy the data I need.

This page has 2 frames, and I need to copy data in a table present in
one of these frames.

Could you please tell me what to do next?

Thanks a lot,
tanilo
 
C

Chris Marlow

Tanilo,

I'd take a slightly different track.

Look up the libraries;
Microsoft WinHTTP Services
Microsoft HTML Object Library

I've posted some code in the last couple of days that utilises these. I
believe it is possible to maintain security permissions accross GET/POST
requests to the server, although I've only done that once & the login was
posting a simple (unencrypted) header.

Regards,

Chris.
 
T

tanilov

Chris Marlow ha scritto:
Look up the libraries;
Microsoft WinHTTP Services
Microsoft HTML Object Library

Hi Chris,
thanks for your hints. This is the code I wrote:

Dim WinHttpReq As WinHttp.WinHttpRequest

HOST = "http://www.address.com"
LOGIN = "/login"
DEST = "/member/page"

' Create an instance of the WinHTTPRequest ActiveX object.
Set WinHttpReq = New WinHttpRequest


' Login to DDTS
PostData = "username=user"
PostData = PostData & "&password=pwd"
' Assemble an HTTP Request.
WinHttpReq.Open "POST", HOST & LOGIN

WinHttpReq.SetRequestHeader "Content-Type",
"application/x-www-form-urlencoded"

' Send the HTTP Request.
WinHttpReq.Send PostData

'Start Query
PostData = PostData & "&ACTION=Download+CSV"
PostData = PostData & "&personalQuery=webQuery"

WinHttpReq.Open "GET", HOST & DEST & "?" & PostData

WinHttpReq.Send


' Put status and content type into your workbook Column 1, row 2
and 3
csv_file = WinHttpReq.ResponseText

With this code I have in csv_file, the data I need... but I don't know
how to populate the sheet now :( In all posts I found, people works
with csv file, and not with a variable containing the data. In my case,
query can returns several data... could this generate an 'Out of
Memory' error?

If I send 'ACTION=HTML' I receive a web-page with two frame, and in one
of these frame there is a table with data I need (same as cvs).

Thanks again,
Tanilo
 
C

Chris Marlow

Tanilo,

No worries. I think your 'csv_file' is actually html -

Add the follwing dim's

Dim objHTMLDoc As Object
Dim objEleTables As MSHTML.IHTMLElementCollection
Dim objEleRows As MSHTML.IHTMLElementCollection
Dim objHTMLRow As MSHTML.HTMLTableCell
Dim objEleCells As MSHTML.IHTMLElementCollection
Dim objHTMLCell As MSHTML.HTMLTableCell
Dim objHTMLTable As MSHTML.HTMLTable

Then;

Set objHTMLDoc = New MSHTML.HTMLDocument
objHTMLDoc.write objWinHTTP.ResponseText

Populates objHTMLDoc (which has to be late bound for some reason that
escapes me)

To get the tables in the HTML you would then use;

'Get the collection of tables in the HTML
Set objEleTables = objHTMLDoc.body.getElementsByTagName("table")

To get a table from this;

'Get the first table in the output
Set objHTMLTable = objEleTables(0)

'body of table
Set objEleRows = objHTMLTable.getElementsByTagName("tr")

.... and so on down to Cells. You will need to know a little HTML to go
further & need to be sure your structure is not going to change (or your code
be flexible enough to cope with it).

Regards,

Chris.

Chris Marlow
MCSD.NET, Microsoft Office XP Master
 
T

tanilov

Chris Marlow ha scritto:

[snip]

Thanks a lot Chris
Everything works great now :-D

Ciao,
Tanilo
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top