Fetching META tag data from non-local URL

C

Chris Kinata

Hi all,

Is there a way to load and/or fetch meta-tag and
document data from a non-local webpage in a relatively
safe way? I have to say first that I'm not interested in mining
pages for email other data. Basically very frustrated with the
bookmark management apps out there, and wanting to snag
snippets such as meta description, keywords, or even
selections in documents, and add to standard internet
shortcut info through a database.

In a general sense, some ideas:

--Use FPVB?

--Develop an HTA that wraps an IFRAME around the
non-local page, and references the page's DOM.
Wondering whether the implied release from the
cross-frame security restraints would be unwise.

--Do essentially the same thing with a webpage in IE,
but use DHTML.

Regards,
Chris
 
S

Steve Easton

In line

Chris Kinata said:
Hi all,

Is there a way to load and/or fetch meta-tag and
document data from a non-local webpage in a relatively
safe way? I have to say first that I'm not interested in mining
pages for email other data. Basically very frustrated with the
bookmark management apps out there, and wanting to snag
snippets such as meta description, keywords, or even
selections in documents, and add to standard internet
shortcut info through a database.

In a general sense, some ideas:

--Use FPVB?

Yes and no, more likely no. You're going to need something a little more robust than
what's available via FP and VBA, because you need to alter the way IE creates shortcuts
/ favorites. As it stands, it uses the title tag from the page. ( and obviously the url
to the page )
You're going to need to intercept / alter this behavior and that's going to take some
programming using API calls to manipulate the browser "shell."
--Develop an HTA that wraps an IFRAME around the
non-local page, and references the page's DOM.
Wondering whether the implied release from the
cross-frame security restraints would be unwise.

..htas are neat, but remember: They run within the browser "domain" and you're going to
run into
"cross domain scripting" issues, and the normal IE behavior when these arise is to shut
down and request to send an arror message to MSFT.
--Do essentially the same thing with a webpage in IE,
but use DHTML.

Same issues as mentioned above.

Regards

--
Steve Easton
Microsoft MVP FrontPage
95isalive
This site is best viewed............
........................with a computer
 
C

Chris Kinata

Hi Steve, thanks...response to one of your comments:

.htas are neat, but remember: They run within the browser "domain" and you're going to
run into
"cross domain scripting" issues, and the normal IE behavior when these arise is to shut
down and request to send an arror message to MSFT.

In the MSDN KB article Q241754--HOWTO: Create Cross-Frame Scripting-Capable Web Pages with
HTML Applications (HTAs), it says (and I quote 8)

"When a user accesses the HTA, it asks whether he or she wants to "execute" the file. If
the user says yes, the HTA opens in its own window. From that point on, documents can
script freely across frames whose documents come from different domains. This is
considered secure because it uses trust-based security...[etc.]"

The article was dated 2001, and applies to IE5, 5.5.

I'm thinking that unless this has been superceded and nullified in some way, I could
create an HTA (have developed a few for solely local-file purposes) that displays the
remote site in an IFRAME and then access it in various ways. My question (in this
scenario) is whether a script in the remote site would be able to access the HTA upward
through the parent frame.

Regards,
Chris Kinata
 
S

Steve Easton

Inline


Chris Kinata said:
Hi Steve, thanks...response to one of your comments:

.htas are neat, but remember: They run within the browser "domain" and you're going to
run into
"cross domain scripting" issues, and the normal IE behavior when these arise is to shut
down and request to send an arror message to MSFT.

In the MSDN KB article Q241754--HOWTO: Create Cross-Frame Scripting-Capable Web Pages with
HTML Applications (HTAs), it says (and I quote 8)

"When a user accesses the HTA, it asks whether he or she wants to "execute" the file. If
the user says yes, the HTA opens in its own window. From that point on, documents can
script freely across frames whose documents come from different domains. This is
considered secure because it uses trust-based security...[etc.]"

Correct. When a visitor opens an hta from a server, they will be prompted as to whether
they want to open it.

Try one of the Clock examples here: http://95isalive.com/java/clocks.htm

They are .hta files that run javscript. ( I had actually forgotten I had them up there )

The article was dated 2001, and applies to IE5, 5.5.

I'm thinking that unless this has been superceded and nullified in some way, I could
create an HTA (have developed a few for solely local-file purposes) that displays the
remote site in an IFRAME and then access it in various ways. My question (in this
scenario) is whether a script in the remote site would be able to access the HTA upward
through the parent frame.

The key word here is "within a frame or Iframe. "

When I said cross domain scripting, the term domain refers to the browser and system as
"domains."

You won't be able to access a downloaded and opened .hta file with a server based script.
It will need to be client based.

--
Steve Easton
Microsoft MVP FrontPage
95isalive
This site is best viewed............
........................with a computer
 
C

Chris Kinata

Hi Steve,

Thanks for your attention in this. Just one more question, and then
I'll leave you alone 8).

Just to make really sure, to the best of your knowledge, if I created a
_desktop_ HTA on my machine that frames a _remote_ site, I can

1) access the framed document through its DOM (meta tags would be
nice, too). I should probably just do the experiment...8)

2) that site wouldn't be able to access my machine upward through
the frame's parent.

Try one of the Clock examples here: http://95isalive.com/java/clocks.htm

They are .hta files that run javscript. ( I had actually forgotten I had them up there )

Cool. The original scripts at Kurt's site
http://www.btinternet.com/~kurt.grigg/javascript/
are also very cool.
The key word here is "within a frame or Iframe. "

When I said cross domain scripting, the term domain refers to the browser and system as
"domains."

You won't be able to access a downloaded and opened .hta file with a server based
script.
It will need to be client based.


I absolutely have no desire to access a downloaded HTA through server script or
any other method. Actually, too nervous to run a remote HTA, except in the
single instance of your example (really, what I did was to download it, examine it
in FrontPage, and then run it locally). Just want in this case to set up my own
mini-browser with enhanced functionality for recording link information from browsed
pages.

Regards,
Chris Kinata
 
S

Steve Easton

In line again.

Chris Kinata said:
Hi Steve,

Thanks for your attention in this. Just one more question, and then
I'll leave you alone 8).

Just to make really sure, to the best of your knowledge, if I created a
_desktop_ HTA on my machine that frames a _remote_ site, I can

1) access the framed document through its DOM (meta tags would be
nice, too). I should probably just do the experiment...8)

Correct, but the scripts will have to run from within the .hta file
2) that site wouldn't be able to access my machine upward through
the frame's parent.
Correct.

there )

Cool. The original scripts at Kurt's site
http://www.btinternet.com/~kurt.grigg/javascript/
are also very cool.

Yep, I emailed Kurt and asked permission to use them in the .hta files
I absolutely have no desire to access a downloaded HTA through server script or
any other method. Actually, too nervous to run a remote HTA, except in the
single instance of your example (really, what I did was to download it, examine it
in FrontPage, and then run it locally). Just want in this case to set up my own
mini-browser with enhanced functionality for recording link information from browsed
pages.

This is actually quite simple.

Copy and paste the following into a blank notepad file and save it with the .hta
extension.
Double click it to open it, enter a website address and then click the go button.

Start copy below:

<html>
<head>
<TITLE>My Browser</TITLE>
<HTA:APPLICATION ID="oHTA"
APPLICATIONNAME="yes"
Scroll="no"
BORDER="Thin"
BORDERSTYLE="normal"
WINDOWSTATE="maximize">
</head>
<body>
<span id=AddressBar style="overflow: none">
<span id=AddText>Address</span>
<input type=text value="" id=TheAddress style="width:
expression(document.body.clientWidth - AddText.offsetWidth - AddGo.offsetWidth - 45)"
size="20">
<input type=button value="Go" id=AddGo onclick="navigate()"><br>
<iframe border="0" id="TheFrame" style="width: 100%; height: 100%;"></iframe>
<script language=JScript>
function navigate() {
document.all.TheFrame.src = TheAddress.value;
}
function clickShortcut() {
if (window.event.keyCode == 13) {
navigate()
}
}
TheAddress.onkeypress = clickShortcut;

</script>
</body>
</html>


Stop copy

Regards

--
Steve Easton
Microsoft MVP FrontPage
95isalive
This site is best viewed............
........................with a computer
 
C

Chris Kinata

Hey Steve,

Thanks for your comments and the cool HTA code sample--mainly wanted
the blessing of a guru to assuage some anxiety on the cross-frame security
issue before I got too involved with it.

Best regards,
Chris

--
 
Top