Sign in to follow this  
Followers 0
Kremnari

Patent images download automation help

6 posts in this topic

Greetings everyone.

I must confess, this is my first time using this program myself, but I have seen it in action and I am impressed with the performance.

Now, it is my time to use the program, and I'm having trouble. I have the idea of the flow of the program, but will need help with the specifics of the programming. I have gone through the tutorials, and the FAQ, and will continue to work on it, but since time is of importance, I'd thought I'd go ahead and post this to get some help a little faster.

Some background

For those of you who've never been around the USPTO (United States Patent and Trademark Office) website, it's arranged something like this. Each patent is available in full text on one web page, and the scanned full text, and images, are available on a seperate webpage, one scanned page at a time. Normally, the images wouldn't be such a problem to due without, but patents are written like legal documents, not very fun without some visual guidence.

So here's the deal. I need to start from one webpage, goto the website with the scanned pages, and save each page individually (it only loads one at a time) to a folder. Go to the next page, download, yadda yadda. After that one is done, it needs to go back to the previous page, find the first referenced patent (this is already done and hyperlinked) goto it, and start the saving process again.

If I can get help this far, my programming experience should be able to get me to cover the rest.

To help illistrate (sp?) what I have to do, here's the starting webpage http://patft.uspto.gov/netacgi/nph-Parser?...p;RS=PN/6118724

The first image's webpage is at: http://patimg1.uspto.gov/.piw?Docid=061187...View+first+page

Everthing I need to have happen should be doable with mouse clicks, just where and when. If you don't have a *.tiff viewer, it doesn't matter. The one I use has a button for "save as..." so it wouldn't be a problem to save the image.

Thanks for your help.

Kremnari

P.S.

How do I insure that an actively running script does not create active windows, all the windows it uses remain passive?

Share this post


Link to post
Share on other sites



Thank you, these sites have been very helpful.

I need to now get the number of pages to use in a for loop. I can search for text that prefix the number, but how can I get the number itself?

Thanks

Kremnari

Share this post


Link to post
Share on other sites

well, I'm not to good in the Auto-IE myself... :">

But for me this works at the first image-page:

#include<Inet.au3>

$ie = 'http://patimg1.uspto.gov/.piw?Docid=06118724&homeurl=http%3A%2F%2Fpatft.uspto.gov%2Fnetacgi%2Fnph-Parser%3FSect1%3DPTO1%2526Sect2%3DHITOFF%2526d%3DPALL%2526p%3D1%2526u%3D%25252Fnetahtml%25252FPTO%25252Fsrchnum.htm%2526r%3D1%2526f%3DG%2526l%3D50%2526s1%3D6118724.PN.%2526OS%3DPN%2F6118724%2526RS%3DPN%2F6118724&PageNum=&Rtype=&SectionNum=&idkey=NONE&Input=View+first+page'
$test = _INetGetSource($ie)
$split = StringSplit($test,"NumPages=",1)
$var = StringSplit($split[2]," -->", 1)
MsgBox(1,"test",$var[1])

Don't know if it's useful for you ;)

Neo


[center][font="Arial"]--- The Neo and Only --- [/font][font="Arial"]--Projects---[/font]Image to Text converterText to ASCII converter[/center]

Share this post


Link to post
Share on other sites

Thank you, that is useful. This would probably be easier if I could read source, or just maybe look to begin with!

Kremmy

well, I'm not to good in the Auto-IE myself... :">

But for me this works at the first image-page:

#include<Inet.au3>

$ie = 'http://patimg1.uspto.gov/.piw?Docid=06118724&homeurl=http%3A%2F%2Fpatft.uspto.gov%2Fnetacgi%2Fnph-Parser%3FSect1%3DPTO1%2526Sect2%3DHITOFF%2526d%3DPALL%2526p%3D1%2526u%3D%25252Fnetahtml%25252FPTO%25252Fsrchnum.htm%2526r%3D1%2526f%3DG%2526l%3D50%2526s1%3D6118724.PN.%2526OS%3DPN%2F6118724%2526RS%3DPN%2F6118724&PageNum=&Rtype=&SectionNum=&idkey=NONE&Input=View+first+page'
$test = _INetGetSource($ie)
$split = StringSplit($test,"NumPages=",1)
$var = StringSplit($split[2]," -->", 1)
MsgBox(1,"test",$var[1])

Don't know if it's useful for you ;)

Neo

Share this post


Link to post
Share on other sites

Alright, next issue. I have the tiff viewer, but it doesn't show up in the source. It's loaded when the filetype is detected. I can go directly to the image, to skip out on the frames, but both the source and save functions are disabled. Here's the link of where I'm at: http://patimg1.uspto.gov/.DImg?Docid=US006...ey=B48EA3D4F425

There is a toolbar from AlternaTiff (it's free if you want to download it and follow me). It has a save as button that the AutoIt Window Info tool identifies the classname as Button2. I've tried to use the function _IEFormElementGetObjByName, but when I try to use .click on the stored variable name, but the error says that the variable must be type object.

Are there any ideas as to how I would go about referencing that button to click?

I think this is the last piece of my puzzle. The rest is simply for loops.

Kremmy

Thank you, that is useful. This would probably be easier if I could read source, or just maybe look to begin with!

Kremmy

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0