Sign in to follow this  
Followers 0
Guest Darksoul71

Help needed: Automated Webpage2PDF generation -> how to get everything on the PDF ?

3 posts in this topic

#1 ·  Posted (edited)

Hi all,

First of all I must admit that this is not an AutoIt related problem but AutoIt is my favourite scripting solution and there you go !

In my daily job as SQAE (= Software Quality Assurance Engineer) I work with a lot of

web-based system which often include electronic approvals / electronic documents.

We often need to "file" the webpages for reference within non-webbased databases (don´t ask !

It´s just the way we work *ggg*). This involves a lot of manual steps (e.g. launching a printjob via FreePDF,

storing the generated PDF with the correct naming scheme within a temporary directory, attach them

to the right document within a Notes database, etc).

If those steps don´t sound like much work just imagine you have something like 30-40 webpage printouts to store

each day beside your daily work.

Additionally to this we have stored a lot of webpages as HMTL files on our local share which keep informations on SQA relevant topics. We have planed to convert them to PDF also and store them within a Notes DB as some sort of information pool.

While automating the printout (= conversion to PDF) as well as the correct naming is not a real problem, I´ve run into a few stupid limitations (mostly of IE as I found out during my last research within the web):

* Complex webforms which use some sort of "MemoBoxes"

to store lots of text are simply truncated during printout. IE

seems to limit the amount of lines which are printed out.

I found this out by coincidence when someone copied a

complete AS400 report in such a "MemoBox".

* IE also seems not to be able to "scale" the printout when

the screen resolution is "wider" than the paper format you´ve

choosen. This also occurs when someone enters a very long

text without CR & LF at the end. The "MemoBoxes" get very wide

and even setting the paper format to A4 Landscape sometimes isn´t

enough to capture all informations.

I´ve played around with AutoIt on my home PC and managed fairly quickly to do "semiautomated" conversion of Webpages to PDF using FreePDF. Also I´m running FireFox at home which provides the option to scale any webpage to fit onto the printout but this functionality is, ..well..., let´s say "Less than optimal !"

After printing out a few critical example webforms I brought with me from work, I found out that Firefox either screws up the formating of the webpage itself (if one can speak of "format" :sorcerer:) or simply cuts the object which is wider than the printout

(this happened with a pretty large diagramm).

I know that there are fancy HTML2PDF generators out there but my boss tells me "No money" and IT tells me "not on our certified application list". So anything I want to implement has to be both "cheap" and needs to work without too much installation required

Nice requirements, huh ? :o

I´ve never touched Perl or any other scripting language beside VBA, AutoIt and a little bit of Automate. May be there is some open source module out there that is able to solve my problem ? I could also imagine developing a compilable binary for opening and printing the webform in Delphi / VC++ / VB and call the tool from AutoIt.

I´m aware that the simplest (but also the most stupid) approach would be printing out every webpage in A0 Landscape but this makes normally formated webpages a pain in the neck to read. Esp. I dislike using zooming every time I read a PDF.

:geek:

BTW: I´ve also tried to use M$ Word to open / copy the content of the webpage but with limited success. My hope was that M$ Word is able to "reformat" the webpage with images in it without screwing up too much of the layout. ;)

Any ideas so far ?

TIA,

D$

>>Edit:

May be I should add that I already have a working "Select X HTML files and convert them to a PDF and store them in a specified output folder".

Everything works beside the "leave something out during printing" problem.

A possible solution I have in mind:

1) Find out if the browser window has a horizontal scroll bar

2) If Yes (= webpage wider than browser window) then get "webpage width"

3) Calculate "true width" of webpage in cm

4) Calculate shrink ratio for A4 page

5) Print the webpage with the calculated shrink ratio.

/Edit<<

Edited by Darksoul71

Share this post


Link to post
Share on other sites



Hi ChrisL,

thanks for the hint but I´m afraid Cute PDF will not help since FreePDF (which I currently use both at home and at work) is essentially the same. Also my problem is more related to the HTML2Print engine of both IE and Firefox, rather than being bound to the PDF conversion itself.

I guess what I would more require is a free alternative to the billions of HTML2PDF converters out there :o

Best regards,

D$

Have you looked at Cute PDF with Ghost Script engine.

It works as a windows printer so just send it to print.

May help with what you need to do

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0