bkonst August 26, 2008 at 11:14

I'll give it in good hands ...

... LGPL project:

Developers

The main and only developer is your humble servant.
The project is sponsored by Darren Gates, owner of tufat.com.

Why give it away?

Lack of time for own projects. Alas, in the last year I cut out 1-2 hours a week from work to work on html2ps / pdf, which boiled down mainly to answers on the forum , editing bugs as they were discovered, and (rarely, rarely) adding new small features.
Naturally, the project is a pity - if only because some of its features are unique among LGPL-tools. I would like to find a successor.

What does it mean "give"?

admin rights in projects;
consultations on why it was done inside this way and not otherwise;
if necessary, mediation with Darren.

Short story

The project was born in the fall of 2004 as a clone of the widely known html2ps written in Perl. More precisely, he was a clone about 5 minutes before I looked into the html2ps code ... ... and decided that I would save time by writing my own bike from scratch. By the beginning of 2005, a code was born that could chew simple HTML without tables, parse simple CSS, and generate the appropriate Postscript. (It may be shocking for some to find out that 90% of the calculations related to the placement of text on the page took place inside a Postscript file). In January 2005, it was decided to change the license for LGPL (instead of the originally proposed model "all for $ 5").

$pta=defined $p{"text-align"}?$p{"text-align"}:$body{"text-align"};

$pal=0;

$pal=1 if($pta=~/^c/i);

$pal=2 if($pta=~/^r/i);

$pal=3 if($pta=~/^j/i);

$bgcol=&col2rgb($body{"background"});

if(!$bgcol) {$bgcol="[16#FF 16#FF 16#FF]"};

if(!$p{"color"}) {$p{"color"}="black"};

$tcol=&col2rgb($p{"color"});

$lcol=&col2rgb($a__link{"color"});

if($lcol) {

 $Lc="/Lc t D\n/Dl $lcol D\n";

 $Lc.=$tcol ne $lcol?"/LX t D":"/LX f D";

} else {

 $Lc="/Lc f D\n/LX f D";

}

$pcol=&col2rgb($pre{"color"});

if(!$pcol) {$pcol="[0 0 0]"};

$deftbg=&col2rgb($table{"background"});

Over the course of the year, the script developed, gained weight, replenished with features and lost bugs. By the end of the year, it became clear that: the chosen approach “write everything in Postscript, and the printer itself will figure it out” is far from ideal - the amount of computation grew along with the complexity of the processed pages and even conversion using ps2pdf on a “large” computer began to take several seconds. It was decided to stop raping printers with calculations, and, finally, remove the algorithms for placing elements from postscript. This change was noted by the release of html2ps / pdf 1.0

In its new form, the script existed almost until the end of 2006, when it became completely clear that one couldn’t live like that anymore - users asked for new features in volumes unexpected for such a small community, there was no convenient API for embedding html2ps / pdf in third-party projects, and the algorithms transferred from Postscript fell into PHP in a rather "strange" way, completely simplifying the changes. The time has come for the second (and last) rewriting of the kernel and the appearance of version 2.0.

The first two-thirds of 2007 was perhaps the best time of the project: the appearance of a normal API, optimization (no, the script did not work really fast; nevertheless, its passionate love for resources was somewhat reduced) and many new features.

Since the summer of 2007, my personal life and another project begin to take up more and more time. As a result, html2ps / pdf received a miserable 1-2 hours a week, which was spent mainly on script support on the forum. In May 2008, I decided that there was no point in continuing in the same vein, I talked with Darren and stopped trying to find time for further development.

Pros and cons

At the moment, this is the only LGPL tool I know that supports floats, position: absolute and position: fixed. In addition, there is support for such nice little things (not to mention the "basic" HTML / CSS), such as:

Internal and external links;
Footnotes
Interactive forms;
Table of contents generation;
Insert a "watermark";
Support for complex tables with rowspan / colspan;
A pretty good pagination algorithm (aware of the existence of CSS properties page-break- *, orphans and widows)
Unicode (successfully used for documents containing Korean mixed with French);

There is a well-documented API with examples.

The script is trying (from my point of view - quite successfully) to correct such trifles in the page source code as open tags, missing quotes around attributes, <,> and & characters in the wrong places, and so on.

Now with the bad and the scary. First, the solid part of the code was written by me four years ago (respectively, the experience in designing and programming was four years less); secondly, the code survived one change in coding style and two rewrites of the kernel. The natural result of this is that in some places you can find code for which I am ashamed, and in some places - for which I am very ashamed. (I’m also ashamed of the miserable 5% test coverage).

Another problem is the amount of code (~ 1.5 megabytes in ~ 350 files) and a rather complex subject area (you yourself can’t imagine how much space for interpretation the standards actually leave).

If you are interested and the bad and the terrible did not scare you - write. I will be glad to answer.

Tags: