RE: [xsl] FOP conversion font problems

Subject: RE: [xsl] FOP conversion font problems
From: "Jack Cane" <jwcane@xxxxxxxxxxx>
Date: Thu, 24 Apr 2003 10:40:58 -0400
As an aside, I note that the fop command line, fop -c
conf\userconfig.xml -fo file.fo -pdf file.pdf, contains only six arguments.
When I look inside the fop.bat file (which, I assume you are using), I find
8 command line arguments. Can I assume that %7 and %8 are not needed?

tks,

jwc

-----Original Message-----
From: owner-xsl-list@xxxxxxxxxxxxxxxxxxxxxx
[mailto:owner-xsl-list@xxxxxxxxxxxxxxxxxxxxxx]On Behalf Of Mike Ferrando
Sent: Thursday, April 24, 2003 10:16 AM
To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
Subject: Re: [xsl] FOP conversion font problems


--- "J.Pietschmann" <j3322ptm@xxxxxxxx> wrote:
> Mike Ferrando wrote:
> > Thanks for the links. I have read them. I will just have to
> transform
> > my XML into FO catching every character not in the base font set.
> I
> > will then put the character in <fo:inline> and indicate the font
> > family found in my userconfig.xml file. The bottom line is that
> the
> > characters not in the base font set are a wash.
>
> The base fonts are sufficient for most westerners, in particular
> they
> are a slight superset of Latin-1.
> If you you have that many non-Latin-1 characters, why don't you use
> a
> full Unicode font, like ArialUni.ttf? You'll have to declare the
> font
> family only once, e.g. on fo:root.

J.,
I have done this already. Using only the font-family "Arial"
(arial.xml), all characters are displayed correctly when the -c
userconfig.xml is given as an option in the command line.

  fop -c conf\userconfig.xml -fo file.fo -pdf file.pdf

>
> > At least I can
> > display them correctly even if they cannot be searched.
>
> Do you mean you can't find such characters in the resulting PDF?
> I don't have this problem.

When the document is open in Acrobat 5, I try to search words that
appear in the Arial font. I get no results. Nothing is found by the
Acrobat search tool. However, if I transform all text in the Base
font (Times), and only the one character (&# 299;) in the "Arial"
font, I can find the whole word up to that character.

I have even looked through the cid-fonts.fo file as well. Basically
all characters that are not in the Base 14 font sets are
unsearchable. The Apache site even confirms this being the result of
using metric fonts for these characters. (quote)

When embedding TrueType fonts, a new font, containing only the glyphs
used, is created from the original font and embedded in the pdf.
Currently, this embedded font contains only the minimum data needed
to be embedded in a pdf document, and does not contain any codepage
information. The PDF document contains indexes to the glyphs in the
font instead of to encoded characters. While the document will be
displayed correctly, the net effect of this is that searching,
indexing, and cut-and-paste will not work properly.
http://xml.apache.org/fop/fonts.html#embedding

I thought that this paragraph was directed to ttf files only. So I
tried the ttc files. But the results were still the same. Any
characters that are not part of the Base 14 fonts (transformed from
embedded fonts) will not be able to be searched in the pdf output
file.

If you are having success, don't hesitate to write to me again. I am
facing a significant learning curve and I am certainly aware that I
may not have all the information or experience others do.

Mike F.

__________________________________________________
Do you Yahoo!?
The New Yahoo! Search - Faster. Easier. Bingo
http://search.yahoo.com

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list




 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread