[KDE Dot News]
 faq
 flatforty
 contribute
 subscribe
 configure
 search
 rdf

 main
 parent
 thread


Re: What I would like to see
by Vajsravana on Monday 23/Jun/2003, @06:49
> To import it as text, all that is needed is not to render it but to use the > same information to import it into the WordProcessor's representation of
> text.

This is not "importing". This is OCR.
You see... in many PS files there is no text at all! There is just a description of where some glyphs are to be displayed on the sheet. Some of the glyphs can be text (in the WP sense), some bitmap, some pure vectors.
Extracting from this the "WordProcessor's representation of text" is nothing more or less then OCR, not much different then extracting text from a bitmap or, a better example, a coreldraw file.
  Related Links
 ·   Articles on KDE Office Suite
 ·   Also by Vajsravana
 ·   Contact author

Thread Threshold:

The Fine Print: The following comments are owned by whomever posted them.
( Reply )

Re: What I would like to see
by James Richard Tyrer on Monday 23/Jun/2003, @11:57
To try to interpret your nonsense:

There are two possibilities:

1. The text is represented as glyphs directly or by a name or number that reference a font (either embedded or not embedded).

2. The text is a graphical representation.

In all cases that I know of, the default output of wordprocessors is #1 although some do offer #2 as an option.

Obviously, if it is #1 then the information can be imported as text, and if it is #2 then it can only be imported as a graphic.

Since you are a PostScript expert, I don't need to tell you that even if the text can not be extracted with: "ps2ascii", the PS file may still be #1.

If you had left an e-mail address, I would have sent you a sample. But, you can do it yourself. Make a document with KWord, print it to a PS file, convert it to PDF with: "ps2pdf" and import the PDF back into KWord (you need the import filter if you don't have 1.3 Beta). This might not work perfectly, but you will get text when you import it.

NOW, try: "ps2ascii" on the PS file. NO text. Open the PS file with KEdit. NO text.

Is it magic?? Or, perhaps you don't know what you are talking about. 8-D

--
JRT
[ Reply To This | View ]
  • Re: What I would like to see
    by Vajsravana on Tuesday 24/Jun/2003, @06:53
    > There are two possibilities:
    >
    > 1. The text is represented as glyphs directly or by a name or number that reference a font (either embedded or not embedded).
    >
    > 2. The text is a graphical representation.

    What I meant to say is that, although not common, there are programs which generate postscript as pure vector graphics, keeping the glyphs as vector shapes and discarding the character and the font map that originates the glyphs (you can call it "graphical representation", but this is often used for bitmaps).
    You can see a good example of these files if you use hylafax via WHFC and analyse the PS output of various win32 programs... the difference between similar documents printed in slightly different ways is sometimes amusing.

    Of course, even if the result is similar when printed or viewed, there is no means of reimporting as text this kind of (otherwise perfectly legitimate) files, but only as a useless vector image.
    This discards the idea of using PS as a standard archiving format, at least if you don't limit to programs that generates it "correctly".

    > Obviously, if it is #1 then the information can be imported as text, and if it is #2 then it can only be imported as a graphic.

    As you know, "as graphic" or "not at all" has exactly the same meaning in this context.

    > Since you are a PostScript expert, I don't need to tell you that even if the text can not be extracted with: "ps2ascii", the PS file may still be #1.
    >[...]

    I usually don't use ps2ascii at all, I know very well how many problems it has, and I too skip directly to PDF instead, when I can.

    > Is it magic?? Or, perhaps you don't know what you are talking about. 8-D

    Or perhaps you did not even try to understand what I was talking about. :)
    [ Reply To This | View ]
    • Re: What I would like to see
      by James Richard Tyrer on Tuesday 24/Jun/2003, @10:24
      > ... although not common, there are programs which generate postscript as pure vector graphics

      So, if it is: "not common" it isn't really relevant to the question, is it?

      > This discards the idea of using PS as a standard archiving format.

      I didn't say that it should be used as a "standard archiving format", I have, and do, suggest that PDF be uses as a standard format. However, this does not mean that it would not be useful to be able to import PS files directly into your WordProcessor without having to convert them to EPS or PDF first.

      > As you know, "as graphic" or "not at all" has exactly the same meaning in this context.

      Well, importing them "as graphic" isn't going to get you the text, but this feature -- which is already somewhat available on WordPerfect (it requires EPS) -- still has its uses.

      > Or perhaps you did not even try to understand what I was talking about. :)

      Yes I did, but I could not tell if you were only talking about PS files that are totally graphic images or the PS files that appear not to contain text when they actually do.

      Now that I fully understand, I can state that your reasoning is flawed. Your assertion appears to be that in some (not common) instances you will find a PS file that represents text as a graphic image and that, therefore, the ability to import PS (as text) into a WordProcessor is not a useful feature.

      This is to say that because it won't always work that there is no point in having it. This is illogical -- backwards reasoning.

      --
      JRT
      [ Reply To This | View ]

 
The Fine Print: The previous comments are owned by whomever posted them.
( Reply )

  "Being part of KDE is quite a character building experience." -- Matthias Ettrich
KDE®, "K Desktop Environment", "KDE Dot News", "got the dot?" and the KDE Logo® are trademarks or registered trademarks of KDE e.V. in the European Union, the United States and other countries. All other trademarks and copyrights on this page are owned by their respective owners. Comments are owned by the poster. The rest: Copyright © 2000-2008 KDE e.V. for The KDE Project. For further information or comments on this site, please contact the Webmaster.
[ home | post article | flat forty | subscribe | search | rdf ]