Re: ps to pdf and then to text editor

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, 9 Apr 2006, Paul Smith wrote:

I print to a file, file.ps, a web-page with text. Then, I apply ps2pdf
and I get file.pdf. However, I cannot copy (from file.pdf) the text to
a text editor. Can one get a pdf file with copyable text?

Does this work with a really trivial web page?

What does "pdffonts file.pdf" show?

If the pdf file uses strings, then you stand a better chance of being able to cut and paste from a pdf viewer to the editor, but you may run into encoding issues, so the pasted text is gibberish.

I get:

$ cat t.html
abc

Print to ps from Firefox, convert to pdf, load in Adobe Reader, and
cut and paste gives: "^Y^Z^[", so the encoding is a problem.  Xpdf
would not let me copy the text.  The t.html.ps file has:

8 dict begin
/FontName /Nimbus_Roman_No9_L.Regular.0.0.Set0 def
/FontType 1 def
/FontMatrix [ 0.001 0 0 0.001 0 0 ]readonly def
/PaintType 0 def
/FontBBox [-168 -281 1031 1098]readonly def
/Encoding [
/.notdef
/uni0066/uni0069/uni006C/uni0065/uni003A/uni002F/uni0068/uni006F
/uni006D/uni0067/uni0077/uni0074/uni0057/uni0073/uni002E/uni0031
/uni0020/uni0030/uni0034/uni0039/uni0032/uni0036/uni0041/uni004D
/uni0061/uni0062/uni0063/

This is the 'abc' --> '^Y^Z^[' encoding.

$ pdffonts t.html.pdf
name                         type         emb sub uni object ID
---------------------------- ------------ --- --- --- ---------
YNAHAD+Nimbus_Roman_No9_L.Regular.0.0.Set0
                             Type 1C      yes yes no  9 0

If the pdf file uses images, you need to use an OCR tool to get the text.
I have seen cases where printing docs to PS on Win32 results in the
text being rasterized in the driver so the PS file has images. This may happen with screen fonts and/or certain effects (transparency, text outlines filled with colored patterns).

--
George N. White III  <aa056@xxxxxxxxxxxxxx>


[Index of Archives]     [Current Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [Yosemite Photos]     [KDE Users]     [Fedora Tools]     [Fedora Docs]

  Powered by Linux