pdftohtml encoding question

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

bonsoir,

I am trying to convert a pdf file into html using pdftohtml provided by f8.

I get an html file with "nice" characters like: ’ insead of apostroph,
or é instead of é...

so i think that there is some coding problem.

Using man pdftohtml, I got this info:
- -enc <string>
~              output text encoding name


but, I am unable to guess what is the syntax to use in order to have a
correct output in utf8 for:

Error: Couldn't find unicodeMap file for the 'utf8' encoding

is the only answer I get if I try:

pdftohtml -enc utf8 myfile.pdf


i tried utf-8, latin1, latin-1, ISO_8859-1, .... without any success.


If somebody knows... many thnaks in advance.


- --
François Patte
UFR de mathématiques et informatique
Université Paris Descartes
45, rue des Saints Pères
F-75270 Paris Cedex 06
Tél. +33 (0)1 44 55 35 61
http://www.math-info.univ-paris5.fr/~patte
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org

iD8DBQFH1bXBdE6C2dhV2JURAoPgAJ9KFRPk265X2Wp0uTmofOJBOGmZHgCfXZs8
cRHc7uIPOnAvBKGpiFVAByg=
=UBKu
-----END PGP SIGNATURE-----


[Index of Archives]     [Current Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [Yosemite Photos]     [KDE Users]     [Fedora Tools]     [Fedora Docs]

  Powered by Linux