摘自网络:

 

根据PDF文档参考手册,PDF的书签只能用PDFDocEncoding或者UTF-16BE来进行编码。在用LaTeX的hyperref宏包生成PDF文档的过程中,hyperref使用源文档的编码生成书签,结果导致在PDF阅读器中书签不能正确显示。

 

有两个解决办法,一个是采用\texorpdfstring{}{}命令进行两种不同的编码,一个针对书签,一个针对文本,这种方法使用起来比较繁琐。另一种是让hyperref按照文档编码产生书签,然后将书签编码转换成UTF-16BE编码。用LaTeX生成的书签文件以.out做为扩展名,因此只需转换.out文件即可。

 

用perl编写编码转换程序,然后写了一个脚本以简化生成书签的过程,其中gbkbm和utf8bm分别用来将GBK和UTF8编码的书签转换成UTD-16BE。用法如下:

 

$ latex gbkbkmark

$ gbkbm gbkbkmark

$ latex gbkbkmark

$ dvipdfm gbkbkmark

 

如果使用pdflatex直接产生PDF文档,那么需要将gbkbkmark.tex中hyperref的PDF驱动程序由dvipdfm 改为pdftex,编译过程如下代码:

$ pdflatex gbkbkmark

$ gbkbm gbkbkmark

$ pdflatex gbkbkmark

 

上面运行过程中的gbkbm gbkbkmark命令的作用是转换书签的编码,其他过程与LaTeX编译过程的标准步骤相同。

 

utf8bm的作用是将UTF8编码的书签转换成UTF-16BE编码,用法与gbkbm相同。

 

几个脚本的源代码如下:

 

======= gbkbm =======

 

#!/bin/sh

#

# $Id: gbkbm, v 0.90 2009/10/25 23:03:15 $

#

# Convert Hyperref Bookmark Encoding from GBK to UTF-16BE

#

# According to the PDF Reference, outlines are text strings and can therefore only be encoded

# in either PDFDocEncoding or Unicode character encoding (UTF-16BE). This script assumes that

# the bookmark file (.out) is encoded in GBK/GB2312, then convert to UTF-16BE.

#

# Author: Chunhua Li, 2009/10/25 <该邮件地址已受到反垃圾邮件插件保护。要显示它需要在浏览器中启用 JavaScript。>

#

# Usage: gbkbm example[.out]

#

# Examples of use for latex/pdftex:

#

#    1. Assume that the current encoding is GBK, and ‘dvipdfm’ is to used for PDF driver.

#        latex example

#        latex example

#        gbkbm example

#        latex example

#        dvipdfm example

#

#    2. Assume that the current encoding is GBK, and ‘pdftex’ is to used for PDF driver.

#        pdflatex example

#        pdflatex example

#        gbkbm example

#        pdflatex example

#

 

GBK2UNI=/usr/local/bin/gbk2utf16be.pl

 

# Extract the filename and ignore the extention (.out)

FILE=${1%.out}

TMPFILE=`mktemp /tmp/example.XXXXXX` || exit 1

 

# call utf2utf16be.pl to do the actual conversion

$GBK2UNI $TMPFILE

 

# Modified the bookmark output file

mv -f $TMPFILE ${FILE}.out

 

 

======= gbk2utf16be.pl =======

#!/usr/bin/perl

use Encode;

 

sub octal {

my ($t) = (@_);

sprintf "\\%03o", ord($t);

}

 

sub convert {

my ($t) = (@_);

if ($t =~ /[\x80-\xFF]/) {

Encode::from_to($t, "GBK", "UTF-16BE");

$t =~ s/(.)/${\octal($1)}/g;

$t = "\\376\\377" . $t;

}

$t;

}

 

while () {

$_ =~ s/([^}]*}{)([^}]*)/$1${\convert($2)}/;

print $_;

}

 

======= utf8bm =========

#!/bin/sh

#

# $Id: utf8bm, v 0.90 2009/10/25 23:03:15 $

#

# Convert Hyperref Bookmark Encoding from UTF8 to UTF-16BE

#

# According to the PDF Reference, outlines are text strings and can therefore only be encoded

# in either PDFDocEncoding or Unicode character encoding (UTF-16BE). This script assumes that

# the bookmark file (.out) is encoded in UTF8, then convert to UTF-16BE.

#

# Author: Chunhua Li, 2009/10/25 <该邮件地址已受到反垃圾邮件插件保护。要显示它需要在浏览器中启用 JavaScript。>

#

# Usage: utf8bm example[.out]

#

# Examples of use for latex/pdftex:

#

#   1. Assume that the current encoding is UTF8, and ‘dvipdfm’ is to used for PDF driver.

#        latex example

#        latex example

#        utf8bm example

#        latex example

#        dvipdfm example

#

#   2. Assume that the current encoding is UTF8, and ‘pdftex’ is to used for PDF driver.

#        pdflatex example

#        pdflatex example

#        utf8bm example

#        pdflatex example

#

 

UTF2UNI=/usr/local/bin/utf8to16be.pl

 

# Extract the filename and ignore the extention (.out)

FILE=${1%.out}

TMPFILE=`mktemp /tmp/example.XXXXXX` || exit 1

 

# call utf2utf16be.pl to do the actual conversion

$UTF2UNI $TMPFILE

 

# Modified the bookmark output file

mv -f $TMPFILE ${FILE}.out

 

========= utf8to16be.pl ==========

#!/usr/bin/perl

use Encode;

 

sub octal {

my ($t) = (@_);

sprintf "\\%03o", ord($t);

}

 

sub convert {

my ($t) = (@_);

if ($t =~ /[\x80-\xFF]/) {

Encode::from_to($t, "UTF8", "UTF-16BE");

$t =~ s/(.)/${\octal($1)}/g;

$t = "\\376\\377" . $t;

}

$t;

}

 

while () {

$_ =~ s/([^}]*}{)([^}]*)/$1${\convert($2)}/;

print $_;

}

 

用户登录