Tuesday, June 26, 2007

Bounding Boxes and EPS to PDF Conversion (in LaTeX)

Another development: More bounding box related issues are discussed in this CTT thread. It turns out that dvips -E basically guesses RANDOMLY at what the bounding box should be, and so its answers can be inconsistent. GhostScript (gs) has a bbox driver that circumscribes your EPS with a rectangle and uses the rectangular dimensions as the bounding box. The epstool command can use this GhostScript calculation to update your EPS. So you can imagine doing things like...
latex file.tex
dvips -E file.tex -o tmp.eps
epstool --bbox --copy --ouput file.eps tmp.eps
epstopdf file.eps
The epspdf script has similar functionality (when you are converting from EPS to PDF) and will be included in TeXLive 2008.

Related post: LaTeX generated figures: Using preview instead of pst-eps

Follow-up: See another interesting option (pst-pdf) in this follow-up.

Update: Another interesting option is purifyeps, which requires pstoedit and Perl. See below.

None of the following is too special. This is all well-known stuff. However, it's not the easiest to find with a Google search, so I'm going to post it here.

Pretty often, I have to generate an EPS file with MATLAB. That EPS figure will go into a LaTeX document. To generate a PDF document from my LaTeX source, I will probably use PDFLaTeX/PDFTeX. However, that means I need to convert that MATLAB EPS figure to PDF. Most EPS-to-PDF distillers I use will mess up the bounding box information and the result of the conversion will be a FULL PAGE PDF rather than the nice tiny EPS figure.

Keep this scenario in mind and consider these notes:
(*) ps2pdf command: To convert EPS to PDF and maintain the proper bounding box, try including the "EPSCrop" GhostScript (GS) option:
ps2pdf -dEPSCrop blah.eps blah.pdf
Without the -dEPSCrop option, I get the full-page PDF from a MATLAB EPS. However, with the -dEPSCrop option, things work fine.

(*) epstopdf command: ALTERNATIVELY, try this line instead:
epstopdf blah.eps
This works for me. If it doesn't, try this (on Windows with MiKTeX 2.6):
epstopdf --gsopt=-dEPSCrop blah.eps
That also works for me.

(*) epstopdf LaTeX package: You may be interested in the epstopdf package which comes with the oberdiek bundle. It should be included in your LaTeX distribution. If not, install the oberdiek bundle. You can find information (and download) about these here:
From that last link, you'll find this epstopdf usage (this will work for the graphics package as well):
\usepackage[pdftex]{graphicx}
\usepackage{epstopdf}
Then you can include graphics two different ways:
% Way 1: Includes blah.eps. Will ALWAYS generate 
% blah.pdf regardless of whether it already exists.
\includegraphics{blah.eps}

% Way 2: Includes blah.eps. If blah.pdf DOES NOT EXIST,
% it will automatically be generated.
\includegraphics{blah}
There are configuration options too. Consider the following:
% The default eps to pdf rule
\DeclareGraphicsRule{.eps}{pdf}{.pdf}{`epstopdf #1}

% Alternative eps to pdf rule
\DeclareGraphicsRule{.eps}{pdf}{.pdf}
{`ps2pdf -dEPSCrop #1}

% A rule for converting gif to png using ImageMagick
% NOTE: The placement of the % signs IS important
\DeclareGraphicsRule{.gif}{png}{.png}{%
`convert #1 `basename #1 .gif`.png%
}

% The same gif-to-png rule for Windows
% (i.e., without basename support)
\makeatletter
\DeclareGraphicsRule{.gif}{png}{.png}{%
`convert #1 \noexpand\Gin@base.png%
}
\makeatother
FINALLY, if you want to add .gif to the list of extensions that the package graphicx (or graphics package) searches if the file extension is not given in \includegraphics, you can either use the command \GraphicsExtensions OR doing something like:
\makeatletter
\g@addto@macro\Gin@extensions{,.gif}
\makeatother
Leaving the file extension off of the \includegraphics macro makes a lot of sense; however, remember that epstopdf will only be run the first time latex or pdflatex gets run. If you want it to convert all of your graphics every run, be sure to leave the extensions on.

(*) purifyeps command: There is also purifyeps, which requires pstoedit and Perl. Taken from purifyeps's CTAN page:
While pdfLaTeX has a number of nice features, its primary shortcoming relative to standard LaTeX+dvips is that it is unable to read ordinary Encapsulated PostScript (EPS) files, the most common graphics format in the LaTeX world. purifyeps converts EPS files into a "purified" form that can be read by *both* LaTeX+dvips and pdfLaTeX. The trick is that the standard LaTeX2e graphics packages can parse MetaPost-produced EPS directly. Hence, purifyeps need only convert an arbitrary EPS file into the same stylized format that MetaPost outputs.
I haven't actually played with this at all. I recommend reading purifyeps.pdf for more information about why you want a "purified" EPS rather than some other format. I assume that the bounding box problem shouldn't be an issue here, but I have no idea.

(*) pstoedit command:
MPS files can be used DIRECTLY by BOTH latex and pdflatex (pdflatex does MPS-to-PDF conversion on-the-fly). You can easily convert EPS files to MPS files yourself as long as you have psttoedit (download and install it from the pstoedit page). Take a look at section 5.4 (MetaPost) of epslatex.pdf for information on that. From the instructions there:
pstoedit -f mpost graphic.eps graphic.mp
mpost graphic.mp
rename graphic.1 graphic.mps
That is, run psttoedit to convert to MP and then use mpost to create the MPS from the MP. Simple, huh? Now, does it fix the bounding box problem? As with the last bullet, I have no idea. Maybe someday I'll try this.

Hopefully some of that will be useful to someone; it will at least be a good reference for me. :)

9 comments:

Faraz said...

Thanks a lot for this post.

I have a couple of comments:

>>Using the package epstopdf in latex:
-shell-escape switch is required. For those people who use WinEdt, they can set it at Options>Execution Modes>PDFLatex>Switches

if using PDFTexify, PDFLatex needs to be called at least once to convert the eps figures to pdf.

>>To have the fonts embedded and subset (IEEE requirement), use this rule:

\DeclareGraphicsRule{.eps}{pdf}{.pdf}{`epstopdf --gsopt=-dPDFSETTINGS=/prepress #1}

Basically the /prepress setting must be passed to gs. This is very useful and convenient for IEEE conferences.

Theo said...

Thanks for the font embedding note!

For font embedding, also be sure that updmap.cfg is updated to download fonts. See Getting pdflatex to embed all fonts (or pdf pdflatex embedded fonts) for more details.

NOTE: It appears like font embedding is the default using gwTeX for OS X.

Theo said...

I haven't tried this, but I hear from my officemate that this line will work well (where #1 is your EPS file):

ps2pdf -dPDFSETTINGS=/prepress -dEPSCrop #1

This line should embed fonts AND maintain bounding box AND should be supported across platforms (whereas the epstopdf command is very different on Windows).

Also consider using these font embedding options (perhaps INSTEAD of the prepress option):

-dPDFX=true

or

-dPDFSETTINGS=/printer

See the ps2pdf documentation for more details.

Nick said...

One addition to Faraz' comment about the switch -shell-escape. You can run PdfTexify directly in WinEdt (without having to call PDFLatex), if in

Options>Execution Modes>PdfTexify

you add the switch

--tex-option=-shell-escape

Anonymous said...

To crop EPS, I use pdfcrop which is just a perl script like epstopdf.

In texlive, it's part of the texlive-extra-utils package.

-Ben

Faraz said...

The gsopt switch of epstopdf only works with MikTex epstopdf (written by Christian Schenk in C++). It doesn't work with TexLive since that is using a completely different epstopdf written by Sebastian Rahtz et al. in perl.

But, if you are using Texlive, it's very easy to modify the epstopdf function. first make a backup copy of epstopdf perl script. Then use vim to open the perl script and look for the the comment "### open output file". right below it, add -dPDFSETTINGS=/prepress or -dPDFSETTINGS=/printer to the arguments of gs, like
my $pipe = "$GS -q -dPDFSETTINGS=/prepress (leave other arguments intact)

and you are good to go.

you can check if the fonts are really embedded and subset by:

pdffonts myfile.pdf

kiroosha said...

Thanks for the post. Here's a simple hack to fix bounding boxes. I found that including latex in the Matlab figures simply places the BoundingBox declaration in an incorrect spot. The following perl script fixes the problem (and converts the files to pdf, using epstopdf), without generating any extra files.

---------------------
#!/usr/bin/perl
# The script converts input a list of input eps files
# to pdf files, using epstopdf.
# It also fixes eps files of MATLAB figures, containing latex text.
# Including LaTeX strings in figure labels causes some versions of MATLAB
# to misplace the bounding box instructions in the eps files.

for $d (@ARGV){ # for every file in the input line
if (-s $d and $d=~/eps$/){ # check that it's non-empty and has an eps extension
print $d."\n"; # output the file name
# correct the bounding box problem with latex-containing Matlab plots

@lines = `cat $d`; # read the file: HACK: will only work on *nix
@bbox = grep(/BoundingBox/, @lines); # Find the bounding box lines

# insert the bounding box where it belongs:
# right after the first line
$lines[1] =~ s#.*^#$bbox[0]#g;

# write the corrected verions of the eps file
open F, ">$d";
print F @lines;
close F;

# create the pdf version
`epstopdf $d`;

}

}
-----------------------

Anand said...

Thanks for this. I have been looking for this for a while.

Sarah said...

Thanks for all the details. 3 yrs after the original post and it's still very helpful!