Sunday, February 2, 2014

GUIDE: Convert (Book-style) SWFs to PDFs with GFX2GFX

0. Preamble

So you've downloaded (legally?) digital copies of a textbook, but they're individual pages in SWF format. (Okay, it needn't be this specific) What now?

Just to be clear, before we continue, the kind of SWF files I'm talking about are static, one frame SWF files consisting of bitmap images and text. While swfrender from SWFTools could be used to convert these into PNGs/PDFs, the text wouldn't be scalable. This guide will take these static SWF files and turn them into PDF files with scalable text.

1. GFX2GFX

This guide will be using the undocumented tool gfx2gfx from SWFTools. As this tool is undocumented, it must be compiled from source.

You might be able to compile this tool by following this guide on GitHub, however I had issues, and so I will be compiling (on Arch Linux) by modifying the swftools-git package on the AUR. Note that this method applies only to Arch Linux.
  1. Install the dependency pdflib-lite from the Arch Linux repositories.
  2. Open a terminal window and cd to a directory of your choice.
  3. Run the commands to download and extract the swftools-git tarball from the AUR:
    1. wget https://aur.archlinux.org/packages/sw/swftools-git/swftools-git.tar.gz
    2. tar xzf swftools-git.tar.gz
    3. cd swftools-git
  4. Open the PKGBUILD file in an editor of your choice.
  5. After the lines ./configure and make, insert the following two lines:
    1. cd src
    2. make gfx2gfx
  6. Run makepkg -s.
    1. If the compiler complains about ‘No rule to make target `xpdf/TextOutputDev.o'’, open the src/swftools/lib/pdf/Makefile file in an editor of your choice.
    2. After the lines DummyOutputDev.$(O): DummyOutputDev.cc DummyOutputDev.h InfoOutputDev.h and $(CC) -I ./ $(xpdf_include) DummyOutputDev.cc -o $@, insert the following two lines:
      1. TextOutputDev.$(O): TextOutputDev.cc TextOutputDev.h InfoOutputDev.h
      2. $(CC) -I ./ $(xpdf_include) TextOutputDev.cc -o $@
    3. Then run makepkg -s again.
  7. If all went well, there should be a gfx2gfx executable file inside the src/swftools/src directory. Copy it to the folder where your SWF files are.

1b. What is this DPI restrictions?

The default options for gfx2gfx put a limit of 72dpi on images in SWFs. If you find this ridiculous, go poke around for the undocumented functions to raise that limit! Have fun!

Just kidding, I did that already. To disable the limit, open the src/swftools/src/gfx2gfx.c file, and, after the line gfxdevice_pdf_init(out);, insert the line out->setparameter(out, "maxdpi", "320");, where 320 is the maximum DPI. To disable this limit, set it to 0.

Then, recompile gfx2gfx by cd'ing to the src/swftools/src directory and running make gfx2gfx.

2. Convert Time!

The complex part is over!

./gfx2gfx page_1.swf -o page_1.pdf

It's that easy! Or, if for loops are your thing,

for i in {1..350}; do ./gfx2gfx chapter0/page_$i.swf -o pdf/page_$i.pdf; done