Sunday, February 2, 2014

GUIDE: Convert (Book-style) SWFs to PDFs with GFX2GFX

0. Preamble

So you've downloaded (legally?) digital copies of a textbook, but they're individual pages in SWF format. (Okay, it needn't be this specific) What now?

Just to be clear, before we continue, the kind of SWF files I'm talking about are static, one frame SWF files consisting of bitmap images and text. While swfrender from SWFTools could be used to convert these into PNGs/PDFs, the text wouldn't be scalable. This guide will take these static SWF files and turn them into PDF files with scalable text.

1. GFX2GFX

This guide will be using the undocumented tool gfx2gfx from SWFTools. As this tool is undocumented, it must be compiled from source.

You might be able to compile this tool by following this guide on GitHub, however I had issues, and so I will be compiling (on Arch Linux) by modifying the swftools-git package on the AUR. Note that this method applies only to Arch Linux.
  1. Install the dependency pdflib-lite from the Arch Linux repositories.
  2. Open a terminal window and cd to a directory of your choice.
  3. Run the commands to download and extract the swftools-git tarball from the AUR:
    1. wget https://aur.archlinux.org/packages/sw/swftools-git/swftools-git.tar.gz
    2. tar xzf swftools-git.tar.gz
    3. cd swftools-git
  4. Open the PKGBUILD file in an editor of your choice.
  5. After the lines ./configure and make, insert the following two lines:
    1. cd src
    2. make gfx2gfx
  6. Run makepkg -s.
    1. If the compiler complains about ‘No rule to make target `xpdf/TextOutputDev.o'’, open the src/swftools/lib/pdf/Makefile file in an editor of your choice.
    2. After the lines DummyOutputDev.$(O): DummyOutputDev.cc DummyOutputDev.h InfoOutputDev.h and $(CC) -I ./ $(xpdf_include) DummyOutputDev.cc -o $@, insert the following two lines:
      1. TextOutputDev.$(O): TextOutputDev.cc TextOutputDev.h InfoOutputDev.h
      2. $(CC) -I ./ $(xpdf_include) TextOutputDev.cc -o $@
    3. Then run makepkg -s again.
  7. If all went well, there should be a gfx2gfx executable file inside the src/swftools/src directory. Copy it to the folder where your SWF files are.

1b. What is this DPI restrictions?

The default options for gfx2gfx put a limit of 72dpi on images in SWFs. If you find this ridiculous, go poke around for the undocumented functions to raise that limit! Have fun!

Just kidding, I did that already. To disable the limit, open the src/swftools/src/gfx2gfx.c file, and, after the line gfxdevice_pdf_init(out);, insert the line out->setparameter(out, "maxdpi", "320");, where 320 is the maximum DPI. To disable this limit, set it to 0.

Then, recompile gfx2gfx by cd'ing to the src/swftools/src directory and running make gfx2gfx.

2. Convert Time!

The complex part is over!

./gfx2gfx page_1.swf -o page_1.pdf

It's that easy! Or, if for loops are your thing,

for i in {1..350}; do ./gfx2gfx chapter0/page_$i.swf -o pdf/page_$i.pdf; done

8 Comments:

At March 13, 2014 at 8:09 PM , Blogger Danny said...

Thanks for the information. I get the following warning when modifying for DPI:

gfx2gfx.c:263:13: warning: implicit declaration of function 'pdf_setparameter' is invalid in C99 [-Wimplicit-function-declaration]
pdf_setparameter(out, "maxdpi", "320");

 
At March 29, 2014 at 3:54 PM , Blogger RunasSudo said...

That's just a warning, and it doesn't affect the compile, but if you're worried about it, you can change the line to:

out->setparameter(out, "maxdpi", "320");

I'll go change it in the post now.

 
At May 17, 2014 at 11:00 PM , Blogger Unknown said...

Hi,

What do you mean by:
1. Install the dependency pdflib-lite from the Arch Linux repositories.

Can you explain in details to install pdflib-lite.
Thanks!

 
At May 19, 2014 at 4:12 AM , Blogger Unknown said...

never mind... I put the wrong swf (more than 1 frame).
thanks for your hard work.

 
At July 14, 2014 at 1:12 AM , Blogger Tony said...

Hi many thanks for this excellent guide. Is the scalable text also selectable and searchable?

 
At August 18, 2014 at 12:02 AM , Blogger RunasSudo said...

You betcha!

 
At February 20, 2015 at 1:50 PM , Blogger Unknown said...

Dunno if my other comment got lost but:

Text is not selectable/searchable as originally stored as characters being individual shapes. Any way around this?

Thin black line around images (see http://www50.zippyshare.com/v/AV9orjSw/file.html)

Thanks for the great post :)

 
At March 14, 2015 at 10:23 PM , Blogger RunasSudo said...

Unfortunately, text converted using this method is not searchable or selectable, as it converts the text to vector shapes.

The text is zoomable, however, which is an improvement upon other techniques.

 

Post a Comment

Subscribe to Post Comments [Atom]

<< Home