Convert LaTeX to Word with ‘Pandoc’

Hi,

How it’s going?

This will be the first post on this blog, which was created to share tools and tips that I learned throughout my academic and work life. Most of the time, I learned “alone” on the internet looking for solutions to my problems. Alone in quotation marks because I used the knowledge of many people who I had never met in person, but who, in line with my persistence in searching, helped me resolve issues and create, with very useful resources. Therefore, I believe that sharing my learnings in one place will add value to my journey.

As a first tool, which is the one I’m using most frequently at the moment, I will show how to convert a .tex document to .docx, for free, without deformatting texts with accents and other characters, common things that happen when we use normal LaTeX document conversion tools in .pdf to .docx.

Before coming to the conclusion of choosing this method, I tested some previously. However, if another interesting method has emerged or you, the reader, know of another way that you consider better for the purpose presented here, leave your comment on the page. Questions are also welcome.

Well, let’s get to the point. The first step you will need to take is to install Pandoc, a program that converts various file extensions. It can be obtained from the official page here. On the Pandoc project page, you can find instructions for installation on different operating systems. Installation is quite intuitive and you can follow the installer’s steps.

Once you have Pandoc installed on your machine, you will now need the following files, gathered in a folder, for the conversion to work:

  • main.tex
  • images (may be inside another folder within the root, according to what you called it in the code)
  • bibliography.bib
  • It may also have a .csl file, which allows you to format citations

The images are obviously the same as what you called in your main .tex file; However, I realized that Pandoc does not work for images in .pdf format.

I rarely use the .bib file, as I like to do my bibliographies manually. So, I don’t usually use .csl either, but as I said, it is possible to convert with them.

Another issue is that, despite being a great facilitator, some problems (errors) may appear and, in this case, with very large files, such as theses, I like to do the conversion by chapters. Either way, the tool is great.

Having your LaTeX project gathered in a folder, you need to open a CMD Command Prompt window to this same directory. Now, just apply the command:

pandoc main.tex -o mainconverted.docx

The mainconverted.docx output file does not need to have the same name as the input .tex file.

If you want to include your .bib file, you can use the following code:

pandoc --citeproc --bibliography.bib .\main.tex -f latex -t docx -o mainconverted.docx

And if you also want to apply the .csl file that formats the citations, just apply the code:

--citeproc --bibliography.bib --cslieee.csl .\main.tex -f latex -t docx -o mainconverted.docx.docx

It is also possible to use a .docx file style from a reference .docx file, in this case a model file from a magazine, for example. So, just add this file to your line of code, as follows:

pandoc main.tex --bibliography.bib --referenceIEEEtemplate.docx -o mainconverted.docx

The Pandoc page has documentation where you can discover more of the functions that the program brings.

Leave a Reply

Your email address will not be published. Required fields are marked *