M y    b r a i n    h u r t s  !                                           w e                 r e a l l y                 t h i n k                   w h a t                y o u             k n o w

03 March 2016

Creating native dictionaries for the OSX Dictionary.app

OSX has an integrated dictionary app which can be accessed by selecting some text in any application, right-clicking and then picking Look up from the context menu (three-finger-tap on a selected word should work too).

The OSX Dictionary comes pre-installed with several dictionaries but creating your own dictionaries or converting existing ones is surprisingly complex.

This post is an attempt to make this process easier for everyone.

Most third-party dictionaries that I use are in DSL (ABBY Lingvo) and BGL (Babylon) formats. I have collected them over the years, and have been using them both on the desktop via the open-source app GoldenDict, and on my smartphone thanks to GoldenDict for android. The OSX version of GoldenDict is still buggy, and apparently not properly maintained, so I wanted to find a way to migrate my existing dictionaries to the native OSX dictionary app.

After a bit of research I finally figured out the toolchain for converting my existing dictionaries in Babylon and Lingvo DSL formats to the AppleDict format which the OSX Dictionary.app expects.

Prerequisites

Auxiliary Tools for Xcode

Download the DMG for Auxiliary Tools for Xcode from developer.apple.com/download/more. You'll need to sign in with your Apple ID in order to download. If you already have an iCloud account then you can use it. Once you sign in you can use the direct link to Auxiliary Tools for Xcode 7 or use the latest version by searching for it on the Apple downloads page.

Mount the DMG file by double-clicking it in Finder. Next, As the root user create a folder at /Developer/Extras and copy the Dictionary Development Kit folder from the Auxiliary Tools into that folder.
sudo mkdir -p /Developer/Extras
sudo cp -r "/Volumes/Auxiliary Tools/Dictionary Development Kit" /Developer/Extras

Python 3

Install Python 3 via homebrew (OSX comes with Python 2.x preinstalled):
brew cask install python3

Python 3 dependencies

Install lxml and BeautifulSoup, the parsers that pyglossary depends on:
## for current user only
pip3 install lxml beautifulsoup4 --user

## system-wide, for all users
sudo pip3 install lxml beautifulsoup4

PyGlossary

Clone the pyGlossary project to a folder on your drive:
mkdir -p ~/projects
cd ~/projects
git clone --depth 1 git@github.com:ilius/pyglossary.git
Next, locate the path to pyglossary.pyw — that's the script you'll need for converting the DSL file in UTF-8 encoding to AppleDict XML.

DSL to AppleDict conversion

The DSL to AppleDict sequence:
  1. DSL files come in Unicode Little Endian encoding, so the first step would be converting your DSL file from UTF-16 to UTF-8:
  2. iconv -f UTF-16 -t UTF-8 webster-original-utf16.dsl > webster.dsl
    
  3. Now do the actual conversion to AppleDict source format. Make sure you include the correct path to pyglossary.pyw. When the command completes (which might take several minutes if the source file is big), a new file webster.xml will be created along with some make files.
  4. ~/projects/pyglossary/pyglossary.pyw --read-format=ABBYYLingvoDSL --write-format=AppleDict webster.dsl webster.xml
    
  5. The generated AppleDict source file will be created in a subfolder with the same name as the source file, i.e. webster. Next, we move into that folder and do the compilation of AppleDict source to the OSX binary dictionary format:
  6. cd webster
    make
    make install
    
Running make install will copy the compiled dictionary to ~/Library/Dictionaries. Restart the Dictionary.app, open preferences (⌘ + ,), and you should see the new dictionary available in the list. Enable the corresponding checkbox to make the dictionary active.

I have created a bash script that automates the process of DSL to AppleDict conversion: https://gist.github.com/elFua/8540294. Run the bash script without arguments to see usage notes.


IMPORTANT: You will need to edit the script and set the correct value of PYGLOSSARY_HOME, by default it is ~/projects/pyglossary.

Babylon BGL to AppleDict conversion

The conversion steps for BGL dictionaries:



A bash script for BGL to AppleDict is available at https://gist.github.com/elFua/8541228. To get this toolchain to work I recommend reading the readme files which are included both with pyglossary and the Apple Dictionary Development Kit. I have included the most important points in the help section of the bash script for DSL.

Where to get the dictionaries?

There are literally hundreds, maybe thousands of dictionaries in DSL, Stardict, and Babylon BGL formats. A search engine query with the format and language pairs will most likely return some links where you'll be able to download the files. For example try stardict+english+spanish. The Goldendict dictionaries download page also has some useful links.

There are also a couple of command line utilities for converting between free dictionary formats: dictconv, makedict.

3 comments :

  1. Hi! Is it possible to make these native OS X dictionaries?
    http://www.mobileread.mobi/forums/showthread.php?t=256360
    Any chance you could help me out?

    ReplyDelete
  2. After command "git clone --depth 1 git@github.com:ilius/pyglossary.git" I get this:

    Cloning into 'pyglossary'...
    Permission denied (publickey).
    fatal: Could not read from remote repository.
    Please make sure you have the correct access rights
    and the repository exists.

    Any solutions? Thanks

    ReplyDelete
    Replies
    1. Problem solved: instead of "git clone --depth 1 git@github.com:ilius/pyglossary.git" run "git clone --depth 1 https://github.com/ilius/pyglossary.git ~/projects/pyglossary"

      Delete

The Java Posse (Most Recent Podcasts)