M y    b r a i n    h u r t s  !                                           w e                 r e a l l y                 t h i n k                   w h a t                y o u             k n o w

13 July 2016

Simple webserver with twisted

The Python oneliner webserver python -m SimpleHTTPServer (for Python 2.x) or python -m http.server (for Python 3.x) is useful in many scenarios, but sometimes it simply does not cut it because it is meant for development only, is not secure, and is single-threaded.

A better oneline webserver comes with Twisted, the library that the original bittorrent reference implementation was written in.
Installing twisted for the current user only:

pip install twisted --user

Installing twisted system-wide:

sudo pip install twisted

Once installed you can run the default twisted webserver with a command like twistd -n web --path . where the dot represents the current folder. Replace with an absolute path if you need.

The version below additionally displays the internal and external IP addresses, so that you can launch the server and instantly connect to it by entering the displayed IP on your mobile devices or another computers in the local network.

echo 'IP Address:' $(ifconfig | sed -En 's/127.0.0.1//;s/.*inet (addr:)?(([0-9]*\.){3}[0-9]*).*/\2/p'):8080
echo 'External IP:' $(wget http://ipinfo.io/ip -qO -):8080
twistd -n web --path .

In order to be able to connect via the external IP you'll need to set up port forwarding on your router as described at portforward.com.

03 March 2016

Creating native dictionaries for the OSX Dictionary.app

OSX has an integrated dictionary app which can be accessed by selecting some text in any application, right-clicking and then picking Look up from the context menu (three-finger-tap on a selected word should work too).

The OSX Dictionary comes pre-installed with several dictionaries but creating your own dictionaries or converting existing ones is surprisingly complex.

This post is an attempt to make this process easier for everyone.

Most third-party dictionaries that I use are in DSL (ABBY Lingvo) and BGL (Babylon) formats. I have collected them over the years, and have been using them both on the desktop via the open-source app GoldenDict, and on my smartphone thanks to GoldenDict for android. The OSX version of GoldenDict is still buggy, and apparently not properly maintained, so I wanted to find a way to migrate my existing dictionaries to the native OSX dictionary app.

After a bit of research I finally figured out the toolchain for converting my existing dictionaries in Babylon and Lingvo DSL formats to the AppleDict format which the OSX Dictionary.app expects.

Prerequisites

Auxiliary Tools for Xcode

Download the DMG for Auxiliary Tools for Xcode from developer.apple.com/download/more. You'll need to sign in with your Apple ID in order to download. If you already have an iCloud account then you can use it. Once you sign in you can use the direct link to Auxiliary Tools for Xcode 7 or use the latest version by searching for it on the Apple downloads page.

Mount the DMG file by double-clicking it in Finder. Next, As the root user create a folder at /Developer/Extras and copy the Dictionary Development Kit folder from the Auxiliary Tools into that folder.
sudo mkdir -p /Developer/Extras
sudo cp -r "/Volumes/Auxiliary Tools/Dictionary Development Kit" /Developer/Extras

Python 3

Install Python 3 via homebrew (OSX comes with Python 2.x preinstalled):
brew cask install python3

Python 3 dependencies

Install lxml and BeautifulSoup, the parsers that pyglossary depends on:
## for current user only
pip3 install lxml beautifulsoup4 --user

## system-wide, for all users
sudo pip3 install lxml beautifulsoup4

PyGlossary

Clone the pyGlossary project to a folder on your drive:
mkdir -p ~/projects
cd ~/projects
git clone --depth 1 git@github.com:ilius/pyglossary.git
Next, locate the path to pyglossary.pyw — that's the script you'll need for converting the DSL file in UTF-8 encoding to AppleDict XML.

DSL to AppleDict conversion

The DSL to AppleDict sequence:
  1. DSL files come in Unicode Little Endian encoding, so the first step would be converting your DSL file from UTF-16 to UTF-8:
  2. iconv -f UTF-16 -t UTF-8 webster-original-utf16.dsl > webster.dsl
    
  3. Now do the actual conversion to AppleDict source format. Make sure you include the correct path to pyglossary.pyw. When the command completes (which might take several minutes if the source file is big), a new file webster.xml will be created along with some make files.
  4. ~/projects/pyglossary/pyglossary.pyw --read-format=ABBYYLingvoDSL --write-format=AppleDict webster.dsl webster.xml
    
  5. The generated AppleDict source file will be created in a subfolder with the same name as the source file, i.e. webster. Next, we move into that folder and do the compilation of AppleDict source to the OSX binary dictionary format:
  6. cd webster
    make
    make install
    
Running make install will copy the compiled dictionary to ~/Library/Dictionaries. Restart the Dictionary.app, open preferences (⌘ + ,), and you should see the new dictionary available in the list. Enable the corresponding checkbox to make the dictionary active.

I have created a bash script that automates the process of DSL to AppleDict conversion: https://gist.github.com/elFua/8540294. Run the bash script without arguments to see usage notes.


IMPORTANT: You will need to edit the script and set the correct value of PYGLOSSARY_HOME, by default it is ~/projects/pyglossary.

Babylon BGL to AppleDict conversion

The conversion steps for BGL dictionaries:



A bash script for BGL to AppleDict is available at https://gist.github.com/elFua/8541228. To get this toolchain to work I recommend reading the readme files which are included both with pyglossary and the Apple Dictionary Development Kit. I have included the most important points in the help section of the bash script for DSL.

Where to get the dictionaries?

There are literally hundreds, maybe thousands of dictionaries in DSL, Stardict, and Babylon BGL formats. A search engine query with the format and language pairs will most likely return some links where you'll be able to download the files. For example try stardict+english+spanish. The Goldendict dictionaries download page also has some useful links.

There are also a couple of command line utilities for converting between free dictionary formats: dictconv, makedict.