Skip to content

Extending Jython with pypi modules

Thad Guidry edited this page Nov 15, 2022 · 7 revisions

Tutorial - How to extend OpenRefine with pypi modules

OpenRefine comes with Jython/Python expression support built-in. Great!

BUT... sometimes you want to easily use Pip and download python modules/libraries to use in your OpenRefine project without a lot of headache.

Quick Answer: Download and install Jython 2.7+ (install 'standard') to manage and run pip itself outside of OpenRefine, then do cool sys.path.append's

  • Download Jython 2.7x and install using 'standard' option (setuptools and pip will be installed)
  • Add the jython2.7x\bin folder to your path.On gnu/Linux add export PATH=$PATH:$HOME/jython2.7x/bin line to ~/.bashrc file.
  • Because of a current bug with Jython installer (http://bugs.jython.org/issue2521), if on Windows, you might have to rename TEMP\pip_build_blah to TEMP\XXX_pip_build_blah
  • Open a Windows Command Prompt cmd.exe and install modules that you want to be able to use in OpenRefine.
  pip install address-formatter
  • NOTE : if you already have a Python version installed, pip install <module> will install the module in your first Python distribution, not in Jython. To work around the problem, use this command instead :
  jython -m pip install address-formatter
  • Start OpenRefine
  • Use the Expression dialog and append your path to include Jython 2.7 site-packages folder
  import sys
  sys.path.append('E:\\jython2.7.0\\Lib\\site-packages')
  • or, if you don't want to escape each backslash in the folder path, use :
sys.path.append(r'E:\jython2.7.0\Lib\site-packages')
  • On gnu/Linux, use sys.path.append('/home/user/jython2.7x/Lib/site-packages')

  • Then import the package and/or .py files you want to work with

  • You can also set the Jython site-packages path permanently in your refine.ini file. Add the following line to your refine.ini: JAVA_OPTIONS=-Dpython.home=/home/user/jython2.7x

  from address import AddressParser, Address
  • Putting it all together with a usecase
import sys
sys.path.append('E:\\jython2.7.0\\Lib\\site-packages')
from address import AddressParser, Address

ap = AddressParser()

return ap.parse_address(value)

How to replace diacritic characters

import sys
sys.path.append(r'E:\jython2.7.1rc1\Lib\site-packages')
from unidecode import unidecode
return unidecode(value)

# Boris Villazón -> Boris Villazon
Clone this wiki locally