opentrep

Open Travel Request Parser

View the Project on GitHub trep/opentrep

Open Travel Request Parser (TREP)

Build Status Docker Repository on Quay

Table of content

Table of contents generated with markdown-toc

Overview

OpenTREP aims at providing a clean API, and the corresponding C++ implementation, for parsing travel-/transport-focused requests. It powers the https://transport-search.org Web site (as well as its newer version, https://www2.transport-search.org).

OpenTREP uses Xapian (https://www.xapian.org) for the Information Retrieval part, on freely available transport-/travel-related data (e.g., country names and codes, city names and codes, airline names and codes, etc), mainly to be found in the OpenTravelData (OPTD) project: http://github.com/opentraveldata/opentraveldata/tree/master/opentraveldata

OpenTREP exposes a simple, clean and object-oriented, API. For instance, the OPENTREP::interpretTravelRequest() method:

As an example, the travel request Washington DC Beijing Monday a/r +AA -UA 1 week 2 adults 1 dog would yield the following list:

The output may then be used by other systems, for instance to book the corresponding travel, or to visualize it on a map and calendar and to share it with others.

Note that the current version of OpenTREP recognizes only geographical POR (points of reference), whatever their number in the request. For instance, the request lviv rivne jitomir kbp kharkiv dnk ods lwo yields the following list of POR: LWO, RWN, ZTR, KBP, HRK, DNK, ODS and LWO again. See that request in action on the transport-search.org site or through the API (enable JSONView or similar for a more confortable reading).

OpenTREP also deals with transport-related requests. For instance, cnshg deham nlrtm uslbg brssz cnshg correspond to a world tour of famous ports:

The underlying data for the POR is the OpenTravelData optd_por_public_all.csv file. A good complementary tool is GeoBase, a Python-based software able to access any travel-related data source.

OpenTREP makes an extensive use of existing open-source libraries for increased functionality, speed and accuracy. In particular the Boost (C++ Standard Extensions), Xapian and SOCI libraries are used.

Docker images

OpenTREP Docker images

Docker images provide ready-to-use environments, and are available on Docker Cloud and Quay.io:

$ docker pull opentrep/search-travel:legacy # for Docker.io
$ docker pull quay.io/trep/opentrep # for Quay.io
$ docker run --rm -it opentrep/search-travel:legacy bash

See https://github.com/trep/opentrep/tree/master/gui/legacy for more details.

General purpose C++/Python Docker images

General purpose Docker images for C++/Python development are also available from Docker Cloud. Those Docker images allow to develop on the major Linux distributions, i.e., CentOS, Debian and Ubuntu.

CentOS 8

$ docker pull cpppythondevelopment/base:centos8
$ docker run -t cpppythondevelopment/base:centos8 bash
[build@2..c ~]$ $ mkdir -p ~/dev/geo && cd ~/dev/geo
[build@2..c geo]$ git clone https://github.com/trep/opentrep.git
[build@2..c geo]$ cd opentrep && mkdir build && cd build
[build@2..c build (master)]$ cmake3 -DCMAKE_INSTALL_PREFIX=${HOME}/dev/deliveries/opentrep-99.99.99 -DCMAKE_BUILD_TYPE:STRING=Debug -DINSTALL_DOC:BOOL=ON -DRUN_GCOV:BOOL=OFF -DLIB_SUFFIX= ..

CentOS 7

$ docker pull cpppythondevelopment/base:centos7
$ docker run -t cpppythondevelopment/base:centos7 bash
[build@2..c ~]$ $ mkdir -p ~/dev/geo && cd ~/dev/geo
[build@2..c geo]$ git clone https://github.com/trep/opentrep.git
[build@2..c geo]$ cd opentrep && mkdir build && cd build
[build@2..c build (master)]$ cmake3 -DCMAKE_INSTALL_PREFIX=${HOME}/dev/deliveries/opentrep-99.99.99 -DCMAKE_BUILD_TYPE:STRING=Debug -DINSTALL_DOC:BOOL=ON -DRUN_GCOV:BOOL=OFF -DLIB_SUFFIX= ..

Ubuntu 18.04 LTS Focal Fossal

$ docker pull cpppythondevelopment/base:ubuntu2004
$ docker run -t cpppythondevelopment/base:ubuntu2004 bash
[build@2..c ~]$ $ mkdir -p ~/dev/geo && cd ~/dev/geo
[build@2..c geo]$ git clone https://github.com/trep/opentrep.git
[build@2..c geo]$ cd opentrep && mkdir build && cd build
[build@2..c build (master)]$ cmake -DCMAKE_INSTALL_PREFIX=${HOME}/dev/deliveries/opentrep-99.99.99 -DCMAKE_BUILD_TYPE:STRING=Debug -DINSTALL_DOC:BOOL=ON -DRUN_GCOV:BOOL=OFF -DLIB_SUFFIX= ..

Ubuntu 18.04 LTS Bionic Beaver

$ docker pull cpppythondevelopment/base:ubuntu1804
$ docker run -t cpppythondevelopment/base:ubuntu1804 bash
[build@2..c ~]$ $ mkdir -p ~/dev/geo && cd ~/dev/geo
[build@2..c geo]$ git clone https://github.com/trep/opentrep.git
[build@2..c geo]$ cd opentrep && mkdir build && cd build
[build@2..c build (master)]$ cmake -DCMAKE_INSTALL_PREFIX=${HOME}/dev/deliveries/opentrep-99.99.99 -DCMAKE_BUILD_TYPE:STRING=Debug -DINSTALL_DOC:BOOL=ON -DRUN_GCOV:BOOL=OFF -DLIB_SUFFIX= ..

Debian 10 Buster

$ docker pull cpppythondevelopment/base:debian10
$ docker run -t cpppythondevelopment/base:debian10 bash
[build@2..c ~]$ $ mkdir -p ~/dev/geo && cd ~/dev/geo
[build@2..c geo]$ git clone https://github.com/trep/opentrep.git
[build@2..c geo]$ cd opentrep && mkdir build && cd build
[build@2..c build (master)]$ cmake -DCMAKE_INSTALL_PREFIX=${HOME}/dev/deliveries/opentrep-99.99.99 -DCMAKE_BUILD_TYPE:STRING=Debug -DINSTALL_DOC:BOOL=ON -DRUN_GCOV:BOOL=OFF -DLIB_SUFFIX= ..

Debian 9 Stretch

$ docker pull cpppythondevelopment/base:debian9
$ docker run -t cpppythondevelopment/base:debian9 bash
[build@2..c ~]$ $ mkdir -p ~/dev/geo && cd ~/dev/geo
[build@2..c geo]$ git clone https://github.com/trep/opentrep.git
[build@2..c geo]$ cd opentrep && mkdir build && cd build
[build@2..c build (master)]$ cmake -DCMAKE_INSTALL_PREFIX=${HOME}/dev/deliveries/opentrep-99.99.99 -DCMAKE_BUILD_TYPE:STRING=Debug -DINSTALL_DOC:BOOL=ON -DRUN_GCOV:BOOL=OFF -DLIB_SUFFIX= ..

Common to all the above-mentioned Linux distributions

[build@2..c build (master)]$ make install
[build@2..c build (master)]$ ./opentrep/opentrep-indexer
[build@2..c build (master)]$ ./opentrep/opentrep-searcher -q "nice san francisco"

Native installation (without Docker)

RPM-based distributions (eg, Fedora/CentOS/RedHat)

Since OpenTREP has been approved as an official package of Fedora/CentOS/RedHat (see the review request on Bugzilla for further details), just use DNF (or Yum for the older distributions):

$ dnf -y install opentrep opentrep-doc

Installation from the sources

Configure the environment

Clone the Git repository

The GitHub repository may be cloned as following:

$ mkdir -p ~/dev/geo && cd ~/dev/geo
$ git clone https://github.com/trep/opentrep.git
$ cd opentrep
$ git checkout master

Alternatively, download and extract the tar-ball

GitHub generates tar-balls on the fly for every tagged release. For instance:

$ wget https://github.com/trep/opentrep/archive/opentrep-0.07.15.tar.gz

Note that SourceForge also stores some older archived tar-balls.

Installation of the dependencies

On Linux

The following packages may be needed (Fedora/RedHat/CentOS names on the left hand side, Debian/Ubuntu names on the right hand side; names for other Linux distributions may vary):

For instance, the following subsections show respective installation commands for a few famous Linux distributions.

Fedora
CentOS
Debian/Ubuntu

On MacOS

ICU

Boost

Follow the instructions on Boost helper documentation on GitHub to install Python and Boost on some platforms, including MacOS.

CentOS

SOCI

General Unix/Linux

$ mkdir -p /opt/soci/socigit/build/head
$ pushd /opt/soci/socigit/build/head
$ cmake -DCMAKE_INSTALL_PREFIX=/usr -DCMAKE_BUILD_TYPE=Release -DSOCI_CXX11=ON -DSOCI_TESTS=OFF ../..
$ make
$ sudo make install
$ popd

Debian

$ wget https://github.com/trep/opentrep/raw/master/ci-scripts/soci-debian-cmake.patch -O /opt/soci/soci-debian-cmake.patch
$ pushd /opt/soci/socigit
$ patch -p1 < ../soci-debian-cmake.patch
$ popd
$ mkdir -p /opt/soci/socigit/build/head
$ pushd /opt/soci/socigit/build/head
$ cmake -DCMAKE_INSTALL_PREFIX=/usr -DCMAKE_BUILD_TYPE=Release -DSOCI_CXX11=ON -DSOCI_TESTS=OFF ../..
$ make
$ sudo make install
$ popd

MacOS

Building the library and test binary

To customize OpenTREP to your environment, you can alter the installation directory:

export INSTALL_BASEDIR="${HOME}/dev/deliveries"
export TREP_VER="0.07.15"
if [ -d /usr/lib64 ]; then LIBSUFFIX="64"; else LIBSUFFIX=""; fi
export LIBSUFFIX_4_CMAKE="-DLIB_SUFFIX=$LIBSUFFIX"

Then, as usual:

cmake -DCMAKE_INSTALL_PREFIX=${INSTALL_DIR}
-DCMAKE_BUILD_TYPE:STRING=Debug -DINSTALL_DOC:BOOL=OFF
-DRUN_GCOV:BOOL=OFF ${LIBSUFFIX_4_CMAKE} ..


* On MacOS, a few software (_e.g._, ICU and Readline) are not in
the standard place. So, the `cmake` command becomes:
```bash
$ export CMAKE_CXX_FLAGS="-Wno-mismatched-new-delete"; \
  cmake -DCMAKE_INSTALL_PREFIX=${INSTALL_BASEDIR}/opentrep-$TREP_VER \
   -DREADLINE_ROOT=/usr/local/opt/portable-readline \
   -DREADLINE_INCLUDE_DIR=/usr/local/opt/portable-readline/include \
   -DREADLINE_LIBRARY=/usr/local/opt/libedit/lib/libedit.dylib \
   -DICU_ROOT=/usr/local/opt/icu4c \
   -DCMAKE_BUILD_TYPE:STRING=Debug -DINSTALL_DOC:BOOL=ON \
   -DRUN_GCOV:BOOL=OFF ${LIBSUFFIX_4_CMAKE} ..

Underlying (relational) database, SQLite or MySQL/MariaDB, if any

OpenTREP may use, if so configured, a relational database. For now, two database products are supported, SQLite3 and MySQL/MariaDB. The database accelerates the look up of POR by (IATA, ICAO, FAA) codes and of Geonames ID. When OpenTREP is configured to run without database, those codes and Geonames ID are full-text searched directly with Xapian. Note that the database can be managed directly, i.e., without the OpenTREP search interface on top of it, thanks to the opentrep-dbmgr utility, which is detailed below.

Use cases

Indexing the POR data

Filling the (relational) database, SQLite or MySQL/MariaDB, if any

Here, for clarity reason, we use the local version. It is easy (see above) to derive the same commands with the installed version.

Xapian indexing with standard installation

By default, the Xapian indexer runs without filling any relational database, as that step can be performed independantly by opentrep-dbmgr, as seen above.

Xapian indexing for an ad hoc deployed Web application

Searching

Deployment stages

The idea is to have at least two pieces of infrastructure (SQL database, Xapian index) in parallel:

Once the new version has been validated, the two pieces of infrastructure can then be interverted, ie, the production becomes the new version, and the older version ends up in staging.

It means that all programs have to choose which version they want to work on. That version may even be toggled in live.

That method to deploy in production through a staging process is even more needed by the fact that indexing a new POR data file takes up to 30 minutes in the worst case. So, we cannot afford 30-45 minutes of downtime everytime a new POR data file is released (potentially every day).

With that staging process, it is even possible to fully automate the re-indexing after a new POR data file release: once the new release has been cleared by QA (Quality Assurance) on staging, it becomes production.

The corresponding command-line option for the various programs (opentrep-dbmgr, opentrep-indexer, opentrep-searcher) is -m.

Index, or not, non-IATA POR

There is also a command-line option, namely -n, to state whether or not the non-IATA-referenced POR should be included/parsed and indexed.

By default, and historically, only the POR, which are referenced by IATA (ie, which have a specific IATA code), are indexed (and may be searched for) in OpenTREP.

POR are also referenced by other international organizations, such as ICAO or UN/LOCODE, and may not be referenced by IATA (in which case their IATA code is left empty).

As of October 2018, there are around 110,000 POR in OpenTravelData (OPTD), the reference data source for OpenTREP:

Once indexed, all those POR become searchable. That flag is therefore only used at indexing time (i.e., by the opentrep-dbmgr and opentrep-indexer programs).

Installing a Python virtual environment

All the details are explained on a dedicated procedure, which works for the major Linux distributions and on MacOS.

The procedure first installs a specific version of Python (as of January 2022, 3.9.9) thanks to PyEnv, then install pipenv thanks to the pip utility provided with that specific Python version.

Checking that the Python module works

Trouble-shooting Python issues on MacOS

Interceptors not installed / late

OpenTREP as a Python extension

References

Build and package OpenTREP as a Python extension

Install Python modules/dependencies

Install OpenTrep Python extension with system-based pip

Build the OpenTrep Python extension locally with system-based Scikit-build

_skbuild/*-x86_64-3.9/: -rw-r–r– 1 user staff 0B Jan 10 19:10 _skbuild_MANIFEST drwxr-xr-x 24 user staff 768B Jan 10 19:10 cmake-build/ drwxr-xr-x 6 user staff 192B Jan 10 19:10 cmake-install/ drwxr-xr-x 3 user staff 96B Jan 10 19:10 setuptools/


* Set the `LD_LIBRARY_PATH` and `PYTHONPATH` environment variables:
```bash
$ INST_DIR=${PWD}/_skbuild/macosx-13.0-x86_64-3.9/cmake-install
  TREPBINDIR=${INST_DIR}/bin
  OPTDPOR=${INST_DIR}/share/opentrep/data/por/test_optd_por_public.csv
  export LD_LIBRARY_PATH=${INST_DIR}/lib
  export PYTHONPATH=${INST_DIR}/lib:${INST_DIR}/lib/python3.9/site-packages/pyopentrep

View at: https://test.pypi.org/project/opentrep/0.7.14/


* Upload/release the Python packages onto the
  [PyPi repository](https://pypi.org):
```bash
user@laptop$ PYPIURL="https://upload.pypi.org"
user@laptop$ keyring set ${PYPIURL}/ __token__
Password for '__token__' in '${PYPIURL}/':
user@laptop$ twine upload -u __token__ --repository-url ${PYPIURL}/legacy/ dist/*
Uploading distributions to https://upload.pypi.org/legacy/
Uploading opentrep-0.7.14.post2-cp39-cp39-macosx_13_0_x86_64.whl
100%|█████████████████████████████████████████████████████████████████████| 9.86M/9.86M [01:00<00:00, 172kB/s]
Uploading opentrep-0.7.14.post2.tar.gz
100%|█████████████████████████████████████████████████████████████████████| 1.65M/1.65M [00:12<00:00, 139kB/s]

View at:
https://pypi.org/project/opentrep/0.7.14.post2/

Test the OpenTREP Python extension

test_trep_e2e_simple.py [100%]

================== 1 passed in 1.38s ==============


## Use the OpenTREP Python extension

### Download the latest OpenTravelData (OPTD) POR data file
* If not already done, install a few more Python modules
  + On Linux:
```bash
$ python -mpip install -U opentrepwrapper opentraveldata

Xapian index initialization

Search with the OpenTrep Python extension

Search with the OpenTrepWrapper package

(Optional) Running the Django-based application server (needs update)