Skip to content

An analysis and speedup of the FALDOI method for Optical Flow estimation

Notifications You must be signed in to change notification settings

fperezgamonal/faldoi-ipol

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

45 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Python 2.7 Python 3.5 License

FALDOI-IPOL

Stems from the basic FALDOI: A New Minimization Strategy for Large Displacement Variational Optical Flow by Roberto P. Palomares, Enric Meinhardt-Lopis, Coloma Ballester and Gloria Haro algorithm and provides a sped-up optimizationwhich is published on the IPOL journal with an interactive demo, source code and supplementary materials and tools available openly. In the future, we intend to add occlusion estimation to several energy functionals, aside from the one available (TVL1).

Paper(s) and citation

If you use FALDOI, please cite any of the following papers:

Springer original paper (November 2016):

@Article{Palomares2017,
author="Palomares, Roberto P.
and Meinhardt-Llopis, Enric
and Ballester, Coloma
and Haro, Gloria",
title="FALDOI: A New Minimization Strategy for Large Displacement Variational Optical Flow",
journal="Journal of Mathematical Imaging and Vision",
year="2017",
month="May",
day="01",
volume="58",
number="1",
pages="27--46",
issn="1573-7683",
doi="10.1007/s10851-016-0688-y",
url="https://doi.org/10.1007/s10851-016-0688-y"
}

the arXiv paper (September 2016):

@article{DBLP:journals/corr/PalomaresHBM16,
  author    = {Roberto P. Palomares and
               Gloria Haro and
               Coloma Ballester and
               Enric Meinhardt{-}Llopis},
  title     = {{FALDOI:} Large Displacement Optical Flow by Astute Initialization},
  journal   = {CoRR},
  volume    = {abs/1602.08960},
  year      = {2016},
  url       = {http://arxiv.org/abs/1602.08960},
  archivePrefix = {arXiv},
  eprint    = {1602.08960},
  timestamp = {Wed, 07 Jun 2017 14:42:11 +0200},
  biburl    = {https://dblp.org/rec/bib/journals/corr/PalomaresHBM16},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

You can reference this implementation with: IPOL article, demo and supplementary material (March 2019):

@article{ipol.2019.238,
    title   = {{An Analysis and Speedup of the FALDOI Method for Optical Flow Estimation}},
    author  = {Gamonal, Ferran P. and Ballester, Coloma and Haro, Gloria and Meinhardt-Llopis, Enric and Palomares, Roberto P.},
    journal = {{Image Processing On Line}},
    volume  = {9},
    pages   = {94--123},
    year    = {2019},
    doi     = {10.5201/ipol.2019.238},
}
% if your bibliography style doesn't support doi fields:
    note    = {\url{https://doi.org/10.5201/ipol.2019.238}}

Table of contents


Getting Started

The code is written in C/C++ and includes Python scripts to unify and simplify all the tasks the algorithms performs.

Pre-requisites

The software needs the following programs to be installed in order to function:

  • Python 3.5 (also works with Python 2.7)

  • Pillow for Python 3 (if you do not want to use Pillow, in the python scripts there is a commented section above the PIL import that uses imagemagick's identify instead). Nevertheless, notice that Pillow will be usually faster (since it is built in python).

  • lipng, libjpeg, libtiff: by default, we use your system's default version (although we provide the original version of libpng which the code was created with in the 'src/lib' directory). Default system versions of Ubuntu 16.04 LTS have been checked as well.

  • OpenMP (Optional but highly recommended(*)). Should be included with your compiler if you are using a relatively new version (e.g.: gcc supports it since version 4.2). For a gentle introduction to OpenMP, please read this post or visit the official website.

(*) The speed-up factor is detailed in the IPOL article but it ranges between 1.5x times faster (only 2 CPUs) to 4x faster (with 16 CPUs). This fact is more critical if you used the original code, which is about 20 times slower than the current multi-threaded implementation.

Compilation

The code needs to be compiled in order to generate the needed executables. Nevertheless, if you do not want to compile the code before testing it once, we added already-compiled executables that are already linked (see Execution). These executables have been compiled in a machine with 4 cores running Ubuntu 16.04 LTS.

To compile the code, we have added a bash script that runs all the needed commands to remove the old build directory, create a new one and re-compile the whole project, integrating all the changes done to any C/C++ file. To run the script, just do the following (from the project's root directory):

source recompile_changes.sh

or

. recompile_changes.sh

'source' or '.' is a bash shell built-in command that executes the content of the file passed as argument, in the current shell.

If you run into any problems with the binaries for the matchers located in 'ext_bin', you may try to download their sources and compile it under your machine. To do so, you can follow the steps explained in the 'Compute matches' section below.

Algorithm's steps

In order to obtain the final optical flow estimation, the algorithm follows these steps:

  1. Reads a text file (.txt) containing the paths to the input frames to be processed (see Python scripts/NOTE for details).
  2. Computes the descriptors and matches between frames I0 and I1 (with SIFT or Deepmatching).
  3. Computes a sparse flow from the matches.
  4. Computes a dense flow from the initial sparse flow (local step).
  5. Computes global minimization taking as input the flow of the previous step (global step).

From algorithms to actual code

The first time you download the code, you will see that the 'src' folder contains several source files. Trying to understand what each of those files does at once will be too time-consuming and hard. For those reasons, we strongly encourage you to read one of the articles linked at the top of this file. More concretely, if you want a thorough explanation of the theoretical background of the algorithm, read the original Springer paper; if you want a summarised version containing more implementations details, read the IPOL article instead.

In the IPOL article, we have broken down the algorithm into 8 different pseudo-codes which are implemented in code inside the 'src' folder. Note that we will refer to the algorithms with the same notation used in the IPOL article. The algorithms are the following:

  • Algorithm 1 (main algorithm (end-to-end)): it is implemented in the python scripts 'faldoi_sift.py' and 'faldoi_deep.py', depending on the chosen matching method.
  • Algorithm 2 (compute-matches): it is implemented in lines #201 to #240 of 'faldoi_sift.py' and lines #223 to #273 of 'faldoi_deep.py'.
  • Algorithm 3 (saliency-matches): this algorithm is only run if the selected matcher is DeepMatching. It is implemented in the file 'rescore_prunning.py' (compute confidence scores) and 'auxiliar_faldoi_functions.py', lines #47 to #64 (delete outliers) and lines #32 to #44 (cut matches list, removing scores).
  • Algorithm 4 (sparse-flow): it is implemented in the file 'sparse_flow.cpp'.
  • Algorithm 5 (dense-flow): it is implemented in the file 'local_faldoi.cpp', in the function 'match_growing_variational' (lines #1174 to #1530).
  • Algorithm 6 (basic-faldoi-growing): it is implemented in the file 'local_faldoi.cpp', in the function 'local_growing' (lines #890 to #1039).
  • Algorithm 7 (forwardBackward-pruning): it is implemented in the file 'local_faldoi.cpp', in the function 'fb_consistency_check' (lines #166 to #189).
  • Algorithm 8 (flow-refinement): this algorithm has two different implementations following the pseudo-code written in the article (one for the local step, over a patch; one for the global step, over the whole image domain).

    As a consequence, it is implemented in 'global_faldoi.cpp', inside the main() function, using different implementations for each type of functional that FALDOI supports. For example, the TVL2-L1 functional, the global step is performed in function 'tvl2OF' (lines #555 to #843).
    For the local step, the implementation resides in functional-specific functions defined alongside the energy model in its own source + header (cpp + h) files. For instance, for the TVL2-L1 functional, the pseudo-code is implemented in the file 'tvl2_model.cpp' (lines #249 to #425).

For more information about all these algorithms, please, refer to the papers (section 'Proposed minimization strategy'). Additionally, we provide a list of the relation between the naming convention for the energy functionals used in the IPOL paper and in the code.

Not in the paper, but combination of different data term (CSAD, L2) and regularization term (L1, NLTV):

  • "NLTV1" combines NLTV as regularizer + L1 as data term. Implemented in 'nltv_model.cpp'

Execution

Once you have successfully compiled the program, you can execute the algorithm in two ways: using the C++ executables or running the Python scripts (recommended). The advantage of choosing the latter is that all the algorithm's steps are run at once sequentially, while with C++ executables, you will have to manually call each block of the algorithm, including the necessary input files and parameters (this may be needed if you want to combine your own matching algorithm, with steps 3, 4 and 5 of FALDOI). Moreover, in both cases you can decide to run only some of the algorithm's steps (the python scripts set boolean variables for each step). This can be useful to avoid recomputing matches several times if you plan to run the minimization with different parameters.

C++ executables

Given a text file with the input images paths (e.g.: 'sintel_one_frame_easy.txt' in example_data) you can obtain the output flow by following the Algorithm's steps and calling the executables as follows:

Compute matches

  • With SIFT:

Computing descriptors (once per image: i0 + i1)

./sift_cli im_name0.png -nspo 15 > im_name0_descriptors.txt

Computing matches (once forward i0 => i1, once backward i0 <= i1)

./match_cli im_name0_descriptors.txt im_name1_descriptors.txt > im_name0_matches.txt
  • With DeepMatching:

Computing matches (once forward i0 => i1, once backward i0 <= i1)

./deepmatching im_name0.png im_name1.png -nt 4 -downscale 2

To see a more in detail usage of the sift_cli , match_cli (for SIFT) and deepmatching (DeepMatching) executables, visit the SIFT anatomy and DeepMatching pages or/and check their source code README.md files. In order to generate the needed binaries, you MUST download the source code from the links above and compile them as follows:

  • SIFT: download and unzip the compressed file in the desired directory. Change to the directory where the Makefile is located in a terminal. Simply run the command 'make' which should yield the sift_cli , match_cli binaries.
  • DeepMatching: download and unzip the compressed file in the desired directory. Open a terminal and change to the directory where the Makefile is located. In this case, we needed to change some lines of the original Makefile from the source code in order to successfully compile it under Linux. Just comment out the lines inside the 'ifeq ($(OS_NAME),Linux)' statement for 'LAPACKLDFLAGS=-latlas -lblas'. If this does not work, you may try the original lines as it may work in your case.

Please note that in both cases, you may need to install the required libraries (see README files from both sources for details). Also in both cases, you will need to copy the binaries to the 'build' directory (created after recompiling the project with './recompile_changes.sh' in the root folder).
To avoid having to repeat this step every time you change the code, you could just add a line to the bash script above of the type: 'cp ../precompiled_binaries/ .'* to copy them to the build directory after each new compilation.

For a simplified usage of the algorithm, you can take a look at any of the Python scripts on the following section to see some usage examples with FALDOI (only some default parameters of the matching algorithms are tweaked and most of these are fixed in the code).

NOTE: 'nspo' means the number of scales per octave; 'nt' means the number of threads, 'downscale' is the downscaling factor to apply to the original image.

Sparse flow

./sparse_flow list_matches.txt img_width img_height out_sparse_n.flo

Local faldoi

  • Without occlusions:
./local_faldoi file_paths_to_images.txt out_sparse_1.flo out_sparse_2.flo out_local.flo sim_map.tiff [options...]

or (if you have input saliency files for both images)

./local_faldoi file_paths_to_images.txt out_sparse_1.flo out_sparse_2.flo out_local.flo sim_map.tiff sal0.tiff sal1.tiff [options...]
  • With occlusions:
./local_faldoi file_paths_to_images.txt out_sparse_1.flo out_sparse_2.flo out_local.flo sim_map.tiff occ_loc.png [options...]

or (if you have input saliency files for both images)

./local_faldoi file_paths_to_images.txt out_sparse_1.flo out_sparse_2.flo out_local.flo sim_map.tiff occ_loc.png sal0.tiff sal1.tiff [options...]

options (python scripts have equivalent ones with similar names and longer explanation):

  • -m (0)    chooses the functional out of the following:
               M_TVL1        0
               M_TVL1_W       1
               M_NLTVL1        2
               M_NLTVL1_W        3
               M_TVCSAD      4
               M_TVCSAD_W      5
               M_NLTVCSAD       6
               M_NLTVCSAD_W    7
               M_TVL1_OCC    8
  • -wr (5)        radius value wr (5) - patch 2*wr + 1 x 2*wr +1 (11 x 11).
  • -p (None)       file of parameters of the optical flow energy (see function init_params in utils_preprocess.cpp for more details).
  • -loc_it (3)      number of iterations for the local minimization.
  • -max_pch_it (3)     number of iterations per patch (for each 'loc_it')
  • -split_img (0)      whether to split image into parts to boost speed.
  • -h_parts (3)      number of horizontal parts.
  • -v_parts (2)      number of vertical parts.
  • -fb_thresh (*)     forward-backward consistency check threshold(*) Def. 0.45 for SIFT, 13 for DeepMatching.
  • -partial_res (0)    whether to save all intermediate flows or not.

Global faldoi

  • Without occlusions:
./global_faldoi file_paths_to_images.txt out_local.flo out_final.flo [options...]
  • With occlusions:
./global_faldoi file_paths_to_images.txt out_local.flo out_final.flo occ_loc.png occ_final.png [options...]

options:

  • -m (0)        changes the functional (check aux_energy_model.h).
  • -w (5)        number of warpings.
  • -p (None)        file of parameters of the optical flow energy (see function init_params in utils_preprocess.cpp for more details).
  • -glb_iters (400)      number of iterations for the global minimization.
Change default optical flow energy parameters (input flag -p)

By default, the algorithm uses a set of parameters for the optical flow energy computation which were selected as those giving the best overall results in the tests done for the first publication of FALDOI (i.e.: Springer article above).
Nevertheless, one can input different values by giving the path the txt file with parameters after the flag -p. The format of such file is as follows (in parenthesis we include a definition of each parameter but take a look at the paper for further information):

lambda_value (weight of one part of the data term in prop. 3 in the IPOL article)
theta_value (coupling parameter, see IPOL article (appendix A) for more information)
tau_value (time step, see IPOL article (appendix A) proposition 1 for more details)
beta_value (smoothness/regularity weight term)
alpha_value (weights the occlusions term related to the flow magnitude (u,v))
tau_u_value (Chambolle's algorithm variable, see prop. 1 of tvl1 + occlusions*)
tau_eta_value (Primal-dual algorithm variable see prop. 3 of tvl1 + occlusions*)
tau_chi_value (Primal-dual algorithm variable, see prop. 3 of tvl1 + occlusions*)
mu_value (see proposition 2 of tvl1 with occlusions, cited below*)

* A TV-L1 Optical Flow Method with Occlusion Detection. C. Ballester, L. Garrido, V. Lazcano, V.Caselles.
You must define each value in a new line and if you want to ommit some (i.e.: use the default value) you can introduce a value <=0 (e.g.: -1). For instance, if we only want to change lambda=30 and theta=0.2. The file will look like:

30
0.2
-1
-1
-1
-1
-1
-1
-1

Take into account that the last five parameters are only used by TVL2-occ (TVL1-L2 with occlusion modeling). You can get a sense for the range of each value by taking a look at the default values in parameters.h. The default values correspond to the default energy functional, TVL1-L2. More precisely, the default values for all the implemented energy functionals are:
TVL1-L2
lambda=40, theta=0.3, tau=0.125, beta=0.025 (the rest are not used, set it to -1).
NLTVL1
lambda=2, theta=0.3, tau=0.1, beta=0.025 (the rest are not used, set it to -1).
TVCSAD
lambda=0.85, theta=0.3, tau=0.1, beta=0.025 (the rest are not used, set it to -1).
NLTV-CSAD
lambda=0.85, theta=0.3, tau=0.1, beta=0.025 (the rest are not used, set it to -1).
TVL1-L2 with occlusion estimation
lambda=23.455, theta=0.283, tau=0.1, beta=0.702, alpha=0.071, tau_u=0.074, tau_eta=0.084, tau_chi=0.134, mu=1.406 (values rounded up to 3 decimals for simplicity, see related line in parameters.h for full-precision).
Finally, the weighted versions with the suffix '_w', share the same parameters with the "standard" versions but lambda is divided by the window size squared.

Python scripts - Usage

As you saw above, calling each binary with the correct parameters and keeping track of all output files to pass them to the following step, etc. can be very convoluted. For that reason, we suggest that you try using the python scripts to simplify the process. In the directory scripts_python, you will find three main scripts that execute the whole algorithm following all the steps detailed in the Algorithm's steps section.

faldoi_sift.py

Given a text file containing the paths to the input frames, computes the optical flow estimation based on the FALDOI algorithm's energy minimisation. Matches are extracted with SIFT. Usage:

./faldoi_sift.py path2imgs.txt [options]

options:

  • -vm (0)         changes the functional (check aux_energy_model.h).
  • -wr (5)         windows radius or patch size (2*wr + 1 x 2*wr + 1). E.g.: wr=5 means a 11x11 patch size.
  • -local_iter      number of iterations for the local minimization.
  • -patch_iter      number of iterations per patch (for each 'local_iter')
  • -split_img (0)     whether to split image into parts to boost speed.
  • -h_parts (3)       number of horizontal parts.
  • -v_parts (2)      number of vertical parts.
  • -fb_thresh (*)     forward-backward consistency check threshold(*) Def. 0.45 for SIFT, 13 for DeepMatching.
  • -partial_res (0)    whether to save all intermediate flows or not.
  • -warps (5)      numbers of warps at the finest scale (global minimisation).
  • -glob_iter (400)    number of iterations for the global minimization.
  • -nsp (15)      number of scales per octave to be computed by the SIFT algorithm.
  • -res_path        path where the output files will be saved (partial and final flow, descriptors and matches). If "None", the results are stored in the Results folder.
  • energy_params (" ")   path to a txt file containing modified energy parameters (see subsection above). This parameter is equivalent to -p when directly working with the binaries.

faldoi_deep.py

Does the same as the above script but the matches are extracted with the DeepMatching algorithm instead of SIFT. Usage:

./faldoi_deep.py path2imgs.txt [options]

options:

  • -vm (0)        changes the functional (check aux_energy_model.h).
  • -wr (5)        windows radius or patch size (2*wr + 1 x 2*wr + 1). E.g.: wr=5 means a 11x11 patch size.
  • -local_iter       number of iterations for the local minimization.
  • -patch_iter       number of iterations per patch (for each 'local_iter')
  • -split_img (0)      whether to split image into parts to boost speed.
  • -h_parts (3)       number of horizontal parts.
  • -v_parts (2)      number of vertical parts.
  • -fb_thresh (*)     forward-backward consistency check threshold(*) Def. 0.45 for SIFT, 13 for DeepMatching.
  • -partial_res (0)     whether to save all intermediate flows or not.
  • -warps (5)         numbers of warps at the finest scale (global minimisation).
  • -glob_iter (400)     number of iterations for the global minimization.
  • -warps (5)       numbers of warps at the finest scale (global minimisation).
  • -th (0.45)       threshold to discard outliers from DeepMatching.
  • -nt (4)        number of CPUs used to compute DeepMatching.
  • -downscale (2)     related to the scale at which DM will work (by default, half original resolution).
  • -max_scale (sqrt(2))   maximum scale of DeepMatching.
  • -rot_plus (45)     positive rotation angle for DeepMatching.
  • -rot_minus (45)     negative rotation angle for DeepMatching.
  • -res_path         path where the output files will be saved (partial and final flow, descriptors and matches). If "None", the results are stored in the Results folder.
  • energy_params (" ")    path to a txt file containing modified energy parameters (see subsection above). This parameter is equivalent to -p when directly working with the binaries.

NOTE

faldoi_deep_occ.py

Includes the optional parameters to model occlusions (only available with the TVL1 energy functional right now). Matches are computed with Deep Matching. Usage

./fast_faldoi_occ.py file_paths_to_images.txt [options]

options: same as fast_faldoi.py (see above).

REMARKS

1. Out of the list of available functionals listed above (see C++ Executables/Local faldoi), in the code published in the IPOL journal all of them are available with the exception of the TVL1+occlusions. If you wish to test that particular function, please refer to the Github repository linked below. For more details about how the occlusions are model for FALDOI, please read the base article where the occlustion estimation model was presented.

2. The ability of working with partitions is currently only compatible with the TVL2-L1 functional (the one that benefits more from the speed-up as commented in the IPOL paper).

3. The format of the path2imgs.txt file, is the following:

path/to/frame_0001.png
path/to/frame_0002.png

For now, the paths should be absolute or relative to the current path (using the 'tilde' character does not work at the moment).

Notice that the order in the text file is important. The paths should be entered (one per line) as follows: first line: I0, second line: I1. Chronologically, they are ordered as: I0 (t) --> I1 (t+1). Since we are not modeling occlusions (see GitHub link below for the complete version), we can pass only 2 paths to frames I0 and I1. Nevertheless, specifying the paths to the 4 consecutive frames as if you were using occlusions will enable you to use the same input files if you desire to start modeling occlusions at some point.

Example of usage

With the source code, you can already run the algorithm with some sample frames stored in folders inside the example_data folder. Most of these frames are part of a bigger dataset known as MPI Sintel. In the folder, two directories have been created, each containing the first 4 frames of the same sequence ('alley_1').

The 'clean' directory contains the synthetic frames without any defussion or distortion applied to them. The second directory, 'final', applies some transformations to the original frames which makes the optical flow estimation more challenging.

If you want to run the algorithm with SIFT matches and specify your own results path, you just need to navigate to the scripts_python folder and execute the following line in your terminal:

./faldoi_sift.py path2imgs.txt -res_path ../tmp_faldoi/Experiment1/Results/

Remember to add the final slash '/' so the files are created inside the child folder (in the example, 'Results') and not in its parent directory. Finally, you may run the 'faldoi_deep.py' script in a similar fashion.

The other folders contain special cases tested throughout the algorithm's development and optimization. All the sequences with known ground truth include a subfolder called 'gt' with extra subfolders for occlusions and invalid pixel masks (so one can compute error metrics). We follow the MPI-Sintel naming nomenclature for the folders (you can find more information in Sintel's downloads' website). More images can be found by downloading the supplementary material submitted to the IPOL article (see link at the top of this file).

Bugs, issues or questions

If you encounter any bugs, issues or have any questions about this source code or the implemented algorithm, please, do not hesitate to open an issue in this repository. Alternatively, you may want to send an e-mail to the last active developer. The complete code can be found on GitHub including some extra files left out of the IPOL publication for the sake of simplicity and coherence with the accompanying article.

Developers

  • Roberto Palomares, 2014 (main dev)
  • Onofre Martorell, 2017 (MsC Thesis: migrated most of the code to C++, added occlusions estimation for the TVL2-L1 functional and fixed bugs)
  • Ferran Pérez, 2018-2019 (speed-up optimisation for IPOL, fixed some bugs with some functionals). All the work is summarised in the following article

License and copyright

This software is licensed under the BSD 3-Clause license. For details, see LICENSE.md
Copyright © 2014, Roberto P.Palomares   r.perezpalomares@gmail.com
Copyright © 2017, Onofre Martorell      onofremartorelln@gmail.com
Copyright © 2018-2019, Ferran Pérez    fperez.gamonal@gmail.com
All rights reserved.

About

An analysis and speedup of the FALDOI method for Optical Flow estimation

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published