Skip to content

Hyper-parameter tuner (for computer vision and reinforcement learning)

License

Notifications You must be signed in to change notification settings

akbinod/akbinod.Tuner

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

akbinod.Tuner

Binod Purushothaman : binod@gatech.edu/ak.binod@gmail.com
Georgia Tech CS-6476: Spring 2022

ThetaUI

Introduction

If you're studying Computer Vision, or Reinforcement Learning, parameter tuning is probably causing you some angst. The goal is to give you a python execution harness for parameter tuning that's easy to use and minimally disruptive to your CV code. You get to skip learning about openCV's HighGUI and Trackbars APIs, and focus instead on the joys of particle filtering. Here's a 5 minute introduction to the essentials (the code for this example is in `example.py`).

Getting started...

Importing this component, and copying 3 lines into your code will get you the tuner ThetaUI.


This function and its invocation...

def find_circle(image, radius):

    # your implementation

    return results

if __name__ == "__main__":
    find_circle(image, 42)

... hooked up to ThetaUI, become:

#new import
import TunedFunction

#new decorator, and a 'tuner' param
@TunedFunction()
def find_circle(image, radius, tuner=None)


#new line of code to display an updated image in ThetaUI
    if not tuner is None: tuner.image = updated_image

    return results

ThetaUI Your (unchanged) invocation from 'main' now shows ThetaUI : a python tkinter GUI with a spinbox called 'radius' which ranges from 0 to 42.

  • the window title shows the name of your tuned function,
  • various parts of the status bar tell you:
    • the image title (when you pass in a file name),
    • the frame number (when you pass in a carousel of images)
    • code timing in h,m,s and process time
    • whether the image displayed was sampled/interpolated for this display,
    • whether there were exceptions during execution (click when red to view exceptiosn).
  • the menus let you traverse the carousel, start a grid search, save results and images etc.
  • on the left, the tree shows you json representing what your invocation: args, and results.
  • the picture is your image once you are done processing it
    • typically this shows the last couple images you've specified, that number is configurable

Each time you change a parameter, ThetaUI calls your code find_circle() with a new value for 'radius'.


And that folks, is pretty much it. Here's a good stopping point; try this out on your CV code.

There's more to ThetaUI, like:

  • it runs a systematic grid search over the space of your args (exhausts the search space),
  • tagging args (note when theta is cold/warm/on-the-money),
  • json serialization of invocation trees


So... read on...

@TunedFunction() Decorator

Although you do give up some flexibility, compared to explicitly instantiating and configuring Tuner, just decorating your function is the quickest way of getting started.

Usage

  1. Decorate the function you want to tune (referred to as target) with @TunedFunction(), and add a 'tuner' param to its signature. (Note: there should be no other decorator on target.)
  2. Begin tuning by calling target. @TunedFunction creates an instance of ThetaUI (passed to target via the tuner param). You are now in the tuning loop:
    • Switch to the Tuner GUI and adjust the trackbars.
    • Tuner will invoke your function on each change made to a trackbar.
    • Set tuner.image to the processed image from within target. This refreshes the display in Tuner's GUI.
  3. End your tuning session by pressing the Esc (or any non function key)

To restore normal operation of your function, comment out or delete the @TunedFunction() decorator.

Tracked/Pinned Parameters, or What is tuned?

Positional and keyword parameters (not varargs, or varkwargs) in your function signature are candidates for tuning. If your launch call passes an int, boolean, list or dict to any of these, then that parameter is tuned; the others are passed through to your function unchanged (I.E. they are automatically "pinned"). Images, e.g., can't be tuned - so np.ndarray arguments are passed through to your function unchanged. Tuples of 3 ints also work, and are interpreted in a special way.

If you want to skip tuning (aka "pin") some parameters in your target's signature, you have the following options, choose what works best for your workflow:

  • Set a default value for the parameter you want to pin and drop the argument from your launch call. A param is not tuned, if an arg is not passed to it from your launch call.
  • When the argument passed to the target function is of a type that Tuner does not hendle, the value is passed to target unchanged (pinned).

  • It's the type of the argument passed in your launch call that drives Tuner behavior, not the annotation on the parameters.
    #image is passed through, radius is tuned - min 0, max 50
    find_circle(image, radius=50)
    #same as above
    find_circle( image, 50 )
    #radius is tuned with values ranging between 20, and 50
    find_circle( image, (50,20) )
    #radius is tuned and the slider selects among [10, 50, 90]
    find_circle(image, [10,50,90])
    #radius is tuned and target receives one of [10, 50, 90]
    #The difference is that Tuner GUI dispalys "small", "med", "large"
    j = {"small":10, "med":50, "large":90}
    find_circle( image, radius=j )
    
    

    When an argument in the launch call is:
    • int
    • The trackbar's max is set to the int value passed in.
    • tuple
    • The trackbar's (max,min,default) are taken from the tuple.
    • boolean
    • The trackbar will have two settings 0, 1 which correspond to False, True. The default value is whatever you have passed in. Tuner will call target with one of False, True depending on trackbar selection.
    • list
    • This is a good way to specify non int values of some managable length. Strings, floats, tuples all go in lists.
      • The trackbar will have as many ticks as there are items in the list.
      • Changing the trackbar selects the corresponding item from the list.
      • The argument passed to target is the list item. E.g., when your launch call passes ['dog','cat','donut'] to the radius parameter, Tuner will:
        • create a trackbar with 3 positions.
        • call target passing one of the following ['dog','cat','donut'] to radius - whichever you've selected with the trackbar.


      Trivially, [(3,3), (5,5), (7,7)] is a list you might use for tuning the ksize parameter of cv2.GaussianBlur()

    • dict or json object
    • Very similar to list above. obj[key] is returned in the arg to target.

      Consider the json below. Passing that to a parameter would create a trackbar that switches amongst "gs", "gs_blur" and "gs_blur_edge". When target is invoked in a tuning call, the argument passed in is the json corresponding to the selected key.

      	preprocessing_defs = {
      	"gs" :{
              "img_mode": "grayscale"
              , "blur":{"apply": false}
              , "edge": {"detect" : false}
          }
          ,"gs_blur":{
              "img_mode": "grayscale"
              , "blur":{"apply": true, "ksize": (5, 5), "sigmaX": 2}
              , "edge": {"detect" : false}
      	}
         , "gs_blur_edge": {
              "img_mode": "grayscale"
              , "blur":{"apply": true, "ksize": (5, 5), "sigmaX": 2}
              , "edge": {"detect" : true, "threshold1": 150, "threshold2": 100, "apertureSize": 5}
      	}
      
      

    ThetaUI Menu

    • F1 : runs a grid search
    • F2 : saves the image
    • F3 : saves your Invocation Tree
    • F8 - F10 : tags and saves your Invocation Tree (see below).

    Saving Invocation Trees

    The basic idea behind Tuner is:
    1. ...hook up Tuner and invoke your function to tune it
    2. ...save your observations (tags) along with theta
    3. ...and finally, come back and analyse the Invocation Tree saved to your output file to narrow in on your ideal theta

    Saving behavior is determined principally by a couple of statics in TunerConfig.

    TunerConfig.output_dir: by default this is set to `./wip` Change this before you use the other functions of Tuner.

    TunerConfig.save_style: This should be set to some valid combination of the flags found in `constants.SaveStyles`. The default is to overwrite the contents of the output file on each run, and to only save when explicitly asked to.

    The following are always tracked, although only serialized to file under certain circumstances:

    • args: The set of args to an invocation.
    • results: This could be explicitly set by your code like so tuner.results=.... If you do not set this value, tuner captures the values returned by target and saves them as long as they are json serializable
    • errored: Whether an error took place during target invocation.
    • error: These are execution errors encountered during target invocation. BTW, the most recent call is first in this formatted list, not last as you would expect from typical python output.
    • [insert your tag here] : A complete list of all the custom tags with the value set to false, unless you explicitly tag the invocation, in which case the particular tag(s) are set to True.
    An invocation is serialized to the output file when:
    • You explicitly save - F3.
    • You tag an invocation.
    • An exception was encountered during the invocation.
    The name of the output file begins with the function being tuned; and within the file, this is approximately the tree structure:
    • The title of the image from your carousel (see explicit instantiation below), defaulting to 'frame'
      • The invocation key (what you see is the md5 hash of theta)
        • args (contains each element of theta)
        • results (contains the saved or captured results of target)
        • the custom tags that you set up in Tuner GUI, defaulting to False
        • errored
        • errors (contains your execution exceptions)
    The purpose of tuning is to find args that work for the task at hand. It might be a somewhat lengthy process, and this feature lets you tag some theta with a word that you can search for in the output file. I like using 'avoid', 'exact' and 'close', the defaults you see in the UI. You could customize this. Modify constants.py and update the `Tags` enum. Code comments there will explain your options. Pick a scheme that works for you, and stick with it. I'd recommend something like glom or jsonpath-ng to search the saved invocation tree.

    Grid Search

    If you are not a "parameter whisperer", you're going to turn to brute force tuning at some point; I did. So, with 3 params, each of which could take 5 values, you're likely to be annoyed by the process, and more likely to make a costly mistake. The worst of tuning, for me, is the prospect of missing the "right set of args", thanks to NOT clicking through the various settings methodically. Fortunately, there's code for that.


    This feature runs through a cartesian product of the parameter values you have set up. target is invoked with each theta, and Tuner waits indefinitely for your input before it proceeds to the next theta.

    Here's my workflow:

    1. I start with a small range of inputs, and let Tuner search through that space.
    2. When Tuner waits for input, I tag the current set of args (e.g., 'avoid' or 'close'); or just 'press any key'. I can also hit Esc to cancel the grid search.
    3. After I've run through the cart (cartesian product of all arguments), I query the (json) output file to find my theta, or something close.
    With explicit instantiation (i.e., using ThetaUI rather than @TunedFunction), I can set how long Tuner waits, etc. I typically first run through the search space with a 40ms delay to determine if I'm "in the ball-park". If it looks like the answer or something close to it is in there, I then run through it again with a full second delay, and tag what I find interesting. If I don't find anything close in my first attempt, I open up the search space some (expand the range of values for the args).

    This is about as much code as I can give you without running afoul of the GA Tech Honor Code. We can spitball some ideas to help you get more value out of the data that's captured if you follow the "Search-Inspect-Tag" workflow I've outlined above.

    1. If you find a number of 'close' thetas, build a histogram of the various args to EACH param, using only thetas that are 'close'. That should highlight a useful arg to that param :)
    2. Implement a Kalman Filter to help you narrow the grid search.

    Here's another good stopping point. Read on for more fine grained control.

    ThetaUI Class/Explicit Instantiation

    With explicit instantiation, you give up the convenience of automatic trackbar GUI configuration, but gain more control over features. If you like the UX of @TunedFunction, see the benefits section below to determine if it's worth it to wade through the rest of this.

    Instead of TunedFunction, you import ThetaUI and TunerConfig. ThetaUI is the facade you work with. You could ignore TunerConfig if the default settings (e.g. when and where to save) work for you.

    Workflow:

    1. import ThetaUI
    2. Instantiate tuner, choosing between one and two functions to watch : main and downstream.
      • Each func must accept a tuner param with the default value of None...
    3. Make calls to tuner.track(), track_boolean(), track_list() or track_dict() to define tracked/tuned parameters to main
    4. Make a call to tuner.begin(), or to tuner.grid_search(). Each of these calls accepts a carousel. You do not use a launch call, as you did with TunedFunction().
      • This launches tuner, and then, as usual, each change to a trackbar results in a tuning call to target.
      • Tuner passes args to formal parameters which match by name to a tracked parameter.
      • All tracked parameters are also accessible off tuner. E.g., tuner.radius. This enables you to tune variables that are not part of the formal arguments to your function. Wondering if you should set reshape=True in a call to cv2.resize()? Well, just add a tracked parameter for that (without adding a parameter to your function), and access its value off tuner. The idea is to keep your function signature the same as what the auto-grader would expect - minimizing those 1:00am exceptions that fill one with such bonhomie. These args are also accesible as a dict via tuner.args
    5. set tuner.image to the processed image before you return...
    6. optionally - set tuner.results to something that is json serializable before you return.

    You cannot mix Tuner with partials and decorators (things blow up unperdictably) - just the func please.

    Watching Downstream Functions

    You could have two distinct functions called by Tuner - main (called first) and downstream (called after main).
    • There's only one set of trackbars - these appear on main's window.
    • Args (other than tuner) are not curried into downstream, so set defaults.
    • When downstream accesses tuner.image, it gets a fresh copy of the current image being processed. To get the image processed by main, access tuner.main_image.
    • tuner.image and tuner.results set from main are displayed in the main window (the one with the trackbars).
    • tuner.image and tuner.results set in downstream are displayed in the downstream window which does not have trackbars. Usually, the downstream image obscures the main one on first show; you'll need to move it out of the way.
    • Tuner will save images separately on F2, but will combine the results of both, along with args (tuned parameters) and writes it to one json file when you press F3. Remember to keep your results compatible with json serialization.
    A carousel is a group of images that you want tuner to deal with as a set. You typically want to do this to find parameters that work well across all images in the set. You are setting up a list of images (full path names) and specifying the parameters they need to be passed to.
    • Use the helper call tuner.carousel_from_images() to set up a carousel. This takes 2 lists.
      • The first is the list of names of parameters in target that take images. target might work with multiple images, and this list is where you specify the name of each parameter that expects an image.
      • The second is a list of image files (full path name). Each element of this list should be a tuple of file names.
        • If target works with 2 images, then each element of this second list must be a tuple of two image paths.
        • If it works with three images, then each element must be a tuple of three image paths, et cetera.
    • When Tuner is aware of image files, it uses the file name in ThetaUI's window title, (instead of just 'frame').
    • You can specify openCV imread codes to be used when reading files.
    • A video file can be used as a frame generator [untested as of April 2021]

    Why bother with explicit instantiation

    • Being able to tune hyper-parameters, or other control variables, without having them be parameters to your function. This keeps your signature what your auto-grader expects. Once ascertained, you should remove these from Tuner
    • Process a carousel of images, remembering settings between images.
    • Insert a thumbnail into the main image (set tuner.thumbnail before you set tuner.image. This is useful, e.g., when you are matching templates. You could do this with @TunedFunction() as well.
    • View the results of two processes in side by side windows. A few use cases for side-by-side comparison of images:
      • Show your pre-processing output in main; and traffic sign identification output in downstream.
      • match_template() output in one vs. harris_corners() output in the other.
      • What your noble code found, vs. what the built in CV functions found (I find this view particularly revealing, also, character building).
    • Controlling aspects of tuner.grid_search(). Please see the docstrings for more information.
    • You get to control whether the GUI returns list items vs list indices; keys vs dict objects etc.
    • You get to create tuners by spec'ing them in json.
    • Finally, as anyone who has written a Decorator knows, things can get squirrelly when exceptions take place within a partial... you could avoid that whole mess with explicit instantiation of ThetaUI.

    Apart from the few differences above, ThetaUI and TunedFunction() will give you pretty much the same UX. If none of the above are dealbreakers for you, stick with the decorator.

    OpenCV GUI

    Your experience of this GUI is going to be determined by the version of various components - OpenCV, and the Qt backend. Tuner does take advantage of a couple of the features of the Qt backend, but those are guarded in try blocks, so you shouldn't bomb. If you're in CS-6476, you've installed opencv-contrib-python. If not, might I suggest...

    If you don't see the status bar in Tuner GUI, you are missing opencv-contrib-python If you don't see the overlay menu, you are missing Qt backend

    Important Safety Tips

    The accompanying example.py illustrates some uses. Refer to the docstrings for ThetaUI's interface for details. Play around, and let me know if you think of ways to improve this.

    I've debugged this thing extensively, but I haven't had the time to bullet proof it. It will behave if your arguments are well behaved; but caveat emptor...

    Arguments curried into your functions follow usual call semantics, so modifying those args will have the usual side effects. Accessing tuner.image always gives you a fresh copy of the image - but this is the exception.
    (tldr: work on a copy of the image parameter - not directly on it, or else side effects will accumulate...)

    Don't forget to remove the @TunedFunction() decorator; the auto-grader won't much care for it :)

    Licensing

    It's only licensed the way it is to prevent commercial trolling. For all other purposes...

    Fork it, make something beautiful.

    About

    Hyper-parameter tuner (for computer vision and reinforcement learning)

    Topics

    Resources

    License

    Stars

    Watchers

    Forks

    Releases

    No releases published

    Packages

    No packages published

    Languages