Building an interactive GUI with OpenCV

Posted on March 13, 2014 by erget — 6 Comments

OpenCV is great for all kinds of computer vision tasks. Many of these can run in a fully automated fashion, where parameters for the CV algorithms are provided by the user before the program begins or can be determined algorithmically at run time. Some, however, cannot.

For example, for my current project I am trying to find the optimal settings for OpenCV’s implementation of the stereo block matching algorithm. This requires computing disparity pictures, examining them visually, and deciding whether the parameters let the block matcher perform well or not. This is fairly subjective work, and it’s really annoying if you have to restart your program in order to see results with other settings or, if you’re using e.g. the Python interpreter, retype your arguments and display the window again. Of course, you could run a loop over all possible parameter combinations, but that makes it hard to experiment.

In this example, I will show how to implement a GUI in Python that lets you tune settings in your program and recompute values on the fly depending on user inputs in a graphical window.

GUI background: Working with callbacks

If you’ve programmed a standalone program before, you’ve most likely written some kind of user interface. In Python, a great tool for creating command line UIs is argparse. I use it for just about everything because it takes all the work of putting together usage messages and parsing user inputs off your hands.

A command line UI is relatively easy to write even in simple scripts because you can halt the program’s progress until you’ve received the input you need. Graphical interfaces are normally a bit more complex because they don’t halt the program’s progress. Also, receiving the inputs from the program is sometimes more complicated. The principle behind most points of interaction is simple, though: You register a GUI element and pass it a callback so that it knows what to do with the input the user gives it.

A callback is very simple: All it is is a function passed without calling it. A very simple example could look like this:

def foo():
    print("You called foo.")
def bar():
    print("You called bar.")
for callback in foo, bar:
    callback()

Here I have two functions, foo and bar. I make a tuple out of them containing only the function objects, and then iterate over both items in that tuple, calling each item with no arguments. Since both functions require no arguments, this works, so the output is:

You called foo.
You called bar.

It’s really that simple.

Of course, the functions could be a little more complicated, and they could also require arguments. Therefore it’s always important that you register a callback that is compatible with the element you’re using with it – if the element passes an argument, your callback needs to accept an argument, etc.

Many GUI elements ask for a callback when they’re initialized. They use that somewhere in their own internals. You should know from the documentation how many arguments they pass to the callback and what kind they are. You use this to control what happens when the user interacts with the GUI.

Of course, most of the time you write a callback, you’ll want it to do something more complex than a Hello World. Your function will probably need more information than the GUI element passes to it. There are many ways to provide the missing data: The most common, and probably the most logical, way of doing this is by using object orientation, but other possibilities include using partially applied functions and global variables. In my opinion, global variables are okay, but only if there’s a good reason for having them and they’re used read-only. As much as I think partially applied functions are cool, I have never been in a situation where they were able to solve a problem more neatly than object orientation. Perhaps it’s a question of taste.

Example situation

This code is from a project to develop a self-calibrating stereo camera. The code I’ve generated so far lets you calibrate a stereo camera in an automated fashion, and you can make it from any two webcams you like. Once the stereo pair has been calibrated, the goal is to use them in conjunction to produce 3D pictures. This will happen in real time, so I’m using the stereo block matching algorithm in OpenCV. It’s possible to get good results with this algorithm, but you need to tune the parameters you use in order to get good results.

Since originally writing this post, the code has changed a lot and grown more flexible, but I’m leaving the content as is. If you’d like to see the new code, check out the now-full-fledged StereoVision package, more specifically its UI utilities.

What I need is a window that shows the disparity image computed from two input images taken with the calibrated stereo camera, with the possibility of adjusting the algorithm’s three parameters: camera type, disparity search range and block size. I want the image to update automatically when a new parameter is set.

OpenCV provides all the tools needed to do this. I can create a window, show an image in it, and add sliders to it. Also, I can associate a callback with each slider that is called every time the slider is moved. The callback is called with a single argument: the slider’s new value. It looks like this, assuming you’ve already defined the variables I pass in these examples:

# Create window
cv2.namedWindow(window_name)
# Show an image in the window
cv2.imshow(window_name, image)
# Add a slider
cv2.createTrackbar(slider_name, window_name, start_value, max_value, callback)

In my application, the result of all this looks like this:

The Stereo BM Tuner GUI showing some pretty bad results. The picture is updated automatically when the parameters are changed.

How to make it happen

I’m going to take a step back and explain what my code does in a bit more detail here, so feel free to skip to whatever’s interesting for you. If you’re wanting to do this yourself, you’ll need to already have your stereo camera and have it calibrated.

Rectifying the stereo image pair

In order to make a 3D image, the two images you take with your stereo camrea have to be rectified. This involves lining up both images in such a fashion that you can find the same point in both pictures through searching along a given line. If your cameras are aligned horizontally, which is the normal case, this will be a horizontal line, otherwise it will be a vertical line.

You can do this using my StereoCalibration class. It works like this:

# This assumes you've already calibrated your camera and have saved the
# calibration files to disk. You can also initialize an empty calibration and
# calculate the calibration, or you can clone another calibration from one in
# memory
calibration = StereoCalibration(input_folder=my_folder)
# Now rectify two images taken with your stereo camera. The function expects
# a tuple of OpenCV Mats, which in Python are numpy arrays
rectified_pair = calibration.rectify((left_image, right_image))

Computing the disparity image

Now that you have a rectified image pair, you can compute the disparity between both pictures. A fast algorithm that’s implemented in OpenCV is stereo block matching. You can use it like this:

# Initialize a stereo block matcher. See documentation for possible arguments
block_matcher = cv2.StereoBM()
# Compute disparity image
disparity = block_matcher.compute(rectified_pair[0], rectified_pair[1])
# Show normalized version of image so you can see the values
cv2.imshow(window_name, disparity / 255.)

Tuning the block matcher

You’ll notice pretty fast that using the algorithm in its default state will probably give you bad results. By changing the parameters you used to initialize the block matcher and trying out new combinations you can find the optimal parameters for your camera pair.

Although the goal here is to use a GUI to tune the block matcher manually, every GUI needs a good backend that works well all on its own – otherwise your code ends up messy. That’s why I implemented the GUI and the calibrated pair that you tune separately – to separate the interface from the backend’s design. You can find the code for both on Github.

Here’s the calibrated camera pair:

class CalibratedPair(webcams.StereoPair):
    """
    A stereo pair of calibrated cameras.

    Should be initialized with a context manager to ensure that the camera
    connections are closed properly.
    """
    def __init__(self, devices,
                 calibration,
                 stereo_bm_preset=cv2.STEREO_BM_BASIC_PRESET,
                 search_range=0,
                 window_size=5):
        """
        Initialize cameras.

        ``devices`` is an iterable of the device numbers. If you want to use the
        ``CalibratedPair`` in offline mode, pass None.
        ``calibration`` is a StereoCalibration object. ``stereo_bm_preset``,
        ``search_range`` and ``window_size`` are parameters for the
        ``block_matcher``.
        """
        if devices:
            super(CalibratedPair, self).__init__(devices)
        #: ``StereoCalibration`` object holding the camera pair's calibration.
        self.calibration = calibration
        self._bm_preset = cv2.STEREO_BM_BASIC_PRESET
        self._search_range = 0
        self._window_size = 5
        #: OpenCV camera type for ``block_matcher``
        self.stereo_bm_preset = stereo_bm_preset
        #: Number of disparities for ``block_matcher``
        self.search_range = search_range
        #: Search window size for ``block_matcher``
        self.window_size = window_size
        #: ``cv2.StereoBM`` object for block matching.
        self.block_matcher = cv2.StereoBM(self.stereo_bm_preset,
                                          self.search_range,
                                          self.window_size)
    def get_frames(self):
        """Rectify and return current frames from cameras."""
        frames = super(CalibratedPair, self).get_frames()
        return self.calibration.rectify(frames)
    def compute_disparity(self, pair):
        """
        Compute disparity from image pair (left, right).

        First, convert images to grayscale if needed. Then pass to the
        ``CalibratedPair``'s ``block_matcher`` for stereo matching.

        If you wish to visualize the image, remember to normalize it to 0-255.
        """
        gray = []
        if pair[0].ndim == 3:
            for side in pair:
                gray.append(cv2.cvtColor(side, cv2.COLOR_BGR2GRAY))
        else:
            gray = pair
        return self.block_matcher.compute(gray[0], gray[1])
    @property
    def search_range(self):
        """Number of disparities for ``block_matcher``."""
        return self._search_range
    @search_range.setter
    def search_range(self, value):
        """Set ``search_range`` to multiple of 16, replace ``block_matcher``."""
        if value == 0 or not value % 16:
            self._search_range = value
            self.replace_block_matcher()
        else:
            raise InvalidSearchRange("Search range must be a multiple of 16.")
    @property
    def window_size(self):
        """Search window size."""
        return self._window_size
    @window_size.setter
    def window_size(self, value):
        """Set search window size and update ``block_matcher``."""
        if value &gt; 4 and value &lt; 22 and value % 2:
            self._window_size = value
            self.replace_block_matcher()
        else:
            raise InvalidWindowSize("Window size must be an odd number between "
                                    "5 and 21 (inclusive).")
    @property
    def stereo_bm_preset(self):
        """Stereo BM preset used by ``block_matcher``."""
        return self._bm_preset
    @stereo_bm_preset.setter
    def stereo_bm_preset(self, value):
        """Set stereo BM preset and update ``block_matcher``."""
        if value in (cv2.STEREO_BM_BASIC_PRESET,
                     cv2.STEREO_BM_FISH_EYE_PRESET,
                     cv2.STEREO_BM_NARROW_PRESET):
            self._bm_preset = value
            self.replace_block_matcher()
        else:
            raise InvalidBMPreset("Stereo BM preset must be defined as "
                                  "cv2.STEREO_BM_*_PRESET.")
    def replace_block_matcher(self):
        """Replace ``block_matcher`` with current values."""
        self.block_matcher = cv2.StereoBM(preset=self._bm_preset,
                                          ndisparities=self._search_range,
                                          SADWindowSize=self._window_size)

The class inherits from webcams.StereoPair, which is just a class that handles dealing with two webcams as a stereo pair simultaneously. It also does some nice things like letting you take a picture with both cameras simultaneously or show their video streams live. If you use them in a with clause they will clean up after themselves and close the camera connection when they’re done.

All of this functionality is inherited from that class, with one modification that you can see in the method get_frames. This method calls the superclass’ get_frames method, collects the frames it returns, and rectifies them using the StereoCalibration object stored on the CalibratedPair. Because the inherited methods show_frames and show_videos call this method in order to get the frames they show, when you use a CalibratedPair rather than just a normal StereoPair to view camera outputs, you’ll always see the rectified images.

Non-inherited methods include replace_block_matcher, which replaces the block matcher stored on the CalibratedPair object. You’ll also notice several decorators being used in the class to do something that you don’t find in Python code too often: They act as getters and setters. They control access to variables that the class is meant to protect from the user.

The reason for this is that I don’t want the user to pass bad values to the block matcher. If the user passes bad values, OpenCV silently instantiates a new StereoBM object without throwing an error. That would lead to bad results. The decorators allow the object’s fields search_range, window_size and stereo_bm_preset to be accessed as normal variables – all they do is return their counterparts that are named with a leading underscore. Setting them is also done normally, but because I’ve used setter decorators, doing the following:

calibrated_pair.window_size = 5

actually does this:

calibrated_pair.window_size(5)

The window_size method checks the input value and throws a meaningful error if the value is inappropriate.

Storing the parameters for the block matcher means that they exist persisently as a part of the object, so when I change one variable I can retain the settings on the other variables without having a lot of complicated bookkeeping code.

The core of the class, compute_disparity, checks the passed images to see if they are grayscale. If not, they’re converted to grayscale. Then they’re passed to the block matcher.

So if you use this class, you can just pass new parameters for the block matcher again and again, replacing it when you need to, and check the results. If you have an algorithm that can check the quality of disparity images, you don’t need a GUI – the class will work just fine for automated tuning.

Implementing the frontend

Of course, you may not have an algorithm that can judge the quality of disparity pictures. I don’t, which is why I’m stuck with my brain. To ease the parameter selection, I implemented a StereoBMTuner class which hides the complexity of dealing with the details of cv2’s high-level GUI functions.

class StereoBMTuner(object):
    """
    A class for tuning Stereo BM settings.

    Display a normalized disparity picture from two pictures captured with a
    ``CalibratedPair`` and allow the user to manually tune the settings for the
    stereo block matcher.
    """
    #: Window to show results in
    window_name = "Stereo BM Tuner"
    def __init__(self, calibrated_pair, image_pair):
        """Initialize tuner with a ``CalibratedPair`` and tune given pair."""
        #: Calibrated stereo pair to find Stereo BM settings for
        self.calibrated_pair = calibrated_pair
        cv2.namedWindow(self.window_name)
        cv2.createTrackbar("cam_preset", self.window_name,
                           self.calibrated_pair.stereo_bm_preset, 3,
                           self.set_bm_preset)
        cv2.createTrackbar("ndis", self.window_name,
                           self.calibrated_pair.search_range, 160,
                           self.set_search_range)
        cv2.createTrackbar("winsize", self.window_name,
                           self.calibrated_pair.window_size, 21,
                           self.set_window_size)
        #: (left, right) image pair to find disparity between
        self.pair = image_pair
        self.tune_pair(image_pair)
    def set_bm_preset(self, preset):
        """Set ``search_range`` and update disparity image."""
        try:
            self.calibrated_pair.stereo_bm_preset = preset
        except InvalidBMPreset:
            return
        self.update_disparity_map()
    def set_search_range(self, search_range):
        """Set ``search_range`` and update disparity image."""
        try:
            self.calibrated_pair.search_range = search_range
        except InvalidSearchRange:
            return
        self.update_disparity_map()
    def set_window_size(self, window_size):
        """Set ``window_size`` and update disparity image."""
        try:
            self.calibrated_pair.window_size = window_size
        except InvalidWindowSize:
            return
        self.update_disparity_map()
    def update_disparity_map(self):
        """Update disparity map in GUI."""
        disparity = self.calibrated_pair.compute_disparity(self.pair)
        cv2.imshow(self.window_name, disparity / 255.)
        cv2.waitKey()
    def tune_pair(self, pair):
        """Tune a pair of images."""
        self.pair = pair
        self.update_disparity_map()

When you instantiate an object of this class, it sets up a named window, creates three trackbars and calls the tune_pair method, which shows the disparity map and the other GUI elements. It’s important to note that the class itself stores the CalibratedPair, as well as the image pair, that it’s currently working on. This is important for the callbacks.

The callbacks registered with the trackbars receive only a single argument: The value of the trackbar. They are called every time the trackbar is updated.

Each callback is pretty simple. It receives the value from the trackbar and tries to use it to set the appropriate field on the StereoBMTuner’s CalibratedPair. If this is successful, the CalibratedPair swaps out its StereoBM object. If the passed value was inappropriate, the error is caught and nothing is done. If another error occurs, the method stops. If no error occurs, the StereoBMTuner calls its method update_disparity_map, which passes the image pair stored on the StereoBMTuner to the CalibratedPair and asks for a new disparity image. The CalibratedPair computes the disparity image with the new parameters and returns it to the StereoBMTuner, which normalizes the image and shows it in the GUI. This continues until the user presses a key on the keyboard, allowing the user to find the optimal parameters for the block matching algorithm for the image pair in question.

Putting it all together

This is all fine and good – we have all the building blocks, so now we can use them to make a program. I’ve implemented the program so that it takes a series of pictures stored in a folder, as well as a calibration folder, and iterates through all of the images, showing their disparity maps and allowing the user to tune the block matcher. After the user has gone through all the images, he receives a report of what he chose and how often. The code looks like this:

def main():
    """Let user tune all images in the input folder and report chosen values."""
    parser = argparse.ArgumentParser(description="Read images taken from a "
                                     "calibrated stereo pair, compute "
                                     "disparity maps from them and show them "
                                     "interactively to the user, allowing the "
                                     "user to tune the stereo block matcher "
                                     "settings in the GUI.")
    parser.add_argument("calibration_folder",
                        help="Directory where calibration files for the stereo "
                        "pair are stored.")
    parser.add_argument("image_folder",
                        help="Directory where input images are stored.")
    args = parser.parse_args()

    calibration = calibrate_stereo.StereoCalibration(
                                        input_folder=args.calibration_folder)
    input_files = find_files(args.image_folder)
    calibrated_pair = CalibratedPair(None, calibration)
    image_pair = [cv2.imread(image) for image in input_files[:2]]
    rectified_pair = calibration.rectify(image_pair)
    tuner = StereoBMTuner(calibrated_pair, rectified_pair)
    chosen_arguments = []
    while input_files:
        image_pair = [cv2.imread(image) for image in input_files[:2]]
        rectified_pair = calibration.rectify(image_pair)
        tuner.tune_pair(rectified_pair)
        chosen_arguments.append((calibrated_pair.stereo_bm_preset,
                                 calibrated_pair.search_range,
                                 calibrated_pair.window_size))
        input_files = input_files[2:]
    stereo_bm_presets, search_ranges, window_sizes = [], [], []
    for preset, search_range, size in chosen_arguments:
        stereo_bm_presets.append(preset)
        search_ranges.append(search_range)
        window_sizes.append(size)
    for name, values in (("Stereo BM presets", stereo_bm_presets),
                         ("Search ranges", search_ranges),
                         ("Window sizes", window_sizes)):
        report_variable(name, values)
        print()

As you can see, setting up the GUI is easy because all of the work is encapsulated in the classes explained above. All the main function really does is load the appropriate images, keep track of the parameters the user sets and report them at the end.

Important to note here is that the complexity in all this work is funneled down to the most abstract level possible. It would have required a bit less thought to just implement this all as a script, but it would have been at least as much work and the result would have been horrific. With a little bit of thought, each part of the program can be written fairly abstractly, making it reusable, so at the end, all that needs to be implemented is exactly what is needed for this use case. All of the one-time logic is contained in the main function, and the redundant part of the logic is stored in a local function in order to keep the code clean.

So there you have it – a GUI with only OpenCV as a dependency with simple user interaction that sets off complex action deeper within the program.

About erget

My name’s Daniel Lee. I've founded a company for planning solar power. I've worked on analog space suit interfaces, drones and a bunch of other things in my free time. I'm also involved in standards work for meteorological data. I worked for a while German Weather Service on improving forecasts for weather and renewable power production. I led the team for data ingest there before moving to EUMETSAT.

Tagged with: gui, OpenCV, python
Posted in Uncategorized

6 comments on “Building an interactive GUI with OpenCV”

SutoCom says:

March 14, 2014 at 02:35

Reblogged this on Sutoprise Avenue, A SutoCom Source.

LikeLike

Reply
Producing 3D point clouds with a stereo camera in OpenCV | Stackable says:

April 27, 2014 at 14:54

[…] my last posts, I showed you how to build a stereo camera, calibrate it and tune a block matching algorithm to produce disparity maps. The code is written in Python in order to make it easy to understand and extend, and the total […]

LikeLike

Reply
liberona91 says:

October 10, 2014 at 21:21

Hi Daniel,

J have succesfully calibrated my cameras and stored its files in the path “/home/pi/calib_parameters”, but now I have some problems to define my input_folder argument.

In this part of the code, I need to define this path, where says “input_folder=my_folder”

# This assumes you’ve already calibrated your camera and have saved the
# calibration files to disk. You can also initialize an empty calibration and
# calculate the calibration, or you can clone another calibration from one in
# memory
calibration = StereoCalibration(input_folder=my_folder)
# Now rectify two images taken with your stereo camera. The function expects
# a tuple of OpenCV Mats, which in Python are numpy arrays
rectified_pair = calibration.rectify((left_image, right_image))

Can you help me please with this? I’m pretty new in Python and I’m trying to run a script called “code_pablo.py” in LxTerminal:

pi@raspberrypi ~ $ python code_pablo.py
Traceback (most recent call last):
File “code_pablo.py”, line 3, in
calibration = StereoCalibration(input_folder=calib_parameters)
NameError: name ‘calib_parameters’ is not defined

Also I tried to put the entire path in this argument, but throws the same error

Any advice you can give me , I would really appreciate it

Best regards,

Pablo Liberona

LikeLike

Reply
- erget says:
  
  October 11, 2014 at 15:37
  
  Hi Pablo,
  
  Without seeing what’s in code_pablo.py it’s a bit difficult to diagnose your problem. However, it looks to me like you’ve got a classic basic programming problem: You’ve mistakenly passed a string without quotes, leading the Python interpreter to interpret the string as an object. That’s what it means when it reports the NameError: There is no object in your code named calib_parameters.
  
  Either you need to define calib_parameters as the string pointing to the folder you’re wanting to use, or you need to pass the string into the method as a literal. Like this:
  
  # Option 1: Define a variable. This might be more flexible, depending on what # you want to do parameters = "/home/pi/calib_parameters" calibration = StereoCalibration(input_folder=parameters) # Option 2: Pass it in as a string. If you use this string in multiple places, # you'll probably end up being sad at some point or another because you'll # have to find every instance and change them all individually if the folder # changes. calibration = StereoCalibration(input_folder="/home/pi/calib_parameters")
  
  Hope that helps!
  Daniel
  
  LikeLike
  
  Reply
Aston says:

February 19, 2020 at 17:23

Hi Daniel, I’m trying to follow along with your tutorial but the links to Git repositories don’t seem to be working for me. Do these files still exist?

LikeLike

Reply
- erget says:
  
  May 10, 2020 at 14:19
  
  Hi Aston, sorry about the slow reply – I don’t look at this blog super open but I’ve corrected those links for you now.
  
  LikeLike
  
  Reply