Skip to content

unittesting_basics

Yoseph Maguire edited this page Jan 16, 2020 · 2 revisions

Unit Testing

File Structure

The Azure IoT Python libraries we maintain are covered by unittests. These unittests are maintained in a package called tests, directly inside any top-level package directory in the Azure IoT Python mono-repository. The test structure should mostly mirror the structure of the packages under test (in this example, azure.iot.device).

e.g.

azure-iot-device
+-- azure
|   +-- iot
|       +-- device
|           +-- iothub
|           |   +-- sync_clients.py
|           |   +-- sync_inbox.py
|           |   +-- ...
|           +-- provisioning
|               +-- registration_client.py
|               +-- registration_client_factory.py
|               +-- ...
+-- tests
    +-- iothub
    |   +-- test_sync_clients.py
    |   +-- test_sync_inbox.py
    |   +-- ...
    +-- provisioning
        +-- test_registration_client.py
        +-- test_registration_client_factory.py
        +-- ...

Tooling

We use pytest as our testing framework, along with the extensions pytest-mock, pytest-asyncio, and pytest-testdox.

Test Syntax and Semantics

We use features of pytest-testdox to allow us to tailor the testing output to our liking with the use of the decorators pytest.mark.describe and pytest.mark.it.

pytest.mark.describe

This decorator is used for a test class to "describe" the unit that is under test. This will be a noun. It is defined with a specific syntax, depending on the unit under test

Unit under test Syntax
Function .<function name>()
Function called under a certain condition .<function name>() -- <condition>
Class <class name>
Specific aspect of a class <class name> - <aspect>
Class response to an external event <class name> - OCCURANCE: <event name>
Method in a class <class name> - .<method name>()
Method in a class called under a cerain condition <class name> - .<method name>() -- <condition>

pytest.mark.it

This decorator is used for a test function within a test class to provide the requirement that is being tested for the unit. This will be a sentence fragment, in indicative-mood third-person present tense, where the subject is implied as the noun defined with pytest.mark.describe.

Or more plainly, write it as if the sentence begins with "it", and describe what "it" does, where "it" refers to the item under test, i.e. what you wrote in pytest.mark.describe

@pytest.mark.describe(".foo()")
class TestFoo(object):
    
    @pytest.mark.it("Returns the string 'foo'")
    def test_returns_foo(self):
        assert foo() == "foo"

While pytest.mark.it should spell out the requirement under test, the actual test name itself should refer to what the test does. This is generally (but not always) fairly similar.

Test Structure - What is a unit?

The goal of unittests to not only verify the function of our codebase, but also to serve as a "requirements document", outlining the exact expected functionality. As a result, the idea is to structure tests around clear and verifiable "requirements" per unit under test.

In order to effectively do this, significant consideration must be given to how units under test are defined. Consider the following simple module:

def foo():
    return "foo"

def bar():
    return "bar"

There are two units here - foo and bar are both totally independent functions with independent requirements. Thus there should be two test classes to signify these two units. And since each function only does one thing, there's really only one requirement.

@pytest.mark.describe(".foo()")
class TestFoo(object):
    @pytest.mark.it("Returns the string 'foo'")
    def returns_foo(self):
        assert foo() == "foo"

@pytest.mark.describe(".bar()")
class TestBar(object):
    @pytest.mark.it("Returns the string 'bar'")
    def returns_bar(self):
        assert bar() == "bar"

Now consider the following class definition:

class FooClient(object):

    def __init__(self):
        self.bar = "buzz"
        self.io = IOLibrary()
        self.io.tls_type = ssl.PROTOCOL_TLSv1_2
        self.io.verify_mode = ssl.CERT_REQUIRED
        self.io.check_hostname = True

    def connect():
        self.io.connect()
    
    def send_message(message, qos):
        self.io.send(message, qos)

While it may be tempting to say that the entire class is the unit under test, that would not be a good design here, since there are many different independent things being tested. After all, are you really testing the instance of the FooClient object itself when you test the ability to connect, or are you actually testing the connect method?

@pytest.mark.describe("FooClient - Instantiation")
class TestFooInstantiation(object):
    @pytest.mark.it("Sets the .bar instance attribute to 'buzz'")
    def test_sets_bar(self):
        foo = FooClient()
        assert foo.bar == "buzz"

    @pytest.mark.it("Configures the IO library")
    def test_configures_io(self):
        foo = FooClient()
        assert isinstance(foo.io, IOLibrary)
        assert foo.io.tls_type == ssl.PROTOCOL_TLSv1_2
        assert foo.io.verify_mode == ssl.CERT_REQUIRED
        assert foo.io.check_hostname == True

@pytest.mark.describe("Foo - .connect()")
class TestFooConnect(object):
    @pytest.mark.it("Connects via the IO library")
    def test_connect_via_io(self, mocker):
        foo = FooClient()
        io_mock = mocker.patch(foo, "io")
        foo.connect()

        assert io_mock.connect.call_count == 1

@pytest.mark.describe("Foo - .send_message()")
class TestFooSendMessage(object):
    @pytest.mark.it("Sends a message via the IO library")
    @pytest.mark.parametrize(
        "qos",
        [pytest.param(0, id="QoS 0"), pytest.param(1, id="QoS 1"), pytest.param(2, id="QoS 2")],
    )
    def test_sends_via_io(self, mocker, qos):
        foo = FooClient()
        io_mock = mocker.patch(foo, "io")
        message = "my_message"

        foo.send_message(message, qos)
        assert io_mock.send_message.call_count == 1
        assert io_mock.send_message.call_args == mocker.call(message)

Note that the instantiation is broken up into discrete logical requirements rather than testing everything in a single test. This creates the most informative output.

FooClient - Instantiation
 [x] Sets the .bar instance attribute to 'buzz'
 [x] Configures the IO Library

Note also that the .send_message() method is parametrized to cover a variety of inputs

FooClient - .send_message()
 [x] Sends a message via the IO library[QoS 0]
 [x] Sends a message via the IO library[Qos 1]
 [x] Sends a message via the IO library[QoS 2]

There should be a test for every defined behavior of the class or module under test so that our unittest output can serve as a complete requirements document which spells out the exact functionality.

Make sure to test

  • Optional arguments
  • Bad inputs that raise errors
  • Different scenarios/configurations
  • Handling errors thrown by dependencies
  • Callbacks are resolved
  • etc.

Remember, every requirement is a test!

Depending on the code being tested, your approach to dividing into units may be different! Check out some further examples of different units you can use, and when to use them.

If you start noticing it's hard to phrase the pytest.mark.it requirements, you may need to revisit the way you have divided your units up.

What needs to be tested?

Generally, test everything in a module but convention-private helper methods/functions (i.e. those that begin with an underscore). Those will be implicitly tested by the testing of functions/methods that rely upon them. If a helper method is sufficiently complex that it seems like it shoud be tested separately, this likely suggests a deficiency in design.

We do not consider the idea of a "helper class" to exist in the same way. If something is sufficiently complex to be it's own class within the module, it probably should be tested separately.

Note also that callbacks and handlers must ALWAYS be covered, no matter how they are defined (convention-private definition, inner-function definition, etc.). Depending on the code, there may be different approaches to handling this.

Testing for bad inputs is much more important at the user-facing API surface, and should be more robust there, wheras deeper in the stack, adherance to the docstring type contract can generally be assumed. Special care should, however, be paid to bad inputs and error handling between architectural layers.

Use of mocks

As previously mentioned, one of the extensions we use for our tesing is pytest-mock, which provides functionality via the usage of the mocker fixture that can be used in a test like any other fixture. We do NOT use the unittest.mock package. This is because mocker provides access to the contents of unittest.mock in a way that is more agreeable to the pytest syntax - to also use unittest.mock would mean having two different references to the same package, which could thus have different package versions, and lead to unexpected behavior.

Since, in Python, unit-testing is kind of a unit/integration testing hybrid, usage of mocks should tend to be limited to places where it is strictly necessary or desirable from a design standpoint. For instance, mocking I/O, or a lower level of product architecture. Generally, try to not mock objects, methods or functions within the same level of architecture without a positive reason. Instead, use mocker.spy or mocker.stub wherever you can.

class FooClient(object):

    def __init__(self):
        self.httpclient = HttpClient("http://myurl")
        self.valuemanager = ValueManager

    def boo(self, val):
        self.valuemanager.set(val)

    def bar(self):
        return valuemanager.get(val)

    def buzz(self):
        self.httpclient.send("buzz")
@pytest.mark.describe("FooClient - .bar()")
class TestFooClientBar(object):
    @pytest.mark.it("Returns stored value upon call to .bar method")
    def test_returns_next_primate(self, mocker):
        foo = FooClient()
        stored_value = 3
        foo.boo(stored_value)
        # No need to mock the prime_generator, let it operate as normal
        mocker.spy(foo.valuemanager, "get")

        res = foo.bar()
        assert res == stored_value
        assert foo.valuemanager.get.call_count == 1

@pytest.mark.describe("FooClient - .buzz()")
class TestFooClientBuzz()
    @pytest.mark.it("Sends an HTTP request upon call to .buzz method")
    def test_makes_http_request(self, mocker):
        foo = FooClient()
        # Mock the I/O layer of the stack so it does not actually run
        foo.httpclient = mocker.MagicMock()

        foo.bar()
        assert foo.httpclient.send.call_count == 1

Note that testing with mocks on asynchronous code is tricky.

Organizing Tests: More art than science

As mentioned above, the goal of our testing is not just to have a suite of tests that allow us confidence in our product, but also that we can use the testing output as a legible requirements document. Because of this restriction, sometimes there is less leeway to get creative with test organization and structure. Readability of testing output will always take precedence over the beauty of the test code!!

Nontheless, we still also want to optimize test code to avoid repetition wherever possible, and to keep the overall testing code as silm as possible. Depending on the complexity of the objects and methods under test, there are various approaches, of varying complexity that can be taken. Some are outlined in other articles within this wiki.

The idea here is that, in order to achieve the goal of efficient test code, that is also highly readable, one must approach testing as more of an "art" than a "science". There is not a one-size fits all solution, and different approaches will have to be taken to optimally test different parts of the stack.

Advanced Patterns

Over time, certain patterns for more complex scenarios have emerged, and have been documented in this wiki: