Continuous Integration Testing of AI Models on Modelplace.AI

In our last post, ”Using OpenCV AI Kit with Modelplace.ai To Create Real-time Reaction Videos”, we wrote about how simple it is to make a real product using a model from Modelplace.AI. Today we’re continuing the series of blog posts about the Modelplace by OpenCV.AI with a look behind the scenes. One of the core features of Modelplace.AI is the custom-built testing infrastructure which helps to ensure models will work on the user’s end device.

Why Testing Models Is Important

Testing is an essential part of any development cycle. As a marketplace, AI Models are the main component of our product. This means we must be careful to ensure the functionality and performance of every model we list, or users will lose confidence in the service. Automated testing also allows the team to focus on other aspects of quality assurance, making our internal development cycle smoother overall.


There are several reasons why tests make development a lot easier:

  • Check that code works as expected
  • Check for compatibility between software updates
  • Provides a strict contract that the code must satisfy
  • Tests are another way of writing documentation
  • Writing tests is a good practice - it saves the developer time and helps focus on the intended functionality

What Kind of Tests Are There?

For our purposes we will be discussing two main types of tests:


Unit tests - the type of software testing where individual units or components of the software are tested independently from other parts. Usually, a unit test contains 3 phases. The goal of unit testing is to isolate different parts of code and show that the individual parts are functioning as intended. If the observed behavior meets the expectations, the unit test passes, otherwise, it fails, indicating that there is a bug in the code.


Integration tests - The type of test which demonstrates how different parts of a system work together. They evaluate complex scenarios to ensure that the individual components function as a group. 

What Tools Does Modelplace.AI use For Testing?

There are a lot of different tools for testing but in this post we are going to discuss  Pytest - in our opinion one of the best  choices due to its flexibility and ease of use, and its ability to handle increasingly complex testing needs.


Pytest Advantages


1 — Easy to install : 

2 — Easy to use (compared to unittest)

In comparison with Python built-in unittest module you don’t need to deal with any imports and follow default class structure, you also don’t have to remember all self.assert* unittest methods.

3 — Simple structure

Pytest expects that test functions are located in files with test_* or *_test.py pattern name. Also, test functions themselves should be named with test_* pattern. This keeps naming conventions simple.


Usage Example

Here is the simple pytest usage example that tests a function which counts odd numbers


What Tests Are Used on Modelplace.AI?


In Modelplace.AI we are using both unit and integration tests. They are written for each model on the marketplace to make sure every model works as intended and described on the site.


Modelplace.AI Unit Tests

Here is an example of a unit test template for a model:


This test template is written  according to the “Arrange-Act-Assert” (also AAA) testing pattern. Let's go through the test line by line:


The first phase is “Arrange” where we initialize the model and load weights.


The next phase is “Act” where we run inference on a test image.

And the last one is the “Assert” phase where we compare prediction with the ground truth results.

It is necessary to be sure that the development of a new model didn’t impact the existing models. For this purpose, on every commit, we have a set of tests that build models and run unit tests for each of them.

Modelplace.AI Integration Tests

In order to detect problems in a timely manner, we have a scheduling pipeline that automatically runs tests every day.

As was mentioned earlier, integration tests check complex scenarios. Here we check that the inference of each model using Modelplace’s API provides predictions and visualization. We send test images and specific IDs for each model to the Modelplace endpoint and check if the predicted results match the ground truths.For example, a classification model must predict the "cat" class for a test image with a cat.

Notes About Benchmarking

Before publishing an AI model to Modelplace.AI, we need to be sure that the model generalizes well. It means we want to check a model's ability to adapt properly to new, previously unseen data. That is why every model needs not only to be tested that the inference works but also evaluated on a validation subset of the model's domain public benchmark.


Example of evaluation MobileNet based Single Shot Detector model on PASCAL dataset:

The metrics allow us to rank models by score and the performance time based on benchmarks. We will add this data to the public model pages in the future.

Testing on Real Hardware with OpenCV AI Kit 

Many AI projects use the type of testing we’ve shown, but one platform is close to our heart. On Modelplace.AI we have a list of models that specifically support  OpenCV AI Kit (OAK) which includes both the OAK-1 and OAK-D models.

What  is OAK? According to the hugely successful Kickstarter Campaign:

“OAK is a modular, open-source ecosystem composed of MIT-licensed hardware, software, and AI training - that allows you to embed the super-power of spatial AI plus accelerated computer vision functions into your product.  OAK provides in a single, cohesive solution what would otherwise require cobbling together disparate hardware and software components.”


For OAK we have gone the extra mile with regards to compatibility and performance testing- each OAK-compatible AI model is tested on a real OAK device! This required the creation of a custom testing setup.

This setup consists of: 


We have slightly adopted tests to run them directly on OAK. The test methods are  the same as described above, but run against the real-life scenario of a Pi 4 and OAK-1. This provides honest, accurate, data for both ourselves and Modelplace.AI’s customers.


OAK tests are run with every commit, and nightly tests run once per day. 

Conclusion

It is difficult to overestimate the importance of model testing in the Modelplace.AI product. Unit tests allow us to make sure that the model works correctly as a stand-alone unit, while the integration tests verify that the model works within the entire product pipeline. We also evaluate the quality of model prediction on public benchmarks in order to rank models and make sure that they generalize well.  We have even taking testing one step further, creating a testing environment that continuously runs on real hardware.


Thank you for reading a bit about our infrastructure. We’ll be posting more about the ins-and-outs of Modelplace’s backend systems in the future.

June 7, 2021
✨ Thank you! Your submission has been received!
Oops! Something went wrong!
HomePodcastNewsGet in Touch