Mask Detector w/ FastAI and Streamlit Sharing

Jack Harding
Analytics Vidhya
Published in
3 min readDec 11, 2020

--

Introduction

I want to use this post to highlight how easy it was to deploy an accurate deep learning web application. After starting the FastAI course recently, I’ve learned the basics of image processing and managed to build an app with one of the models. FastAI recommends using Binder, but I found this to be very slow compared to Streamlit.

I trained the model in a Colab notebook where I can use a GPU; then exported it to Streamlit. All the code is available in this repository, complete with example notebooks. The Streamlit dashboard is here.

Bing API

Automating the data collection process was part of one of the lessons in FastAI. The Bing Image Search API requires making an application on Azure Cognitive Services (no small feat) and, you’re ready to go. The function used to search for images on Bing is below.

def search_images_bing(key, term, max_images: int = 150, **kwargs):
params = {‘q’:term, ‘count’:max_images}
headers = {“Ocp-Apim-Subscription-Key”:key}
search_url = “https://api.bing.microsoft.com/v7.0/images/search”
response = requests.get(search_url, headers=headers, params=params)
response.raise_for_status()
search_results = response.json()
return L(search_results[‘value’])

The data organisation stage places the different image classes into separate folders used as the data labels with the function below taking a search term, Python path and data label as inputs.

def make_category(cat, path, label):                         
if not path.exists():
path.mkdir()
dest = (path/label)
dest.mkdir(exist_ok=True)
results = search_images_bing(key, cat)
download_images(dest, urls=results.attrgot('contentUrl'))

Using “%ls” magic command in Jupyter reveals the folder structure.

Model Training

FastAI uses the DataBlock API, which curates the data so I can spend more time on the modelling and less on organising where the data should be. The recommended folder structure is below.

mask_detector
├── train
│ ├── mask
│ │ ├── img_1.jpeg
│ │ └── img_2.jpeg
│ └── no_mask
│ ├── img_3.jpeg
│ └── img_4.jpeg
└── valid
├── mask
│ ├── img_5.jpeg
│ └── img_6.jpeg
└── no_mask
├── img_7.jpeg
└── img_8.jpeg

The arguments used are listed below:

  • blocks: specifies the dependent and independent variables. ImageBlock being the image data in each folder and CategoryBlock being the folder names.
  • get_items: takes a function that returns the location of the image data.
  • splitter: splits the data into valid and training data
  • get_y: function used to label the data
  • item_tfms: transforms each image into a suitable size
masks = DataBlock(
blocks=(ImageBlock, CategoryBlock),
get_items=get_image_files(path),
splitter=RandomSplitter(valid_pct=0.2, seed=42),
get_y=parent_label,
item_tfms=Resize(128))

Streamlit Hosting

Streamlit takes away the need for web development skills for sharing your data science projects. There is no need to mess around with HTML or JS frameworks; just Python, check out their demo. You can run Streamlit locally or on a PaaS like Heroku, but the easiest and fastest is to use their beta sharing service.

Streamlit Sharing App Deployment

After signing up for the beta, include your GitHub repo, branch and Streamlit file and you’re ready to deploy.

Beard Skew

After deploying my app, I was excited to see how well it performed; every image I uploaded was working. To be sure I sent it to my friend who happens to have a beard-the model didn’t like this. I tried more beards and realised there was a problem.

Photo by Frank Marino on Unsplash

The two categories I already had were mask and no_mask, so I decided to add another label to the data: beards. The code is below.

make_category('peoples faces with covid mask', path, 'mask')make_category('clean shaven face', path, 'no_mask')make_category('bearded face', path, 'beard')

After solving the beard issue, I realised the same problem could crop up with different ethnicities as most of the images search results were white men. I did not initially consider this when starting this and something I should include in future computer vision projects. I’d recommend FastAI’s lesson on data ethics for an expansion on what I just mentioned.

Feel free to correct me on any mistakes I might have made.

Resources

--

--