Using Python, OpenCV and AWS Lambda to gather crime statistics – Part 2

Using Python, OpenCV and AWS Lambda to gather crime statistics – Part 2

This is part two in my quest to determine the long-term crime statistics, in the city where I intend to buy a house. Click here for part one, where I collect the images weekly from the police’s site. In this step, I’ll pick the images out of the S3 bucket, and attempt to find all the pins
The pins are relatively easy to find, thanks to OpenCV + python-cv. I separated out a red pin and yellow pin  and have saved these as the sample of what to look for. The code below will take an image, find all instances of the red and yellow pins in this image.

This first attempt seems to work, and draws a red box around all red pins, a yellow box around all yellow pins, and a small blue dot at the base of every pin where it is actually centered:

There are however a few issues which require some fine-tuning. One pin is correctly identified as red:

 –>

however in the log, it is actually identified as yellow first, then re-found as red afterwards:

Placing yellow point at 344,746
Placing red point at 344,746

 There is also a double break-in at the same location (probably two apartments in the same block):
The first issue can be tuned by changing the threshold. I tried also changing the algorithms, but there was no difference dependinging on which algorith was used — they all had the exact same result. Moving the threshold up to 76% helped, as well as re-creating the pin from scratch (rather than from a screenshot. I’m now getting acceptable recognition, still a few duplicates or false positives, but those can be removed based on location (if two are within a pixel). Also, my goal is to get rough statistics, so this is now ‘good enough’, at least until I can try Tensorflow.
Here is the result after some tuning:
 
In part 3 I will try to apply geo coordinates to the identified pins, and in part 4 I will load the data into Kibana