Skip to content

Comparative analysis of the models

matiasdelellis edited this page Dec 23, 2020 · 7 revisions

Comparative analysis of the models

NOTE: This analysis is pending updating to include model 4, and also to evaluate the quality of the clusters that were not originally taken into account.

From a previous analysis of Branko Kokanovic we know that there are some linearity between the area of the images and memory consumption of CNN models, so we take as reference the area of the image.

Procedure followed here:

  1. Have an instance of nextcloud with this application working. Install Netdata to obtain statistics of your installation. Install jq to parse the Json answers and GNU bc to perform some basic calculations.
  2. We create a new user to the Nexcloud instance. For example only 'user'.
  3. Upload some images with faces to analyze. For this analysis 274 images of The Big Bang Theory.. 😉
  4. Turn off all services that consume PHP (Apache, nginx, php-fpm). Sorry, but that makes a difference of 500mb. 😅 I.e: sudo systemctl stop httpd php-fpm
  5. Install the model that want to analyze. sudo -u apache php occ face:setup --model 2
  6. Run the background task to analyze the photos also controlling the time consumed. time sudo -u apache php occ face:background_job -u user
  7. Meanwhile observe the memory consumption in Netdata. Don't worry, we actually automate it. 😅
  8. Get the statistics of the background task.sudo -u apache php occ face:stats -u user

Finally:

  • The number of faces found (This should be the important data.) You obtained it from the statistics of the background task.
  • The maximum memory consumption was obtained with the naked eye with Netdata.
  • The average time consumed by each image is obtained by dividing in total time consumption by the amount of images.

Well. As it can take a long time, and I admit that it is not so easy, we do everything automatically. 😅

Automated analysis

[matias@nube nextcloud]$ cat test-facerecognition.sh 
#!/usr/bin/env bash
NETDATAHOST='http://localhost:19999'
NCUSER='user'

MINMODEL='1'
MAXMODEL='4'

MINSIDE='800'
MAXSIDE='2600'
DELTASIDE='200'

for ((i=${MINMODEL} ; i <= ${MAXMODEL} ; i++)); do

	php occ face:setup --model ${i}

	OUTPUTFILE="model-${i}.csv"
	echo "AREA,AVGTIME,MAXMEMORY,FACES,PERSONS" > ${OUTPUTFILE}

	for ((j=${MINSIDE} ; j <= ${MAXSIDE} ; j+=${DELTASIDE})); do
		let area=$((j ** 2))

		php occ face:reset --all -u ${NCUSER}

		AFTER=$(date +%s)
		php occ face:background_job -u ${NCUSER} --max_image_area ${area}
		BEFORE=$(date +%s)

		MAXMEMORY=$(curl -s "${NETDATAHOST}/api/v1/data?chart=apps.mem&before=${BEFORE}&after=${AFTER}&dimensions=php" | jq '.data[][1]' | sort -Vr | head -n1)

		php occ face:background_job -u ${NCUSER}

		ALLTIME=$((${BEFORE}-${AFTER}))
		IMAGES=$(php occ face:stats --json -u ${NCUSER} | jq '.[].images')
		FACES=$(php occ face:stats --json -u ${NCUSER} | jq '.[].faces')
		PERSONS=$(php occ face:stats --json -u ${NCUSER} | jq '.[].persons')
		AVGTIME=$(echo "scale=3; ${ALLTIME}/${IMAGES}" | bc -l)

		echo "${area},${AVGTIME},${MAXMEMORY},${FACES},${PERSONS}" >> ${OUTPUTFILE}

		sleep 1
	done
done

Run as:

sudo -u apache bash test-facerecognition.sh #Replace with your service user!. ie www-data

Result:

It results in 3 csv files with the main statistics which we share here:

[matias@nube ~]$ cat model-1.csv 
AREA,AVGTIME,MAXMEMORY,FACES,PERSONS
160000,2.905,793.3242,599,338
250000,4.412,1135.3633,779,445
360000,6.135,1536.0508,889,454
490000,8.226,2037.832,955,492
640000,10.992,2622.656,965,497
810000,13.832,3275.984,975,484
1000000,17.018,3979.73,975,473
1210000,20.427,4752.957,988,498
1440000,24.284,5628.828,988,502
1690000,28.518,6522.82,988,503
[matias@nube ~]$ cat model-2.csv 
AREA,AVGTIME,MAXMEMORY,FACES,PERSONS
160000,2.937,805.0469,599,338
250000,4.514,1172.1367,779,445
360000,6.306,1548.707,889,454
490000,8.463,2053.75,955,492
640000,11.040,null,965,497
810000,13.846,3278.324,975,484
1000000,16.948,3986.047,975,473
1210000,20.135,4764.246,988,498
1440000,24.262,5632.656,988,502
1690000,28.167,6575.504,988,499
[matias@nube ~]$ cat model-3.csv 
AREA,AVGTIME,MAXMEMORY,FACES,PERSONS
160000,3.021,112.94141,144,84
250000,4.091,122.96875,273,117
360000,5.543,101.48047,384,166
490000,7.175,130.17188,458,192
640000,9.124,116.25781,537,210
810000,11.288,119.52344,598,235
1000000,13.594,118.92969,647,266
1210000,16.394,116.90625,684,272
1440000,18.799,120.18359,696,286
1690000,16.901,116.23828,710,290

Contextualizing result

The idea is to compare the models, so have to compare the same columns.

Number of faces detected

Area Model 1 Model 2 Model 3
640.000 601 599 537
1.000.000 785 779 647
1.440.000 897 889 696
1.960.000 957 955 727
2.560.000 967 965 738
3.240.000 976 975 735
4.000.000 980 975 730
4.840.000 987 988 735
5.760.000 993 988 737
6.760.000 994 988 735

Faces Detected

Maximum memory consumption

Area Model 1 Model 2 Model 3
640.000 748,92 805,05 120,95
1.000.000 1.076,11 1.172,14 102,18
1.440.000 1.474,38 1.548,71 113,62
1.960.000 1.972,52 2.053,75 128,54
2.560.000 2.584,01 2.584,92 134,02
3.240.000 3230,22 3.278,32 142,68
4.000.000 3.943,64 3.986,05 146,13
4.840.000 4.707,66 4.764,25 150,22
5.760.000 5.580,12 5.632,66 162,84
6.760.000 6.480,95 6.575,50 183,54

Max. Memory

Processing time

Area Model 1 Model 2 Model 3
640.000 2,91 2,94 9,18
1.000.000 4,43 4,51 13,59
1.440.000 6,33 6,31 16,40
1.960.000 8,34 8,46 25,37
2.560.000 11,20 11,04 32,65
3.240.000 13,73 13,85 34,58
4.000.000 17,09 16,95 50,51
4.840.000 20,86 20,14 60,07
5.760.000 24,76 24,26 54,68
6.760.000 28,96 28,17 63,98

Processing time

First conclusions

  • The model 1 and model 2, practically result in exactly the same results. Therefore, between both models we recommend model 2 that offers more information with the same requirements. EDIT: Recently discovered some clustering errors of model 2 that were not taken into account in this analysist, and therefore it is discouraged. So, Model 1 is recommend.

  • It is true that Model 3 practically does not consume memory, but with the same image it is much slower and only finds 73% of faces.

However, the most interesting thing IMHO is to compare when HOG ties the worst CNN result.

Area Width Heigh Faces (Model 2) Memory (Model 2) Time (Model 2) Faces (Model 3) Memory (Model 3) Time (Model 3)
160.000 462 346 599 805,05 2,94 144 112,94 3,02
810.000 1.039 779 975 3.278,32 13,85 598 119,52 11,29
  • With an image of 462x346 CNN results in the same amount of faces as HOG with 1039x975
  • With an image of 462x346 CNN has an average time of 3.02 seconds against 11.28 of HOG with 1039x975. (Almost 4 times slower to get the same results.)
  • With an image of 462x346 CNN use 805,04 MB of ram against 117Mb of HOG for any size.

So if it is for speed, I would recommend CNN, which with less area offers similar results in much less time. The CNN model memory grows according to the size of the image but is controllable.

Also we can do the same analysis comparing the best HOG result against the one that offers the same result in CNN

Area Width Heigh Faces (Model 2) Memory (Model 2) Time (Model 2) Faces (Model 3) Memory (Model 3) Time (Model 3)
250.000 577 433 779 1.172,14 4,51 273 122,97 4,09
1.690.000 1.501 1.126 988 6.575,50 28,17 710 116,24 16,90
  • With an image of only 577x433 CNN offers even better results than HOG with 1501x1126
  • With an image of 577x433 CNN has an average time of 4.51 seconds against 16.90 of HOG with 1501x1126. Again almost 4 times slower to get the same results.
  • With an image of 577x433 CNN use 1.172,14 MB of ram against 117Mb of HOG for any size. HOG still has a ridiculous memory consumption, but 1172 MB of CNN is more than acceptable. Therefore, in terms of memory obviously HOG is much better, but CNN is fully usable.

Finally, in general I would recommend using CNN68 with small images, and HOG with larger images only if it is absolutely necessary..