Is there a reason not to reuse stats for multiple detection runs? #308
-
My thought was if I pass But it looks like caching and loading is not happening at all. So I looked for some notes about this and found b89238c.
Yet I'm not fully convinced why recalculating the stats on every run is better than caching previously calculated stats and reusing them for the next runs. What am I missing here? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 3 replies
-
This feature was removed for a variety of reasons. Regarding performance, the benefit isn't as great as it once was now that v0.6 does some things in parallel to use multiple cores. One could also just process the output of a statsfile to see which frames exceed a particular threshold similar to how the detection loop works. Really though, using the statsfile as a cache was not well thought out. The CSV file format is inefficient for that purpose, and there's a significant number of factors that affect the stats calculations. This includes downscale factor, or even some detector parameters (e.g. for edge detection). Reusing a statsfile when these things change would be incorrect, and leads to a big rabbit hole of choices (e.g. do we make a new column for each changed dimension?) Now things are much simpler for people actually consuming the statsfile for it's primary purpose - statistical analysis of the video itself. In the long term, I'm not opposed to having some kind of data cache for speeding up repeated calculations for reprocessing videos. However, I don't think the statsfile itself is good choice for that. Sorry if this rationale wasn't made clear enough, but I'm happy to talk through specific points further if you wish. Thanks for the question. |
Beta Was this translation helpful? Give feedback.
This feature was removed for a variety of reasons. Regarding performance, the benefit isn't as great as it once was now that v0.6 does some things in parallel to use multiple cores. One could also just process the output of a statsfile to see which frames exceed a particular threshold similar to how the detection loop works.
Really though, using the statsfile as a cache was not well thought out. The CSV file format is inefficient for that purpose, and there's a significant number of factors that affect the stats calculations. This includes downscale factor, or even some detector parameters (e.g. for edge detection). Reusing a statsfile when these things change would be incorrect, and lead…