`exceptions`-based exceptions #141

ocramz · 2018-03-28T15:28:39Z

Rather than Nothing/0/NaN etc. (the first option being way better than the others), it would be great to generalize code that may throw to the MonadThrow class from exceptions.

This way, functions using throwM (e :: Exception) would have the signature MonadThrow m => ... -> m ( ... ), where m may become Maybe, or Either e, or even IO, according to the calling context.

The text was updated successfully, but these errors were encountered:

ocramz · 2018-03-28T15:29:54Z

Related: #128 , #100 , #111 , #118 ...

Shimuuar · 2018-03-29T08:03:37Z

That's excellent suggestion!

ocramz · 2018-07-19T22:36:06Z

I've started addressing this here: https://github.com/DataHaskell/statistics/tree/exceptions-not-error

Shimuuar · 2018-07-20T04:45:04Z

I'm actually halfway through implementing it. Thing us once you touch S.Sample you need to adjust basically everything

ocramz · 2018-07-20T08:45:25Z

Yes, I noticed, error is used pretty much throughout. We could skip refactoring the input validation parts for now (i.e. zero input size or negative parameters etc.) and focus on the important ones, e.g. the NaN correlations etc. For example, I've replaced Sample.correlation with this:

-- | Correlation coefficient for sample of pairs. Also known as
--   Pearson's correlation. For empty sample it's set to zero.
correlation :: (G.Vector v (Double,Double), G.Vector v Double, MonadThrow m)
           => v (Double,Double)
           -> m Double
correlation xy
  | n == 0    = pure 0
  | nearZero varX = throwM $ NaNE "Variance of X == 0"
  | nearZero varY = throwM $ NaNE "Variance of Y == 0"
  | otherwise = pure corr
  where
    corr = cov / sqrt (varX * varY)
    n       = G.length xy
    (xs,ys) = G.unzip xy
    (muX,varX) = meanVariance xs
    (muY,varY) = meanVariance ys
    cov = mean $ G.zipWith (*)
            (G.map (\x -> x - muX) xs)
            (G.map (\y -> y - muY) ys)
{-# SPECIALIZE correlation :: U.Vector (Double,Double) -> Maybe Double #-}
{-# SPECIALIZE correlation :: V.Vector (Double,Double) -> Maybe Double #-}

ocramz · 2018-07-20T11:22:19Z

@Shimuuar would you like to join forces on this? I don't have an efficient implementation in mind for Matrix.generateSym , though

Shimuuar · 2018-07-20T11:50:53Z

@Shimuuar <https://github.com/Shimuuar> would you like to join forces on this?

Sure although I won't be able to do anything till monday

…

ocramz · 2018-07-24T12:00:42Z

Hi @Shimuuar :) as discussed, if you point me to your working branch for this we can figure out how to collaborate :)

Shimuuar · 2018-07-24T19:01:52Z

I just pushed branch exception2 (exception was complete failure). It's mostly complete except for

Statistics.Sample some functions are commented out and I'm thinking about using type classes from monoid-statistics for things like calculation of mean and variance in single call (saving one evaluation of mean). Having dedicated functions is not terribly good since in that case we have combinatorial explosion.
Resampling. Again I'm thinking about jackknife which is clearly monoidal (although it's obscured by API)
Bootstrap didn't even touch it
Regression depends on resampling
KruskalWallis test
Few other thing I certainly forgot about

monoid-statistics is in rather poor state currently. I got lost in figuring out numeric precision and performance of different algorithms for variance

ocramz · 2018-07-25T09:07:04Z

@Shimuuar Re. monoid-statistics ; did you know of foldl-statistics? https://hackage.haskell.org/package/foldl-statistics

Shimuuar · 2018-07-25T09:33:56Z

Yes. Main difference is monoid-statistics exposes accumulator types and allows to merge estimates with several data set without refolding them.

ocramz · 2018-07-25T09:50:58Z

Aha! that's a clever thing to have. However what do you think of setting up speed benchmarks before looking into adding streaming capabilities? I would like to start adding basic summary functionality to `criterion-measurement` soon, to make it self-contained .

…

On Wed, Jul 25, 2018 at 11:33 AM, Aleksey Khudyakov < ***@***.***> wrote: Yes. Main difference is monoid-statistics exposes accumulator types and allows to merge estimates with several data set without refolding them. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#141 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AFoRqORK8RmfndEm34yTXJO7Ia-fMWfcks5uKDuFgaJpZM4S-3YM> .

Shimuuar · 2018-07-25T09:53:30Z

Why, of course! Without benchmarks all performance statements are just hopes and prayers

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`exceptions`-based exceptions #141

`exceptions`-based exceptions #141

ocramz commented Mar 28, 2018

ocramz commented Mar 28, 2018 •

edited

Loading

Shimuuar commented Mar 29, 2018

ocramz commented Jul 19, 2018

Shimuuar commented Jul 20, 2018 via email

ocramz commented Jul 20, 2018

ocramz commented Jul 20, 2018

Shimuuar commented Jul 20, 2018 via email

ocramz commented Jul 24, 2018

Shimuuar commented Jul 24, 2018

ocramz commented Jul 25, 2018

Shimuuar commented Jul 25, 2018

ocramz commented Jul 25, 2018 via email

Shimuuar commented Jul 25, 2018

exceptions-based exceptions #141

exceptions-based exceptions #141

Comments

ocramz commented Mar 28, 2018

ocramz commented Mar 28, 2018 • edited Loading

Shimuuar commented Mar 29, 2018

ocramz commented Jul 19, 2018

Shimuuar commented Jul 20, 2018 via email

ocramz commented Jul 20, 2018

ocramz commented Jul 20, 2018

Shimuuar commented Jul 20, 2018 via email

ocramz commented Jul 24, 2018

Shimuuar commented Jul 24, 2018

ocramz commented Jul 25, 2018

Shimuuar commented Jul 25, 2018

ocramz commented Jul 25, 2018 via email

Shimuuar commented Jul 25, 2018

`exceptions`-based exceptions #141

`exceptions`-based exceptions #141

ocramz commented Mar 28, 2018 •

edited

Loading