Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: add Gaussian sampler #169

Closed
wants to merge 3 commits into from

Conversation

hyiltiz
Copy link
Contributor

@hyiltiz hyiltiz commented Jan 11, 2024

  • Add a Gaussian sampler
  • Add a convenience function sample-n that converts a sampler to a generator

Copy link
Member

@pepe pepe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding a test for the new functionality.

spork/randgen.janet Outdated Show resolved Hide resolved
@sogaiu
Copy link
Contributor

sogaiu commented Jan 11, 2024

This line doesn't look right.

I don't think the code is valid:

  (math/rng-uniform ))

is problematic in at least 2 ways.

First it's missing an argument:

    (math/rng-uniform rng)

    Extract a random number in the range [0, 1) from the RNG.

and second there is an extra closing parentheses.

I'm guessing it's just stray code that would be better removed?

@sogaiu
Copy link
Contributor

sogaiu commented Jan 11, 2024

Not sure what's better, but may be forever could be used like this:

(defn rand-gaussian
  "Get a random sample from the standard Gaussian distribution.
  Optionall specify the mean m and the standard deviation sd. E.g.:
  (rand-gaussian) # => 0.1324...
  (rand-gaussian 5 0.1) # => 5.3397...
  "
  [&opt m sd]
  (default [m sd] [0 1])
  (defn scale [x] (+ m (* sd x)))

  (forever
    (def p (rand-uniform))
    (def q (rand-uniform))

    # We use the Box-Muller transform
    (let [rho (math/sqrt (* -2 (math/log q)))
          theta (* 2 math/pi p)
          _box    (* rho (math/cos theta))
          _muller (* rho (math/sin theta))
          box    (scale _box)
          muller (scale _muller)]

      (yield box)
      (yield muller))))

@sogaiu
Copy link
Contributor

sogaiu commented Jan 11, 2024

Regarding the testing comment above, the examples in the docstrings could be broken out as tests, though checking the return values might require some care.

Possibly there are better ways, but here is what I did in a PR for adding tests for randgen. The approach involves using set-seed before evaluating so that results are reproducible.

@hyiltiz
Copy link
Contributor Author

hyiltiz commented Jan 11, 2024

@hyiltiz
Copy link
Contributor Author

hyiltiz commented Jan 12, 2024

Currently, we have the following problems:

  • PA: evaluating it with same rng multiple times for testing and reproduce;
  • PB: re-use muller values for efficiency;
  • PC: can be called alone and with sample-n

@sogaiu proposed an alternative implementation using a "closure"-like mechanism.

My original implementation via fibers (using yield) couldn't satisfy PC. I think this implementation satisfies them all.

@sogaiu
Copy link
Contributor

sogaiu commented Jan 15, 2024

Below is another attempt:

(def rand-gaussian
  ``
  Get a random sample from the standard Gaussian distribution. 
  Optionally specify the mean `m` and the standard deviation `sd`.
  ``
  (do
    (defn make-rg
      [&opt m sd]
      (def default-m 0)
      (default m default-m)
      (def default-sd 1)
      (default sd default-sd)
      (def state @{:m m :sd sd})
      (defn scale
        [x]
        (+ (get state :m)
           (* (get state :sd) x)))

      (defn box-muller
        []
        (let [p (math/rng-uniform (get-rng))
              q (math/rng-uniform (get-rng))
              rho (math/sqrt (* -2 (math/log q)))
              theta (* 2 math/pi p)
              box (scale (* rho (math/cos theta)))
              muller (scale (* rho (math/sin theta)))]
          [box muller]))

      (def [box muller] (box-muller))
      (-> state (put :box box) (put :muller muller))

      (fn rg [&opt m sd]
        (cond
          # "reset" if arguments do not match cached values
          (not (and (= m (get state :m default-m))
                    (= sd (get state :sd default-sd))))
          (do
            (def m (or m default-m))
            (def sd (or sd default-sd))
            (-> state (put :m m) (put :sd sd))
            (def [box muller] (box-muller))
            (-> state (put :box box) (put :muller muller))
            (rg m sd))
          #
          (def box (get state :box))
          (do
            (put state :box nil)
            box)
          #
          (def muller (get state :muller))
          (do
            (def [next-box next-muller] (box-muller))
            (-> state (put :box next-box) (put :muller next-muller))
            muller)
          #
          (errorf "unexpected state"))))
    #
    (make-rg)))

(comment

  (set-seed 1)

  (rand-gaussian)
  # =>
  0.0917440069385442

  (sample-n |(rand-gaussian 5 0.1) 3)
  # =>
  @[4.97222830484943 5.10069692026214 5.13979794451211]

  (rand-gaussian)
  # =>
  -0.856296446506397

 )

@sogaiu
Copy link
Contributor

sogaiu commented Jan 18, 2024

@hyiltiz and I are comparing the implementation above with his simpler one:

(defn rand-gaussian [&opt m sd]
  (default m 0)
  (default sd 1)
  (defn scale [x] (+ m (* sd x)))

  (def p (math/rng-uniform (get-rng)))
  (def q (math/rng-uniform (get-rng)))

  # We use the Box-Muller transform
  (let [rho (math/sqrt (* -2 (math/log q)))
        theta (* 2 math/pi p)
        _box (* rho (math/cos theta))
        box (scale _box)]
    box))

This latest one is faster and simpler:

(set-seed 1)

(timeit-loop [:timeout 1] "simpler" (rand-gaussian-simpler))

(set-seed 1)

(timeit-loop [:timeout 1] "complex" (rand-gaussian-complex))

# simpler 1.000s, 0.485µs/body
# complex 1.000s, 0.8271µs/body

@hyiltiz hyiltiz closed this Jan 19, 2024
@hyiltiz hyiltiz deleted the random-sampler/gaussian branch January 19, 2024 03:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants