Skip to content
Ondřej Moravčík edited this page Mar 25, 2015 · 10 revisions

Serializating method and passing them to other process is difficult on Ruby. For example you cannot do something like this.

proc = Proc.new do |x|
 x * 2
end

Marshal.dump(proc)

Proc and Method are serialized as string using Sourcify library. Problem is that the library does not support closures. However there is small workaround by using .bind method.

number = 3
proc = lambda{|x| x * number}

proc.to_source # => "proc { |x| (x * number) }"

Bind is RDD method which allow you to attach objects for workers. All object must be serializable by Marshal.

# rdd.bind(METHOD_ON_WORKER => OBJECT)

number = 3
proc = lambda{|x| x * number}

rdd.map(proc).bind(number: number)

You can send a function to worker in 4 way

As String

String muset represent Proc.

rdd.map('lambda{|x| x * 2}')
As Proc

Not all Proc can be transfered. Problem is with

  • more Procs on one line
  • format: ->(x){x * 3}
  • format: lambda(&:to_s) (see next way)
  • dynamically allocated
rdd.map(lambda{|x| x * 2})

# This won't work
rdd.map(lambda{|x| x * 2}).map(lambda{|x| x * 2})
As Symbol

Instead of using .map(&:to_s) or lambda(&:to_s) you can use.

rdd.map(:to_s)
As Method

Method have the same problems as Proc but .bind works there too.

def multiple(x)
 x * 2
end

rdd.map(method(:multiple))
Clone this wiki locally