Replies: 1 comment 2 replies
-
The With a recurrent network, it processes a sequence one element at a time (in sequence) because each samples depends on the output of the previous sample. But, you would use a batch to process multiple sequences in parallel |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi, I try to learn recurrent networks, and a RecurrentBlock expects a (Batch, Time, Channel) shape as input.
My question is about how data for different time steps are organized. DIVE INTO DEEP LEARNING chapter 8.6 states
Besides, the updated hidden state (stateNew) returned by rnnLayer refers to the hidden state at the last time step of the minibatch. It can be used to initialize the hidden state for the next minibatch within an epoch in sequential partitioning.
According to this, I would think that each index of the Batch axes represents a different time step and the highest index is the latest point in time of the batch and its hidden state is inputed to the subsequent batch at index 0. So what does the second axes, "Time", encode, isn't (Batch, Channel) enough?
A side question: Items of a batch are usually processed in parallel for neural nets, but with the above interpretation of the shape of a RecurrentBlock, are they processed in sequence to as one sample needs the output of the previous sample as input?
Beta Was this translation helpful? Give feedback.
All reactions