You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.
As I understand it, I can use the Embedding symbol such that I can extract rows of a table (weights matrix) using a set of indices. I should also be able to enable the sparse-gradient option even though the indices and table are dense.
I made the simplest relevant test I could (I'm using the symbol API because I play in C++). Basically, we make a table and indices symbol. To make it more like an actual use-case, I'm running an optimiser to solve the table to 0, using one batch of random indices each iteration.
This works fine when we don't use sparse_grad on the Embedding, but fails if we do. The error message is: "Cannot use sparse_grad = 1, while stype of gradients w.r.t embedding weight is default"
I'm using 1.9.1
#include<iostream>using std::cout;
using std::endl;
#include"mxnet-cpp/MxNetCpp.h"namespacemx= mxnet::cpp;
intmain( int argc, char *argv[] )
{
auto ctx = mx::Context::gpu(0);
// settingsint numInds = pow(2,4);
int tableRows = pow(2,10);
int tableCols = 4;
// stupidly simple graphauto indices = mx::BlockGrad( mx::Symbol::Variable("inds") );
auto table = mx::Symbol::Variable("table");
auto sample = mx::Embedding( indices, table, tableRows, tableCols, mx::EmbeddingDtype::kFloat32, false );
// and a simple loss - solve table values to 0auto loss = mx::square( sample );
// execution graph is...auto graph = mx::Symbol::Group(
{
mx::MakeLoss(loss),
mx::BlockGrad(mx::mean(loss))
}
);
// state size of inds.
std::map< std::string, mx::NDArray > args;
args[ "inds" ] = mx::NDArray( mx::Shape( numInds ), ctx );
// infer other sizes
graph.InferArgsMap(ctx, &args, args);
// bind...auto exec = graph.SimpleBind(ctx, args);
// initialise table to some random values.auto ui = mx::Uniform( 100 );
ui( "table", &args["table"] );
// adam optimiser, we know that that is meant to support sparse grads.
mx::Optimizer *opt = mx::OptimizerRegistry::Find("adam");
opt->SetParam("lr", 0.1 );
for( unsigned c = 0; c < 1000000; ++c )
{
// make a batch of indices....
std::vector< float > inds(numInds);
for( unsigned c = 0; c < numInds; ++c )
{
inds[c] = rand()%(tableRows-1);
}
auto indsND = mx::NDArray( inds, mx::Shape( numInds ), mx::Context::cpu() );
indsND.CopyTo( &args["inds"] );
exec->Forward(true);
exec->Backward( {exec->outputs[0]} );
for( unsigned ac = 0; ac < exec->arg_arrays.size(); ++ac )
{
opt->Update(ac, exec->arg_arrays[ac], exec->grad_arrays[ac]);
}
mx::NDArray errND = exec->outputs[1].Copy( mx::Context::cpu() );
mx::NDArray::WaitAll();
cout << "err: " << *errND.GetData() << endl;
}
}
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
As I understand it, I can use the Embedding symbol such that I can extract rows of a table (weights matrix) using a set of indices. I should also be able to enable the sparse-gradient option even though the indices and table are dense.
I made the simplest relevant test I could (I'm using the symbol API because I play in C++). Basically, we make a table and indices symbol. To make it more like an actual use-case, I'm running an optimiser to solve the table to 0, using one batch of random indices each iteration.
This works fine when we don't use sparse_grad on the Embedding, but fails if we do. The error message is: "Cannot use sparse_grad = 1, while stype of gradients w.r.t embedding weight is default"
I'm using 1.9.1
Beta Was this translation helpful? Give feedback.
All reactions