Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to make network requests during the init phase, from a SharedArray's callback #2962

Open
michael-sw-id opened this issue Mar 8, 2023 · 5 comments
Labels
evaluation needed proposal needs to be validated or tested before fully implementing it in k6 feature

Comments

@michael-sw-id
Copy link

michael-sw-id commented Mar 8, 2023

Feature Description

I am trying to load test users from redis (that were created for load test by another k6 script) into a sharedarray bu getting the following error:

ERRO[0002] could not initialize 'scenario-create-order-from_json.js': could not load JS test '/k6/scenario-create-order-from_json.js': Uncaught (in promise) connecting to a redis server in the init context is not supported

It should be possible to connect to redis or mongo from init context so that we can load test users.

S3 would also be a great source for users

When running load tests deployed in cluster/cloud, we would not want to mess around with json/csv files.

Having mongo or persistent redis as data store for test users fits in with test data management practices.

Source

const redisClient = new redis.Client({
  addrs: redis_addrs.split(',') || new Array('localhost:6379'), // in the form of 'host:port', separated by commas
  password: redis_password,
});

const getCreatedCustomersAsync = async () => {
  let parsedCustomers = [];

  sleep(1) // for some reason need to wait for redis 
  let redisCustomers = await redisClient.smembers('created_customers')

  console.log(redisCustomers.length + " customers found in redis..")
  redisCustomers.forEach(c => {
    parsedCustomers.push(JSON.parse(c))
  });

  //redisClient.del("created_customers")
  return parsedCustomers;
}

const customers = new SharedArray('customers', function () {
  // return JSON.parse(open('./data/customer-pool.json'));
  let result = []

  getCreatedCustomersAsync(result).then(
    (customers) => result = customers
  );

  return result;
});

Suggested Solution (optional)

No response

Already existing or connected issues / PRs (optional)

Someone seems to have suggested that it is possible to use redis extension here:
https://community.k6.io/t/getting-data-from-database-instead-of-csv-file/5493/3

@na-- na-- assigned na-- and unassigned na-- Mar 8, 2023
@na-- na-- added the evaluation needed proposal needs to be validated or tested before fully implementing it in k6 label Mar 8, 2023
@na--
Copy link
Member

na-- commented Mar 8, 2023

Please take a look at the discussion in #2911 and #2719 (comment) for an explanation why we don't allow networking code in the VU init context, i.e. in the global JavaScript scope that gets executed every time a new VU is initialized. That said, your example of putting the network requests in the SharedArray callback and some previous thoughts on the topic prompted me to write this comment in the "SharedArray improvements" issue: #2043 (comment)

The TLDR version is that I think something like this might be possible. But someone needs to try and make a proof of concept implementation, since there are some pretty significant obstacles and problems that I see, and probably more that I am missing. And, even if we can implement it with reasonable trade-offs, it won't help with non-data use cases like #2719, but it should be very helpful for use cases where users want to get a lot of data from a remote system before running the test.

So, I'll leave this issue open, but I'll change the title a little bit to make it more generic. Please, anyone who has a similar use case, share it here and upvote (:+1:) the main post. Again, no promises, since I am not sure if it's even possible or practical to implement this feature, but it would be helpful to know how many people would like to have it and why. Worst case, we can use that information for prioritizing more highly some of the other potential solutions to the same problems.

@na-- na-- changed the title Ability to connect to redis or mongo from init context Ability to make network requests during the init phase, from a SharedArray's callback Mar 8, 2023
@michael-sw-id
Copy link
Author

This seems to be a similar requirement to ours (distributed test data):
#2043 (comment)

@na--
Copy link
Member

na-- commented Mar 9, 2023

Yes, all of these solutions can roughly solve the same set of problems:

  1. Allow SharedArray values to be created during setup() and returned from it.
  2. Create a new SharedObject type that is similar to SharedArray (i.e. has one immutable copy of the data in memory) and use it for the setup() result, instead of copying its value to every VU, as it currently happens. Somewhat similar to ⬆️, but more generic, and it will allow large data volumes to be returned from setup().
  3. Allow SharedArray values to be created in "VU contexts", i.e. in normal test scenarios or even setup() (without necessarily being able to return them from it). This would allow users to have a "getData" scenario that is executed before the main test, which can load the required data in a SharedArray. Though, given how inflexible chaining scenarios is, this will be most useful with test suites (Test suites / execute multiple scripts with k6 #1342).
  4. Allow network requests from the SharedArray callback in the init phase (this issue).
  5. A bit different from the rest (and also solving other problems besides data management), but being able to have a per-VU setup function in every scenario (Per-VU init lifecycle function #785, setup() per scenario #1638) might also help with data partitioning.

These all have different tradeoffs and implementation issues that need to be overcome. Personally, I think 2. and 3. are the most promising and with the least amounts of sharp and weird edge cases. But we haven't ever made a proof of concept implementation for any of them, so there are probably plenty of unknown unknowns remaining...

@michael-sw-id
Copy link
Author

Probably I don't understand the lifecycle, but if SharedArray is intended as a feeder of data for VUs, it seems counter-intuitive that it is initialised for every VU.

At least, for the VU data feeder use-case of having a data record per VU, you'd want to load into the shared array once, and assign a record to each VU

@na--
Copy link
Member

na-- commented Mar 9, 2023

it seems counter-intuitive that it is initialised for every VU.

But it isn't 😕 See the comments in this example:

// Every VU will execute this code 

const data = new SharedArray('some id', function () {
   // Only one VU will execute this function, other VUs will wait for the result:
   // ... 
   // ... some code that retrieves or generates data ...
   // ...
   return aBigChunkOfData;
});

// Every VU now has a local variable `data` with a reference to the same single copy of the shared immutable array data

Since every VU is an independent JavaScript runtime, they each need a local reference to the shared data. However, regardless of whether we initialize the SharedArray during VU initialization or test execution (i.e. the somewhat wrongly-named "init" and "VU" contexts), one VU would have to actually run the function that actually generates the data of the SharedArray contents.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
evaluation needed proposal needs to be validated or tested before fully implementing it in k6 feature
Projects
None yet
Development

No branches or pull requests

2 participants