You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We observed that, when re-creating new Faker\Generators between test cases, a "random" PHP garbage collector run happening on object allocation triggers Faker\Generator::__destruct which changes the mt_rand seed, breaking further fixture generation in a non-deterministic way.
Context
I am not sure if I would consider this an actual bug of Faker. It's more that the specific use case makes it break in some (rare) conditions. We do not use Faker itself directly, but through the symfony integration of Alice.
We use it to generate test data for phpunit test. Alice populates our test database, using faker to fill in fields that need data, but not predefined specific values. Works great for snapshot test and thelike.
The Problem
With one test, we ran into an issue where the test would sometimes, on some systems (mainly CI) fail due to Faker generating different data for that run. Not always the "same different" data, not always the exact same test case, and not on every run.
Setup
The way its setup looks like this:
For each test case, we "boot" a fresh Symfony stack, recreating all services. In the test configuration, Faker\Generator is a service in that stack, used by Alice to populate the test data.
So on each test, we create all these instances, including a Faker\Generator, which also gets a predefined seed,
use Alice to read our fixture definitions and create objects from it, using Faker to populate properties,
start a fresh db transaction, insert that data into our test db, run tests against that DB, and compare the result with snapshots.
Then we rollback the whole transaction, teardown the symfony stack, and start the next test with a fresh stack, (that calls seed again) and a fresh clean DB.
probable root cause
After some debugging we found that sometimesFaker\Generator::__destruct is called while Alice creates the test data, on random object creations. I think that this is the PHP GC kicking in, claiming the Faker\Generator instance used in the previous test run.
It also explains why we only see these problems in some cases, only on some hosts, and never when just running one single test alone.
Building a minimal test case I could reproduce that problem, and commenting out the seed() call in the __destruct also "fixed" the issue.
Conclusion
Our current workaround is to call gc_collect_cycles in the phpunit setUp method, before starting the new Symfony stack. This, as expected triggers Faker\Generator::__destruct and thereby seed before the new instance is created and sets the expected seed.
From our observation, that seems to have fixed our breaking tests.
But I am not sure if that really what we want to do here, and in all tests that use Alice/Faker.
The reason these exact test cases failed, and all others using the same mechanism did not, I can only guess.
Most likely it is just the amount of data. We usually try to limit test data to a minimum, so do not created tens or hundreds of entities with it.
For the new one, we required more data, and therefor I figure we just pushed it "over the edge" where this test case would have a considerably higher chance of triggering a GC while building fixtures than every other test.
So for me, there are a few different ways this can be resolved:
either add gc_collect_cycles to every test to be run before building the new stack, or even add it to the kernel itself.
or use Faker differently, preventing multiple instances of if in the first place. Not trivial with the way symfony works in the test cases, and not sure if I like all implications of that.
do not seed on __destruct. Breaking change, not sure about all implications of that, seems like rats nest of problems.
?? maybe we are holding something very wrong and all of this is just me being stupid? Always an option!
I file the issue here, and not in on of the upstream projects, because I have to start somewhere and Faker seemed like the correct place.
I use gc_collect_cycles to urge PHP to collect the dangling Generator instance here. In reality, this happens "somewhere" down the line as more and more new object are allocated.
Expected output (with PHP 8.3 and thus MT_RAND_MT19937)
I have the exact same problem when using zenstruck/foundry! When creating around 50+ Fixtures using a createMany() call, the seeding started to break. It took me a while to find out it was the __destruct call from some previous Generator instances, that get created by either booting the kernel more then one time in a single test, but also because Foundry creates some Generator instances on the fly (I didn't investigate why). Thanks for the detailed bug report and I hope this will get fixed soon :)
Summary
TL;DR:
We observed that, when re-creating new
Faker\Generator
s between test cases, a "random" PHP garbage collector run happening on object allocation triggersFaker\Generator::__destruct
which changes themt_rand
seed, breaking further fixture generation in a non-deterministic way.Context
I am not sure if I would consider this an actual bug of
Faker
. It's more that the specific use case makes it break in some (rare) conditions. We do not use Faker itself directly, but through the symfony integration ofAlice
.We use it to generate test data for phpunit test. Alice populates our test database, using faker to fill in fields that need data, but not predefined specific values. Works great for snapshot test and thelike.
The Problem
With one test, we ran into an issue where the test would sometimes, on some systems (mainly CI) fail due to Faker generating different data for that run. Not always the "same different" data, not always the exact same test case, and not on every run.
Setup
The way its setup looks like this:
For each test case, we "boot" a fresh Symfony stack, recreating all services. In the test configuration,
Faker\Generator
is a service in that stack, used by Alice to populate the test data.So on each test, we create all these instances, including a
Faker\Generator
, which also gets a predefined seed,use Alice to read our fixture definitions and create objects from it, using Faker to populate properties,
start a fresh db transaction, insert that data into our test db, run tests against that DB, and compare the result with snapshots.
Then we rollback the whole transaction, teardown the symfony stack, and start the next test with a fresh stack, (that calls
seed
again) and a fresh clean DB.probable root cause
After some debugging we found that sometimes
Faker\Generator::__destruct
is called while Alice creates the test data, on random object creations. I think that this is the PHP GC kicking in, claiming theFaker\Generator
instance used in the previous test run.This would also fit on the behaviour seen in #272
It also explains why we only see these problems in some cases, only on some hosts, and never when just running one single test alone.
Building a minimal test case I could reproduce that problem, and commenting out the
seed()
call in the__destruct
also "fixed" the issue.Conclusion
Our current workaround is to call
gc_collect_cycles
in the phpunitsetUp
method, before starting the new Symfony stack. This, as expected triggersFaker\Generator::__destruct
and therebyseed
before the new instance is created and sets the expected seed.From our observation, that seems to have fixed our breaking tests.
But I am not sure if that really what we want to do here, and in all tests that use Alice/Faker.
The reason these exact test cases failed, and all others using the same mechanism did not, I can only guess.
Most likely it is just the amount of data. We usually try to limit test data to a minimum, so do not created tens or hundreds of entities with it.
For the new one, we required more data, and therefor I figure we just pushed it "over the edge" where this test case would have a considerably higher chance of triggering a GC while building fixtures than every other test.
So for me, there are a few different ways this can be resolved:
gc_collect_cycles
to every test to be run before building the new stack, or even add it to the kernel itself.seed
on__destruct
. Breaking change, not sure about all implications of that, seems like rats nest of problems.I file the issue here, and not in on of the upstream projects, because I have to start somewhere and Faker seemed like the correct place.
Versions
fakerphp/faker
Self-enclosed code snippet for reproduction
I use
gc_collect_cycles
to urge PHP to collect the danglingGenerator
instance here. In reality, this happens "somewhere" down the line as more and more new object are allocated.Expected output (with PHP 8.3 and thus MT_RAND_MT19937)
Actual output
The text was updated successfully, but these errors were encountered: