Admin pages frequently come up blank white for custom hubs cloud clients. #4208

MegaMotion · 2021-04-29T18:06:31Z

Bug can be erratic and does not affect everyone, but those affected see nothing but a blank white screen instead of admin pages, and very few to no messages in the browser logs.

To reproduce: deploy a custom client, and you may or may not have this problem.

When active, the bug affects all browsers equally.

We experienced the issue from the point we installed our current stack several weeks ago all the way up until a couple of days ago, not immediately but a couple of days after installing the new hubs cloud client. At one point it just miraculously fixed itself, and worked for a day or two, but today it has gone back to a blank white screen, and several other users are reporting the problem now as well.

┆Issue is synchronized with this Jira Task

blairmacintyre · 2021-04-29T20:14:52Z

Ditto. We just installed a custom client, seeing the same thing. We had been running a custom client based on the previous client code, and ported our small changes to this. Admin pages works fine without the custom client, and everything else in the client seems to work (spoke, rooms, etc). Just the admin pages. Signing out and back in doesn't help.

mattrossman · 2021-04-29T20:45:43Z

In our case, the admin page is sending an empty Javascript bundle even though the expected 3 MB build file appears in dist/ locally. Perhaps something is going wrong during the deploy script's upload process. After an npm run undeploy, the admin page becomes visible, it's just npm run deploy that doesn't work.

blairmacintyre · 2021-04-29T21:04:31Z

FWIW, /hab/svc/hubs/var/dist/pages/admin.html (Is this where it's served from?) is small.

app root@optimistic-troll:/hab/svc/hubs# ls -la var/dist/pages/
total 128
drwxr-xr-x 2 hab hab  4096 Apr 29 16:07 .
drwxr-xr-x 6 hab hab  4096 Apr 29 16:07 ..
-rw-r--r-- 1 hab hab  2274 Apr 29 16:07 admin.html
-rw-r--r-- 1 hab hab  2678 Apr 29 16:07 avatar.html
-rw-r--r-- 1 hab hab  2987 Apr 29 16:07 cloud.html
-rw-r--r-- 1 hab hab  3030 Apr 29 16:07 discord.html
-rw-r--r-- 1 hab hab 61479 Apr 29 16:07 hub.html
-rw-r--r-- 1 hab hab  1578 Apr 29 16:07 hub.service.js
-rw-r--r-- 1 hab hab  3040 Apr 29 16:07 index.html
-rw-r--r-- 1 hab hab  3149 Apr 29 16:07 link.html
-rw-r--r-- 1 hab hab  3388 Apr 29 16:07 scene.html
-rw-r--r-- 1 hab hab 11038 Apr 29 16:07 schema.toml
-rw-r--r-- 1 hab hab  2149 Apr 29 16:07 signin.html
-rw-r--r-- 1 hab hab  2154 Apr 29 16:07 verify.html
-rw-r--r-- 1 hab hab  2309 Apr 29 16:07 whats-new.html

At the bottom of admin.html it's trying to load a script at <script type="text/javascript" src="https://gt-ael-aq-assets.aelatgt-internal.net/hubs/assets/js/admin-f0d3e2008bc2c665d632.js"></script></body>

That script exists, and is fairly large

app root@optimistic-troll:/hab/svc/hubs# ls -l var/dist/assets/js/admin-f0d3e2008bc2c665d632.js
-rw-r--r-- 1 hab hab 3240496 Apr 29 16:07 var/dist/assets/js/admin-f0d3e2008bc2c665d632.js

So it's being uploaded, it's just not "being gotten".

blairmacintyre · 2021-04-29T21:10:54Z

And the response from the server for that file is 200, happy, but it gets back content-length 0.

MegaMotion · 2021-04-30T14:25:10Z

Same thing here: if I view source, I am getting an html page, but the body contains nothing but a ui-root div and a call to that admin js file, which does not load.

blairmacintyre · 2021-04-30T15:25:28Z

Given that I see the JS file on my server at the right place, but if I try to access that URL directly in the browser it returns nothing, is it possibly a server issue or some sort of cache/CDN problem?

mattrossman · 2021-05-02T17:33:54Z

@blairmacintyre There's two different .js files at play here. The file you showed that exists on the server is accessible by direct URL here: admin-f0d3e2008bc2c665d632.js. However, the admin page that I see is requesting a different script which produces the 0-byte response: admin-8ad4ac3d365a78faa8cc.js.

The latter file is the "correct" one which was built locally and supposedly uploaded, but it appears the server is still holding an old script file. Maybe if you try manually cleaning the files on the server and then re-deploying, it will accept the new script?

At the bottom of admin.html it's trying to load a script at <script type="text/javascript" src="https://gt-ael-aq-assets.aelatgt-internal.net/hubs/assets/js/admin-f0d3e2008bc2c665d632.js"></script>

This is the only part that confuses me, I'm seeing a different script requested by the current admin page. Could you double check this?

MegaMotion · 2021-05-02T18:47:32Z

Hm, that is an interesting clue. I can see in my own example that the script being called from my admin.html is zero bytes in size, in my S3 bucket. I can also see that I had a 3M version of this file on January 26th, the last time the admin pages worked, and I have many other versions that are zero bytes, from all the times it has not worked.

So clearly it is the process of creating this file that is dropping the ball.

blairmacintyre · 2021-05-02T19:16:29Z

@mattrossman hmmm. I'm confused now too. The dates on these files are correct in that directory on the server, based on when we deployed the modified client, but the admin.html file we are getting (that you see, Matt, which I also see, now that I look) is different from the admin.html file on the server.

Am I not looking in the right place? I there some sort of cache on the server that's stuck?

MegaMotion · 2021-05-11T18:10:47Z

Well, my only current theory is the New Moon tonight, but... my admin pages are working again for the moment. :-) Anybody else see anything different here today?

blairmacintyre · 2021-05-11T18:18:10Z

Mine started working; I rebuilt and re-uploaded. No idea why it failed the first time.

MegaMotion · 2021-05-14T05:06:41Z

Whoops, back to white pages again. :-\

johnshaughnessy · 2021-10-12T17:08:10Z

I've seen this happen as a result of having "invalid" data in local storage. We can probably handle this case more gracefully.

MegaMotion · 2021-12-14T22:55:07Z

Well, for anyone else cursed with this affliction, I just discovered a workaround! Deploying a clean, fresh client does not do anything for me, however "npm run undeploy" DOES restore my admin pages! Obviously, this is not a fix, since it abandons all my custom client work, but it enables making changes to admin and then redeploying the custom client afterwards.

rawnsley · 2022-01-06T10:22:49Z

I'm also experiencing this "white admin page" issue. The pathology for me is that admin.js is truncated and stops at some random point within the file. This point is different each time.

The root cause for me is here in the deploy script. The contents of admin/dist are copied to the top-level dist folder, but the ncp function invokes its callback multiple times and the first time is often before the copy is complete. These incomplete files are then bundled up into the tar file for distribution. Here is the function with some console logging:

...
console.log("NCP BEFORE");
  await new Promise(res => {
    ncp("./admin/dist", "./dist", err => {
console.log("NCP CALLBACK");
      if (err) {
        console.error(err);
        process.exit(1);
      }

      res();
    });
  });
  step.text = "Preparing Deploy.";
console.log("NCP AFTER");
...

The output is typically something like this:

NCP BEFORE
NCP CALLBACK
NCP AFTER
NCP CALLBACK
NCP CALLBACK

Multiple callbacks are NOT part of the advertised functionality and I can confirm (using fs.stat) that the file copy is not always complete after the first callback.

This fits with the pathology:

The problem only impacts admin pages because the client pages are already there
admin.js is the largest file in that folder and so most likely to be truncated
The problem has got worse as admin.js has grown in size; in fact I triggered it months ago with an accidental include that ballooned the size of the file
Perhaps the problem is also getting more common because CPUs are getting faster and outpacing the I/O? This last one is a guess based on the fact that the problem happens 100% of the time on my MacBook Pro and never happened on my previous computer.

Inserting a pause after the copy command makes the problem go away, but obviously that's a band-aid solution. ncp hasn't been touched in a long time and should be replaced with something more modern and less flaky.

daCking15 · 2022-01-25T19:36:31Z

@rawnsley By "inserting a pause after the copy command", do you mean something like the following:

daCking15 · 2022-01-25T20:07:57Z

I also tried this, but no luck so far:

daCking15 · 2022-01-25T20:19:31Z

I can also confirm that this wasn't an issue on my older computer (i3), but has been on newer ones (i5, i7, M1)

rawnsley · 2022-01-25T20:24:55Z

@daCking15

Something like this to give the ncp command the time it needs to actually finish, which is probably a fraction of a second:

  ...
  await new Promise(res => {
    ncp("./admin/dist", "./dist", err => {
      if (err) {
        console.error(err);
        process.exit(1);
      }

      res();
    });
  });
  step.text = "Preparing Deploy.";

  // HACK TO WORK AROUND NCP BEHAVIOUR
  await new Promise(res => setTimeout(res, 5000));

  step.text = "Packaging Build.";
  tar.c({ sync: true, gzip: true, C: path.join(__dirname, "..", "dist"), file: "_build.tar.gz" }, ["."]);
  step.text = `Uploading Build ${buildEnv.BUILD_VERSION}.`;
  ...

I haven't had the problem since making this change and the one time I did have a problem was because I had accidentally reverted it.

markusTraber · 2022-04-06T15:18:21Z

Had the same issue since the April Hubs-Cloud Update. @rawnsley hack got it working for me now, thank you! :)

msalafia · 2022-04-08T14:47:55Z

@rawnsley you saved my day

mattrossman · 2022-04-12T14:34:56Z

Thanks for identifying that, I was having this issue again with the April release and your workaround fixed it.

Looks like a known issue with ncp:
AvianFlu/ncp#143

ncp hasn't been touched in a long time and should be replaced with something more modern and less flaky.

Here is another package with similar functionality and popularity but is more recently updated, perhaps it would be a more reliable option. It does have several dependencies though.
https://www.npmjs.com/package/cpy

takahirox · 2022-04-13T16:51:36Z

Honestly I'm not really familiar with the deploy script yet but sounds like @rawnsley 's workaround should get in the core because we often get the problem report about this problem and the workaround seems to actually resolve the problem. And the workaround is very easy and simple.

We may be able to either seek a more proper way to detect the copy completion, try to fix ncp, or try other libs later if possible and needed.

What do you think? @netpro2k @brianpeiris

rawnsley · 2022-04-14T08:21:41Z

Honestly I'm not really familiar with the deploy script yet but sounds like @rawnsley 's workaround should get in the core because we often get the problem report about this problem and the workaround seems to actually resolve the problem. And the workaround is very easy and simple.

We may be able to either seek a more proper way to detect the copy completion, try to fix ncp, or try other libs later if possible and needed.

What do you think? @netpro2k @brianpeiris

I've submitted a trivial PR in case you want to include it temporarily.

jbshin-gemiso · 2022-04-15T00:53:43Z

@rawnsley

I was also very helpful.
Have a nice day !

brianpeiris · 2022-04-16T11:53:01Z

Thanks for the detailed investigation and workaround @rawnsley, and for the alternative @mattrossman. I'll see if I can fix this more permanently.

brianpeiris · 2022-04-16T16:29:21Z

Alright, I think #5365 is a good fix, though I wasn't able to reproduce the issue myself, so that might require verification from the community. But, I think it's solid enough to ship.

Sorry it took so long to get to this. Thanks to @takahirox for bringing attention to it.

brianpeiris · 2022-05-09T15:53:43Z

I'm going to mark this as closed, since #5365 has been merged into our master branch, though it will take a while to be released to Hubs Cloud proper. Keep an eye on the the changelog to watch for that. In the mean time, if you are running into this issue, you should be able to cherry pick the changes from #5365 into your Hubs Cloud fork directly.

Dayk0 · 2022-05-12T11:10:27Z

Yes. The white page is still existing in the "master" branch of the github. You have to upload the files directly to the 'hubs-cloud' branch to avoid this problem

wswoodruff · 2022-05-12T15:45:43Z

This may or may not be related, but I've experienced some erratic behavior with whitescreens before when making rapid enough requests to reticulum.

Ret has a rate limiter available as a plug and it looks like lots of files pull it in. I'm pretty sure the rate limiter is what was causing this type of error for me.

Permalink: https://github.com/mozilla/reticulum/blob/239742e27c019f35ac50b939b307144b308fd3a7/lib/ret_web/rate_limit.ex

MegaMotion added bug needs triage For bugs that have not yet been assigned a fix priority labels Apr 29, 2021

johnshaughnessy added the jira-hubs label Aug 9, 2021

rawnsley mentioned this issue Apr 13, 2022

Deploy custom client problem #5357

Closed

rawnsley added a commit to LearnHub/hubs that referenced this issue Apr 14, 2022

Hack for unexpected ncp behaviour as discussed in Hubs-Foundation#4208

a7a16fd

rawnsley mentioned this issue Apr 14, 2022

Hack for unexpected ncp behaviour #5363

Closed

brianpeiris self-assigned this Apr 16, 2022

brianpeiris mentioned this issue Apr 16, 2022

Fix Hubs Cloud admin panel deploy by replacing ncp with fs-extra copy #5365

Merged

brianpeiris closed this as completed May 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Admin pages frequently come up blank white for custom hubs cloud clients. #4208

Admin pages frequently come up blank white for custom hubs cloud clients. #4208

MegaMotion commented Apr 29, 2021 •

edited by data-sync-user

Loading

blairmacintyre commented Apr 29, 2021

mattrossman commented Apr 29, 2021

blairmacintyre commented Apr 29, 2021

blairmacintyre commented Apr 29, 2021

MegaMotion commented Apr 30, 2021

blairmacintyre commented Apr 30, 2021

mattrossman commented May 2, 2021

MegaMotion commented May 2, 2021

blairmacintyre commented May 2, 2021

MegaMotion commented May 11, 2021

blairmacintyre commented May 11, 2021

MegaMotion commented May 14, 2021

johnshaughnessy commented Oct 12, 2021

MegaMotion commented Dec 14, 2021

rawnsley commented Jan 6, 2022

daCking15 commented Jan 25, 2022

daCking15 commented Jan 25, 2022

daCking15 commented Jan 25, 2022

rawnsley commented Jan 25, 2022

markusTraber commented Apr 6, 2022

msalafia commented Apr 8, 2022

mattrossman commented Apr 12, 2022

takahirox commented Apr 13, 2022 •

edited

Loading

rawnsley commented Apr 14, 2022

jbshin-gemiso commented Apr 15, 2022

brianpeiris commented Apr 16, 2022

brianpeiris commented Apr 16, 2022

brianpeiris commented May 9, 2022 •

edited

Loading

Dayk0 commented May 12, 2022

wswoodruff commented May 12, 2022

Admin pages frequently come up blank white for custom hubs cloud clients. #4208

Admin pages frequently come up blank white for custom hubs cloud clients. #4208

Comments

MegaMotion commented Apr 29, 2021 • edited by data-sync-user Loading

blairmacintyre commented Apr 29, 2021

mattrossman commented Apr 29, 2021

blairmacintyre commented Apr 29, 2021

blairmacintyre commented Apr 29, 2021

MegaMotion commented Apr 30, 2021

blairmacintyre commented Apr 30, 2021

mattrossman commented May 2, 2021

MegaMotion commented May 2, 2021

blairmacintyre commented May 2, 2021

MegaMotion commented May 11, 2021

blairmacintyre commented May 11, 2021

MegaMotion commented May 14, 2021

johnshaughnessy commented Oct 12, 2021

MegaMotion commented Dec 14, 2021

rawnsley commented Jan 6, 2022

daCking15 commented Jan 25, 2022

daCking15 commented Jan 25, 2022

daCking15 commented Jan 25, 2022

rawnsley commented Jan 25, 2022

markusTraber commented Apr 6, 2022

msalafia commented Apr 8, 2022

mattrossman commented Apr 12, 2022

takahirox commented Apr 13, 2022 • edited Loading

rawnsley commented Apr 14, 2022

jbshin-gemiso commented Apr 15, 2022

brianpeiris commented Apr 16, 2022

brianpeiris commented Apr 16, 2022

brianpeiris commented May 9, 2022 • edited Loading

Dayk0 commented May 12, 2022

wswoodruff commented May 12, 2022

MegaMotion commented Apr 29, 2021 •

edited by data-sync-user

Loading

takahirox commented Apr 13, 2022 •

edited

Loading

brianpeiris commented May 9, 2022 •

edited

Loading