Discussion: Integrate our fork #11

jonashaag · 2016-08-04T11:38:40Z

Intro and motivation

As already noted here vmprof/vmprof-python#90 (comment), we have implemented our own vmprof server, for the following reasons:

At that time, vmprof-server was very slow on large profiles (multiple hours of runtime) as it stored the profiles in the SQL database. (I'm not sure how the current implementation compares to ours.)
We wanted to have a good memory profile viewer in the server
We don't really need user accounts etc.

Features of our implementation

I can't share the source code of our server just yet, for bureaucratic reasons, but I can share some information and a few screenshots here.

Properties and differences to vmprof-server:

About 1000 LOC
Uses vmprof-server CPU viewer (no jitlog integration yet)
Much improved memory viewer based on Plotly:
- Shows time and date on X axis
- Data points aren't simply sampled from the complete data, but binned (mean), so that you don't miss spikes due to too-low sampling interval
- Show memory usage mean, max or both
- Nice interaction with the graph due to Plotly
- Shows absolute or relative runtime on X axis
Allows to search for projects and functions/callables
Stores profiles as gzipped msgpack files, no decoding done in the server whatsoever: Data is encoded to .msgpack.gz in the client once, and delivered to the browser UI as-is. The only exception to this is for memory profile resampling.

We have also implemented a new client:

About 100 LOC
Interface isn't a script runner like python -m vmprof yourscript.py but a decorator that is applied to to-be-profiled callables, like
```
@profile
def somefunc():
    ...
```
Allows to tag your submissions with a project name
Automatically tags your submissions with the top-level function/callable name (somefunc)
Client can upload normal vmprof profile files
Client protocol not compatible to the vmprof-python protocol (but very similar)

Screenshots

Landing page with project names and top-level function names (in red). Search filters may be shared using the arrow on the right

Integration of vmprof-server CPU viewer

The memory viewer, showing max memory usage for each bin (no hidden spikes!). On the right: Upper stacktrace shows the largest common denominator of all stack traces of the bin. Lower stacktrace extends the upper one by the most common stack trace of the bin (28% of the stack traces in the bin were equal to the "concatenation" of the two stack trace parts).

Memory viewer showing mean + max of each bin

Future of our server, integration in vmprof-server

I think our server has some nice properties, mainly the memory viewer and the storage system (although I'm not sure how it compares to the current vmprof-server JSON/Gzip storage system in terms of performance). We'd love the contribute most of it back to vmprof-server proper.

Possibility A: Integrate vmprof-server into our server

Integrate jitlog into our server
Maybe integrate user accounts into our server
Make our server the new official server

Possibility B: Integrate our memory viewer into vmprof-server

Add memory view to vmprof-server
Change protocol accordingly

What do you guys think?

The text was updated successfully, but these errors were encountered:

jonashaag · 2016-08-04T11:39:17Z

cc @StephanErb

planrich · 2016-08-04T11:50:33Z

Hi, looks like a solid enhancement to vmprof-server. As I see things now: I would stand up for possibilty B. We already maintain and run the service and this will continue that way.

How did you solve the SQL storage issue? Store the gzipped profiles on the file system? If you ask me we should not continue to store json in the relational database. For jitviewer the gzipped file is stored on the file system which is good enough as far as I can tell.

planrich · 2016-08-04T11:52:18Z

In any case, is the source available on the web? I would to look at it for a bit to estimate how much work it would be to merge it. Or would you just open a pull request then?

jonashaag · 2016-08-04T12:20:01Z

How did you solve the SQL storage issue? Store the gzipped profiles on the file system?

Yes

In any case, is the source available on the web? I would to look at it for a bit to estimate how much work it would be to merge it. Or would you just open a pull request then?

As I said I can't make it available at the moment but I'll send you a private, confidential copy in a few seconds.

planrich · 2016-08-16T12:05:15Z

Any news? I'm planning to release a bug fix version today, so we could aim for the next major release

jonashaag · 2016-08-16T17:11:21Z

Have you had a look at the code? I haven't put in the effort of making our server open source yet; I expected your "complexity of integration assessment" first. But we can also make it open source first.

planrich · 2016-09-07T09:20:46Z

Yes I did, nothing that takes tremendous amount of re-engineering. If you can make it open source, we will integrate it into the public service. After all this is a feature many people want to have!

jonashaag · 2016-09-07T10:56:35Z

Cool, we are through with the internal open source process, I'll release the source by the end of the week.

jonashaag · 2016-09-14T13:07:34Z

Here we go!

https://github.com/blue-yonder/vmprof-viewer-client
https://github.com/blue-yonder/vmprof-viewer-server

planrich · 2016-09-14T13:14:02Z

Great! I assume that the licence of both is compatible to MIT? I'll probably find some time soon to pull in your fork!

jonashaag · 2016-09-14T13:28:35Z

Awesome! I'm happy to assist. License is MIT, yes. The features that are most important for us are memory profiles with good stack traces and higher-performance profile storage (not in the SQL database)

jonashaag · 2016-10-26T13:00:12Z

What's the current state of this? Is there anything I can do to help get the merge done?

planrich · 2016-10-26T15:07:27Z

There are a view commits I have made, the migrations are ready but I need to apply some changes here and there (model names have changed, ...). We also need to copy and modify the client in the vmprof package. That would be a small project on it's own.

jonashaag · 2016-10-26T15:29:16Z

OK, let me know if when should help out.

planrich · 2016-10-26T16:07:38Z

I'm planning to work on vmprof this friday (Munich, pycon.de) and maybe I find time to push this forward. It would be great if you have a look on integrating the client side code to github.com/vmprof/vmprof-python

rongekuta · 2016-11-22T03:03:58Z

@jonashaag , yesterda. I set up vmprof-viewer-server according as https://github.com/blue-yonder/vmprof-viewer-server

but meet a mistake:
(env) [root@localhost vmprof-viewer-server]# vmprof_viewer/manage.py runserver
Performing system checks...

System check identified no issues (0 silenced).
November 22, 2016 - 03:03:11
Django version 1.10.3, using settings 'vmprof_viewer.settings'
Starting development server at http://127.0.0.1:8000/
Quit the server with CONTROL-C.
Unhandled exception in thread started by <function wrapper at 0x31c92a8>
Traceback (most recent call last):
File "/root/vmprof-viewer-server/env/lib/python2.7/site-packages/django/utils/autoreload.py", line 226, in wrapper
fn(*args, **kwargs)
File "/root/vmprof-viewer-server/env/lib/python2.7/site-packages/django/core/management/commands/runserver.py", line 142, in inner_run
handler = self.get_handler(*args, **options)
File "/root/vmprof-viewer-server/env/lib/python2.7/site-packages/django/contrib/staticfiles/management/commands/runserver.py", line 27, in get_handler
handler = super(Command, self).get_handler(*args, **options)
File "/root/vmprof-viewer-server/env/lib/python2.7/site-packages/django/core/management/commands/runserver.py", line 64, in get_handler
return get_internal_wsgi_application()
File "/root/vmprof-viewer-server/env/lib/python2.7/site-packages/django/core/servers/basehttp.py", line 59, in get_internal_wsgi_application
sys.exc_info()[2])
File "/root/vmprof-viewer-server/env/lib/python2.7/site-packages/django/core/servers/basehttp.py", line 49, in get_internal_wsgi_application
return import_string(app_path)
File "/root/vmprof-viewer-server/env/lib/python2.7/site-packages/django/utils/module_loading.py", line 20, in import_string
module = import_module(module_path)
File "/usr/lib64/python2.7/importlib/init.py", line 37, in import_module
import(name)
django.core.exceptions.ImproperlyConfigured: WSGI application 'vmprof_viewer.wsgi.application' could not be loaded; Error importing module: 'No module named wsgi'

What can I do to fix it ?

jonashaag · 2016-11-22T09:09:43Z

Create a file vmprof_viewer/wsgi.py with the following contents

"""
WSGI config for myproject project.
It exposes the WSGI callable as a module-level variable named ``application``.
For more information on this file, see
https://docs.djangoproject.com/en/1.7/howto/deployment/wsgi/
"""

import os
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "vmprof_viewer.settings")

from django.core.wsgi import get_wsgi_application
application = get_wsgi_application()

planrich · 2016-12-21T10:21:54Z

I have been working on the server to display the memory graph. Since we agreed to not have separate files, I wonder what the file format for the mem.msgpack.gz and addr_name_map.msgpack.gz is.

What I'm currently trying to do is reconstruct mem and addr_name_map on ther server from the vmprof profile.

jonashaag · 2016-12-21T11:52:55Z

IIRC:

mem: List of [ip, mem bytes] (see https://github.com/blue-yonder/vmprof-viewer-client/blob/master/vmprof_viewer_client/protocol.py#L34)

addr_name_map: Dict of ip -> string name of function (see https://github.com/blue-yonder/vmprof-viewer-client/blob/master/vmprof_viewer_client/protocol.py#L40)

planrich · 2016-12-22T15:51:01Z

the resampling is now again done on the server (as shortly discussed in the commit comment), I got most of it working I think. That is missing now on master (just to summarize):

peak memory is not displayed (top right)
duration is not displayed (top right)
relative and absolute time is not yet working
profiles cannot be grouped to projects
function profiling

for the last point (function profiling) I think there is some special decorator, can you maybe point me to an API or example how it should be used? I'm unsure how top_level_function name is sent to the server.

jonashaag · 2016-12-22T17:12:49Z

Cool! We can leave out the function profiling stuff and grouping for now. Function profiling is something that can also live in a separate project; it's a mere convenience wrapper.

criemen · 2017-04-28T12:18:05Z

Hi,
as Jonas is no longer with us, I was assigned to this project.
I think that #11 (comment) is still an accurate summary as of now.
What can we do in order to get the missing features into the server? Maybe without function profiling, I guess we could provide that externally, and if you're interested, integrate that later.

planrich · 2017-05-03T20:13:06Z

Sorry for the late response! I think a first step it would be good to test the current setup as it is. One problem was and is that like the flamegraph visualisation there is no documentation about what it means. E.g. what is the difference between absolute and relative in the memory view (where is the origin of 'relative').

Here are some issues I remember:

we use PyPy for generating the output sent to the browser (that contains the memory view data), it sometimes occurs that a numpy array is resized which is not supported on pypy
Absolute/Relative button are not working

I have been thinking to extend vmprof.com to make a short tutorial as an overlay to the flamegraph that shows the essential details (as it is done in the jitlog). I think that would be a good addition to the memory view as well.

jonashaag · 2017-05-04T10:08:31Z

Hi guys, if you have any questions, I'm happy to help.

Flamegraph: Not sure what you mean. The memory graph should be pretty obvious.

@Corni most of the differences between our internal vmprof frontend and the vmprof.com frontend should be easy to add, except for the grouping/project structure. Not sure if you guys @planrich are interested in that at all?

planrich · 2017-05-04T11:01:48Z

'pretty obvious' is a stretchable term. at least I have experienced that some people have no clue what the flame graph means. Usually you get no feedback at all and from time to time you find out that they kind of guess how it "should" be (profiling is not about how it should be, but how it is IMHO). So my idea is to make the docs easily accessible (preferable in the same view as the profiling visualisation).

jonashaag · 2017-05-04T15:21:01Z

Flame graph = the blue line? I'm confused by the term "flame graph" here... it's a completely different kind of visualisation that the CPU flame graph.

If flame graph = blue line, do you recall what confuses people about it?

planrich · 2017-05-04T17:23:01Z

no, I was taking about the CPU visualisation (=>CPU flame graph view). What confused me about the memory view is relative/absolute. does relative mean, "relative" to the minium heap size (= subtract the minimum heap size)?

jonashaag · 2017-05-04T18:48:52Z

Um, it was relative/absolute TIME. The reasoning here was that some people know "this must have been somewhen around 11:30 yesterday" and other people look for "about 20 minutes into the program run".

planrich added the enhancement label Sep 7, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Discussion: Integrate our fork #11

Discussion: Integrate our fork #11

jonashaag commented Aug 4, 2016

jonashaag commented Aug 4, 2016

planrich commented Aug 4, 2016

planrich commented Aug 4, 2016

jonashaag commented Aug 4, 2016

planrich commented Aug 16, 2016

jonashaag commented Aug 16, 2016

planrich commented Sep 7, 2016

jonashaag commented Sep 7, 2016

jonashaag commented Sep 14, 2016

planrich commented Sep 14, 2016

jonashaag commented Sep 14, 2016

jonashaag commented Oct 26, 2016

planrich commented Oct 26, 2016

jonashaag commented Oct 26, 2016 •

edited

Loading

planrich commented Oct 26, 2016

rongekuta commented Nov 22, 2016

jonashaag commented Nov 22, 2016 •

edited

Loading

planrich commented Dec 21, 2016

jonashaag commented Dec 21, 2016

planrich commented Dec 22, 2016 •

edited

Loading

jonashaag commented Dec 22, 2016

criemen commented Apr 28, 2017

planrich commented May 3, 2017

jonashaag commented May 4, 2017

planrich commented May 4, 2017

jonashaag commented May 4, 2017

planrich commented May 4, 2017

jonashaag commented May 4, 2017

Discussion: Integrate our fork #11

Discussion: Integrate our fork #11

Comments

jonashaag commented Aug 4, 2016

Intro and motivation

Features of our implementation

Screenshots

Future of our server, integration in vmprof-server

Possibility A: Integrate vmprof-server into our server

Possibility B: Integrate our memory viewer into vmprof-server

jonashaag commented Aug 4, 2016

planrich commented Aug 4, 2016

planrich commented Aug 4, 2016

jonashaag commented Aug 4, 2016

planrich commented Aug 16, 2016

jonashaag commented Aug 16, 2016

planrich commented Sep 7, 2016

jonashaag commented Sep 7, 2016

jonashaag commented Sep 14, 2016

planrich commented Sep 14, 2016

jonashaag commented Sep 14, 2016

jonashaag commented Oct 26, 2016

planrich commented Oct 26, 2016

jonashaag commented Oct 26, 2016 • edited Loading

planrich commented Oct 26, 2016

rongekuta commented Nov 22, 2016

jonashaag commented Nov 22, 2016 • edited Loading

planrich commented Dec 21, 2016

jonashaag commented Dec 21, 2016

planrich commented Dec 22, 2016 • edited Loading

jonashaag commented Dec 22, 2016

criemen commented Apr 28, 2017

planrich commented May 3, 2017

jonashaag commented May 4, 2017

planrich commented May 4, 2017

jonashaag commented May 4, 2017

planrich commented May 4, 2017

jonashaag commented May 4, 2017

jonashaag commented Oct 26, 2016 •

edited

Loading

jonashaag commented Nov 22, 2016 •

edited

Loading

planrich commented Dec 22, 2016 •

edited

Loading