pyProbe is a collection of data gathering and analysis tools for Freenet network probes. These probes report limited sets of information at once and apply random noise in order to reduce how identifiable information is while keeping it useful for network-wide statistics. Requires Freenet build 1409 or greater.
- Python 2.6 or higher
- argparse (if using Python earlier than 2.7)
- Freenet
- gnuplot (for extra analyze.py plots)
- gnuplot-py
- [rrdtool] (http://oss.oetiker.ch/rrdtool/download.en.html) (rrdpython)
- Twisted
- twistedfcp
- Markdown
- enum34
- postgresql
- psycopg
- numpy
Freenet, Python, gnuplot, rrdtool, Twisted, and Markdown all have installation instructions on their respective sites.
pip install markdown enum gnuplot-py psycopg2 numpy
- argparse was added to Python in version 2.7. It is available for older versions on the package index:
# pip install argparse
- Clone twistedfcp:
$ git clone https://github.com/AnIrishDuck/twistedfcp.git
$ cd twistedfcp
# python setup.py install
After installing, create roles. This guide was written using PostgreSQL 9.2 and assumes Debian-ish tendencies. An example package name is postgresql-9.2
.
pyProbe uses the database in three capacities:
- Table creation, updating, and alteration for database initialization and upgrades.
- Inserting for data gathering.
- Reading for data analysis.
Please note: I don't know if I'm setting this up in a sane way. If not, please yell at me about it.
If appropriate roles for these don't already exist, create them. Then create the database and grant sufficient privileges. There are many ways to authenticate; this guide will use peer authentication, which maps operating system user names to PostgreSQL users.
# su postgres
$ createuser pyprobe-maint
$ createuser pyprobe-add
$ createuser pyprobe-read
$ createdb probe-results
$ psql -c 'GRANT CREATE ON DATABASE "probe-results" TO "pyprobe-maint"'
The tables do not exist yet, so privileges cannot be assigned for them. They will be assigned by the maintenance user after creating the tables. Note that pyProbe will modify permissions for all tables in the public schema of the database. This means it does not coexist nicely with other applications in the same database. (This was to avoid maintaining a separate hardcoded list of what tables exist, see db.Database initialization.)
Copy database.config_sample
to database.config
and set the usernames and database name. (Passwords need not be specified if they are not used.) Set the mapping between system users and PostgreSQL users - this may involve /etc/postgresql/9.2/main/pg_ident.conf
and /etc/postgresql/9.2/main/pg_hba.conf
. For example, in pg_ident.conf
:
# MAPNAME SYSTEM-USERNAME PG-USERNAME
pyprobe pyprobe pyprobe-maint
pyprobe pyprobe pyprobe-read
pyprobe pyprobe pyprobe-add
And in pg_hba.conf
:
# TYPE DATABASE USER ADDRESS METHOD
local probe-results pyprobe-maint peer map=pyprobe
local probe-results pyprobe-add peer map=pyprobe
local probe-results pyprobe-read peer map=pyprobe
Then reload the PostgreSQL configuration. If migrating from from the sqlite version of pyProbe, run python fnprobe/migrate_from_sqlite.py
. If importing the database dumps, run python fnprobe/copy_from.py
. Create probe.config
from the sample to do probe collection, and upload.config
from the sample to upload a site from the analysis scripts. Now probe collection and analysis can begin!
The tools are:
probe.py
: connects to a Freenet node, makes probe requests, and stores the results.analyze.py
: analyzes stored probe results, and generates plots of the data.
Can be run directly with python
, with twistd
, or with the bash script run
, which supports these operations:
start
: Starts the probe if it is not already running.stop
: Stops the probe if it is running.restart
: Stops the probe if it is running, then starts it again.console
: Restarts the probe and follows the log, and stops the probe on interrupt.log
: Follows the log.
Configured with the self-documenting probe.config
.
Can perform analysis of gathered probe data:
- Network size estimate
- Store size estimate
- Location distribution
- Peer count distribution
- Link length distribution
- Uptime distribution (using that included with
identifier
) - Bulk reject percentage distribution
The time spans used are manually specified (with defaults) but might be better as a function of probe rate and type distribution for more consistent results at the cost of more unusual time spans.
For documentation on using it run analyze
with --help
.
There are separate tables for each result type, errors, and refuals. The database is versioned, and previous versions will be upgraded. All table names but error
, refused
, and peer_count
match the name of the result type with which they are updated. With the exception of link_lengths
, all tables have the following columns:
time
: when the result was committed.htl
: hops to live of the request.duration
: elapsed between sending the probe and receiving the response.
All tables have an id
primary key. Additional columns vary by table:
kib
- Outgoing bandwidth limit in floating point KiB/s.
build
: Freenet build number.
identifier
: Randomly assigned (by default; can be set or randomized again at will) integer identifier.percent
: Very low-precision integer uptime percentage over the last 7 days.
Each individual reported length has its own entry.
length
: Floating point difference between the responding node's location and one of its peers' locations.count_id
: Matches theid
of the associatedpeer_count
row.
Set from LINK_LENGTHS
probes like link_lengths
.
peers
: Number of peers.
location
: Floating point network location.
GiB
: Datastore (cache and store) size in floating point GiB.
bulk_request_chk
: Percent bulk CHK requests rejected.bulk_request_ssk
: Percent bulk SSK requests rejected.bulk_insert_chk
: Percent bulk CHK inserts rejected.bulk_insert_ssk
: Percent bulk SSK inserts rejected.
percent
: Floating point uptime percentage over the last 48 hours.
percent
: Floating point uptime percentage over the last 7 days.
For probe and error type code mappings see db.probeTypes and db.errorTypes or, in Fred, src/freenet/node/probe/Type.java
and src/freenet/node/probe/Error.java
.
probe_type
: Code for the requested probe result.error_type
: Code for the error.code
: If specified, the local node did not recognize this error code. In this case, theerror_type
will be the code forUNKNOWN
.local
: Iftrue
the error occurred locally and was not prompted by an error relayed from a remote node. Iffalse
the error was relayed from a remote node.
probe_type
: The probe result which was requested.
schema_version
: Internal version number to handle upgrades.