Skip to content

wspr-ncsu/BrowserFingerprintingAD

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 

Repository files navigation

source_code

Workflow:

Crawling and Post-processing

We followed the instructions in VisibleV8 public Github repo to install VV8. Combining VV8 with a crawler, we collected crawling data and stored it in PostgreSQL. Here is the crawling data (including new crawling results). Post-processing is done during crawling and the code is inside the vv8-post-processor directory. Note: the vv8-post-processor provided by us is different from the one in public post-processor Github repo. The difference is that the public vv8-post-processor does not preserve the sequence of API calls when processing the crawling data. That is the major reason we built a new vv8-post-processor that preserves the sequence of API calls to fulfill the locality needs in our research. We provide our vv8-post-processor executable and the crawling data table(Line 96-103) used in PostgreSQL.

Locality Algorithm

local_cal.py includes the main logic of the locality algorithm. FPanalysis.py, which includes the main logic of dynamic analysis, connects to PostgreSQL that stores crawling data and conducts locality algorithm on APIs in table element "APIs"(Line 102). Then, it inserts the dynamic analysis results in another database(we provide it here with the format mentioned in this table(Line 105-110)). The command we used to run dynamic analysis is python3 FPanalysis.py FPLOI 1.

Static Analysis

The static analysis takes the results of the locality algorithm as input and stores the corresponding source code(Line 107) in a JS file. Then, it builds a DFG based on the JS file and searches for data flow in DFG. We only used functionality in JStap/pdg_generation from JStap. These files are in analysis-source-code. FPstatic.py and statictest.py are the main logic of static analysis. The command we used to run static analysis is python3 statictest.py.

Example

The above data is obtained by calling the command pg_dump DATABASE | gzip > DATABASE.gz and it is the result of crawling the Alexa Top 10K websites. It's not easy to read or comprehend. For convenience, we also provide corresponding easy crawling data, easy locality_results, VV8 raw log by only crawling ebay.

Links to Public Github Repo

JStap: https://github.com/Aurore54F/JStap VisibleV8: https://github.com/wspr-ncsu/visiblev8