Julia package for handling Lens.org patent data.
All packages in the JuliaPatents family are registered in the JuliaPatents registry. To add the registry, enter the julia REPL and run:
using Pkg
pkg"registry add https://github.com/JuliaPatents/Registry"
This only needs to be done once.
After adding the registry, the package can be added to any Julia environment:
using Pkg
pkg"add PatentsLens"
The package can now be loaded:
using PatentsBase, PatentsLens
PatentsLens.jl implements the analysis interface defined by PatentsLandscapes.jl. To use it, import the PatentsLandscapes package (included in this package's dependencies):
using PatentsLandscapes
The main purpose of this package is to import patent metadata as exported from Lens.org in the jsonlines (.jsonl
) format.
The package supports two data models:
- An in-memory object model similar to the original JSON, using Julia structs
- An SQLite-based relational model that offers indexed and fast property-based and full-text search, aggregation, and more
The PatentsLandscapes.jl API is currently only implemented for the SQLite model.
Loading data from a file test.jsonl
into memory looks like this:
applications = PatentsLens.read_jsonl("test.jsonl")
The LensApplication
struct implements the interface defined in PatentsBase.jl.
The dataset can easily be elevated to the simple family level:
families = PatentsLens.aggregate_families(applications)
To begin using the SQLite model, create a new database:
db = LensDB("database.db")
This will create a new SQLite database at the specified path and initialize it with the PatentsLens schema.
Data can then be loaded into the new database like this:
PatentsLens.load_jsonl!(db, "test.jsonl")
Because the data needs to be transformed into a relational form, this step may take a while.
Data can be retrieved from the database and converted back into the object model:
all_apps = applications(db)
all_fams = families(db)
Subsets of the data can be created and accessed using PatentsBase's Filter API:
# Retrieve all patent families from the database that mention polylactic acid in their abstract
pla_fams = families(db, ContentFilter("polylactic OR PLA", AbstractSearch()))