Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove Ignite, implements #96 #100

Merged
merged 1 commit into from
Sep 18, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 2 additions & 5 deletions service/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,9 @@
[![Docker Image Version (latest semver)](https://img.shields.io/docker/v/crawlercommons/url-frontier)](https://hub.docker.com/r/crawlercommons/url-frontier)
[![Docker Pulls](https://img.shields.io/docker/pulls/crawlercommons/url-frontier)](https://hub.docker.com/r/crawlercommons/url-frontier)

Implementations of the URL Frontier Service. There are currently 3 implementations available:
Implementations of the URL Frontier Service. There are currently 2 implementations available:
- a simple memory-based which was used primarily for testing
- the default one which is scalable, persistent and is based on [RocksDB](https://rocksdb.org/)
- a persistent and distributed one based on [Ignite](https://ignite.apache.org/)

The Ignite implementation is still in beta mode.

Web crawlers can connect to it using the gRPC code generated from the API. There is also a simple client available
which can do basic interactions with a Frontier.
Expand Down Expand Up @@ -37,7 +34,7 @@ the call above can have the following equivalent without the config file:

If no path is set explicitly for RocksDB, the default value _./rocksdb_ will be used.

For implementation supporting a cluster mode, like the Ignite one, it is required to use the parameter `-h xxx.xxx.xxx.xxx` with the private IP or hostname
For implementation supporting a cluster mode, it is required to use the parameter `-h xxx.xxx.xxx.xxx` with the private IP or hostname
on which it is running so that it can report its location with the heartbeat.

## Logging configuration
Expand Down
18 changes: 0 additions & 18 deletions service/config.ini
Original file line number Diff line number Diff line change
Expand Up @@ -21,23 +21,5 @@ rocksdb.max_background_jobs = 4
rocksdb.max_subcompactions = 2
rocksdb.max_bytes_for_level_base = 536870912

#implementation = crawlercommons.urlfrontier.service.ignite.IgniteService

# Needed for the Ignite based frontiers to form a cluster
# ignite.seed.address = xxx.xxx.xxx.xxx

ignite.path = /data/crawl/ignite
ignite.workdir = /data/crawl/ignite
ignite.index = /data/crawl/lucene
# ignite.purge = true

ignite.backups = 3
# frequency in sec of when the frontiers should send a heartbeat
ignite.frontiers.heartbeat = 60
# ttl of hearbeats in sec
ignite.frontiers.ttl = 120





14 changes: 0 additions & 14 deletions service/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,6 @@

<properties>
<prometheus.version>0.16.0</prometheus.version>
<ignite.version>2.14.0</ignite.version>
<lucene.version>9.4.0</lucene.version>
<rocksdb.version>7.6.0</rocksdb.version>
</properties>

Expand Down Expand Up @@ -94,18 +92,6 @@
<version>${rocksdb.version}</version>
</dependency>

<dependency>
<groupId>org.apache.ignite</groupId>
<artifactId>ignite-core</artifactId>
<version>${ignite.version}</version>
</dependency>

<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-core</artifactId>
<version>${lucene.version}</version>
</dependency>

<dependency>
<groupId>commons-io</groupId>
<artifactId>commons-io</artifactId>
Expand Down

This file was deleted.

Loading
Loading