Use context cache for imported contexts #304

skodapetr · 2023-12-29T13:52:24Z

Should resolve issue #292, as now the cache is used for imported contexts as well.

In order to work user must set JsonLdOptions.documentCache as it is by default null meaning no context caching is applied at all.

filip26#292

filip26

Thank you for contributing. HTTP caching MUST be part of underlying HTTP client implementation (preferred) or DocumentLoader implementation. Mixing caching with a processor implementation violates a separation of concerns. (as has been mentioned here #292 (comment)) FYI: ContextCache is going to be deprecated in the future version. Please let's keep the code clean as much as possible.

Should resolve issue #292, as now the cache is used for imported contexts as well.

Obviously a test case is missing. SHOULD is not an acceptable resolution.

skodapetr · 2024-01-02T13:54:35Z

Thank you for your kind comment pointing out opportunities for improving the pull request.

It seems like I've misunderstood our conversation in #292. based on the conversation I thought that you prefer to avoid caching HTTP requests in the codebase of titanium-json-ld. At the same time, you seemed to be open to the idea of caching higher-level resources like documents and contexts as this is already happening in the codebase.

That is why I've opted for re-using the already existing cache for "imported" contexts as well. The idea was to introduce caching where it was missing in a way compatible with the existing code. In addition, caching on this level prevents unnecessary parsing or JSON documents, leading to better performance. As of the performance, I agree that caching on a higher level would add additional performance benefits, yet it would not be aligned with the existing code.

Reading your comment I've concluded that my understanding so far was wrong and you prefer caching on the level of HTTP requests. Perhaps I was too quick to reject this option as the drawbacks of caching on multiple places, and performance penalty looks to me like good enough reasons for dismissal.

Anyway, back to the pull request. The objective is to cache HTTP requests, where the caching should be implemented in an underlying HttpClient. I'm not sure whether you mean com.apicatalog.jsonld.http.DefaultHttpClient or another (custom) implementation of com.apicatalog.jsonld.http.HttpClient. For the second case, it can be implemented outside titanium-json-ld codebase as users can set custom com.apicatalog.jsonld.http.HttpClient for com.apicatalog.jsonld.loader.HttpLoader.

Can you please comment on the preferred solution?

hmottestad · 2024-01-03T21:27:25Z

If you are using the built-in HttpClient in Java 11 then I don't believe that it supports caching. The JEP says that it's only meant to cover 80-90% of application needs and is meant to replace the outdated HttpURLConnection API.

If you are instead using the far more common Apache HttpClient then there is this builder here that should enable caching: https://hc.apache.org/httpcomponents-client-4.5.x/current/httpclient-cache/apidocs/org/apache/http/impl/client/cache/CachingHttpClientBuilder.html

filip26 · 2024-01-08T11:45:21Z

@skodapetr What about having a specialized document loader, a wrapper that acts as a cache? e.g.

JsonLd.expand(...).loader(new LRUDocumentCache(loader, cacheparams))...

This solution has several benefits. It keeps processor code clean of caching, end-users can decide if/how to use it, the cache can be shared between calls, and anyone can implement its own alternative cache. What do you think? Also, check StaticContextLoader for an inspiration.

@hmottestad Titanium does not provide direct support to Apache HTTP Client but uses Java 11 HtttClient and OkHttp (Java 8). Both implementations support caching, it just has to be enabled.

hmottestad · 2024-01-08T11:51:17Z

@filip26 I didn't know that Java 11 HttpClient supports caching. Any chance you could point me to some documentation, might be useful in the future?

skodapetr · 2024-01-08T13:29:39Z

@filip26 It is not the most performant solution, but it will help keep the code clean and may actually ease removal of caching from the main algorithm. Should I create a new PR or update this one?

EDIT: You mention ability to use custom implementation, should the LRUDocumentCache be part of the PR?

filip26 · 2024-01-08T14:18:03Z

@hmottestad sorry, I was looking into Apache HttpClient documentation by mistake. You are right about Java 11 HttpClient.

filip26 · 2024-01-08T14:21:41Z

@skodapetr please, can you elaborate why you don't think it's a performant solution?

hmottestad · 2024-01-08T14:27:34Z

@hmottestad sorry, I was looking into Apache HttpClient documentation by mistake. You are right about Java 11 HttpClient.

That's what I initially stumbled upon too. Would have been nice if they could have named it something a bit more unique to avoid confusion with the Apache HttpClient 😔

skodapetr · 2024-01-09T21:38:33Z

In 5.6.4, 5.2.4, and 5.2.5 we can cache either the input documents (solutions we plan to implement) or results produced by processing the documents - mostly contexts.

Good example is 5.2.4. If there is a cache hit it returns from the algorithm in the "if" part. Using only document level caching, we save request in loadDocument, yet rest of the algorithm is executed every time.

I may need to clarify that this is not an issue of the currently implemented approach, but rather of the intention to remove ContextCache. At the same time I understand the decision to prefer readability and maintainability over performance for some specific cases.

filip26 · 2024-01-09T22:08:51Z

@skodapetr If we both agree that using a separate cache based on DocumentLoader is the way to go, then let's add:

a cache implementation - MUST
a simple unit test to test only it does what is expected - MUST
"how to use a cache" paragraph to README or even a separate document - NICE

This reverts commit e9b3ee1.

filip26

Looks good, just minor stuff. Thank you!

src/main/java/com/apicatalog/jsonld/loader/LRUDocumentCache.java

…nto feature/292

Use context cache for imported contexts

e9b3ee1

filip26#292

filip26 self-requested a review December 30, 2023 15:03

filip26 requested changes Dec 30, 2023

View reviewed changes

skodapetr added 6 commits January 15, 2024 09:19

Revert "Use context cache for imported contexts"

d5418e6

This reverts commit e9b3ee1.

Add LRUDocumentCache

67b0489

Remove use of var

7f0ab48

Remove unused field

0312ab7

Update test

0ffe652

Merge branch 'main' into feature/292

8d6e12a

skodapetr requested a review from filip26 January 16, 2024 17:06

filip26 reviewed Jan 16, 2024

View reviewed changes

src/main/java/com/apicatalog/jsonld/loader/LRUDocumentCache.java Outdated Show resolved Hide resolved

src/main/java/com/apicatalog/jsonld/loader/LRUDocumentCache.java Outdated Show resolved Hide resolved

skodapetr and others added 7 commits January 17, 2024 17:14

Remove LRUDocumentCache constrcutor

2779574

Merge branch 'feature/292' of github.com:skodapetr/titanium-json-ld i…

ce58005

…nto feature/292

Merge branch 'main' into feature/292

2624cac

Update cache to use equals and hashCode

5022bbd

Merge branch 'feature/292' of github.com:skodapetr/titanium-json-ld i…

0cc431e

…nto feature/292

Remove use of new syntax

0756e9f

Resolve code-style issues

f69f925

filip26 approved these changes Jan 22, 2024

View reviewed changes

filip26 merged commit f4b905b into filip26:main Jan 22, 2024
5 checks passed

skodapetr deleted the feature/292 branch January 22, 2024 09:24

filip26 linked an issue Jan 24, 2024 that may be closed by this pull request

Large numbers of HTTP requests for the same JSON-LD context due to lack of caching #292

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use context cache for imported contexts #304

Use context cache for imported contexts #304

skodapetr commented Dec 29, 2023

filip26 left a comment

skodapetr commented Jan 2, 2024

hmottestad commented Jan 3, 2024

filip26 commented Jan 8, 2024 •

edited

Loading

hmottestad commented Jan 8, 2024

skodapetr commented Jan 8, 2024 •

edited

Loading

filip26 commented Jan 8, 2024

filip26 commented Jan 8, 2024

hmottestad commented Jan 8, 2024

skodapetr commented Jan 9, 2024

filip26 commented Jan 9, 2024

filip26 left a comment

Use context cache for imported contexts #304

Use context cache for imported contexts #304

Conversation

skodapetr commented Dec 29, 2023

filip26 left a comment

Choose a reason for hiding this comment

skodapetr commented Jan 2, 2024

hmottestad commented Jan 3, 2024

filip26 commented Jan 8, 2024 • edited Loading

hmottestad commented Jan 8, 2024

skodapetr commented Jan 8, 2024 • edited Loading

filip26 commented Jan 8, 2024

filip26 commented Jan 8, 2024

hmottestad commented Jan 8, 2024

skodapetr commented Jan 9, 2024

filip26 commented Jan 9, 2024

filip26 left a comment

Choose a reason for hiding this comment

filip26 commented Jan 8, 2024 •

edited

Loading

skodapetr commented Jan 8, 2024 •

edited

Loading