Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Obtaining licence / attribution for "depiction" URLs #285

Open
kshepherd opened this issue Feb 16, 2021 · 10 comments
Open

Obtaining licence / attribution for "depiction" URLs #285

kshepherd opened this issue Feb 16, 2021 · 10 comments

Comments

@kshepherd
Copy link

When re-using images from depiction data, which are typically (always?) from Wikimedia, I need to display the licence and attribution for the image.
A LOBID API search result or record returns id, url and thumbnail for a depiction. The URL points to the main Wikimedia page for this image, and it is this page that contains the licence data.

My questions are:

  • Is there some way that the licence can be included in the depiction node of an API result to avoid any further HTTP calls needed to obtain the licence information?
  • LOBID record pages do display the licence and attributions... is this done with a second call to a Wikimedia API? Or is the information stored locally somewhere?
@acka47
Copy link
Contributor

acka47 commented Feb 17, 2021

Thanks, @kshepherd, for these good questions.

Is there some way that the licence can be included in the depiction node of an API result to avoid any further HTTP calls needed to obtain the licence information?

This would indeed be a good feature for API users. Currently, we don't add any data ourselves but use the depiction information from EntityFacts. So it would be easiest for us if EntityFacts included this information. I will open an issue for this at the DNB as this would be a nice feature for all services that use EntityFacts or lobid-gnd.

LOBID record pages do display the licence and attributions... is this done with a second call to a Wikimedia API? Or is the information stored locally somewhere?

Yes it is done calling the Wikimedia API, see the code at

private static CompletionStage<WSResponse> requestInfo(WSClient client, String imageName)
throws UnsupportedEncodingException {
String imageId = "File:" + URLDecoder.decode(imageName, StandardCharsets.UTF_8.name());
return client.url("https://commons.wikimedia.org/w/api.php")//
.addQueryParameter("action", "query")//
.addQueryParameter("format", "json")//
.addQueryParameter("prop", "imageinfo")//
.addQueryParameter("iiprop", "extmetadata")//
.addQueryParameter("titles", imageId).get();
}
private static String createAttribution(String fileName, JsonNode info) {
String artist = findText(info, "Artist");
String licenseText = findText(info, "LicenseShortName");
String licenseUrl = findText(info, "LicenseUrl");
String fileSourceUrl = "https://commons.wikimedia.org/wiki/File:" + fileName;
return String.format(
(artist.isEmpty() ? "%s" : "%s | ") + "<a href='%s'>Wikimedia Commons</a> | <a href='%s'>%s</a>",
artist, fileSourceUrl, licenseUrl.isEmpty() ? fileSourceUrl : licenseUrl, licenseText);
}

I just discussed this with @fsteeg: It would be relatively straightforward to us to add an API call for the HTML string we use for attribution with something like .imgattribution added to the respective GND entry, e.g. http://lobid.org/gnd/172494532.imgattribution (similarly to the preview, see e.g. http://lobid.org/gnd/172494532.preview).

For example a GET on https://lobid.org/gnd/172494532.imgattribution would give back the following HTML:

 国土交通省 | <a href='https://commons.wikimedia.org/wiki/File:Kazuhisa%20Ogawa%20cropped%201%20Kazuhisa%20Ogawa%2020041102.jpg'>Wikimedia Commons</a> | <a href='https://creativecommons.org/licenses/by/4.0'>CC BY 4.0</a>

We would then store all information already requested instead of sending out a new call each time. Are you interested in such a feature?

@kshepherd
Copy link
Author

Thanks @acka47 . The extra information from DNB would be great!
We also like your suggestion about a .imgattribution format that can supply an HTML snippet.

In the meantime, I will implement an additional call to Wikimedia API at the time of retrieving and indexing the JSON record from the LOBID API. Thanks for the code example, that gives me a good direction to to go in.

Thanks for the helpful reply!

@acka47
Copy link
Contributor

acka47 commented Feb 22, 2021

So it would be easiest for us if EntityFacts included this information. I will open an issue for this at the DNB as this would be a nice feature for all services that use EntityFacts or lobid-gnd.

I opened an issue at https://jira.dnb.de/browse/GND-160.

@acka47
Copy link
Contributor

acka47 commented Sep 10, 2021

DNB is thinking about adding attribution information using Wikimedia's Lizenzhinweisgenerator which totally makes sense. Here is an example of how the result could look.

Public Domain:

{
  "depiction":{
    "@id":"http://commons.wikimedia.org/wiki/Special:FilePath/Wappen_Aha_(Gunzenhausen).png",
    "thumbnail":{
      "@id":"https://commons.wikimedia.org/wiki/Special:FilePath/Wappen_Aha_(Gunzenhausen).png?width=270"
    },
    "url":"https://commons.wikimedia.org/wiki/File:Wappen_Aha_(Gunzenhausen).png?uselang=de",
    "license":{
      "@id":"https://commons.wikimedia.org/wiki/Template:PD-Coa-Germany",
      "label":"PD-Coa-Germany"
    },
    "attributionText":"anonym, <a href=\"https://upload.wikimedia.org/wikipedia/commons/0/03/Stadtwappen_Gunzenhausen.svg\">Stadtwappen_Gunzenhausen</a>, als gemeinfrei gekennzeichnet, Details auf <a href=\"https://commons.wikimedia.org/wiki/Template:PD-Coa-Germany\" rel=\"license\">Wikimedia Commons</a>"
  }
}

CC-BY:

 {
  "depiction":
  {
    "@id":"http://commons.wikimedia.org/wiki/Special:File:Lady%20Gaga%20interview%202016.jpg",
    "thumbnail":{
        "@id":"https://commons.wikimedia.org/wiki/Special:File:Lady%20Gaga%20interview%202016.jpg?width=270"
      },
    "url":"https://commons.wikimedia.org/wiki/File:Lady%20Gaga%20interview%202016.jpg?uselang=de",
    "license" : {
      "@id" : "https://creativecommons.org/licenses/by/3.0/legalcode",
      "label": "cc-by-3.0"
     },
    "attributionText":"<a href=\"https://vimeo.com/smpentertainment/about\">SMP Entertainment</a>, <a href=\"https://upload.wikimedia.org/wikipedia/commons/2/2c/Lady_Gaga_interview_2016.jpg\">Lady_Gaga_interview_2016</a>, <a href=\"https://creativecommons.org/licenses/by/3.0/legalcode\" rel=\"license\">CC BY 3.0</a>"
  }
}

I think this looks good to use. Would this be ok for you as well, @kshepherd ?

@kshepherd
Copy link
Author

@acka47 - This looks great! Thanks for the update

@acka47
Copy link
Contributor

acka47 commented Jun 9, 2022

DNB has now added license and attribution information to EntityFacts data, see e.g. https://hub.culturegraph.org/entityfacts/4022556-2. We will include this in lobid-gnd when the next EntityFacts full dump is published (probably next week).

@acka47
Copy link
Contributor

acka47 commented Jul 18, 2022

@fsteeg (who is now on holidays) has recently loaded the new EntityFacts dump but the licence/attribution information in the depiction object is still missing, see https://lobid.org/gnd/4022556-2.json. Apparently, we will have to adjust some configuration to include in lobid. Maybe this has something to do with it: Two new properties (publisher. creditText) are now used in the RDF and we must include them in the JSON-LD context.

Assigning @fsteeg to take a look at this in August.

@fsteeg
Copy link
Member

fsteeg commented Oct 7, 2022

Currently, we don't add any data ourselves but use the depiction information from EntityFacts.

We don't add anything, but we actually create our own, simple depiction object from the EntityFacts data, e.g. thumbnail has a nested @id in EntityFacts, unlike in lobid. So the question is how exactly to add this to lobid. Given the original use case, and the .imgattribution discussion above, how about we add a single field, with the HTML attribution string created from the new EntityFacts data, as creditText?

@fsteeg fsteeg assigned acka47 and unassigned fsteeg Oct 7, 2022
@acka47
Copy link
Contributor

acka47 commented Oct 10, 2022

how about we add a single field, with the HTML attribution string created from the new EntityFacts data, as creditText

This would definitely be more work than just adding the additional fields, wouldn't it? And I am not sure whether we run into some smaller problems there. I think we should – as we did in the past – stick as close as possible to the EntityFacts data but make it simpler where possible by removing objects and aliasing @id and @type. In this case, this would result in just adding the EntityFacts JSON and use id instead of @id.

Suggested future status in lobid-gnd:

{
   "depiction":[
      {
         "id":"https://commons.wikimedia.org/wiki/Special:FilePath/Gunzenhausen%20001-.jpg",
         "publisher":"Wikimedia Commons",
         "copyrighted":"true",
         "creator":[
            "Wolkenkratzer"
         ],
         "creditText":[
            "Own work"
         ],
         "license":[
            {
               "id":"https://creativecommons.org/licenses/by-sa/4.0",
               "abbr":"CC BY-SA 4.0",
               "name":"Creative Commons Attribution-Share Alike 4.0",
               "attributionRequired":"true",
               "restrictions":""
            }
         ],
         "url":"https://commons.wikimedia.org/wiki/File:Gunzenhausen%20001-.jpg?uselang=de",
         "thumbnail":"https://commons.wikimedia.org/wiki/Special:FilePath/Gunzenhausen%20001-.jpg?width=270"
      }
   ]
}

@acka47 acka47 assigned fsteeg and unassigned acka47 Oct 10, 2022
@fsteeg
Copy link
Member

fsteeg commented Oct 10, 2022

This would definitely be more work than just adding the additional fields, wouldn't it?

No, I don't think so. We already create that HTML. We could remove the Wikidata call, use the EntityFacts data instead, and add that HTML to the response. If we add and simplify the EntityFacts data, we'd have to do that additionally, and in the end, our API would not actually serve the use case very well (clients would still have to put the attribution HTML together themselves). Will move to backlog for now, since the concrete user request is no longer urgent and this is not as minor as I hoped.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants