-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Images for tooltips from wikidata and wikipedia #1
Comments
Ignoring wikipedia completely at the moment, these are four possible ways to express the image in the sparql query from above: ?x wdt:P18 ?image .
?x wdt:P18|wdt:P109|wdt:P14|wdt:P1442|wdt:P154|wdt:P1543|wdt:P158|wdt:P1766|wdt:P1801|wdt:P2096|wdt:P2713|wdt:P2716|wdt:P2910|wdt:P3311|wdt:P3383|wdt:P3451|wdt:P367|wdt:P41|wdt:P4291|wdt:P4640|wdt:P5252|wdt:P5775|wdt:P7407|wdt:P7415|wdt:P94|wdt:P996 ?image .
OPTIONAL { ?x wdt:P18 ?image . }
OPTIONAL { ?x wdt:P18|wdt:P109|wdt:P14|wdt:P1442|wdt:P154|wdt:P1543|wdt:P158|wdt:P1766|wdt:P1801|wdt:P2096|wdt:P2713|wdt:P2716|wdt:P2910|wdt:P3311|wdt:P3383|wdt:P3451|wdt:P367|wdt:P41|wdt:P4291|wdt:P4640|wdt:P5252|wdt:P5775|wdt:P7407|wdt:P7415|wdt:P94|wdt:P996 ?image . } One still needs to deal with duplicates because of multiple images for one entity. Some kind of preference would be good which would be possible with the PS: The list of predicates was generated with this query: PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX p: <http://www.wikidata.org/prop/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
SELECT ?image ?label WHERE {
wd:P18 wdt:P1659 ?image .
?image rdfs:label ?label .
FILTER langMatches(lang(?label), "en") .
} |
Before, only information of entities which had a link to an image on Wikidata via the P18 property was included in the wiki info mapping. Now, information about all entities that can be mapped to one of (title, abstract, image) is included. Images are now either retrieved via the Wikipedia API or via a Wikidata image property (P18, P109, P15, ...). This commit adds the necessary scripts to create the mapping from scratch and adds documentation about the process. Relates to #1.
Thanks for the detailed analysis! |
How did you deal with the license question for wikipedia images? |
Not at all. I skillfully overlooked that part. So right now, all images are included in the mapping, i.e. the From what I understood, Wikipedia can use these non-free contents under the fair use policy which exists in the US but not in the EU (which is probably why the English Wikipedia contains theatrical release posters for films and the German Wikipedia does not). Too bad... |
The image in the wikipedia infobox is not always from wikidata. See https://www.wikidata.org/wiki/Q16742294 and https://www.wikidata.org/wiki/Q16742291, which might be helpful in determining differences.
As explained here, the wikipedia image can be queried in the following way:
https://en.wikipedia.org/w/api.php?action=query&prop=pageimages&titles=Jaguar&pithumbsize=500&format=json&formatversion=2
.Per default, it returns only images with a free license. For Lord of the Rings, the image is not free, so it is not returned. However, it is possible to return any (including non-free) image with the additional argument
pilicense=any
, as in https://en.wikipedia.org/w/api.php?action=query&prop=pageimages&titles=The_Lord_of_the_Rings:_The_Fellowship_of_the_Ring&pithumbsize=500&format=json&formatversion=2&pilicense=any.I don't know what the licensing means for aqqu tooltips, but there is more info on that here.
It is possible to query multiple images with one query: https://en.wikipedia.org/w/api.php?action=query&prop=pageimages&titles=The_Lord_of_the_Rings:_The_Fellowship_of_the_Ring|Sun|Jaguar&pithumbsize=500&format=json&formatversion=2&pilicense=any.
Maybe one option would be to use the following query:
and then loop over the results without an image and use the wikipedia image only for those.
On the other hand, maybe one should prefer the wikipedia image over the wikidata image. For the example of mexico, wdt:P18 yields a bunch of images, but an image of the flag (P41) would probably be more useful. Wikipedia uses the flag in this case.
In either case, the script or command to produce the file qid_to_wikipedia.tsv should be included in the repo for better reproducibility, especially regarding entities with more than one image.
The text was updated successfully, but these errors were encountered: