The online service Wikipedia in Marble retrieves wikipedia articles by location from geonames.org and displays them on the map. The service is limited to a certain number of daily queries since some time. This task is about researching a way to use wikilocation.org as an additional source for wikipedia articles. Either - query articles from both geonames and wikilocation in parallel - replaces geonames with wikilocation or - run wikilocation when detecting that we used up the limit To decide for the best approach this task is about doing some research wrt to the quality of both services: - Do both differ in the articles they provide? Compare a couple of sample locations - What kind of additional information do they provide that is useful for Marble? - Are there any speed differences between them? See http://wikilocation.org/documentation/#api-articles and http://www.geonames.org/export/wikipedia-webservice.html#wikipediaBoundingBox
I want to work on this task!
So is this task supposed to be only research-based, or is there some coding present in it as well?
I don't expect any coding here (or none that would end up in Marble directly, you might want to do some script or similar though to visualize the API call results to compare the results for similar regions).
Okay, actually I was thinking if it would be possible to maybe assign this task to the others and assign me in the next task instead. I am sorry I should not have gone for this task in such a hurry and messing it up in midway like this.
No worries ;-)
Thank you :)
I want to work on this task.
I made some research and came to these conclusions: - The two services do not differ too much in the articles they provide: one difference which may be interesting is that geonames.org provides a small summary of the entry. However, both provide wikipedia urls so if one wants to read more about it, it would be only one click away. - As I previously said, they both offer almost the same information to marble. There are, tough, some small differences: geonames provides elevation in addition; wikilocation, however, returns the distance from the given point. The two services work similarly. They differ, however, in that in geonames you give a bounding box and it returns the wikipedia entries within that bounding box while in wikilocation you give a point by it's lat and lng and optionally a radius to search witin and limit for the number of results you want to return. - As far as time is concerned, I spoted some differences. I made some tests, and it took "real 0m5.679s" for geonames to provide 20 entries while wikilocation returned 9 articles in "real 0m6.602s" so almost 1 second difference and it returned half of the entries return by geonames. In conclusion, taking into consideration the time differences I think that replacing geonames with wikilocation would be the worst solution. I think that neither the last solution, "run wikilocation when detecting that we used up the limit" is not very good because when the limit would be reached, the difference of speed would be obvious. So my opinion is that querying articles from both geonames and wikilocation in parallel would be the best solution. PS: I couldn't make more tests because geonames said I reached the number of credits for the day. I tried to change the username but however, it didn't let me check more than one or two xml files.
The timing results here (Germany, Sunday morning local time) are - about 2 seconds for wikilocation (high variance) - 0.1 seconds for geonames (low variance) Clear win for geonames, but that doesn't rule out wikilocation. For geonames I used our marble username without any problems. Which username did you use? The demo one is often problematic as it is shared by all users trying their API. I agree that replacing geonames with wikilocation is not an option right now, but I feel that detecting problems with geonames and only then working with wikilocation might be the better approach. It avoids unnecessary network traffic and duplicated results shown over each other in the map.
Yes, I only used the demo user and some random usernams. I didn't know that I can use my marble username. I think that your conclusion of detecting problems with geonames and then using wikilocation is close to what I thought since I specifically excluded the option of replacing geonames with wikilocation too. I wasn't sure which of the other two to choose and I considered as a criterion the difference of speed. However, at a second thought, I think that you are right and it is more important to avoid unnecessary network traffic and duplicated results so running wikilocation when detecting that we used up the limit would be the best option.