Sunday, October 15, 2017

#Wikidata - motivation; thank you #Magnus

I added a Baratunde A. Cola to Wikidata because he won the Alan T. Waterman Award. This month a Wikipedia article was written and I wanted to add some data to the item.

I did not because functionality that is key to me was broken. A new property was added and all the work that I had done on categories no longer showed in Reasonator. There was no willingness to consider the consequential loss of functionality and the result was a dip in my motivation.

Wikidata is important to me and I asked Magnus if he would help out and change Reasonator. He did.

Now I have added information to Mr Cola based on his categories. It matters that a category like this one reflects all the people known to have played in the Vanderbilt Commodores football team.

The issue is that at Wikidata, we have lost sight of these collaborative aspects. Everybody does his own thing and we hardly consider why. It is why user stories are so important; they tell you why something is done and what the benefit is.  In the end without a benefit there is no reason to do it.
Thanks,
      GerardM

Thursday, October 12, 2017

#Wikisource - the proof of the pudding

A user story for Wikisource could be: As Wikisourcerers we transcribe and format books so that our public may read these books electronically.

The proof of the pudding is therefore in the people who actually read the finished books.  To future proof the effort of the Wikisourcerers, it is vital to know all the books that are ready for reading. It is vital to know this for books in any and all languages supported.

There are two issues:
  • The status of the books is not sufficiently maintained in all the Wikisources
  • There is no tool that advertises finished books
To come to a solution, existing information could be maintained in Wikidata for all Wikisources in a similar way as done for badges. With the information in Wikidata a queries can be formulated that shows the books in whatever language, by whatever author.

Currently there are Wikisources that do not register this information at all. This does not prevent us from making the necessary steps towards a queriable solution. After all adding missing badges at a later date only adds to the size of the pudding, not to the proof of the pudding.
Thanks,
     GerardM

Tuesday, October 10, 2017

#Wikipedia discovers #OpenLibrary

On Facebook, Dumisani Ndubane posted his discovery of Open Library:
I just discovered that The Internet Archive has a book loan system, which gives me access up to 5 books for 14 days. So I have a library on my laptop!!! This is awesomest!!!
And it is. Anybody can borrow books from the Open Library (is is part of the Internet Archive). What Dumisani did not know at the time is that there are books in other languages to be found as well.

Dumisani found out by accident; he googled for an ebook called "Heart of darkness" by Joseph Conrad. What Dumisani did not know at the time is that the Open Library includes books in many languages. His next challenge: find the books in Xitsonga, and tell his fellow Wikipedians about it.
Thanks,
      GerardM

Wednesday, October 04, 2017

#Wikimedia - A user story for libraries

The primary user story for libraries is something like: As a library we maintain a collection of publications so that the public may read them in the library or at home .

Whatever else is done, it is to serve this primary purpose. In the English Wikipedia you will find at the bottom for many authors a reference to WorldCat. WorldCat is to entice people to come to their library.

It does not work for me.

My library is in Almere and, I have stated in my profile in WorldCat that I live in Almere, I have indicated that my local library is my favourite. WorldCat indicates that the Peace Palace Library is nearby.. It isn't.

When it does not work for me, it does not work for other people reading Wikipedia articles and consequently it needs to be fixed. So what does it take to fix WorldCat for the Netherlands; for me. WorldCat is used for a wordwide public and all the libraries of the world may benefit when WorldCat gets some TLC.
Thanks,
     GerardM

Monday, October 02, 2017

#Wikipedia - A user story for WikipediaXL: an end to the Cebuano issue

The user story for #Wikimedia is something like: As a Wikimedia community we share the sum of all knowledge so that all people have this available to them. 

As an achievable objective it sucks. The sum of all knowledge is not available to us either. To reflect this, the following is more realistic: As a Wikimedia community we share the sum of all knowledge available to us so that all people have this available to them.

When all people are to be served with the sum of all knowledge that is available to us, it is obvious that what we do serve depends very much on the language people are seeking knowledge in. What we offer is whatever a Wikipedia holds and this is often not nearly enough.

To counter the lack of information, bots add articles on subjects like "all the lakes in Finland". This information is not really helpful for people living in the Philipines but it does add to the sum of available information in Cebuano.

The process is as follows: an external database is selected. A script is created to build text and an infobox for each item in the database. This text is saved as an article in the Wikipedia. From the article information is harvested and it is included in Wikidata. One issue is that when the data is not "good enough", subsequent changes in Wikidata are not reflected in the Wikipedia article.

Turning the process around makes a key difference. An external database is selected. Selected data is merged into Wikidata. This data is used to generate only new article texts that are cached in all languages that have an applicable script. As the quality of the data in Wikidata improves, the cached articles improve.

With Wikipedia extended in this way, WikipediaXL, we become more adept at sharing the sum of our available knowledge. With caching enabled in this way, any language may benefit from all the data in Wikidata. It is considered important to consider the quality of new data. Data may come from a reputable source or from a source we collaborate with on the maintenance of the data. What is to be preferred is for another blogpost.

Saturday, September 30, 2017

#Wikipedia - #Wikidata user stories

User stories are important. They indicate why a certain functionality exists or the purpose of a project. A "user story" has a fixed format:
As a <insert a role> I would like to <insert an activitiy> so that I <insert a purpose>.
One user story is: As a Wikipedia editor, I can link an article to articles in other language(s) so that a Wikipedia reader can find an article in a language he or she can read.

Another user story:  As a Wikidata editor, I can maintain statements on Wikidata items so that Wikipedia readers always have the latest information available to them.

The first user story has been a resounding success. It is why Wikidata was relevant from the start. The second is very much a work in process and it depends very much how the current state of affairs is evaluated. There are dependencies for the efforts of so many to have an effect;
  • Readers of a Wikipedia can only see the result when the information has been included in Wikidata
  • Wikipedia readers will only see the result when the editors of their Wikipedia allow them to see it
The first dependency is with Wikidata editors but the second dependency is outside of the influence of Wikidata editors. For this reason it makes sense to formulate a different user story: As a Wikidata editor I can maintain statements on Wikidata items so that Wikipedia editors can take the responsibility to inform their public.

To help these Wikipedia gatekeepers there is a need for tools that makes them aware of the information they do not provide.
Thanks,
      GerardM

Sunday, September 17, 2017

#Wikimedia and its #BLP approach


There is a huge controversy about the policies about the "Biographies of Living People". Central in all this is that there is no such policy at Wikidata. Many seasoned Wikipedians are of the opinion that using data in Wikipedia is a violation of its BLP policy as a consequence. At the same time there are seasoned Wikidatans who oppose a BLP policy similar to the one at Wikipedia. The problem is that Wikidata does need a BLP policy but it needs to be different for various reasons.

  • An item in Wikidata can be really rudimentary; Marian Latour, a Dutch author, was created because she won an award. This is allowed in Wikidata but the limited information is probably a violation of the English BLP policy. This information came from the Dutch Wikipedia
  • The initial data of Wikidata were the interwiki links. This was a huge improvement for the Wikipedias and there are still many items that have no statements. This is used as an argument not to accept information from Wikidata.
  • Wikidata data is retrieved from a Wikipedia, information like "who won an award". Given the BLP policy of that Wikipedia is should be faultless but it often is not due to disambiguation issues. 
The first issue refers to a red link on the Dutch Wikipedia. When the red link is associated with the Wikidata item, there will not be a new disambiguation issue when a different Marian Latour is introduced. Currently there is only one Marian Latour known to Wikidata.
The second issue is one where Wikidata statistics indicate that slowly but surely is adding statements. They also prove that there is still so much to do...
The third issue is the main one. When an article is linked to Wikidata, articles in other languages should link to the same item or to a red link. Solving these issues requires coexistence and preferably collaboration. 

What we need in a Wikipedia is the ability to link a blue or red link to a Wikidata item. Obviously changing links is either blatantly obvious like for Manuel Echeverria or it requires a source. Technically the necessary change in the MediaWiki software may be "opt in" so that only people who care about this approach to quality make use of it. 

As far as I am concerned, when some Wikipedians find fault elsewhere and do not reflect on this proposal and the improvements it brings them, that is fine. What is relevant is that this approach allows for the best Wikidata practices and at the same time improves the BLP quality in all Wikimedia projects.
Thanks,
       GerardM