GLAM/Case studies/Javanese Wikisource Competition
Javanese Wikisource (Wikisumber) was born in 2021, after started in Incubator (Multilingual Wikisource) since 2004.
In 2023, GLAM Indonesia, under Wikimedia Indonesia, together with Yogyakarta Wikimedia Community, organized the first Javanese Wikisource Competition.
Part of this page was originally written for GLAM Indonesia, and expanded by the example of another case study in outreachwiki
The Competition
edit- The Wikisource 2023 competition was held on 6 - 20 March 2023.
- [The landing page was created in Javanese Wikisource https://jv.wikisource.org/wiki/Wikisumber:Kompetisi_Wikisumber_2023, including pages about rule, list of books, results, and questions and answers. Another page was created to list all the templates most commonly used during the competition: https://jv.wikisource.org/wiki/Pitulung:Cithakan]
- Books prepared: 30 Javanese books in Latin letters (10 books on March 6, plus 10 books on March 10, and 10 more books on March 14) and 3 manuscripts in Javanese script (available at the beginning of competition)
- Communication for questions and answers and announcements with participants was done via the Whatsapp Group.
- All participants were put into the WA group, except for two who did not have WA or did not provide telephone numbers.
- The assessment used two WSContest pages because there were two kinds of points, one for books with Latin script and the other for manuscripts with Javanese script:
- https://wscontest.toolforge.org/c/60 - for scoring books with Latin letters (3 for proofreading, 1 for validating)
- https://wscontest.toolforge.org/c/93 - for evaluating Javanese script books (6 for proofreading, 2 for validating)
Where did the participants from?
editThe participants were open for all Indonesians, not just for those who are ethnically or linguistically from Javanese background, but also other ethnicity and language groups, since during the proofreading (the Latin Javanese books), no language knowledge was needed. With the exception of the 3 manuscripts, which required Javanese script knowledge.
There were 49 people from all around Indonesia (not only from Java) registered and 19 people told that they had never edited any Wiki projects before (one of them managed to get the second rank). Unfortunately we didn't think of asking how many of them were familiar with (or edited) other (non-Wikisource) Wikimedia projects before but never edited Wikisource.
Because the registered participants came from different language backgrounds, we mostly used Indonesian language for communications in the chat group, and the competition page (https://jv.wikisource.org/wiki/Wikisumber:Kompetisi_Wikisumber_2023).
There were 32 participants who made contributions in the project during that time, and 27 of them are considered active participants after they met the minimum criteria of 12 proofreadings.
What was the technical steps that needed to prepare for the competition?
editIndonesian Wikisource had already held annual competition since 2020, although the very first such competition was held in 2012 (https://id.wikimedia.org/wiki/Beasiswa_Perjalanan_Jakarta)
Each iteration tries to be better from the previous one. We mostly reuse the basic structure of the landing page from the previous competition. In 2022-23, Wikimedia Indonesia has planned to hold 3 Wikisource competitions; they are 1 Balinese, 1 Javanese, and 1 Indonesian. Another Wikisource competition, the Sundanese Wikisource, was also held because the community started to be active this year. (It was held in the Multilingual Wikisource)
Just before the Javanese Wikisource competition, we had to expand and revamp many rules [1] to deal with the complexities of proofreading in 2 different writing scripts.
We also had video tutorials on how to type Javanese script using Universal Language Selector. The video was made from the previous year. We also provided other resources such as PPT presentation, screenshots, as well as extensive Q&A that were constantly updated from the feedback of the participants in the chat group. (https://jv.wikisource.org/wiki/Wikisumber:Kompetisi_Wikisumber_2023/Bantuan)
The list of most-used templates were also created and expanded [2], so that the new contributors who were not familiar with wikitext would not be left behind by the more experience editors. Indeed, the result was that the first winner had never contributed to Wikisource and the second winner was a newcomer to Wikimedia projects.
Other than that, we needed to create two different WSContest pages, one for Latin books and the other for Javanese script manuscripts, because the latter counts as double points.
- https://wscontest.toolforge.org/c/60 - for scoring books with Latin letters (3 for proofreading, 1 for validating)
- https://wscontest.toolforge.org/c/93 - for evaluating Javanese script books (6 for proofreading, 2 for validating)
Also, GLAM appointed two community member as volunteer competition committee members. They are responsible for answering the participants' questions on chat group, preparing the landing page and the list of books, overseeing the competition progress (finding any irregularities or participants who unknowingly broke the rules), announcing the result, and preparing the final report.
- T. S. Setyawati
- Febri M. N.
- Creation of a campaign page
As mentioned above, we used the previous Indonesian Wikisource Competition landing page as template:
- https://id.wikimedia.org/wiki/Kompetisi_Wikisource_2022, which in turn was from
- https://id.wikimedia.org/wiki/Kompetisi_Wikisource_2021, which in turn was from
- https://id.wikimedia.org/wiki/Kompetisi_Wikisource_2020
What did preparing the book list look like? What steps did they have to go through? How did they choose them?
editThe Yogyakarta Wikimedia Community prepared the books beforehand, and asked WMID for grant to digitize these books. They are from personal collections and Wikimedia Yogyakarta's collections. 4 volunteers scanned the books using mobile apps, and uploaded 40 books to Commons. https://commons.wikimedia.org/wiki/Category:Hibah_Dana_Wiki_Komunitas_Yogyakarta_2022. Among them, 30 was used in the competition.
The special 3 manuscripts was from British Library. I downloaded them and compiled them into PDFs before uploading to Commons. Previously GLAM Indonesia was already in touch with British Library's Dr. Gallop, and informed them about the plan to use some of Javanese manuscripts in their collection for the competition. Then the British Library informed us that they have collaborated with Yayasan Sastra Lestari from Solo to transliterate 3 special manuscripts (3 of the most beautiful manuscripts in their collection). So we decided to use that 3 manuscripts, because the presence of transliteration text would greatly help the transcription process, in absence of a Javanese script OCR.
How did you promote it?
editWe didn't use site notice, village pump, nor mailing lists. Rather, we used the Yogyakarta Wikimedia Community monthly gatherings to inform about this competition several months before.
Wikimedia Indonesia promoted the event through their social media accounts: Telegram, Twitter, and Instagram. We didn't use Facebook.
- https://twitter.com/wikimediaid/status/1631625782845132804
- https://twitter.com/wikimediaid/status/1639504176467288066
From our previous experience, we estimated those channels were already enough to gather the necessary amount of proofreaders. We wanted to have not too much, not too little of participants. One of the source of participants are recurring participants, those who have participated in previous (various) competitions, either Wikisource or GLAM / Wikimedia competitions, they were very likely to join again. The rules only forbade any past winners to get the first and second prize, but they can still get various merchandise prizes.
This is the benefit of having multiple competitions each year, and annual events that people looked forward to join to. Our roster of participants are evenly consist of veterans as well as newbies.
What kinds of educational material did they need to produce before developing the contest?
editAs we have mentioned above:
- Video tutorials, power points, are available in: https://commons.wikimedia.org/wiki/Category:Javanese_Wikisource_competition_2023
- List of templates: https://jv.wikisource.org/wiki/Pitulung:Cithakan
- Wikimedia Indonesia's podcast about Wikisource: https://www.youtube.com/watch?v=y6D2vmnQ0GI https://www.youtube.com/watch?v=YLuJ6av4cZI and several user produced videos
- etc.
How did the organizers develop rules for the contest?
edithttps://jv.wikisource.org/wiki/Wikisumber:Kompetisi_Wikisumber_2023/Peserta,_aturan,_dan_hadiah In a nutshell:
- Only edit during 6-20 March
- Have an account and logged in while editing
- Already registered through Google forms
- Start proofreading/validating the selected books
- Follow the rules, the contributions are counted automatically via WSContest
Then the specific rules with regards of what is considered proofreading/validationg
- Red - not proofread, no points
- Yellow - proofread, 3 points for Latin, 6 points for Javanese script, and various rules of what is considered "proofread"
- Green - validated, 1 point for Latin, 2 points for Javanese script (but at the end, there was no one doing validations of Javanese script. Perhaps the point was not high enough, or the difficulty of proofreading Javanese script beyond the skill of the proofreaders).
- Grey - no text, no points
- Purple - problematic, or image only, no points
General rule
- Can only save from red to yellow, or yellow to green. Cannot lower the status of any page.
- Only allowed to work on one page at a time (red status). Can work on another page, after turn it into yellow. Cannot edit other participant's "red" status pages.
- Can only edit own red/green status page. Can only edit other participant's "yellow" status pages.
Javanese script rules
- Must use [Unicode] Javanese script
- Must finish transcribing the whole page for yellow status. Some mistakes are permitted at this level
- Must be the same with the source for green status. Only very minor mistakes are allowed.
What tools were needed?
editWSContest Tool
Outcomes and Follow Up
edithttps://jv.wikisource.org/wiki/Wikisumber:Kompetisi_Wikisumber_2023/Hasil
User:WanaraLima won the contest by proofreading 447 pages, and validating 346 others (all Latin pages). He got a new Asus laptop, a community shirt, and GLAM Indonesia merchs. He couldn't speak Javanese (although ethnically Javanese)
User:Kriita came close second by proofreading 291 Latin pages and 111 Javanese script pages, and validating 115 Latin pages. He got a new Huawei tablet, and together with the 3rd to 10th top contributors received the same goodies as the first place.
The rest of the participants who proofread minimum of 12 pages, received the GLAM merchs sans the community shirt.
Outcome Statistics
edit- 49 registered participants (19 participants answered that they had never contributed to a Wikimedia project before)
- Contributing participants: 32 (27 participants actively worked on more than 12 pages)
- The duration of the competition is 14 days
- Number of books 33
- The total number of pages is 3,912
- Number of pages not proofread (red): 124
- Number of proofread pages (yellow): 1,064
- Number of validated pages (green): 1,580
- Number of pages total (red, yellow, green): 2,768
- The total number of edits is 9,390
- Edits per page made: 3-4 edits/page
- Edits per active participant: 348 edits/participant
- Edits per day: 671 edits/day
- Pages per day: 198 pages/day
- Pages per active participant: 103 pages/participant
Evaluations
edit- Questions frequently asked by participants:
- How is the writing format, diacritics, word cutting to the line below it, is it written with hyphens like the original text, how is the work on pages containing images?
- What are the formulas/templates for numbering or details, dialogs, two columns, words in a box?
- Requirements to change status from red, yellow, and green?
- If a page is being worked on and then given a red status, but then another participant continues and changes the status to yellow, what should I do?
- Competition process:
- The books being contested have a varied number of pages (thick and thin) so they are not too burdensome for the participants.
- The Latin OCR tool really helps speed up the rewriting process so that more pages are successfully created, but for text quality and neatness, you still have to test the manual carefully.
- The WSContest tool is quite helpful in scoring by recording every edit of the participant.
- Committee
- Need to provide a more complete video tutorial on how to create a page with a desktop view (not enabling the desktop display) and how to give red, yellow, or green status.
- Need to provide a list of html formulas/writing templates or missing image markers and how to add images.
- It is necessary to regularly remind participants about the rules of the competition so that each participant maintains sportsmanship and responsibility in making pages or testing reads so as not to harm other participants.
- It is necessary to regularly remind participants that quality takes precedence over quantity so that they are more thorough.
- It is necessary to add the competition page link and the group invite link in the participant's WhatsApp Group description.
See also
edit- The original report in Indonesian - https://id.wikimedia.org/wiki/GLAM_2022/Kompetisi_Wikisumber_2023 by T. S. Setyawati, some were translated here using machine translator
- The blog in Diff - https://diff.wikimedia.org/2023/03/13/javanese-wikisource-competition/ (in Javanese, Indonesian, and English)
- The blog in Wikimedia Indonesia - https://wikimedia.or.id/2023/03/21/kompetisi-wikisumber-2023/
and the digitization of the books - http://wikimedia.or.id/2023/04/10/melestarikan-bahasa-jawa-melalui-digitalisasi-buku/ - GLAM/Case studies/Bengali Wikisource 10th Anniversary Proofreading Contest
- In social media: Start of the competition - End of competition