Sorting Cairo Geniza fragments began on August 8, 2017 with approximately 30,000 fragments. This means that each subject (a front and back of an image of a fragment) had to be classified (meaning a series of questions had to be answered) 5 times, each time by a different sorter, before a subject was retired (complete). As of May 19, 2018 our #genizascribes completed sorting these fragments, 1/10 of the entire Cairo Geniza! We are amazed, humbled, and energized by the participation.
Now that we have reached this milestone, we want to offer an update on where we are with this project.
- New images
You’ll notice that some of the statistics on the home page have dropped. This is due to the fact that we recently added more images to the project from the University of Cambridge Libraries and the John Rylands Library at the University of Manchester. By the beginning of August, we’ll have uploaded all of the images we currently have, and will be working on forming partnerships with more collections to gather another batch of images.
- Data from Sorting
The research team is just getting started with processing the great data from the first batch of sorted fragments. What we can say is that it looks really good! Whether you had experience or not, you were able to correctly identify script type. We still have a ways to go with this data. This will take time.
Now, in a few weeks we invite you to not only sort Cairo Geniza fragments, but transcribe them! Integrating data #genizascribes generated through sorting fragments and also feedback we’ve received has taught us about what works and doesn’t work with this special and highly complex corpus. We are working with the talented team at Zooniverse to build a custom interface for transcribing these fragments after they’ve been sorted. We’ve worked to design an interface that allows someone with no experience with or expertise in the Geniza to transcribe a fragment from the Cairo Geniza.
Overall, this project has moved astonishingly quickly and we are so pleased not only with the speed, but also with the interactions on the talk boards and the immense amount of learning that seems to be transpiring. While the transcription interface and additional components are complicated because of the complexities of the Geniza corpus itself as well as a trilingual interface (Modern Arabic, Hebrew, and English), we are looking forward to seeing how the #genizascribes run with the new tools in the transcription tasks.
How Can You Help?
Sort Images! — We have a new batch of images to sort, as we explained above.
In a few weeks— Transcribe! We need all hands on deck. Tell everyone you know about this project. Join us in the talk boards, and tell us what you’re learning. No experience necessary!
By Judaica DH at the Penn Libraries on .
Exported from Medium on April 14, 2020.