About the project
MiDRASH is an international project that aims to reconstruct Medieval Jewish book culture by making all Medieval Hebrew Texts accessible and searchable.
The project seeks to provide annotated automatic transcriptions of 30,000 medieval manuscripts in Hebrew, Aramaic, and Judeo-Arabic, as well as cutting-edge tools for manuscript research.
These tools will be based on digital paleography, NLP and deep learning technologies for the enrichment, analysis and re-use detection of the Jewish Medieval texts. The results of this project — data as well as algorithms — will be made available to the public.
- Team Synergy
Digital Atlas
Moshe LAVEE
The team, led by Dr. Moshe Lavee at the University of Haifa, will incorporate data and results from the project into a digital atlas for spatial and chronological exploration. With the Atlas, we will be able to map and visualize diverse types of knowledge, including the chronology of manuscripts, writing styles, and the spread of works and traditions over time and space. In addition, the team will draw on the results of other teams in the project to study the reception of midrashic works and traditions as reflected in medieval midrashic anthologies. With the extensive transcriptions of both published and unpublished texts, we will be able to examine the rabbinic sources used by the various anthologies and the intellectual connections between various Jewish geo-cultural milieus in the formative period of Midrashic Literature.
Natural Language Processing
Avi SHMIDMAN
The team, led by Dr. Avi Shmidman at Bar Ilan University, is developing various Natural Language Processing (NLP) tools. These tools have a dual purpose: to correct and enhance automatic texts produced by eScriptorium, and to provide extensive support to scholars in their analysis, processing, and search within textual data. These algorithms encompass post-HTR correction, code-switching facilitation, parallel text alignment, semantic analysis, morphological analysis, and syntactic analysis. These multifaceted NLP tools serve to augment the precision, interpretability, and usability of textual data within scholarly contexts.
Deep Learning
Nachum DERSHOWITZ
The Tel Aviv University team of Professor Nachum Dershowitz creates and utilizes advanced computational techniques to find solutions for complex problems of digital Hebrew paleography and for analyzing medieval manuscripts in Hebrew characters. Deep machine learning is used to build state-of-the-art algorithms for automatic dating of undated manuscripts, recognition of the region of copying, and automatic recognition of script types and modes. The research involves synergetic feedback between human paleographers and computer scientists to create new tools for Hebrew paleography, including algorithms for clustering and sub-clustering medieval manuscripts within known script type-modes, extraction of paleographical features, and new tools for page segmentation and layout recognition. The machine classifications are translated into human-understandable terms to ensure meaningful feedback.
Sustainability of Midrash’s products
Tsafra SIEW
The team, led by Dr. Tsafra Siew at the National Library of Israel (NLI), is exporting various national and international databases of images, metadata, and texts to support the MiDRASH project.
Alongside depositing the project’s products in Zenodo, it is imperative to seamlessly integrate these products into NLI’s systems. This integration, adhering to a format that aligns with the library’s infrastructure, aims to ensure optimal functionality for conservation, integration into national and international databases, and long-term maintenance and accessibility. The team is conducting a thorough analysis of the relationships between Midrash’s input data, resulting products, and NLI’s knowledge structures to facilitate effective product ingestion and management in the long run.
Automatic Transcription and Computational Philology
Daniel STOEKL BEN EZRA
Professor Daniel Stökl Ben Ezra’s team at the Ecole Pratique des Hautes Études (EPHE, PSL), deals with the automatic analysis of layout, reading order and transcription of the books and manuscripts. At its heart stands eScriptorium, our open-source virtual research environment for automatic manuscript analysis that provides ergonomic access to the AI core kraken created by Benjamin Kiessling and used by many other teams and institutions on many other scripts and languages. In addition, Stökl’s team is responsible for three case studies in computational philology (textual fluidity, prehistory of texts and traditions, tracing the reader). Everything will be performed in close interaction with the other teams.
Hebrew paleography
Judith OLSZOWY-SCHLANGER
The team led by Professor Judith Olszowy-Schlanger at the Ecole Pratique des Hautes Etudes, PSL, Paris, works towards a new methodology and new tools for the study of the medieval Hebrew script of all the geo-cultural areas. It creates an online palaeography album (HebrewPal) – a fully searchable large library of annotated digital images of manuscripts, documents, and inscriptions. To create HebrewPal, MiDRASH team works closely with the project Jewish Book Culture in the Islamicate World (Professor Ronny Vollandt, University of Munich), Digital Scholarship Oxford and Hebrew Manuscripts in the Digital Ages, EPHE, PSL.