Skip to content

Code for SemDH 2024 Workshop Paper "A Corpus of Biblical Names in the Greek New Testament to Study the Additions, Omissions, and Variations across Different Manuscripts"

License

Notifications You must be signed in to change notification settings

chr-werner/SemDH2024-GreekNewTestamentNames

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SemDH2024-GreekNewTestamentNames

This repo contains the notebooks used for sourcing data for A Corpus of Biblical Names in the Greek New Testament to Study the Additions, Omissions, and Variations across Different Manuscripts, which was submitted to SemDH 2024: First International Workshop of Semantic Digital Humanities.

Structure of this Repository

data/                             General directory for downloaded and generated data
  |-- publish/                    Directory of cleaned up lists (will be generated by 05_pub_prep.ipynb)
  |-- tables/                     Directory containing manually curated lists
  |   `-- names.csv               List of manually curated names
  |-- transcriptions/             Directory of transcripts (will be created during download)
  |-- manuscripts/                Directory of manuscript metadata (will be created during download)
  |-- manuscripts.csv             Processed list of manuscripts (will be generated by 03_*.ipynb)
  |-- names.csv                   Processed list of names (will be generated by 02_get_words.ipynb)
  |-- occurrences.csv             Processed list of occurrences of names (will be generated by 04_search.ipynb)
  `-- verses.csv                  Processed list of verses in manuscripts  (will be generated by 03_*.ipynb)
notebooks/                        Directory of notebooks used
  |-- 01_download.ipynb           Download files from the IGNTP and NTVMR (TEI files and JSON files)
  |-- 02_get_words.ipynb          Preprocess manual curated list of names for later search
  |-- 03_1_teiparse.ipynb         Parsing TEI files for manuscript metadata and verses
  |-- 03_2_jsonparse.ipynb        Parsing JSON files for manuscript metadata
  |-- 03_3_sparql.ipynb           Enriching manuscript metadata with data from dbpedia
  |-- 04_search.ipynb             Search for occurrences and omissions of names in verses
  |-- 05_pub_prep.ipynb           Clean up processed lists
  |-- constants.py                Constants
  |-- convertes.py                Converter functions
  |-- TEIFile.py                  Class file for TEIFile
  |-- utils.py                    Helper functions
  `-- tests.py                    Testing functions
.python-version                   Python version indicator
README                            This README
requirements.txt                  Requirements for Python environment

Install and Use

The recommended Python version for this repo is 3.9.18 (see .python-version). Dockerimages with Python preinstalled can be found on Dockerhub. Alternatively you can setup and run a virtual Python environment.

In your Python environment run pip install -r requirements.txt from the projects root directory to install Jupyter. This will enable you to run the notebooks.

The notebooks will automatically download and install the required packages and modules at runtime in their respective kernel.

SPARQL Queries

We have utilized a SPARQL query for retrieving an initial list of biblical names in the New Testament.

Endpoint: https://database.factgrid.de/query

SELECT ?Person ?PersonLabel ?noted ?notedLabel ?GenderLabel ?link ?book
WHERE {
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
  
  ?Person wdt:P2 wd:Q8811.
  ?Person wdt:P143 ?noted.
  ?noted wdt:P8 ?book.

  FILTER (?book IN (wd:Q74942, wd:Q74943, wd:Q74944, wd:Q74945, wd:Q74946, wd:Q74947, wd:Q74948, wd:Q74949, wd:Q74950, wd:Q74951, wd:Q74952, wd:Q74953, wd:Q74954, wd:Q74955, wd:Q74956, wd:Q74957, wd:Q74958, wd:Q74959, wd:Q74960,  wd:Q74961, wd:Q74962, wd:Q74963, wd:Q74964, wd:Q74965, wd:Q74966, wd:Q74967, wd:Q74968)) 
  
  OPTIONAL { ?Person wdt:P154 ?Gender. }
  OPTIONAL { ?link schema:about ?Person ; schema:isPartOf <https://www.wikidata.org/> . }
}
ORDER BY (?PersonLabel)

Updates and Refinements

There will be/have been updates on this repo. Please have a look at the release tags for previous versions.

How to Cite

If you use this code or data in your research, please cite:

@inproceedings{Werner2024,
  title = {A Corpus of Biblical Names in the Greek New Testament to Study the Additions, Omissions, and Variations across Different Manuscripts},
  author = {Christoph Werner and Zacharias Shoukry and Soham Al-Suadi and Frank Krüger},
  url = {https://ceur-ws.org/Vol-3724/paper6.pdf},
  crossref = {SemDH2024},
  year     = {2024},
  abstract = {The analysis of textual variants of verses in the New Testament across different manuscripts has mainly been done by close reading with manual effort. With the increasing number of transcriptions of the different manuscripts, quantitative analyses (so-called distant reading) can be used to search for patterns of omission, addition, or other variations, to formulate novel hypotheses to be investigated by close reading. In this work, we present a corpus of biblical names including spelling variation and inflections and their mentions in the transcriptions of the New Testament. By integrating and semantically enriching the data collected from different sources, we established a corpus that can be used for the quantitative study of omission, addition, and variation of such biblical names. To illustrate the corpus, we implement some use cases and show that well-known cases can be quantitatively reproduced. The corpus and all code are published under open licenses to enable reproduction, update, and maintenance.},
  keywords = {New Testament,Biblical Names,Textual Variation Units},
}

@proceedings{SemDH2024,
  booktitle = {Semantic Digital Humanities 2024},
  year = {2024},
  editor = {Oleksandra Bruns and Andrea Poltronieri and Lise Stork and Tabea Tietz},
  series = {CEUR Workshop Proceedings},
  address = {Aachen},
  issn = {1613-0073},
  url = {https://ceur-ws.org/Vol-3724/},
  venue = {Hersonissos, Greece},
  eventdate = {2024-05-27},
  title = {Proceedings of the First International Workshop of Semantic Digital Humanities (SemDH 2024)}
}

Versions of Generated Data on Zenodo

About

Code for SemDH 2024 Workshop Paper "A Corpus of Biblical Names in the Greek New Testament to Study the Additions, Omissions, and Variations across Different Manuscripts"

Resources

License

Stars

Watchers

Forks

Packages

No packages published