帮助在 XeLaTeX 中将希伯来语转录为拉丁语

Question

我发现最好的解决方案是在 LaTeX 之外。我改为在 Python 脚本中解析 .tex 文件，然后编译此文档。

参见下面的python脚本：

# This python script takes in a string (either word or sentence(s)) and removes the sheva on the last consonant, and second last consonant (if there is a sheva on the last consonant). It does so irrespective of any dagesh which might throw it off.
import re

def remove_sheva(text):
    # Defining dagesh and sheva
    sheva = '\u05B0'
    dagesh = '\u05BC'

    # Splitting text in single words
    words = text.split()

    # Treating each word separately
    new_words = []
    for word in words:
        # Finding indicies for all consonants in the word
        consonant_indices = [m.start() for m in re.finditer(r'[\u05D0-\u05EA]', word)]

        # If no consonants in the word, adding them in list over nye words without any change
        if not consonant_indices:
            new_words.append(word)
            continue

        # Finding indicies of last and second last consonant.
        last_consonant_index = consonant_indices[-1]
        second_last_consonant_index = consonant_indices[-2] if len(consonant_indices) > 1 else None

        # Checking for any sheva after last consonant
        sheva_after_last_consonant = False
        if last_consonant_index + 1 < len(word) and word[last_consonant_index + 1] == sheva:
            sheva_after_last_consonant = True
            word = word[:last_consonant_index + 1] + word[last_consonant_index + 2:]
        elif last_consonant_index + 2 < len(word) and word[last_consonant_index + 1] == dagesh and word[last_consonant_index + 2] == sheva:
            sheva_after_last_consonant = True
            word = word[:last_consonant_index + 2] + word[last_consonant_index + 3:]

        # Removing sheva between the two last consonants, but only if there is a sheva on last consonant.
        if second_last_consonant_index is not None and sheva_after_last_consonant:
            if second_last_consonant_index + 2 < len(word) and word[second_last_consonant_index + 1] == dagesh and word[second_last_consonant_index + 2] == sheva:
                word = word[:second_last_consonant_index + 2] + word[second_last_consonant_index + 3:]
            elif second_last_consonant_index + 1 < len(word) and word[second_last_consonant_index + 1] == sheva:
                word = word[:second_last_consonant_index + 1] + word[second_last_consonant_index + 2:]

        new_words.append(word)

    # Joining words again to create sentence.
    new_text = ' '.join(new_words)
    return new_text

Answer 1

我发现最好的解决方案是在 LaTeX 之外。我改为在 Python 脚本中解析 .tex 文件，然后编译此文档。

参见下面的python脚本：

# This python script takes in a string (either word or sentence(s)) and removes the sheva on the last consonant, and second last consonant (if there is a sheva on the last consonant). It does so irrespective of any dagesh which might throw it off.
import re

def remove_sheva(text):
    # Defining dagesh and sheva
    sheva = '\u05B0'
    dagesh = '\u05BC'

    # Splitting text in single words
    words = text.split()

    # Treating each word separately
    new_words = []
    for word in words:
        # Finding indicies for all consonants in the word
        consonant_indices = [m.start() for m in re.finditer(r'[\u05D0-\u05EA]', word)]

        # If no consonants in the word, adding them in list over nye words without any change
        if not consonant_indices:
            new_words.append(word)
            continue

        # Finding indicies of last and second last consonant.
        last_consonant_index = consonant_indices[-1]
        second_last_consonant_index = consonant_indices[-2] if len(consonant_indices) > 1 else None

        # Checking for any sheva after last consonant
        sheva_after_last_consonant = False
        if last_consonant_index + 1 < len(word) and word[last_consonant_index + 1] == sheva:
            sheva_after_last_consonant = True
            word = word[:last_consonant_index + 1] + word[last_consonant_index + 2:]
        elif last_consonant_index + 2 < len(word) and word[last_consonant_index + 1] == dagesh and word[last_consonant_index + 2] == sheva:
            sheva_after_last_consonant = True
            word = word[:last_consonant_index + 2] + word[last_consonant_index + 3:]

        # Removing sheva between the two last consonants, but only if there is a sheva on last consonant.
        if second_last_consonant_index is not None and sheva_after_last_consonant:
            if second_last_consonant_index + 2 < len(word) and word[second_last_consonant_index + 1] == dagesh and word[second_last_consonant_index + 2] == sheva:
                word = word[:second_last_consonant_index + 2] + word[second_last_consonant_index + 3:]
            elif second_last_consonant_index + 1 < len(word) and word[second_last_consonant_index + 1] == sheva:
                word = word[:second_last_consonant_index + 1] + word[second_last_consonant_index + 2:]

        new_words.append(word)

    # Joining words again to create sentence.
    new_text = ' '.join(new_words)
    return new_text

帮助在 XeLaTeX 中将希伯来语转录为拉丁语

答案1

相关内容