TranslitKit

TranslitKit is a framework for Hebrew-English transliteration.

Installation

gem install translit_kit

# in your Gemfile
gem 'translit_kit'

Requires Ruby 2.2 or later

Usage

Basic transliteration

  require 'translit_kit'
  word = HebrewWord.new "אַברָהָם"
  word.transliterate(:single)
  # => ["avrohom"]

  # Shortcut
  word.t(:single)
  # => ["avrohom"]

Transliteration is powered by phoneme maps, files that map between Hebrew phonemes, or units of sound, and English characters. (see below)

Three phoneme_maps are provided: :long, :short, and :single. You can easily add your own (see below)

word.t(:single)
# => ["avrohom"]
word.t(:short)
# => ["avroom", "avroam", "avroem", "avrohom", "avroham",
# "avrohem", "avraom", "avraam", "avraem", "avrahom",
# "avraham", "avrahem", "avreom", "avream", "avreem",
# "avrehom", "avreham", "avrehem" ]
word.t(:long)
# => ["avroom", "avrooom", "avroohm", ... ] # 5,997 more!

The default is :short:

  word.t == word.t(:short)
  # => true

To get the total permutation count, call HebrewWord#inspect

word.inspect
# => "אַברָהָם: Permutations: 1 single | 18 short | 6000 long"

Adding Custom Phoneme maps

Format

Phoneme Maps are simply JSON files, placed in the lib/phoneme_maps directory.

The file should map between each String (the phonemes) and an Arrays of replacement characters.

{
  "ב": ["v"],
  "בּ": ["b", "bb"]
}

A phoneme can be a Hebrew character א, nekuda (ָ), or character with modifiers, such as a dagesh (בּ). Keep in mind that many characters will be normalized (see below).

Installation

To install your custom map, place the file in lib/resources

Your file will be available as the symbol:<filename> without the .json extension.

Example: klingon.json becomes :klingon

Now you can use it anywhere:

  word.transliterate(:klingon)
  # => (Results)

At present, your map will not display results in HebrewWord#inspect

Contributing

TranslitKit is currently maintained by @AnalyzePlatypus. Contributions welcome!

Appendix: Pre-Processing

When a word is transliterated, it is pre-processed to normalize certain characters. Specifically:

Whitespace is stripped
The final letters [םןךףץ] are normalized to their standard forms
CHATAF nekudos ['ֲ','ֳ','ֱ'] are normalized to their standard forms
Full CHIRIK, TZEIREI, and CHOLOM nekudos have their letters removed
DAGESH characters are removed from all but the characters [בוכפת]

Name		Name	Last commit message	Last commit date
Latest commit History 71 Commits
.github/workflows		.github/workflows
lib		lib
pkg		pkg
test		test
.codeclimate.yml		.codeclimate.yml
.gitignore		.gitignore
.rubocop.yml		.rubocop.yml
.travis.yml		.travis.yml
Gemfile		Gemfile
Gemfile.lock		Gemfile.lock
MIT-LICENSE		MIT-LICENSE
README.md		README.md
Rakefile		Rakefile
codeclimate-config.patch		codeclimate-config.patch
translit_kit.gemspec		translit_kit.gemspec

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TranslitKit

Installation

Usage

Adding Custom Phoneme maps

Format

Installation

Contributing

Appendix: Pre-Processing

About

Releases

Packages

Contributors 3

Languages

License

AnalyzePlatypus/TranslitKit

Folders and files

Latest commit

History

Repository files navigation

TranslitKit

Installation

Usage

Adding Custom Phoneme maps

Format

Installation

Contributing

Appendix: Pre-Processing

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages