@broofa/stringlang

3.0.0 • Public • Published

stringlang

Utility functions for analyzing strings by Unicode block

Installation

npm i @broofa/stringlang
import {unicodeBlock, unicodeBlockCount, BLOCKS} from 'stringlang';

unicodeBlock()

Get block of a given character or code point.

Note: Runs at 10M+ chars/second on a modern Mac laptop (test data)

// Get block (codePoint)
unicodeBlock(30028); // => 'CJK Unified Ideographs'
// Get block (string)
unicodeBlock('界'); // => 'CJK Unified Ideographs'
// Get block (string, character index)
unicodeBlock('Aα界', 2); // => 'CJK Unified Ideographs'

unicodeBlockCount()

Count characters by block

unicodeBlockCount('Hello World or Καλημέρα κόσμε or こんにちは 世界'); // =>
// {
//   'Basic Latin': 21,
//   'CJK Unified Ideographs': 2,
//   'Greek and Coptic': 13,
//   Hiragana: 5
// }

BLOCKS

Array of [block name, min code point, max code point] entries, ordered by code point.

BLOCKS; // =>
// [
//   [ 'Basic Latin', 0, 127 ],
//   [ 'Latin-1 Supplement', 128, 255 ],
//   ... 308 more entries
// ]

Package Sidebar

Install

npm i @broofa/stringlang

Weekly Downloads

30

Version

3.0.0

License

ISC

Unpacked Size

15.6 kB

Total Files

6

Last publish

Collaborators

  • broofa