Title: | Chinese Numerals Processing |
---|---|
Description: | Chinese numerals processing in R, such as conversion between Chinese numerals and Arabic numerals as well as detection and extraction of Chinese numerals in character objects and string. This package supports the casual scale naming system and the respective SI prefix systems used in mainland China and Taiwan: "China Statutory Measurement Units" State Administration for Market Regulation (2019) <http://gkml.samr.gov.cn/nsjg/jls/201902/t20190225_291134.html> "Names, Definitions and Symbols of the Legal Units of Measurement and the Decimal Multiples and Submultiples" Ministry of Economic Affairs (2019) <https://gazette.nat.gov.tw/egFront/detail.do?metaid=108965>. |
Authors: | Elgar Teo [aut, cre] |
Maintainer: | Elgar Teo <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.1.3 |
Built: | 2024-11-01 04:08:51 UTC |
Source: | https://github.com/elgarteo/cnum |
Functions to convert between Chinese and Arabic numerals.
c2num( x, lang = default_cnum_lang(), mode = "casual", financial = FALSE, literal = FALSE ) num2c( x, lang = default_cnum_lang(), mode = "casual", financial = FALSE, literal = FALSE, single = FALSE )
c2num( x, lang = default_cnum_lang(), mode = "casual", financial = FALSE, literal = FALSE ) num2c( x, lang = default_cnum_lang(), mode = "casual", financial = FALSE, literal = FALSE, single = FALSE )
x |
the Arabic/Chinese numerals to be converted, or a vector of them. The absolute value must not be greater than 1e+18. |
lang |
the language of the Chinese numerals. |
mode |
the scale naming system to be enforced. See the ‘Details’ section for the list of supported modes. |
financial |
logical: should the financial numerals be used (daxie shuzi)? |
literal |
logical: should the numerals be converted literally? (e.g. 721 to be converted to "qi er yi" instead of "qibai ershiyi" and vice versa) |
single |
logical: should the return result with one scale character only? (e.g. 1.5e+08 as "yi dian wuyi" instead of "yiyi wuqianwan") |
c2num
returns a numeric vector.
num2c
returns a character vector.
c2num
: Convert Chinese Numerals to Arabic Numerals.
num2c
: Convert Arabic Numerals to Chinese Numerals.
The following scale naming systems are supported:
"casual"
: the casual naming system used outside of mainland
China, i.e. 1e+09 is referred to as "yi zhao".
"casualPRC"
: the casual naming system used in mainland China,
i.e. 1e+9 is referred to as "yi wanyi".
"SIprefix"
:
the SI prefix system used in Taiwan as stipulated in the document
Names, Definitions and Symbols of the Legal Units of Measurement and
the Decimal Multiples and Submultiples.
"SIprefixPRC"
: the
SI prefix system used in mainland China as stipulated in the document
China Statutory Measurement Units.
"SIprefixPRClong"
:
a variant of "SIprefixPRC"
with long prefixes, e.g. 1e+09 is
referred to as "yi jika" instead of "yi ji".
The modes "casual"
and "casualPRC"
implements a “myriad scale” with an interval of 1e+04 for large numbers,
i.e. "yi" is 10,000 times of "wan", which is different from
some of the interval systems used in ancient Chinese writings.
This package supports conversion of numbers with absolute value not greater than 1e+18. Note that numbers in R are in double precision that carries approximately 16 significant digits. The conversion accuracy for numbers beyond this limit is therefore not guaranteed.
The standard for mode "SIprefix"
Names, Definitions
and Symbols of the Legal Units of Measurement and the Decimal Multiples and
Submultiples is available from
https://gazette.nat.gov.tw/egFront/detail.do?metaid=108965 (in
Traditional Chinese).
The standard for mode "SIprefixPRC"
China Statutory
Measurement Units is available from
http://gkml.samr.gov.cn/nsjg/jls/201902/t20190225_291134.html (in
Simplified Chinese).
Functions for detetction and extraction
c2num("EXAMPLE CHECK") num2c(721) num2c(-6) num2c(3.14) num2c(721, literal = TRUE) num2c(1.45e12, financial = TRUE) num2c(6.85e12, lang = "sc", mode = "casualPRC") num2c(1.5e9, mode = "SIprefix", single = TRUE)
c2num("EXAMPLE CHECK") num2c(721) num2c(-6) num2c(3.14) num2c(721, literal = TRUE) num2c(1.45e12, financial = TRUE) num2c(6.85e12, lang = "sc", mode = "casualPRC") num2c(1.5e9, mode = "SIprefix", single = TRUE)
cnum
Function to check the default language for cnum
functions.
default_cnum_lang()
default_cnum_lang()
This package supports Traditional Chinese and Simplified Chinese. The
language can be specified with the lang
parameter in every function,
with "tc"
for Traditional Chinese and "sc"
for Simplified
Chinese. The default is "tc"
, but this can be changed by setting
options(cnum.lang = "sc")
.
The default language for cnum
functions.
# Set the default language to Simplified Chinese options(cnum.lang = "sc") default_cnum_lang()
# Set the default language to Simplified Chinese options(cnum.lang = "sc") default_cnum_lang()
Functions to detect and extract Chinese numerals in character object and string.
is_cnum( x, lang = default_cnum_lang(), mode = "casual", financial = FALSE, literal = FALSE, strict = FALSE, ... ) has_cnum( x, lang = default_cnum_lang(), mode = "casual", financial = FALSE, ... ) extract_cnum( x, lang = default_cnum_lang(), mode = "casual", financial = FALSE, prefix = NULL, suffix = NULL, ... )
is_cnum( x, lang = default_cnum_lang(), mode = "casual", financial = FALSE, literal = FALSE, strict = FALSE, ... ) has_cnum( x, lang = default_cnum_lang(), mode = "casual", financial = FALSE, ... ) extract_cnum( x, lang = default_cnum_lang(), mode = "casual", financial = FALSE, prefix = NULL, suffix = NULL, ... )
x |
the character object or string to be tested or to extract from. |
lang |
the language of the Chinese numerals. |
mode |
the scale naming system to be enforced. See the ‘Details’ section for the list of supported modes. |
financial |
logical: should the financial numerals be used (daxie shuzi)? |
literal |
logical: should the numerals be converted literally? (e.g. 721 to be converted to "qi er yi" instead of "qibai ershiyi" and vice versa) |
strict |
logical: Should the Chinese numerals format be strictly
enforced? A casual test only checks if |
... |
optional arguments to be passed to |
prefix |
the prefix of the Chinese numerals. Only numerals with the designated prefix are extracted. Supports regular expression(s). |
suffix |
the suffix of the Chinese numerals. Only numerals with the designated suffix are extracted. Supports regular expression(s). |
is_cnum
returns a logical vector indicating is Chinese
numerals or not for each element of x
).
has_cnum
returns a logical vector indicating contains Chinese
numerals or not for each element of x
.
extract_cnum
returns a list of character vectors containing
the extracted Chinese numerals.
is_cnum
: Test if character object is Chinese numerals. A wrapper
around grepl
.
has_cnum
: Test if string contains Chinese numerals. A wrapper around
grepl
.
extract_cnum
: Extracts Chinese numerals from string. A wrapper around
str_extract_all
from stringr
.
The following scale naming systems are supported:
"casual"
: the casual naming system used outside of mainland
China, i.e. 1e+09 is referred to as "yi zhao".
"casualPRC"
: the casual naming system used in mainland China,
i.e. 1e+9 is referred to as "yi wanyi".
"SIprefix"
:
the SI prefix system used in Taiwan as stipulated in the document
Names, Definitions and Symbols of the Legal Units of Measurement and
the Decimal Multiples and Submultiples.
"SIprefixPRC"
: the
SI prefix system used in mainland China as stipulated in the document
China Statutory Measurement Units.
"SIprefixPRClong"
:
a variant of "SIprefixPRC"
with long prefixes, e.g. 1e+09 is
referred to as "yi jika" instead of "yi ji".
The standard for mode "SIprefix"
Names, Definitions
and Symbols of the Legal Units of Measurement and the Decimal Multiples and
Submultiples is available from
https://gazette.nat.gov.tw/egFront/detail.do?metaid=108965 (in
Traditional Chinese).
The standard for mode "SIprefixPRC"
China Statutory
Measurement Units is available from
http://gkml.samr.gov.cn/nsjg/jls/201902/t20190225_291134.html (in
Simplified Chinese).
is_cnum("yibai ershiyi") has_cnum("yibai bashi yuan") extract_cnum("shisiyi ren")
is_cnum("yibai ershiyi") has_cnum("yibai bashi yuan") extract_cnum("shisiyi ren")