2024/05/06

Guava String Utilities 2

CharMatcher

CharMatcher 用在字元的 trimming, collapsing, removing, retaining

    @Test
    public void char_matchers() {
        String string = "12 ab 34 CD\r\n 56 ef789GH";
        String noControl = CharMatcher.javaIsoControl().removeFrom(string); // remove control characters
        String theDigits = CharMatcher.digit().retainFrom(string); // only the digits
        String spaced = CharMatcher.whitespace().trimAndCollapseFrom(string, ' ');
        // trim whitespace at ends, and replace/collapse whitespace into single spaces
        String noDigits = CharMatcher.javaDigit().replaceFrom(string, "*"); // star out all digits
        String lowerAndDigit = CharMatcher.javaDigit().or(CharMatcher.javaLowerCase()).retainFrom(string);
        // eliminate all characters that aren't digits or lowercase

        assertEquals("12 ab 34 CD 56 ef789GH", noControl);
        assertEquals("123456789", theDigits);
        assertEquals("12 ab 34 CD 56 ef789GH", spaced);
        assertEquals("** ab ** CD\r\n ** ef***GH", noDigits);
        assertEquals("12ab3456ef789", lowerAndDigit);
    }

除了一些已經封裝的methods 以外,通用的 method 有三個

Method Description example
anyOf(CharSequence) 要符合的字元 CharMatcher.anyOf("aeiou")
is(char) 特定的 char
inRange(char, char) 一連串的字元 CharMatcher.inRange('a', 'z')
    @Test
    public void char_matchers2() {
        String string = "12 ab 34 CD\r\n 56";
        String anyOfResult = CharMatcher.anyOf("cdef\r\n").removeFrom(string);
        String isResult = CharMatcher.is('C').removeFrom(string);
        String inRangeResult = CharMatcher.inRange('A', 'Z').removeFrom(string);

        assertEquals("12 ab 34 CD 56", anyOfResult);
        assertEquals("12 ab 34 D\r\n 56", isResult);
        assertEquals("12 ab 34 \r\n 56", inRangeResult);
    }

使用 CharMatcher 的 methods

Method Description
collapseFrom(CharSequence, char) 將一連串的字元,縮小變成一個  ex: WHITESPACE.collapseFrom(string, ' ') 會減少為只有一個空白字元
matchesAllOf(CharSequence) 符合所有字元
removeFrom(CharSequence) 移除符合字元
retainFrom(CharSequence) 保留符合的字元
trimFrom(CharSequence) 去掉 leading, trailing 符合的字元
replaceFrom(CharSequence, CharSequence) 取代字元

Charsets

注意 "不要" 這樣寫

try {
  bytes = string.getBytes("UTF-8");
} catch (UnsupportedEncodingException e) {
  // how can this possibly happen?
  throw new AssertionError(e);
}

要改用 Charsets

bytes = string.getBytes(Charsets.UTF_8);

CaseFormat

Format Example
LOWER_CAMEL lowerCamel
LOWER_HYPHEN lower-hyphen
LOWER_UNDERSCORE lower_underscore
UPPER_CAMEL UpperCamel
UPPER_UNDERSCORE UPPER_UNDERSCORE
CaseFormat.UPPER_UNDERSCORE.to(CaseFormat.LOWER_CAMEL, "CONSTANT_NAME"));
// returns "constantName"

References

StringsExplained · google/guava Wiki · GitHub

沒有留言:

張貼留言