|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectde.tudarmstadt.ukp.jwktl.parser.ru.wikokit.base.wikipedia.util.StringUtilRegular
public class StringUtilRegular
String usefull functions via regular expressions
Constructor Summary | |
---|---|
StringUtilRegular()
|
Method Summary | |
---|---|
static String |
encodeRussianToLatinitsa(String text,
String enc_from,
String enc_to)
Encodes the text to latinitsa, e.g.: женьшень -> zhen'shen' (Russian) |
static int |
getFirstEmptyLinePosition(int start_pos,
String text)
Gets position of first header in text from start_pos, e.g. 2nd, 3rd or 4th level header ==? |
static int |
getFirstHeaderPosition(int start_pos,
String text)
Gets position of first header in text from start_pos, e.g. 2nd, 3rd, 4th, or 5th level header ==? |
static String |
getLettersTillHyphen(String text)
Gets first letters till first hyphen "-". |
static String |
getLettersTillSpace(String text)
Gets first letters till space. |
static String |
getLettersTillSpaceHyphenOrPipe(String text)
Gets first letters till space " ", ... or pipe "|" (shortest string). |
static String |
getTextTillFirstHeaderOrEmptyLine(int start_pos,
String text)
Gets text from 'start_pos' position till the nearest position: (1) of first header text, or (2) of first empty line, (3) or till the end of text (if header and empty lines are absent). |
static String |
getTextTillFirstHeaderPosition(int start_pos,
String text)
Gets text from 'start_pos' position till position of first header in text, or till the end of text (if header is absent). |
static String |
replaceComplexSpacesByTrivialSpaces(String text)
Replaces special spaces by usual whitespace, e.g. in quote author names "Name Surname" |
static void |
stripNonWordLetters(String[] words)
Strips non-word letters in source array "words". |
static String |
substringAndchopLastNewline(String text,
int start_pos,
int end_pos)
Gets text substring from 'start_pos' position till 'end_pos' position and chop last symbol if it is newline \n symbol. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public StringUtilRegular()
Method Detail |
---|
public static void stripNonWordLetters(String[] words)
public static String getLettersTillSpace(String text)
public static String getLettersTillSpaceHyphenOrPipe(String text)
public static String replaceComplexSpacesByTrivialSpaces(String text)
public static String getLettersTillHyphen(String text)
public static String encodeRussianToLatinitsa(String text, String enc_from, String enc_to)
public static int getFirstHeaderPosition(int start_pos, String text)
public static int getFirstEmptyLinePosition(int start_pos, String text)
public static String getTextTillFirstHeaderPosition(int start_pos, String text)
public static String getTextTillFirstHeaderOrEmptyLine(int start_pos, String text)
public static String substringAndchopLastNewline(String text, int start_pos, int end_pos)
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |