|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectde.tudarmstadt.ukp.jwktl.parser.WiktionaryPageParser<WiktionaryPage>
de.tudarmstadt.ukp.jwktl.parser.WiktionaryArticleParser
public class WiktionaryArticleParser
Parses a Wiktionary XML dump and stores the parsed information as a
Berkeley DB within a specified directory. The parsed Wiktionary dump
can then be accessed using the main JWKTL API. This implementation
parses only article pages within the main namespace; discussions, user
pages, revisions, etc. are not handled. An article page's text is
passed to an implementation of IWiktionaryEntryParser
, which
is either automatically detected from the Wiktionary's base URL, or
specified in the constructor. Note that each directory can only
contain one Wiktionary database.
Field Summary | |
---|---|
protected IWiktionaryEntryParser |
entryParser
|
protected IWritableWiktionaryEdition |
wiktionaryDB
|
Fields inherited from class de.tudarmstadt.ukp.jwktl.parser.WiktionaryPageParser |
---|
currentNamespace, dumpInfo, page |
Constructor Summary | |
---|---|
WiktionaryArticleParser(IWritableWiktionaryEdition wiktionaryDB)
Creates a caching article parser that saves the parsed Wiktionary data into a Berkeley DB within the given target directory. |
|
WiktionaryArticleParser(IWritableWiktionaryEdition wiktionaryDB,
IWiktionaryEntryParser entryParser)
Creates a caching article parser that saves the parsed Wiktionary data into a Berkeley DB within the given target directory. |
Method Summary | |
---|---|
protected WiktionaryPage |
createPage()
|
protected boolean |
isAllowed(IWiktionaryPage page)
|
void |
onClose(IDumpInfo dumpInfo)
Hotspot that is invoked after the parser has finished its work. |
void |
onPageEnd()
Hotspot that is invoked upon finishing the current article page. |
void |
onSiteInfoComplete(IDumpInfo dumpInfo)
Hotspot that is invoked after the siteinfo header has been read. |
protected void |
saveParsedWiktionaryPage()
|
void |
setText(String text)
Hotspot that is invoked after the current page's text is read. |
Methods inherited from class de.tudarmstadt.ukp.jwktl.parser.WiktionaryPageParser |
---|
onPageStart, onParserEnd, onParserStart, setAuthor, setPageId, setRevision, setTimestamp, setTitle |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
protected IWritableWiktionaryEdition wiktionaryDB
protected IWiktionaryEntryParser entryParser
Constructor Detail |
---|
public WiktionaryArticleParser(IWritableWiktionaryEdition wiktionaryDB) throws WiktionaryException
WiktionaryException
- if the target dictionary is not empty
and overwriteExisting was set to false.public WiktionaryArticleParser(IWritableWiktionaryEdition wiktionaryDB, IWiktionaryEntryParser entryParser) throws WiktionaryException
WiktionaryException
- if the target dictionary is not empty
and overwriteExisting was set to false.Method Detail |
---|
public void onSiteInfoComplete(IDumpInfo dumpInfo)
IWiktionaryPageParser
onSiteInfoComplete
in interface IWiktionaryPageParser
onSiteInfoComplete
in class WiktionaryPageParser<WiktionaryPage>
public void onPageEnd()
IWiktionaryPageParser
onPageEnd
in interface IWiktionaryPageParser
onPageEnd
in class WiktionaryPageParser<WiktionaryPage>
public void onClose(IDumpInfo dumpInfo)
IWiktionaryPageParser
IWiktionaryPageParser.onParserEnd(IDumpInfo)
calls have been handled.
onClose
in interface IWiktionaryPageParser
onClose
in class WiktionaryPageParser<WiktionaryPage>
protected WiktionaryPage createPage()
createPage
in class WiktionaryPageParser<WiktionaryPage>
public void setText(String text)
IWiktionaryPageParser
setText
in interface IWiktionaryPageParser
setText
in class WiktionaryPageParser<WiktionaryPage>
protected void saveParsedWiktionaryPage()
protected boolean isAllowed(IWiktionaryPage page)
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |