de.tudarmstadt.ukp.jwktl.parser.en
Class ENWiktionaryEntryParser

java.lang.Object
  extended by de.tudarmstadt.ukp.jwktl.parser.WiktionaryEntryParser
      extended by de.tudarmstadt.ukp.jwktl.parser.en.ENWiktionaryEntryParser
All Implemented Interfaces:
IWiktionaryEntryParser

public class ENWiktionaryEntryParser
extends WiktionaryEntryParser

An implementation of the IWiktionaryEntryParser interface for parsing the contents of article pages from the English Wiktionary.

Author:
Christian M. Meyer

Field Summary
 
Fields inherited from class de.tudarmstadt.ukp.jwktl.parser.WiktionaryEntryParser
COMMENT_PATTERN, entryId, handlers, IMAGE_PATTERN, language, redirectTemplate, REFERENCES_PATTERN
 
Constructor Summary
ENWiktionaryEntryParser()
          Initializes the English entry parser.
 
Method Summary
protected  ParsingContext createParsingContext(WiktionaryPage page)
           
protected  boolean isStartOfBlock(String line)
          Checks if it is start of new section.
 
Methods inherited from class de.tudarmstadt.ukp.jwktl.parser.WiktionaryEntryParser
checkForRedirect, getLanguage, parse, register, selectHandler
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

ENWiktionaryEntryParser

public ENWiktionaryEntryParser()
Initializes the English entry parser. That is, the language and the redirection pattern is defined, and the handlers for extracting the information from the article constituents are registered.

Method Detail

createParsingContext

protected ParsingContext createParsingContext(WiktionaryPage page)
Specified by:
createParsingContext in class WiktionaryEntryParser

isStartOfBlock

protected boolean isStartOfBlock(String line)
Checks if it is start of new section. Symbols are =, {{, [[, but neither the Wikipedia nor the translation patterns.

Specified by:
isStartOfBlock in class WiktionaryEntryParser


Copyright © 2011-2013 Ubiquitous Knowledge Processing (UKP) Lab. All Rights Reserved.