de.tudarmstadt.ukp.jwktl
Class JWKTL

java.lang.Object
  extended by de.tudarmstadt.ukp.jwktl.JWKTL

public class JWKTL
extends Object

Main entry point of the JWKTL API. Use this class to initiate the parsing of a Wiktionary XML dump file and for accessing already parsed dump files.

Author:
Christian M, Meyer

Constructor Summary
JWKTL()
           
 
Method Summary
static void deleteEdition(File parsedData)
          Deletes all files from a previously parsed Wiktionary from the specified directory.
static String getVersion()
          Returns the software version.
static IWiktionaryCollection openCollection(File... parsedDumps)
          Opens the parsed Wiktionary language edition stored at the given locations and aggregated them in a IWiktionaryCollection.
static IWiktionaryCollection openCollection(Long cacheSize, File... parsedDumps)
          Opens the parsed Wiktionary language edition stored at the given locations and aggregated them in a IWiktionaryCollection.
static IWiktionaryEdition openEdition(File parsedDump)
          Opens the parsed Wiktionary language edition stored at the given location.
static IWiktionaryEdition openEdition(File parsedDump, Long cacheSize)
          Opens the parsed Wiktionary language edition stored at the given location.
static void parseWiktionaryDump(File dumpFile, File targetDirectory, boolean overwriteExisting)
          Parses the given XML dump file of Wiktionary and stores the parsed data within the specified target directory.
static void parseWiktionaryDump(File dumpFile, File targetDirectory, boolean overwriteExisting, boolean parseWikiSaurus)
          Parses the given XML dump file of Wiktionary and stores the parsed data within the specified target directory.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

JWKTL

public JWKTL()
Method Detail

getVersion

public static String getVersion()
Returns the software version.


openCollection

public static IWiktionaryCollection openCollection(File... parsedDumps)
Opens the parsed Wiktionary language edition stored at the given locations and aggregated them in a IWiktionaryCollection. This method uses a default cache size for the Berkeley DB.

Throws:
WiktionaryException - in case of any JWKTL-related error.

openCollection

public static IWiktionaryCollection openCollection(Long cacheSize,
                                                   File... parsedDumps)
Opens the parsed Wiktionary language edition stored at the given locations and aggregated them in a IWiktionaryCollection. This method uses the given cache size for connecting to the Berkeley DB.

Throws:
WiktionaryException - in case of any JWKTL-related error.

openEdition

public static IWiktionaryEdition openEdition(File parsedDump)
Opens the parsed Wiktionary language edition stored at the given location. This method uses a default cache size for the Berkeley DB.

Throws:
WiktionaryException - in case of any JWKTL-related error.

openEdition

public static IWiktionaryEdition openEdition(File parsedDump,
                                             Long cacheSize)
Opens the parsed Wiktionary language edition stored at the given location. This method uses the given cache size for connecting to the Berkeley DB.

Throws:
WiktionaryException - in case of any JWKTL-related error.

parseWiktionaryDump

public static void parseWiktionaryDump(File dumpFile,
                                       File targetDirectory,
                                       boolean overwriteExisting)
Parses the given XML dump file of Wiktionary and stores the parsed data within the specified target directory. Note that each target directory can only contain one parsed Wiktionary database. This method is equivalent to WiktionaryDumpParser.parse(File) using a registered WiktionaryArticleParser. The parsing does not include information from Wikisaurus.

Parameters:
dumpFile - file name of the Wiktionary dump in XML format.
targetDirectory - directory for storing the parsed data.
overwriteExisting - if true, previously parsed Wiktionary data files are removed from the targetDirectory.
Throws:
WiktionaryException - in case of any parser errors.

parseWiktionaryDump

public static void parseWiktionaryDump(File dumpFile,
                                       File targetDirectory,
                                       boolean overwriteExisting,
                                       boolean parseWikiSaurus)
Parses the given XML dump file of Wiktionary and stores the parsed data within the specified target directory. Note that each target directory can only contain one parsed Wiktionary database. This method is equivalent to WiktionaryDumpParser.parse(File) using a registered WiktionaryArticleParser. Optionally, information from Wikisaurus is added to the parsed database using the WikisaurusArticleParser.

Parameters:
dumpFile - file name of the Wiktionary dump in XML format.
targetDirectory - directory for storing the parsed data.
overwriteExisting - if true, previously parsed Wiktionary data files are removed from the targetDirectory.
parseWikiSaurus - parses Wikisaurus pages and adds the parsed information to the corresponding articles.
Throws:
WiktionaryException - in case of any parser errors.

deleteEdition

public static void deleteEdition(File parsedData)
Deletes all files from a previously parsed Wiktionary from the specified directory. This method is equivalent to BerkeleyDBWiktionaryEdition.deleteParsedWiktionary(File).



Copyright © 2011-2013 Ubiquitous Knowledge Processing (UKP) Lab. All Rights Reserved.