de.tudarmstadt.ukp.jwktl.parser
Class WritableBerkeleyDBWiktionaryEdition

java.lang.Object
  extended by de.tudarmstadt.ukp.jwktl.api.entry.AbstractWiktionary
      extended by de.tudarmstadt.ukp.jwktl.api.entry.WiktionaryEdition
          extended by de.tudarmstadt.ukp.jwktl.api.entry.BerkeleyDBWiktionaryEdition
              extended by de.tudarmstadt.ukp.jwktl.parser.WritableBerkeleyDBWiktionaryEdition
All Implemented Interfaces:
IWiktionary, IWiktionaryEdition, IWritableWiktionaryEdition

public class WritableBerkeleyDBWiktionaryEdition
extends BerkeleyDBWiktionaryEdition
implements IWritableWiktionaryEdition

Extends the Berkeley DB implementation by providing the possibility for modifying the contents. This is required by the parsers which need writing access to the database, but not by the querying and iterating interface.

Author:
Christian M. Meyer

Nested Class Summary
 
Nested classes/interfaces inherited from class de.tudarmstadt.ukp.jwktl.api.entry.BerkeleyDBWiktionaryEdition
BerkeleyDBWiktionaryEdition.WiktionaryEntryProxy, BerkeleyDBWiktionaryEdition.WiktionarySenseProxy
 
Field Summary
protected  long entryCount
           
protected  boolean entryIndexByTitle
           
protected  long pageCount
           
protected  long senseCount
           
 
Fields inherited from class de.tudarmstadt.ukp.jwktl.api.entry.BerkeleyDBWiktionaryEdition
DATABASE_NAME, dbPath, entryById, entryByKey, env, language, openCursors, pageById, pageByNormalizedTitle, pageByTitle, properties, PROPERTY_FILE_NAME, senseByKey, store
 
Fields inherited from class de.tudarmstadt.ukp.jwktl.api.entry.WiktionaryEdition
isClosed
 
Constructor Summary
WritableBerkeleyDBWiktionaryEdition(File dbPath, boolean overwriteExisting)
          Shorthand for WritableBerkeleyDBWiktionaryEdition(File, boolean, Long) with a cacheSize set to half the size of the the + current JWM max memory.
WritableBerkeleyDBWiktionaryEdition(File dbPath, boolean overwriteExisting, Long cacheSize)
          Instanciates the writable Wiktionary database for the given database path.
 
Method Summary
 void commit()
          Force a database commit of the pages saved so far.
protected  void connect(boolean isReadOnly, boolean allowCreateNew, boolean overwriteExisting, Long cacheSize)
           
 boolean getEntryIndexByTitle()
          Returns the setting if IWiktionaryEntrys should be ordered alphabetically.
 void savePage(WiktionaryPage page)
          Adds the given Wiktionary page to the database.
 void saveProperties(IDumpInfo dumpInfo)
          Hotspot called after parsing has finished to save the metadata of the dump file and the basic parsing statistics.
 void setEntryIndexByTitle(boolean entryIndexByTitle)
          Sorts the entries by word form before assigning an ID to them.
 void setLanguage(ILanguage language)
          Assigns the given language to the Wiktionary edition.
 
Methods inherited from class de.tudarmstadt.ukp.jwktl.api.entry.BerkeleyDBWiktionaryEdition
deleteParsedWiktionary, doClose, getAllPages, getDBName, getDBPath, getEntryForId, getLanguage, getPageForId, getPageForWord, getPagesForWord, getSenseForKey, loadPage, prepareTargetDirectory
 
Methods inherited from class de.tudarmstadt.ukp.jwktl.api.entry.WiktionaryEdition
close, ensureOpen, getAllEntries, getAllSenses, getEntriesForWord, getEntryForId, getEntryForWord, getSenseForId, getSenseForId, getSensesForWord, getSensesForWord, getSensesForWord, isClosed
 
Methods inherited from class de.tudarmstadt.ukp.jwktl.api.entry.AbstractWiktionary
getAllEntries, getAllEntries, getAllEntries, getAllEntries, getAllEntries, getAllPages, getAllPages, getAllPages, getAllPages, getAllPages, getAllSenses, getAllSenses, getAllSenses, getAllSenses, getAllSenses, getEntriesForWord, getEntriesForWord, getEntriesForWord, getPagesForWord, getSensesForWord, getSensesForWord, getSensesForWord
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface de.tudarmstadt.ukp.jwktl.parser.IWritableWiktionaryEdition
getPageForId, getPageForWord
 
Methods inherited from interface de.tudarmstadt.ukp.jwktl.api.IWiktionaryEdition
getDBPath, getEntryForId, getEntryForId, getEntryForWord, getLanguage, getSenseForId, getSenseForId, getSenseForKey, getSensesForWord, getSensesForWord
 
Methods inherited from interface de.tudarmstadt.ukp.jwktl.api.IWiktionary
close, getAllEntries, getAllEntries, getAllEntries, getAllEntries, getAllEntries, getAllEntries, getAllPages, getAllPages, getAllPages, getAllPages, getAllPages, getAllPages, getAllSenses, getAllSenses, getAllSenses, getAllSenses, getAllSenses, getAllSenses, getEntriesForWord, getEntriesForWord, getEntriesForWord, getEntriesForWord, getPagesForWord, getPagesForWord, getSensesForWord, getSensesForWord, getSensesForWord, getSensesForWord, isClosed
 

Field Detail

pageCount

protected long pageCount

entryCount

protected long entryCount

senseCount

protected long senseCount

entryIndexByTitle

protected boolean entryIndexByTitle
Constructor Detail

WritableBerkeleyDBWiktionaryEdition

public WritableBerkeleyDBWiktionaryEdition(File dbPath,
                                           boolean overwriteExisting)
Shorthand for WritableBerkeleyDBWiktionaryEdition(File, boolean, Long) with a cacheSize set to half the size of the the + current JWM max memory.


WritableBerkeleyDBWiktionaryEdition

public WritableBerkeleyDBWiktionaryEdition(File dbPath,
                                           boolean overwriteExisting,
                                           Long cacheSize)
Instanciates the writable Wiktionary database for the given database path.

Parameters:
overwriteExisting - if set to false, parsing a Wiktionary dump using this database will cause an exception if the database path is not empty. Otherwise, an existing parsed Wiktionary database will be overwritten.
cacheSize - denotes the size of the cache (in Bytes) used by the Berkeley DB.
Method Detail

connect

protected void connect(boolean isReadOnly,
                       boolean allowCreateNew,
                       boolean overwriteExisting,
                       Long cacheSize)
                throws com.sleepycat.je.DatabaseException
Overrides:
connect in class BerkeleyDBWiktionaryEdition
Throws:
com.sleepycat.je.DatabaseException

getEntryIndexByTitle

public boolean getEntryIndexByTitle()
Returns the setting if IWiktionaryEntrys should be ordered alphabetically.


setEntryIndexByTitle

public void setEntryIndexByTitle(boolean entryIndexByTitle)
Description copied from interface: IWritableWiktionaryEdition
Sorts the entries by word form before assigning an ID to them. THIS METHOD IS KEPT FOR COMPATIBILITY. YOU SHOULD NOT USE THIS METHOD.

Specified by:
setEntryIndexByTitle in interface IWritableWiktionaryEdition

setLanguage

public void setLanguage(ILanguage language)
Description copied from interface: IWritableWiktionaryEdition
Assigns the given language to the Wiktionary edition.

Specified by:
setLanguage in interface IWritableWiktionaryEdition

commit

public void commit()
            throws WiktionaryException
Description copied from interface: IWritableWiktionaryEdition
Force a database commit of the pages saved so far.

Specified by:
commit in interface IWritableWiktionaryEdition
Throws:
WiktionaryException

saveProperties

public void saveProperties(IDumpInfo dumpInfo)
                    throws WiktionaryException
Description copied from interface: IWritableWiktionaryEdition
Hotspot called after parsing has finished to save the metadata of the dump file and the basic parsing statistics.

Specified by:
saveProperties in interface IWritableWiktionaryEdition
Throws:
WiktionaryException

savePage

public void savePage(WiktionaryPage page)
              throws com.sleepycat.je.DatabaseException
Adds the given Wiktionary page to the database.

Specified by:
savePage in interface IWritableWiktionaryEdition
Throws:
com.sleepycat.je.DatabaseException - if the page could not be stored, which is, i.e. the case if the DB is in read-only mode.


Copyright © 2011-2013 Ubiquitous Knowledge Processing (UKP) Lab. All Rights Reserved.