8.2. Example HTTP Connector
The Example HTTP connector has a simple goal: it reads comma-separated data over HTTP. For example, if you have a large amount of data in a CSV format, you can point the example HTTP connector at this data and write a SQL query to process it.
Code
The Example HTTP connector can be found in the presto-example-http directory in the root of the Presto source tree.
Maven Project
The Example HTTP connector uses Maven to build via the pom.xml file in the root of the plugin directory.
Project Dependencies
Plugins depend on the SPI from Presto:
<dependency>
<groupId>com.facebook.presto</groupId>
<artifactId>presto-spi</artifactId>
<scope>provided</scope>
</dependency>
The plugin uses the Maven provided scope because Presto provides the classes from the SPI at runtime and thus the plugin should not include them in the plugin assembly.
There are a few other dependencies that are provided by Presto such as javax.inject and Jackson. In particular, Jackson is used for serializing handles and thus plugins must use the verison provided by Presto.
All other dependencies are based on what the plugin needs for its own implementation. Plugins are loaded in a separate class loader to provide isolation and to allow plugins to use a different version of a library that Presto uses internally.
Plugin Implementation
The plugin implementation in the Example HTTP connector looks very similar to other plugin implementations. Most of the implementation is devoted to handling optional configuration and the only function of interest is the following:
@Override
public <T> List<T> getServices(Class<T> type)
{
if (type == ConnectorFactory.class) {
return ImmutableList.of(type.cast(new ExampleConnectorFactory(getOptionalConfig())));
}
return ImmutableList.of();
}
Note that the ImmutableList class is a utility class from Guava.
As with all plugins, this plugin overrides the getServices() method and returns an ExampleConnectorFactory in response to a request for a service of type ConnectorFactory.
ConnectorFactory Implementation
In Presto, the primary object that handles the connection between Presto and a particular type of data source is the Connector object, which are created using ConnectorFactory.
This implementation is available in the class ExampleConnectorFactory. The first thing the connector factory implementation does is specify the name of this connector. This is the same string used to reference this connector in Presto configuration.
@Override
public String getName()
{
return "example-http";
}
The real work in a connector factory happens in the create() method. In the ExampleConnectorFactory class, the create() method configures the connector and then asks Guice to create the object. This is the meat of the create() method without parameter validation and exception handling:
// A plugin is not required to use Guice; it is just very convenient
Bootstrap app = new Bootstrap(
new JsonModule(),
new ExampleModule(connectorId));
Injector injector = app
.strictConfig()
.doNotInitializeLogging()
.setRequiredConfigurationProperties(requiredConfig)
.setOptionalConfigurationProperties(optionalConfig)
.initialize();
return injector.getInstance(ExampleConnector.class);
Connector: ExampleConnector
This class allows Presto to obtain references to the various services provided by the connector.
Metadata: ExampleMetadata
This class is responsible for reporting table names, table metadata, column names, column metadata and other information about the schemas that are provided by this connector. ConnectorMetadata is also called by Presto to ensure that a particular connector can understand and handle a given table name.
The ExampleMetadata implementation delegates many of these calls to ExampleClient, a class that implements much of the core functionality of the connector.
Split Manager: ExampleSplitManager
The split manager partitions the data for a table into the individual chunks that Presto will distribute to workers for processing. In the case of the Example HTTP connector, each table contains one or more URIs pointing at the actual data. One split is created per URI.
Record Set Provider: ExampleRecordSetProvider
The record set provider creates a record set which in turn creates a record cursor that returns the actual data to Presto. ExampleRecordCursor reads data from a URI via HTTP. Each line corresponds to a single row. Lines are split on comma into individual field values which are then returned to Presto.