Skip to content

Commit

Permalink
Example: FFI Table Provider as dynamic module loading (apache#13183)
Browse files Browse the repository at this point in the history
* Initial commit for example on using FFI Table provider in rust as a module loading system

* Add license text

* formatting cargo.toml files

* Correct typos in readme

* Update documentation per PR feedback

* Add additional documentation to example

* Do not publish example

* apply prettier

* Add text describing async calls

* Update datafusion-examples/examples/ffi/ffi_example_table_provider/Cargo.toml

Co-authored-by: Alexander Hirner <[email protected]>

---------

Co-authored-by: Alexander Hirner <[email protected]>
  • Loading branch information
timsaucer and ahirner authored Nov 5, 2024
1 parent 4f82371 commit a6586cc
Show file tree
Hide file tree
Showing 10 changed files with 432 additions and 44 deletions.
3 changes: 3 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,9 @@ members = [
"datafusion/substrait",
"datafusion/wasmtest",
"datafusion-examples",
"datafusion-examples/examples/ffi/ffi_example_table_provider",
"datafusion-examples/examples/ffi/ffi_module_interface",
"datafusion-examples/examples/ffi/ffi_module_loader",
"test-utils",
"benchmarks",
]
Expand Down
48 changes: 48 additions & 0 deletions datafusion-examples/examples/ffi/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
<!---
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->

# Example FFI Usage

The purpose of these crates is to provide an example of how one can use the
DataFusion Foreign Function Interface (FFI). See [API Docs] for detailed
usage.

This example is broken into three crates.

- `ffi_module_interface` is a common library to be shared by both the module
to be loaded and the program that will load it. It defines how the module
is to be structured.
- `ffi_example_table_provider` creates a library to exposes the module.
- `ffi_module_loader` is an example program that loads the module, gets data
from it, and displays this data to the user.

## Building and running

In order for the program to run successfully, the module to be loaded must be
built first. This example expects both the module and the program to be
built using the same build mode (debug or release).

```shell
cd ffi_example_table_provider
cargo build
cd ../ffi_module_loader
cargo run
```

[api docs]: http://docs.rs/datafusion-ffi/latest
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

[package]
name = "ffi_example_table_provider"
version = "0.1.0"
edition = { workspace = true }
publish = false

[dependencies]
abi_stable = "0.11.3"
arrow = { workspace = true }
arrow-array = { workspace = true }
arrow-schema = { workspace = true }
datafusion = { workspace = true }
datafusion-ffi = { workspace = true }
ffi_module_interface = { path = "../ffi_module_interface" }

[lib]
name = "ffi_example_table_provider"
crate-type = ["cdylib", 'rlib']
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
// Licensed to the Apache Software Foundation (ASF) under one
// or more contributor license agreements. See the NOTICE file
// distributed with this work for additional information
// regarding copyright ownership. The ASF licenses this file
// to you under the Apache License, Version 2.0 (the
// "License"); you may not use this file except in compliance
// with the License. You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing,
// software distributed under the License is distributed on an
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
// KIND, either express or implied. See the License for the
// specific language governing permissions and limitations
// under the License.

use std::sync::Arc;

use abi_stable::{export_root_module, prefix_type::PrefixTypeTrait};
use arrow_array::RecordBatch;
use datafusion::{
arrow::datatypes::{DataType, Field, Schema},
common::record_batch,
datasource::MemTable,
};
use datafusion_ffi::table_provider::FFI_TableProvider;
use ffi_module_interface::{TableProviderModule, TableProviderModuleRef};

fn create_record_batch(start_value: i32, num_values: usize) -> RecordBatch {
let end_value = start_value + num_values as i32;
let a_vals: Vec<i32> = (start_value..end_value).collect();
let b_vals: Vec<f64> = a_vals.iter().map(|v| *v as f64).collect();

record_batch!(("a", Int32, a_vals), ("b", Float64, b_vals)).unwrap()
}

/// Here we only wish to create a simple table provider as an example.
/// We create an in-memory table and convert it to it's FFI counterpart.
extern "C" fn construct_simple_table_provider() -> FFI_TableProvider {
let schema = Arc::new(Schema::new(vec![
Field::new("a", DataType::Int32, true),
Field::new("b", DataType::Float64, true),
]));

// It is useful to create these as multiple record batches
// so that we can demonstrate the FFI stream.
let batches = vec![
create_record_batch(1, 5),
create_record_batch(6, 1),
create_record_batch(7, 5),
];

let table_provider = MemTable::try_new(schema, vec![batches]).unwrap();

FFI_TableProvider::new(Arc::new(table_provider), true)
}

#[export_root_module]
/// This defines the entry point for using the module.
pub fn get_simple_memory_table() -> TableProviderModuleRef {
TableProviderModule {
create_table: construct_simple_table_provider,
}
.leak_into_prefix()
}
26 changes: 26 additions & 0 deletions datafusion-examples/examples/ffi/ffi_module_interface/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

[package]
name = "ffi_module_interface"
version = "0.1.0"
edition = "2021"
publish = false

[dependencies]
abi_stable = "0.11.3"
datafusion-ffi = { workspace = true }
49 changes: 49 additions & 0 deletions datafusion-examples/examples/ffi/ffi_module_interface/src/lib.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
// Licensed to the Apache Software Foundation (ASF) under one
// or more contributor license agreements. See the NOTICE file
// distributed with this work for additional information
// regarding copyright ownership. The ASF licenses this file
// to you under the Apache License, Version 2.0 (the
// "License"); you may not use this file except in compliance
// with the License. You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing,
// software distributed under the License is distributed on an
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
// KIND, either express or implied. See the License for the
// specific language governing permissions and limitations
// under the License.

use abi_stable::{
declare_root_module_statics,
library::{LibraryError, RootModule},
package_version_strings,
sabi_types::VersionStrings,
StableAbi,
};
use datafusion_ffi::table_provider::FFI_TableProvider;

#[repr(C)]
#[derive(StableAbi)]
#[sabi(kind(Prefix(prefix_ref = TableProviderModuleRef)))]
/// This struct defines the module interfaces. It is to be shared by
/// both the module loading program and library that implements the
/// module. It is possible to move this definition into the loading
/// program and reference it in the modules, but this example shows
/// how a user may wish to separate these concerns.
pub struct TableProviderModule {
/// Constructs the table provider
pub create_table: extern "C" fn() -> FFI_TableProvider,
}

impl RootModule for TableProviderModuleRef {
declare_root_module_statics! {TableProviderModuleRef}
const BASE_NAME: &'static str = "ffi_example_table_provider";
const NAME: &'static str = "ffi_example_table_provider";
const VERSION_STRINGS: VersionStrings = package_version_strings!();

fn initialization(self) -> Result<Self, LibraryError> {
Ok(self)
}
}
29 changes: 29 additions & 0 deletions datafusion-examples/examples/ffi/ffi_module_loader/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

[package]
name = "ffi_module_loader"
version = "0.1.0"
edition = "2021"
publish = false

[dependencies]
abi_stable = "0.11.3"
datafusion = { workspace = true }
datafusion-ffi = { workspace = true }
ffi_module_interface = { path = "../ffi_module_interface" }
tokio = { workspace = true, features = ["rt-multi-thread", "parking_lot"] }
63 changes: 63 additions & 0 deletions datafusion-examples/examples/ffi/ffi_module_loader/src/main.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
// Licensed to the Apache Software Foundation (ASF) under one
// or more contributor license agreements. See the NOTICE file
// distributed with this work for additional information
// regarding copyright ownership. The ASF licenses this file
// to you under the Apache License, Version 2.0 (the
// "License"); you may not use this file except in compliance
// with the License. You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing,
// software distributed under the License is distributed on an
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
// KIND, either express or implied. See the License for the
// specific language governing permissions and limitations
// under the License.

use std::sync::Arc;

use datafusion::{
error::{DataFusionError, Result},
prelude::SessionContext,
};

use abi_stable::library::{development_utils::compute_library_path, RootModule};
use datafusion_ffi::table_provider::ForeignTableProvider;
use ffi_module_interface::TableProviderModuleRef;

#[tokio::main]
async fn main() -> Result<()> {
// Find the location of the library. This is specific to the build environment,
// so you will need to change the approach here based on your use case.
let target: &std::path::Path = "../../../../target/".as_ref();
let library_path = compute_library_path::<TableProviderModuleRef>(target)
.map_err(|e| DataFusionError::External(Box::new(e)))?;

// Load the module
let table_provider_module =
TableProviderModuleRef::load_from_directory(&library_path)
.map_err(|e| DataFusionError::External(Box::new(e)))?;

// By calling the code below, the table provided will be created within
// the module's code.
let ffi_table_provider =
table_provider_module
.create_table()
.ok_or(DataFusionError::NotImplemented(
"External table provider failed to implement create_table".to_string(),
))?();

// In order to access the table provider within this executable, we need to
// turn it into a `ForeignTableProvider`.
let foreign_table_provider: ForeignTableProvider = (&ffi_table_provider).into();

let ctx = SessionContext::new();

// Display the data to show the full cycle works.
ctx.register_table("external_table", Arc::new(foreign_table_provider))?;
let df = ctx.table("external_table").await?;
df.show().await?;

Ok(())
}
Loading

0 comments on commit a6586cc

Please sign in to comment.