Skip to content

Commit

Permalink
reorganized. wip
Browse files Browse the repository at this point in the history
  • Loading branch information
elijahpetty committed Jan 25, 2024
1 parent 148dad8 commit b62825d
Showing 1 changed file with 33 additions and 29 deletions.
62 changes: 33 additions & 29 deletions docs/working-with-deephaven-tables/merging-tables.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
---
id: merging-tables
title: How to merge tables in Deephaven
sidebar_label: Merging tables in Deephaven
title: Merging tables
---

This guide discusses how to merge tables in Deephaven.
Expand All @@ -12,21 +11,38 @@ Merge operations combine two tables by stacking the tables one on top of the oth

:::

The basic `merge` syntax follows:
## Syntax

```syntax
t = merge(tables...)
There are two methods for merging tables in Deephaven: `merge` and `merge_sorted`.

### `merge`

The `merge` method simply stacks one or more tables on top of another.

```python syntax
t = merge(tables: List[Table])
```

All of the source tables must have the same schema (column names and column types), and `NULL` inputs will be ignored.
:::note

The columns for each table need to have the same names and types, or a column mismatch error will occur. `NULL` inputs are ignored.

The resulting table is all of the source tables stacked vertically. If the source tables dynamically change, such as for ticking data, rows will be inserted within the stack. For example, if a row is added to the end of the third source table, in the resulting table, that new row appears after all other rows from the third source table and before all rows from the fourth source table.
:::

### `merge_sorted`

The `merge_sorted` method is similar to the `merge` method, but it sorts the result table after merging the data. The `merge_sorted` method is more efficient than using `merge` followed by `sort`.

```python syntax
t = merge(tables: List[Table])
t = merge_sorted(tables: List[Table], order_by: str)
```

## Examples

Next we will demonstrate Deephaven's `merge` methods with some examples. First, we will create some source tables to use in the following examples:
The following code block initializes three tables, each with two columns. We will use these tables in the following examples.

```python order=source1,source2,source3
```python test-set=1 order=source1,source2,source3,result_merged,result_merge_sorted
from deephaven import merge, merge_sorted, new_table
from deephaven.column import int_col, string_col

Expand All @@ -35,43 +51,31 @@ source2 = new_table([string_col("Letter", ["C", "D", "E"]), int_col("Number", [1
source3 = new_table([string_col("Letter", ["E", "F", "A"]), int_col("Number", [22, 25, 27])])
```

The sections below discuss basic merge operations, and how to merge tables effectively, especially when you have many tables to combine.

### `merge`

The above source tables can be combined, or vertically stacked, by providing each table as an argument to the merge method.

:::note

The columns for each table need to have the same names and types, or a column mismatch error will occur.

:::

The following query merges two tables:
Next, let's merge two of our tables using the `merge` method.

```python order=result
```python test-set=1 order=result
result = merge([source1, source2])
```

The following query merges three tables:

```python order=result
result = merge([source1, source2, source3])
```
The resulting table `result_merged` is all of the source tables stacked vertically. If the source tables dynamically change, such as for ticking data, rows will be inserted within the stack. For example, if a row is added to the end of the third source table, in the resulting table, that new row appears after all other rows from the third source table and before all rows from the fourth source table.

### `merge_sorted`

The `merge_sorted` method is similar to the `merge` method, but it sorts the result table after merging the data. The `merge_sorted` method is more efficient than using `merge` followed by `sort`.

```python order=result
result = merge([source1, source2, source3], "Number")
```python test-set=1 order=result
result = merge_sorted([source1, source2, source3], "Number")
```

The resulting table is all of the source tables stacked vertically and sorted by the `Number` column.

## Perform efficient merges

When performing more than one merge operation, it is best to perform all the merges at the same time, rather than nesting several merges.

In this example, a table named result is initialized. As new tables are generated, the results are merged at every iteration. Calling the merge method on each iteration makes this example inefficient.
In this example, a table named `result` is initialized. As new tables are generated, the results are merged at every iteration. Calling the merge method on each iteration makes this example inefficient.

```python order=result
from deephaven import merge, new_table
Expand Down

0 comments on commit b62825d

Please sign in to comment.