Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add note about Data Prepper versus ingest processors #6886

Merged
merged 7 commits into from
Apr 11, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion _ingest-pipelines/processors/append.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,14 @@ nav_order: 10
redirect_from:
- /api-reference/ingest-apis/processors/append/
---


This documentation describes using the `append` processor in OpenSearch ingest pipelines. Consider using the [Data Prepper `add_entries` processor]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/add-entries/), which runs on the OpenSearch cluster, if your use case involves large or complex datasets.
{: .note}

# Append processor

The `append` processor is used to add values to a field:

- If the field is an array, the `append` processor appends the specified values to that array.
- If the field is a scalar field, the `append` processor converts it to an array and appends the specified values to that array.
- If the field does not exist, the `append` processor creates an array with the specified values.
Expand Down
3 changes: 3 additions & 0 deletions _ingest-pipelines/processors/convert.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,9 @@ redirect_from:
- /api-reference/ingest-apis/processors/convert/
---

This documentation describes using the `convert` processor in OpenSearch ingest pipelines. Consider using the [Data Prepper `convert_entry_type` processor]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/convert_entry_type/), which runs on the OpenSearch cluster, if your use case involves large or complex datasets.
{: .note}

# Convert processor

The `convert` processor converts a field in a document to a different type, for example, a string to an integer or an integer to a string. For an array field, all values in the array are converted.
Expand Down
3 changes: 3 additions & 0 deletions _ingest-pipelines/processors/copy.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,9 @@ redirect_from:
- /api-reference/ingest-apis/processors/copy/
---

This documentation describes using the `copy` processor in OpenSearch ingest pipelines. Consider using the [Data Prepper `copy_values` processor]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/copy-values/), which runs on the OpenSearch cluster, if your use case involves large or complex datasets.
{: .note}

# Copy processor

The `copy` processor copies an entire object in an existing field to another field.
Expand Down
3 changes: 3 additions & 0 deletions _ingest-pipelines/processors/csv.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,9 @@ redirect_from:
- /api-reference/ingest-apis/processors/csv/
---

This documentation describes using the `csv` processor in OpenSearch ingest pipelines. Consider using the [Data Prepper `csv` processor]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/csv/), which runs on the OpenSearch cluster, if your use case involves large or complex datasets.
{: .note}

# CSV processor

The `csv` processor is used to parse CSVs and store them as individual fields in a document. The processor ignores empty fields.
Expand Down
3 changes: 3 additions & 0 deletions _ingest-pipelines/processors/date.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,9 @@ redirect_from:
- /api-reference/ingest-apis/processors/date/
---

This documentation describes using the `date` processor in OpenSearch ingest pipelines. Consider using the [Data Prepper `date` processor]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/date/), which runs on the OpenSearch cluster, if your use case involves large or complex datasets.
{: .note}

# Date processor

The `date` processor is used to parse dates from document fields and to add the parsed data to a new field. By default, the parsed data is stored in the `@timestamp` field.
Expand Down
3 changes: 3 additions & 0 deletions _ingest-pipelines/processors/dissect.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,9 @@ parent: Ingest processors
nav_order: 60
---

This documentation describes using the `dissect` processor in OpenSearch ingest pipelines. Consider using the [Data Prepper `dissect` processor]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/dissect/), which runs on the OpenSearch cluster, if your use case involves large or complex datasets.
{: .note}

# Dissect

The `dissect` processor extracts values from a document text field and maps them to individual fields based on dissect patterns. The processor is well suited for field extractions from log messages with a known structure. Unlike the `grok` processor, `dissect` does not use regular expressions and has a simpler syntax.
Expand Down
3 changes: 3 additions & 0 deletions _ingest-pipelines/processors/drop.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,9 @@ parent: Ingest processors
nav_order: 70
---

This documentation describes using the `drop` processor in OpenSearch ingest pipelines. Consider using the [Data Prepper `drop_events` processor]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/drop-events/), which runs on the OpenSearch cluster, if your use case involves large or complex datasets.
{: .note}

# Drop processor

The `drop` processor is used to discard documents without indexing them. This can be useful for preventing documents from being indexed based on certain conditions. For example, you might use a `drop` processor to prevent documents that are missing important fields or contain sensitive information from being indexed.
Expand Down
6 changes: 3 additions & 3 deletions _ingest-pipelines/processors/grok.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,13 +6,13 @@ grand_parent: Ingest pipelines
nav_order: 140
---

This documentation describes using the `grok` processor in OpenSearch ingest pipelines. Consider using the [Data Prepper `grok` processor]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/grok/), which runs on the OpenSearch cluster, if your use case involves large or complex datasets.
{: .note}

# Grok processor

The `grok` processor is used to parse and structure unstructured data using pattern matching. You can use the `grok` processor to extract fields from log messages, web server access logs, application logs, and other log data that follows a consistent format.

This documentation describes using the `grok` processor in OpenSearch ingest pipelines. Consider using the [Data Prepper `grok` processor]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/grok/), which runs on the OpenSearch cluster, if your use case involves large or complex datasets.
{: .note}

## Grok basics

The `grok` processor uses a set of predefined patterns to match parts of the input text. Each pattern consists of a name and a regular expression. For example, the pattern `%{IP:ip_address}` matches an IP address and assigns it to the field `ip_address`. You can combine multiple patterns to create more complex expressions. For example, the pattern `%{IP:client} %{WORD:method} %{URIPATHPARM:request} %{NUMBER:bytes %NUMBER:duration}` matches a line from a web server access log and extracts the client IP address, the HTTP method, the request URI, the number of bytes sent, and the duration of the request.
Expand Down
3 changes: 3 additions & 0 deletions _ingest-pipelines/processors/kv.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,9 @@ redirect_from:
- /api-reference/ingest-apis/processors/lowercase/
---

This documentation describes using the `kv` processor in OpenSearch ingest pipelines. Consider using the [Data Prepper `key_value` processor]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/key-value/), which runs on the OpenSearch cluster, if your use case involves large or complex datasets.
{: .note}

# KV processor

The `kv` processor automatically extracts specific event fields or messages that are in a `key=value` format. This structured format organizes your data by grouping it together based on keys and values. It's helpful for analyzing, visualizing, and using data, such as user behavior analytics, performance optimizations, or security investigations.
Expand Down
3 changes: 3 additions & 0 deletions _ingest-pipelines/processors/lowercase.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,9 @@ redirect_from:
- /api-reference/ingest-apis/processors/lowercase/
---

This documentation describes using the `lowercase` processor in OpenSearch ingest pipelines. Consider using the [Data Prepper `lowercase_string` processor]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/lowercase-string/), which runs on the OpenSearch cluster, if your use case involves large or complex datasets.
{: .note}

# Lowercase processor

The `lowercase` processor converts all the text in a specific field to lowercase letters.
Expand Down
3 changes: 3 additions & 0 deletions _ingest-pipelines/processors/uppercase.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,9 @@ redirect_from:
- /api-reference/ingest-apis/processors/uppercase/
---

This documentation describes using the `uppercase` processor in OpenSearch ingest pipelines. Consider using the [Data Prepper `uppercase_string` processor]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/uppercase-string/), which runs on the OpenSearch cluster, if your use case involves large or complex datasets.
{: .note}

# Uppercase processor

The `uppercase` processor converts all the text in a specific field to uppercase letters.
Expand Down
Loading