Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: add info about compression #15699

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 11 additions & 3 deletions docs/sources/configure/bp-configure.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,22 +23,22 @@ One issue many people have with Loki is their client receiving errors for out of

There are a few things to dissect from that statement. The first is this restriction is per stream. Let’s look at an example:

```
```bash
{job="syslog"} 00:00:00 i'm a syslog!
{job="syslog"} 00:00:01 i'm a syslog!
```

If Loki received these two lines which are for the same stream, everything would be fine. But what about this case:

```
```bash
{job="syslog"} 00:00:00 i'm a syslog!
{job="syslog"} 00:00:02 i'm a syslog!
{job="syslog"} 00:00:01 i'm a syslog! <- Rejected out of order!
```

What can you do about this? What if this was because the sources of these logs were different systems? You can solve this with an additional label which is unique per system:

```
```bash
{job="syslog", instance="host1"} 00:00:00 i'm a syslog!
{job="syslog", instance="host1"} 00:00:02 i'm a syslog!
{job="syslog", instance="host2"} 00:00:01 i'm a syslog! <- Accepted, this is a new stream!
Expand All @@ -50,6 +50,14 @@ But what if the application itself generated logs that were out of order? Well,

It's also worth noting that the batching nature of the Loki push API can lead to some instances of out of order errors being received which are really false positives. (Perhaps a batch partially succeeded and was present; or anything that previously succeeded would return an out of order entry; or anything new would be accepted.)

## Use `snappy` compression algorithm

`Snappy` is currently the Loki compression algorithm of choice. It performs much better than `gzip` for speed, but it is not as efficient in storage. This was an acceptable tradeoff for us.

Grafana Labs has found that `gzip` was very good for compression but was very slow, and this was causing slow query responses.

`LZ4` was a good compromise of speed and compression performance. However, there were some issues with non-deterministic output of compressed chunks, where two ingesters compressing the same data would produce a chunk with a different checksum, even though they would still decompress back to the same input data. This was interfering with syncing chunks to reduce duplicates.

## Use `chunk_target_size`

Using `chunk_target_size` instructs Loki to try to fill all chunks to a target _compressed_ size of 1.5MB. These larger chunks are more efficient for Loki to process.
Expand Down
Loading