From 7ba1a939abe20a47cd4b5c49fc3e129ef8787fc3 Mon Sep 17 00:00:00 2001 From: Julie Stickler Date: Fri, 10 Jan 2025 17:33:30 -0500 Subject: [PATCH] docs: add info about compression --- docs/sources/configure/bp-configure.md | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/docs/sources/configure/bp-configure.md b/docs/sources/configure/bp-configure.md index 3d00867201815..3edc76a474aef 100644 --- a/docs/sources/configure/bp-configure.md +++ b/docs/sources/configure/bp-configure.md @@ -23,14 +23,14 @@ One issue many people have with Loki is their client receiving errors for out of There are a few things to dissect from that statement. The first is this restriction is per stream. Let’s look at an example: -``` +```bash {job="syslog"} 00:00:00 i'm a syslog! {job="syslog"} 00:00:01 i'm a syslog! ``` If Loki received these two lines which are for the same stream, everything would be fine. But what about this case: -``` +```bash {job="syslog"} 00:00:00 i'm a syslog! {job="syslog"} 00:00:02 i'm a syslog! {job="syslog"} 00:00:01 i'm a syslog! <- Rejected out of order! @@ -38,7 +38,7 @@ If Loki received these two lines which are for the same stream, everything would What can you do about this? What if this was because the sources of these logs were different systems? You can solve this with an additional label which is unique per system: -``` +```bash {job="syslog", instance="host1"} 00:00:00 i'm a syslog! {job="syslog", instance="host1"} 00:00:02 i'm a syslog! {job="syslog", instance="host2"} 00:00:01 i'm a syslog! <- Accepted, this is a new stream! @@ -50,6 +50,14 @@ But what if the application itself generated logs that were out of order? Well, It's also worth noting that the batching nature of the Loki push API can lead to some instances of out of order errors being received which are really false positives. (Perhaps a batch partially succeeded and was present; or anything that previously succeeded would return an out of order entry; or anything new would be accepted.) +## Use `snappy` compression algorithm + +`Snappy` is currently the Loki compression algorithm of choice. It performs much better than `gzip` for speed, but it is not as efficient in storage. This was an acceptable tradeoff for us. + +Grafana Labs has found that `gzip` was very good for compression but was very slow, and this was causing slow query responses. + +`LZ4` was a good compromise of speed and compression performance. However, there were some issues with non-deterministic output of compressed chunks, where two ingesters compressing the same data would produce a chunk with a different checksum, even though they would still decompress back to the same input data. This was interfering with syncing chunks to reduce duplicates. + ## Use `chunk_target_size` Using `chunk_target_size` instructs Loki to try to fill all chunks to a target _compressed_ size of 1.5MB. These larger chunks are more efficient for Loki to process.