diff --git a/README.md b/README.md index d7684fba..dbf4cf43 100644 --- a/README.md +++ b/README.md @@ -158,8 +158,11 @@ Each object is represented as a simple Python `dict`, with the following propert |`bottom`| Distance of bottom of the character from top of page.| |`doctop`| Distance of top of character from top of document.| |`matrix`| The "current transformation matrix" for this character. (See below for details.)| -|`stroking_color`|The color of the character's outline (i.e., stroke), expressed as a tuple, with length determined by the “color space” used (1 for grayscale, 3 for RBG, 4 for CMYK).| -|`non_stroking_color`|The character's interior color.| +|`ncs`|TKTK| +|`stroking_pattern`|TKTK| +|`non_stroking_pattern`|TKTK| +|`stroking_color`|The color of the character's outline (i.e., stroke). See [docs/colors.md](docs/colors.md) for details.| +|`non_stroking_color`|The character's interior color. See [docs/colors.md](docs/colors.md) for details.| |`object_type`| "char"| __Note__: A character’s `matrix` property represents the “current transformation matrix,” as described in Section 4.2.2 of the [PDF Reference](https://ghostscript.com/~robin/pdf_reference17.pdf) (6th Ed.). The matrix controls the character’s scale, skew, and positional translation. Rotation is a combination of scale and skew, but in most cases can be considered equal to the x-axis skew. The `pdfplumber.ctm` submodule defines a class, `CTM`, that assists with these calculations. For instance: @@ -186,8 +189,8 @@ my_char_rotation = my_char_ctm.skew_x |`bottom`| Distance of bottom of the line from top of page.| |`doctop`| Distance of top of line from top of document.| |`linewidth`| Thickness of line.| -|`stroking_color`|The color of the line, expressed as a tuple, with length determined by the “color space” used (1 for grayscale, 3 for RBG, 4 for CMYK).| -|`non_stroking_color`|The non-stroking color specified for the line’s path.| +|`stroking_color`|The color of the line. See [docs/colors.md](docs/colors.md) for details.| +|`non_stroking_color`|The non-stroking color specified for the line’s path. See [docs/colors.md](docs/colors.md) for details.| |`object_type`| "line"| #### `rect` properties @@ -205,8 +208,8 @@ my_char_rotation = my_char_ctm.skew_x |`bottom`| Distance of bottom of the rectangle from top of page.| |`doctop`| Distance of top of rectangle from top of document.| |`linewidth`| Thickness of line.| -|`stroking_color`|The color of the rectangle's outline, expressed as a tuple, with length determined by the “color space” used (1 for grayscale, 3 for RBG, 4 for CMYK).| -|`non_stroking_color`|The rectangle’s fill color.| +|`stroking_color`|The color of the rectangle's outline. See [docs/colors.md](docs/colors.md) for details.| +|`non_stroking_color`|The rectangle’s fill color. See [docs/colors.md](docs/colors.md) for details.| |`object_type`| "rect"| #### `curve` properties @@ -226,8 +229,8 @@ my_char_rotation = my_char_ctm.skew_x |`doctop`| Distance of curve's highest point from top of document.| |`linewidth`| Thickness of line.| |`fill`| Whether the shape defined by the curve's path is filled.| -|`stroking_color`|The color of the curve's outline, expressed as a tuple, with length determined by the “color space” used (1 for grayscale, 3 for RBG, 4 for CMYK).| -|`non_stroking_color`|The curve’s fill color.| +|`stroking_color`|The color of the curve's outline. See [docs/colors.md](docs/colors.md) for details.| +|`non_stroking_color`|The curve’s fill color. See [docs/colors.md](docs/colors.md) for details.| |`object_type`| "curve"| #### Derived properties diff --git a/docs/colors.md b/docs/colors.md new file mode 100644 index 00000000..d39f227b --- /dev/null +++ b/docs/colors.md @@ -0,0 +1,41 @@ +# Colors + +In the PDF specification, as well as in `pdfplumber`, most graphical objects can have two color attributes: + +- `stroking_color`: The color of the object's outline +- `non_stroking_color`: The color of the object's interior, or "fill" + +In the PDF specification, colors have both a "color space" and a "color value". + +## Color Spaces + +Valid color spaces are grouped into three categories: + +- Device color spaces + - `DeviceGray` + - `DeviceRGB` + - `DeviceCMYK` +- CIE-based color spaces + - `CalGray` + - `CalRGB` + - `Lab` + - `ICCBased` +- Special color spaces + - `Indexed` + - `Pattern` + - `Separation` + - `DeviceN` + +To read more about the differences between those color spaces, see section 4.5 [here](https://ghostscript.com/~robin/pdf_reference17.pdf). + +`pdfplumber` aims to expose those color spaces as `scs` (stroking color space) and `ncs` (non-stroking color space), represented as a __string__. + +__Caveat__: The only information `pdfplumber` can __currently__ expose is the non-stroking color space for `char` objects. The rest (stroking color space for `char` objects and either color space for the other types of objects) will require a pull request to `pdfminer.six`. + +## Color Values + +The color value determines *what specific color* in the color space should be used. With the exception of the "special color spaces," these color values are specified as a series of numbers. For `DeviceRGB`, for example, the color values are three numbers, representing the intensities of red, green, and blue. + +In `pdfplumber`, those color values are exposed as `stroking_color` and `non_stroking_color`, represented as a __tuple of numbers__. + +The pattern specified by the `Pattern` color space is exposed via the `non_stroking_pattern` and `stroking_pattern` attributes.