#531 Add the info about the new feature to README.

AbsaOSS · Jan 8, 2025 · b031c17 · b031c17
1 parent 1f50835
commit b031c17
Showing 1 changed file with 47 additions and 1 deletion.
diff --git a/README.md b/README.md
@@ -237,13 +237,59 @@ Let's take a look on components of a data pipeline in more detail.
 
 ## Pipeline components
 
-A pipeline consists of _sources_, _the metastore_ and _sinks_.
+A pipeline consists of _common options_, _sources_, _the metastore_, _sinks_, and _operations_. All these
+definitions form the workflow config. For big pipelines these definitions can be split among multiple files. Check out
+`examples/` folder for example workflow definitions. Let's take a look at each section of a workflow separately.
 
 Currently there are 3 types of jobs:
 - _Ingestion_ jobs to get data from external sources to the metastore.
 - _Transformation jobs_ to transform data inside the metastore.
 - _Sink_ jobs to send data from the metastore to external systems.
 
+### Common options
+Pramen pipeline should have several options defined. Here is the minimum configuration. For the list of all options
+and their default values check out [reference.conf](pramen/core/src/main/resources/reference.conf).
+
+```hocon
+pramen {
+  environment.name = "AWS Glue (DEV)"
+  pipeline.name = "CDC PoC"
+
+  bookkeeping.enabled = true
+  bookkeeping.jdbc {
+    driver = "org.postgresql.Driver"
+    url = "jdbc:postgresql://myhost:5432/pramen_database"
+    user = "postgresql_user"
+    password = "password"
+  }
+  temporary.directory = "s3://bucket/prefix/tmp/"
+}
+```
+
+#### Email notifications
+One section of config defines options for email notifications. You can define
+```hocon
+mail {
+  # SMTP configuration
+  # Any options from https://javaee.github.io/javamail/docs/api/com/sun/mail/smtp/package-summary.html
+  smtp.host = "smtp.example.com"
+  smtp.port = "25"
+  smtp.auth = "false"
+  smtp.starttls.enable = "false"
+  smtp.EnableSSL.enable = "false"
+  debug = "false"
+  
+  # A custom email sender (optional)
+  send.from = "Pramen <[email protected]"
+  
+  # Email recipients
+  send.to = "[email protected], [email protected]"
+  
+  # A list of allowed domains (optional)
+  allowed.domains = [ "example.com", "test.com" ]
+}
+```
+
 ### Dates
 Before diving into pipeline definition it is important to understand how dates are handled. Pramen is a batch data
 pipeline manager for input data updates coming from applications which are usually referred to as _source systems_. Pramen is designed