Release CDAP 6.1.2 · cdapio/cdap

Summary

This release primarily focuses on bugfixes and performance improvements. Some of the highlights include,

Performance improvements
- Improve preview performance & limits concurrent preview runs to 10 by default
- Shift in polling logic to UI to avoid polling leaks in Nodejs server
- Batch API usage in UI to reduce the load on backend services
Pipeline and Plugin fixes
- Support Field Level Lineage for Streaming pipelines
- Improve Field Level Lineage computation algorithm
- Added support for Spark 2.4
- Improve memory consumption during pipeline execution

Added the ability for SparkCompute and SparkSink to record field lineage. (CDAP-15579)
Added support for Spark 2.4. (CDAP-16107)
Added the ability to record field lineage for streaming pipelines. (CDAP-13643)

Fixed a bug that caused errors when Wrangler's parse-as-csv with header was used when reading multiple small files.(CDAP-16002)
Fixed the BigQuery sink to properly allow certain types as clustering fields.(CDAP-16526)
Fixed a bug that would cause zombie processes when using the Remote Hadoop Provisioner.(CDAP-16471)
Fixed a bug that getSchema is not working for database plugins.(CDAP-16472)
Fixed a bug that made DBSource plugin fail in preview mode.(CDAP-16453)
Fixed a race condition bug that can cause failure when running Spark program.(CDAP-16309)

Added an option to skip header in the files in delimited, csv, tsv and text formats.(CDAP-16517)
Added an option for database source to replace the characters in the field names.(CDAP-16525)
Reduce preview startup by by 60%. Also adds limit to max concurrent preview runs (10 by default).(CDAP-16308)
Reduce memory footprint for StructureRecord which improves overall memory consumption for pipeline execution.(CDAP-16509)
Introduced a new REST endpoint for fetching scheduled time for multiple programs.(CDAP-16339)