-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mainframe Condensed data #637
Comments
Hi, no it is not supported at the moment. We might add support, depending on the complexity of the feature. Could you please point me to some documentation with more details about the data format? Specifically:
|
@yruslan I also have a very similar requirement, so I'm posting my use-case details here in this issue so I can get your advice on how to proceed. I also have a very similar requirement to the original issue raised, so here is what I'm trying to solve. I have a packaged financial application that maintains a transactional journal in VBVR VSAM dataset(s). The datsets are somewhat large, and the system vendor has implemented a method by which unused sections of the copybook are skipped and not written to the VSAM. My system vendor refers to this as compression, but @manchikalapudi refered to this as "collapsed" which might be a better term. The data is not compressed with any compression codec, rather entire sections of the copybook are skipped based on the contents of certain fields. This has not been implemented using occurs depending upon, but instead entire sections of the copybook are skipped. This can be different on each and every record. Consider the following copybook (this simplifed example has been adapted from our very complex copybook)
TR-LNGTH, TR-ACCT-NUM, TR-EFF-DATE, and TR-COM-DATA are always populated on each and every record. The TR-MONETARY-DATA group is only populated when RT-TRAN-TYPE is a C or D, otherwise this section is skipped and does not appear in the record (the bytes were skipped and never written to DASD). The TR-XCHG-DATA group is only populated when the TR-MONETARY-DATA.TR-XCHG-IND is a "Y," otherwise this section is skiped and does not appear in the record (the bytes were never written to DASD) The TR-CAPTURE-DATA group is always populated, but what members it contains is dependent on the TR-CAPTURE-BIT-CD field. In both cases, we need to extract the raw bits to derive 8 boolean flags. Based on these flags, we would determine which fields to skip. For example, one of the bits controls whether TR-ORIGIN is present, another if TR-OPER is present, etc. If an individual bit is a 0, then that field was never written to DASD. Our actual use case is a lot more complicated, but I've tried to distill this down to a simpler representative example. At first I thought that generate_record_bytes might be useful when combined with a row processor, however I quickly realized that I would need to know how to access your lower level methods and functions directly so that I could create a custom record parser. I would be very interested in your advice on how to tackle this. When a section/field is skipped I would to set its value to null, otherwise I want to assign the value from the conversion code. Essentially, I need to decode sections of the file incrementially, and then use that data to know whether to skip or process data groups/fields. As I mentioned, this can different on each and every record, and not all of the controlling variables are in the common (always present) section. I'm sure you remember that in 2020 you and I collaborated on the VBVR support using a custom record extractor. Is there such a thing as a custom record parser, where I could write a class to do what I have described? |
Thanks @mark-weghorst for the very detailed description of the issue! I think the support can be added to handle such cases. Several ideas come to mind, will get back on a proposal. |
Background [Optional]
"Condensed" data refers to data that has been compressed or shortened to take up less space in memory. This is typically done by removing any unnecessary or redundant information from the data. For example, a condensed representation of a series of numbers might store only the differences between each number, rather than the actual values themselves. This can help reduce the amount of memory required to store the data, which can be especially important in systems with limited resources.
Question
Does Cobrix parse Condensed mainframe files?
The text was updated successfully, but these errors were encountered: