Data Dictionary File Sizing

James Zukowski · November 08, 2022, 05:24:06 PM

When the Data Dictionary creates a file, there is a rather significant "buffer" of extra record space added. Is there any particular reason the number is that large? Does it have any real effect on the space used by a VLR or EFF file's records?
More curiosity than anything else...
Thanks!

Mike King · November 08, 2022, 09:49:06 PM

When a file is created the system attempts to use the key size(s) and record size to estimate a buffer file.

The file itself (VLR or EFF) consists of a header (512 bytes) followed by fixed size blocks which can contain keys and/or records. When the file is created it will contain the 512 byte header, a block for space management and at least one block for the data dictionary. When the first record is added the system will add the first key block and the first data block -- meaning a 1 record file will require 4 blocks plus the 512 byte file header or typically 16.5K based on a 4K blocks.

As records are added to the file the data will be added to data blocks into which have enough room for the record data. To allow for record expansion only blocks which have more than the thresh hold percentage (default 10%). So for example assuming you have a 4K block size the system will not consider blocks with less than about 400 bytes for new records based on a 10% thresh hold.

Data records, which vary in size, will be packed into the blocks using only the space required for the data.

File keys will be also placed in the blocks, where the number of keys per block will be determined based on key size plus 5 bytes for the record pointer.

Hopefully this answers your question

James Zukowski · November 09, 2022, 08:46:35 AM

Mike,
Thanks for the explanation.
So, when I define a Data Dictionary file that adds up to 140 bytes and the DD creates it with a Maximum Record Size of 384, it doesn't really mean that much, aside from my being able to throw a lot more data into the record than I originally intended before generating an error.

Mike King · November 09, 2022, 11:24:00 AM

Technically BB since its origin didn't impose field lengths only record size lengths. This is because each field was delimited by a field separator character as opposed to having a preset position in the record.

This remains the same today meaning if you declare a record with say a company name field of 30 characters, your application could write 31 or more into the field as long as the total record length remained below the record size defined (basically stealing the extra bytes from the space set aside for other fields).

In PxPlus VLR files the record size is really just used to setup the buffers used to hold the records while being read/written. The active data in this record buffer is what it read from/written to the data blocks -- so over estimating the record size has no real impact on the actual file size. The file size is really impacted by the actual record size required to hold the data fields.

Now there is one 'caveat' to this. When using the DD and you plan to use ODBC to access you data, make sure you define the correct maximum field size. If you declared a field as 30 bytes long but actually wrote 35, most programs (e.g. Excel) will use the declared field length to define a buffer to hold the data and will get an error on the record where the data exceeds the defined length.

Lastly, if desired, you can enable data verification in PxPlus where it will verify the data you are writing to the file adheres to the field definition. This would include length, type (number/string), and even validation rulese.

James Zukowski · November 09, 2022, 11:53:10 AM

That clears up a lot. Thanks.
I guess the same sort of applies to FLRs, though they're probably becoming rarer over time. (They seem to have the extra 'buffer' space, as well.)

Mike King · November 09, 2022, 01:38:14 PM

For Fixed length records its a bit different. We still create blocks of keys but the records themselves are not put into blocks. Basically when the file is created we have a 512 file header and at least key block. Additional blocks can also be created to hold the data dictionary. The first record is then created following these blocks and as additional key blocks are needed they are intermixed with the fixed length records.

Free space on a FLR is reclaimed using a linked list of deleted records.

Data Dictionary File Sizing

James Zukowski

Mike King

James Zukowski

Mike King

James Zukowski

Mike King