VESvault J. Zubov Draft Document VESvault Corp Category: Security July 2023 VESencrypt Data Entry Abstract VESencrypt Data Entry format provides multiple options for balancing bewteen the security level and the size of the encrypted ciphertext, distinguishing between encrypted and unencrypted data, and handling encrypted data corruption. Status of This Memo This document is an internal material of VESvault Corp. Copyright Notice Copyright (c) 2023 VESvault Corp 1. Introduction VESencrypt is a suite of transparent proxy services over various application protocols that perform encryption and decryption of data intended to be stored at-rest by the corresponding application server. VESencrypt uses VES as the encryption key management infrastructure. VESencrypt Data Entry is a format for an encrypted form of an arbitrary block of generic binary data that meets the following criteria: * Distinguish between a VESencrypt Data Entry and a block of unencrypted data with sufficient practical reliability, thus allowing a seamless introduction of data encryption witout disruptions to any unencrypted data pre-existing on the application server; * Use ASCII compatible or binary format for the VESencrypt Data Entry, depending on the specifics of the application server storage; * Provide an option to store a specific number of leading and trailing characters of the original data block in an unencrypted form, either as a part of the VESencrypt Data Entry, or as a separate satellite entry, for the purpose of a preview and indexed search; * Provide an option for including a data integrity check in the entry; * Provide an option to use either deterministic encryption (one to one mapping between encrypted entries and unencrypted data blocks), or non-deterministic encryption with an arbitrary seed size; * Provide an option to gracefully handle truncation of trailing bytes of the VESencrypt Data Entry; * Provide an option of incorporating a variable length padding prior to encrypting the data block to mitigate a data length side channel; * Provide an option to use certain compression algorithms on the data block prior to encryption. 2. VESencrypt Data Entry Structure VESencrypt Data Entry MUST comply with the following format (here "quotes" denote literal values, whitespace denotes concatenation, [brackets] denote optional elements): [lead_chars] "$" "ve" "$" flag_char [seed] "$" ciphertext "$" [trail_chars] lead_chars Optional unencrypted leading characters of the data block, for preview and indexing purposes. Maximum length is 16 bytes. SHOULD NOT include non-printable characters {0x00-0x1f, 0x7f}. MUST NOT include a "$" character. "ve" The VESencrypt magic signature. flag_char An ASCII character, the lower 6 bits are treated as flags. See Section 3. The upper 2 bits SHOULD be 01 or 00, as long as it results in a printable character. seed An optional seed for non-deterministic encryption. Maximum length is 32 bytes. Although treated as binary, MUST NOT include any characters outside of the RFC 4648 Section 5 dictionary. ciphertext The ciphertext for the encrypted original data block, except for bytes supplied in lead_chars and trail_chars. If the BIN flag is set (Section 3) - a binary ciphertext that may contain any byte values including a "$" character. If the BIN flag is not set - a Base64 encoded ciphertext according to RFC 4648 Section 5. trail_chars Optional unencrypted trailing characters of the data block, for preview and indexing purposes. Maximum length is 16 bytes. SHOULD NOT include non-printable characters {0x00-0x1f, 0x7f}. MUST NOT include a "$" character. Since the ciphertext value may contain a "$" character if the BIN flag is set, the safe consideration is that trail_chars starts after the last occurence of the "$" character in the VESencrypt Data Entry, which is also at least the 4th from the beginning, to properly address possible truncation. 3. Flags The lower 6 bits of flag_char byte are to be interpreted as following: RSV1 RSV2 COMP PAD MAC BIN RSV1, RSV2 Reserved, MUST be 0. In the current revision the VESencrypt Data Entry MUST be considered invalid if any of those bits is set to 1. COMP The data was compressed before the encryption. The first byte of the plaintext, identifies the encryption algorithm, the rest of the plaintext MUST be handled according to the encryption algorithm, specified in a separate document. PAD The data was padded before the encryption. The first byte of the plaintext specifies the number of trailing bytes to be removed from the end of the plaintext before the decryption. If combined with COMP, the padding is applied on top of the compression. MAC The ciphertext includes Message Authorization Code, see Section 4. BIN The ciphertext is provided in a binary form, see Section 2. 4. Encryption The encryption is performed by a VESencrypt Proxy on specific data blocks passed by a client to the Application Server. Which data needs to be encrypted, and the corresponding options, are determined by the VESencrypt Proxy based on the application protocol specifics and supplied settings. The following values are prerequisite for the Encryption: data_block The block of data to be encrypted into a VESancrypt Data Entry, treated as binary; key The binary encryption key, retrieved from the VESencrypt Profile; profile_seed The initial seed value, retrieved from the VESencrypt Profile. The encryption process SHOULD proceed as following: * If the settings require, identify lead_bytes and trail_bytes in the data_block, see Section 2. Prepare lead_bytes and trail_bytes to be passed unencrypted in the VESencrypt Data Entry, strip these values from the data_block. * If the data_block size is 0, the Proxy MAY decide to bypass the further encryption steps and pass the unencrypted value to the Application Server. * If the settings require, apply compression and/or padding on the data_block, see Section 2. * If the settings require, generate a random seed byte sequence (see Section 2). Prepare a seed_hash value: seed_hash = SHA256 ( profile_seed seed ) (white space denoted contatenation, seed might be empty). * According to the settings, use either AES 256 CTR or AES 256 GCM to encrypt the data_block processed through the previous steps. Use the corresponding number of the leading bytes of seed_hash as the initialization vector for the encryption. If GCM is used, append a 16 byte GMAC value to the end of the ciphertext. * Encode the VESencrypt Data Entry according to Section 2, use the proper flags according to Section 3. 5. Decryption The decryption is performed by a VESencrypt Proxy on specific data blocks passed by the Application Server to a client. The VESencrypt Proxy SHOULD attempt to decrypt any data blocks that reasonably might have been encrypted using current or past settings of the Proxy. The following values are prerequisite for the Decryption: data_entry The block of data retrieved from the Application Server that may or may not be a VESencrypt Data Entry; key The binary encryption key, retrieved from the VESencrypt Profile; profile_seed The initial seed value, retrieved from the VESencrypt Profile. The decryption process SHOULD proceed as following: * Attempt to parse the data_entry according to Section 2, if failed - consider the data_entry being unencrypted and skip any further steps. In certain configurations, if the parsing was successful to the ciphertext part but without the following "$" character, the VESencrypt Proxy MAY attempt to salvage the truncated VESencrypt Data Entry and proceed with decrypting the partial ciphertext. * Check the flag_char, if illegal flags are set (Section 3) - abort the decryption and consider the data_entry unencrypted. * If the BIN flag is not set - decode the ciphertext into binary according to RFC 4648 Section 5. If any illegal characters are detected - abort the decryption and treat the data_entry as unencrypted. * Prepare a seed_hash value: seed_hash = SHA256 ( profile_seed seed ) (white space denoted contatenation, seed might be empty). * If the MAC flag is set - use the trailing 16 bytes of the ciphertext as GMAC with AES 256 GCM decryption. Otherwise - use AES 256 CTR. If the GMAC validation has failed and the entry was not truncated - the VESencrypt Proxy MAY either return a failure or consider the data_entry unencrypted. * If PAD flag is set - remove padding from the decrypted data (Section 3). If the padding length is inconsistent with the data length - abort the decryption and treat the data_block as unencrypted. * If COMP flag is set - attempt decompression (Section 3). If not supported or failed - abort the decryption and treat the data_block as unencrypted. * Append the lead_chars and trail_chars to the beginning and the end of the decrypted data respectively. The resulting decrypted value is to be passed to the client. 6. Database Proxy Design Considerations The following considerations apply to a VESencrypt Proxy design for any SQL database protocol: * The database architect needs to identify specific columns that need to be transparently encrypted by the proxy; * The solution that uses the database needs to be analyzed to identify any queries that involve expressions that depend on the encrypted columns, any stored procedures, triggers, views etc, and assess the effect of the column encryption on those components; * The width and type (char/binary) of the columns that are intended to be encrypted, may need to be adjusted to accomodate VESencrypt Data Entries. * In case if the encrypted column needs to be searchable by first/last characters, a separate Preview Column should be considered which will be automatically assigned by the Proxy with the specific number of first/last unencrypted characters and fully indexed. 6.1. Write Queries (INSERT / REPLACE / UPDATE) The Proxy MUST process all Write Queries to transform any literals assigned to the encrypted columns into VESencrypt Data Entries. An example query: UPDATE cards SET card_num='1234 5678 8765 4321' WHERE id=123; Assuming the cards.card_num column is configured to be encrypted, having the first 2 and last 4 characters as a preview, 3 byte seed, the query rewritten by the Proxy will look like: UPDATE cards SET card_num='12$ve$@fG0$fwpwBL5w8gEPPN0yLA$4321' WHERE id=123; In case if a separate preview column is configured for card_num, the query will be rewritten by the Proxy as UPDATE cards SET card_num='$ve$@fG0$w5JhJiTfoSktGLQDEFZXwekXGA$',card_num_preview='124321' WHERE id=123; 6.2. SELECT Queries In SELECT queries, the Proxy SHOULD replace any mention of an encrypted column in conditions with the corresponding preview column, if there is one configured. For example, SELECT * FROM cards WHERE card_num LIKE '%4321'; is to be rewritten by the Proxy as SELECT * FROM cards WHERE card_num_preview LIKE '%4321'; In the responses, any column that is configured to be encryptable SHOULD be passed through the Decryption process (Section 5). Since value truncation is a realistic scenario in a database case, any truncated VESencrypt Data Entries SHOULD be reasonably attempted to be salvaged. 6.2.1. Auto-Encrypt Option The Proxy MAY be designed to allow an auto-encrypt option. The auto-encrypt option will specify one or more key columns for the particular database table. When a SELECT query on an auto-encrypt table returns a row that has the values of the key columns and unencrypted content in any encryptable columns for the table - the Proxy issue an UPDATE statement on the background for this row to put encrypted content into the encryptable columns, the row matched by the key fields. The auto-encrypt option will provide a transparent way of encrypting previously existing content in the table by simply running SELECT statements through the Proxy. Depending on the database protocol, the auto-encrypt approach may involve buffering the data on SELECT queries that return a large number of rows prior to sending the UPDATE queries. In such cases, the Proxy MAY set a limit to a number and/or total size of auto-update queries per each SELECT. 6.3. DELETE Queries For sanity reasons, the Proxy MAY be designed and configured to reject any DELETE queries that involve expressions on encrypted fields, because such queries may cause unexpected results. 6.4. Stored procedures and functions The Proxy MAY be designed to allow encrypting specific literal arguments passed to specific stored procedure and function calls, and to decrypt values returned by certain functions via SELECT statements. 6.5. Other SQL Queries This document does not cover rewriting any SQL queries besides those mentioned above. Future revisions may identify and address more cases. 7. IANA Considerations Not applicable for private protocols. Author's Address Jim Zubov VESvault Corp Email: jz@vesvault.com URI: https://vesvault.com