Pages Navigation Menu

Kdbutl file utility

kdbutl – verify, recover, and optimize KitaroDB files

In this section… kdbutl usage
Arguments
Discussion
Verifying file integrity
Using the Information Advisor
Other options
KitaroDB ISAM errors

The kdbutl utility enables you to detect and possibly fix file and data corruption (which is rare and is generally due to hardware failure.) Kdbutl is available with the Win32 KitaroDB distribution. So if you want to use it with a Windows Store application, you must install the Win32 version of KitaroDB along with the WinRT version, and you must copy the database files for the Windows Store application to a non-Windows-Store location before running kdbutl.

KitaroDB databases are built on indexed sequential access method (ISAM) files, and a database consists of two files: an index file and a data file. (See KitaroDB basics for more information.) File corruption occurs when the control information in the index file doesn’t correspond to the records or key-value pairs in the data file. Data corruption occurs when records or key-value pairs in the data file are unexpectedly overwritten or inserted. File corruption can be detected by running the kdbutl utility with the verify (-v) option and corrected by running kdbutl with the re-index (-r) option. But when data corruption is detected, kdbutl fails, usually with a BADSEG error, and the file remains unchanged. In this case, you will be prompted to run kdbutl again with the -a option. Any record or key-value pair that cannot be processed will be written to an exception file with the extension .exc in the current directory.

Note that when recovering data, we recommend that you first make a copy of the files (both index and data files). Then, if recovery should fail with one method, you can try other methods. If one or more data records or key-value pairs near the beginning of a file are corrupted, kdbutl can skip them (these bad record segments are automatically sent to the exception file as they are found in the data file) and continue recovering the rest of the file. You may be able to reconstruct the lost records or key-value pairs by examining the exception file.

kdbutl usage

In the following, variables that either represent or should be replaced with specific data are in italic type. Optional arguments are enclosed in [italic square brackets]. Arguments that can be repeated one or more times are followed by an ellipsis…

kdbutl -r [-re-index_options] [-other_options] filename[ …]

or

kdbutl -v [-verify_options] [-other_options] filename[ …]

or

kdbutl -i filename

Arguments

-r   Re-index the specified KitaroDB file.

re-index_options

(optional) One or more of the following options for re-indexing:

o key# Order database file by the specified key during the re-index operation.
p density Pack index blocks to the specified percentage for each defined key during the re-index operation. Note that -p does not change the file density setting.

-v  Verify the specified file.

verify_options

(optional) One or more of the following options for verifying
the file(s):

b Report bucket usage statistics and freelist usage.
i Use the Information Advisor to provide a full analysis of file organization and content. (See Discussion below.)
l Coordinated access locking. (Default)
le Exclusive access.
ln No locking (for read-only files only).
n Bypass use of the data file and verify only the index during the verify operation. Do not use this option when file integrity is in question.
z Scan for all problems rather than stopping at the first detected problem.

-i  Use the Information Advisor.

other_options

(optional) One or more of the following options:

h or ? Display help screen.
mlevel# Specify a message level that defines the amount of information displayed during an operation, where level is a value from 0 to 3. (See Discussion below.)
t directory Specify a temporary file directory.
% Display a running status (0 to 100) to indicate the percentage completed by the operation.

filename

The name of the KitaroDB file(s) that you want to re-index or whose index structure you want to verify. The default extension is .kdb.

Discussion

The KitaroDB File Maintenance Utility (kdbutl) can perform one of the following functions:

  • Re-index a KitaroDB file
  • Verify the integrity of a KitaroDB file
  • Use the Information Advisor

Re-indexing a KitaroDB file

Re-indexing causes index blocks to be packed and arranged adjacent to related index blocks to enhance lookup performance. If you do not specify index density, the following default packing percentages are used:

Page size Packing percentage
1024 80%
2048 90%
4096 95%
8192 and higher 97%

If at least three key entries cannot fit into the space that’s left, the default percentage is reduced to 80% for all page sizes. If more (or less) empty index space is desired, specify the packing density explicitly with -p.

In addition, the data file can be ordered by a preferred key to maximize sequential read performance. As a file grows, the speed of sequential access to a primary key is greatly reduced due to large file disk seeks. The -o option orders the data in key# order for high-speed sequential file access of the key specified, making it significantly more efficient. Key# must be a valid key number defined by the file.

When ordering data (with -ro filename), kdbutl generates a sort temporary file called filename_is1.257. (For example, if your .is1 file is named armast.ms1, the temporary file is armast_ms1.257.) If this file is left on your system due to abnormal termination of kdbutl, do not remove it. If kdbutl was in the process of writing data records to the .is1 file when termination occurred, the .257 file is required to completely restore your data. All other temporary files created by kdbutl (having the same filename with the extensions .000 – .256 or .258) are short-lived. Assuming these files are not in use, they can be removed without consequences. To resume the sort, use -r again.

If the data reorganization option (-o key#) is not specified, a new index file is created without altering the existing data file.

When ordering (-o) data, all unused record file address (RFA) space and free space due to record deletions is reclaimed. To explicitly reclaim this space and reduce data file size, we suggest ordering the data periodically.

If the index packing density option (-p density) is not specified, the density defined for that key (or the default density if none is defined) is used.

Verifying file integrity

By default, the verify command (-v) uses the fastest optimized methods to assure file integrity. If kdbutl detects something that would result in a Synergy data access failure, it generates an error and stops the verify operation. By using the -z option, a more linear scan is performed and all errors are displayed in context, which may reveal the entire severity. The -z option, however, can be more time consuming, especially with very large non-optimized files. In other words, regardless of how many problems a file contains, -v stops at the first problem detected and generates an error, while -vz detects all the problems that it can but might be somewhat slower.

We recommend running without -z first, and then using -z only on files that show problems.

Using the Information Advisor

The Information Advisor (the -i option) displays helpful advice based on file organization and content. The identified file conditions can range from high-risk issues that may result in file failure to low-risk, performance-related issues. If the file condition is correctable, the Information Advisor will suggest corrective actions and/or ways to enhance performance. Note that if the Information Advisor has nothing to report, it is not displayed.

The conditions reported are not errors. They simply provide helpful information to point out things that may or may not be otherwise detectable and suggestions that you can choose to ignore or act upon. Having one or more conditions be displayed for a file does not mean the file is corrupted.

Use -i filename to quickly check static configuration information, or -vi filename to generate a full analysis based on content.

The -i option (or -vi) reports the following static conditions:

  • Duplicates exceed 80% full on one or more keys. Duplicates that exceed 100% will be denied with an $ERR_FILFUL error.
  • Index freelist overflow. Performance during STORE operations may be suffering.
  • Duplicates ordered at the beginning on one or more keys. Kdbutl performance may suffer.
  • Index depth exceeds 3 on one or more keys. Increasing PAGE size may improve overall performance.
  • Free space exceeds half. The percentage of free space is specified. Reducing file size may improve performance.

The following conditions are only available with -vi:

  • One or more keys exhibit excessive blank duplicates. Change to a replicating null key to improve performance.
  • Index exhibits a low optimization level. Keyed and sequential file access may not be optimal.
  • Data exhibits a low optimization level. Sequential file access may not be optimal.
  • Existing data exhibits some compression benefit. Compressing data will reduce file size and improve performance.

Other options

The level# argument to the -m option can have one of the following values:

0 No output is generated. All errors generated are returned in the form of an exit value.
1 Only errors and necessary output is displayed. (default)
2 Process information is displayed, in addition to errors and necessary output.
3 Verbose key information is displayed if this message level is specified with the -v option. (If -m3 is specified with -r, the message level defaults to -m1.)

To measure the degree of a file’s optimization, specify the -m2 flag on the verify (-v) command. For each key, a line will be displayed that indicates, as a percentage, the optimization level as well as the sequential order of the data file. For example:

Primary key, 751406 blocks (728570 leaf), 19416428 records
Index density 50%, leaf 50%, separator 50%
Optimization: index 44%, data 95%

The index percentage indicates the percentage of on-disk index blocks for the target key that can be accessed quickly from a previous index block with the least amount of disk overhead. The data percentage indicates the percentage of data records in a sorted order giving the least amount of disk overhead when reading sequentially by that key. The effectiveness of these percentages vary depending on file size and hardware configuration. They tend to become more significant as the file size exceeds the available file cache memory on a system. No additional overhead is consumed as a result of getting this information.

To get this optimization information quickly and accurately on a large file, you can also specify the -n verify flag. However, you shouldn’t rely on the verification results, because only the index is verified.

The -t option specifies a directory for all temporary work files and can be specified with either the -r or -v command. The directory specification must be a valid path specification or logical that references a local or network drive. The default location for temporary files is the current directory.

Writing temporary files to a secondary disk may improve overall performance.
When processing large files, make sure sufficient disk space is available for the temporary work files.

The amount of disk space required for temporary files varies with the operation. In general, you can assume the following:

Operation Maximum temporary file size (approximate)
Re-index only (-r) 2 * (size of largest key * #records)
Order data (-ro) 1.2 * size of in-use data
Verify (-v) (overall index density * size of index file) + (size of largest key * #records) or ~ 80% size of index file
Verify linear (-vz) No temporary files used

When re-indexing only, the total disk space occupied (ISAM file plus temporary files) will not exceed the original size of the ISAM file (unless the packing density is reduced).

The -% option can be specified with either the -r or -v command. When specified with -r, the numbers displayed indicate the percentage of the overall re-indexing operation completed for each file. When specified with -v, the numbers displayed indicate the percentage of the overall verify operation completed for each file. When message level 2 (-m2) is also specified, an individual process percentage as well as a total overall percentage is displayed for each file.

Kdbutl generates an exit status, which can be especially useful if you’ve used the -m0 option. Possible exit statuses are as follows:

This status Indicates
0 Kdbutl was successful.
Error Kdbutl failed as a result of the specified error. See KitaroDB ISAM Errors below for message text for error numbers.
-1 More than one file was specified on the command line, and at least one file failure occurred.

If you use the -z option on a corrupted file, the exit status reflects the first error only.

When the file is successfully processed, the current date is written to the index control record to indicate the last recover or verify.

Kdbutl does not support loading records with binary data from sequential files (excluding counted files). Attempting to do so can cause some records to split into two records in the ISAM file.

Kdbutl generates a log file named kdbutl.log that records its operations and results. Each log file entry specifies the ISAM filename, the operation performed, the date and time the operation was performed, the command line options supplied to kdbutl, the exit status, and the amount of time the operation took. The log file is created in the Windows Temp directory. The maximum size of the log file is 1 megabyte.

KitaroDB ISAM errors

The following maps KitaroDB ISAM error numbers to their message text. If you run kdbutl with message enabled, the text specified here will be displayed. If you run without messaging enabled (-m0), only the number will be displayed.

1 Bad ISAM file control
2 Specified key out of range
3 Lock failure
4 Filename length too long
5 EOF encountered
6 Index incongruity error
7 Illegal decimal key of reference
8 Illegal alpha key of reference
9 Invalid open mode, requires update
10 Invalid RFA
11 I/O error
12 Illegal record size
13 Key not same
14 No current record established
15 No duplicates allowed
16 I/O error: No disk space
17 Not an ISAM file
18 Record no locked for WRITE/DELETE
19 Cannot open data file
20 Cannot open index file
21 Qualifier incongruity error
22 I/O error: Read failure
23 Record is locked
24 Input size exceeds destination size
25 I/O error: Write failure
26 Data incongruity, key to deleted rec
27 Data compression/uncompression error
28 Data freelist error
29 Deleted record error
30 Cannot create file
31 Insufficient memory for attempted op
32 invalid option
33 Invalid compression option
34 Invalid key length
35 Invalid record length
36 Invalid start position
37 Missing required parameter
38 Mismatched segments
39 Key spans end of record
40 Existing file, cannot overwrite
41 Undefined keys, cannot create
42 Flush error
43 Encountered incompatible ISAM file
44 Record not found
45 Invalid null value
47 File in use by another user
49 Too many open files
55 No privilege to this file or directory
57 File not found
58 Bad file specification
59 Invalid I/O mode on open
60 Bad file org on open
61 Bad I/O options in I/O statement
62 Operations timed out
63 Illegal function for this control
66 Bad decimal key value
67 Partial numeric key not allowed
68 Invalid overlay of numeric key
72 Invalid key type
73 Invalid key order
74 Incorrect number of types specified
75 Incorrect number of orders specified
76 Invalid non-key integer size
77 Non-key integer data cannot overlap
78 Invalid overlay of non-key integer data
79 Specified segment out of range
80 Null value doesn’t exist for key
84 No connection
91 Record not same
92 File requires network encryption
100 Illegal record number
101 No room to write to file
103 Cannot delete file
104 Device not available
107 No winsock
108 TCP/IP init error
109 TCP/IP bad remote user name
110 Cannot connect to port
111 Cannot create client connection
112 bad host name
113 Network problem
114 Server not running on remote host
115 Deadlock condition detected
118 Network error after open