1 SORT
The OpenVMS Sort/Merge utility sorts records or merges input files. To sort one or more input files, specify the SORT command. These files are sorted according to the fields you select and one reordered output file is generated. To merge input files that have previously been sorted according to the same key fields, specify the MERGE command. One output file is generated. High-performance SORT/MERGE Utility: On Alpha systems, you can also choose the high-performance Sort/Merge utility. It uses the same command line interface. Any differences are noted with the appropriate SORT and MERGE qualifiers. Use the SORTSHR logical to select the high- performance Sort/Merge utility. Define SORTSHR to point to the high-performance sort executable in SYS$LIBRARY as follows: $ DEFINE SORTSHR SYS$LIBRARY:HYPERSORT.EXE To return to SORT/MERGE, deassign SORTSHR. (The SORT/MERGE utility is the default if SORTSHR is not defined.) For additional information, you can enter one of the following topics: Command_Qualifiers Provides a brief description of the qualifiers that can be used with the SORT and MERGE commands. Input_File_Qualifier Provides information about qualifiers that can be used to modify input files (such as /FORMAT). Output_File_Qualifiers Provides information about qualifiers that can be used to modify output files (such as /SEQUENTIAL). For a complete description of the Sort/Merge Utility, see the OpenVMS User's Manual. Formats SORT input-file-spec [,...] output-file MERGE input-file-spec [,...] output-file 2 Parameters input-file-spec [,...] Specifies the file or files to be sorted or merged. You can specify up to 10 input files. Multiple file specifications must be separated by commas. The default input file type is .DAT. High-performance Sort/Merge: The high-performance Sort/Merge utility allows you to specify up to 12 input files. output-file Specifies the file to be created. You can specify one output file only. If you omit a file type in the file specification, the command defaults to the file type of the first input file. 2 Command_Qualifiers /CHECK_SEQUENCE Verifies the sequence of the records only in Merge input files. By default Merge checks the sequence of the records. Use only with the MERGE command. /COLLATING_SEQUENCE Selects one of three predefined collating sequences for character key fields, or specifies the name of a National character set (NCS) collating sequence to be used in comparing character keys. Sort arranges characters in ASCII sequence by default; the EBCDIC and Multinational sequences can also be used. High-performance Sort/Merge: The high-performance Sort/Merge utility currently supports all collating sequences except NCS. /DUPLICATES By default, Sort retains multiple records with duplicate keys. The /NODUPLICATES qualifier eliminates all but one of multiple records with duplicate keys. /KEY Describes key fields, including the position, size, sorting order, and data type. /PROCESS Defines the internal sorting process. The /PROCESS qualifier allows you to choose one of four processes: record, tag, address, or index. By default, SORT uses a record sorting process. Use only with the SORT command. High-performance Sort/Merge: The high-performance Sort/Merge utility currently supports only the record process. /SPECIFICATION Identifies a Sort or Merge specification file. High-performance Sort/Merge: The high-performance Sort/Merge utility does not currently support the use of specification files. /STABLE Directs records with equal keys to the output file in their input file order. The default condition is /NOSTABLE. /STATISTICS Displays a statistical summary that can be used for optimization. High-performance Sort/Merge: The high-performance Sort /Merge utility does not currently support the display of a statistical summary. /WORK_FILES Increases the number of Sort work files by any number from 1 to 10 inclusively to make each work file smaller. If the available disks are too small or too full for work files, increasing the number of files can improve the efficiency of the sort operation. Use only with the SORT command. High-performance Sort/Merge: The high-performance Sort/Merge utility supports from 1 to 255 work files. 2 /CHECK_SEQUENCE Verifies the sequence of the records only in Merge input files. By default, Merge checks the sequence of records. Use only with the MERGE command. Formats /CHECK_SEQUENCE /NOCHECK_SEQUENCE 3 Full_Description The /CHECK_SEQUENCE qualifier is unique to the MERGE command. By default, Merge does sequence checking to ensure that the input files have been sorted on the same key. You can also use the /CHECK_SEQUENCE qualifier to check whether the records of one or more files (up to 10) have been sorted. (The records will still be directed to an output file, which you must specify.) If you are checking whether records are sorted on a key field other than the entire record, you must specify key information, along with requesting sequence. High-performance Sort/Merge: The high-performance Sort/Merge utility allows you to specify up to 12 files. 3 Examples 1.$ MERGE/KEY=(SIZE:4,POSITION:3)/NOCHECK_SEQUENCE PRICE1.DAT, - _$ PRICE2.DAT PRICE.LIS The /NOCHECK_SEQUENCE qualifier specifies that the sequence of the input files, PRICE1.DAT and PRICE2.DAT does need not be checked. (Checking is not necessary because the records in those files are sorted on the same key and the sequence of records is correct.) 2.$ MERGE/SPECIFICATION=PAYROLL.SRT/CHECK_SEQUENCE - _$ MAY3.DAT,MAY10.DAT,MAY17.DAT,MAY24.DAT TOTAL.LIS In this example, the specification file, PAYROLL.SRT includes the /NOCHECK_SEQUENCE qualifier. The /CHECK_SEQUENCE qualifier on the MERGE command line is necessary to override the /NOCHECK_SEQUENCE qualifier in the specification file. The sequence of records in the four input files are to be checked. 2 /COLLATING_SEQUENCE Selects one of three predefined collating sequences for character key fields, or specifies the name of a National character set (NCS) collating sequence to be used in comparing character keys. Sort arranges characters in ASCII sequence by default; the EBCDIC and Multinational sequences can also be used. High-performance Sort/Merge: The high-performance Sort/Merge utility currently supports all collating sequences except NCS. Formats /COLLATING_SEQUENCE=type /COLLATING_SEQUENCE=cs-name 3 Parameters type o ASCII Arranges characters according to ASCII sequence. ASCII is the default sequence and need not be specified. o EBCDIC Arranges characters according to EBCDIC sequence. The characters remain in ASCII representation; only the order is changed. o Multinational Arranges characters according to Multinational sequence, which collates the international character set. When you use the Multinational sequence, characters are ordered according to the following rules: - All diacritical (accented) forms of a character are given the collating value of the character (A', A", A' collate as A). - Lowercase characters are given the collating value of their uppercase equivalents (a collates as A, a" collates as A"). - If two strings compare as equal, tie-breaking is performed. The strings are compared to detect differences due to diacritical marks, ignored characters, or characters that collate as equal although they are actually different. If the strings still compare as equal, another comparison is done based on the numeric codes of the characters. In this final comparison, lowercase characters are ordered before uppercase. Care should be taken when sorting or merging files for further processing using the Multinational sequence. Sequence checking procedures in most programming languages compare numeric characters. Because Multinational is based on actual graphic characters and not on the codes representing those characters, normal sequence checking does not work. cs-name Arranges character keys according to the named sequence, which must be a collating sequence defined in an NCS library. High-performance Sort/Merge: The high-performance Sort /Merge utility currently supports only the ASCII, EBCDIC, and Multinational collating sequences. 3 Full_Description By default, Sort/Merge arranges records according to ASCII sequence. However, it can also arrange records according to EBCDIC and Multinational sequence. These three collating sequences can be modified to meet your particular needs through the use of a specification file. You can also define your own collating sequence by using a specification file if one of the three collating sequences does not suit your needs. 3 Example $ SORT/COLLATING_SEQUENCE=MULTINATIONAL - _$ NAMES.DAT,NOM.DAT LIST.LIS This SORT command arranges the input files NAMES.DAT and NOM.DAT according to the Multinational collating sequence to create the output file LIST.LIS. 2 /DUPLICATES By default, Sort retains multiple records with duplicate keys. The /NODUPLICATES qualifier eliminates all but one of multiple records with duplicate keys. Formats /DUPLICATES /NODUPLICATES 3 Full_Description By default, Sort/Merge retains records with equal keys. The /NODUPLICATES qualifier eliminates all but one record with equal keys. The retained records may not appear in the same order as they appeared in the input file. If you want to specify which duplicate record to keep, invoke Sort at the program level and specify an equal-key routine. The /STABLE and the /NODUPLICATES qualifiers are mutually exclusive. 3 Example $ SORT/KEY=(POSITION:3,SIZE:5,DECIMAL)/NODUPLICATES - _$ ACCT1,ACCT2 ACCT.LIS This SORT command arranges the two input files according to the key supplied and eliminates all but one of multiple records with equal keys. 2 /KEY Describes key fields, including the position, size, sorting order, and data type. Both the position and size of the key field must be specified, except for floating point data types where the size is known. By default, Sort reorders a file by sorting entire records with character data in ascending order. Any other type of key field must be specified. When you specify multiple keys, use a separate /KEY qualifier for each key. Format /KEY=(field [,...]) 3 Fields POSITION:n Specifies the position of the first byte in the key field. A value of 1 to 32,767 may be specified. The first byte in a record is considered position 1. The POSITION:n field must be specified. (Note that the SIZE:n field must also be specified, except for floating point data types where the size is known.) SIZE:n Specifies the length of the key field. The total composite size of all keys and the original input record length must be less than 32,767 bytes. If the decimal sign is stored in a separate byte in the key field, that byte is not counted toward the size of the data. Both the POSITION:n and SIZE:n fields must be specified, except for floating point data types where the size is known. The data type of the key determines what values are acceptable when specifying size as well as the units in which the size is specified: o 1 to 32,767 (characters) for character data o 1, 2, 4, 8, or 16 (bytes) for binary data High-performance Sort/Merge: The high-performance Sort/Merge utility currently supports only 1, 2, 4, and 8-byte binary keys. o 1 to 31 (digits) for decimal data o No value is necessary for floating point data ASCENDING Orders the sorting operation in ascending alphabetical or numerical order. ASCENDING is the default order. DESCENDING Orders the sorting operation in descending alphabetical or numerical order. CHARACTER Specifies character data in the key field. CHARACTER is the default data type. BINARY Specifies binary data in the key field. SIGNED Specifies signed binary or decimal data in key field. SIGNED is the default for binary and decimal data. UNSIGNED Specifies unsigned binary or decimal data in the key field. F_FLOATING Specifies F_FLOATING format data in the key field. D_FLOATING Specifies D_FLOATING format data in the key field. G_FLOATING Specifies G_FLOATING format data in the key field. H_FLOATING Specifies H_FLOATING format data in the key field. High-performance Sort/Merge: Not supported by the high- performance Sort/Merge utility. S_FLOATING On Alpha systems, specifies IEEE S_FLOATING format data in the key field. T_FLOATING On Alpha systems, specifies IEEE T_FLOATING format data in the key field. DECIMAL Specifies decimal data in the key field. TRAILING_SIGN Specifies trailing sign decimal data in the key field. TRAILING_ SIGN is the default for decimal data. LEADING_SIGN Specifies leading sign decimal data in the key field. The leading sign must be in the first position of the field and the field must be left zero padded. OVERPUNCHED_SIGN Specifies overpunched decimal data in the key field. OVERPUNCHED_ SIGN is the default for decimal data. SEPARATE_SIGN Specifies separate sign decimal data in the key field. ZONED Specifies zoned decimal data in the key field. High-performance Sort/Merge: Not supported by the high- performance Sort/Merge utility. PACKED_DECIMAL Specifies packed decimal data in the key field. NUMBER:n Specifies the order of priority of each key if you do not list multiple keys in the order of their priority. A value of 1 to 255 may be specified. 3 Full_Description The /KEY qualifier specifies all the necessary information about a key field. If the file is to be sorted using entire records with character data in ascending order, you do not need to specify the key information. When a key field must be described, you must specify both the position and the size of the key. In addition, if the sorting or merging operation is to be done in descending alphabetic or numeric order, specify DESCENDING in the key description. If the data in the key fields is not character data, you must specify the data type. The following data types are recognized by the Sort/Merge utility: BINARY, [SIGNED] BINARY, UNSIGNED CHARACTER DECIMAL, LEADING_SIGN, SEPARATE_SIGN [SIGNED] DECIMAL, LEADING_SIGN, [OVERPUNCHED_SIGN, SIGNED] DECIMAL [,SIGNED, TRAILING_SIGN, OVERPUNCHED_SIGN] DECIMAL, [TRAILING SIGN], SEPARATE_SIGN, [SIGNED] DECIMAL, UNSIGNED D_FLOATING F_FLOATING G_FLOATING H_FLOATING PACKED_DECIMAL S_FLOATING (Alpha only) T_FLOATING (Alpha only) ZONED The items in brackets are defaults and need not be specified. Multiple Keys You can specify up to 255 key fields in a sorting operation. If you do specify multiple keys, decide which is primary, which is secondary, and so on; then, in the command string, list them in the order of their priority. By default, Sort assigns 1 to the first key specified in the command line, 2 to the second key, and so on. If you do not list the keys in the order of their priority, specify the order of each with the parameter NUMBER:n. For each Sort key, you must use a separate /KEY qualifier. If Sort finds /KEY parameters repeated after a single /KEY qualifier, it does not treat these as specifications for multiple keys; instead, the duplicate parameters override previously specified parameters. 3 Example $ SORT/KEY=(POS:16,SIZ:3)/KEY=(POS:1,SIZ:11) - _$ /KEY=(POS:40,SIZ:2,DESC) YRENDAVG.DAT YRAVGSRT.LIS This SORT command identifies three key fields. The input file, YRENDAVG, is first sorted by the key beginning in position 16, then by the key beginning in position 1, and finally by the key beginning in position 40. The third key used sorts in descending order. 2 /PROCESS Defines the internal sorting process. The /PROCESS qualifier allows you to choose one of four processes: record, tag, address, or index. By default, SORT uses a record sorting process. Use only with the SORT command. High-performance Sort/Merge: The high-performance Sort/Merge utility currently supports only the record process. Format /PROCESS=type 3 Parameters type o RECORD Keeps records intact while sorting and produces an output file consisting of complete records. Record is the default sorting process. High-performance Sort/Merge: The high-performance Sort/Merge utility currently supports only the record process. o TAG Sorts only the keys and then rereads the input file to produce an output file consisting of complete records. o ADDRESS Sorts only the keys and produces an output file that is an index of record addresses in binary format. The index must be submitted to a program for further processing. o INDEX Creates an output file containing both RFAs and key fields, plus a file number when sorting multiple files. The format of these key fields is the same as in the input files. If the program needs key field content for a decision during future processing, select index sort rather than address sort. 3 Example $ SORT/KEY=(POS:40,SIZ:2,DESC)/PROCESS=TAG YRENDAVG.DAT - _$ DESCYRAVG.LIS This sort operation uses a tag sorting process to create the output file DESCYRAVG.LIS. 2 /SPECIFICATION Identifies a Sort or Merge specification file. High-performance Sort/Merge: The high-performance Sort/Merge utility does not currently support the use of specification files. Format /SPECIFICATION=file-spec 3 Qualifier_Value file-spec Specifies the Sort/Merge specification file. The default file type is .SRT. 3 Full_Description The /SPECIFICATION qualifier identifies the specification file to be used in a sort or merge operation. A specification file allows you to do the following: o Change the format and length of the records in the output file o Conditionally alter record order and data fields o Omit specified records from the process o Include specified records in the process o Change the way in which characters are ordered o Reassign work files o Define commonly used sort or merge operations High-performance Sort/Merge: The high-performance Sort/Merge utility does not currently support the use of specification files. Specification files may be created by any standard editor or the DCL CREATE command. The commands within a specification file are formatted differently than those on the DCL command line and some have different meanings. Each command in the specification file should start with a slash(/) and continutation characters are not required if a command spans more than one line. Comments may be included in a specification file by preceding them with an exclamation point (!). Many of the qualifiers used in the specification file are similar to the DCL qualifiers used in the Sort/Merge command line. Note, however, that the format of these qualifiers can be different. For example, the /KEY qualifier at DCL level has a different format than the /KEY qualifier in the specification file. 3 Example $ SORT/SPECIFICATION=ACCTS.SRT SALES1.DAT,SALES2.DAT MAILING.LIS This SORT command arranges the input files according to the instructions detailed in the specification file, ACCTS.SRT. 2 /STABLE Directs records with equal keys to the output file in their input file order. The default condition is /NOSTABLE. Formats /STABLE /NOSTABLE 3 Full_Description When the input files contain records with equal keys, those records may not maintain the same order that they appeared in the input file. Specifying the /STABLE qualifier arranges records with equal keys in the order of the input files on output. If you use this qualifier when sorting multiple input files, on output, records with equal keys in the first file precede those from the second file and so on. The /STABLE and /NODUPLICATES qualifiers are mutually exclusive. 3 Example $ SORT/KEY=(POS:1,SIZ:5,DECIMAL)/STABLE PRICESA.DAT,PRICESB.DAT, - _$ PRICESC.DAT SUMMARY.LIS In this sort operation, records with equal keys from PRICESA.DAT will be listed first, followed by those from PRICESB.DAT, followed by those from PRICESC.DAT. 2 /STATISTICS Displays a statistical summary that can be used for optimization. High-performance Sort/Merge: The high-performance Sort/Merge utility does not currently support this. Format /STATISTICS 3 Full_Description When the /STATISTICS qualifier is used, Sort/Merge displays statistics in SYS$OUTPUT. You can use these statistics to judge the efficiency of the ordering operation and to determine adjustments that can improve its performance. To save these statistics in a file, specify the following command: $ DEFINE/USER SYS$ERROR output-file High-performance Sort/Merge: The high-performance Sort/Merge utility does not currently support this. The following statistical display results when you use the /STATISTICS qualifier: OpenVMS Sort/Merge Statistics Records read: nnn Input record length: nnn Records sorted: nnn Internal length: nnn Records output: nnn Output record length: nnn Working set extent: nnn Sort tree size: nnn Virtual memory: nnn Number of initial runs: nnn Direct I/O: nnn Maximum merge order: nnn Buffered I/O: nnn Number of merge passes: nnn Page faults: nnn Work file allocation: nnn Elapsed time: nn:nn:nn.nn Elapsed CPU: nn:nn:nn.nn Records read is the number of records read by Sort or Merge. Records sorted is the number of records that have been processed using Sort. This number could be less than the number of records read if a specification file is used to select only certain records for the sort or merge operation. Records output is the number of records written to the output file. This number could be less than the number of records sorted if /NODUPLICATES was selected or if I/O errors occurred when the output records were being written. Working set extent shows the number of pages in the process working set extent. This value is used as an upper limit on the size of the sort data structure. Adjusting this value is one way to improve the efficiency of a sort operation. Virtual memory is the number of pages of virtual memory added to the Sort image to hold the data. The total of the direct I/O and buffered I/O is the number of I/O movements needed to read and write data. The lower this total value is, the more efficient the ordering operation. The number of page faults indicates how well the data fits into memory: the higher the number of page faults, the less efficient the ordering operation. Elapsed time is the total wall clock time used by the sort or merge operation in hours, minutes, seconds, and hundredths of seconds. The input record length value is obtained from the Record Management Services (OpenVMS RMS) unless the user supplies it. Internal length is the size in bytes of an internal format node. This includes any keys, data, a word to store the length, record file addresses (RFAs), and converted keys. Output record length is the length of the output record. The length is computed from the input record length, the sort process, and the record reformatting requested. Sort tree size is the number of records that fit in sort's internal data structure. Number of initial runs is one indication of how well the data fits into memory. The maximum merge order is the maximum number of sorted strings that are merged at one time. The number of merge passes is the number of times the Sort utility merges strings until one sorted output string is produced. The number of initial runs and the number of merge passes indicate how well the data fits in memory. The higher these numbers, the further the working set size is from containing the data and the longer the sorting takes. Work file allocation is the number of blocks used for the work files. When more than one merge pass is needed, this size is approximately twice the size of the input file allocation. Elapsed CPU is the CPU time used by the ordering operation; it does not include time spent waiting for I/O operations to complete or time spent waiting while another process executes. 3 Example $ SORT/STATISTICS PRICE1.DAT,PRICE2.DAT PRICE.LIS This SORT command results in the following statistical display: OpenVMS Sort/Merge Statistics Records read: 793 Input record length: 80 Records sorted: 793 Internal length: 80 Records output: 793 Output record length: 80 Working set extent: 100 Sort tree size: 412 Virtual memory: 433 Number of initial runs: 2 Direct I/O: 22 Maximum merge order: 2 Buffered I/O: 9 Number of merge passes: 1 Page faults: 3418 Work file allocation: 114 Elapsed time: 00:00:05.98 Elapsed CPU: 00:00:03.63 In the sample statistics display, the Sort data structure size is limited by the small working set extent. By doubling the working set extent you can almost double the Sort data structure size, enabling all the records to fit in memory without using work files. 2 /WORK_FILES Increases the number of Sort work files by any number from 1 to 10 inclusively to make each work file smaller. If the available disks are too small or too full for work files, increasing the number of files can improve the efficiency of the sort operation. Use only with the SORT command. High-performance Sort/Merge: The high-performance Sort/Merge utility allows you to specify from 1 to 255 work files. Format /WORK_FILES=n 3 Qualifier_Value n Specifies the number of work files requested; 1 to 10 files may be specified. The default value is 2. High-performance Sort/Merge: The high-performance Sort/Merge utility allows you to specify from 1 to 255 work files. The default value is 1. 3 Full_Description Sort does not create work files until it needs them. If Sort needs work files, it creates them, places them in the SYS$SCRATCH directory, and assigns them SORTWORKn logicals. Usually, there is no advantage to requesting more than one work file. However, if the available disks are too small or too full for Sort work files, you can increase the number of work files to make each work file smaller. High-performance Sort/Merge: You can also enhance performance by assigning each work file to a different disk. 3 Examples 1.$ ASSIGN DRA5: SORTWORK0 $ ASSIGN DB0: SORTWORK1 $ ASSIGN DB1: SORTWORK2 $ SORT/KEY=(POS:1,SIZ:80)/WORK_FILES=3 - _$ STATS1,STATS2,STATS3,STATS4 SUMMARY.LIS Because the input files in this sort operation are large files, specifying three work files improves the efficiency of the sort operation. Note that you can also assign the work files to a specific directory on a device by including the directory name. For example, to assign SORTWORK0 to the [WORKSPACE] directory on DRA5, enter the following command: $ ASSIGN DRA5:[WORKSPACE] SORTWORK0 2 Input_File_Qualifier Must be specified immediately after the input file specification in the Sort or Merge command line. 3 /FORMAT Defines input file characteristics; allows you to specify or override record or file size. You can also use /FORMAT as an output file qualifier; enter HELP SORT OUTPUT_FILE_QUALIFIERS/FORMAT at the DCL prompt for more information. Format input-file-spec/FORMAT=(type:n,[...]) 4 Qualifier_Values RECORD_SIZE:n Specifies the input file's longest record length (LRL) in bytes. The maximum longest record length that can be specified depends on the file organization: Sequential files 32,767 Relative files 16,383 Indexed-sequential files 16,362 These totals include control bytes for variable records with fixed-length control (VFC) format. FILE_SIZE:n Specifies input file size in blocks. The maximum file size accepted is 4,294,967,295 blocks. 4 Full_Description Sort obtains the file's longest record length (LRL) and file size from RMS. If you know the LRL that RMS has defined for the input files is incorrect, you can override this value by specifying the record size with RECORD_SIZE. For multiple input files, LRL is the length of the longest record in all files. If you do not know the LRL value for a file, use the ANALYZE /RMS_FILE command. The LRL value appears in the file attributes section in the statistical report generated for the file that you specify. Sort uses input file size information to determine the amount of memory needed, as well as the size of the work files for the sort operation. If the file size is unknown (for example, you are sorting files not residing on disk or standard ANSI magnetic tape), Sort assumes a fairly large file size. If this default is too large, Sort overestimates its memory and work file requirements; the sort operation will be more efficient if you specify a smaller input file size. If the default is too small, Sort underestimates its memory requirements; therefore, you should specify a larger input file size, provided the Sort data structure size is not limited by the working set extent. 4 Examples 1.$ SORT/KEY=(POS:40,SIZ:2,DESC) - _$CRA0:YRENDAVG.DAT/FORMAT=(RECORD_SIZE:41,FILE_SIZE:3) - _$DESCYRAVG.LIS Because the input file YRENDAVG.DAT does not reside on a disk device or ANSI magnetic tape, file organization must be described by the /FORMAT qualifier. 2.$ SORT/KEY=(POS:1,SI:80) STATS.DAT SUMMARY.LIS/FORMAT=FIXED:80 The input file STATS.DAT consists of variable-length records that are 80 bytes in length. The /FORMAT qualifier specifies that the output file SUMMARY.LIS consists of fixed-length records. 2 Output_File_Qualifiers Must be specified immediately after the output file specification in the SORT or MERGE command line. 3 /ALLOCATION Specifies the number of blocks to be preallocated for the output file. Used for optimization when you know that the output file allocation will differ substantially from the total input file allocation (because you are reformatting data or omitting records). Format output-file-spec/ALLOCATION=n 4 Qualifier_Value n Specifies the number of blocks to be allocated. A value of 1 to 4,294,967,295 is allowed. 4 Full_Description Sort/Merge preallocates space for the output file based on total input file allocation, thereby avoiding the overhead of extending the file every time another few blocks are written to it. However, if you know that the output file allocation will differ substantially from the total input file allocation (because you are reformatting data or omitting records), you can specify the number of blocks to be preallocated for the output file. The /ALLOCATION qualifier is required if the /CONTIGUOUS qualifier is used. 4 Example $ SORT/KEY=(POS:1,SIZ:80) STATS.DAT - _$ SUMMARY.LIS/ALLOCATION=1000/CONTIGUOUS This SORT command allocates 1000 contiguous blocks for the output file SUMMARY.LIS. 3 /BUCKET_SIZE Specifies the OpenVMS RMS bucket size (the number of 512-byte blocks per bucket) for the output file. Used with relative and indexed-sequential output disk files for optimization. Format output-file-spec/BUCKET_SIZE=n 4 Qualifier_Value n Specifies the bucket size. A value of 1 to 32 is allowed. 4 Full_Description Use the /BUCKET_SIZE qualifier with relative and indexed- sequential output disk files to specify OpenVMS RMS bucket size (the number of 512-byte blocks per bucket). If the output file organization is the same as for the input files, the default value is the same as the first input file bucket size. If output file organization is different, the default value is 1. The maximum number of blocks per bucket is 32. 4 Example $ SORT/KEY=(POS:1,SIZ:80) STATS1.DAT,STATS2.DAT - _$ SUMMARY.LIS/BUCKET_SIZE=16/RELATIVE This SORT command results in the output file SUMMARY.LIS that has a bucket size of 16 with relative organization. 3 /CONTIGUOUS Requests that the output file be stored in contiguous disk blocks, thereby decreasing access time. Note that you must also specify the /ALLOCATION qualifier because if the preallocated space is too small, OpenVMS RMS may be unable to extend the file contiguously. Format output-file-spec/CONTIGUOUS 4 Full_Description By default, Sort/Merge does not allocate contiguous disk blocks for the output file. You can request, however, that the output file be stored in contiguous disk blocks by specifying the /CONTIGUOUS qualifier, thereby decreasing access time. If you use the /CONTIGUOUS qualifier, you must also specify the /ALLOCATION qualifier because if the preallocated space is too small, OpenVMS RMS may be unable to extend the file contiguously. 4 Example $ SORT/KEY=(POS:1,SIZ:80) STATS.DAT - _$ SUMMARY.LIS/ALLOCATION=1000/CONTIGUOUS This SORT command allocates 1,000 contiguous blocks for the output file SUMMARY.LIS. 3 /FORMAT Specifies the output file record format if it differs from the input file format. Format output-file-spec/FORMAT=(type:n ...) 4 Qualifier_Values BLOCK_SIZE:n Specifies the output file's block size, in bytes, if you have directed the file to magnetic tape. You can also accept the default. If the input file is a tape file, the block size of the output file defaults to that of the input file. Otherwise, the output file block size defaults to the size used when the tape was mounted. Acceptable values for block size n range from 20 to 65,532. To ensure correct data interchange with other Digital systems, however, specify a block size of not more than 512 bytes. For compatibility with systems that are not made by Digital, the block size should not exceed 2,048 bytes. CONTROLLED:n Specifies variable with fixed-length control (VFC) records in the output file. n Optionally indicates the maximum record size (in bytes) of the output records. The maximum record size allowed depends on the file organization. Sequential files 32,767 Relative files 16,383 Indexed-sequential files 16,362 These totals include control bytes. If you do not specify the maximum record size, the default is a length large enough to hold the longest output record. FIXED:n Specifies fixed-length records in the output file. SIZE:n Specifies the size, in bytes, of the fixed portion of VFC (CONTROLLED) records, up to a maximum of 255 bytes. If you do not specify SIZE, the default is the size of the fixed portion of the first input file. If you specify this size as 0, OpenVMS RMS defaults the value to 2 bytes. VARIABLE:n Specifies variable-length records in the output file. 4 Full_Description If the sort operation is a record or tag sort, the default output record format is the same as the first input file record format. If the sort operation is an address or index sort, the default output record format is fixed record format. If the input files have different record formats, Sort provides an output record size that is large enough to contain the largest record in the input files. When you specify the output record format, you can indicate the maximum record size, in bytes, of the output records. You can specify fixed-length records, variable-length records, or variable with fixed-length control records. 4 Example $ SORT/KEY=(POS:1,SIZ:80) STATS.DAT SUMMARY.LIS/FORMAT=FIXED:80 The input file STATS.DAT consists of variable-length records that are 80 bytes in length. The /FORMAT qualifier specifies that the output file, SUMMARY.LIS, consists of fixed-length records. 3 /INDEXED_SEQUENTIAL Defines the output file organization as indexed sequential. The output file must exist and must be empty. Used with the /OVERLAY qualifier. High-performance Sort/Merge: The high-performance Sort/Merge utility does not currently support indexed sequential output file organization. Format output-file-spec/INDEXED_SEQUENTIAL 4 Full_Description If the organization of the output file is to be different from that of the input files, then you must specify the new organization. Use the /INDEXED_SEQUENTIAL qualifier to define indexed-sequential organization for the output file. Additionally, the output file must exist and must be empty, and you must use the /OVERLAY qualifier. High-performance Sort/Merge: The high-performance Sort/Merge utility does not currently support indexed sequential output file organization. 4 Example $ CREATE/FDL=NEW.FDL AVERAGE.DAT $ SORT/KEY=(POS:1,SIZ:80) DATA.DAT,STATS.DAT - _$ AVERAGE.DAT/INDEXED_SEQUENTIAL/OVERLAY The CREATE/FDL command creates the empty file AVERAGE.DAT. The SORT command specifies that the output file have an indexed- sequential organization and be written to the empty file AVERAGE.DAT. 3 /OVERLAY Specifies an existing empty file that the output file is to be overlaid, or written to. The /OVERLAY qualifier is required when you use the /INDEXED_SEQUENTIAL qualifier. High-performance Sort/Merge: The high-performance Sort/Merge utility does not currently support use of the /OVERLAY qualifier. Format output-file-spec/OVERLAY 4 Full_Description To specify that an empty file is to be overlaid with sorted records, use the /OVERLAY qualifier. High-performance Sort/Merge: The high-performance Sort/Merge utility does not currently support use of the /OVERLAY qualifier. If the input file organization is indexed-sequential, the output file must already exist and must be empty. If the output file is not empty, /OVERLAY does not write over the file. Instead, it appends the result of the sort to the existing output file. If the input file organization is sequential or relative, you can create an empty file for the sorted records using an OpenVMS RMS program and use the /OVERLAY qualifier to specify that the output file is to be overlaid. You can use the Create/FDL utility to create an empty data file; use the /OVERLAY qualifier to specify that Sort is to write output to that file. Any attributes that you specify when creating the empty file then become attributes of the Sort output file. 4 Example $ CREATE/FDL=NEW.FDL AVERAGE.DAT $ SORT/KEY=(POS:1,SIZ:80) STATS.DAT AVERAGE.DAT/OVERLAY The FDL file NEW.FDL specifies special attributes for the file AVERAGE.DAT. When Sort writes output to that file, the resulting Sort output file has the attributes specified by the FDL file. 3 /RELATIVE Defines the output file organization as relative. Format output-file-spec/RELATIVE 4 Full_Description If the organization of the output file is to be different from that of the input files, then you must specify the new organization. If you do not specify file organization, the default for record and tag sorts is the organization of the first input file. You must use the /RELATIVE qualifier to specify relative output file organization. 4 Example $ SORT/KEY=(POS:1,SIZ:80) STATS.DAT SUMMARY.LIS/RELATIVE Because the input file STATS.DAT is not a relative file and the output file, SUMMARY.LIS, will be, /RELATIVE qualifies the output file specification. 3 /SEQUENTIAL Defines output file organization as sequential. This is the default for address and index sorts. (The default for record and tag sorts is the organization of the first input file.) Format output-file-spec/SEQUENTIAL 4 Full_Description If the organization of the output file is to be different from that of the input files, you must specify the new organization. If you do not specify file organization, the default for record and tag sorts is the organization of the first input file. If you do not specify file organization, the default organization for address and index sorts is sequential. Use the /SEQUENTIAL qualifier when the default is not sequential file organization and you want an output file with sequential file format. 4 Example $ SORT/KEY=(POS:1,SIZ:80) STATS.DAT SUMMARY.LIS/SEQUENTIAL Because the input file STATS.DAT is not a sequential file and the output file SUMMARY.LIS will be, /SEQUENTIAL qualifies the output file specification. 2 Specification_File_Qualifiers Qualifiers used in a Sort/Merge specification file are similar to the DCL qualifiers used in the SORT or MERGE command line. However, in some cases, the format of these qualifiers can be different. For example, the /KEY qualifier at DCL level has a different format than the /KEY qualifier in the specification file. If you specify DCL command qualifiers in the SORT or MERGE command line, those qualifiers override corresponding entries in the specification file. High-performance Sort/Merge: The high-performance Sort/Merge utility does not currently support the use of specification files. 3 Specification_File_Example /FIELD=(NAME=RECORD_TYPE,POS:1,SIZ:1) ! Records type, one-byte field /FIELD=(NAME=PRICE,POS:2,SIZ:8) ! Price field, both files /FIELD=(NAME=TAXES,POS:10,SIZ:5) ! Taxes field, both files /FIELD=(NAME=STYLE_A,POS:15,SIZ:10) ! Style field, format A file /FIELD=(NAME=STYLE_B,POS:20,SIZ:10) ! Style field, format B file /FIELD=(NAME=ZIP_A,POS:25,SIZ:5) ! Zip code field, format A file /FIELD=(NAME=ZIP_B,POS:15,SIZ:5) ! Zip code field, format B file /CONDITION=(NAME=FORMAT_A, ! Condition test, format A file TEST=(RECORD_TYPE EQ "A")) /CONDITION=(NAME=FORMAT_B, ! Condition test, format B file TEST=(RECORD_TYPE EQ "B")) KEY=ZIP_A, DATA=PRICE, DATA=TAXES, DATA=STYLE_A, DATA=ZIP_A) /INCLUDE=(CONDITION=FORMAT_B, ! Output format, type B records KEY=ZIP_B, DATA=PRICE, DATA=TAXES, DATA=STYLE_B, DATA=ZIP_B) In this example, two input files from two different branches of a real estate agency are sorted according to the instructions specified in a specification file. The records in the first file that begin with an A in the first position have this format: |A|PRICE|TAXES|STYLE|ZIP| 1 2 10 15 25 The records in the second file that begin with a B in the first position and have the style and zip code fields reversed, as follows: |B|PRICE|TAXES|ZIP|STYLE| 1 2 10 15 20 To sort these two files on the zip code field in the format of record A, first define the fields in both records with the /FIELD qualifiers. Then, specify a test to distinguish between the two types of records with the /CONDITION qualifiers. Finally, the /INCLUDE qualifiers change the record format of type B to record format of type A on output. Note that, if you specify either key or data fields in an /INCLUDE qualifier, you must explicitly specify all the key and data fields for the sort operation in the /INCLUDE qualifier. Also note that records that are not type A or type B are omitted from the sort. 3 /CDD_PATH_NAME Identifies fields and attributes defined for use with the Common Data Dictionary (CDD/Plus). Once the fields have been identified, they can then be used later with other specification file qualifiers, such as /KEY, /CONDITION, /INCLUDE, or /OMIT. You can use the /CDD_PATH_NAME qualifier only if your system has CDD/Plus installed. Format /CDD_PATH_NAME="cdd-path-name" 4 Qualifier_Values "cdd-path-name" Specifies the CDD/Plus record definition within CDD/Plus. 4 Full_Description /CDD_PATH_NAME can be used in place of or in conjunction with /FIELD statements. The /CDD_PATH_NAME qualifier identifies CDD/Plus defined fields and attributes for SORT. Identifying these fields with this qualifier is the same as specifying them with the /FIELD qualifier. 4 Example /CDD_PATH_NAME="customer" The /CDD_PATH_NAME qualifier identifies the customer record, which was previously identified in CDD/Plus. 3 /CHECK_SEQUENCE Specifies whether or not the sequence of records in the input files is checked when files are merged. By default the sequence of records is not checked. Use only with the MERGE command. Formats /CHECK_SEQUENCE /NOCHECK_SEQUENCE 4 Full_Description By default, Merge does not check the sequence of records in the input files. If you want to override that default, specify /CHECK_SEQUENCE in your specification file text. 4 Example /NOCHECK_SEQUENCE The /NOCHECK_SEQUENCE qualifier overrides Merge's default behavior. 3 /COLLATING_SEQUENCE Specifies the collating instructions for a sort or merge operation. With the /COLLATING_SEQUENCE qualifier, you can specify ASCII (the default), EBCDIC, or Multinational sequence; you can also define your own sequence. Formats /COLLATING_SEQUENCE= (SEQUENCE=sequence_type [,MODIFICATION=(character operator character)] [,IGNORE=character or character-range,...] [,FOLD] [,[NO]TIE_BREAK]) 4 Qualifier_Values SEQUENCE=sequence_type ASCII Specifies ASCII collating sequence, which is the default sequence. EBCDIC Arranges characters according to EBCDIC sequence. The characters remain in ASCII representation; only the order is changed. MULTINATIONAL Arranges characters according to Multinational sequence, which collates the international character set. When you use the Multinational sequence, characters are ordered according to the following rules: o All diacritical forms of a character are given the collating value of the character (A',A",A' collate as A). o Lowercase characters are given the collating value of their uppercase equivalents (a collates as A, a" collates as A"). o If two strings compare as equal, tie-breaking is performed. The strings are compared to detect differences due to diacritical marks, ignored characters, or characters that collate as equal although they are actually different. If the strings still compare as equal, another comparison is done based on the numeric codes of the characters. In this final comparison, lowercase characters are ordered before uppercase. Care should be taken when sorting or merging files for further processing using the Multinational sequence. Sequence checking procedures in most programming languages compare numeric characters. Because Multinational is based on actual graphic characters and not on the codes representing those characters, normal sequence checking does not work. user-defined-sequence Specifies a user-defined collating sequence. Define a collating sequence by specifying a string of single or double characters or ranges of single characters. (A double character is any set of two single characters collated as if they were one character. For example, "CH" can be defined to collate as "C".) This string should be enclosed in parentheses. You can also represent characters by their corresponding octal, decimal, or hexadecimal values using the radix operators: %O, %D, %X. You must observe the following rules when defining your collating sequence: o Enclose characters in quotation marks (" "). o Separate each character and character range with a comma, and enclose the entire list in parentheses. o Give all the characters appearing in the character keys in the sort or merge operation a collating value. Any character not given a collating value will be ignored unless the FOLD or MODIFICATION options are specified. o Do not define a character more than once. o Do not specify the null character by using quotation marks (""). Instead, use a radix operator such as %X0. o Specify quotation marks by enclosing them within another set of quotation marks ("" "") or by using a radix operator. MODIFICATION=(character operator character) Specifies a change to the collating sequence specified in the SEQUENCE option. You can modify the ASCII, EBCDIC, Multinational, or user-defined sequence. The sequence being modified must be specified with the SEQUENCE qualifier even if the sequence is the default (ASCII). character Specifies a character in the collating sequence. You can specify a single or double character. A double character is any set of two single characters collated as if they were a single character. Enclose the character in quotation marks. operator Specifies the operator used to compare the characters. You can specify greater than (>), less than (<), or equal to (=). These are the kinds of changes permitted in the MODIFICATION option: o A single or double character can be equated to a single character that has already been assigned a collating value ("a"="A"). o A single or double character can collate after a single character that has already been assigned a collating value ("CH">"C"). o A single or double character can collate before a single character that has already been assigned a collating value ("D"<"A"). o A double character can be equated to a previously defined double character ("CH" = "SH"). o A single character can be equated to a double character sequence ("C" = "CH"). IGNORE Specifies that Sort/Merge ignore a character or character range in the collating sequence when making an initial comparison. Note that, when tie-breaking takes place, Sort/Merge considers the characters specified with the IGNORE qualifier. Tie-breaking takes place when two or more strings have compared as equal and the Multinational sequence is being used or when two or more strings have compared as equal and the TIE_BREAK qualifier has been specified. FOLD Specifies that all lowercase letters be given the collating value of their uppercase equivalents. For ASCII, EBCDIC, and user- defined sequences, the lowercase letters are a to z. Because the lowercase letters in the Multinational sequence already have the collating value of their uppercase equivalents, using FOLD is unnecessary. TIE_BREAK Specifies whether or not Sort/Merge should use numeric values to break any ties between characters that have equivalent values. By default, tie-breaking occurs with the Multinational sequence. Specifying NOTIE_BREAK overrides this default and ensures that no further comparisons are made after the initial comparison. A TIE_BREAK option must be specified for the ASCII, EBCDIC, and user-defined sequences in order for tie-breaking to occur. TIE_ BREAK should be used when specifying FOLD or MODIFICATION for the these sequences. 4 Full_Description The MODIFICATION, IGNORE, FOLD, and [NO]TIE_BREAK options of the /COLLATING_SEQUENCE qualifier can also be used to modify the collating sequence. You can make more than one modification to the collating sequence. If you intend to modify any collating sequence, you must specify the sequence in the SEQUENCE option, even if it is the default sequence (ASCII). Because the FOLD, MODIFICATION, and IGNORE qualifiers are processed in the order in which they are specified, care should be taken when specifying the order of those qualifiers. Normally, FOLD should be specified after all MODIFICATION and IGNORE qualifiers to ensure that the effects of the MODIFICATION and IGNORE qualifiers apply to uppercase and lowercase characters. You can request that Sort/Merge ignore a character or character range within the given collating sequence by using the IGNORE qualifier. By default, in the Multinational collating sequence, Sort/Merge folds lowercase letters into their uppercase equivalents. If you want this folding to occur in the other collating sequences, you must specify a FOLD qualifier with the instructions for the collating sequence. Also, by default in the Multinational collating sequence, Sort/Merge uses numeric comparisons to break any ties in the collating values. Ties occur when two equal keys collate the same. If you do not want the default when using the Multinational collating sequence, specify the keyword NOTIE_BREAK. For tie breaking in the other collating sequences, specify a TIE_BREAK qualifier. 4 Examples 1./COLLATING_SEQUENCE=(SEQUENCE=ASCII,IGNORE=("-"," ")) This /COLLATING_SEQUENCE qualifier with an IGNORE option specified results in the following fields being compared as equal before tie breaking: 252-3412 252 3412 2523412 2./COLLATING_SEQUENCE=(SEQUENCE=("A"-"L","LL","M"-"R","RR","S"-"Z")) This /COLLATING_SEQUENCE qualifier defines a sequence in which the double character LL collates as a single character between L and M, and the double character RR collates as a single character between R and S. These double characters would otherwise appear in their usual alphabetical order. By default, this user-defined sequence does not define any other characters, such as lowercase a to z. 3./COLLATING_SEQUENCE=(SEQUENCE= ("AN","EB","AR","PR","AY","UN","UL", "UG","EP","CT","OV","EC","0"-"9"), MODIFICATION=("'"="19"), FOLD) This /COLLATING_SEQUENCE qualifier defines a collating sequence. It includes a user-defined sequence that gives each month a unique value in chronological order. For example, if you want to order a file called SEMINAR.DAT according to the date, the file SEMINAR.DAT would be set up as follows: 16 NOV 1983 Communication Skills 05 APR 1984 Coping with Alcoholism 11 Jan '84 How to Be Assertive 12 OCT 1983 Improving Productivity 15 MAR 1984 Living with Your Teenager 08 FEB 1984 Single Parenting 07 Dec '83 Stress --- Causes and Cures 14 SEP 1983 Time Management The primary key is the year field; the secondary key is the month field. Because the month field is not numeric and you want the months ordered chronologically, you must define your own collating sequence. You can do this by sorting on the second two letters of each month-in their chronological sequence-giving each month a unique key value. The MODIFICATION option specifies that the apostrophe (') be equated to 19, thereby allowing a comparison of '83 and 1984. The FOLD option specifies that uppercase and lowercase letters are treated as equal. The output from this sort operation appears as follows: 14 SEP 1983 Time Management 12 OCT 1983 Improving Productivity 16 NOV 1983 Communication Skills 07 Dec '83 Stress --- Causes and Cures 11 Jan '84 How to Be Assertive 08 FEB 1984 Single Parenting 15 MAR 1984 Living with Your Teenager 05 APR 1984 Coping with Alcoholism 3 /CONDITION Defines conditions for key and data handling and for record selection. Formats /CONDITION= (NAME=condition-name, TEST=(field-name operator test-condition [logical-operator ...])) 4 Qualifier_Values NAME=condition-name Specifies the name of the condition you are testing. This condition-name can be used in /KEY, /DATA, /OMIT, and /INCLUDE qualifiers after it has been defined using the /CONDITION qualifier. TEST=(field-name operator test-condition) Specifies the conditional test. field-name Specifies the name of the field you are testing. The field-name must be defined previously by a /FIELD qualifier. operator Specifies the logical or relational operator used in the conditional test. The logical operators that you can use are AND and OR. The relational operators that you can specify are as follows: EQ Equal to NE Not equal to GT Greater than GE Greater than or equal to LT Less than LE Less than or equal to test-condition Specifies the constant or field-name against which you are testing. A constant is specified with the following format: Decimal_digits (default) %Ddecimal_digits %Ooctal_digits %Xhexadecimal_digits "character" NOTE Normally, you do not need to specify the radix operator (%D); however, test-condition will assume the same data type as the field-name. The field-name must be defined by a /FIELD qualifier. 4 Full_Description A specification file can be used to change the relative order of a record or to alter the contents of certain fields of a record. You must first use a /CONDITION qualifier to define a conditional test. Once you define a test using a /CONDITIONAL qualifier, you can use that same test with a /KEY or /DATA qualifier to change the order of record. You can also use the test with an /OMIT or /INCLUDE qualifier to change the contents of a record. If you want to change the order of records in the output file, first specify a condition name with a /CONDITION qualifier and set up a test for what meets that condition. Then, specify the relative order with a /KEY qualifier of the form: /KEY=(IF condition-name THEN value ELSE value) You can use any values to specify the relative order of the records. The /CONDITION qualifier also permits you to change the contents of a field in the output records. First specify a condition name, and then set up a test for what meets the condition. Specify the contents you want in the field in a /DATA qualifier of the form: /DATA=(IF condition-name THEN "new-contents" ELSE "new-contents") 4 Examples 1./FIELD=(NAME=AGENT,POSITION:20,SIZE:15) /CONDITION=(NAME=AGENCY, TEST=(AGENT EQ "Real-T Trust" OR AGENT EQ "Realty Trust")) /DATA=(IF AGENCY THEN "Realty Trust" ELSE AGENT) In this example, two real estate files are being sorted. One file refers to an agency as Real-T Trust; the other refers to the same agency as Realty Trust. The /CONDITION and /DATA qualifiers instruct Sort to list the AGENT field in the sorted output file as Realty Trust. 2./FIELD=(NAME=ZIP,POSITION:60,SIZE:6) /CONDITION=(NAME=LOCATION, TEST=(ZIP EQ "01863")) /KEY=(IF LOCATION THEN 1 ELSE 2) In this example, all the records with a zip code of 01863 will appear at the beginning of the sorted output file. The conditional test is on the ZIP field, defined with the /FIELD qualifier; the condition is named LOCATION. The values 1 and 2 in this /KEY qualifier signify a relative order for those records that satisfy the condition and those that do not. 3./FIELD=(NAME=ZIP,POSITION:60,SIZE:6) /CONDITION=(NAME=LOCATION, TEST=(ZIP EQ "01863")) /DATA=(IF LOCATION THEN "NORTH CHELMSFORD" ELSE "Outside district") In this example, the /CONDITION qualifier tests for the 01863 zip code. The /DATA qualifier specifies that the name of town field will be added to the output record, depending on the test results. 4./FIELD=(NAME=FFLOAT,POS:1,SIZ:0,F_FLOATING) /CONDITION=(NAME=CFFLOAT,TEST=(FFLOAT GE 100)) /OMIT=(CONDITION=CFFLOAT) In this example, the number 100 is considered to be an F_FLOATING data type because field FFLOAT is defined as F_FLOATING in the /FIELD qualifier. 3 /DATA Specifies the fields of a record to be directed to the output file. Formats /DATA= field-name /DATA= (IF condition-name THEN "new-contents" ELSE "new-contents") 4 Qualifier_Values field-name Specifies the name of a field in a record. The field-name must be defined previously in a /FIELD qualifier. condition-name Specifies a condition-name that has been defined previously in a /CONDITION qualifier. new-contents Specifies how the record is to be altered. The new-contents can be a constant or a field-name that has been defined in a /FIELD qualifier. 4 Full_Description A /DATA qualifier must identify every field in the records you are directing to the output file. Specify the data fields in the order you want them to appear in the output record. By default, the record format for an output file is the same as that for the input file. If you want to eliminate or reorder fields from the output record, you can use the /DATA qualifier, causing only those fields identified by the /DATA qualifier to be directed to the output file. You can conditionally change the contents of a field in the output records by first specifying a condition name and then setting up a test for what meets the condition in a /CONDITION qualifier. You then specify the contents you want in the field in a /DATA qualifier of the form: /DATA=(IF condition-name THEN "new-contents" ELSE "new-contents") 4 Examples 1./FIELD=(NAME=AGENT,POSITION:1,SIZE:5) /FIELD=(NAME=ZIP,POSITION:6,SIZE:3) /FIELD=(NAME=STYLE,POSITION:10,SIZE:5) /FIELD=(NAME=CONDITION,POSITION:16,SIZE:9) /FIELD=(NAME=PRICE,POSITION:26,SIZE:5) /FIELD=(NAME=TAXES,POSITION:32,SIZE:5) /DATA=PRICE /DATA=" " /DATA=TAXES /DATA=" " /DATA=STYLE /DATA=" " /DATA=ZIP /DATA=" " /DATA=AGENT The /FIELD qualifiers define the fields in the records from an input file that has the following format: AGENT ZIP STYLE CONDITION PRICE TAXES The /DATA qualifiers, which use the field-names defined in the /FIELD qualifiers, reformat the records to create output records of the following format: PRICE TAXES STYLE ZIP AGENT 2./FIELD=(NAME=AGENT,POSITION:20,SIZE:15) /CONDITION=(NAME=AGENCY, TEST=(AGENT EQ "Real-T Trust" OR AGENT EQ "Realty Trust")) /DATA=(IF AGENCY THEN "Realty Trust" ELSE AGENT) In this example, two real estate files are being sorted. One file refers to an agency as Real-T Trust; the other refers to the same agency as Realty Trust. The /CONDITION and /DATA qualifiers instruct Sort to list the AGENT field in the sorted output file as Realty Trust. 3 /FIELD Specification File Qualifier Defines the input record fields to be used for a sort or merge operation or in a conditional evaluation, or whose order or format will change in the output record. You identify each field by specifying a name, its position and size in the record, and its data type. You can also use /FIELD to define a constant and assign it a value of any valid sort/merge data-type for use in /KEY, /DATA, and /CONDITION statements. Formats /FIELD=(NAME=field-name,POSITION:n,) SIZE:n,[DIGITS:n,]data-type /FIELD=(NAME=field-name,VALUE:n,) SIZE:n,[DIGITS:n,]data-type 4 Qualifier_Values NAME=field-name Specifies the name of the field. The field-name cannot have any embedded spaces, must begin with an alphabetic character, and can be no longer than 31 characters. POSITION:n Specifies the position of the field in the record. VALUE:n Assigns a value to a constant field for use in a /KEY, /DATA, or /CONDITION statement. If you specify VALUE:n, do not specify /POSITION:n, because the field is a constant and not part of an input record. SIZE:n Specifies the size of a field containing character or binary data. In the specification file, SIZE implies byte lengths. The data type determines what values are acceptable, as well as the units in which the size is specified: o For character data, the size must not exceed 32,767 (characters). o For binary data, the size specified must be 1, 2, 4, 8, or 16 (bytes). High-performance Sort/Merge: The high-performance Sort/Merge utility currently supports only 1, 2, 4, and 8-byte binary keys. o For floating-point data, no size is specified. DIGITS:n Specifies the size of a field containing decimal data. The size of a field containing decimal data is specified in digits. The size must not exceed 31 digits. Note that DIGITS:n is used only when describing a field containing decimal data. data-type Specifies the data type of the field. You are not required to specify the data-type if it is character; Sort assumes character data type by default. The following data types are recognized by OpenVMS Sort/Merge: CHARACTER BINARY[,SIGNED] BINARY,UNSIGNED D_FLOATING DECIMAL,LEADING_SIGN,[OVERPUNCHED_SIGN,SIGNED] DECIMAL,LEADING_SIGN,SEPARATE_SIGN[,SIGNED] DECIMAL[,SIGNED,TRAILING_SIGN,OVERPUNCHED_SIGN] DECIMAL,[TRAILING_SIGN],SEPARATE_SIGN[,SIGNED] DECIMAL,UNSIGNED F_FLOATING G_FLOATING H_FLOATING PACKED_DECIMAL S_FLOATING, IEEE (Alpha systems only) T_FLOATING, IEEE (Alpha systems only) PACKED_DECIMAL ZONED 4 Full_Description Use the /FIELD qualifier to define input record fields to be used for a sort or merge operation or in a conditional evaluation, or whose order or format will change in the output record. Identify each field by specifying a name in the /FIELD qualifier, a constant value or the field position, and the size and data type of the field. Field names must be unique; no duplicate field names are allowed. You cannot use more than 255 key field definitions. Once the field-name has been specified in the /FIELD qualifier, it can be used in the /CONDITION, /KEY, and /DATA qualifiers. 4 Example /FIELD=(NAME=SALARY,POSITION:10,DIGITS:8,DECIMAL) This /FIELD qualifier identifies a field in a record by the name SALARY, specifies that it starts in position 10 of the record, is 8 digits long, and consists of decimal data. 3 /INCLUDE Specification File Qualifier Specifies record selection as well as multiple record formats. Formats /INCLUDE=(CONDITION=condition-name) [,KEY=...][,DATA=...] 4 Qualifier_Values CONDITION=condition-name Refers to the condition-name specified in a previous /CONDITION qualifier. KEY=... Defines a key field because the default record type defined in the /KEY qualifier is not being used. DATA=... Defines a data field because the default record type defined in the /DATA qualifier is not being used. 4 Full_Description You can specify that records are to be conditionally included in an output file. After defining a condition in a /CONDITION qualifier, specify record selection in an /INCLUDE qualifier requesting that records satisfying the condition are to be included in the output file. You can specify multiple /INCLUDE and /OMIT qualifiers in a specification file. The order in which you specify them determines the order the input records are tested for inclusion. After the last /INCLUDE qualifier, all records that have not already been included or explicitly omitted are omitted. You can unconditionally include any records not previously omitted or included by specifying /INCLUDE without a condition. When sorting multiple record formats, one /INCLUDE qualifier should be specified for each different record format among the records to be sorted. If you do not specify a KEY option within the INCLUDE qualifier, Sort assumes the default key definitions. If the KEY is specified in the /INCLUDE qualifier, the default key definitions are not used. The order of the KEY fields in the /INCLUDE qualifier determines how the internal key is built for sorting. The order of the DATA fields in the /INCLUDE qualifier determines the way the output record is formatted. If you specify a key or data field in an /INCLUDE qualifier, you must define all other key or data fields in the record. 4 Example /FIELD=(NAME=ZIP,POSITION:20,SIZE:6) /CONDITION=(NAME=LOCATION, TEST=(ZIP EQ "01863")) /INCLUDE=(CONDITION=LOCATION) These /CONDITION and /INCLUDE qualifiers specify that records with the zip code 01863 will be included in the output file. 3 /KEY Specification File Qualifier Identifies key field names, specifies sorting order, and changes the order of records in the output file. Formats /KEY=field-name /KEY=(field-name,order) /KEY=([IF condition-name THEN value ELSE]...value [,order]) 4 Qualifier_Values field-name Specifies the name of the key field. The field-name has been previously specified in a /FIELD qualifier. order Specifies the order of the sort. The ASCENDING option specifies ascending order for a sort or merge operation. This option is the default. The DESCENDING option specifies descending order for a sort or merge operation. value Specifies the key. The value can be a constant or a field-name that has been defined in a /FIELD qualifier. 4 Full_Description If you are sorting on the entire record using character data, you do not need to specify your key field. Otherwise, specify a /KEY qualifier for each of the keys, in the order of their priority. You can sort on as many as 255 key fields. There are three ways to use the /KEY qualifier: o To identify the key field name. o To identify the key field name and to specify sorting order. In this case, enclose the field name and the order option in parentheses. o As a conditional qualifier, to change the order of records in the output file. First, specify a condition name in a /CONDITION qualifier, and set up a test for what meets that condition. Then, specify the relative order in a /KEY qualifier of the form: /KEY=(IF condition-name THEN value ELSE value) You can use any values to specify the relative order of the records. 4 Examples 1./FIELD=(NAME=SALARY,POSITION:10,DIGITS:8,DECIMAL) /KEY=(SALARY,DESCENDING) This /KEY qualifier specifies that the key field is SALARY and that the sorting order is descending. 2./FIELD=(NAME=ZIP,POSITION:20,SIZE:6) /CONDITION=(NAME=LOCATION, TEST=(ZIP EQ "01863")) /KEY=(IF LOCATION THEN 1 ELSE 2) In this example, all the records with the zip code 01863 are to appear at the beginning of the sorted output file. The conditional test LOCATION (defined in a /CONDITION qualifier) is on the ZIP field (named in a /FIELD clause). The values of 1 and 2 in this /KEY clause signify a relative order for those records that satisfy the condition and those that do not. 3 /OMIT Specifies that records are to be omitted from the output file based on a condition defined with a /CONDITION qualifier. Format /OMIT=(CONDITION=condition-name) 4 Qualifier_Value CONDITION=condition-name Refers to the condition-name previously specified in a /CONDITION qualifier. 4 Full_Description You can specify that records are to be omitted from the output file by using the /OMIT qualifier. First, you must define a condition with the /CONDITION qualifier. Specify your record selection with an /OMIT qualifier requesting the records satisfying that condition be selected for omission from your sort. By default, Sort/Merge includes all the other input records in the output file. You can specify multiple /OMIT and /INCLUDE qualifiers in your specification file. The order you specify them determines the order the input records are tested for omission. All the records that have not already been included or omitted after the last /OMIT qualifier are included. You can unconditionally omit any records not previously omitted or included by specifying the /OMIT qualifier only. 4 Example /FIELD=(NAME=ZIP,POSITION:20,SIZE:6) /CONDITION=(NAME=LOCATION, TEST=(ZIP EQ "01863")) /OMIT=(CONDITION=LOCATION) These /CONDITION and /OMIT qualifiers specify that records with the zip code 01863 are to be omitted from your output file. 3 /PAD Allows you to specify a pad character to use when reformatting records or when comparing strings of unequal length. Format /PAD=single-character 4 Qualifier_Value single-character Specifies the character that the Sort utiltiy will use to pad a string. Characters, decimal, octal, or hexadecimal digits can be used. The pad character should be specified as follows: o Use quotation marks for a character. For example, " # " would specify the number sign. o Use decimal radix for decimal digits. For example, %D35 would specify the decimal number 35. o Use octal radix for octal digits. For example, %O043 would specify the octal number 043. o Use hexadecimal radix for hexadecimal digits. For example, %X23 would specify the hexadecimal number 23. 4 Full_Description Use the /PAD qualifier to specify a pad character when comparing strings of unequal length or when reformatting records. By default, Sort uses the null character for padding, ensuring conformity with the previous versions. Double characters that can be defined as single characters ("ch" > "c") cannot be used as pad characters. 4 Example /PAD="." This example of a /PAD qualifier specifies that records will be padded with periods. 3 /PROCESS Defines the processing method (record, tag, address, or index) for the sorting operation. Use only with the SORT command. Format /PROCESS=type 4 Qualifier_Values RECORD Specifies the record sort. This sort process is the default. TAG Specifies the tag sort. ADDRESS Specifies the address sort. INDEX Specifies the index sort. 4 Full_Description By default, Sort uses a record sorting process. You can also specify a tag, address, or index sorting process. If you intend to reformat the output records, you cannot use address or index sort. For a comparison of the four processes, see the description of /PROCESS in the Command Qualifiers Section. Use the /PROCESS qualifier with the SORT command only. 4 Example /PROCESS=tag This example of the /PROCESS qualifier specifies that Sort use a tag sorting process. 3 /STABLE Specifies that records with equal keys are directed to the output file in their input file order. The default condition is /NOSTABLE. Formats /STABLE /NOSTABLE 4 Full_Description By default, when records are sorted with identical keys, the order of those records in the output file may not be the same as they appeared in the input file. Specifying the /STABLE qualifier in a specification file arranges records with equal keys in the output file in the order of the input files as specified in the command line. If you use this qualifier when sorting multiple input files, on output, records with equal keys in the first file will precede those from the second file and so on. 4 Example /STABLE This example of the /STABLE qualifier ensures that records with equal keys will have the same order in the input and output files. 3 /WORK_FILES Reassigns work files to different disk-structured devices to improve performance. Use only with the SORT command. Format /WORK_FILES=(device[,...]) 4 Qualifier_Value device Specifies a logical name for the work file. Unlike the DCL qualifier /WORK_FILES=n, the specification file qualifier /WORK_ FILES=(device[,...]) specifies work file assignments, not the number of work files. 4 Full_Description You can improve the performance of Sort by placing work files on different disk-structured devices. Using the /WORK_FILES qualifier in a specification file to reassign work files makes it unnecessary to make logical assignments prior to invoking Sort at the command or program level. 4 Example /WORK_FILES=("WRKD$:") This example of a /WORK_FILES qualifier assigns one of Sort's work files to the device WRKD$: because that device has the most space available.