1 .ds DT July 9, 1993 \" use troff -mm 2 .nr C 3 3 .nr N 2 4 .SA 1 \" right justified 5 .TL "311466-6713" "49059-6" \" charging case filing case 6 Guidelines for writing \f5ksh-93\fP built-in commands 7 .AU "David G. Korn" DGK FP 11267 8062 D-237 "(research!dgk)" 8 .AF 9 .TM 11267-930???-93 \" technical memo + TM numbers 10 .MT 4 11 .AS 2 \" abstract start for TM 12 One of the features of \f5ksh93\fP, the latest version of \f5ksh\fP, 13 is the ability to add built-in commands at run time. 14 This feature only works on operating systems that have the ability 15 to load and link code into the current process at run time. 16 Some examples of the systems that have this feature 17 are System V Release 4, Solaris, Sun OS, HP-UX Release 8 and above, 18 AIX 3.2 and above, and Microsoft Windows systems. 19 .P 20 This memo describes how to write and compile programs 21 to can be loaded into \f5ksh\fP at run time as built-in 22 commands. 23 .AE \" abstract end 24 .OK Shell "Command interpreter" Language UNIX \" keyword 25 .MT 1 \" memo type 26 .H 1 INTRODUCTION 27 A built-in command is executed without creating a separate process. 28 Instead, the command is invoked as a C function by \f5ksh\fP. 29 If this function has no side effects in the shell process, 30 then the behavior of this built-in is identical to that of 31 the equivalent stand-alone command. The primary difference 32 in this case is performance. The overhead of process creation 33 is eliminated. For commands of short duration, the effect 34 can be dramatic. For example, on SUN OS 4.1, the time do 35 run \f5wc\fP on a small file of about 1000 bytes, runs 36 about 50 times faster as a built-in command. 37 .P 38 In addition, built-in commands that have side effects on the 39 shell environment can be written. 40 This is usually done to extend the application domain for 41 shell programming. For example, an X-windows extension 42 that makes heavy use of the shell variable namespace 43 was added as a group of built-ins commands that 44 are added at run time. 45 The result is a windowing shell that can be used to write 46 X-windows applications. 47 .P 48 While there are definite advantages to adding built-in 49 commands, there are some disadvantages as well. 50 Since the built-in command and \f5ksh\fP share the same 51 address space, a coding error in the built-in program 52 may affect the behavior of \f5ksh\fP; perhaps causing 53 it to core dump or hang. 54 Debugging is also more complex since your code is now 55 a part of a larger entity. 56 The isolation provided by a separate process 57 guarantees that all resources used by the command 58 will be freed when the command completes. 59 Also, since the address space of \f5ksh\fP will be larger, 60 this may increase the time it takes \f5ksh\fP to fork() and 61 exec() a non-builtin command. 62 It makes no sense to add a built-in command that takes 63 a long time to run or that is run only once, since the performance 64 benefits will be negligible. 65 Built-ins that have side effects in the current shell 66 environment have the disadvantage of increasing the 67 coupling between the built-in and \f5ksh\fP making 68 the overall system less modular and more monolithic. 69 .P 70 Despite these drawbacks, in many cases extending 71 \f5ksh\fP by adding built-in 72 commands makes sense and allows reuse of the shell 73 scripting ability in an application specific domain. 74 This memo describes how to write \f5ksh\fP extensions. 75 .H 1 "WRITING BUILT-IN COMMANDS" 76 There is a development kit available for writing \f5ksh\fP 77 built-ins. The development kit has three directories, 78 \f5include\fP, \f5lib\fP, and \f5bin\fP. 79 The \f5include\fP directory contains a sub-directory 80 named \f5ast\fP that contains interface prototypes 81 for functions that you can call from built-ins. The \f5lib\fP 82 directory contains the \fBast\fP library\*F 83 .FS 84 \fBast\fP stands for Advanced Software Technology 85 .FE 86 and a library named \fBlibcmd\fP that contains a version 87 of several of the standard POSIX\*(Rf 88 .RS 89 .I "POSIX \- Part 2: Shell and Utilities," 90 IEEE Std 1003.2-1992, ISO/IEC 9945-2:1993. 91 .RF 92 utilities that can be made run time built-ins. 93 It is best to set the value of the environment variable 94 \fB\s-1PACKAGE_\s+1ast\fP to the pathname of the directory 95 containing the development kit. 96 Users of \f5nmake\fP\*(Rf 97 .RS 98 Glenn Fowler, 99 Nmake reference needed 100 .RF 101 2.3 and above will then be able to 102 use the rule 103 .nf 104 .in .5i 105 \f5:PACKAGE: ast\fP 106 .in 107 .fi 108 in their makefiles and not have to specify any \f5-I\fP switches 109 to the compiler. 110 .P 111 A built-in command has a calling convention similar to 112 the \f5main\fP function of a program, 113 .nf 114 .in .5i 115 \f5int main(int argc, char *argv[])\fP. 116 .in 117 .fi 118 However, instead of \f5main\fP, you must use the function name 119 \f5b_\fP\fIname\fP, where \fIname\fP is the name 120 of the built-in you wish to define. 121 The built-in function takes a third 122 \f5void*\fP argument which you can define as \f5NULL\fP. 123 Instead of \f5exit\fP, you need to use \f5return\fP 124 to terminate your command. 125 The return value, will become the exit status of the command. 126 .P 127 The steps necessary to create and add a run time built-in are 128 illustrated in the following simple example. 129 Suppose, you wish to add a built-in command named \f5hello\fP 130 which requires one argument and prints the word hello followed 131 by its argument. First, write the following program in the file 132 \f5hello.c\fP: 133 .nf 134 .in .5i 135 \f5#include <stdio.h> 136 int b_hello(int argc, char *argv[], void *context) 137 { 138 if(argc != 2) 139 { 140 fprintf(stderr,"Usage: hello arg\en"); 141 return(2); 142 } 143 printf("hello %s\en",argv[1]); 144 return(0); 145 }\fP 146 .in 147 .fi 148 .P 149 Next, the program needs to be compiled. 150 On some systems it is necessary to specify a compiler 151 option to produce position independent code 152 for dynamic linking. 153 If you do not compile with \f5nmake\fP 154 it is important to specify the a special include directory 155 when compiling built-ins. 156 .nf 157 .in .5i 158 \f5cc -pic -I$PACKAGE_ast/include -c hello.c\fP 159 .in 160 .fi 161 since the special version of \f5<stdio.h>\fP 162 in the development kit is required. 163 This command generates \f5hello.o\fP in the current 164 directory. 165 .P 166 On some systems, you cannot load \f5hello.o\fP directly, 167 you must build a shared library instead. 168 Unfortunately, the method for generating a shared library 169 differs with operating system. 170 However, if you are building with the AT\&T \f5nmake\fP 171 program you can use the \f5:LIBRARY:\fP rule to specify 172 this in a system independent fashion. 173 In addition, if you have several built-ins, it is desirable 174 to build a shared library that contains them all. 175 .P 176 The final step is using the built-in. 177 This can be done with the \f5ksh\fP command \f5builtin\fP. 178 To load the shared library \f5hello.so\fP and to add 179 the built-in \f5hello\fP, invoke the command, 180 .nf 181 .in .5i 182 \f5builtin -f hello hello\fP 183 .in 184 .fi 185 The suffix for the shared library can be omitted in 186 which case the shell will add an appropriate suffix 187 for the system that it is loading from. 188 Once this command has been invoked, you can invoke \f5hello\fP 189 as you do any other command. 190 .P 191 It is often desirable to make a command \fIbuilt-in\fP 192 the first time that it is referenced. The first 193 time \f5hello\fP is invoked, \f5ksh\fP should load and execute it, 194 whereas for subsequent invocations \f5ksh\fP should just execute the built-in. 195 This can be done by creating a file named \f5hello\fP 196 with the following contents: 197 .nf 198 .in .5i 199 \f5function hello 200 { 201 unset -f hello 202 builtin -f hello hello 203 hello "$@" 204 }\fP 205 .in 206 .fi 207 This file \f5hello\fP needs to be placed in a directory that is 208 in your \fB\s-1FPATH\s+1\fP variable. In addition, the full 209 pathname for \f5hello.so\fP should be used in this script 210 so that the run time loader will be able to find this shared library 211 no matter where the command \f5hello\fP is invoked. 212 .H 1 "CODING REQUIREMENTS AND CONVENTIONS" 213 As mentioned above, the entry point for built-ins must be of 214 the form \f5b_\fP\fIname\fP. 215 Your built-ins can call functions from the standard C library, 216 the \fBast\fP library, interface functions provided by \f5ksh\fP, 217 and your own functions. 218 You should avoid using any global symbols beginning with 219 .BR sh_ , 220 .BR nv_ , 221 and 222 .B ed_ 223 since these are used by \f5ksh\fP itself. 224 In addition, \f5#define\fP constants in \f5ksh\fP interface 225 files, use symbols beginning with \fBSH_\fP to that you should 226 avoid using names beginning with \fBSH_\fP. 227 .H 2 "Header Files" 228 The development kit provides a portable interface 229 to the C library and to libast. 230 The header files in the development kit are compatible with 231 K&R C\*(Rf, 232 .RS 233 Brian W. Kernighan and Dennis M. Ritchie, 234 .IR "The C Programming Language" , 235 Prentice Hall, 1978. 236 .RF 237 ANSI-C\*(Rf, 238 .RS 239 American National Standard for Information Systems \- Programming 240 Language \- C, ANSI X3.159-1989. 241 .RF 242 and C++\*(Rf. 243 .RS 244 Bjarne Stroustroup, 245 .IR "C++" , 246 Addison Wesley, xxxx 247 .RF 248 .P 249 The best thing to do is to include the header file \f5<shell.h>\fP. 250 This header file causes the \f5<ast.h>\fP header, the 251 \f5<error.h>\fP header and the \f5<stak.h>\fP 252 header to be included as well as defining prototypes 253 for functions that you can call to get shell 254 services for your builtins. 255 The header file \f5<ast.h>\fP 256 provides prototypes for many \fBlibast\fP functions 257 and all the symbol and function definitions from the 258 ANSI-C headers, \f5<stddef.h>\fP, 259 \f5<stdlib.h>\fP, \f5<stdarg.h>\fP, \f5<limits.h>\fP, 260 and \f5<string.h>\fP. 261 It also provides all the symbols and definitions for the 262 POSIX\*(Rf 263 .RS 264 .I "POSIX \- Part 1: System Application Program Interface," 265 IEEE Std 1003.1-1990, ISO/IEC 9945-1:1990. 266 .RF 267 headers \f5<sys/types.h>\fP, \f5<fcntl.h>\fP, and 268 \f5<unistd.h>\fP. 269 You should include \f5<ast.h>\fP instead of one or more of 270 these headers. 271 The \f5<error.h>\fP header provides the interface to the error 272 and option parsing routines defined below. 273 The \f5<stak.h>\fP header provides the interface to the memory 274 allocation routines described below. 275 .P 276 Programs that want to use the information in \f5<sys/stat.h>\fP 277 should include the file \f5<ls.h>\fP instead. 278 This provides the complete POSIX interface to \f5stat()\fP 279 related functions even on non-POSIX systems. 280 .P 281 .H 2 "Input/Output" 282 \f5ksh\fP uses \fBsfio\fP, 283 the Safe/Fast I/O library\*(Rf, 284 .RS 285 David Korn and Kiem-Phong Vo, 286 .IR "SFIO - A Safe/Fast Input/Output library," 287 Proceedings of the Summer Usenix, 288 pp. , 1991. 289 .RF 290 to perform all I/O operations. 291 The \fBsfio\fP library, which is part of \fBlibast\fP, 292 provides a superset of the functionality provided by the standard 293 I/O library defined in ANSI-C. 294 If none of the additional functionality is required, 295 and if you are not familiar with \fBsfio\fP and 296 you do not want to spend the time learning it, 297 then you can use \fBsfio\fP via the \fBstdio\fP library 298 interface. The development kit contains the header \f5<stdio.h>\fP 299 which maps \fBstdio\fP calls to \fBsfio\fP calls. 300 In most instances the mapping is done 301 by macros or inline functions so that there is no overhead. 302 The man page for the \fBsfio\fP library is in an Appendix. 303 .P 304 However, there are some very nice extensions and 305 performance improvements in \fBsfio\fP 306 and if you plan any major extensions I recommend 307 that you use it natively. 308 .H 2 "Error Handling" 309 For error messages it is best to use the \fBast\fP library 310 function \f5errormsg()\fP rather that sending output to 311 \f5stderr\fP or the equivalent \f5sfstderr\fP directly. 312 Using \f5errormsg()\fP will make error message appear 313 more uniform to the user. 314 Furthermore, using \f5errormsg()\fP should make it easier 315 to do error message translation for other locales 316 in future versions of \f5ksh\fP. 317 .P 318 The first argument to 319 \f5errormsg()\fP specifies the dictionary in which the string 320 will be searched for translation. 321 The second argument to \f5errormsg()\fP contains that error type 322 and value. The third argument is a \fIprintf\fP style format 323 and the remaining arguments are arguments to be printed 324 as part of the message. A new-line is inserted at the 325 end of each message and therefore, should not appear as 326 part of the format string. 327 The second argument should be one of the following: 328 .VL .5i 329 .LI \f5ERROR_exit(\fP\fIn\fP\f5)\fP: 330 If \fIn\fP is not-zero, the builtin will exit value \fIn\fP after 331 printing the message. 332 .LI \f5ERROR_system(\fP\fIn\fP\f5)\fP: 333 Exit builtin with exit value \fIn\fP after printing the message. 334 The message will display the message corresponding to \f5errno\fP 335 enclosed within \f5[\ ]\fP at the end of the message. 336 .LI \f5ERROR_usage(\fP\fIn\fP\f5)\fP: 337 Will generate a usage message and exit. If \fIn\fP is non-zero, 338 the exit value will be 2. Otherwise the exit value will be 0. 339 .LI \f5ERROR_debug(\fP\fIn\fP\f5)\fP: 340 Will print a level \fIn\fP debugging message and will then continue. 341 .LI \f5ERROR_warn(\fP\fIn\fP\f5)\fP: 342 Prints a warning message. \fIn\fP is ignored. 343 .H 2 "Option Parsing" 344 The first thing that a built-in should do is to check 345 the arguments for correctness and to print any usage 346 messages on standard error. 347 For consistency with the rest of \f5ksh\fP, it is best 348 to use the \f5libast\fP functions \f5optget()\fP and 349 \f5optusage()\fPfor this 350 purpose. 351 The header \f5<error.h>\fP included prototypes for 352 these functions. 353 The \f5optget()\fP function is similar to the 354 System V C library function \f5getopt()\fP, 355 but provides some additional capabilities. 356 Built-ins that use \f5optget()\fP provide a more 357 consistent user interface. 358 .P 359 The \f5optget()\fP function is invoked as 360 .nf 361 .in .5i 362 \f5int optget(char *argv[], const char *optstring)\fP 363 .in 364 .fi 365 where \f5argv\fP is the argument list and \f5optstring\fP 366 is a string that specifies the allowable arguments and 367 additional information that is used to format \fIusage\fP 368 messages. 369 In fact a complete man page in \f5troff\fP or \f5html\fP 370 can be generated by passing a usage string as described 371 by the \f5getopts\fP command. 372 Like \f5getopt()\fP, 373 single letter options are represented by the letter itself, 374 and options that take a string argument are followed by the \f5:\fP 375 character. 376 Option strings have the following special characters: 377 .VL .5i 378 .LI \f5:\fP 379 Used after a letter option to indicate that the option 380 takes an option argument. 381 The variable \f5opt_info.arg\fP will point to this 382 value after the given argument is encountered. 383 .LI \f5#\fP 384 Used after a letter option to indicate that the option 385 can only take a numerical value. 386 The variable \f5opt_info.num\fP will contain this 387 value after the given argument is encountered. 388 .LI \f5?\fP 389 Used after a \f5:\fP or \f5#\fP (and after the optional \f5?\fP) 390 to indicate the the 391 preceding option argument is not required. 392 .LI \f5[\fP...\f5]\fP 393 After a \f5:\fP or \f5#\fP, the characters contained 394 inside the brackets are used to identify the option 395 argument when generating a \fIusage\fP message. 396 .LI \fIspace\fP 397 The remainder of the string will only be used when generating 398 usage messages. 399 .LE 400 .P 401 The \f5optget()\fP function returns the matching option letter if 402 one of the legal option is matched. 403 Otherwise, \f5optget()\fP returns 404 .VL .5i 405 .LI \f5':'\fP 406 If there is an error. In this case the variable \f5opt_info.arg\fP 407 contains the error string. 408 .LI \f50\fP 409 Indicates the end of options. 410 The variable \f5opt_info.index\fP contains the number of arguments 411 processed. 412 .LI \f5'?'\fP 413 A usage message has been required. 414 You normally call \f5optusage()\fP to generate and display 415 the usage message. 416 .LE 417 .P 418 The following is an example of the option parsing portion 419 of the \f5wc\fP utility. 420 .nf 421 .in +5 422 \f5#include <shell.h> 423 while(1) switch(n=optget(argv,"xf:[file]")) 424 { 425 case 'f': 426 file = opt_info.arg; 427 break; 428 case ':': 429 error(ERROR_exit(0), opt_info.arg); 430 break; 431 case '?': 432 error(ERROR_usage(2), opt_info.arg); 433 break; 434 }\fP 435 .in 436 .fi 437 .H 2 "Storage Management" 438 It is important that any memory used by your built-in 439 be returned. Otherwise, if your built-in is called frequently, 440 \f5ksh\fP will eventually run out of memory. 441 You should avoid using \f5malloc()\fP for memory that must 442 be freed before returning from you built-in, because by default, 443 \f5ksh\fP will terminate you built-in in the event of an 444 interrupt and the memory will not be freed. 445 .P 446 The best way to to allocate variable sized storage is 447 through calls to the \fBstak\fP library 448 which is included in \fBlibast\fP 449 and which is used extensively by \f5ksh\fP itself. 450 Objects allocated with the \f5stakalloc()\fP 451 function are freed when you function completes 452 or aborts. 453 The \fBstak\fP library provides a convenient way to 454 build variable length strings and other objects dynamically. 455 The man page for the \fBstak\fP library is contained 456 in the Appendix. 457 .P 458 Before \f5ksh\fP calls each built-in command, it saves 459 the current stack location and restores it after 460 it returns. 461 It is not necessary to save and restore the stack 462 location in the \f5b_\fP entry function, 463 but you may want to write functions that use this stack 464 are restore it when leaving the function. 465 The following coding convention will do this in 466 an efficient manner: 467 .nf 468 .in .5i 469 \fIyourfunction\fP\f5() 470 { 471 char *savebase; 472 int saveoffset; 473 if(saveoffset=staktell()) 474 savebase = stakfreeze(0); 475 \fP...\f5 476 if(saveoffset) 477 stakset(savebase,saveoffset); 478 else 479 stakseek(0); 480 }\fP 481 .in 482 .fi 483 .H 1 "CALLING \f5ksh\fP SERVICES" 484 Some of the more interesting applications are those that extend 485 the functionality of \f5ksh\fP in application specific directions. 486 A prime example of this is the X-windows extension which adds 487 builtins to create and delete widgets. 488 The \fBnval\fP library is used to interface with the shell 489 name space. 490 The \fBshell\fP library is used to access other shell services. 491 .H 2 "The nval library" 492 A great deal of power is derived from the ability to use 493 portions of the hierarchal variable namespace provided by \f5ksh-93\fP 494 and turn these names into active objects. 495 .P 496 The \fBnval\fP library is used to interface with shell 497 variables. 498 A man page for this file is provided in an Appendix. 499 You need to include the header \f5<nval.h>\fP 500 to access the functions defined in the \fBnval\fP library. 501 All the functions provided by the \fBnval\fP library begin 502 with the prefix \f5nv_\fP. 503 Each shell variable is an object in an associative table 504 that is referenced by name. 505 The type \f5Namval_t*\fP is pointer to a shell variable. 506 To operate on a shell variable, you first get a handle 507 to the variable with the \f5nv_open()\fP function 508 and then supply the handle returned as the first 509 argument of the function that provides an operation 510 on the variable. 511 You must call \f5nv_close()\fP when you are finished 512 using this handle so that the space can be freed once 513 the value is unset. 514 The two most frequent operations are to get the value of 515 the variable, and to assign value to the variable. 516 The \f5nv_getval()\fP returns a pointer the the 517 value of the variable. 518 In some cases the pointer returned is to a region that 519 will be overwritten by the next \f5nv_getval()\fP call 520 so that if the value isn't used immediately, it should 521 be copied. 522 Many variables can also generate a numeric value. 523 The \f5nv_getnum()\fP function returns a numeric 524 value for the given variable pointer, calling the 525 arithmetic evaluator if necessary. 526 .P 527 The \f5nv_putval()\fP function is used to assign a new 528 value to a given variable. 529 The second argument to \f5putval()\fP is the value 530 to be assigned 531 and the third argument is a \fIflag\fP which 532 is used in interpreting the second argument. 533 .P 534 Each shell variable can have one or more attributes. 535 The \f5nv_isattr()\fP is used to test for the existence 536 of one or more attributes. 537 See the appendix for a complete list of attributes. 538 .P 539 By default, each shell variable passively stores the string you 540 give with with \f5nv_putval()\fP, and returns the value 541 with \f5getval()\fP. However, it is possible to turn 542 any node into an active entity by assigning functions 543 to it that will be called whenever \f5nv_putval()\fP 544 and/or \f5nv_getval()\fP is called. 545 In fact there are up to five functions that can 546 associated with each variable to override the 547 default actions. 548 The type \f5Namfun_t\fP is used to define these functions. 549 Only those that are non-\f5NULL\fP override the 550 default actions. 551 To override the default actions, you must allocate an 552 instance of \f5Namfun_t\fP, and then assign 553 the functions that you wish to override. 554 The \f5putval()\fP 555 function is called by the \f5nv_putval()\fP function. 556 A \f5NULL\fP for the \fIvalue\fP argument 557 indicates a request to unset the variable. 558 The \fItype\fP argument might contain the \f5NV_INTEGER\fP 559 bit so you should be prepared to do a conversion if 560 necessary. 561 The \f5getval()\fP 562 function is called by \f5nv_getval()\fP 563 value and must return a string. 564 The \f5getnum()\fP 565 function is called by by the arithmetic evaluator 566 and must return double. 567 If omitted, then it will call \f5nv_getval()\fP and 568 convert the result to a number. 569 .P 570 The functionality of a variable can further be increased 571 by adding discipline functions that 572 can be associated with the variable. 573 A discipline function allows a script that uses your 574 variable to define functions whose name is 575 \fIvarname\fP\f5.\fP\fIdiscname\fP 576 where \fIvarname\fP is the name of the variable, and \fIdiscname\fP 577 is the name of the discipline. 578 When the user defines such a function, the \f5settrap()\fP 579 function will be called with the name of the discipline and 580 a pointer to the parse tree corresponding to the discipline 581 function. 582 The application determines when these functions are actually 583 executed. 584 By default, \f5ksh\fP defines \f5get\fP, 585 \f5set\fP, and \f5unset\fP as discipline functions. 586 .P 587 In addition, it is possible to provide a data area that 588 will be passed as an argument to 589 each of these functions whenever any of these functions are called. 590 To have private data, you need to define and allocate a structure 591 that looks like 592 .nf 593 .in .5i 594 \f5struct \fIyours\fP 595 { 596 Namfun_t fun; 597 \fIyour_data_fields\fP; 598 };\fP 599 .in 600 .fi 601 .H 2 "The shell library" 602 There are several functions that are used by \f5ksh\fP itself 603 that can also be called from built-in commands. 604 The man page for these routines are in the Appendix. 605 .P 606 The \f5sh_addbuiltin()\fP function can be used to add or delete 607 builtin commands. It takes the name of the built-in, the 608 address of the function that implements the built-in, and 609 a \f5void*\fP pointer that will be passed to this function 610 as the third agument whenever it is invoked. 611 If the function address is \f5NULL\fP, the specified built-in 612 will be deleted. However, special built-in functions cannot 613 be deleted or modified. 614 .P 615 The \f5sh_fmtq()\fP function takes a string and returns 616 a string that is quoted as necessary so that it can 617 be used as shell input. 618 This function is used to implement the \f5%q\fP option 619 of the shell built-in \f5printf\fP command. 620 .P 621 The \f5sh_parse()\fP function returns a parse tree corresponding 622 to a give file stream. The tree can be executed by supplying 623 it as the first argument to 624 the \f5sh_trap()\fP function and giving a value of \f51\fP as the 625 second argument. 626 Alternatively, the \f5sh_trap()\fP function can parse and execute 627 a string by passing the string as the first argument and giving \f50\fP 628 as the second argument. 629 .P 630 The \f5sh_isoption()\fP function can be used to set to see whether one 631 or more of the option settings is enabled. 632