:::::::::::::: ::::::::::::::::::::::::::::::
Last Modified: 2001-07-29 :
Author : Laura Fairhead :
:::::::::::::: ::::::::::::::::::::::::::::::
How do I get a single line text file into an environment variable?
------------------------------------------------------------------
A common problem encountered in DOS for which the solution is not
obvious is how one might read a line of text in a file into a
DOS environment variable.
For example to get the current directory one might start with;
CD>FILE.TXT
If only you could do something like;
SET VAR=readfile("FILE.TXT")
But DOS has no such abilities.
Here are presented a number of the known ways to perform this function,
throughout the same naming convention is used;
FILE.TXT = the single line text file to read
VAR = the name of the variable to get the line into
Auxillary file solution
~~~~~~~~~~~~~~~~~~~~~~~
This is the standard book solution.
(i) At the command prompt enter "COPY CON FILE1.BAT", type the
text "SET VAR=" and terminate the line by pressing CTRL+Z first,
then ENTER. This creates a file that contains the text "SET VAR="
with no newline on the end.
(ii) Use the following code in your batch file;
======================================================================
COPY FILE1.BAT FILE2.BAT
TYPE FILE.TXT >>FILE2.BAT
CALL FILE2.BAT
DEL FILE2.BAT
======================================================================
After this VAR will be set with the contents of FILE.TXT
The problem with this method is that it requires an auxillary file that
goes together with the main batch program. The above lines work
fine if they are run while in the same directory as the auxillary file
however will fail if it is not. It is desirable for the auxillary file
to be stored together with the batch file in it's directory, then
there are 2 possibilites to solve this problem
(A) "hard-code" the location of FILE1.BAT. So if the batch file
and FILE1.BAT are put into C:\UTILS you would use the following lines
======================================================================
COPY C:\UTILS\FILE1.BAT FILE2.BAT <--- path to FILE1.BAT hardcoded
TYPE FILE.TXT >>FILE2.BAT
CALL FILE2.BAT
DEL FILE2.BAT
======================================================================
Hardcoding PATHs to files is generally considered extremely bad programming
practice, but this maybe sufficient in certain contexts (for example
if the batch is for floppy).
(B) Locate the directory of the batch & FILE1.BAT using code.
This is the usual solution for professional software, however
to do this using only batch language is extremely awkward and it
requires around a dozen or so lines of code.
DEBUG script edit solution
~~~~~~~~~~~~~~~~~~~~~~~~~~
The problem with the "AUXILLARY FILE" algorithm could easily be
solved by getting the batch file to create FILE1.BAT.
There is a problem with this however;
ECHO SET VAR=>FILE1.BAT
Creates a file with the text "SET VAR=" and a newline appended. There
is no way possible to convince the ECHO command not to append the
newline, it does this with anything that it outputs.
Here is where DEBUG can help out;
======================================================================
ECHO E100"SET VAR=">$
FOR %%_ IN (RCX 8 NFILE1.BAT W Q) DO ECHO %%_>>$
DEBUG <$ >NUL
======================================================================
The first 2 lines of the above create a DEBUG script file "$";
======================================================================
E100"SET VAR="
RCX
8
NFILE1.BAT
W
Q
======================================================================
When fed into DEBUG it creates a 8 byte file consisting of the text
"SET VAR=" with no newline on the end. So having created FILE1.BAT
(in the current directory) the program can go on to use it to
complete the algorithm;
======================================================================
ECHO E100"SET VAR=">$ (1)
FOR %%_ IN (RCX 8 NFILE1.BAT W Q) DO ECHO %%_>>$ (2)
DEBUG <$ >NUL (3)
DEL $ (4)
TYPE FILE.TXT>>FILE1.BAT (5)
CALL FILE1.BAT (6)
DEL FILE1.BAT (7)
======================================================================
Many variations on this are possible and common. The creation
of the file can actually be done much more compactly;
======================================================================
ECHO EXIT|%COMSPEC%/KPROMPT E100'SET VAR='$_RCX$_8$_NFILE1.BAT$_W$_Q|DEBUG>NUL
======================================================================
This replaces lines 1-4 in the above. The temporary file is now implicit
(handled by DOS pipes) so doesn't require deletion. The first two stages
in the pipeline produce the following output;
======================================================================
E100'SET VAR='
RCX
8
NFILE1.BAT
W
QEXIT
======================================================================
The PROMPT command has been utilized in order to output newlines
directly here. With COMMAND /K switch the arguement is executed
as a shell command and then it continues by executing standard input.
There is no need for an additional '$_' after the 'Q' because debug
only looks at 'Q' itself and ignores any trailing garbage on the
line.
SED solution
~~~~~~~~~~~~
SED is an industry standard UNIX utility and a very good, compact,
freeware port for DOS exists;
ftp://ftp.simtel.net/pub/simtelnet/msdos/txtutl/sed15x.zip
SED can solve many tasks standing on its head that plain DOS finds either
impossible or _extremely_ awkward. This is one of them;
======================================================================
SED "s/^/SET VAR=/" FILE.TXT >FILE1.BAT
CALL FILE1.BAT
DEL FILE1.BAT
======================================================================
ASCII CODE
~~~~~~~~~~
ASCII code is machine code that can be represented by printable ASCII
characters. Often this is very carefully crafted by hand so that the
program does not contain any binary values which are unprintable ASCII
codes (less than hex20), shell special characters (<, >, |...)
or possibly untransportable 8-bit characters (greater than hex7F).
The result is a machine code .COM program which can be created simply
using ECHO (program ) >CODE.COM. This is very flexible and transportable
code and reduces file access overhead;
=====================================================================
ECHO XPYP[*'CC-\1P\QX,=P,APZ5O!PQ2O~5aaI~}Ksx>_.COM
=====================================================================
This is an ASCII code program that when run simply ECHO's text
but with no newline on the end.
So;
_ SET VAR=>FILE1.BAT
Then creates the desired base FILE1.BAT
Of course having created the _.COM echo program it can be used again
and again throughout the batch, only being deleted on exit. It is
obviously useful for a great deal more than this one particular
task.
Of course the full atomic operation will be peformed now by the
lines;
=====================================================================
ECHO XPYP[*'CC-\1P\QX,=P,APZ5O!PQ2O~5aaI~}Ksx>_.COM
_.COM SET VAR=>FILE1.BAT
TYPE FILE.TXT>>FILE1.BAT
CALL FILE1.BAT
DEL FILE1.BAT
DEL _.COM
=====================================================================
Here is a full disassembly of the machine code generated
with commenting;
=====================================================================
;on entry to .COM program:
; SP=FFFE, a 0 word is on the stack
; DS=ES=SS=CS
; CS:0->PSP
;
; contrary to some information out there AX is not always 0
; this is the 4th revision of this code, the 3rd assumed AX=0000
;
; Since all the segment registers are the same and remain the
; same throughout they are not mentionned any further
;AX=0000 CX=0000 BX=0000
1777:0100 58 X POP AX
1777:0101 50 P PUSH AX
1777:0102 59 Y POP CX
1777:0103 50 P PUSH AX
1777:0104 5B [ POP BX
;[BX]=[0]=BYTE AT PSP OFFSET 0
;the word at PSP offset 0 is a INT 20 instruction (CD 20)
;before a COM program is entered 0 is pushed to the stack
;this means that a RET instruction will 'JMP 0' execute
;the INT 20 and consequently return to DOS
;however here is used the value 'CD'
;and AH=00-CD =33
1777:0105 2A 27 *' SUB AH,[BX]
;BX=0002
1777:0107 43 C INC BX
1777:0108 43 C INC BX
;AX=01A4
1777:0109 2D 5C 31 -\1 SUB AX,315C
;SP=01A4
1777:010C 50 P PUSH AX
1777:010D 5C \ POP SP
;AX=0000
1777:010E 51 Q PUSH CX
1777:010F 58 X POP AX
;AX=00C3
1777:0110 2C 3D ,= SUB AL,3D
;[01A2]=C3 00
1777:0112 50 P PUSH AX
;AX=0082
1777:0113 2C 41 ,A SUB AL,41
;DX=0082
1777:0115 50 P PUSH AX
1777:0116 5A Z POP DX
;AX=21CD
1777:0117 35 4F 21 5O! XOR AX,214F
;[01A0]=CD 21 C3 00
1777:011A 50 P PUSH AX
;
; now at: 01A0: CD 21 INT 21
; 01A2: C3 RET
; 01A3: 00
;
;stack pointer is just below it at 01A0 (predecrementing).
;This is adjusted deliberately to be as far away from the
;program as possible (for stack space) but still being
;able to reach this _generated_ code with a relative jump. *[1]
;
;and now a 0 word is pushed. This is so that when the 'RET'
;at 01A2 is encountered the program will 'JMP 0' and hence
;INT 20 and return to DOS
;get 0 word ready for next POP (the RET)
1777:011B 51 Q PUSH CX
;CL=00 XOR [0080] = [0080]
; The number of characters in the command line tail (in the PSP)
1777:011C 32 4F 7E 2O~ XOR CL,[BX+007E]
;AX=40xx (21CD XOR 6161=40xx)
; only the value in AH is important
1777:011F 35 61 61 5aa XOR AX,6161
; this has been preparing for a DOS call;
;
; AH=40 write handle function
; BX=0001 DOS stdout
; CX=#chars chars to write
; DX=0082 start of text to output
;
;DX is set to 0082 instead of 0081 (the actual start of the command line
;tail data in the PSP) because DOS starts writing the command tail
;from/including the character deliminating the command. So if you started
;at 0081 the result of 'ECHOCOMMAND hello' would be the text ' hello'
;just incrementing to 0082 doesn't solve the problem completely. CX needs
;to be decremented so that 1 less character is output.
1777:0122 49 I DEC CX
;Now if CX has decremented to -1 (from 0) we must skip the call. JNG
;will jump if ZF=1 OR SF<>OF. The overflow flag must be clear due to
;the limited range of CX, so that evaluates to ZF=1 OR SF=1. So it
;skips the output call if there are no characters to output as well.
1777:0123 7E 7D ~} JNG 01A2
;BX=0001
1777:0125 4B K DEC BX
;CF=0 after the XOR AX,6161 by definition and the intermediate instructions
;do not change it so this is simply a 'JMP'. Of course JMP = 0EBh
;which is outside the range of printable ASCII characters (020h to 07Eh)
;that are allowed in the program.
1777:0126 73 78 sx JNC 01A0
---------------------------------------------------------------------
*[1]:
Certain instructions that _need_ to be executed can not be
represented with the ASCII codes 020h - 07Eh, so the method
has to be that those instructions are generated somehow.
I had seen other ASCII machine code programs that did this
by modifying later parts of themselves so that when they fell
through to those stages the instructions were there. There is
a serious problem with self modifying code like this on 486 and
earlier processors, the prefetch queue is loaded ahead of code
execution (on 486 I think it is maybe as much 32 bytes) and
any modifications to code that falls in that range will only
effect memory--- the preloaded(unmodified) instructions will
still execute. On the Pentium they changed the processor logic
so that a write in the range of the prefetch queue will cause
it to be immediately reloaded from memory. On 486 and earlier
you need to have some sort of JMP instruction to cause it to
be reloaded first. Try making a program this small with a jump
instruction to another part of the program.....
=====================================================================
~~ |
|