其实我只需要UNIX到DOS的,也就是'\n' -> '\r\n'
借助sed,可以是用如下方式达到效果:
sed -e "s/$/\r/" myunix.txt > mydos.txt
不过,我现在是想要问问,是否有更简单的方法,最好是不用第三方工具的。
原理我已经知道了,下面摘自:http://hi.baidu.com/dbsuit/blog/item/511eb409f9abe7266b60fba8.html
Windows/DOS与Unix文件格式之间的相互转换
2007-06-21 13:25
Windows/DOS与Unix文件格式是不同的
先搞清楚几个符号
*******************
0A LF ^J 换行
0D CR ^M 回车
*******************
DOS/Windows文本文件中使用CR(回车\r)和 LF(换行\n),
在文件的行尾的情况是是 '\r\n'
UNIX文本只使用换行符,在行的末尾有一个换行(\n),也就是'\n'
所以在windows下编辑的C程序放在unix下编译会出现"No end of newline"的Warning
两种文件格式之间的转化
Unix -> Dos
'\n' -> '\r\n'
//////////////////////////////////////////
while ( (ch = fgetc(in)) != EOF )
{
if ( ch == '\n' )
putchar('\r');
putchar(ch);
}
//////////////////////////////////////////
只要在Unix文件中出现的'\n'的之前加入一个'\r'字符就可以了
Unix <- DOS
'\n' <- '\r\n'
从Dos到Unix的情况复杂点,不能只是把从文件中读出的'\r'去掉就可以了
因为Dos文件中的文本行的末尾有时会内嵌一个回车符号,这种情况在击打式打印机中出现。
所以在转换前要判断'\r'是否和'\n'同时出现。
如果同时出现,则去掉'\r'
如果没有同时出现,保留'\n'
//////////////////////////////////////////
cr_flag = 0; /* No CR encountered yet */
while ( (ch = fgetc(in)) != EOF )
{
if ( cr_flag && ch != '\n' ) {
/* This CR did not preceed LF */
putchar('\r');
}
if ( !(cr_flag = (ch == '\r')) )
putchar(ch);
}
//////////////////////////////////////////
Last edited by honghunter on 2008-1-10 at 02:15 PM ]
Actually, I only need the one from UNIX to DOS, that is, '\n' -> '\r\n'
With the help of sed, the following way can achieve the effect:
sed -e "s/$/\r/" myunix.txt > mydos.txt
However, now I want to ask if there is a simpler method, preferably without using third-party tools.
I already know the principle. The following is excerpted from: http://hi.baidu.com/dbsuit/blog/item/511eb409f9abe7266b60fba8.html
Conversion between Windows/DOS and Unix file formats
2007-06-21 13:25
Windows/DOS and Unix file formats are different
First, clarify a few symbols
*******************
0A LF ^J Newline
0D CR ^M Carriage return
*******************
In DOS/Windows text files, CR (carriage return \r) and LF (newline \n) are used.
In the case of the end of a line in the file, it is '\r\n'
Unix text only uses the newline character, with a newline (\n) at the end of the line, that is, '\n'
So C programs edited under Windows and compiled under Unix will have a "No end of newline" Warning
Conversion between the two file formats
Unix -> Dos
'\n' -> '\r\n'
//////////////////////////////////////////
while ( (ch = fgetc(in)) != EOF )
{
if ( ch == '\n' )
putchar('\r');
putchar(ch);
}
//////////////////////////////////////////
Just add a '\r' character before the '\n' that appears in the Unix file
Unix <- DOS
'\n' <- '\r\n'
The situation from Dos to Unix is more complicated. You can't just remove the '\r' read from the file.
Because there may be an embedded carriage return symbol at the end of the text line in the Dos file, this situation occurs in impact printers.
So before conversion, you need to judge whether '\r' and '\n' appear simultaneously.
If they appear simultaneously, remove '\r'
If they do not appear simultaneously, keep '\n'
//////////////////////////////////////////
cr_flag = 0; /* No CR encountered yet */
while ( (ch = fgetc(in)) != EOF )
{
if ( cr_flag && ch != '\n' ) {
/* This CR did not preceed LF */
putchar('\r');
}
if ( !(cr_flag = (ch == '\r')) )
putchar(ch);
}
//////////////////////////////////////////
Last edited by honghunter on 2008-1-10 at 02:15 PM ]