Saturday, November 26, 2011

How to stop cygwin sed from changing windows CRLF to CR

This question got asked a lot on the net, largely in cygwin forums, and the answers were usually dismissive and snarky.

This is essentially the defined behavior for sed.

There are "text mount" settings in cygwin to change the default definitions of various file types. This is apparently defined when mounting a disk, and is way overkill.

There is a trick with sed where if you feed it a windows-like file, it will not do the default behavior. It's supposed to detect either a colon or a backwards slash in the file name to do this. I found the effectiveness of this to be non-straightforward.

I experimented for a while with ways to get find or grep to return a windows-like filename. There are ways.

However, it turns out that all of that was a big detour; cygwin sed has a -b option in which the file is considered "binary" and it stops mashing the CRLF and just writes the line as is.

It was about at this point that I finally began to understand the -i option, in which files are modified "in place".