从 Windows 到 Unix 的数据中和

2024-5-21 • tag-icon

我刚刚注意到其他平台文件（Windows）必须给我带来很多错误，例如这里，在我的正则表达式中。大多数情况下，行结尾为^M.

似乎有很多方法可以做dos2unix，比如这里描述。但是，我不确定这是否足以解决字符^M问题，即从所有 Windows 系统到 Unix。

示例代码

while (my $file = readdir($DIR)) {
    ## Reset the counter
    my $c=0;
    ## Skip any files that aren't .tex
    next unless $file =~ /\.tex$/;

    ## Open the file
    open(my $fh,"$path/$dir/$file");
    ######### TODO  I think the replacement should be done here
    ######### Pseudocode : 's/\r\n/\n/' input.txt

    while (<$fh>) {...}

        s/\r\n\z//;  # TODO bug here, 
                     # This line is not affecting the file globally. 
                     # I need to somehow apply the replament to the file. 
                     # Probably, I should do it globally, since this seems to be only locally. 
                     # What do you think?

在我的 UNIX 系统中返回这样的数据

\subsection{3}^MA 45 岁男性表示，去年他偶尔会吐出几天前吃的食物中的颗粒。

下面的替换有什么区别？

#1

s/\r\n\z//;

它根本没有返回任何明显的替代品。

#2

s/\r\n\z//g;

它没有返回正确的数据。以上数据{3}^MA 4仍然可以在输出中看到。

#3

s/\R//g;

它返回错误定向的数据，例如

\subsection{4}All the following are associated with an increased risk for gallstones, except:
\begin{question}{Why hemolysis can give gall stones?}Bilirubin accumulation.\end{question}
...
\begin{question}{Vomiting. Why?}Neuropathy of stomach. Changes in the nervous system of the intestinal system. Not using insulin all the time. % Ketone irritation possible. % ketoacidosis \end{question}

所有内容都放在一行中，甚至是注释，所以不是#2。

在 Perl 脚本中执行 win2unix 的正确方法是什么？

答案1

任思源的评论中有很好的答案。以某种方式使用命令

tofrodos

#1

#2

#3

答案1

相关内容