从 CSV 文件中删除日期

从 CSV 文件中删除日期

我有一个很长的推文列表,但我想删除某个日期之前的推文。我想我需要使用 sed、awk 或 grep 来删除这些推文,但我不确定语法。此格式在第二列中以“2017-9-18 XX:XX:XX”的形式出现,并表示我想删除 2017-9-15 之前的推文”。

非常感谢大家!

答案1

您可以使用dategrep。来自perldoc /usr/local/bin/dategrep

NAME
    dategrep - print lines matching a date range

SYNOPSIS
      dategrep --start "12:00" --end "12:15" --format "%b %d %H:%M:%S" syslog
      dategrep --end "12:15" --format "%b %d %H:%M:%S" syslog
      dategrep --last-minutes 5 --format "%b %d %H:%M:%S" syslog
      dategrep --last-minutes 5 --format rsyslog syslog
      cat syslog | dategrep --end "12:15"

DESCRIPTION
    Do you even remember how often in your life you needed to find lines in a
    log file falling in a date range? And how often you build brittle regexs
    in grep to match entries spanning over a hour change?

    dategrep hopes to solve this problem once and for all.

...

INSTALLATION
    It is possible to install this script via perl normal install routines.

      perl Makefile.PL && make && make install

    Or via CPAN:

      cpan App::dategrep

    You can also install one of the two prebuild versions, which already
    include all or some of dategrep's dependencies. Which to choose mainly
    depends on how hard it is for you to install Date::Manip. The small
    version is just 22.3KB big and includes all libraries except Date::Manip.
    The big one packs everything in a nice, neat package for you, but will
    cost you almost 10MB of disk space. Both are always included in the latest
    release <https://github.com/mdom/dategrep/releases/latest>.

    So, to install the big version you could just type:

      wget -O /usr/local/bin/dategrep https://github.com/mdom/dategrep/releases/download/v0.58/dategrep-standalone-big
      chmod +x /usr/local/bin/dategrep

    And for the small one (with the apt-get for Debian):

      apt-get install libdate-manip-perl
      wget -O /usr/local/bin/dategrep https://github.com/mdom/dategrep/releases/download/v0.58/dategrep-standalone-small
      chmod +x /usr/local/bin/dategrep

答案2

正则表达式让这个问题变得简单。第一个版本输出 2017-9-01 或之后的日期。

grep -E "2017-([9]|[0-1][0-9])" file > output_file

第二个示例进一步过滤输出以排除 2017-9-15 之前的日期。但前提是月份中的日期以零填充。

grep -E "2017-([9]|[0-1][0-9])-([0-9]|[0-9][0-9])" file | grep -Ev "2017-9-(0[0-9]|[0-1][0-5])" > output_file

每对方括号代表一位数字。|字符在正则表达式中表示或。请参阅Bash 初学者指南第 4 章 正则表达式了解更多详情。

相关内容