我有一个由管道分隔的数据文件(请参阅下面的三行示例)。
我想要做的就是“删除”字段并将它们放入变量中。它们具有预先定义的长度。我想将文件转换为 SQL 脚本,如下所示:
输入:
| 416|CAWNBORE LIMITED |CAWNBORE | 8| 0| 0|00.00 | 0| 0|********NO ADDRESS DETAILS******* |********NO ADDRESS DETAILS******* |********NO ADDRESS DETAILS******* |********NO ADDRESS DETAILS******* | 0| 0| 0| 0| 0|
| 431|MAIN HOLDINGS LIMITED |MAINHOLDINGSCHA | 8| 0| 0|00.00 | 0| 0|********NO ADDRESS DETAILS******* |********NO ADDRESS DETAILS******* |********NO ADDRESS DETAILS******* |********NO ADDRESS DETAILS******* | 0| 19650509| 0| 0| 0|
| 432|DUBLIN NORTH CITY MILLING COMPANY LIMITED |DUBLINNORTHCITY | 8| 0| 1|00.00 | 18750125| 19830124|113 PHIBSBORO ROAD |DUBLIN | | | 216410| 19901106| 0| 20030124| 0|
基本上,剥离管道 - 用逗号分隔数据并插入 SQL 字符串 INSERT INTO .... VALUES( data_in_here comma split );
期望的输出:
INSERT INTO tbcrocompany (id_company, nm_company, id_keyword, cd_status, dt_company_status, cd_type, cd_principle_obj, dt_register, dt_last_ar, ad_line_1, ad_line_2, ad_line_3, ad_line_4, cd_town_number, dt_dissolved, dt_bond_expiry, dt_next_ar, dt_last_accounts) VALUES (416,'CAWNBORE LI|MITED','CAWNBORE',8,0,0, '00.00', 0, 0, '********NO ADDRESS DETAILS*******', '********NO ADDRESS DETAILS*******', '********NO ADDRESS DETAILS*******', '********NO ADDRESS DETAILS*******', 0, 0, 0, 0, 0);
INSERT INTO tbcrocompany (id_company, nm_company, id_keyword, cd_status, dt_company_status, cd_type, cd_principle_obj, dt_register, dt_last_ar, ad_line_1, ad_line_2, ad_line_3, ad_line_4, cd_town_number, dt_dissolved, dt_bond_expiry, dt_next_ar, dt_last_accounts) VALUES (431,'MAIN HOLDIN|GS LIMITED','MAINHOLDINGSCHA',8,0,0, '00.00', 0, 0, '********NO ADDRESS DETAILS*******', '********NO ADDRESS DETAILS*******', '********NO ADDRESS DETAILS*******', '********NO ADDRESS DETAILS*******', 0, 19650509, 0, 0, 0);
INSERT INTO tbcrocompany (id_company, nm_company, id_keyword, cd_status, dt_company_status, cd_type, cd_principle_obj, dt_register, dt_last_ar, ad_line_1, ad_line_2, ad_line_3, ad_line_4, cd_town_number, dt_dissolved, dt_bond_expiry, dt_next_ar, dt_last_accounts) VALUES (432,'DUBLIN NORTH CITY MILLING COMPANY LIMITED','DUBLINNORTHCITY',8,0,1, '00.00', 18750125, 19830124, '113 PHIBSBORO ROAD', 'DUBLIN', '', '', 216410, 19901106, 0, 20030124, 0);
答案1
假设您的数据位于文件中data
,这应该适用于合理的输入(例如,您的数据中没有|
换行符):
sed -e 's/^ *| *//' -e 's/ *$//' -e 's/ *| */|/g' data |
while IFS='|' read -r f1 f2 f3 f4 f5 f6
do
# INSERT INTO mt (F1, F2, F3, F4, F5, F6) VALUES ( 16524,01,'10/17/2012','3930621977','XXNPUES ', 'S1');
echo "INSERT INTO mt (F1, F2, F3, F4, F5, F6) VALUES ($f1,$f2,'$f3','$f4','$f5','$f6');"
done
第一行删除数据输入中的前导和尾随空格,并丢弃初始的|
,从而| 416|CABlah |Somewhere else |
变为416|CABlah|Somewhere else|
。如果表达式不适合您的用例,则可以修改或完全删除表达式。
答案2
这是一个perl
可以完成这项工作的脚本。只需更改$tablename
字符串以及@fields
和数组即可轻松修改它以适用于其他表@types
。
如果输入文件的第一行包含字段名称,则此脚本会显着缩短(这在许多 CSV 文件中很常见)。该@types
数组仍然需要手动创建,否则脚本无法知道哪些字段是字符串,哪些字段是数字。 OTOH,如果输入文件有这样的第一行,则可以编写脚本以使用Text::CSV
或DBD::CSV
模块。
它从命令行上指定的任何/所有文件名和/或从标准输入获取输入。
#! /usr/bin/perl
use strict;
my $tablename = 'tbcrocompany';
# field names array. This will be used later to construct
# the %fields hash, and to print the field names in the
# INSERT statement.
my @fields = qw(id_company nm_company id_keyword cd_status
dt_company_status cd_type cd_principle_obj dt_register
dt_last_ar ad_line_1 ad_line_2 ad_line_3 ad_line_4
cd_town_number dt_dissolved dt_bond_expiry dt_next_ar
dt_last_accounts);
# fields can be string (s) or number (n)
my @types = qw(n s s n n n n n n s s s s n n n n n);
# initialise the field types hash.
my %types=();
my $fieldnum=0;
foreach my $f (@fields) {
$types{$f} = $types[$fieldnum++];
}
while (<>) {
my %fields=();
# remove leading '|' from line
s/^\|//;
# split each input line by '|'
my @row = split(/\|/);
# add each element of @row to the appropriate record in the %fields hash
my $fieldnum=0;
foreach my $f (@fields) {
$fields{$f} = $row[$fieldnum++];
# strip leading and trailing spaces from field value
$fields{$f} =~ s/^\s*|\s*$//g;
}
# construct the VALUES( ... ) section of the output line,
# with quotes for string values but not for numbers.
my @values = ();
foreach my $f (@fields) {
if ($types{$f} eq 's') {
push @values, "'$fields{$f}'"
} else {
push @values, $fields{$f}
};
}
# and, finally, output the line.
print "INSERT INTO $tablename (", join(', ',@fields),
") VALUES (", join(", ",@values), ";\n";
}