我有一些地址.csv以这种格式
Street 1
Street 10
Street 100
Street 1000
Straße 1b
Straße1b
Street 1 B
Street, 1B
The Street 1B
The-Street 1B
The'Street 1B
The&Street 1B
The Str. 1B
Street 1-3
Street 1 - 3
Street 1A-3B
Street 1A -3 B
Super's Street-Str., 1 - 1000B
有没有办法分离/提取所有街道名称和街道号码?
输出名称.csv
Street
Street
Street
Street
Straße
Straße
Street
Street
The Street
The-Street
The'Street
The&Street
The Str.
Street
Street
Street
Street
Super's Street-Str.
输出数字.csv
1
10
100
1000
1b
1b
1 B
1B
1B
1B
1B
1B
1B
1-3
1 - 3
1A-3B
1A -3 B
1 - 1000B
我找到了一个解决方案,我想在这里分享:
答案1
我的解决方案是
- 检查地址格式是否有效
if [[ ${var_street_and_number} =~ ^[[:alpha:][:space:]\.\'\&\-]+[,]?[[:space:]]?[0-9]{1,4}[[:space:]]?[a-zA-Z]?[[:space:]]?[-]?[[:space:]]?[0-9]{0,4}[[:space:]]?[a-zA-Z]?$ ]];
then
echo "Adress is format is valid :)";
else
echo "Adress is format is invalid \!";
fi;
该变量var_street_and_number
应该是一条线带有街道名称+门牌号
如果您有一个包含许多街道和号码(=许多行)的文件,您可以使用:
while read line; do
if [[ ${line} =...
done < addresses.csv
- 如果地址格式有效,您可以使用
sed
sed 's/[,]\{0,1\}[[:space:]]\{0,1\}[[:digit:]].*$//' address.csv > output-name.csv
sed 's/^[^[:digit:]]*//' address.csv > output-number.csv