我需要在文本文件中生成数十亿条记录以用于表加载目的。
我的目标表定义是:
CREATE TABLE txnrecords12(
txnno int,
txndate string,
custno int,
amount double,
category string,
product string,
city string,
state string,
spendby string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
为了生成数据,我的代码是:
############!/bin/sh
####### Create file dynamically
if [ ! -d hero_work ]
then
mkdir hero_work
fi
TEMPDIR=$HOME/hero_work
cd $TEMPDIR
touch $TEMPDIR/big_data_file_$$
echo $"string1\n",printf "string1\n",printf "string1\n">>big_data_file_$$
####################################
### Create data file dynamically####
####################################
if [ ! -d hero_work ]
then
mkdir hero_work
fi
TEMPDIR=$HOME/hero_work
cd $TEMPDIR
touch $TEMPDIR/big_data_file_$$
n1='
'
state_factor=$1
category_factor=$2
city_factor=$3
product_factor=$4
hiphen="-"
comma=","
### Write Table Columns Below
Col1="Txnno"
Col2="Txndate"
Col3="Custno"
Col4="Amount"
Col5="Category"
Col6="Product"
Col7="City"
Col8="State"
Col9="Spend_by"
####### Randomvariable declarations
rand1_Date_d="01"
rand2_Date_m="01"
rand5_Date_year="1999"
rand3_Transaction="0014"
rand4_cust_no="01155"
rand6_amount_no="0000"
######Column related variable declaration
var1_col1_txnno=0
var2_col2_txndate=0
var3_col3_custno=0
var4_col4_amount=0
var5_col5_category=0
var6_col6_product=0
var7_col7_city=0
var8_col8_state=0
var9_col9_spendby=0
write_value=${var1_col1_txnno}${comma}${var2_col2_txndate}${comma}${var3_col3_custno}${comma}${var4_col4_amount}${comma}${var5_col5_category}${comma}${var6_col6_product}${comma}${var7_col7_city}${comma}${var8_col8_state}${comma}${var9_col9_spendby}
Column_list=$Col1${comma}Col2${comma}$Col3${comma}$Col4${comma}$Col5${comma}$Col6${comma}$Col7${comma}$Col8${comma}$Col9
echo "$Column_list">big_data_file_$$
#####
####### Array of States
State[0]="UP"
State[1]="MP"
State[2]="Punjab"
State[3]="Delhi"
State[4]="WB"
### Array of Cities
City[0]="ABC"
City[1]="BCD"
City[2]="KJL"
City[3]="CGL"
City[4]="PPL"
#### Array of Products
Product[0]="ICECREAM"
Product[1]="Wheat"
Product[2]="CLOTHES"
Produt[3]="Laptop"
Product[4]="Bags"
Product[5]="Books"
#### Array of Categories
Category[0]="Foods"
Category[1]="Wearings"
Category[2]="Electronics"
###########3 Loop variables were initialized below
var_state_loop=0
var_city_loop=0
var_category_loop=0
var_product_loop=0
while (( var_state_loop -le $state_factor ))
do
if[ $var_state_loop -le 4 ]
then
echo "State loop part starts here.."
$var8_col8_state=${State[$var_state_loop]}
else
echo "State loop part ends here.."
while((var_city_loop -le ${city_factor} ))
do
echo "City Loop starts here"
if[ $var_city_loop -le 4 ]
then
$var7_col7_city=${City[$var_city_loop]}
else
echo "City Loop ends here"
while((var_category_loop -le ${category_factor}))
do
echo "Category loop started from here"
if[ $var_category_loop -le 3 ]
then
$var5_col5_category=${Category[$var_category_loop]}
else
echo"Category loop ended"
while((var_product_loop -le 6))
do
if [ $var_product_loop -le 6 ]
then
$var6_col6_product=${Product[$var_product_loop]}
$var1_col1_txnno=${var8_col8_state}${var7_col7_city}${var5_col5_category}${var6_col6_product}${rand3_Transaction}
while((rand5_Date_year -le 2016))
do
echo "starting date writing"
if[ ${rand1_Date_d} -le 31 -a ${rand2_Date_m} -le 12 ]
then
$var2_col2_txndate=${rand1_Date_d}${hiphen}${rand2_Date_m}${hiphen}${rand5_Date_year}
else
echo "Date part completed"
((ran5_Date_year+=1)))
done
$var3_col3_custno=${var8_col8_state}${var7_col7_city}${var5_col5_category}${var6_col6_product}${rand4_cust_no}
$var4_col4_amount=${rand6_amount_no}
$var9_col9_spendby=${var3_col3_custno}${hiphen}${var7_col7_city}
echo "The product loop finished for one product"
write_value=${var1_col1_txnno}${comma}${var2_col2_txndate}${comma}${var3_col3_custno}${comma}${var4_col4_amount}${comma}${var5_col5_category}${comma}${var6_col6_product}${comma}${var7_col7_city}${comma}${var8_col8_state}${comma}${var9_col9_spendby}
echo ${write_value}>>big_data_file_$$
##### Product end variable declaration
((rand3_Transaction+=1))
((rand6_amount_no+=212))
((var_product_loop+=1))
done
((var_category+=1))
done
((var_city_loop+=1))
done
((var_state_loop+=1))
done
当我运行代码时,每次都会出现以下错误
line 94: syntax error near unexpected token `then'
biggun.ksh: line 94: ` then'
答案1
发现一些错误:
- shebang 必须是 #!/bin/sh。所有这些
#
都是语法错误。 - 第 15-21 行与第 4-11 行完全相同(删除它们)。
- var for
$
中缺少第 58 行Col2
Column_list=
- 第 59 行,
echo "$Column_list">big_data_file_$$
将擦除之前写入的所有信息big_data_file$$
。改成>>
。 - 第 93、103、112、124 行
if[
应更改为if [
。 - 第 114 行,
echo"Category
必须有一个空格:echo "Category
。 - 第129行,有3个关闭
)
,去掉1个。 - 第 129 行,(( )) 结构在 sh(your shebang)中无效。
- 缺少多个结束符
fi
(至少 5 个),可能还缺少一个done
.
我累了。测试你的代码,清理它,做你的作业。