在将 bash 变量存储到 apache hive 数据库之前对其进行转义

在将 bash 变量存储到 apache hive 数据库之前对其进行转义

我正在运行一个脚本文件sqoop_import_ATM.sh,并希望将日志存储在SQL数据库中。

我做的第一件事是将日志定向到我自己的变量中:

OUTPUT="$(/home/me/sqoop_insights_extract/./sqoop_import_ATM.sh 2>&1)"

这同时得到stdoutstderr

然后我将其存储OUTPUTSQL数据库中(这部分不起作用):

hive -e "INSERT INTO SCHEDULED_MIGRATION_LOGS VALUES ('$OUTPUT')"

以下是可能包含的日志示例OUTPUT

rm: `transaction_detail_atm': No such file or directory Warning: /usr/iop/4.1.0.0/sqoop/../accumulo does not exist! Accumulo imports will fail. Please set $ACCUMULO_HOME to the root of your Accumulo installation. 16/07/26 10:28:33 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6_IBM_20 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/iop/4.1.0.0/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J

为了删除我尝试使用的有害单引号

$OUTPUT | sed "s/'/\ /g"

这似乎奏效了。

现在线路有问题[jar:file:/usr/iop/4.1.0.0/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]

它开始检查我的目录中是否有该文件或其他内容

如果我遗漏了一些明显的东西,我很抱歉,我对 bash 很陌生。

知道如何简单地将此日志发布到我的SQL数据库中

编辑 - 整个日志

rm: `transaction_detail_atm': No such file or directory Warning: /usr/iop/4.1.0.0/sqoop/../accumulo does not exist! Accumulo imports will fail. Please set $ACCUMULO_HOME to the root of your Accumulo installation. 16/07/26 10:28:33 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6_IBM_20 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/iop/4.1.0.0/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/iop/4.1.0.0/zookeeper/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 16/07/26 10:28:33 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 16/07/26 10:28:33 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override 16/07/26 10:28:33 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc. 16/07/26 10:28:33 WARN sqoop.ConnFactory: Parameter --driver is set to an explicit driver however appropriate connection manager is not being set (via --connection-manager). Sqoop is going to fall back to org.apache.sqoop.manager.GenericJdbcManager. Please specify explicitly which connection manager should be used next time. 16/07/26 10:28:33 INFO manager.SqlManager: Using default fetchSize of 1000 16/07/26 10:28:33 INFO tool.CodeGenTool: Beginning code generation 16/07/26 10:28:34 INFO manager.SqlManager: Executing SQL statement: select QueryResult.java sqoop_import_ALL.sh sqoop_import_ATM.sh sqoop_import_CC.sh sqoop_import_TD.sh sqoop_import_VD.sh testscript.sh transaction_detail.java transaction_detail_visa_debit.java from transaction_detail_atm where posted_dte = current_date-1 and (1 = 0) 16/07/26 10:28:35 INFO manager.SqlManager: Executing SQL statement: select QueryResult.java sqoop_import_ALL.sh sqoop_import_ATM.sh sqoop_import_CC.sh sqoop_import_TD.sh sqoop_import_VD.sh testscript.sh transaction_detail.java transaction_detail_visa_debit.java from transaction_detail_atm where posted_dte = current_date-1 and (1 = 0) 16/07/26 10:28:35 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/iop/4.1.0.0/hadoop Note: /tmp/sqoop-u28158/compile/5fe23466bd0c4860e3529366cde274b4/QueryResult.java uses or overrides a deprecated API. Note: Recompile with -Xlint:deprecation for details. 16/07/26 10:28:37 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-u28158/compile/5fe23466bd0c4860e3529366cde274b4/QueryResult.jar 16/07/26 10:28:37 INFO mapreduce.ImportJobBase: Beginning query import. 16/07/26 10:28:38 INFO impl.TimelineClientImpl: Timeline service address: http://rhdtmomg1.mid.aib.pri:8188/ws/v1/timeline/ 16/07/26 10:28:38 INFO client.RMProxy: Connecting to ResourceManager at rhdtmomg1.mid.aib.pri/10.30.39.1:8050 16/07/26 10:28:43 INFO db.DBInputFormat: Using read commited transaction isolation 16/07/26 10:28:44 INFO mapreduce.JobSubmitter: number of splits:1 16/07/26 10:28:44 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1464347352640_0138 16/07/26 10:28:44 INFO impl.YarnClientImpl: Submitted application application_1464347352640_0138 16/07/26 10:28:44 INFO mapreduce.Job: The url to track the job: http://rhdtmomg1.mid.aib.pri:8088/proxy/application_1464347352640_0138/ 16/07/26 10:28:44 INFO mapreduce.Job: Running job: job_1464347352640_0138 16/07/26 10:28:49 INFO mapreduce.Job: Job job_1464347352640_0138 running in uber mode : false 16/07/26 10:28:49 INFO mapreduce.Job: map 0% reduce 0% 16/07/26 10:31:21 INFO mapreduce.Job: map 100% reduce 0% 16/07/26 10:31:22 INFO mapreduce.Job: Job job_1464347352640_0138 completed successfully 16/07/26 10:31:22 INFO mapreduce.Job: Counters: 30 File System Counters FILE: Number of bytes read=0 FILE: Number of bytes written=147903 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=87 HDFS: Number of bytes written=2514798055 HDFS: Number of read operations=4 HDFS: Number of large read operations=0 HDFS: Number of write operations=2 Job Counters Launched map tasks=1 Other local map tasks=1 Total time spent by all maps in occupied slots (ms)=149743 Total time spent by all reduces in occupied slots (ms)=0 Total time spent by all map tasks (ms)=149743 Total vcore-seconds taken by all map tasks=149743 Total megabyte-seconds taken by all map tasks=306673664 Map-Reduce Framework Map input records=4073603 Map output records=4073603 Input split bytes=87 Spilled Records=0 Failed Shuffles=0 Merged Map outputs=0 GC time elapsed (ms)=2264 CPU time spent (ms)=127770 Physical memory (bytes) snapshot=368541696 Virtual memory (bytes) snapshot=3798663168 Total committed heap usage (bytes)=331350016 File Input Format Counters Bytes Read=0 File Output Format Counters Bytes Written=2514798055 16/07/26 10:31:22 INFO mapreduce.ImportJobBase: Transferred 2.3421 GB in 164.7606 seconds (14.5563 MB/sec) 16/07/26 10:31:22 INFO mapreduce.ImportJobBase: Retrieved 4073603 records. 16/07/26 10:31:22 INFO manager.SqlManager: Executing SQL statement: select QueryResult.java sqoop_import_ALL.sh sqoop_import_ATM.sh sqoop_import_CC.sh sqoop_import_TD.sh sqoop_import_VD.sh testscript.sh transaction_detail.java transaction_detail_visa_debit.java from transaction_detail_atm where posted_dte = current_date-1 and (1 = 0) 16/07/26 10:31:23 INFO manager.SqlManager: Executing SQL statement: select QueryResult.java sqoop_import_ALL.sh sqoop_import_ATM.sh sqoop_import_CC.sh sqoop_import_TD.sh sqoop_import_VD.sh testscript.sh transaction_detail.java transaction_detail_visa_debit.java from transaction_detail_atm where posted_dte = current_date-1 and (1 = 0) 16/07/26 10:31:23 WARN hive.TableDefWriter: Column DISPENSED_AMT had to be cast to a less precise type in Hive 16/07/26 10:31:23 WARN hive.TableDefWriter: Column POSTED_AMT had to be cast to a less precise type in Hive 16/07/26 10:31:23 WARN hive.TableDefWriter: Column POSTED_REF_CRNCY_AMT had to be cast to a less precise type in Hive 16/07/26 10:31:23 WARN hive.TableDefWriter: Column UNCLEARED_1_DAY had to be cast to a less precise type in Hive 16/07/26 10:31:23 WARN hive.TableDefWriter: Column UNCLEARED_3_DAY had to be cast to a less precise type in Hive 16/07/26 10:31:23 WARN hive.TableDefWriter: Column UNCLEARED_4_DAY had to be cast to a less precise type in Hive 16/07/26 10:31:23 WARN hive.TableDefWriter: Column UNCLEARED_5_DAY had to be cast to a less precise type in Hive 16/07/26 10:31:23 WARN hive.TableDefWriter: Column UNCLEARED_6_DAY had to be cast to a less precise type in Hive 16/07/26 10:31:23 WARN hive.TableDefWriter: Column UNCLEARED_TEMP had to be cast to a less precise type in Hive 16/07/26 10:31:23 WARN hive.TableDefWriter: Column POSTED_DTE had to be cast to a less precise type in Hive 16/07/26 10:31:23 WARN hive.TableDefWriter: Column TRANS_DTE had to be cast to a less precise type in Hive 16/07/26 10:31:23 WARN hive.TableDefWriter: Column POSTED_COMMISSION_AMT had to be cast to a less precise type in Hive 16/07/26 10:31:23 WARN hive.TableDefWriter: Column COMMISSION_REF_CRNCY had to be cast to a less precise type in Hive 16/07/26 10:31:23 WARN hive.TableDefWriter: Column SEQUENCE_NO had to be cast to a less precise type in Hive 16/07/26 10:31:23 WARN hive.TableDefWriter: Column TRACER_NO had to be cast to a less precise type in Hive 16/07/26 10:31:23 WARN hive.TableDefWriter: Column FORCE_POST_RECVRY had to be cast to a less precise type in Hive 16/07/26 10:31:23 WARN hive.TableDefWriter: Column FORCE_POST_RSN had to be cast to a less precise type in Hive 16/07/26 10:31:23 WARN hive.TableDefWriter: Column CURR_LEDG_BAL had to be cast to a less precise type in Hive 16/07/26 10:31:23 WARN hive.TableDefWriter: Column ATM_POSTED_DTE had to be cast to a less precise type in Hive 16/07/26 10:31:23 WARN hive.TableDefWriter: Column ACQUIRER_SETTLEMENT_DTE had to be cast to a less precise type in Hive 16/07/26 10:31:23 WARN hive.TableDefWriter: Column ISSUER_SETTLEMENT_DTE had to be cast to a less precise type in Hive 16/07/26 10:31:23 WARN hive.TableDefWriter: Column REQUESTED_AMT had to be cast to a less precise type in Hive 16/07/26 10:31:23 WARN hive.TableDefWriter: Column ATM_DISPENSED_AMT had to be cast to a less precise type in Hive 16/07/26 10:31:23 WARN hive.TableDefWriter: Column SETTLEMENT_AMT had to be cast to a less precise type in Hive 16/07/26 10:31:23 WARN hive.TableDefWriter: Column CARD_EXCH_RTE had to be cast to a less precise type in Hive 16/07/26 10:31:23 WARN hive.TableDefWriter: Column SETTLEMENT_EXCH_RTE had to be cast to a less precise type in Hive 16/07/26 10:31:23 WARN hive.TableDefWriter: Column COMMISSION_AMT had to be cast to a less precise type in Hive 16/07/26 10:31:23 WARN hive.TableDefWriter: Column ERR_INFO_CDE had to be cast to a less precise type in Hive 16/07/26 10:31:23 WARN hive.TableDefWriter: Column REVERSAL_RSN_CDE had to be cast to a less precise type in Hive 16/07/26 10:31:23 WARN hive.TableDefWriter: Column CARD_MBR_NO had to be cast to a less precise type in Hive 16/07/26 10:31:23 WARN hive.TableDefWriter: Column SURCHARGE_DR_AMT had to be cast to a less precise type in Hive 16/07/26 10:31:23 WARN hive.TableDefWriter: Column SURCHARGE_ORIG_AMT had to be cast to a less precise type in Hive 16/07/26 10:31:23 WARN hive.TableDefWriter: Column ACQUIRER_SURCHARGE_AMT had to be cast to a less precise type in Hive 16/07/26 10:31:23 WARN hive.TableDefWriter: Column ACQUIRER_INSTIT_ID had to be cast to a less precise type in Hive 16/07/26 10:31:23 WARN hive.TableDefWriter: Column LOAD_DTE had to be cast to a less precise type in Hive 16/07/26 10:31:23 INFO hive.HiveImport: Loading uploaded data into Hive Logging initialized using configuration in jar:file:/usr/iop/4.1.0.0/hive/lib/hive-common-1.2.1-IBM-12.jar!/hive-log4j.properties OK Time taken: 1.034 seconds OK Time taken: 0.15 seconds Loading data to table insights.transaction_detail_atm_archive chgrp: changing ownership of 'hdfs://rhdtmomg1.mid.aib.pri:8020/apps/hive/warehouse/insights.db/transaction_detail_atm_archive/part-m-00000_copy_5': User does not belong to hadoop Table insights.transaction_detail_atm_archive stats: [numFiles=6, numRows=0, totalSize=15088788330, rawDataSize=0] OK Time taken: 0.587 seconds

答案1

在 SQL 单引号字符串文字中,可以通过双引号来包含单引号。在 bash 中,您可以使用参数扩展特征来获取 的值,并将所有出现的替换为。${VARIABLE//PATTERN/REPLACEMENT}VARIABLEPATTERNREPLACEMENT

hive -e "INSERT INTO SCHEDULED_MIGRATION_LOGS VALUES ('${OUTPUT//\'/\'\'}')"

请注意,如果OUTPUT包含一个无效的多字节序列,后跟',它'可能会被解析为无效序列的一部分,并且'由加倍代码添加的第二个序列将结束字符串文字。除非您知道数据库如何处理多字节序列,否则这是很难防御的。我不知道这在 Apache Hive 中是如何工作的。如果您的数据应该仅包含 ASCII,您可以检查它是否确实仅包含 ASCII 字符。以下代码片段也拒绝控制字符。

LC_ALL=C
OUTPUT="$(…)"
case "$OUTPUT" in
  *[! -~]*) echo 1>&2 "Output contains characters that are not printable ASCII, aborting."; exit 3;;
esac

相关内容