这是我第一次尝试使用 Apache Sqoop 将 SQL Server 表(6 列,4 条记录)导入 Hive。以下是代码。
sqoop import --connect "jdbc:sqlserver://192.168.10.101:1433;database=Testdb" --username abc--password abc --table "DimEmployee" --create-hive-table --hive-import --hive-table DboDimEmployee
执行正常,但在此输出处停止
19/01/25 13:19:36 INFO mapreduce.Job: Running job: job_1548438714494_0003
我检查了 Hadoop UI 网页。此特定应用程序没有分配任何资源,进度为 0%。我不确定我做错了什么。
以下是附加信息。
- sql db连接参数正确,我已经从Hadoop端测试了连接。
- Hive 运行良好。我能够在 Hive 中创建数据库或表。
- 整个Hadoop系统在我的笔记本上的VirtualBox中。主节点有4G内存,数据节点有1G内存。
以下是我在 Hadoop 上进行的唯一与内存相关的配置。我不确定该问题是否与内存有关,我将其发布出来以防万一。
vi mapred-site.xml <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>yarn.app.mapreduce.am.resource.mb</name> <value>256</value> </property> <property> <name>mapreduce.map.memory.mb</name> <value>128</value> </property> <property> <name>mapreduce.reduce.memory.mb</name> <value>128</value> </property>
我从来没有看到过 map Reduce 的进度,比如 map 的百分比和 Reduce 的百分比。
我的安装中没有 HBASE、HCatalog、Accumulo 或 zookeeper。我认为我不需要它们,但我可能错了。
以下是我从 Sqoop 获得的所有执行消息。
Warning: /home/admin1/sqoop/../hbase does not exist! HBase imports will fail. Please set $HBASE_HOME to the root of your HBase installation. Warning: /home/admin1/sqoop/../hcatalog does not exist! HCatalog jobs will fail. Please set $HCAT_HOME to the root of your HCatalog installation. Warning: /home/admin1/sqoop/../accumulo does not exist! Accumulo imports will fail. Please set $ACCUMULO_HOME to the root of your Accumulo installation. Warning: /home/admin1/sqoop/../zookeeper does not exist! Accumulo imports will fail. Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation. 19/01/28 09:56:12 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7 19/01/28 09:56:12 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 19/01/28 09:56:12 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override 19/01/28 09:56:12 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc. 19/01/28 09:56:12 WARN sqoop.ConnFactory: Parameter --driver is set to an explicit driver however appropriate connection manager is not being set (via --connection-manage r). Sqoop is going to fall back to org.apache.sqoop.manager.GenericJdbcManager. Please specify explicitly which connection manager should be used next time. 19/01/28 09:56:12 INFO manager.SqlManager: Using default fetchSize of 1000 19/01/28 09:56:12 INFO tool.CodeGenTool: Beginning code generation 19/01/28 09:56:12 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM DimEmployee AS t WHERE 1=0 19/01/28 09:56:12 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM DimEmployee AS t WHERE 1=0 19/01/28 09:56:12 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /home/admin1/hadoop Note: /tmp/sqoop-admin1/compile/e8e0b042e5ecc16c39484556762dae8a/DimEmployee.java uses or overrides a deprecated API. Note: Recompile with -Xlint:deprecation for details. 19/01/28 09:56:17 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-admin1/compile/e8e0b042e5ecc16c39484556762dae8a/DimEmployee.jar 19/01/28 09:56:18 INFO mapreduce.ImportJobBase: Beginning import of DimEmployee 19/01/28 09:56:18 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar 19/01/28 09:56:18 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM DimEmployee AS t WHERE 1=0 19/01/28 09:56:19 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps 19/01/28 09:56:19 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032 19/01/28 09:56:29 INFO db.DBInputFormat: Using read commited transaction isolation 19/01/28 09:56:29 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT MIN(EmployeeKey), MAX(EmployeeKey) FROM DimEmployee 19/01/28 09:56:29 INFO db.IntegerSplitter: Split size: 73; Num splits: 4 from: 1 to: 296 19/01/28 09:56:29 INFO mapreduce.JobSubmitter: number of splits:4 19/01/28 09:56:29 INFO Configuration.deprecation: yarn.resourcemanager.system-metrics-publisher.enabled is deprecated. Instead, use yarn.system-metrics-publisher.enabled 19/01/28 09:56:30 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1548697949348_0001 19/01/28 09:56:31 INFO impl.YarnClientImpl: Submitted application application_1548697949348_0001 19/01/28 09:56:31 INFO mapreduce.Job: The url to track the job: http://name1:8088/proxy/application_1548697949348_0001/
19/01/28 09:56:31 INFO mapreduce.Job: Running job: job_1548697949348_0001