我见过:
- 安装 parquet-tools
- 无法编译 parquet-tools
- 无法传输工件(https://repo.maven.apache.org/maven2):收到致命警报:protocol_version-> [帮助 1]
- maven 项目执行“maven-thrift-plugin”失败
- 如何在 Ubuntu 上安装 libthrift-dev?
还有一些关于安装的内容thrift
。我真的不想从源thirft
代码构建parquet-mr
。我想要的只是parquet-tools
。
我上线了:
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 18.04.5 LTS
Release: 18.04
Codename: bionic
$
我尝试过的事情:
master
从某些发布标签构建,例如。1.11.x
出现各种错误,例如org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal org.apache.maven.plugins:maven-remote-resources-plugin:1.5:process (default) on project parquet-generator: Error rendering velocity resource. at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:215) ... Caused by: org.apache.maven.plugin.MojoExecutionException: Error rendering velocity resource. at org.apache.maven.plugin.resources.remote.ProcessRemoteResourcesMojo.processResourceBundles (ProcessRemoteResourcesMojo.java:1246) ... Caused by: java.lang.NullPointerException at java.util.Objects.requireNonNull (Objects.java:203) ...
使用以下方式安装 thrift
sudo apt-get install thrift-compiler
(安装0.9.x
,但在构建时会出现编译错误parquet-mr
)[DEBUG] (f) arguments = [-c, thrift -version | fgrep 'Thrift version 0.12.0' && exit 0; echo "================================================================================="; echo "========== [FATAL] Build is configured to require Thrift version 0.12.0 =========="; echo -n "========== Currently installed: "; thrift -version; echo "================================================================================="; exit 1]
尝试构建
thrift
来自源头,我收到一些错误:checking whether we are cross compiling... configure: error: in `/home/kash/vm_share/thrift-0.13.0': configure: error: cannot run C compiled programs.
尝试寻找
0.12/13.0
预建的thrift
但找不到。好像对于仿生来说只有0.9.0
拜托!我只想在命令行上查看 parquet 文件的元数据。
答案1
所以我最终设法从源代码进行编译。
总结
trift
用编译--host=x86_64
。- 在 parquet-mr repo 上使用
apache-parquet-1.11.11
标签而不是master
。 - 将 trift 依赖项版本从 12 更新到 13,
parquet-mr/pom.xml
并添加 maven central repo(codehaus
已失效):
+ <repository>
+ <id>mvnrepository</id>
+ <url>https://repo1.maven.org/maven2/</url>
+ </repository>
...
- <thrift.version>0.12.0</thrift.version>
+ <thrift.version>0.13.0</thrift.version>
# install dependencies as described here: https://thrift.apache.org/docs/install/debian.html
# install thrift from source
wget -nv http://archive.apache.org/dist/thrift/0.13.0/thrift-0.13.0.tar.gz
tar xzf thrift-0.13.0.tar.gz
cd thrift-0.13.0
chmod +x ./configure
./configure --host=x86_64 --disable-libs
sudo make install
# build parquet-tools from source
git clone https://github.com/Parquet/parquet-mr.git
cd parquet-mr
git checkout apache-parquet-1.11.11
# build only parquet-tools and it's dependencies
# had to skip tests because one failed
mvn package -pl parquet-tools -am -Plocal -Dmaven.test.skip=true
# Use
java -jar parquet-tools/target/parquet-tools-*.jar --help
# Or if you're lazy like me:
alias parquet-tools="java -jar $(realpath ./parquet-tools/target/parquet-tools-*.jar)"
parquet-tools -h
答案2
如果你感兴趣的话,你可以用 homebrew 来做:
brew install parquet-tools
它对我来说是有效的(在 20.04LTS 上),但是它确实花了一段时间并且拖拽了很多东西。