PySpark 中的端口绑定错误

PySpark 中的端口绑定错误

我一直在尝试让 PySpark 工作。我在 Windows 10 机器上使用 PyCharm IDE。对于设置,我采取了以下步骤:

  • 安装 PySpark
  • 安装了 Java 8u211
  • 下载并粘贴 winutils.exe
  • 在 Path 中声明了 SPARK_HOME、JAVA_HOME 和 HADOOP_HOME
  • 将 spark 文件夹和 zip 文件添加到内容根目录
  • 已经尝试过:export SPARK_LOCAL_IP="127.0.0.1"在 load-spark-env.sh 和其他与主机名相关的解决方案中

从 cmd 启动时出现以下错误(从 PyCharm 内部运行也会出现同样的错误)。我该如何修复此问题?

错误信息:

Python 3.7.1 (default, Dec 10 2018, 22:54:23) [MSC v.1915 64 bit (AMD64)] :: Anaconda, Inc. on win32  
Type "help", "copyright", "credits" or "license" for more information.  
19/05/14 21:33:19 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable  
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties  
Setting default log level to "WARN".  
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 ERROR SparkContext: Error initializing SparkContext.  
java.net.BindException: Cannot assign requested address: bind: Service 'sparkDriver' failed after 16 retries (on a random free port)! Consider explicitly setting the appropriate binding address for the service 'sparkDriver' (for example spark.driver.bindAddress for SparkDriver) to the correct binding address.  
        at sun.nio.ch.Net.bind0(Native Method)  
        at sun.nio.ch.Net.bind(Unknown Source)  
        at sun.nio.ch.Net.bind(Unknown Source)  
        at sun.nio.ch.ServerSocketChannelImpl.bind(Unknown Source)  
        at io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:128)  
        at io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:558)  
        at io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1283)  
        at io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:501)  
        at io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:486)  
        at io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:989)  
        at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:254)  
        at io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:364)  
        at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)  
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403)  
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463)  
        at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)  
        at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)  
        at java.lang.Thread.run(Unknown Source)  
19/05/14 21:33:21 WARN SparkContext: Another SparkContext is being constructed (or threw an exception in its constructor).  This may indicate an error, since only one SparkContext may be running in this JVM (see SPARK-2243). The other SparkContext was created at:  
org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)  
sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)  
sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)  
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)  
java.lang.reflect.Constructor.newInstance(Unknown Source)  
py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)  
py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)  
py4j.Gateway.invoke(Gateway.java:238)  
py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)  
py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)  
py4j.GatewayConnection.run(GatewayConnection.java:238)  
java.lang.Thread.run(Unknown Source)  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 WARN Utils: Service 'sparkDriver' could not bind on a random free port. You may check whether configuring an appropriate binding address.  
19/05/14 21:33:21 ERROR SparkContext: Error initializing SparkContext.  
java.net.BindException: Cannot assign requested address: bind: Service 'sparkDriver' failed after 16 retries (on a random free port)! Consider explicitly setting the appropriate binding address for the service 'sparkDriver' (for example spark.driver.bindAddress for SparkDriver) to the correct binding address.  
        at sun.nio.ch.Net.bind0(Native Method)  
        at sun.nio.ch.Net.bind(Unknown Source)  
        at sun.nio.ch.Net.bind(Unknown Source)  
        at sun.nio.ch.ServerSocketChannelImpl.bind(Unknown Source)  
        at io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:128)  
        at io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:558)  
        at io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1283)  
        at io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:501)  
        at io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:486)  
        at io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:989)  
        at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:254)  
        at io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:364)  
        at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)  
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403)  
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463)  
        at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)  
        at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)  
        at java.lang.Thread.run(Unknown Source)  
W:\Spark\spark-2.4.3-bin-hadoop2.7\python\pyspark\shell.py:45: UserWarning: Failed to initialize Spark session.  
  warnings.warn("Failed to initialize Spark session.")  
Traceback (most recent call last):  
  File "W:\Spark\spark-2.4.3-bin-hadoop2.7\python\pyspark\shell.py", line 41, in <module>  
    spark = SparkSession._create_shell_session()  
  File "W:\Spark\spark-2.4.3-bin-hadoop2.7\python\pyspark\sql\session.py", line 583, in _create_shell_session  
    return SparkSession.builder.getOrCreate()  
  File "W:\Spark\spark-2.4.3-bin-hadoop2.7\python\pyspark\sql\session.py", line 173, in getOrCreate  
    sc = SparkContext.getOrCreate(sparkConf)  
  File "W:\Spark\spark-2.4.3-bin-hadoop2.7\python\pyspark\context.py", line 367, in getOrCreate  
    SparkContext(conf=conf or SparkConf())  
  File "W:\Spark\spark-2.4.3-bin-hadoop2.7\python\pyspark\context.py", line 136, in __init__  
    conf, jsc, profiler_cls)  
  File "W:\Spark\spark-2.4.3-bin-hadoop2.7\python\pyspark\context.py", line 198, in _do_init  
    self._jsc = jsc or self._initialize_context(self._conf._jconf)  
  File "W:\Spark\spark-2.4.3-bin-hadoop2.7\python\pyspark\context.py", line 306, in _initialize_context  
    return self._jvm.JavaSparkContext(jconf)  
  File "W:\Spark\spark-2.4.3-bin-hadoop2.7\python\lib\py4j-0.10.7-src.zip\py4j\java_gateway.py", line 1525, in __call__  
    answer, self._gateway_client, None, self._fqn)  
  File "W:\Spark\spark-2.4.3-bin-hadoop2.7\python\lib\py4j-0.10.7-src.zip\py4j\protocol.py", line 328, in get_return_value  
    format(target_id, ".", name), value)  
py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.  
: java.net.BindException: Cannot assign requested address: bind: Service 'sparkDriver' failed after 16 retries (on a random free port)! Consider explicitly setting the appropriate binding address for the service 'sparkDriver' (for example spark.driver.bindAddress for SparkDriver) to the correct binding address.  
        at sun.nio.ch.Net.bind0(Native Method)  
        at sun.nio.ch.Net.bind(Unknown Source)  
        at sun.nio.ch.Net.bind(Unknown Source)  
        at sun.nio.ch.ServerSocketChannelImpl.bind(Unknown Source)  
        at io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:128)  
        at io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:558)  
        at io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1283)  
        at io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:501)  
        at io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:486)  
        at io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:989)  
        at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:254)  
        at io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:364)  
        at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)  
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403)  
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463)  
        at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)  
        at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)  
        at java.lang.Thread.run(Unknown Source)  

答案1

如果有人遇到同样的问题:

conf = pyspark.SparkConf().set('spark.driver.host','127.0.0.1')
sc = pyspark.SparkContext(master='local', appName='myAppName',conf=conf)

成功了。

答案2

不确定为什么 Bishu 的回应会得到负面投票——对于 Windows 用户来说,这是正确的答案......它对我有用。

Windows 步骤

对于不知道如何在 Windows 中指定系统变量的人,请参阅以下步骤:

  1. 在打开的文件夹中(左侧文件夹导航窗口打开)找到“这台电脑
  2. 右键单击“此电脑”,然后选择“特性
  3. 在左侧导航菜单中,选择“高级系统设置
  4. 在这个新菜单中,选择底部的项目“环境变量...
  5. 在第二个窗口(底部)中,选择“新的...
  6. 为了 ”多变的“提示符下,输入:SPARK_LOCAL_IP
  7. 为了 ”价值“提示符下,输入:localhost
  8. 注意:它可能是 127.0.0.1 或您系统上的其他值——您应该真正检查此文件中列出的内容C:\Windows\System32\Drivers\etc\hosts
  9. 完成后,离开整个区域
  10. 笔记:只有新的 cmd 提示才会加载/识别新的系统变量-- 请加载另一个 cmd 提示符

-- 希望有帮助

答案3

在 Windows 上

创建环境变量如下

变量 -> SPARK_LOCAL_IP 值 -> localhost

相关内容