git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

--archives flag missing from SparkSubmitHook


The current SparkSubmitHook doesn't appear to support the --archives flag.
>From the Spark docs:

"spark.yarn.dist.archives (none): Comma separated list of archives to be
extracted into the working directory of each executor."
https://spark.apache.org/docs/latest/running-on-yarn.html

This is necessary for deploying zipped virtualenvs or other packages across
the cluster. For now, I'll have to maintain my own copy of this Hook but
will contribute this back if others aren't planning on it.

Curious if there is context on why this isn't included?

Ben