|Robert James||Jul 8, 2014 6:22 pm|
|Patrick Wendell||Jul 9, 2014 12:45 am|
|Koert Kuipers||Jul 9, 2014 5:20 am|
|Surendranauth Hiraman||Jul 9, 2014 5:30 am|
|Robert James||Jul 9, 2014 6:47 am|
|Jerry Lam||Jul 9, 2014 7:14 am|
|Andrei||Jul 9, 2014 8:34 am|
|Sandy Ryza||Jul 9, 2014 9:05 am|
|Koert Kuipers||Jul 9, 2014 9:14 am|
|Jerry Lam||Jul 9, 2014 9:25 am|
|Sandy Ryza||Jul 9, 2014 9:28 am|
|Ron Gonzalez||Jul 9, 2014 9:37 am|
|Ron Gonzalez||Jul 9, 2014 9:40 am|
|Andrew Or||Jul 9, 2014 6:39 pm|
|Koert Kuipers||Jul 10, 2014 6:10 am|
|Subject:||Re: Purpose of spark-submit?|
|From:||Surendranauth Hiraman (sure...@velos.io)|
|Date:||Jul 9, 2014 5:30:38 am|
Are there any gaps beyond convenience and code/config separation in using spark-submit versus SparkConf/SparkContext if you are willing to set your own config?
If there are any gaps, +1 on having parity within SparkConf/SparkContext where possible. In my use case, we launch our jobs programmatically. In theory, we could shell out to spark-submit but it's not the best option for us.
So far, we are only using Standalone Cluster mode, so I'm not knowledgeable on the complexities of other modes, though.
On Wed, Jul 9, 2014 at 8:20 AM, Koert Kuipers <koe...@tresata.com> wrote:
not sure I understand why unifying how you submit app for different platforms and dynamic configuration cannot be part of SparkConf and SparkContext?
for classpath a simple script similar to "hadoop classpath" that shows what needs to be added should be sufficient.
on spark standalone I can launch a program just fine with just SparkConf and SparkContext. not on yarn, so the spark-launch script must be doing a few things extra there I am missing... which makes things more difficult because I am not sure its realistic to expect every application that needs to run something on spark to be launched using spark-submit. On Jul 9, 2014 3:45 AM, "Patrick Wendell" <pwen...@gmail.com> wrote:
It fulfills a few different functions. The main one is giving users a way to inject Spark as a runtime dependency separately from their program and make sure they get exactly the right version of Spark. So a user can bundle an application and then use spark-submit to send it to different types of clusters (or using different versions of Spark).
It also unifies the way you bundle and submit an app for Yarn, Mesos, etc... this was something that became very fragmented over time before this was added.
Another feature is allowing users to set configuration values dynamically rather than compile them inside of their program. That's the one you mention here. You can choose to use this feature or not. If you know your configs are not going to change, then you don't need to set them with spark-submit.
On Wed, Jul 9, 2014 at 10:22 AM, Robert James <srob...@gmail.com> wrote:
What is the purpose of spark-submit? Does it do anything outside of the standard val conf = new SparkConf ... val sc = new SparkContext ... ?
SUREN HIRAMAN, VP TECHNOLOGY Velos Accelerating Machine Learning
440 NINTH AVENUE, 11TH FLOOR NEW YORK, NY 10001 O: (917) 525-2466 ext. 105 F: 646.349.4063 E: suren.hiraman@v <sure...@sociocast.com>elos.io W: www.velos.io