Commit Graph

20 Commits

Author SHA1 Message Date
Ayaz Salikhov
03e5fe572d Fix docs: we're not installing stable version of spark anymore (#2165) 2024-10-29 10:11:31 +00:00
Ayaz Salikhov
b744182207 Start using spark4-preview versions (#2159)
* Start using spark4-preview versions

* Allow to download preview versions

* Expect warnings in spark

* Disable local_sparklyr test for now
2024-10-22 11:47:45 +01:00
Ayaz Salikhov
c4cb04ec37 Make Spark scripts more robust: support preview versions and Spark 4 output 2024-10-19 19:08:04 +01:00
Ayaz Salikhov
5365b9f79f Rename: ROOT_CONTAINER->ROOT_IMAGE, BASE_CONTAINER->BASE_IMAGE (#2155)
* Rename: ROOT_CONTAINER->ROOT_IMAGE, BASE_CONTAINER->BASE_IMAGE

* Add changelog
2024-10-09 15:02:53 +01:00
Ayaz Salikhov
afe30f0c9a Use argparse to setup spark (#2082) 2024-01-17 15:07:15 +04:00
Ayaz Salikhov
bf33945b9e Do not bloat spark image with ENV variables (#2081)
* Do not bloat spark image with ENV variables

* Remove HadoopVersionTagger
2024-01-17 13:34:33 +04:00
Ayaz Salikhov
14a29d12d8 Improve comments in images 2024-01-15 14:56:23 +04:00
Ayaz Salikhov
e84bfdf4ae Add logger to setup_julia and setup_spark 2024-01-07 15:55:41 +04:00
Ayaz Salikhov
c294e9e2d9 Automatically install latest spark version (#2075)
* Automatically install latest pyspark version

* Better text

* Do not use shutil to keep behaviour

* Make setup_script cwd independent

* Use _get_program_version to calculate spark version

* Update setup_spark.py reqs

* Update setup_spark.py

* Add info about HADOOP_VERSION

* Add customization back

* Better text

* Specify build args when they are actually needed

* Better text

* Better code

* Better code

* Better text

* Get rid of warning

* Improve code

* Remove information about checksum

* Better text
2024-01-07 10:01:23 +04:00
Ayaz Salikhov
06cdadd0bf Improve spark pandas version information 2024-01-05 15:06:26 +04:00
Ayaz Salikhov
2927745fb2 Add order of precedence for spark-config script 2023-12-04 12:05:56 +01:00
Ayaz Salikhov
d8c60bc42c Fix more grammar issues 2023-11-19 12:16:19 +01:00
xieshuaihu
66f7beff16 add grpcio grpcio-status to support spark connect (#2017)
* add grpcio grpcio_status to support spark connect

* Sort install list

* Fix package name

* Update pyspark docs with new deps grpcio and grpcio-status

* set grpcio and grpcio-status version as 1.56

* exclude grpcio and grpcio-status in test_packages.py

* Update selecting.md

* Update test_packages.py

* Update Dockerfile

---------

Co-authored-by: Ayaz Salikhov <mathbunnyru@users.noreply.github.com>
2023-10-30 12:11:32 +01:00
Ayaz Salikhov
00a6728161 Move from Docker Hub to quay.io (#2010)
* Move from Docker Hub to quay.io

* Fix http->https

* Update registry-overviews

* Remove Docker Hub name
2023-10-20 22:31:45 +02:00
Ayaz Salikhov
3e04ded3a3 Add DockerHub warning (#2009) 2023-10-20 22:30:14 +02:00
Ayaz Salikhov
814219407b Remove some Docker Hub usage 2023-10-19 22:06:58 +02:00
Ayaz Salikhov
f8cd90ade1 Add an ability to specify registry when using docker images (#2008)
* Add an ability to specify registry when using docker images

* Fix typo

* [TMP] Speedup workflow

* Revert "[TMP] Speedup workflow"

This reverts commit 3af0055ccf.
2023-10-19 21:15:10 +02:00
Bjørn Jørgensen
52a999a554 Upgrade Apache Spark to 3.5.0 (#1995)
* 1.

* add note for pandas version

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update images/pyspark-notebook/Dockerfile

Co-authored-by: Ayaz Salikhov <mathbunnyru@users.noreply.github.com>

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Ayaz Salikhov <mathbunnyru@users.noreply.github.com>
2023-09-16 20:56:10 +04:00
Ayaz Salikhov
cf9a8b6624 Small improvements to startup hooks (#1976)
* Small improvements to startup hooks

* Fix
2023-08-20 16:55:54 +02:00
Ayaz Salikhov
a5b40a6f11 Move all images to images dir (#1972) 2023-08-19 17:25:20 +02:00