在 Slurm 上运行 Jupyter

以下所有示例作业脚本中关于 SBATCH 的选项

  • supervisor,请替换为用户账号关联的付费账户名
  • 其它的 SBATCH 选项请根据实际使用需要进行修改

首先创建一个 Slurm 作业脚本。

vi jupyterLab.sh

CPU 节点运行 Jupyter

将以下文本粘贴到作业脚本中,然后保存文件。

#!/bin/bash

#SBATCH --job-name=jupyter
#SBATCH --partition=hpxg
#SBATCH --ntasks-per-node=1
#SBATCH --time=06:00:00
#SBATCH --output=./jupyter.log

hostip=`ip a|grep 10.255.255.255|awk {'print $2'}|awk -F/ {'print $1'}`  # 此行代码请勿删除或修改。
echo "ssh -L8888:${hostip}:8888 ${USER}@swarm.whu.edu.cn" > ./jupyter.log # 此行代码请勿删除或修改。

module load jupyter # 加载系统预装的jupyter,如需使用自己安装的jupyter,请注释或删除此行。

jupyter lab --ip=${hostip} --port=8888 # 通过计算节点的8888端口,启动JupyterLab服务

GPU 节点运行 Jupyter

将以下文本粘贴到作业脚本中,然后保存文件。

#!/bin/bash

#SBATCH --account=supervisor
#SBATCH --job-name=jupyter
#SBATCH --partition=gpu
#SBATCH --gres=gpu:1
#SBATCH --time=06:00:00
#SBATCH --output=./jupyter.log

hostip=`ip a|grep 10.255.255.255|awk {'print $2'}|awk -F/ {'print $1'}`  # 此行代码请勿删除或修改。
echo "ssh -L8888:${hostip}:8888 ${USER}@swarm.whu.edu.cn" > ./jupyter.log # 此行代码请勿删除或修改。

module load jupyter # 加载系统预装的jupyter,如需使用自己安装的jupyter,请注释或删除此行。

jupyter lab --ip=${hostip} --port=8888 # 通过计算节点的8888端口,启动JupyterLab服务

将此作业脚本提交给 Slurm:

sbatch jupyterLab.sh

现在,检查您的作业是否正在运行

squeue -u $USER

确定作业已启动 后,检查 jupyter.log 日志输出,找出我们需要用于创建 SSH 隧道的相关信息。

[user@swarm ~]$ cat ./jupyter.log

如果成功运行,日志文件输出内容如下,仅需关注第一行和最后一行

ssh -L8888:10.34.0.12:8888 xinming@swarm.whu.edu.cn
[I 2023-04-19 15:50:35.298 ServerApp] jupyterlab | extension was successfully linked.
[I 2023-04-19 15:50:35.305 ServerApp] nbclassic | extension was successfully linked.
[I 2023-04-19 15:50:38.073 ServerApp] notebook_shim | extension was successfully linked.
[I 2023-04-19 15:50:38.074 ServerApp] panel.io.jupyter_server_extension | extension was successfully linked.
[I 2023-04-19 15:50:38.260 ServerApp] notebook_shim | extension was successfully loaded.
[I 2023-04-19 15:50:38.261 LabApp] JupyterLab extension loaded from /software/conda/anaconda3/2023.03/lib/python3.10/site-packages/jupyterlab
[I 2023-04-19 15:50:38.262 LabApp] JupyterLab application directory is /software/conda/anaconda3/2023.03/share/jupyter/lab
[I 2023-04-19 15:50:38.270 ServerApp] jupyterlab | extension was successfully loaded.
[I 2023-04-19 15:50:38.313 ServerApp] nbclassic | extension was successfully loaded.
[I 2023-04-19 15:50:38.314 ServerApp] panel.io.jupyter_server_extension | extension was successfully loaded.
[I 2023-04-19 15:50:38.317 ServerApp] Serving notebooks from local directory: /home/xinming
[I 2023-04-19 15:50:38.317 ServerApp] Jupyter Server 1.23.4 is running at:
[I 2023-04-19 15:50:38.317 ServerApp] http://10.34.0.12:8888/lab?token=5ab3861accf57e32f1351ab895cb456c30a20cf9cd10f86c
[I 2023-04-19 15:50:38.317 ServerApp]  or http://127.0.0.1:8888/lab?token=5ab3861accf57e32f1351ab895cb456c30a20cf9cd10f86c
[I 2023-04-19 15:50:38.317 ServerApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 2023-04-19 15:50:38.438 ServerApp] 

    To access the server, open this file in a browser:
        file:///home/xinming/.local/share/jupyter/runtime/jpserver-17370-open.html
    Or copy and paste one of these URLs:
        http://10.34.0.12:8888/lab?token=5ab3861accf57e32f1351ab895cb456c30a20cf9cd10f86c
     or http://127.0.0.1:8888/lab?token=5ab3861accf57e32f1351ab895cb456c30a20cf9cd10f86c

在本地电脑上打开一个新的终端窗口,复制并执行 ./jupyter.log 第一行生成的命令,创建一个SSH隧道

windows 通过 cmd (命令提示符) 执行 ,Linux 或 Mac 通过 terminal 执行。

ssh -L8888:10.34.0.12:8888 xinming@swarm.whu.edu.cn  # 输入密码登录成功后,请勿关闭此终端窗口。

在通过本地电脑打开浏览器,访问 ./jupyter.log 最后一行生成的网址

http://127.0.0.1:8888/lab?token=5ab3861accf57e32f1351ab895cb456c30a20cf9cd10f86c

如果在日志提示端口 8888 已在使用中,请取消此作业。并修改作业脚本中的8888端口(1024 以上的任何端口)。

results matching ""

    No results matching ""