CLI¶
Installing the package will install the command-line script pytorch-igniter-demo
Features¶
- Run training, dataprep, evaluation and inference locally or on AWS servers
- Model package is self-contained with all dependencies and configuration
- Integrate with pytorch-igniter for creating an experiment CLI and managing training.
- Integrate with MLflow for tracking training runs, including hyperparameters and metrics.
- Integrate with AWS SageMaker using aws-sagemaker-remote for tracking training runs and executing training remotely on managed containers.
Command-Line Arguments¶
Set of arguments and defaults is configured through code. See pytorch-igniter documentation.
pytorch-igniter demo script
usage: pytorch-igniter-demo [-h] {train,eval,train-and-eval,dataprep} ...
command¶
| command | Possible choices: train, eval, train-and-eval, dataprep Command to execute |
Sub-commands:¶
train¶
Train a model
pytorch-igniter-demo train [-h] [--sagemaker-profile SAGEMAKER_PROFILE]
[--sagemaker-run [SAGEMAKER_RUN]]
[--sagemaker-wait [SAGEMAKER_WAIT]]
[--sagemaker-spot-instances [SAGEMAKER_SPOT_INSTANCES]]
[--sagemaker-script SAGEMAKER_SCRIPT]
[--sagemaker-source SAGEMAKER_SOURCE]
[--sagemaker-training-instance SAGEMAKER_TRAINING_INSTANCE]
[--sagemaker-training-image SAGEMAKER_TRAINING_IMAGE]
[--sagemaker-training-role SAGEMAKER_TRAINING_ROLE]
[--sagemaker-base-job-name SAGEMAKER_BASE_JOB_NAME]
[--sagemaker-job-name SAGEMAKER_JOB_NAME]
[--sagemaker-experiment-name SAGEMAKER_EXPERIMENT_NAME]
[--sagemaker-trial-name SAGEMAKER_TRIAL_NAME]
[--sagemaker-volume-size SAGEMAKER_VOLUME_SIZE]
[--sagemaker-max-run SAGEMAKER_MAX_RUN]
[--sagemaker-max-wait SAGEMAKER_MAX_WAIT]
[--pytorch-igniter-demo PYTORCH_IGNITER_DEMO]
[--model-dir MODEL_DIR] [--output-dir OUTPUT_DIR]
[--checkpoint-dir CHECKPOINT_DIR]
[--sagemaker-checkpoint-s3 SAGEMAKER_CHECKPOINT_S3]
[--sagemaker-checkpoint-container SAGEMAKER_CHECKPOINT_CONTAINER]
[--input INPUT] [--device DEVICE]
[--classes CLASSES] [--max-epochs N]
[--n-saved N_SAVED] [--learning-rate LEARNING_RATE]
[--weight-decay WEIGHT_DECAY]
[--train-batch-size TRAIN_BATCH_SIZE]
[--mlflow-enable [MLFLOW_ENABLE]]
[--mlflow-experiment-name MLFLOW_EXPERIMENT_NAME]
[--mlflow-run-name MLFLOW_RUN_NAME]
[--mlflow-tracking-uri MLFLOW_TRACKING_URI]
[--mlflow-tracking-username MLFLOW_TRACKING_USERNAME]
[--mlflow-tracking-password MLFLOW_TRACKING_PASSWORD]
[--mlflow-tracking-secret-name MLFLOW_TRACKING_SECRET_NAME]
[--mlflow-tracking-secret-profile MLFLOW_TRACKING_SECRET_PROFILE]
[--mlflow-tracking-secret-region MLFLOW_TRACKING_SECRET_REGION]
Named Arguments¶
| --model-dir | Directory to save final model (default: output/model) Default: “output/model” |
| --output-dir | Directory for logs, images, or other output files (default: “output/output”) Default: “output/output” |
SageMaker¶
SageMaker options
| --sagemaker-profile | |
AWS profile for SageMaker session (default: [default]) Default: “default” | |
| --sagemaker-run | |
Run training on SageMaker (yes/no default=False) Default: False | |
| --sagemaker-wait | |
Wait for SageMaker training to complete and tail logs files (yes/no default=True) Default: True | |
| --sagemaker-spot-instances | |
Use spot instances for training (yes/no default=False) Default: False | |
| --sagemaker-script | |
Script to run on SageMaker. (default: [/home/docs/checkouts/readthedocs.org/user_builds/pytorch-igniter-demo/checkouts/latest/pytorch_igniter_demo/main.py]) Default: “/home/docs/checkouts/readthedocs.org/user_builds/pytorch-igniter-demo/checkouts/latest/pytorch_igniter_demo/main.py” | |
| --sagemaker-source | |
Source to upload to SageMaker. Must contain script. If blank, default to directory containing script. (default: []) Default: “” | |
| --sagemaker-training-instance | |
Instance type for training Default: “ml.m5.large” | |
| --sagemaker-training-image | |
Docker image for training Default: “683880991063.dkr.ecr.us-east-1.amazonaws.com/columbo-sagemaker-training:latest” | |
| --sagemaker-training-role | |
Docker image for training Default: “aws-sagemaker-remote-training-role” | |
| --sagemaker-base-job-name | |
Base job name for tracking and organization on S3. A job name will be generated from the base job name unless a job name is specified. Default: “training-job” | |
| --sagemaker-job-name | |
Job name for tracking. Use –base-job-name instead and a job name will be automatically generated with a timestamp. Default: “” | |
| --sagemaker-experiment-name | |
| Name of experiment in SageMaker tracking. | |
| --sagemaker-trial-name | |
| Name of experiment trial in SageMaker tracking. | |
| --sagemaker-volume-size | |
Volume size in GB. Default: 30 | |
| --sagemaker-max-run | |
Maximum runtime in seconds. Default: 43200 | |
| --sagemaker-max-wait | |
Maximum time to wait for spot instances in seconds. Default: 86400 | |
Dependencies¶
Dependencies to upload to SageMaker
| --pytorch-igniter-demo | |
Directory for dependency [pytorch_igniter_demo] (default: “/home/docs/checkouts/readthedocs.org/user_builds/pytorch-igniter-demo/checkouts/latest/pytorch_igniter_demo”) Default: “pytorch_igniter_demo” | |
Checkpoints¶
Checkpointing options
| --checkpoint-dir | |
Local directory to store checkpoints for resuming training (default: “output/checkpoint”) Default: “output/checkpoint” | |
| --sagemaker-checkpoint-s3 | |
Location to store checkpoints on S3 or “default” (default: “default”) Default: “default” | |
| --sagemaker-checkpoint-container | |
Location to store checkpoints on container (default: “/opt/ml/checkpoints”) Default: “/opt/ml/checkpoints” | |
Inputs¶
Inputs (local or S3)
| --input | Input channel [input]. Set to local path and it will be uploaded to S3 and downloaded to SageMaker. Set to S3 path and it will be downloaded to SageMaker. (default: [output/data]) Default: “output/data” |
Training¶
Training arguments
| --max-epochs | number of epochs to train (default: 10) Default: 10 |
| --n-saved | Number of checkpoints to keep (default: Default: 10 |
| --learning-rate | |
| Default: 0.001 | |
| --weight-decay | Default: 1e-05 |
| --train-batch-size | |
| Default: 32 | |
MLflow¶
MLflow arguments
| --mlflow-enable | |
Enable logging to MLflow (default: True) Default: True | |
| --mlflow-experiment-name | |
Experiment name in MLflow (default: default) Default: “default” | |
| --mlflow-run-name | |
| Run name in MLflow (default: None) | |
| --mlflow-tracking-uri | |
URI of MLflow tracking server (default: None) | |
| --mlflow-tracking-username | |
Username for MLflow tracking server (default: None) | |
| --mlflow-tracking-password | |
Password for MLflow tracking server (default: None) | |
| --mlflow-tracking-secret-name | |
Secret for accessing MLflow (default: None) | |
| --mlflow-tracking-secret-profile | |
Profile for accessing secret for accessing MLflow (default: None) | |
| --mlflow-tracking-secret-region | |
Region for accessing secret for accessing MLflow (default: None) | |
eval¶
Evaluate a model
pytorch-igniter-demo eval [-h] [--sagemaker-profile SAGEMAKER_PROFILE]
[--sagemaker-run [SAGEMAKER_RUN]]
[--sagemaker-wait [SAGEMAKER_WAIT]]
[--sagemaker-spot-instances [SAGEMAKER_SPOT_INSTANCES]]
[--sagemaker-script SAGEMAKER_SCRIPT]
[--sagemaker-source SAGEMAKER_SOURCE]
[--sagemaker-training-instance SAGEMAKER_TRAINING_INSTANCE]
[--sagemaker-training-image SAGEMAKER_TRAINING_IMAGE]
[--sagemaker-training-role SAGEMAKER_TRAINING_ROLE]
[--sagemaker-base-job-name SAGEMAKER_BASE_JOB_NAME]
[--sagemaker-job-name SAGEMAKER_JOB_NAME]
[--sagemaker-experiment-name SAGEMAKER_EXPERIMENT_NAME]
[--sagemaker-trial-name SAGEMAKER_TRIAL_NAME]
[--sagemaker-volume-size SAGEMAKER_VOLUME_SIZE]
[--sagemaker-max-run SAGEMAKER_MAX_RUN]
[--sagemaker-max-wait SAGEMAKER_MAX_WAIT]
[--pytorch-igniter-demo PYTORCH_IGNITER_DEMO]
[--model-dir MODEL_DIR] [--output-dir OUTPUT_DIR]
[--checkpoint-dir CHECKPOINT_DIR]
[--sagemaker-checkpoint-s3 SAGEMAKER_CHECKPOINT_S3]
[--sagemaker-checkpoint-container SAGEMAKER_CHECKPOINT_CONTAINER]
[--input INPUT] [--device DEVICE]
[--classes CLASSES]
[--eval-batch-size EVAL_BATCH_SIZE]
[--mlflow-enable [MLFLOW_ENABLE]]
[--mlflow-experiment-name MLFLOW_EXPERIMENT_NAME]
[--mlflow-run-name MLFLOW_RUN_NAME]
[--mlflow-tracking-uri MLFLOW_TRACKING_URI]
[--mlflow-tracking-username MLFLOW_TRACKING_USERNAME]
[--mlflow-tracking-password MLFLOW_TRACKING_PASSWORD]
[--mlflow-tracking-secret-name MLFLOW_TRACKING_SECRET_NAME]
[--mlflow-tracking-secret-profile MLFLOW_TRACKING_SECRET_PROFILE]
[--mlflow-tracking-secret-region MLFLOW_TRACKING_SECRET_REGION]
Named Arguments¶
| --model-dir | Directory to save final model (default: output/model) Default: “output/model” |
| --output-dir | Directory for logs, images, or other output files (default: “output/output”) Default: “output/output” |
SageMaker¶
SageMaker options
| --sagemaker-profile | |
AWS profile for SageMaker session (default: [default]) Default: “default” | |
| --sagemaker-run | |
Run training on SageMaker (yes/no default=False) Default: False | |
| --sagemaker-wait | |
Wait for SageMaker training to complete and tail logs files (yes/no default=True) Default: True | |
| --sagemaker-spot-instances | |
Use spot instances for training (yes/no default=False) Default: False | |
| --sagemaker-script | |
Script to run on SageMaker. (default: [/home/docs/checkouts/readthedocs.org/user_builds/pytorch-igniter-demo/checkouts/latest/pytorch_igniter_demo/main.py]) Default: “/home/docs/checkouts/readthedocs.org/user_builds/pytorch-igniter-demo/checkouts/latest/pytorch_igniter_demo/main.py” | |
| --sagemaker-source | |
Source to upload to SageMaker. Must contain script. If blank, default to directory containing script. (default: []) Default: “” | |
| --sagemaker-training-instance | |
Instance type for training Default: “ml.m5.large” | |
| --sagemaker-training-image | |
Docker image for training Default: “683880991063.dkr.ecr.us-east-1.amazonaws.com/columbo-sagemaker-training:latest” | |
| --sagemaker-training-role | |
Docker image for training Default: “aws-sagemaker-remote-training-role” | |
| --sagemaker-base-job-name | |
Base job name for tracking and organization on S3. A job name will be generated from the base job name unless a job name is specified. Default: “training-job” | |
| --sagemaker-job-name | |
Job name for tracking. Use –base-job-name instead and a job name will be automatically generated with a timestamp. Default: “” | |
| --sagemaker-experiment-name | |
| Name of experiment in SageMaker tracking. | |
| --sagemaker-trial-name | |
| Name of experiment trial in SageMaker tracking. | |
| --sagemaker-volume-size | |
Volume size in GB. Default: 30 | |
| --sagemaker-max-run | |
Maximum runtime in seconds. Default: 43200 | |
| --sagemaker-max-wait | |
Maximum time to wait for spot instances in seconds. Default: 86400 | |
Dependencies¶
Dependencies to upload to SageMaker
| --pytorch-igniter-demo | |
Directory for dependency [pytorch_igniter_demo] (default: “/home/docs/checkouts/readthedocs.org/user_builds/pytorch-igniter-demo/checkouts/latest/pytorch_igniter_demo”) Default: “pytorch_igniter_demo” | |
Checkpoints¶
Checkpointing options
| --checkpoint-dir | |
Local directory to store checkpoints for resuming training (default: “output/checkpoint”) Default: “output/checkpoint” | |
| --sagemaker-checkpoint-s3 | |
Location to store checkpoints on S3 or “default” (default: “default”) Default: “default” | |
| --sagemaker-checkpoint-container | |
Location to store checkpoints on container (default: “/opt/ml/checkpoints”) Default: “/opt/ml/checkpoints” | |
Inputs¶
Inputs (local or S3)
| --input | Input channel [input]. Set to local path and it will be uploaded to S3 and downloaded to SageMaker. Set to S3 path and it will be downloaded to SageMaker. (default: [output/data]) Default: “output/data” |
MLflow¶
MLflow arguments
| --mlflow-enable | |
Enable logging to MLflow (default: True) Default: True | |
| --mlflow-experiment-name | |
Experiment name in MLflow (default: default) Default: “default” | |
| --mlflow-run-name | |
| Run name in MLflow (default: None) | |
| --mlflow-tracking-uri | |
URI of MLflow tracking server (default: None) | |
| --mlflow-tracking-username | |
Username for MLflow tracking server (default: None) | |
| --mlflow-tracking-password | |
Password for MLflow tracking server (default: None) | |
| --mlflow-tracking-secret-name | |
Secret for accessing MLflow (default: None) | |
| --mlflow-tracking-secret-profile | |
Profile for accessing secret for accessing MLflow (default: None) | |
| --mlflow-tracking-secret-region | |
Region for accessing secret for accessing MLflow (default: None) | |
train-and-eval¶
Train and evaluate a model
pytorch-igniter-demo train-and-eval [-h]
[--sagemaker-profile SAGEMAKER_PROFILE]
[--sagemaker-run [SAGEMAKER_RUN]]
[--sagemaker-wait [SAGEMAKER_WAIT]]
[--sagemaker-spot-instances [SAGEMAKER_SPOT_INSTANCES]]
[--sagemaker-script SAGEMAKER_SCRIPT]
[--sagemaker-source SAGEMAKER_SOURCE]
[--sagemaker-training-instance SAGEMAKER_TRAINING_INSTANCE]
[--sagemaker-training-image SAGEMAKER_TRAINING_IMAGE]
[--sagemaker-training-role SAGEMAKER_TRAINING_ROLE]
[--sagemaker-base-job-name SAGEMAKER_BASE_JOB_NAME]
[--sagemaker-job-name SAGEMAKER_JOB_NAME]
[--sagemaker-experiment-name SAGEMAKER_EXPERIMENT_NAME]
[--sagemaker-trial-name SAGEMAKER_TRIAL_NAME]
[--sagemaker-volume-size SAGEMAKER_VOLUME_SIZE]
[--sagemaker-max-run SAGEMAKER_MAX_RUN]
[--sagemaker-max-wait SAGEMAKER_MAX_WAIT]
[--pytorch-igniter-demo PYTORCH_IGNITER_DEMO]
[--model-dir MODEL_DIR]
[--output-dir OUTPUT_DIR]
[--checkpoint-dir CHECKPOINT_DIR]
[--sagemaker-checkpoint-s3 SAGEMAKER_CHECKPOINT_S3]
[--sagemaker-checkpoint-container SAGEMAKER_CHECKPOINT_CONTAINER]
[--input INPUT] [--device DEVICE]
[--classes CLASSES] [--max-epochs N]
[--n-saved N_SAVED]
[--learning-rate LEARNING_RATE]
[--weight-decay WEIGHT_DECAY]
[--train-batch-size TRAIN_BATCH_SIZE]
[--eval-batch-size EVAL_BATCH_SIZE]
[--mlflow-enable [MLFLOW_ENABLE]]
[--mlflow-experiment-name MLFLOW_EXPERIMENT_NAME]
[--mlflow-run-name MLFLOW_RUN_NAME]
[--mlflow-tracking-uri MLFLOW_TRACKING_URI]
[--mlflow-tracking-username MLFLOW_TRACKING_USERNAME]
[--mlflow-tracking-password MLFLOW_TRACKING_PASSWORD]
[--mlflow-tracking-secret-name MLFLOW_TRACKING_SECRET_NAME]
[--mlflow-tracking-secret-profile MLFLOW_TRACKING_SECRET_PROFILE]
[--mlflow-tracking-secret-region MLFLOW_TRACKING_SECRET_REGION]
Named Arguments¶
| --model-dir | Directory to save final model (default: output/model) Default: “output/model” |
| --output-dir | Directory for logs, images, or other output files (default: “output/output”) Default: “output/output” |
SageMaker¶
SageMaker options
| --sagemaker-profile | |
AWS profile for SageMaker session (default: [default]) Default: “default” | |
| --sagemaker-run | |
Run training on SageMaker (yes/no default=False) Default: False | |
| --sagemaker-wait | |
Wait for SageMaker training to complete and tail logs files (yes/no default=True) Default: True | |
| --sagemaker-spot-instances | |
Use spot instances for training (yes/no default=False) Default: False | |
| --sagemaker-script | |
Script to run on SageMaker. (default: [/home/docs/checkouts/readthedocs.org/user_builds/pytorch-igniter-demo/checkouts/latest/pytorch_igniter_demo/main.py]) Default: “/home/docs/checkouts/readthedocs.org/user_builds/pytorch-igniter-demo/checkouts/latest/pytorch_igniter_demo/main.py” | |
| --sagemaker-source | |
Source to upload to SageMaker. Must contain script. If blank, default to directory containing script. (default: []) Default: “” | |
| --sagemaker-training-instance | |
Instance type for training Default: “ml.m5.large” | |
| --sagemaker-training-image | |
Docker image for training Default: “683880991063.dkr.ecr.us-east-1.amazonaws.com/columbo-sagemaker-training:latest” | |
| --sagemaker-training-role | |
Docker image for training Default: “aws-sagemaker-remote-training-role” | |
| --sagemaker-base-job-name | |
Base job name for tracking and organization on S3. A job name will be generated from the base job name unless a job name is specified. Default: “training-job” | |
| --sagemaker-job-name | |
Job name for tracking. Use –base-job-name instead and a job name will be automatically generated with a timestamp. Default: “” | |
| --sagemaker-experiment-name | |
| Name of experiment in SageMaker tracking. | |
| --sagemaker-trial-name | |
| Name of experiment trial in SageMaker tracking. | |
| --sagemaker-volume-size | |
Volume size in GB. Default: 30 | |
| --sagemaker-max-run | |
Maximum runtime in seconds. Default: 43200 | |
| --sagemaker-max-wait | |
Maximum time to wait for spot instances in seconds. Default: 86400 | |
Dependencies¶
Dependencies to upload to SageMaker
| --pytorch-igniter-demo | |
Directory for dependency [pytorch_igniter_demo] (default: “/home/docs/checkouts/readthedocs.org/user_builds/pytorch-igniter-demo/checkouts/latest/pytorch_igniter_demo”) Default: “pytorch_igniter_demo” | |
Checkpoints¶
Checkpointing options
| --checkpoint-dir | |
Local directory to store checkpoints for resuming training (default: “output/checkpoint”) Default: “output/checkpoint” | |
| --sagemaker-checkpoint-s3 | |
Location to store checkpoints on S3 or “default” (default: “default”) Default: “default” | |
| --sagemaker-checkpoint-container | |
Location to store checkpoints on container (default: “/opt/ml/checkpoints”) Default: “/opt/ml/checkpoints” | |
Inputs¶
Inputs (local or S3)
| --input | Input channel [input]. Set to local path and it will be uploaded to S3 and downloaded to SageMaker. Set to S3 path and it will be downloaded to SageMaker. (default: [output/data]) Default: “output/data” |
Training¶
Training arguments
| --max-epochs | number of epochs to train (default: 10) Default: 10 |
| --n-saved | Number of checkpoints to keep (default: Default: 10 |
| --learning-rate | |
| Default: 0.001 | |
| --weight-decay | Default: 1e-05 |
| --train-batch-size | |
| Default: 32 | |
MLflow¶
MLflow arguments
| --mlflow-enable | |
Enable logging to MLflow (default: True) Default: True | |
| --mlflow-experiment-name | |
Experiment name in MLflow (default: default) Default: “default” | |
| --mlflow-run-name | |
| Run name in MLflow (default: None) | |
| --mlflow-tracking-uri | |
URI of MLflow tracking server (default: None) | |
| --mlflow-tracking-username | |
Username for MLflow tracking server (default: None) | |
| --mlflow-tracking-password | |
Password for MLflow tracking server (default: None) | |
| --mlflow-tracking-secret-name | |
Secret for accessing MLflow (default: None) | |
| --mlflow-tracking-secret-profile | |
Profile for accessing secret for accessing MLflow (default: None) | |
| --mlflow-tracking-secret-region | |
Region for accessing secret for accessing MLflow (default: None) | |
dataprep¶
Prepare dataset
pytorch-igniter-demo dataprep [-h] [--sagemaker-profile SAGEMAKER_PROFILE]
[--sagemaker-run [SAGEMAKER_RUN]]
[--sagemaker-wait [SAGEMAKER_WAIT]]
[--sagemaker-script SAGEMAKER_SCRIPT]
[--sagemaker-python SAGEMAKER_PYTHON]
[--sagemaker-job-name SAGEMAKER_JOB_NAME]
[--sagemaker-base-job-name SAGEMAKER_BASE_JOB_NAME]
[--sagemaker-runtime-seconds SAGEMAKER_RUNTIME_SECONDS]
[--sagemaker-role SAGEMAKER_ROLE]
[--sagemaker-requirements SAGEMAKER_REQUIREMENTS]
[--sagemaker-configuration-script SAGEMAKER_CONFIGURATION_SCRIPT]
[--sagemaker-configuration-command SAGEMAKER_CONFIGURATION_COMMAND]
[--sagemaker-image SAGEMAKER_IMAGE]
[--sagemaker-instance SAGEMAKER_INSTANCE]
[--sagemaker-volume-size SAGEMAKER_VOLUME_SIZE]
[--sagemaker-output-json SAGEMAKER_OUTPUT_JSON]
[--sagemaker-output-mount SAGEMAKER_OUTPUT_MOUNT]
[--output OUTPUT] [--output-s3 OUTPUT_S3]
[--output-mode OUTPUT_MODE]
[--sagemaker-module-mount SAGEMAKER_MODULE_MOUNT]
[--aws-sagemaker-remote AWS_SAGEMAKER_REMOTE]
SageMaker¶
SageMaker options
| --sagemaker-profile | |
AWS profile for SageMaker session (default: [default]) Default: “default” | |
| --sagemaker-run | |
Run processing on SageMaker (yes/no default=False) Default: False | |
| --sagemaker-wait | |
Wait for SageMaker processing to complete and tail logs (yes/no default=True) Default: True | |
| --sagemaker-script | |
Python script to execute (default: [/home/docs/checkouts/readthedocs.org/user_builds/pytorch-igniter-demo/checkouts/latest/pytorch_igniter_demo/dataprep.py]) Default: “/home/docs/checkouts/readthedocs.org/user_builds/pytorch-igniter-demo/checkouts/latest/pytorch_igniter_demo/dataprep.py” | |
| --sagemaker-python | |
Python executable to use in container (default: [python3]) Default: “python3” | |
| --sagemaker-job-name | |
Job name for SageMaker processing. If not provided, will be generated from base job name. Leave blank for most use-cases. (default: []) Default: “” | |
| --sagemaker-base-job-name | |
Base job name for SageMaker processing .Job name will be generated from the base name and a timestamp (default: [pytorch-igniter-demo-dataprep]) Default: “pytorch-igniter-demo-dataprep” | |
| --sagemaker-runtime-seconds | |
SageMaker maximum runtime in seconds (default: [3600]) Default: 3600 | |
| --sagemaker-role | |
AWS role for SageMaker execution (default: [aws-sagemaker-remote-processing-role]) Default: “aws-sagemaker-remote-processing-role” | |
| --sagemaker-requirements | |
| Requirements file to install on SageMaker (default: [None]) | |
| --sagemaker-configuration-script | |
| Bash configuration script to source on SageMaker (default: [None]) | |
| --sagemaker-configuration-command | |
Bash command to run on SageMaker for configuration (e.g., Default: “pip3 install –upgrade sagemaker sagemaker-experiments” | |
| --sagemaker-image | |
AWS ECR image URI of Docker image to run SageMaker processing (default: [683880991063.dkr.ecr.us-east-1.amazonaws.com/columbo-compute:latest]) Default: “683880991063.dkr.ecr.us-east-1.amazonaws.com/columbo-compute:latest” | |
| --sagemaker-instance | |
AWS SageMaker instance to run processing (default: [ml.t3.medium]) Default: “ml.t3.medium” | |
| --sagemaker-volume-size | |
AWS SageMaker volume size in GB (default: [30]) Default: 30 | |
| --sagemaker-output-json | |
| Write SageMaker training details to JSON file (default: [None]) | |
Output¶
Output options
| --sagemaker-output-mount | |
Mount point for outputs. If running on SageMaker, outputs written here are uploaded to S3. If running locally, S3 outputs written here are uploaded to S3. No effect on local outputs when running locally. (default: [/opt/ml/processing/output]) Default: “/opt/ml/processing/output” | |
| --output | Output [output] local path. If running locally, set to a local path. (default: [output/data]) Default: “output/data” |
| --output-s3 | Output [output] S3 URI. Upload results to this URI. Empty string automatically generates a URI. (default: [default]) Default: “default” |
| --output-mode | Output [output] mode. Set to Continuous or EndOfJob. (default: [EndOfJob]) Default: “EndOfJob” |
Modules¶
Module options
| --sagemaker-module-mount | |
Mount point for modules. If running on SageMaker, modules are mounted here and this directory is added to PYTHONPATH (default: [/opt/ml/processing/modules]) Default: “/opt/ml/processing/modules” | |
| --aws-sagemaker-remote | |
Directory of [aws_sagemaker_remote] module. If running on SageMaker, modules are uploaded and placed on PYTHONPATH. (default: [/home/docs/checkouts/readthedocs.org/user_builds/pytorch-igniter-demo/envs/latest/lib/python3.7/site-packages/aws_sagemaker_remote]) Default: “/home/docs/checkouts/readthedocs.org/user_builds/pytorch-igniter-demo/envs/latest/lib/python3.7/site-packages/aws_sagemaker_remote” | |
See pytorch-igniter documentation for detailed option documentation.
See aws-sagemaker-remote documentation for detailed SageMaker option documentation.