#
Quickstart
#
Getting an Access to MoAI Platform
Please obtain a container or virtual machine on the MoAI Platform from your infrastructure provider and follow the instructions to connect via SSH. For example, you can apply for a trial container on the MoAI Platform or use public cloud services based on the MoAI Platform.
- MoAI Platform Trial Container (Inquiries: support@moreh.io)
- KT Cloud's Hyperscale AI Computing (https://cloud.kt.com/solution/hyperscaleAiComputing/)
#
Verfying MoAI Accelerator
To train models like the sLLM introduced in this tutorial, it's important to select an appropriate size of the MoAI Accelerator. First, use the moreh-smi
command to check the currently used MoAI Accelerator.
$ moreh-smi
+---------------------------------------------------------------------------------------------------+
| Current Version: 24.11.0 Latest Version: 24.11.0 |
+---------------------------------------------------------------------------------------------------+
| Device | Name | Model | Memory Usage | Total Memory | Utilization |
+===================================================================================================+
| * 0 | MoAI Accelerator | Large.256GB | - | - | - |
+---------------------------------------------------------------------------------------------------+
#
Verifying PyTorch Installation
After connecting to the container via SSH, run the following command to check if PyTorch is installed in the current conda environment:
$ conda list torch
...
# Name Version Build Channel
torch 2.1.0+cu118.moreh24.11.0 pypi_0 pypi
...
If you see the message conda: command not found
, if the torch package is not listed, or if the torch package exists but does not include "moreh" in the version name, please follow the instructions in the Installation part below.
#
Installation (if required)
#
Setting up an Environment with Anaconda
To begin training, first create a conda environment:
conda create --name <my-env> python=3.8
Replace
<my-env>
with your desired environment name.Activate the conda environment:
conda activate <my-env>
The MoAI Platform supports various PyTorch versions, allowing you to choose the one that fits your needs.
pip install torch==1.13.1+moreh24.11.209
#
Running a Sample Code
- Clone the repository which includes example codes
git clone https://github.com/moreh-dev/quickstart cd quickstart
- Install the dependency packages
pip install -r requirements/requirements_llama3.txt
- Run a training script
python tutorial/train_llama3.py\ --epochs 1\ --batch-size 256\ --block-size 1024\ --lr 0.00001
- You can see logs like the one below.
| INFO | __main__:main:242 - Model load and warmup done. Duration: 316.09 | INFO | __main__:main:252 - [Step 5/1121] | Loss: 1.9141 | Duration: 32.67 sec | Throughput: 32100.30 tokens/sec | INFO | __main__:main:252 - [Step 10/1121] | Loss: 1.9297 | Duration: 40.53 sec | Throughput: 32337.05 tokens/sec | INFO | __main__:main:252 - [Step 15/1121] | Loss: 1.9297 | Duration: 40.80 sec | Throughput: 32129.29 tokens/sec | INFO | __main__:main:252 - [Step 20/1121] | Loss: 1.9062 | Duration: 40.06 sec | Throughput: 32715.97 tokens/sec | INFO | __main__:main:252 - [Step 25/1121] | Loss: 1.9062 | Duration: 40.14 sec | Throughput: 32656.86 tokens/sec | INFO | __main__:main:252 - [Step 30/1121] | Loss: 1.8672 | Duration: 40.36 sec | Throughput: 32477.78 tokens/sec