launching from the web ui
the web ui handles environment bundling, dataset upload, and job submission. the final step is the launch screen.
before you launch
review the summary on the launch step. it shows:
- task: your system prompt and completion tags
- dataset: number of training and eval examples
- rewards: configured reward components and their weights
- tools (RAG only): search tool configuration and modes
if anything looks wrong, go back to the relevant step and fix it. you can navigate freely between steps without losing state.
launching
click launch. the web ui will:
- upload your dataset splits to storage
- build your environment (generates Python code, runs it remotely, pickles and uploads the result)
- validate the environment (optional smoke test)
- create the training run record and submit the job
you land on your training run page once the job is submitted. GPUs take a few minutes to warm up.
downloading as code
to inspect or modify what was generated, click export configuration on the launch step. you can download your config as a .py, .ipynb, or .json file, with an option to include your dataset. run the exported code locally to iterate or customize beyond what the web ui offers.
launching from the python sdk
once you have a dataset and an environment, one call launches the full training pipeline.
quickstart
import trainer
from trainer.corpus.corpora.search import CorporaSearch
from trainer.envs.search_env import SearchEnv
from trainer.trainer.pipeline import train
search = CorporaSearch(
api_key=API_KEY,
corpus_name="my-docs",
base_url=BASE_URL,
)
experiment_id = train(
env_class=SearchEnv,
env_args={"search": search},
train_dataset=train_data,
eval_dataset=eval_data,
prefix="my-search",
api_key=API_KEY,
local_modules=[cgft],
)train() returns an experiment ID. view it at https://app.castform.com/experiments/{experiment_id}.
parameters
experiment_id = train(
env_class=SearchEnv, # your environment class
env_args={"search": search}, # constructor args (SearchClient, judge config, etc.)
train_dataset=list_of_dicts, # training examples
eval_dataset=list_of_dicts, # evaluation examples
prefix="my-search", # namespace prefix for upload paths
api_key=API_KEY,
local_modules=[cgft],
dry_run=False, # set True to validate without launching
)dry run
use dry_run=True to validate your environment bundle, dataset upload, and remote smoke test without launching a training job. useful for catching issues before committing to a run.
result = train(
env_class=SearchEnv,
env_args={"search": search},
train_dataset=train_data,
eval_dataset=eval_data,
prefix="my-search",
api_key=API_KEY,
dry_run=True,
)
# result is a dict with validation info instead of an experiment ID after launching
rewards will fluctuate in the first few dozen steps. give it time before drawing conclusions. see managing training runs for what to watch for, including pass@k vs average reward trajectories and common warning signs.
next steps
- managing training runs: metrics, rollout inspection, health indicators
- evaluating: compare your model against baselines
- search environment: how the search environment and reward function work
- testing: validate your environment before training