Documentation

Everything you need to fine-tune, test, and ship a model on Soramai.

Soramai is a managed control plane. The docs cover dataset preparation, fine-tuning, deployment, billing, and security.

Quickstart

From sign-up to a deployed adapter in five steps.

  1. 01

    Create an account

    Sign up with email or Google. New accounts get a one-time trial credit.

  2. 02

    Top up credits

    Add a credit pack from the dashboard. $5 unlocks a small end-to-end fine-tuning run.

  3. 03

    Upload a dataset

    Drop a JSONL file (text) or a ZIP of image and caption pairs (image). Soramai validates before queueing.

  4. 04

    Start a fine-tuning run

    Pick a base model, take the defaults, and start. The dashboard shows the cost estimate before you confirm.

  5. 05

    Deploy the adapter

    When the run finishes, promote the adapter to an autoscaling endpoint with a single click.

Concepts

The vocabulary Soramai uses across the dashboard, CLI, and API.

Fine-tuning job

A single fine-tuning run that takes a dataset plus base model and produces a LoRA adapter.

LoRA adapter

A small set of weight deltas fine-tuned on top of a frozen base model. Soramai stores adapters, not full checkpoints.

Inference endpoint

An autoscaling HTTPS endpoint that loads a base model plus one or more adapters and serves completions or images.

Credit

Soramai's usage-based billing unit. Credits are consumed by fine-tuning time and inference time. 100 credits = $1.

Dataset

An uploaded archive validated by Soramai. Text datasets are JSONL. Image datasets are ZIP archives of image and caption pairs.

Account setup

  • Sign in. Email + password or Google. Sign up or sign in.
  • Credits. Top up from the dashboard. Credits never expire while the account is active.
  • API keys. Create and rotate from the account page. Keys are shown once at creation.
  • Billing. Receipts and invoices are emailed and downloadable from the dashboard.

Frequently asked

The most common questions about fine-tuning, models, and billing.

Which base models does Soramai support?

For text: Qwen 2.5 (7B/14B/32B), Llama 3.1 (8B/70B), Mistral 7B, Gemma 2 (9B/27B), Phi 3. Any Hugging Face causal LM compatible with PEFT LoRA also works. For images: FLUX.1 dev and schnell, Stable Diffusion XL, SDXL Turbo.

How long does a typical fine-tuning run take?

A 500-step text LoRA on a 7B model typically finishes in 4 to 7 minutes on an A100. A 1,500-step image LoRA on SDXL typically finishes in 10 to 20 minutes. The dashboard shows a live estimate.

Can I export the fine-tuned adapter?

Yes. Adapter files are stored in your account and can be downloaded as a single archive. Soramai never locks you into the hosted endpoint.

Is Soramai compatible with OpenAI client libraries?

Deployed text endpoints expose a chat-completions-style API that works with most OpenAI-compatible clients. See the Deploy API page for the exact schema.

Where is data stored?

Datasets and artifacts are stored encrypted at rest in US regions by default. Multi-region storage is available on enterprise plans.