Documentation

Everything you need to fine-tune, test, and ship a model on Soramai.

Soramai is a managed control plane. The docs cover dataset preparation, fine-tuning, deployment, billing, and security.

Open dashboard View pricing

Get started

Getting started guide

Concepts

Fine-tuning jobs, LoRA adapters, inference endpoints, credits, and how Soramai pieces them together.

Account setup

Datasets

Datasets reference

JSONL and image dataset formats, validation rules, AI-assisted generation, and multi-dataset merging.

Dataset formats (summary)

One-page overview of accepted file shapes and limits.

Fine-tuning

Fine-tuning reference

Base model catalogue, LoRA hyperparameters, live monitoring, refund policy, and worked cost examples.

Text fine-tuning (marketing)

Feature overview for fine-tuning text base models.

Image fine-tuning (marketing)

Feature overview for fine-tuning LoRAs on FLUX and SDXL.

Inference & Deploy

Inference & Deploy reference

Playground vs Deploy API, request/response schemas, streaming via SSE, rate limits, API key management, and billing.

Deploy API (marketing)

Feature overview for autoscaling inference endpoints.

Playground

Open the in-dashboard playground to test fine-tuned adapters.

Operations

Pricing and credits

Usage-based pricing, credit packs, and how fine-tuning and inference are billed.

Security and trust

Data handling, isolation, encryption, and responsible disclosure.

Service status

Live status of every sub-system, SLO targets, and incident history.

FAQ

Common questions about fine-tuning quality, billing, and supported models.

Quickstart

From sign-up to a deployed adapter in five steps.

01
Create an account
Sign up with email or Google. New accounts get a one-time trial credit.
02
Top up credits
Add a credit pack from the dashboard. $5 unlocks a small end-to-end fine-tuning run.
03
Upload a dataset
Drop a JSONL file (text) or a ZIP of image and caption pairs (image). Soramai validates before queueing.
04
Start a fine-tuning run
Pick a base model, take the defaults, and start. The dashboard shows the cost estimate before you confirm.
05
Deploy the adapter
When the run finishes, promote the adapter to an autoscaling endpoint with a single click.

Concepts

The vocabulary Soramai uses across the dashboard, CLI, and API.

Fine-tuning job

A single fine-tuning run that takes a dataset plus base model and produces a LoRA adapter.

LoRA adapter

A small set of weight deltas fine-tuned on top of a frozen base model. Soramai stores adapters, not full checkpoints.

Inference endpoint

An autoscaling HTTPS endpoint that loads a base model plus one or more adapters and serves completions or images.

Credit

Soramai's usage-based billing unit. Credits are consumed by fine-tuning time and inference time. 100 credits = $1.

Dataset

An uploaded archive validated by Soramai. Text datasets are JSONL. Image datasets are ZIP archives of image and caption pairs.

Account setup

Sign in. Email + password or Google. Sign up or sign in.
Credits. Top up from the dashboard. Credits never expire while the account is active.
API keys. Create and rotate from the account page. Keys are shown once at creation.
Billing. Receipts and invoices are emailed and downloadable from the dashboard.

Frequently asked

The most common questions about fine-tuning, models, and billing.

Which base models does Soramai support?

For text: Qwen 2.5 (7B/14B/32B), Llama 3.1 (8B/70B), Mistral 7B, Gemma 2 (9B/27B), Phi 3. Any Hugging Face causal LM compatible with PEFT LoRA also works. For images: FLUX.1 dev and schnell, Stable Diffusion XL, SDXL Turbo.

How long does a typical fine-tuning run take?

A 500-step text LoRA on a 7B model typically finishes in 4 to 7 minutes on an A100. A 1,500-step image LoRA on SDXL typically finishes in 10 to 20 minutes. The dashboard shows a live estimate.

Can I export the fine-tuned adapter?

Yes. Adapter files are stored in your account and can be downloaded as a single archive. Soramai never locks you into the hosted endpoint.

Is Soramai compatible with OpenAI client libraries?

Deployed text endpoints expose a chat-completions-style API that works with most OpenAI-compatible clients. See the Deploy API page for the exact schema.

Where is data stored?

Datasets and artifacts are stored encrypted at rest in US regions by default. Multi-region storage is available on enterprise plans.