MaaS
Introduction
StarLandAI offers advanced API services for text, voice, image, video, and music, utilizing the latest multimodal models. The multimodal large model APIs enable a variety of product forms, including digital characters for animation and film, text-based games, language translation, voice cloning, face swapping in images, and music generation.
All models are fine-tuned and deployed on cloud-native, DePIN-enhanced devices, reducing overall costs by 50%, supports a diverse range of computing resources, including high-end H100, mid-range 3090, 3080 GPUs, and CPUs. Each model can be swiftly integrated with various blockchains, including Solana, Ethereum, BNB Chain, and other Layer 2 blockchains.
Multimodels you can get
StarLandAI supports a wide range of multimodal models, including but not limited to:
Conversational Models
-
Dedicated Digital Humans: Create conversational digital humans with features such as customizable character design, background story, personality traits, and conversational learning. Users can converse with these digital humans, who possess long-term memory and can maintain topics based on the chat history.
-
Knowledge Base Q&A: Users can upload documents and engage in AI-based conversations regarding the document's content. The AI can restructure the content as per the conversation's requirements and provide relevant source citations.
-
Text-based Games: Engage in text-based games with AI, featuring open-ended interaction scenarios like “Sea Turtle Soup” and “Life Simulator.” In the “Life Simulator” game, the AI randomly generates a person's life, and players, acting as that person, encounter randomly generated events at different stages of life. Players make choices during these events, and the model generates the outcomes. The game concludes when the character dies, summarizing his entire life.
Voice Models
-
TTS (Text-to-Speech): Converts text into speech with customizable voice characteristics. Users can enter text, and the system reads it aloud in the specified voice tone.
-
ASR (Automatic Speech Recognition): Converts speech into text.
-
Voice Cloning: Uses about one minute of audio sample to clone a voice.
-
Speech Translation: Translates from one language to another while retaining the same voice characteristics.
-
AI Singing: Choose a song and a voice style, and AI separates vocals from the song and automatically synthesizes them with the selected voice characteristics.
Image Models
- Basic Text-to-Image: Generates images from textual prompts with options for detail level and image dimensions. Supports partial redrawing and size extension.
- 3D Character Generation: Converts a real person's photo into a 3D-style cartoon image or generates a 3D model based on input.
- Face Swapping: Takes a photo of Person A and another of Person B, then swaps Person B's face with Person A's.
- Style Imitation: Allows users to provide a reference image to control the style of the generated image.
- Photorealistic Image Generation: Produces high-quality photorealistic images with rich details, minimizing the artificial AI look.
- Fine Detail Editing: Edits existing images based on textual prompts, allowing modifications to facial expressions, poses, backgrounds, and other details.
Video Models
- Video Re-Rendering: Users can submit a video with a character, and the system will change the character's face.
- Illustrated Video: Given a text material and a selected video generation style, the system automatically generates an illustrated video corresponding to the text. Illustrated Video: Given a text material and a selected video generation style, the system automatically generates an illustrated video corresponding to the text.
- Lip-Synced Video: Input a text and a pre-recorded talking-head video, and the AI synchronizes the lip movements in the video with the input text to create a coherent lip-sync.
More MaaS APIs please visit: MaaS
API
1. Chat - Avatar Chat
-
Description Creating digital avatars that support setting avatar appearance, setting avatar backgrounds, personality traits, and example dialogues. Engage in conversations with preset digital avatars, which possess long-term memory and can continue topics based on chat history.
-
Models
- LLM
- AI Agent Development Framework
- Reasoning and Acting
- High-performance distributed inference service
- Vector Database
- Output Format Control
-
API Example
Request:
POST /api/avatar/chat/completes HTTP/1.1
Content-Type:application/json
{
"avatarID": "xxxx", // avatar id
"conversationID": "xxx", // conversation id
"model": "xxx", // model name
"messages": [
{
"role": "human", // message role, support human and ai
"message": "xxx" // message
}
] // messages
}
Response:
{
code: 0, // error code, 0 means Ok
err_msg: xxx, // error message
result: {
"message": "xxx" // chat response message
}
}
2. Image Generation - AI Face Swap
-
Description Provide a photo of Person A and a photo of Person B, and replace the face in Person B's photo with the face of Person A.
-
Models
- StableDiffusion
- Lora
- ControlNet+DWPose
- Img2Img
- InsightFace
- FreeU
- IP-Adapter
- ESRGAN
-
API Example
Request:
POST /api/faceswap HTTP/1.1
Content-Type: application/json
{
"img": "xxxxx",
"face_img": "xxx"
}
Response:
{
"code": 0, // error code, 0 means Ok
"data": {
"image": "xxxx"
}
}
3. Image Generation - AI Style Transform
-
Description Transform images into a specified style.
-
Models
- StableDiffusion
- InstantID
- Prompt Engineering
-
API Example
Request:
POST /api/img2img/styletransfer HTTP/1.1
Content-Type: application/json
{
"styleName": "cyberpunk",
"img": "xxxx"
}
Response:
{
"code": 0, // error code, 0 means Ok
"data": {
"images": ["xx"],
"seed": 123
}
}
4. Speech Synthesis - TTS
-
Description Text to Speech
- Set Voice Tone.
- After entering the text, the speech is generated according to the voice tone.
-
Models
- Speech Synthesis
-
API Example
Request:
POST /api/tts HTTP/1.1
Content-Type: application/json
{
"text": "hello, my name is Jack.",
"role_id": "faa54ce7-01a2-4604-9fac"
}
Response:
{
"code": 0, // error code, 0 means Ok
"err_msg": "xxx",
"result": "xxx" // base64(FILE_CONTENT)
}
5. Speech Synthesis - ASR
- Description Automatic Speech Recognition.
- Models Automatic Speech Recognition
- API Example
Request:
POST /api/asr HTTP/1.1
Content-Type:application/json
{
speech: "xxxx" // base64(FILE_CONTENT)
}
Response:
{
"code": 0, // error code, 0 means Ok
"err_msg": "xxx",
"result": "xxx"
}
6. Speech Synthesis - Voice Clone
- Description Using about 1 minute of speech material for voice cloning.
- Models VITS
- API Example
Request:
POST /api/clone HTTP/1.1
Content-Type:application/json
{
"audio_base64": "xxx",
"audio_note": "jack",
"audio_type": "mp3"
}
Response:
{
"code": 0, // error code, 0 means Ok
"err_msg": "xxx",
"result": "0584f013-613079d47"
}
7. Speech Synthesis - Oral Translation
-
Description Using the same voice tone, convert Chinese speech into English speech.
-
Models
- Voice Clone
- Automatic Speech Recognition
-
API Example
Request:
POST /api/tts_clone HTTP/1.1
Content-Type: application/json
{
"text": "hello, my name is Jack.",
"audio_base64": "xxx",
"audio_note": "jack",
"audio_type": "mp3"
}
Response:
{
"code": 0, // error code, 0 means Ok
"err_msg": "xxx",
"result": "xxx" // base64(FILE_CONTENT)
}
8. Speech Synthesis - AI Singing
-
Description Select a song and choose a voice tone. The AI will separate the vocals from the song and automatically synthesize them with the selected voice tone.
-
Models
- Vocal Accompaniment Separation
- Harmonic Removal
- Music Synthesis
-
API Example
Request:
POST /api/svcmusic HTTP/1.1
Content-Type: application/json
{
"keyword": "song keyword",
"role_id": "Jack",
"infer_config": {
"vc_transform": 0,
"auto_f0": false,
"cluster_ratio": 0.5,
"slice_db": -40,
"noise_scale": 0.4,
"pad_seconds": 0.5,
"cl_num": 0,
"lg_num": 0,
"lgr_num": 0.75,
"f0_predictor": "pm",
"enhancer_adaptive_key": 0,
"cr_threshold": 0.05,
"k_step": 100,
"use_spk_mix": false,
"second_encoding": false,
"loudness_envelope_adjustment": 0
}
}
Response:
{
"code": 0, // error code, 0 means Ok
"data": {
"task_id": "xx"
}
}
9. Image Generation - Basic Txt2Img
- Description
- Basic text-to-image generation feature, with options to select detail level and size.
- Hundreds of models available for selection.
-
Models
- StableDiffusion
- Lora
- ControlNet+DWPose
- Img2Img
- InsightFace
- FreeU
- IP-Adapter
- ESRGAN
-
API Example
Request:
POST /api/txt2img HTTP/1.1
Content-Type: application/json
{
"prompt": "1 girl",
"negative_prompt": "ugly",
"seed": 0,
"batch_size": 4,
"width": 512,
"height": 768
}
Response:
{
"code": 0, // error code, 0 means Ok
"data": {
"images": ["xx"],
"seed": 123
}
}
10. Image Generation - Inpaint
-
Description Redraw the obscured areas of an image by painting them black and then providing prompt words for the AI to regenerate the obscured parts.
-
Models
- StableDiffusion
- Lora
- ControlNet+DWPose
- Img2Img
- InsightFace
- FreeU
- IP-Adapter
- ESRGAN
-
API Example
Request:
POST /api/img2img/inpaint HTTP/1.1
Content-Type: application/json
{
"mask_img": "xxxxx",
"prompt": "1 girl",
"negative_prompt": "ugly",
"seed": 0,
"batch_size": 4,
"width": 512,
"height": 768
}
Response:
{
"code": 0, // error code, 0 means Ok
"data": {
"images": ["xx"],
"seed": 123
}
}
11. Image Generation - Outpaint
- Description Expand a small-sized image to a larger one, and fill the expanded area with content based on the prompt words to ensure consistency in the image content.
- Models
- StableDiffusion
- Lora
- ControlNet+DWPose
- Img2Img
- InsightFace
- FreeU
- IP-Adapter
- ESRGAN
- API Example
Request:
POST /api/img2img/outpaint HTTP/1.1
Content-Type: application/json
{
"mask_img": "xxxxx",
"prompt": "1 girl",
"negative_prompt": "ugly",
"seed": 0,
"batch_size": 4,
"width": 512,
"height": 768
}
Response:
{
"code": 0, // error code, 0 means Ok
"data": {
"images": ["xx"],
"seed": 123
}
}
12. Image Generation - 3D Cartoon
- Description Replace a real person's photo with a 3D-styled cartoon character photo.
- Models
- StableDiffusion
- Lora
- ControlNet+DWPose
- Img2Img
- InsightFace
- FreeU
- IP-Adapter
- ESRGAN
- API Example
Request:
POST /api/img2cartoon HTTP/1.1
Content-Type: application/json
{
"img": "xxxxx",
"prompt": "1 girl",
"negative_prompt": "ugly",
"seed": 0,
"batch_size": 4,
"width": 512,
"height": 768
}
Response:
{
"code": 0, // error code, 0 means Ok
"data": {
"images": ["xx"],
"seed": 123
}
}
13. Chat - RAG
-
Description Users can upload documents and engage in dialogues with AI regarding the content of the uploaded documents. The AI can reorganize the content of the documents in accordance with the requirements of the dialogue and provide the sources of the relevant references.
-
Models
- LLM
- AI Agent Development Framework
- Reasoning and Acting
- High-performance distributed inference service
- Vector Database
- Output Format Control
-
API Example
Request:
POST /api/rag/chat/completes HTTP/1.1
Content-Type:application/json
{
"conversationID": "xxx", // conversation id
"corpusID": "xxx", // knowledge corpus id
"model": "xxx", // model name
"messages": [
{
"role": "human", // message role, support human and ai
"message": "xxx" // message
}
] // messages
}
Response:
{
code: 0, // error code, 0 means Ok
err_msg: xxx, // error message
result: {
"message": "xxx" // chat response message,
"references": [
"xxxx"
] // references
}
}
14. Image Generation - Style Imitation
- Description Provide a reference photo of a specific style, allowing control over the style of the generated photo.
- Models
- StableDiffusion
- Lora
- ControlNet+DWPose
- Img2Img
- InsightFace
- FreeU
- IP-Adapter
- ESRGAN
- API Example
Request:
POST /api/ipadapter HTTP/1.1
Content-Type: application/json
{
"style_img": "xxxxx",
"face_img": "xxx",
"prompt": "1 girl",
"negative_prompt": "ugly",
"seed": 0,
"batch_size": 4,
"width": 512,
"height": 768
}
Response:
{
"code": 0, // error code, 0 means Ok
"data": {
"image": "xxxx"
}
}
15. Image Generation - Realistic Photo Generation
-
Description Generate high-quality, realistic photos of real people with rich details and without an artificial feel.
-
Models
- Lora
- ControlNet+DWPose
- Img2Img
- InsightFace
- FreeU
- IP-Adapter
- ESRGAN
-
API Example
Request:
POST /api/txt2img/refine HTTP/1.1
Content-Type: application/json
{
"prompt": "1 girl",
"negative_prompt": "ugly",
"seed": 0,
"batch_size": 4,
"width": 512,
"height": 768
}
Response:
{
"code": 0, // error code, 0 means Ok
"data": {
"image": "xxxx"
}
}
16. Image Generation - Facial Expression Modification
-
Description Provide emotional prompt words to modify the facial expressions of the people in the original photo.
-
Models
- Lora
- ControlNet+DWPose
- Img2Img
- InsightFace
- FreeU
- IP-Adapter
- ESRGAN
-
API Example
Request:
POST /api/img2img/expression HTTP/1.1
Content-Type: application/json
{
"expression": "laugh",
"img": "xxxx"
}
Response:
{
"code": 0, // error code, 0 means Ok
"data": {
"images": ["xx"],
"seed": 123
}
}
17. Image Generation - Pose Modification
-
Description Provide a pose photo and a face photo, and generate a photo with the corresponding face and pose based on the prompt words.
-
Models
- Lora
- ControlNet+DWPose
- Img2Img
- InsightFace
- FreeU
- IP-Adapter
- ESRGAN
-
API Example
Request:
POST /api/img2img/controlnet HTTP/1.1
Content-Type: application/json
{
"pose_img": "xxxxx",
"face_img": "xxx",
"prompt": "1 girl",
"negative_prompt": "ugly",
"seed": 0,
"batch_size": 4,
"width": 512,
"height": 768
}
Response:
{
"code": 0, // error code, 0 means Ok
"data": {
"images": ["xx"],
"seed": 123
}
}
18. Image Generation - Background change
-
Description Provide a photo, and the AI will automatically remove the background and generate a new one based on the prompt words.
-
Models
- Lora
- ControlNet+DWPose
- Img2Img
- InsightFace
- FreeU
- IP-Adapter
- ESRGAN
-
API Example
Request:
POST /api/img2img/changebg HTTP/1.1
Content-Type: application/json
{
"raw_img": "xxxxx",
"prompt": "in the park", // the bg wanted to change to
"negative_prompt": "ugly",
"seed": 0,
"batch_size": 4,
"width": 512,
"height": 768
}
Response:
{
"code": 0, // error code, 0 means Ok
"data": {
"images": ["xx"],
"seed": 123
}
}
19. Image Generation - 3D Model
- Description Generate 3D Model.
- Models
- Lora
- ControlNet+DWPose
- Img2Img
- InsightFace
- FreeU
- IP-Adapter
- ESRGAN
- API Example
Request:
POST /api/txt2img/3d HTTP/1.1
Content-Type: application/json
{
"prompt": "a cat", // input prompt
"negative_prompt": "ugly",
"seed": 0,
"batch_size": 4,
"width": 512,
"height": 768
}
Response:
{
"code": 0,
"data": {
"images": ["xx"],
"seed": 123
}
}
20. Image Generation - Emoticon Pack
- Description Image Generation - Emoticon Pack
- Models
- Lora
- ControlNet+DWPose
- Img2Img
- InsightFace
- FreeU
- IP-Adapter
- ESRGAN
- API Example
Request:
POST /api/img2img/meme HTTP/1.1
Content-Type: application/json
{
"raw_img": "xxxxx",
"seed": 0,
"batch_size": 4
}
Response:
{
"code": 0, // error code, 0 means Ok
"data": {
"images": ["xx"],
"seed": 123
}
}
21. Video Generation - Video Redrawing
-
Description Provide a video of a person, and swap the face of the main character in the video.
-
Models
- Scene Detection
- Face Detect
- Swap Face
- Speech Enhancement
- Translation
- Speech synthesis
- Lip Syncing
- Face Restoration
-
API Example
Request:
POST /api/video2video HTTP/1.1
Content-Type: application/json
{
"raw_video": "xxxxx",
"video_style": "anime"
}
Response:
{
"code": 0, // error code, 0 means Ok
"data": {
"video": "xx"
}
}
22. Video Generation - Illustrated Video
- Description Provide textual material, select the video generation style, and automatically generate an illustrated video corresponding to the text.
- Models
- Scene Detection
- Face Detect
- Swap Face
- Speech Enhancement
- Translation
- Speech synthesis
- Lip Syncing
- Face Restoration
- API Example
Request:
POST /api/story2video HTTP/1.1
Content-Type: application/json
{
"story": "xxxxx", // input story
"style": "anime"
}
Response:
{
"code": 0, // error code, 0 means Ok
"data": {
"video": "xx"
}
}
23. Chat - Text-based Games
-
Description Game involves randomly generating a character which you embody through various stages of life. You'll encounter randomly generated events at each age, where you must make choices. The model then generates the outcomes of those choices. The game concludes upon the character's death, at which point a summary of the life lived is produced.
-
Models
- LLM
- AI Agent Development Framework
- Reasoning and Acting
- High-performance distributed inference service
- Vector Database
- Output Format Control
-
API Example
Request:
POST /api/game/chat/completes HTTP/1.1
Content-Type:application/json
{
"gameID": "xxxx", // game id
"sessionID": "xxx", // session id
"model": "xxx", // model name
"messages": [
{
"role": "human", // message role, support human and ai
"message": "xxx" // message
}
] // messages
}
Response:
{
code: 0, // error code, 0 means Ok
err_msg: xxx, // error message
result: {
"message": "xxx" // chat response message
}
}
24. Video Generation - Video Redrawing
-
Description Provide a video of a person, and swap the face of the main character in the video.
-
Models
- Scene Detection
- Face Detect
- Swap Face
- Speech Enhancement
- Translation
- Speech synthesis
- Lip Syncing
- Face Restoration
-
API Example
Request:
POST /api/video2video HTTP/1.1
Content-Type: application/json
{
"raw_video": "xxxxx",
"video_style": "anime"
}
Response:
{
"code": 0, // error code, 0 means Ok
"data": {
"video": "xx"
}
}
25. Video Generation - Illustrated Video
- Description Provide textual material, select the video generation style, and automatically generate an illustrated video corresponding to the text.
- Models
- Scene Detection
- Face Detect
- Swap Face
- Speech Enhancement
- Translation
- Speech synthesis
- Lip Syncing
- Face Restoration
- API Example
Request:
POST /api/story2video HTTP/1.1
Content-Type: application/json
{
"story": "xxxxx", // input story
"style": "anime"
}
Response:
{
"code": 0, // error code, 0 means Ok
"data": {
"video": "xx"
}
}
26. Video Generation - Voiceover Video
- Description Input text and a voiceover video. The AI will automatically change the voiceover in the video to match the input text, ensuring that the lip movements and pronunciation are consistent.
- Models
- Scene Detection
- Face Detect
- Swap Face
- Speech Enhancement
- Translation
- Speech synthesis
- Lip Syncing
- Face Restoration
- API Example
Request:
POST /api/videoretalking HTTP/1.1
Content-Type: application/json
{
"prompt": "xxxxx", // input prompt
"video": "xxx"
}
Response:
{
"code": 0, // error code, 0 means Ok
"data": {
"video": "xx"
}
}
27. Text - Translation
-
Description Translate text between various languages accurately and contextually. The AI can handle different dialects and colloquialisms, ensuring the translated content maintains the original meaning and tone.
-
Models
- LLM
- High-performance distributed inference service
- Language Detection Algorithms
- Contextual Understanding and Reasoning
- Translation Memory Database
-
API Example
Request:
POST /api/text/translate HTTP/1.1
Content-Type: application/json
{
"sourceLanguage": "xxx", // source language code
"targetLanguage": "xxx", // target language code
"text": "xxx", // text to translate
"model": "xxx", // model name
"context": {
"topic": "xxx", // optional, topic for better context
"formality": "xxx" // optional, formality level
}
}
Response:
{
"code": 0, // error code, 0 means Ok
"err_msg": "xxx", // error message
"result": {
"translatedText": "xxx", // translated text
"sourceLanguageDetected": "xxx", // detected source language
"targetLanguage": "xxx", // target language
"modelUsed": "xxx" // model name used
}
}
28. Text - Document Search Service
-
Description Enable searching within a large corpus of documents to find relevant information based on user queries. The AI can handle various document formats, support full-text search, semantic search, and provide relevance-ranked results. It supports filtering by metadata and can return excerpts of documents with highlighted query terms.
-
Models
- LLM
- High-performance distributed inference service
- Vector Search and Embeddings
- Natural Language Processing (NLP)
- Semantic Search Algorithms
- Metadata Filtering and Indexing
-
API Example
Request:
POST /api/text/docsearch HTTP/1.1
Content-Type: application/json
{
"query": "xxx", // user query
"corpusId": "xxx", // ID of the document corpus
"model": "xxx", // model name
"filters": {
"author": "xxx", // optional, filter by author
"dateRange": {
"start": "yyyy-mm-dd", // optional, start date
"end": "yyyy-mm-dd" // optional, end date
},
"tags": ["xxx", "xxx"] // optional, filter by tags
},
"highlight": true, // optional, highlight query terms in results
"topK": 10 // optional, number of top results to return
}
Response:
{
"code": 0, // error code, 0 means Ok
"err_msg": "xxx", // error message
"result": {
"query": "xxx", // original query
"corpusId": "xxx", // ID of the document corpus
"results": [
{
"documentId": "xxx", // document ID
"title": "xxx", // document title
"snippet": "xxx", // excerpt of the document with highlighted terms
"score": 0.95, // relevance score
"metadata": {
"author": "xxx", // author of the document
"date": "yyyy-mm-dd", // date of the document
"tags": ["xxx", "xxx"] // tags associated with the document
}
}
]
}
}
29. Text - Sentiment Analysis
-
Description Analyze the sentiment of text to determine the emotional tone, such as positive, negative, or neutral. The AI can handle different languages and dialects, provide sentiment scores, and identify key emotional phrases. It supports batch processing and can analyze sentiment over specific aspects of a text.
-
Models
- LLM
- High-performance distributed inference service
- Sentiment Analysis Models
- Natural Language Processing (NLP)
- Aspect-based Sentiment Analysis (ABSA)
- Multilingual Support
-
API Example
Request:
POST /api/text/sentiment HTTP/1.1
Content-Type: application/json
{
"text": "xxx", // text to analyze
"language": "xxx", // optional, language code
}
Response:
{
"code": 0, // error code, 0 means Ok
"err_msg": "xxx", // error message
"result": {
"text": "xxx", // original text
"sentiment": "positive", // sentiment classification
"confidence": 0.95, // confidence score
"modelUsed": "xxx" // model name used
}
}
30. Text - Embedding
-
Description Generate vector embeddings for text, images, or other data types to facilitate similarity search, clustering, and semantic understanding. The service supports various embedding models optimized for different data types and use cases, providing high-dimensional vector representations that capture the semantic essence of the input data.
-
Models
- LLM
- High-performance distributed inference service
- Vector Embedding Models
- Dimensionality Reduction Techniques
- Semantic Understanding Algorithms
-
API Example
Request:
POST /api/text/embeddings HTTP/1.1
Content-Type: application/json
{
"text": "xxx", // text to embed
"normalize": true // optional, normalize the embedding vector
}
Response:
{
"object": "list",
"data": [
{
"object": "embedding",
"index": 0,
"embedding": [0.123, 0.456, ...] // embedding vector
}
]
}