Skip to main content

2 posts tagged with "LLM"

View All Tags

How does StarLandAI Control LLM Output Format?

StarLandAI
StarLandAI
Maintainer

blog4-1

Large language models can produce text; however, they might not always follow directions correctly when a specific output format is required. In the process of creating characters at StarLandAI, it is necessary to extract key character attributes from multiple messages input by the user to generate structured character attribute information. The ability to accurately generate outputs that meet the required format influences the user experience in configuring character attributes through StarLandAI. Different strategies for designing prompts have been developed to enhance the consistency and reliability of the text produced, yet these methods do not always prove adequate. So, how to control the LLM output format?

StarLandAI uses lm-format-enforcer [1] to solve the issues. By restricting the selection of tokens the language model can produce at each step, lm-format-enforcer ensures compliance with the desired output format, while simultaneously reducing constraints on the language model’s capabilities.

1. How does it work?

lm-format-enforcer works by integrating a character-level parser with a tokenizer prefix tree to create an intelligent token filtration system.

blog4-2

(1) Character Level Parser

Interpreting a string into any given format could be perceived as a tacit tree structure — during any point of the parsing sequence, there exists a specific selection of permissible subsequent characters, which, upon being chosen, lead to a subsequent range of allowable characters, and this pattern continues.

(2) Tokenizer Prefix Tree

If we have a tokenizer from a particular language model, we are able to construct a prefix tree with all potential tokens the model might produce by creating all conceivable token sequences and incorporating them into the tree.

(3) Combining the two

With a character-level parser and a tokenizer prefix tree in hand, we can adeptly and effectively sift through the permissible tokens for the language model to produce in the next step: We navigate only through those characters that are concurrently present in the node of the character-level parser as well as the node of the tokenizer prefix tree. This allows us to find all of the tokens. This process is repeated recursively on both trees, culminating in the retrieval of all permissible tokens. As the language model outputs a token, we update the character-level parser with the newly generated characters, preparing it to refine the options for the upcoming timestep.

2. Achieved effect

By applying this technique, StarLandAI is able to enforce the generation of specific enumerated values by the LLM. For example, when categorizing user-created characters and generating recommended character tags through the LLM, StarLandAI utilizes regular expressions to describe the list of tags that the LLM can generate. Then, it converts the regular expression into a character-level parser. This parser is applied to process the LLM’s output logits, filtering out any subsequent tokens that violate the regular expression. Through this method, the LLM selects the next token with the highest probability from those that conform to the regular expression.

blog4-2

3. Feature in StarLandAI

On StarLandAI, configuring a custom character doesn’t require users to fill out extensive and complex forms, nor is there a need to understand the significance behind each field. Users simply need to have a casual chat with the Starland assistant, informing Starland of the character they wish to create. Starland itself handles the extraction and completion of the form attributes.

blog4-2

Yes, Starland utilizes the lm-format-enforcer in the feature that allows users to configure custom roles. Through conversation with users, Starland understands the type of character a user wishes to create. It then leverages LLM to offer inspiring suggestions and assistance for the user’s configured character. Finally, it summarizes the entire conversation history to generate a structured custom role configuration. Just like that, a custom character is configured.

References

[1] https://github.com/noamgat/lm-format-enforcer

StarLandAI, the first AI MaaS DePIN network supports all types of large multimodal model applications

StarLandAI
StarLandAI
Maintainer

StarLandAI is a decentralized AI and blockchain network in the Web3 era, aiming to create a globally accessible ecosystem for AI + Web3 applications and accelerate global innovation in a large multimodal model. It provides AI capabilities such as AI-generated content creation and conversational AI, while supporting Web3 capabilities like wallets, NFT generation, data crowdfunding, and computing power staking. Serving as an AI layer, StarLandAI can cater to multiple public chains and is committed to becoming the world’s first AI MaaS DePIN network application ecosystem.

StarLandAI can supply generative AI models as a service (MaaS), providing multimodal large models through cloud services and APIs, including text, voice, image, and video, etc., allowing for scalability, ease of access, and flexibility. StarlandAI uses cloud-native Enhanced DePIN Networks and utilizes microservices, containers, immutable infrastructure, and declarative APIs to ensure the rapid and resilient deployment of GenAI services on any type of computing device. With generative AI applications Matrix, StarLandAI offers GenAI applications like AI avatars, images, voice, music, and video generation at GPT-level quality, compatible with blockchains including Solana, Ethereum, Bitcoin, and more.

blog2-1

On the crossroads between AI and crypto, there are always many questions to be solved. For example, the difficulty of GenAI models, including deploying, training, fine-tuning, and managing multimodal large models, integrating text, images, sound, databases, and distributed cloud-native systems, is highly complex. Types of computing power are so varied, AI computing spans server capabilities like H100, A100, and consumer-grade power such as 4090, 3090, and 3080, integrated graphics, and CPUs, making unified management challenging. Low-end computing power is idle. Running large models efficiently on low-end compute resources such as 3090, integrated graphics, and CPUs is highly challenging, leading to idle resources. Furthermore, there is a lack focus on applications. The current DePIN networks, despite integrating substantial AI computing power, lacks adequate support for developing, deploying, and maintaining generative AI applications, leading to limited real-world usage.

blog2-1

StarLandAI is the first decentralized AI avatar network based on large-scale multimodal models. The StarLandAI team itself is a mature team that has been deeply involved in AI for many years. They have developed several core AI technologies, including:

  • Multimodal large-scale models: Using deep learning techniques such as CNN, RNN, and Transformers, integrating data from multiple sources such as text, images, audio, and video, to enhance the understanding and adaptability of AI avatars;

  • Distributed low-memory training and inference: Distributing the model across multiple GPUs, thereby being able to take advantage of tensor parallelism, data parallelism, pipelined parallelism, gradient accumulation, and memory optimization techniques, this method allows GPU with smaller memory capacities to participate in inference calculations;

  • Cross-chain AI Avatar mining: Supporting Solana, Ethereum, BNB Chain, etc.—through the improved protocols of SPL 22 and ERC721/ERC404, it promotes the separation of NFT ownership and usage rights, and improves liquidity and participation by securitizing the inference revenue rights represented by NFT;

  • DePIN devices that support running large-scale models with multiple modules: Enabling various devices, such as PCs, smartphones, IoT (IoT) devices, edge computing nodes, etc., to execute complex models involving multiple modalities, such as text, images, and audio;

blog2-1

StarLandAI’s vision is to support harnessing all idle compute resources. StarLandAI integrates various unused computing powers, including GPUs such as the 4090, 3090, and 3080, as well as computing from PCs, edge devices, and mobile platforms, transforming them into a versatile resource pool for multimodal large-scale models. Innovative AI Layer2 for Blockchains, advances Solana and others with AI Layer 2, bringing Web 2.0 AI users to blockchain. StarlandAI aims to build a healthy economic ecosystem among computer providers, developers, AI creators, and public blockchains, fostering mutual growth and innovation. StarlandAI offers One-Click APIs, helping developers with easy-to-use APIs and cloud services, enabling seamless use and further development of all open-source large models without requiring in-depth technical knowledge.

StarLandAI’s GenAI MaaS product has already been running, generating over $500,000 in revenue. The AI Avatar product (www.starland.ai), on par with c.ai and GPT in response speed and multimodal capabilities, is live. StarLandAI also leverages low-end DePIN computing, like 3090 and 3080 GPUs, reducing total costs by 90%. Furthermore, StarlandAI has integrated with Solana blockchain via point smart contracts, NFTs, and model mining.

blog2-1

For the ecosystem, StarLandAI takes advantage of Solana’s high efficiency and good ecosystem to create a complete narrative network of its own, and builds a complete network of various roles such as computing power providers, AI digital human creators, and AI users. Each role can find a position in the ecosystem and obtain income.

Today’s Solana ecosystem is showing amazing vitality. Compared with the previous bull market, Solana has achieved amazing results in many aspects, such as infrastructure, ecological applications, market popularity, and wealth effects. Solana-based StarLandAI is even more so, which is really anticipated.

blog2-1