AI API in use

Last modified by Munoz Matéo on 2023/06/22 20:10

API's usage for AI from different sources

This is a page about how  API made to use AI and LLMs models are designed, based on their documentation.

There is two part in this document, one comparing non-open source API, and another part treating about them.

Non Open Source API

Stable Diffusion:

Stable Diffusion's official API is a REST API made to use Stable Diffusion, an AI that allow you to produce image based on a text description.

This API is available in Python.

You have to authentify yourself with a key.

List of every endpoints available using this API:

  • Text to image
  • Image to Image
  • Inpainting
  • fetch queued image
  • System Load
  • Super resolution

The API return type is in json format.

Here is a request example using the Text to image endpoint:

  {
 "key": "",
 "prompt": "ultra realistic close up portrait ((beautiful pale cyberpunk female with heavy black eyeliner)), blue eyes, shaved side haircut, hyper detail, cinematic lighting, magic neon, dark red city, Canon EOS R3, nikon, f/1.4, ISO 200, 1/160s, 8K, RAW, unedited, symmetrical balance, in-frame, 8K",
 "negative_prompt": "((out of frame)), ((extra fingers)), mutated hands, ((poorly drawn hands)), ((poorly drawn face)), (((mutation))), (((deformed))), (((tiling))), ((naked)), ((tile)), ((fleshpile)), ((ugly)), (((abstract))), blurry, ((bad anatomy)), ((bad proportions)), ((extra limbs)), cloned face, (((skinny))), glitchy, ((extra breasts)), ((double torso)), ((extra arms)), ((extra hands)), ((mangled fingers)), ((missing breasts)), (missing lips), ((ugly face)), ((fat)), ((extra legs)), anime",
 "width": "512",
 "height": "512",
 "samples": "1",
 "num_inference_steps": "20",
 "seed": null,
 "guidance_scale": 7.5,
"safety_checker":"yes",
 "webhook": null,
 "track_id": null
}

Example response:

{
 "status": "success",
 "generationTime": 2.920767068862915,
 "id": 302455,
 "output": [
     "https://d1okzptojspljx.cloudfront.net/generations/05c3260d-6a2e-4aa5-82f0-e952f2a5fa10-0.png"
 ],
 "meta": {
     "H": 512,
     "W": 512,
     "enable_attention_slicing": "true",
     "file_prefix": "05c3260d-6a2e-4aa5-82f0-e952f2a5fa10",
     "guidance_scale": 7.5,
     "model": "runwayml/stable-diffusion-v1-5",
     "n_samples": 1,
     "negative_prompt": "((out of frame)), ((extra fingers)), mutated hands, ((poorly drawn hands)), ((poorly drawn face)), (((mutation))), (((deformed))), (((tiling))), ((naked)), ((tile)), ((fleshpile)), ((ugly)), (((abstract))), blurry, ((bad anatomy)), ((bad proportions)), ((extra limbs)), cloned face, (((skinny))), glitchy, ((extra breasts)), ((double torso)), ((extra arms)), ((extra hands)), ((mangled fingers)), ((missing breasts)), (missing lips), ((ugly face)), ((fat)), ((extra legs)), anime",
     "outdir": "out",
     "prompt": "ultra realistic close up portrait ((beautiful pale cyberpunk female with heavy black eyeliner)), blue eyes, shaved side haircut, hyper detail, cinematic lighting, magic neon, dark red city, Canon EOS R3, nikon, f/1.4, ISO 200, 1/160s, 8K, RAW, unedited, symmetrical balance, in-frame, 8K",
     "revision": "fp16",
     "safety_checker": "none",
     "seed": 1793745243,
     "steps": 20,
     "vae": "stabilityai/sd-vae-ft-mse"
 }
}

Source

DeepL:

DeepL's API is a REST API made to use DeepL translation service, available in alot of programming languages (Java, .NET, Node.js, PHP, Python).

You have to authentify yourself with a key to use it.

An OpenAPI specification is provided.

List of every endpoints available using this API:

  • Text translation
  • Translate Document
  • Manage glossaries (glossary creation, delete glossary... etc.)

The API return type is in json format.

Here is a request example using the Text translation endpoint:

curl -X POST 'https://api-free.deepl.com/v2/translate' \
 -H 'Authorization: DeepL-Auth-Key [yourAuthKey]' \
 -d 'text=Hello%2C%20world!' \
 -d 'target_lang=DE'

Example response:

{
  "translations": [
    {
      "detected_source_language": "EN",
      "text": "Hallo, Welt!"
    }
  ]
}

Source

OpenAI:

OpenAI's API is a REST API made to use any models from OpenAI (Dall-E, GPT 3.5, GPT4...), provided with a Python library and a Node.js one (there is more, but they are community-made library).

Information

There are muliple API from OpenAI (actually, every endpoint is an API on is own)

You have to authentify yourself with a key to use it.

List of every endpoints available using this API:

  • Text completion
  • Chat completion
  • Image generation
  • Fine-Tuning
  • Embeddings
  • Speech to text
  • Moderation (check if content fit the openAI policy).

The API return type is in json format.

Here is a request example in Python using chat completion endpoint:

openai.ChatCompletion.create(
  model="gpt-3.5-turbo",
  messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Who won the world series in 2020?"},
        {"role": "assistant", "content": "The Los Angeles Dodgers won the World Series in 2020."},
        {"role": "user", "content": "Where was it played?"}
    ]
)

Example response:

{
 'id': 'chatcmpl-6p9XYPYSTTRi0xEviKjjilqrWU2Ve',
 'object': 'chat.completion',
 'created': 1677649420,
 'model': 'gpt-3.5-turbo',
 'usage': {'prompt_tokens': 56, 'completion_tokens': 31, 'total_tokens': 87},
 'choices': [
   {
    'message': {
      'role': 'assistant',
      'content': 'The 2020 World Series was played in Arlington, Texas at the Globe Life Field, which was the new home stadium for the Texas Rangers.'},
    'finish_reason': 'stop',
    'index': 0
   }
  ]
}

Source

Hugging Face:

Hugging Face's  API is a REST API made to use LLMs models (public one or private one).

The API is provided in Python, Java, Go, JavaScript and Ruby but the transformers library which provide access to the pre-trained model is written in Python.

You have to authentify yourself with a token to use it.

List of the endpoints available using this API:

  • Natural Language Processing Tasks
  • Audio Tasks
  • Object detection and image segmentation
  • Fine-Tuning
  • And more, since you can use a lot of models

The API return type is in json format.

Here is a resquest example using one of the Natural Language Processing available:

import json
import requests
headers = {"Authorization": f"Bearer {API_TOKEN}"}
API_URL = "https://api-inference.huggingface.co/models/bert-base-uncased"
def query(payload):
    data = json.dumps(payload)
    response = requests.request("POST", API_URL, headers=headers, data=data)
    return json.loads(response.content.decode("utf-8"))
data = query({"inputs": "The answer to the universe is [MASK]."})

Example response:

[
        {
            "sequence": "the answer to the universe is no.",
            "score": 0.1696,
            "token": 2053,
            "token_str": "no",
        },
        {
            "sequence": "the answer to the universe is nothing.",
            "score": 0.0734,
            "token": 2498,
            "token_str": "nothing",
        },
        {
            "sequence": "the answer to the universe is yes.",
            "score": 0.0580,
            "token": 2748,
            "token_str": "yes",
        },
        {
            "sequence": "the answer to the universe is unknown.",
            "score": 0.044,
            "token": 4242,
            "token_str": "unknown",
        },
        {
            "sequence": "the answer to the universe is simple.",
            "score": 0.0402,
            "token": 3722,
            "token_str": "simple",
        },
    ]

There is also, an interesting tool recently added(written on 12/05/2023) : Transformers Agent.

Source

Similarities

As we can see, for these uses APIs are pretty much always available in Pyhton. This could be an indication if we want to use an already existing API like the ones above.

Here is a summary of the similarity between all of these implementation:

  • All APIs use REST architecture.
  • They require authentication with a key or token.
  • They support JSON as the return type.
  • They provide various functionalities for natural language processing, image processing, translation, and other tasks.
  • They are all Python supported.

The main difference is on the endpoints availabilities, they are chosen following the LLMs / AI specificity. If we want a multi-modal app with our own API, we should consider endpoint that fit well with alot of models.

Also, Fine-Tuning is a very interesting functionnability since it allows us to adapt model to our needs.

All of the APIs listed above are well documented.

Open Source "API"

For this part some of the presented "API" are just framework, and do not neceimplement an already made API. However, we could use one of these frameworks to make a little REST API for our use.

In Python for example, an API created with fastAPI (a Framework) should not be hard to do with some light modification to the chosen source.

LMFlow:

LMFlow is a toolkit (i.e multi-modal) which provide fine-tuning feature. it's a light-weight framework, with a built-in function-based API.

It is available in Python.

Here is a summary list of the endpoint available:

Example of a request in Python (from the endpoint from_dict, in the Dataset operation part):

    input_dataset = dataset.from_dict({
        "type": "text_only",
        "instances": [ { "text": context_ } ]
    })

Example response (from the endpoint from_dict, in the Dataset operation part):

{
  "type": "text_only", "instances": [

   {
     "key_1": "VALUE_1.1", "key_2": "VALUE_1.2", "etc.."

   }, 
   {

     "key_1": "VALUE_2.1", "key_2": "VALUE_2.2", "etc.."
   }

]

}

Here is a list of the features:

Source

Databerry:

Databerry is a no-code plateform made to bring specific data source into datastore and connect them to chatGPT (with OpenAI API) or any other LLMs (with Databerry API).

The plateform is available in Node.js.

Databerry's API is a REST API, for the authentification you have to provide a key in the header of each request.

Here is a list of the endpoints available:

  • Query Agent
  • Query Datastore
  • Upsert Datasource
  • Update Datasource
  • Delete Datasource
  • File Upload - Generate Link
  • File Upload

Example of a request (with Query Datastore endpoint):

​curl --location --request POST 'https://api.databerry.ai/datastores/query/<datastoreId>' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer <API_KEY>' \
--data-raw '{
    "query": "Lorem Ipsum...",
    "topK": 5
}'​

Example response from this endpoint:

{
  "results": [
    {
      "text": "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.",
      "score": 0.89,
      "source": "https://en.wikipedia.org/wiki/Lorem_ipsum"
    },
    {
      "text": "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.",
      "score": 0.42,
      "source": "https://en.wikipedia.org/wiki/Lorem_ipsum"
    }
  ]
}

Here is a list of the features available using Databerry:

  • Load data from anywhere
  • No-code: User-friendly interface to manage your datastores and chat with your data
  • Securized API endpoint for querying your data
  • Auto sync data sources (not yet implemented)
  • Auto generates a ChatGPT Plugin for each datastore

Source

Text-generation-webui

Text-generation-webui is a gradio web UI for running Large Language Models.

It is available in Python.

The app allows you to enable extension, such as API with or without streaming (REST or Streaming, basically), or an OpenAI-compatible API.

Here is a list of the endpoints for the basic API extension:

  • Streaming at /api/v1/stream port 5005.
  • Blocking at /api/v1/generate por 5000.

Example of a request for the Blocking endpoint:

   {
        "prompt": "In order to make homemade bread, follow these steps:\n1)",
        "max_new_tokens": 250,
        "do_sample": true,
        "temperature": 1.3,
        "top_p": 0.1,
        "typical_p": 1,
        "repetition_penalty": 1.18,
        "top_k": 40,
        "min_length": 0,
        "no_repeat_ngram_size": 0,
        "num_beams": 1,
        "penalty_alpha": 0,
        "length_penalty": 1,
        "early_stopping": false,
        "seed": -1,
        "add_bos_token": true,
        "truncation_length": 2048,
        "ban_eos_token": false,
        "skip_special_tokens": true,
        "stopping_strings": []
    }

Example response (really basic API):

{
 "results": 
  [{"text": " Make a loaf of bread.\n2) Place the dough in a bowl and add the flour, baking powder, salt, baking soda, and cinnamon.\n3) Add the yeast mixture to the wet ingredients and mix until well combined.\n4) Add the dry ingredients and mix again until incorporated.\n5) Add the remaining ingredients and mix until completely combined.\n6) Add the water and mix until fully combined.\n7) Add the remaining ingredients and mix until thoroughly mixed together.\n8) Add the remaining ingredients and mix until completely combined.\n9) Add the remaining ingredients and mix until completely combined.\n10) Add the remaining ingredients and mix until completely combined.\n11) Add the remaining ingredients and mix until completely combined.\n12) Add the remaining ingredients and mix until completely combined.\n13) Add the remaining ingredients and mix until completely combined.\n14) Add the remaining ingredients and mix until completely combined.\n15) Add the remaining ingredients and mix until completely combined.\n16) Add the remaining ingredients and mix until completely combined.\n17) Add the remaining ingredients and mix until completely combined.\n18) Add the remaining ingredients and mix until completely combined.\n19) Add the remaining ingredients"}]
}

Another list of endpoitns for the OpenAI-compatible API you can read here

Finally, the list of features:

  • Dropdown menu for switching between models.
  • Notebook mode that resembles OpenAI's playground.
  • Chat mode.
  • Instruct mode compatible with various formats, including Alpaca, Vicuna, Open Assistant, Dolly, Koala, ChatGLM, MOSS, RWKV-Raven, Galactica, StableLM, WizardLM, Baize, MPT, and INCITE
  • Multimodal pipelines, including LLaVA and MiniGPT-4.
  • Markdown output for GALACTICA, including LaTeX rendering
  • Nice HTML output for GPT-4chan
  • Custom chat characters
  • Advanced chat features (send images, get audio responses with TTS)
  • Very efficient text streaming
  • Parameter presets
  • LLaMA model
  • 4-bit GPTQ mode
  • LoRA (loading and training)
  • llama.cpp
  • RWKV model
  • 8-bit mode
  • Layers splitting across GPU(s), CPU, and disk
  • CPU mode
  • FlexGen
  • DeepSpeed ZeRO-3
  • API with streaming and without streaming 
  • Extensions - see the user extensions list

Source

Conclusion on Open Source API

Considering the sources above, we can state that the Open Source API ressources available (if they exist) are, for most of them, not fully complete to our use and need to be enriched for our need.

Apart from that, the differences are in how each project was though in the first place. Most of the models listed above were made to train Open Source LLMs and to experiment at first. With that in mind, Databerry should be a logical choice to make it easier.  

We should consider using the model that possess an API as close as possible of our final goals, but we need to consider the full potential of each model  in the equation to avoid possible future limitations (i.e the more capacity, the better).

Conclusion

There are several options to choose from. 

On the non-Open-Source side:

  • If we require image generation and REST architecture, the Stable Diffusion API is a suitable choice.
  • For translation services, the DeepL API is recommended.
  • The OpenAI API offers a broad range of text-related functionalities.
  • If our focus is on natural language processing, the Hugging Face API is a strong contender with an easy way to test models.

On the Open Source side:

  • For more customization and control over large language models, the LMFlow Framework stands out with features like fine-tuning and data management.
  • The Databerry Framework is well-suited for querying data sources via the REST architecture.
  • If you prefer a Python-supported web UI integration for text generation, the Text-generation-webui Framework is worth considering.

Get Connected