Docs
API
Functions
Object Detection

Detection

This is the description of our different detection endpoints for the API.

To use just send a base64 encoded image to the endpoint with any relevant configuration

We currently have:

  • Face Detection
  • Text Detection
  • General Object Detection
  • Vehicle Detection
  • Animal Detection
  • Furniture Detection
  • Clothing Detection

Output explanation

For each detection a bounding box is returned (bbox) which is a list of 4 numbers: [x1, y1, x2, y2] where (x1, y1) is the top left corner of the detected object and (x2, y2) is the bottom right corner of the detected object. The coordinates are in pixels and are relative to the top left corner of the image.

from PIL import Image, ImageDraw

# demonstration of how to use the output

img_path = "<path to image>"

def display_on_image(image, det):
    draw = ImageDraw.Draw(image)
    w, h = image.size

    for detection in det:
        top_left_x, top_left_y, bottom_right_x, bottom_right_y = detection["bbox"]
        draw.rectangle((top_left_x * w, top_left_y * h, bottom_right_x * w, bottom_right_y * h), outline="red")
    image.show()

# any of the requests sent below can be used here
response_date = send_detection_request()

image = Image.open(img_path)
display_on_image(image, response_data["result"])

Face Detection (/face_detection)

Our face detection model that also supports emotion and age detection.

def image_to_base64(image_path: str) -> str:
    with open(image_path, "rb") as image_file:
        # Read the image, encode it in base64, and convert to string
        return base64.b64encode(image_file.read()).decode('utf-8')


url = f"https://gateway.ezml.io/api/v1/functions/face_detection"

payload = {
    "image": image_to_base64("<path to image>"),
    "labels": ["region", "age"] # detect face and age
}
headers = {
    "Authorization": "Bearer <token from /auth>"
}

res = requests.post(url, json=payload, headers=headers)

for face in res.json()["result"]:
    print(f"Face bounding box: {face['bbox']}")
    print(f"Age: {face['age']}")

Text Detection (/text_detection)

Either detect (location + text) or recognize (text) text in multiple languages.

def image_to_base64(image_path: str) -> str:
    with open(image_path, "rb") as image_file:
        # Read the image, encode it in base64, and convert to string
        return base64.b64encode(image_file.read()).decode('utf-8')

url = f"https://gateway.ezml.io/api/v1/functions/text_detection"

payload = {
    "image": image_to_base64("<path to image>"),
    "type": "DETECTION", # locate and recognize text
    "language": "es" # set language to spanish
}
headers = {
    "Authorization": "Bearer <token from /auth>"
}

res = requests.post(url, json=payload, headers=headers)

for text in res.json()["result"]:
    print(f"Text: {text['label']}")
    print(f"Bounding box: {text['bbox']}")

General Object Detection (/object_detection)

This is our general zero-shot object detection endpoint. Just send a list of objects you want to detect and let us handle the rest!

def image_to_base64(image_path: str) -> str:
    with open(image_path, "rb") as image_file:
        # Read the image, encode it in base64, and convert to string
        return base64.b64encode(image_file.read()).decode('utf-8')


url = f"https://gateway.ezml.io/api/v1/functions/face_detection"

payload = {
    "image": image_to_base64("<path to image>"),
    "labels": ["rabbit", "chicken", "cow", "pig"] # detect different species of animals
}
headers = {
    "Authorization": "Bearer <token from /auth>"
}

res = requests.post(url, json=payload, headers=headers)

for obj in res.json()["result"]:
    print(f"Detected object: {obj['label']}")
    print(f"Bounding box: {obj['bbox']}")

Vehicle Detection (/vehicle_detection)

This endpoint detects common vehicles in image, specifically: "car","truck","bus","van","bicycle","motorcycle","scooter","train","airplane","boat","ship","helicopter","rocket","tractor","tank"

def image_to_base64(image_path: str) -> str:
    with open(image_path, "rb") as image_file:
        # Read the image, encode it in base64, and convert to string
        return base64.b64encode(image_file.read()).decode('utf-8')


url = f"https://gateway.ezml.io/api/v1/functions/vehicle_detection"

payload = {
    "image": image_to_base64("<path to image>"),
}
headers = {
    "Authorization": "Bearer <token from /auth>"
}

res = requests.post(url, json=payload, headers=headers)

for obj in res.json()["result"]:
    print(f"Detected vehicle: {obj['label']}")
    print(f"Bounding box: {obj['bbox']}")

People Detection (/people_detection)

This is a model that specializes in detecting people in an image.

def image_to_base64(image_path: str) -> str:
    with open(image_path, "rb") as image_file:
        # Read the image, encode it in base64, and convert to string
        return base64.b64encode(image_file.read()).decode('utf-8')


url = f"https://gateway.ezml.io/api/v1/functions/people_detection"

payload = {
    "image": image_to_base64("<path to image>"),
}
headers = {
    "Authorization": "Bearer <token from /auth>"
}

res = requests.post(url, json=payload, headers=headers)

for obj in res.json()["result"]:
    print(f"Bounding box: {obj['bbox']}")

Furniture Detection (/furniture_detection)

This endpoint detects common furniture in image, specifically: "chair","table","couch","bed","desk","lamp","cabinet","painting","vase","closet","stove"

def image_to_base64(image_path: str) -> str:
    with open(image_path, "rb") as image_file:
        # Read the image, encode it in base64, and convert to string
        return base64.b64encode(image_file.read()).decode('utf-8')


url = f"https://gateway.ezml.io/api/v1/functions/furniture_detection"

payload = {
    "image": image_to_base64("<path to image>"),
}
headers = {
    "Authorization": "Bearer <token from /auth>"
}

res = requests.post(url, json=payload, headers=headers)

for obj in res.json()["result"]:
    print(f"Detected furniture: {obj['label']}")
    print(f"Bounding box: {obj['bbox']}")
Last updated on September 12, 2023