Skip to main content

Subscription Tiers

CTGT API offers two tiers designed for different needs:

Free Tier

Perfect for getting started
  • 3 AI models
  • 20 req/min, 100 req/hour
  • 500 req/day
  • 100K tokens/day
  • Pay-as-you-go only
  • No credit card required

Paid Tier

For production applications
  • All 10 AI models
  • 100 req/min, 1,000 req/hour
  • 10,000 req/day
  • 10M tokens/day
  • $10/month + usage
  • Priority support

Rate Limits Comparison

LimitFree TierPaid TierIncrease
Requests per minute201005x
Requests per hour1001,00010x
Requests per day50010,00020x
Tokens per day100,00010,000,000100x
Available models3103.6x
Rate limits reset at the start of each time period (minute, hour, day).

Check Your Subscription

View your current subscription status and limits:
curl https://api.ctgt.ai/v1/subscription/info \
  -H "Authorization: Bearer sk-ctgt-YOUR_API_KEY"
Response (Free Tier):
{
  "user_id": "user_abc123",
  "email": "john@example.com",
  "subscription_tier": "free",
  "stripe_customer_id": "cus_abc123",
  "rate_limits": {
    "requests_per_minute": 20,
    "requests_per_hour": 100,
    "requests_per_day": 500,
    "tokens_per_day": 100000
  },
  "available_models": [
    {
      "model": "gemini-2.5-flash",
      "display_name": "Gemini 2.5 Flash",
      "description": "Fast and efficient model",
      "tier": "free",
      "pricing": {
        "input": 0.50,
        "output": 2.70
      }
    }
  ],
  "upgrade_url": "https://ctgt.ai/pricing"
}
Response (Paid Tier):
{
  "user_id": "user_abc123",
  "email": "john@example.com",
  "subscription_tier": "paid",
  "stripe_customer_id": "cus_abc123",
  "rate_limits": {
    "requests_per_minute": 100,
    "requests_per_hour": 1000,
    "requests_per_day": 10000,
    "tokens_per_day": 10000000
  },
  "available_models": [
    // All 10 models listed
  ],
  "upgrade_url": null
}

Upgrading to Paid Tier

Step 1: Initiate Upgrade

curl -X POST https://api.ctgt.ai/v1/subscription/upgrade \
  -H "Authorization: Bearer sk-ctgt-YOUR_API_KEY"
Response:
{
  "message": "Checkout session created. Complete payment to upgrade.",
  "checkout_url": "https://checkout.stripe.com/c/pay/cs_test_abc123...",
  "session_id": "cs_test_abc123",
  "tier": "free"
}

Step 2: Complete Payment

  1. Visit the checkout_url in your browser
  2. Complete the Stripe checkout process
  3. Your tier will be automatically upgraded to “paid”
The upgrade takes effect immediately after successful payment. No need to restart your application.

Step 3: Verify Upgrade

curl https://api.ctgt.ai/v1/subscription/info \
  -H "Authorization: Bearer sk-ctgt-YOUR_API_KEY"
Look for:
{
  "subscription_tier": "paid",
  "rate_limits": {
    "requests_per_minute": 100,
    ...
  }
}

Downgrading to Free Tier

You can downgrade to free tier at any time:
curl -X POST https://api.ctgt.ai/v1/subscription/downgrade \
  -H "Authorization: Bearer sk-ctgt-YOUR_API_KEY"
Response:
{
  "message": "Successfully downgraded to free tier",
  "tier": "free"
}
Downgrading will:
  • Reduce rate limits immediately
  • Restrict access to only 3 free-tier models
  • Cancel your monthly subscription
  • Billing remains pay-as-you-go for usage

Billing & Usage

View Current Usage

Monitor your token consumption and costs in real-time:
curl https://api.ctgt.ai/v1/billing/usage \
  -H "Authorization: Bearer sk-ctgt-YOUR_API_KEY"
Response:
{
  "user_id": "user_abc123",
  "email": "john@example.com",
  "period": {
    "start": "2025-12-01T00:00:00Z",
    "end": "2025-12-31T23:59:59Z"
  },
  "usage": {
    "total_requests": 1234,
    "total_tokens": 567890,
    "total_cost_usd": 12.45,
    "by_model": {
      "gemini-2.5-flash": {
        "requests": 800,
        "prompt_tokens": 120000,
        "completion_tokens": 80000,
        "cost_usd": 4.20
      },
      "claude-sonnet-4-5-20250929": {
        "requests": 434,
        "prompt_tokens": 200000,
        "completion_tokens": 167890,
        "cost_usd": 8.25
      }
    }
  }
}

View Billing History

Get historical billing data for a date range:
curl "https://api.ctgt.ai/v1/billing/history?start_date=2025-11-01&end_date=2025-12-01" \
  -H "Authorization: Bearer sk-ctgt-YOUR_API_KEY"

Generate Invoice

Request an invoice for your usage:
curl -X POST https://api.ctgt.ai/v1/billing/generate-invoice \
  -H "Authorization: Bearer sk-ctgt-YOUR_API_KEY"
Response:
{
  "message": "Invoice generated and sent",
  "usage": {
    "total_tokens": 567890,
    "total_cost_usd": 12.45
  },
  "invoice_id": "in_abc123",
  "amount_usd": 12.45
}

Rate Limit Headers

Every API response includes rate limit information in the headers:
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 95
X-RateLimit-Reset: 1702847523
Headers explained:
  • X-RateLimit-Limit: Maximum requests allowed in the current window
  • X-RateLimit-Remaining: Requests remaining in current window
  • X-RateLimit-Reset: Unix timestamp when the limit resets

Example: Checking Rate Limits

import requests

response = requests.post(
    "https://api.ctgt.ai/v1/chat/completions",
    headers={"Authorization": f"Bearer {api_key}"},
    json={"model": "gemini-2.5-flash", "messages": [...]}
)

print(f"Limit: {response.headers.get('X-RateLimit-Limit')}")
print(f"Remaining: {response.headers.get('X-RateLimit-Remaining')}")
print(f"Resets at: {response.headers.get('X-RateLimit-Reset')}")

Handling Rate Limits

When you exceed your rate limits: Status Code: 429 Too Many Requests Response:
{
  "detail": "Rate limit exceeded. Please try again later."
}

Best Practices

import time
import requests

def make_request_with_retry(url, headers, data, max_retries=5):
    for attempt in range(max_retries):
        response = requests.post(url, headers=headers, json=data)
        
        if response.status_code == 429:
            wait_time = 2 ** attempt  # Exponential backoff
            print(f"Rate limited. Waiting {wait_time}s...")
            time.sleep(wait_time)
            continue
        
        return response
    
    raise Exception("Max retries exceeded")
def monitor_rate_limits(response):
    remaining = int(response.headers.get('X-RateLimit-Remaining', 0))
    
    if remaining < 10:
        print(f"Warning: Only {remaining} requests remaining!")
    
    if remaining == 0:
        reset_time = int(response.headers.get('X-RateLimit-Reset', 0))
        wait_seconds = reset_time - time.time()
        print(f"Rate limit exhausted. Waiting {wait_seconds}s")
        time.sleep(wait_seconds)
from functools import lru_cache
import hashlib
import json

@lru_cache(maxsize=1000)
def cached_api_call(prompt_hash):
    # Make actual API call
    response = requests.post(...)
    return response.json()

def get_completion(prompt):
    # Create hash of prompt for caching
    prompt_hash = hashlib.md5(prompt.encode()).hexdigest()
    return cached_api_call(prompt_hash)
# Instead of 100 individual requests
for item in items:
    result = api_call(item)

# Batch into fewer requests with multiple items
for batch in chunks(items, 10):
    results = api_call_batch(batch)
Pro tip: Upgrade to paid tier for 5-100x higher rate limits if you’re consistently hitting limits.

Cost Management

Setting Budget Alerts

Monitor your spending with the usage endpoint:
import requests

def check_budget_alert(api_key, monthly_budget=100):
    url = "https://api.ctgt.ai/v1/billing/usage"
    headers = {"Authorization": f"Bearer {api_key}"}
    
    response = requests.get(url, headers=headers)
    usage = response.json()
    
    current_cost = usage['usage']['total_cost_usd']
    
    if current_cost > monthly_budget * 0.8:
        print(f"⚠️ Warning: Used ${current_cost:.2f} of ${monthly_budget} budget")
    
    if current_cost > monthly_budget:
        print(f"🚨 Alert: Budget exceeded! ${current_cost:.2f} / ${monthly_budget}")
        return False
    
    return True

Optimize Costs

Choose the Right Model

Use cheaper models for simple tasks:
  • Gemini Flash Lite: $0.30 input
  • GPT-5 Nano: $0.25 input

Control Token Limits

Set max_tokens to limit response length:
{"max_tokens": 500}

Optimize Prompts

Shorter prompts = lower costs:
  • Be concise
  • Remove unnecessary context
  • Use system prompts efficiently

Cache Common Responses

Store and reuse responses for:
  • FAQ answers
  • Common queries
  • Static content

Pricing Summary

Monthly Subscription

TierMonthly FeeWhat’s Included
Free$03 models, basic limits
Paid$10All 10 models, 100x limits

Usage Costs (Pay-as-you-go)

Both tiers pay for token usage at the same rates:
Model CategoryInput (per 1M)Output (per 1M)
Most Affordable0.250.25 - 0.500.600.60 - 2.70
Mid-Range1.201.20 - 4.005.205.20 - 14.00
Premium5.005.00 - 10.0017.0017.00 - 30.00
See the Models & Pricing page for complete pricing details.

Example Cost Scenarios

Scenario 1: Small Project (Free Tier)

Usage:
  • 500 requests/day
  • Average 100 input + 300 output tokens per request
  • Using Gemini 2.5 Flash
Monthly Cost:
  • Input: 500 × 30 × 100 tokens = 1.5M tokens = $0.75
  • Output: 500 × 30 × 300 tokens = 4.5M tokens = $12.15
  • Total: $12.90/month (usage only, no subscription)

Scenario 2: Medium Project (Paid Tier)

Usage:
  • 5,000 requests/day
  • Average 200 input + 500 output tokens per request
  • Mix of Gemini Flash and GPT-5
Monthly Cost:
  • Subscription: $10
  • Usage: ~$150-200
  • Total: $160-210/month

Scenario 3: Large Project (Paid Tier)

Usage:
  • 50,000 requests/day
  • Using advanced models (Claude Sonnet, GPT-5.2)
  • Complex queries with higher token counts
Monthly Cost:
  • Subscription: $10
  • Usage: ~$1,500-2,500
  • Total: $1,510-2,510/month
All scenarios assume normal usage patterns. Your costs may vary based on actual token consumption.

Next Steps