βš™οΈProvider Setup

NeurosLink AI supports multiple AI providers with flexible authentication methods. This guide covers complete setup for all supported providers.

Supported Providers

  • OpenAI - GPT-4o, GPT-4o-mini, GPT-4-turbo

  • Amazon Bedrock - Claude 3.7 Sonnet, Claude 3.5 Sonnet, Claude 3 Haiku

  • Amazon SageMaker - Custom models deployed on SageMaker endpoints

  • Google Vertex AI - Gemini 2.5 Flash, Claude 4.0 Sonnet

  • Google AI Studio - Gemini 1.5 Pro, Gemini 2.0 Flash, Gemini 1.5 Flash

  • Anthropic - Claude 3.5 Sonnet, Claude 3 Opus, Claude 3 Haiku

  • Azure OpenAI - GPT-4, GPT-3.5-Turbo

  • LiteLLM - 100+ models from all providers via proxy server

  • Hugging Face - 100,000+ open source models including DialoGPT, GPT-2, GPT-Neo

  • Ollama - Local AI models including Llama 2, Code Llama, Mistral, Vicuna

  • Mistral AI - Mistral Tiny, Small, Medium, and Large models

πŸ’° Model Availability & Cost Considerations

Important Notes:

  • Model Availability: Specific models may not be available in all regions or require special access

  • Cost Variations: Pricing differs significantly between providers and models (e.g., Claude 3.5 Sonnet vs GPT-4o)

  • Rate Limits: Each provider has different rate limits and quota restrictions

  • Local vs Cloud: Ollama (local) has no per-request cost but requires hardware resources

  • Enterprise Tiers: AWS Bedrock, Google Vertex AI, and Azure typically offer enterprise pricing

Best Practices:

  • Use createBestAIProvider() for automatic cost-optimized provider selection

  • Monitor usage through built-in analytics to track costs

  • Consider local models (Ollama) for development and testing

  • Check provider documentation for current pricing and availability

🏒 Enterprise Proxy Support

All providers support corporate proxy environments automatically. Simply set environment variables:

No code changes required - NeurosLink AI automatically detects and uses proxy settings.

For detailed proxy setup β†’ See Enterprise & Proxy Setup Guide

OpenAI Configuration

Basic Setup

Optional Configuration

Supported Models

  • gpt-4o (default) - Latest multimodal model

  • gpt-4o-mini - Cost-effective variant

  • gpt-4-turbo - High-performance model

Usage Example

Timeout Configuration

  • Default Timeout: 30 seconds

  • Supported Formats: Milliseconds (30000), human-readable ('30s', '1m', '5m')

  • Environment Variable: OPENAI_TIMEOUT='45s' (optional)

Amazon Bedrock Configuration

🚨 Critical Setup Requirements

⚠️ IMPORTANT: Anthropic Models Require Inference Profile ARN

For Anthropic Claude models in Bedrock, you MUST use the full inference profile ARN, not simple model names:

Basic AWS Credentials

Session Token Support (Development)

For temporary credentials (common in development environments):

Available Inference Profile ARNs

Replace <account_id> with your AWS account ID:

Why Inference Profiles?

  • Cross-Region Access: Faster access across AWS regions

  • Better Performance: Optimized routing and response times

  • Higher Availability: Improved model availability and reliability

  • Different Permissions: Separate permission model from base models

Complete Bedrock Configuration

Usage Example

Timeout Configuration

  • Default Timeout: 45 seconds (longer due to cold starts)

  • Supported Formats: Milliseconds (45000), human-readable ('45s', '1m', '2m')

  • Environment Variable: BEDROCK_TIMEOUT='1m' (optional)

Account Setup Requirements

To use AWS Bedrock, ensure your AWS account has:

  1. Bedrock Service Access: Enable Bedrock in your AWS region

  2. Model Access: Request access to Anthropic Claude models

  3. IAM Permissions: Your credentials need bedrock:InvokeModel permissions

  4. Inference Profile Access: Access to the specific inference profiles

IAM Policy Example

Amazon SageMaker Configuration

Amazon SageMaker allows you to use your own custom models deployed on SageMaker endpoints. This provider is perfect for:

  • Custom Model Hosting - Deploy your fine-tuned models

  • Enterprise Compliance - Full control over model infrastructure

  • Cost Optimization - Pay only for inference usage

  • Performance - Dedicated compute resources

Basic AWS Credentials

SageMaker-Specific Configuration

Advanced Model Configuration

Session Token Support (for IAM Roles)

Complete SageMaker Configuration

Usage Example

CLI Commands

Timeout Configuration

Configure request timeouts for SageMaker endpoints:

Prerequisites

  1. SageMaker Endpoint: Deploy a model to SageMaker and get the endpoint name

  2. AWS IAM Permissions: Ensure your credentials have sagemaker:InvokeEndpoint permission

  3. Endpoint Status: Endpoint must be in "InService" status

IAM Policy Example

Environment Variables Reference

Variable
Required
Default
Description

AWS_ACCESS_KEY_ID

βœ…

-

AWS access key

AWS_SECRET_ACCESS_KEY

βœ…

-

AWS secret key

AWS_REGION

βœ…

us-east-1

AWS region

SAGEMAKER_DEFAULT_ENDPOINT

βœ…

-

SageMaker endpoint name

SAGEMAKER_TIMEOUT

❌

30000

Request timeout (ms)

SAGEMAKER_MAX_RETRIES

❌

3

Retry attempts

AWS_SESSION_TOKEN

❌

-

For temporary credentials

πŸ“– Complete SageMaker Guide

For comprehensive SageMaker setup, advanced features, and production deployment: πŸ“– Complete SageMaker Integration Guide - Includes:

  • Model deployment examples

  • Cost optimization strategies

  • Enterprise security patterns

  • Multi-model endpoint management

  • Performance testing and monitoring

  • Troubleshooting and debugging

Google Vertex AI Configuration

NeurosLink AI supports three authentication methods for Google Vertex AI to accommodate different deployment environments:

Best for production environments where you can store service account files securely.

Setup Steps:

  1. Create a service account in Google Cloud Console

  2. Download the service account JSON file

  3. Set the file path in GOOGLE_APPLICATION_CREDENTIALS

Method 2: Service Account JSON String (Good for Containers/Cloud)

Best for containerized environments where file storage is limited.

Setup Steps:

  1. Copy the entire contents of your service account JSON file

  2. Set it as a single-line string in GOOGLE_SERVICE_ACCOUNT_KEY

  3. NeurosLink AI will automatically create a temporary file for authentication

Method 3: Individual Environment Variables (Good for CI/CD)

Best for CI/CD pipelines where individual secrets are managed separately.

Setup Steps:

  1. Extract client_email and private_key from your service account JSON

  2. Set them as individual environment variables

  3. NeurosLink AI will automatically assemble them into a temporary service account file

Authentication Detection

NeurosLink AI automatically detects and uses the best available authentication method in this order:

  1. File Path (GOOGLE_APPLICATION_CREDENTIALS) - if file exists

  2. JSON String (GOOGLE_SERVICE_ACCOUNT_KEY) - if provided

  3. Individual Variables (GOOGLE_AUTH_CLIENT_EMAIL + GOOGLE_AUTH_PRIVATE_KEY) - if both provided

Complete Vertex AI Configuration

Usage Example

Timeout Configuration

  • Default Timeout: 60 seconds (longer due to GCP initialization)

  • Supported Formats: Milliseconds (60000), human-readable ('60s', '1m', '2m')

  • Environment Variable: VERTEX_TIMEOUT='90s' (optional)

Supported Models

  • gemini-2.5-flash (default) - Fast, efficient model

  • claude-sonnet-4@20250514 - High-quality reasoning (Anthropic via Vertex AI)

Claude Sonnet 4 via Vertex AI Configuration

NeurosLink AI provides first-class support for Claude Sonnet 4 through Google Vertex AI. This configuration has been thoroughly tested and verified working.

Working Configuration Example

Performance Metrics (Verified)

  • Generation Response: ~2.6 seconds

  • Health Check: Working status detection

  • Streaming: Fully functional

  • Tool Integration: Ready for MCP tools

Usage Examples

Google Cloud Setup Requirements

To use Google Vertex AI, ensure your Google Cloud project has:

  1. Vertex AI API Enabled: Enable the Vertex AI API in your project

  2. Service Account: Create a service account with Vertex AI permissions

  3. Model Access: Ensure access to the models you want to use

  4. Billing Enabled: Vertex AI requires an active billing account

Service Account Permissions

Your service account needs these IAM roles:

  • Vertex AI User or Vertex AI Admin

  • Service Account Token Creator (if using impersonation)

Google AI Studio Configuration

Google AI Studio provides direct access to Google's Gemini models with a simple API key authentication.

Basic Setup

Optional Configuration

Supported Models

  • gemini-2.5-pro - Comprehensive, detailed responses for complex tasks

  • gemini-2.5-flash (recommended) - Fast, efficient responses for most tasks

Usage Example

Timeout Configuration

  • Default Timeout: 30 seconds

  • Supported Formats: Milliseconds (30000), human-readable ('30s', '1m', '5m')

  • Environment Variable: GOOGLE_AI_TIMEOUT='45s' (optional)

How to Get Google AI Studio API Key

  1. Visit Google AI Studio: Go to aistudio.google.comarrow-up-right

  2. Sign In: Use your Google account credentials

  3. Create API Key:

    • Navigate to the API Keys section

    • Click Create API Key

    • Copy the generated key (starts with AIza)

  4. Set Environment: Add to your .env file or export directly

Google AI Studio vs Vertex AI

Feature
Google AI Studio
Google Vertex AI

Setup Complexity

🟒 Simple (API key only)

🟑 Complex (Service account)

Authentication

API key

Service account JSON

Free Tier

βœ… Generous free limits

❌ Pay-per-use only

Enterprise Features

❌ Limited

βœ… Full enterprise support

Model Selection

🎯 Latest Gemini models

πŸ”„ Broader model catalog

Best For

Prototyping, small projects

Production, enterprise apps

Complete Google AI Studio Configuration

Rate Limits and Quotas

Google AI Studio includes generous free tier limits:

  • Free Tier: 15 requests per minute, 1,500 requests per day

  • Paid Usage: Higher limits available with billing enabled

  • Model-Specific: Different models may have different rate limits

Error Handling for Google AI Studio

Security Considerations

  • API Key Security: Treat API keys as sensitive credentials

  • Environment Variables: Never commit API keys to version control

  • Rate Limiting: Implement client-side rate limiting for production apps

  • Monitoring: Monitor usage to avoid unexpected charges

LiteLLM Configuration

LiteLLM provides access to 100+ models through a unified proxy server, allowing you to use any AI provider through a single interface.

Prerequisites

  1. Install LiteLLM:

  1. Start LiteLLM proxy server:

Basic Setup

Optional Configuration

Supported Model Formats

LiteLLM uses the provider/model format:

LiteLLM Configuration File (Optional)

Create litellm_config.yaml for advanced configuration:

Usage Example

Advanced Features

  • Cost Tracking: Built-in usage and cost monitoring

  • Load Balancing: Automatic failover between providers

  • Rate Limiting: Built-in rate limiting and retry logic

  • Caching: Optional response caching for efficiency

Production Considerations

  • Deployment: Run LiteLLM proxy as a separate service

  • Security: Configure authentication for production environments

  • Scaling: Use Docker/Kubernetes for high-availability deployments

  • Monitoring: Enable logging and metrics collection

Hugging Face Configuration

Basic Setup

Optional Configuration

Model Selection Strategy

Hugging Face hosts 100,000+ models. Choose based on:

  • Task: text-generation, conversational, code

  • Size: Larger models = better quality but slower

  • License: Check model licenses for commercial use

Rate Limiting

  • Free tier: Limited requests

  • PRO tier: Higher limits

  • Handle 503 errors (model loading) with retry logic

Usage Example

Timeout Configuration

  • Default Timeout: 30 seconds

  • Supported Formats: Milliseconds (30000), human-readable ('30s', '1m', '5m')

  • Environment Variable: HUGGINGFACE_TIMEOUT='45s' (optional)

  • Note: Model loading may take additional time on first request

  • microsoft/DialoGPT-medium (default) - Conversational AI

  • gpt2 - Classic GPT-2

  • distilgpt2 - Lightweight GPT-2

  • EleutherAI/gpt-neo-2.7B - Large open model

  • bigscience/bloom-560m - Multilingual model

Getting Started with Hugging Face

  1. Create Account: Visit huggingface.coarrow-up-right

  2. Generate Token: Go to Settings β†’ Access Tokens

  3. Create Token: Click "New token" with "read" scope

  4. Set Environment: Export token as HUGGINGFACE_API_KEY

Ollama Configuration

Local Installation Required

Ollama must be installed and running locally.

Installation Steps

  1. macOS:

  2. Linux:

  3. Windows: Download from ollama.aiarrow-up-right

Model Management

Privacy Benefits

  • 100% Local: No data leaves your machine

  • No API Keys: No authentication required

  • Offline Capable: Works without internet

Usage Example

Timeout Configuration

  • Default Timeout: 5 minutes (longer for local model processing)

  • Supported Formats: Milliseconds (300000), human-readable ('5m', '10m', '30m')

  • Environment Variable: OLLAMA_TIMEOUT='10m' (optional)

  • Note: Local models may need longer timeouts for complex prompts

  • llama2 (default) - Meta's Llama 2

  • codellama - Code-specialized Llama

  • mistral - Mistral 7B

  • vicuna - Fine-tuned Llama

  • phi - Microsoft's small model

Environment Variables

Performance Optimization

Mistral AI Configuration

Basic Setup

European Compliance

  • GDPR compliant

  • Data processed in Europe

  • No training on user data

Model Selection

  • mistral-tiny: Fast responses, basic tasks

  • mistral-small: Balanced choice (default)

  • mistral-medium: Complex reasoning

  • mistral-large: Maximum capability

Cost Optimization

Mistral offers competitive pricing:

  • Tiny: $0.14 / 1M tokens

  • Small: $0.6 / 1M tokens

  • Medium: $2.5 / 1M tokens

  • Large: $8 / 1M tokens

Usage Example

Timeout Configuration

  • Default Timeout: 30 seconds

  • Supported Formats: Milliseconds (30000), human-readable ('30s', '1m', '5m')

  • Environment Variable: MISTRAL_TIMEOUT='45s' (optional)

Getting Started with Mistral AI

  1. Create Account: Visit mistral.aiarrow-up-right

  2. Get API Key: Navigate to API Keys section

  3. Generate Key: Create new API key

  4. Add Billing: Set up payment method

Environment Variables

Multilingual Support

Mistral models excel at multilingual tasks:

  • English, French, Spanish, German, Italian

  • Code generation in multiple programming languages

  • Translation between supported languages

Anthropic Configuration

Direct access to Anthropic's Claude models without going through AWS Bedrock.

Basic Setup

Optional Configuration

Supported Models

  • claude-3-7-sonnet-20250219 - Latest Claude 3.7 Sonnet

  • claude-3-5-sonnet-20241022 (default) - Claude 3.5 Sonnet v2

  • claude-3-opus-20240229 - Most capable model

  • claude-3-haiku-20240307 - Fastest, most cost-effective

Usage Example

Timeout Configuration

  • Default Timeout: 30 seconds

  • Supported Formats: Milliseconds (30000), human-readable ('30s', '1m', '5m')

  • Environment Variable: ANTHROPIC_TIMEOUT='45s' (optional)

Getting Started with Anthropic

  1. Create Account: Visit anthropic.comarrow-up-right

  2. Get API Key: Navigate to API Keys section

  3. Generate Key: Create new API key

  4. Set Environment: Export key as ANTHROPIC_API_KEY

Azure OpenAI Configuration

Azure OpenAI provides enterprise-grade access to OpenAI models through Microsoft Azure.

Basic Setup

Optional Configuration

Supported Models

Azure OpenAI supports deployment of:

  • gpt-4o - Latest multimodal model

  • gpt-4 - Advanced reasoning

  • gpt-4-turbo - Optimized performance

  • gpt-3.5-turbo - Cost-effective

Usage Example

Timeout Configuration

  • Default Timeout: 30 seconds

  • Supported Formats: Milliseconds (30000), human-readable ('30s', '1m', '5m')

  • Environment Variable: AZURE_TIMEOUT='45s' (optional)

Azure Setup Requirements

  1. Azure Subscription: Active Azure subscription

  2. Azure OpenAI Resource: Create Azure OpenAI resource in Azure Portal

  3. Model Deployment: Deploy a model to get deployment ID

  4. API Key: Get API key from resource's Keys and Endpoint section

Environment Variables Reference

Variable
Required
Description

AZURE_OPENAI_API_KEY

βœ…

Azure OpenAI API key

AZURE_OPENAI_ENDPOINT

βœ…

Resource endpoint URL

AZURE_OPENAI_DEPLOYMENT_ID

βœ…

Model deployment name

AZURE_OPENAI_API_VERSION

❌

API version (default: latest)

OpenAI Compatible Configuration

Connect to any OpenAI-compatible API endpoint (LocalAI, vLLM, Ollama with OpenAI compatibility, etc.)

Basic Setup

Optional Configuration

Usage Example

Compatible Servers

This works with any server implementing the OpenAI API:

  • LocalAI - Local AI server

  • vLLM - High-performance inference server

  • Ollama (with OLLAMA_OPENAI_COMPAT=1)

  • Text Generation WebUI

  • Custom inference servers

Environment Variables

Redis Configuration

Redis integration for distributed conversation memory and session state.

Basic Setup

Optional Configuration

Advanced Configuration

Usage Example

Redis Cloud Setup

For managed Redis (Redis Cloud, AWS ElastiCache, etc.):

Docker Redis (Development)

Features Enabled by Redis

  • Distributed Memory: Share conversation state across instances

  • Session Persistence: Conversations survive application restarts

  • Export/Import: Export full session history as JSON

  • Multi-tenant: Isolate conversations by session ID

  • Scalability: Handle thousands of concurrent conversations

Environment Variables Reference

Variable
Required
Default
Description

REDIS_URL

Recommended

-

Full Redis connection URL

REDIS_HOST

Alternative

localhost

Redis host

REDIS_PORT

Alternative

6379

Redis port

REDIS_PASSWORD

If auth enabled

-

Redis password

REDIS_DB

❌

0

Database number

REDIS_KEY_PREFIX

❌

neurolink:

Key prefix

Environment File Template

Create a .env file in your project root:

Provider Priority and Fallback

Automatic Provider Selection

NeurosLink AI automatically selects the best available provider:

Provider Priority Order

The default priority order (most reliable first):

  1. OpenAI - Most reliable, fastest setup

  2. Anthropic - High quality, simple setup

  3. Google AI Studio - Free tier, easy setup

  4. Azure OpenAI - Enterprise reliable

  5. Google Vertex AI - Good performance, multiple auth methods

  6. Mistral AI - European compliance, competitive pricing

  7. Hugging Face - Open source variety

  8. Amazon Bedrock - High quality, requires careful setup

  9. Ollama - Local only, no fallback

Custom Priority

Environment-Based Selection

Testing Provider Configuration

CLI Status Check

Programmatic Testing

Common Configuration Issues

OpenAI Issues

Solution: Set OPENAI_API_KEY environment variable

Bedrock Issues

Solutions:

  1. Use full inference profile ARN (not simple model name)

  2. Check AWS account has Bedrock access

  3. Verify IAM permissions include bedrock:InvokeModel

  4. Ensure model access is enabled in your AWS region

Vertex AI Issues

Solution: Install peer dependency: npm install @google-cloud/vertexai

Solutions:

  1. Verify service account JSON is valid

  2. Check project ID is correct

  3. Ensure Vertex AI API is enabled

  4. Verify service account has proper permissions

Security Best Practices

Environment Variables

  • Never commit API keys to version control

  • Use different keys for development/staging/production

  • Rotate keys regularly

  • Use minimal permissions for service accounts

AWS Security

  • Use IAM roles instead of access keys when possible

  • Enable CloudTrail for audit logging

  • Use VPC endpoints for additional security

  • Implement resource-based policies

Google Cloud Security

  • Use service account keys with minimal permissions

  • Enable audit logging

  • Use VPC Service Controls for additional isolation

  • Rotate service account keys regularly

General Security

  • Use environment-specific configurations

  • Implement rate limiting in your applications

  • Monitor usage and costs

  • Use HTTPS for all API communications


← Back to Main README | Next: API Reference β†’arrow-up-right

Last updated

Was this helpful?