Cursor Directory

Integration


---
description: nango-integrations best practice rules for integration files
glob: nango-integrations/*
ruleType: always
alwaysApply: true
---

# Persona

You are a top tier integrations engineer. You are methodical, pragmatic and systematic in how you write integration scripts. You follow best practices and look carefully at existing patterns and coding styles in this existing project. You will always attempt to test your work by using the "dryrun" command, and will use a connection if provided to test or will discover a valid connection by using the API to fetch one. You always run the available commands to ensure your work compiles, lints successfully and has a valid nango.yaml.

## Configuration - nango.yaml

- If `sync_type: full`, then the sync should also have `track_deletes: true`
- If the sync requires metadata, then the sync should be set to `auto_start: false`. The metadata should be documented as an input in the nango.yaml
- Scopes should be documented
- For optional properties in models, use the `?` suffix after the property name
- Endpoints should be concise and simple, not necessarily reflecting the exact third-party API path
- Model names and endpoint paths should not be duplicated within an integration
- When adding a new integration, take care to not remove unrelated entries in the nango.yaml
- For enum values in models, do not use quotes around the values

### Endpoint Naming Guidelines

Keep endpoint definitions simple and consistent:

```yaml
# ✅ Good: Simple, clear endpoint definition
endpoint:
    method: PATCH
    path: /events
    group: Events

# ❌ Bad: Overly specific, redundant path
endpoint:
    method: PATCH
    path: /google-calendars/custom/events/{id}
    group: Events

# ✅ Good: Clear resource identification
endpoint:
    method: GET
    path: /users
    group: Users

# ❌ Bad: Redundant provider name and verbose path
endpoint:
    method: GET
    path: /salesforce/v2/users/list/all
    group: Users
```

```yaml
integrations:
    hubspot:
        contacts:
            runs: every 5m
            sync_type: full
            track_deletes: true
            input: ContactMetadata
            auto_start: false
            scopes:
                - crm.objects.contacts.read
            description: A super informative and helpful description that tells us what the sync does.
            endpoint:
                method: GET
                path: /contacts
                group: Contacts
models:
    ContactMetadata:
        # Required property
        name: string
        # Optional property using ? suffix
        cursor?: string
        # Optional property with union type
        # Enum values without quotes
        type?: user | admin
        status: ACTIVE | INACTIVE
        employmentType: FULL_TIME | PART_TIME | INTERN | OTHER
```

## Scripts

### General Guidelines

- Use comments to explain the logic and link to external API documentation
- Add comments with the endpoint URL above each API request
- Avoid modifying arguments and prefer returning new values

### API Endpoints and Base URLs

When constructing API endpoints, always check the official providers.yaml configuration at:
[https://github.com/NangoHQ/nango/blob/master/packages/providers/providers.yaml](https://github.com/NangoHQ/nango/blob/master/packages/providers/providers.yaml)

This file contains:
- Base URLs for each provider
- Authentication requirements
- API version information
- Common endpoint patterns
- Required headers and configurations

Example of using providers.yaml information:
```typescript
const proxyConfig: ProxyConfiguration = {
    endpoint: '/v1/endpoint', // Path that builds on the `base_url` from the providers.yaml
    retries: 3,
    headers: {
        'Content-Type': 'application/json'
    }
};
```

### Imports and Types

- Add a `types.ts` file which contains typed third party API responses
  - Types in `types.ts` should be prefixed with the integration name (e.g., `GoogleUserResponse`, `AsanaTaskResponse`) as they represent the raw API responses
  - This helps avoid naming conflicts with the user-facing types defined in `nango.yaml`
- Models defined in `nango.yaml` are automatically generated into a `models.ts` file
  - Always import these types from the models file instead of redefining them in your scripts
- For non-type imports (functions, classes, etc.), always include the `.js` extension:

```typescript
// ❌ Don't omit .js extension for non-type imports
import { toEmployee } from '../mappers/to-employee';

// ✅ Do include .js extension for non-type imports
import { toEmployee } from '../mappers/to-employee.js';

// ✅ Type imports don't need .js extension
import type { TaskResponse } from '../../models';
```

- Follow proper type naming and importing conventions:

```typescript
// ❌ Don't define interfaces that match nango.yaml models
interface TaskResponse {
    tasks: Task[];
}

// ✅ Do import types from the auto-generated models file
import type { TaskResponse } from '../../models';

// ❌ Don't use generic names for API response types
interface UserResponse {
    // raw API response type
}

// ✅ Do prefix API response types with the integration name
interface AsanaUserResponse {
    // raw API response type
}
```

### API Calls and Configuration

- Proxy calls should use retries:
  - Default for syncs: 10 retries
  - Default for actions: 3 retries

```typescript
const proxyConfig: ProxyConfiguration = {
    retries: 10,
    // ... other config
};
```

- Use `await nango.log` for logging (avoid `console.log`)
- Use the `params` property instead of appending params to the endpoint
- Use the built-in `nango.paginate` wherever possible:

```typescript
const proxyConfig: ProxyConfiguration = {
    endpoint,
    retries: 10,
    paginate: {
        response_path: 'comments'
    }
};

for await (const pages of nango.paginate(proxyConfig)) {
    // ... handle pages
}
```

- Always use `ProxyConfiguration` type when setting up requests
- Add API documentation links above the endpoint property:

```typescript
const proxyConfig: ProxyConfiguration = {
    // https://www.great-api-docs.com/endpoint
    endpoint,
    retries: 10,
};
```

## Validation

- Validate script inputs and outputs using `zod`
- Validate and convert date inputs:
  - Ensure dates are valid
  - Convert to the format expected by the provider using `new Date`
  - Allow users to pass their preferred format
- Use the nango zod helper for input validation:

```typescript
const parseResult = await nango.zodValidateInput({
    zodSchema: documentInputSchema,
    input,
});
```

## Syncs

- `fetchData` must be the default export at the top of the file
- Always paginate requests to retrieve all records
- Avoid parallelizing requests (defeats retry policy and rate limiting)
- Do not wrap syncs in try-catch blocks (Nango handles error reporting)
- Use dedicated mapper functions for data transformation:
  - Place shared mappers in a `mappers` directory
  - Name files as `mappers/to-${entity}` (e.g., `mappers/to-employee.ts`)

```typescript
import { toEmployee } from '../mappers/to-employee.js';

export default async function fetchData(nango: NangoSync) {
    const proxyConfig: ProxyConfiguration = {
        endpoint: '/employees'
    };
    const allData = await nango.get(proxyConfig);
    return toEmployee(allData);
}
```

- Avoid type casting to leverage TypeScript benefits:

```typescript
// ❌ Don't use type casting
return {
    user: userResult.records[0] as HumanUser,
    userType: 'humanUser'
};

// ✅ Do use proper type checks
if (isHumanUser(userResult.records[0])) {
    return {
        user: userResult.records[0],
        userType: 'humanUser'
    };
}
```

- For incremental syncs, use `nango.lastSyncDate`

## Actions

- `runAction` must be the default export at the top of the file
- Only use `ActionError` for specific error messages:

```typescript
// ❌ Don't use generic Error
throw new Error('Invalid response from API');

// ✅ Do use nango.ActionError with a message
throw new nango.ActionError({
    message: 'Invalid response format from API'
});
```

- Always return objects, not arrays
- Always define API calls using a typed `ProxyConfiguration` object with retries set to 3:

```typescript
// ❌ Don't make API calls without a ProxyConfiguration
const { data } = await nango.get({
    endpoint: '/some-endpoint',
    params: { key: 'value' }
});

// ❌ Don't make API calls without setting retries for actions
const proxyConfig: ProxyConfiguration = {
    endpoint: '/some-endpoint',
    params: { key: 'value' }
};

// ✅ Do use ProxyConfiguration with retries set to 3 for actions
const proxyConfig: ProxyConfiguration = {
    endpoint: '/some-endpoint',
    params: { key: 'value' },
    retries: 3 // Default for actions is 3 retries
};
const { data } = await nango.get(proxyConfig);
```

- When implementing pagination in actions, always return a cursor-based response to allow users to paginate through results:

```typescript
// ✅ Define input type with optional cursor
interface ListUsersInput {
    cursor?: string;
    limit?: number;
}

// ✅ Define response type with next_cursor
interface ListUsersResponse {
    users: User[];
    next_cursor?: string; // undefined means no more results
}

// ✅ Example action implementation with pagination
export default async function runAction(
    nango: NangoAction,
    input: ListUsersInput
): Promise<ListUsersResponse> {
    const proxyConfig: ProxyConfiguration = {
        endpoint: '/users',
        params: {
            limit: input.limit || 50,
            cursor: input.cursor
        },
        retries: 3
    };
    
    const { data } = await nango.get(proxyConfig);
    
    return {
        users: data.users,
        next_cursor: data.next_cursor // Pass through the API's cursor if available
    };
}

// ❌ Don't paginate without returning a cursor
export default async function runAction(
    nango: NangoAction,
    input: ListUsersInput
): Promise<User[]> { // Wrong: Returns array without pagination info
    const { data } = await nango.get({
        endpoint: '/users',
        params: { cursor: input.cursor }
    });
    return data.users;
}
```

```typescript
// Complete action example:
import type { NangoAction, ProxyConfiguration, FolderContentInput, FolderContent } from '../../models';
import { folderContentInputSchema } from '../schema.zod.js';

export default async function runAction(
    nango: NangoAction,
    input: FolderContentInput
): Promise<FolderContent> {
    const proxyConfig: ProxyConfiguration = {
        // https://api.example.com/docs/endpoint
        endpoint: '/some-endpoint',
        params: { key: 'value' },
        retries: 3 // Default for actions is 3 retries
    };
    
    const { data } = await nango.get(proxyConfig);
    return { result: data };
}
```

## Testing

In order to test you need a valid connectionId. You can programmatically discover a valid connection by using the Node SDK. Here's a complete example of finding Salesforce connections:

1. First, create a script (e.g., `find-connections.js`):

```typescript
import { Nango } from '@nangohq/node';
import * as dotenv from 'dotenv';

// Load environment variables from .env file
dotenv.config();

function findNangoSecretKey(): string {
    // Get all environment variables
    const envVars = process.env;
    
    // Find all NANGO_SECRET_KEY variables
    const nangoKeys = Object.entries(envVars)
        .filter(([key]) => key.startsWith('NANGO_SECRET_KEY'))
        .sort(([keyA], [keyB]) => {
            // Sort by specificity (env-specific keys first)
            const isEnvKeyA = keyA !== 'NANGO_SECRET_KEY';
            const isEnvKeyB = keyB !== 'NANGO_SECRET_KEY';
            if (isEnvKeyA && !isEnvKeyB) return -1;
            if (!isEnvKeyA && isEnvKeyB) return 1;
            return keyA.localeCompare(keyB);
        });

    if (nangoKeys.length === 0) {
        throw new Error('No NANGO_SECRET_KEY environment variables found');
    }

    // Use the first key after sorting
    const [key, value] = nangoKeys[0];
    console.log(`Using secret key: ${key}`);
    return value;
}

function isValidConnection(connection: any): boolean {
    // Connection is valid if:
    // 1. No errors array exists, or
    // 2. Errors array is empty, or
    // 3. No errors with type "auth" exist
    if (!connection.errors) return true;
    if (connection.errors.length === 0) return true;
    return !connection.errors.some(error => error.type === 'auth');
}

async function findConnections(providerConfigKey: string) {
    const secretKey = findNangoSecretKey();
    
    const nango = new Nango({ 
        secretKey 
    });

    // List all connections
    const { connections } = await nango.listConnections();
    
    // Filter for specific provider config key and valid connections
    const validConnections = connections.filter(conn => 
        conn.provider_config_key === providerConfigKey && 
        isValidConnection(conn)
    );
    
    if (validConnections.length === 0) {
        console.log(`No valid connections found for integration: ${providerConfigKey}`);
        return;
    }

    console.log(`Found ${validConnections.length} valid connection(s) for integration ${providerConfigKey}:`);
    validConnections.forEach(conn => {
        console.log(`- Connection ID: ${conn.connection_id}`);
        console.log(`  Provider: ${conn.provider}`);
        console.log(`  Created: ${conn.created}`);
        if (conn.errors?.length > 0) {
            console.log(`  Non-auth Errors: ${conn.errors.length}`);
        }
        console.log('---');
    });
}

// Find connections for the salesforce integration
findConnections('salesforce').catch(console.error);
```

2. Make sure your `.env` file contains at least one secret key:
```env
# Environment-specific keys take precedence
NANGO_SECRET_KEY_DEV=your_dev_secret_key_here
NANGO_SECRET_KEY_STAGING=your_staging_secret_key_here
# Fallback key
NANGO_SECRET_KEY=your_default_secret_key_here
```

3. Run the script:
```bash
node find-connections.js
```

Example output for the salesforce integration:
```
Using secret key: NANGO_SECRET_KEY_DEV
Found 1 valid connection(s) for integration salesforce:
- Connection ID: 3374a138-a81c-4ff9-b2ed-466c86b3554d
  Provider: salesforce
  Created: 2025-02-18T08:41:24.156+00:00
  Non-auth Errors: 1
---
```

Each connection in the response includes:
- `connection_id`: The unique identifier you'll use for testing (e.g., "3374a138-a81c-4ff9-b2ed-466c86b3554d")
- `provider`: The API provider (e.g., 'salesforce')
- `provider_config_key`: The integration ID you searched for (e.g., 'salesforce')
- `created`: Timestamp of when the connection was created
- `end_user`: Information about the end user if available
- `errors`: Any sync or auth errors associated with the connection (connections with auth errors are filtered out)
- `metadata`: Additional metadata specific to the provider (like field mappings)


## Script Best Practices Checklist
- [ ] nango.paginate is used to paginate over responses in a sync
- [ ] if it is possible that an action could have a paginated response then the action should return back a `cursor` so the user can paginate over the action response## Integration Directory Structure

Your integration should follow this directory structure for consistency and maintainability:

```
nango-integrations/
├── nango.yaml              # Main configuration file
├── models.ts               # Auto-generated models from nango.yaml
├── schema.zod.ts          # Generated zod schemas for validation
└── ${integrationName}/
    ├── types.ts           # Third-party API response types
    ├── actions/           # Directory for action implementations
    │   ├── create-user.ts
    │   ├── update-user.ts
    │   └── delete-user.ts
    ├── syncs/             # Directory for sync implementations
    │   ├── users.ts
    │   └── teams.ts
    └── mappers/          # Shared data transformation functions
        ├── to-user.ts
        └── to-team.ts
```

### Key Components

1. **Root Level Files**:
   - `nango.yaml`: Main configuration file for all integrations
   - `models.ts`: Auto-generated models from nango.yaml. If this doesn't exist or you have updated the `nango.yaml` be sure to run `npx nango generate`
   - `schema.zod.ts`: Generated validation schemas

2. **Integration Level Files**:
   - `types.ts`: Third-party API response types specific to the integration

3. **Actions Directory**:
   - One file per action
   - Named after the action (e.g., `create-user.ts`, `update-user.ts`)
   - Each file exports a default `runAction` function

4. **Syncs Directory**:
   - One file per sync
   - Named after the sync (e.g., `users.ts`, `teams.ts`)
   - Each file exports a default `fetchData` function

5. **Mappers Directory**:
   - Shared data transformation functions
   - Named with pattern `to-${entity}.ts`
   - Used by both actions and syncs

### Running Tests

Test scripts directly against the third-party API using dryrun:

```bash
npx nango dryrun ${scriptName} ${connectionId} --integration-id ${INTEGRATION} --auto-confirm
```

Example:
```bash
npx nango dryrun settings g --integration-id google-calendar --auto-confirm
```

### Dryrun Options

- `--auto-confirm`: Skip prompts and show all output
```bash
npx nango dryrun settings g --auto-confirm --integration-id google-calendar
```


## Script Helpers

-   `npx nango dryrun ${scriptName} ${connectionId} -e ${Optional environment}` --integration-id ${INTEGRATION}
-   `npx nango compile` -- ensure all integrations compile
-   `npx nango generate` -- when adding an integration or updating the nango.yaml this command should be run to update the models.ts file and also the schema auto-generated files
-   `npx nango sync:config.check` -- ensure the nango.yaml is valid and could compile successfully 

## Deploying Integrations

Once your integration is complete and tested, you can deploy it using the Nango CLI:

```bash
npx nango deploy <environment>
```

### Deployment Options

- `--auto-confirm`: Skip all confirmation prompts
- `--debug`: Run CLI in debug mode with verbose logging
- `-v, --version [version]`: Tag this deployment with a version (useful for rollbacks)
- `-s, --sync [syncName]`: Deploy only a specific sync
- `-a, --action [actionName]`: Deploy only a specific action
- `-i, --integration [integrationId]`: Deploy all scripts for a specific integration
- `--allow-destructive`: Allow destructive changes without confirmation (use with caution)

### Examples

Deploy everything to production:
```bash
npx nango deploy production
```

Deploy a specific sync to staging:
```bash
npx nango deploy staging -s contacts
```

Deploy an integration with version tag:
```bash
npx nango deploy production -i salesforce -v 1.0.0
```

Deploy with auto-confirmation:
```bash
npx nango deploy staging --auto-confirm
```


## Full Example of a sync and action in nango

Here's a complete example of a GitHub integration that syncs pull requests and has an action to create a pull request:

`nango-integrations/nango.yaml`:
```yaml
integrations:
    github:
        syncs:
            pull-requests:
                runs: every hour
                description: |
                    Get all pull requests from a Github repository.
                sync_type: incremental
                endpoint:
                    method: GET
                    path: /pull-requests
                    group: Pull Requests
                input: GithubMetadata
                output: PullRequest
                auto_start: false
                scopes:
                    - repo
                    - repo:status
        actions:
            create-pull-request:
                description: Create a new pull request
                endpoint:
                    method: POST
                    path: /pull-requests
                    group: Pull Requests
                input: CreatePullRequest
                output: PullRequest
                scopes:
                    - repo
                    - repo:status

models:
    GithubMetadata:
        owner: string
        repo: string
    CreatePullRequest:
        owner: string
        repo: string
        title: string
        head: string
        base: string
        body?: string
    PullRequest:
        id: number
        number: number
        title: string
        state: string
        body?: string
        created_at: string
        updated_at: string
        closed_at?: string
        merged_at?: string
        head:
            ref: string
            sha: string
        base:
            ref: string
            sha: string
```

`nango-integrations/github/types.ts`:
```typescript
export interface GithubPullRequestResponse {
    id: number;
    number: number;
    title: string;
    state: string;
    body: string | null;
    created_at: string;
    updated_at: string;
    closed_at: string | null;
    merged_at: string | null;
    head: {
        ref: string;
        sha: string;
    };
    base: {
        ref: string;
        sha: string;
    };
}
```

`nango-integrations/github/mappers/to-pull-request.ts`:
```typescript
import type { PullRequest } from '../../models';
import type { GithubPullRequestResponse } from '../types';

export function toPullRequest(response: GithubPullRequestResponse): PullRequest {
    return {
        id: response.id,
        number: response.number,
        title: response.title,
        state: response.state,
        body: response.body || undefined,
        created_at: response.created_at,
        updated_at: response.updated_at,
        closed_at: response.closed_at || undefined,
        merged_at: response.merged_at || undefined,
        head: {
            ref: response.head.ref,
            sha: response.head.sha
        },
        base: {
            ref: response.base.ref,
            sha: response.base.sha
        }
    };
}
```

`nango-integrations/github/syncs/pull-requests.ts`:
```typescript
import type { NangoSync, ProxyConfiguration, GithubMetadata } from '../../models';
import type { GithubPullRequestResponse } from '../types';
import { toPullRequest } from '../mappers/to-pull-request.js';

export default async function fetchData(
    nango: NangoSync
): Promise<void> {
    // Get metadata containing repository information
    const metadata = await nango.getMetadata<GithubMetadata>();
    
    const proxyConfig: ProxyConfiguration = {
        // https://docs.github.com/en/rest/pulls/pulls#list-pull-requests
        endpoint: `/repos/${metadata.owner}/${metadata.repo}/pulls`,
        params: {
            state: 'all',
            sort: 'updated',
            direction: 'desc'
        },
        retries: 10
    };

    // Use paginate to handle GitHub's pagination
    for await (const pullRequests of nango.paginate<GithubPullRequestResponse[]>(proxyConfig)) {
        const mappedPRs = pullRequests.map(toPullRequest);
        await nango.batchSave(mappedPRs);
    }
}
```

`nango-integrations/github/actions/create-pull-request.ts`:
```typescript
import type { NangoAction, ProxyConfiguration, PullRequest, CreatePullRequest } from '../../models';
import type { GithubPullRequestResponse } from '../types';
import { toPullRequest } from '../mappers/to-pull-request.js';

export default async function runAction(
    nango: NangoAction,
    input: CreatePullRequest
): Promise<PullRequest> {
    // https://docs.github.com/en/rest/pulls/pulls#create-a-pull-request
    const proxyConfig: ProxyConfiguration = {
        endpoint: `/repos/${input.owner}/${input.repo}/pulls`,
        data: {
            title: input.title,
            head: input.head,
            base: input.base,
            body: input.body
        },
        retries: 3
    };

    const { data } = await nango.post<GithubPullRequestResponse>(proxyConfig);
    return toPullRequest(data);
}
```

This example demonstrates:
1. A well-structured `nango.yaml` with models, sync, and action definitions
2. Proper type definitions for the GitHub API responses
3. A reusable mapper function for data transformation
4. An incremental sync that handles pagination and uses `getMetadata()`
5. An action that creates new pull requests
6. Following all best practices for file organization and code structure# Advanced Integration Script Patterns

This guide covers advanced patterns for implementing different types of Nango integration syncs. Each pattern addresses specific use cases and requirements you might encounter when building integrations.

## Table of Contents

1. [Configuration Based Sync](#configuration-based-sync)
2. [Selection Based Sync](#selection-based-sync)
3. [Window Time Based Sync](#window-time-based-sync)
4. [Action Leveraging Sync Responses](#action-leveraging-sync-responses)
5. [24 Hour Extended Sync](#24-hour-extended-sync)

## Configuration Based Sync

### Overview
A configuration-based sync allows customization of the sync behavior through metadata provided in the nango.yaml file. This pattern is useful when you need to:
- Configure specific fields to sync
- Set custom endpoints or parameters
- Define filtering rules

### Key Characteristics
- Uses metadata in nango.yaml for configuration
- Allows runtime customization of sync behavior
- Supports flexible data mapping
- Can handle provider-specific requirements

### Implementation Notes

This pattern leverages metadata to define a dynamic schema that drives the sync. The implementation typically consists of two parts:

1. An action to fetch available fields using the provider's introspection endpoint
2. A sync that uses the configured fields to fetch data

Example configuration in `nango.yaml`:

```yaml
integrations:
    salesforce:
        configuration-based-sync:
            sync_type: full
            track_deletes: true
            endpoint: GET /dynamic
            description: Fetch all fields of a dynamic model
            input: DynamicFieldMetadata
            auto_start: false
            runs: every 1h
            output: OutputData

models:
    DynamicFieldMetadata:
        configurations: Configuration[]
    Configuration:
        model: string
        fields: Field[]
    Field:
        id: string
        name: string
        type: string
    OutputData:
        id: string
        model: string
        data:
            __string: any
```

Example field introspection action:

```typescript
export default async function runAction(
    nango: NangoAction,
    input: Entity,
): Promise<GetSchemaResponse> {
    const entity = input.name;
    
    // Query the API's introspection endpoint
    const response = await nango.get({
        endpoint: `/services/data/v51.0/sobjects/${entity}/describe`,
    });
    // ... process and return field schema
}
```

Example sync implementation:

```typescript
import type { NangoSync, DynamicFieldMetadata, OutputData } from '../models.js';

const SF_VERSION = 'v59.0';

export default async function fetchData(
    nango: NangoSync,
    metadata: DynamicFieldMetadata
): Promise<void> {
    // Process each model configuration
    for (const config of metadata.configurations) {
        const { model, fields } = config;
        
        // Construct SOQL query with field selection
        const fieldNames = fields.map(f => f.name).join(',');
        const soqlQuery = `SELECT ${fieldNames} FROM ${model}`;
        
        // Query Salesforce API using SOQL
        const response = await nango.get({
            endpoint: `/services/data/${SF_VERSION}/query`,
            params: {
                q: soqlQuery
            }
        });

        // Map response to OutputData format and save
        const mappedData = response.data.records.map(record => ({
            id: record.Id,
            model: model,
            data: fields.reduce((acc, field) => {
                acc[field.name] = record[field.name];
                return acc;
            }, {} as Record<string, any>)
        }));

        // Save the batch of records
        await nango.batchSave(mappedData);
    }
}
```

Key implementation aspects:
- Uses metadata to drive the API queries
- Dynamically constructs field selections
- Supports multiple models from the third party API in a single sync
- Maps responses to a consistent output format
- Requires complementary action for field introspection
- Supports flexible schema configuration through nango.yaml

## Selection Based Sync

### Overview
A selection-based sync pattern allows users to specify exactly which resources to sync through metadata. This pattern is useful when you need to:
- Sync specific files or folders rather than an entire dataset
- Allow users to control the sync scope dynamically
- Handle nested resources efficiently
- Optimize performance by limiting the sync scope

### Key Characteristics
- Uses metadata to define sync targets
- Supports multiple selection types (e.g., files and folders)
- Handles nested resources recursively
- Processes data in batches
- Maintains clear error boundaries

### Visual Representation

```mermaid
graph TD
    A[Start] --> B[Load Metadata]
    B --> C[Process Folders]
    B --> D[Process Files]
    C --> E[List Contents]
    E --> F{Is File?}
    F -->|Yes| G[Add to Batch]
    F -->|No| E
    D --> G
    G --> H[Save Batch]
    H --> I[End]
```

### Implementation Example

Here's how this pattern is implemented in a Box files sync:

```yaml
# nango.yaml configuration
files:
    description: Sync files from specific folders or individual files
    input: BoxMetadata
    auto_start: false
    sync_type: full

models:
    BoxMetadata:
        files: string[]
        folders: string[]
    BoxDocument:
        id: string
        name: string
        modified_at: string
        download_url: string
```

```typescript
export default async function fetchData(nango: NangoSync) {
    const metadata = await nango.getMetadata<BoxMetadata>();
    const files = metadata?.files ?? [];
    const folders = metadata?.folders ?? [];
    const batchSize = 100;

    if (files.length === 0 && folders.length === 0) {
        throw new Error('Metadata for files or folders is required.');
    }

    // Process folders first
    for (const folder of folders) {
        await fetchFolder(nango, folder);
    }

    // Then process individual files
    let batch: BoxDocument[] = [];
    for (const file of files) {
        const metadata = await getFileMetadata(nango, file);
        batch.push({
            id: metadata.id,
            name: metadata.name,
            modified_at: metadata.modified_at,
            download_url: metadata.shared_link?.download_url
        });
        if (batch.length >= batchSize) {
            await nango.batchSave(batch, 'BoxDocument');
            batch = [];
        }
    }
    if (batch.length > 0) {
        await nango.batchSave(batch, 'BoxDocument');
    }
}

async function fetchFolder(nango: NangoSync, folderId: string) {
    const proxy: ProxyConfiguration = {
        endpoint: `/2.0/folders/${folderId}/items`,
        params: {
            fields: 'id,name,modified_at,shared_link'
        },
        paginate: {
            type: 'cursor',
            response_path: 'entries'
        }
    };

    let batch: BoxDocument[] = [];
    const batchSize = 100;

    for await (const items of nango.paginate(proxy)) {
        for (const item of items) {
            if (item.type === 'folder') {
                await fetchFolder(nango, item.id);
            }
            if (item.type === 'file') {
                batch.push({
                    id: item.id,
                    name: item.name,
                    modified_at: item.modified_at,
                    download_url: item.shared_link?.download_url
                });
                if (batch.length >= batchSize) {
                    await nango.batchSave(batch, 'BoxDocument');
                    batch = [];
                }
            }
        }
    }

    if (batch.length > 0) {
        await nango.batchSave(batch, 'BoxDocument');
    }
}
```

### Best Practices
1. **Simple Metadata Structure**: Keep the selection criteria simple and clear
2. **Batch Processing**: Save data in batches for better performance
3. **Clear Resource Types**: Handle different resource types (files/folders) separately
4. **Error Boundaries**: Handle errors at the item level to prevent full sync failure
5. **Progress Logging**: Add debug logs for monitoring progress

### Common Pitfalls
1. Not validating metadata inputs
2. Missing batch size limits
3. Not handling API rate limits
4. Poor error handling for individual items
5. Missing progress tracking logs

## Window Time Based Sync

### Overview

A window time based sync pattern is designed to efficiently process large datasets by breaking the sync into discrete, time-bounded windows (e.g., monthly or weekly). This approach is essential when:

- The third-party API or dataset is too large to fetch in a single request or run.
- You want to avoid timeouts, memory issues, or API rate limits.
- You need to ensure incremental, resumable progress across large time ranges.

This pattern is especially useful for financial or transactional data, where records are naturally grouped by time periods.

### Key Characteristics

- Divides the sync into time windows (e.g., months).
- Iterates over each window, fetching and processing data in batches.
- Uses metadata to track progress and allow for resumable syncs.
- Handles both initial full syncs and incremental updates.
- Supports batching and pagination within each window.

### Visual Representation

```mermaid
graph TD
    A[Start] --> B[Load Metadata]
    B --> C{More Windows?}
    C -->|Yes| D[Set Window Start/End]
    D --> E[Build Query for Window]
    E --> F[Get Count]
    F --> G[Batch Fetch & Save]
    G --> H[Update Metadata]
    H --> C
    C -->|No| I[Check for Incremental]
    I -->|Yes| J[Fetch Since Last Sync]
    J --> K[Batch Fetch & Save]
    K --> L[Done]
    I -->|No| L
```

### Implementation Example

Here's a simplified example of the window time based sync pattern, focusing on the window selection and iteration logic:

```typescript
export default async function fetchData(nango: NangoSync): Promise<void> {
    // 1. Load metadata and determine the overall date range
    const metadata = await nango.getMetadata();
    const lookBackPeriodInYears = 5;
    const { startDate, endDate } = calculateDateRange(metadata, lookBackPeriodInYears);
    let currentStartDate = new Date(startDate);

    // 2. Iterate over each time window (e.g., month)
    while (currentStartDate < endDate) {
        let currentEndDate = new Date(currentStartDate);
        currentEndDate.setMonth(currentEndDate.getMonth() + 1);
        currentEndDate.setDate(1);

        if (currentEndDate > endDate) {
            currentEndDate = new Date(endDate);
        }

        // 3. Fetch and process data for the current window
        const data = await fetchDataForWindow(currentStartDate, currentEndDate);
        await processAndSaveData(data);

        // 4. Update metadata to track progress
        await nango.updateMetadata({
            fromDate: currentEndDate.toISOString().split("T")[0],
            toDate: endDate.toISOString().split("T")[0],
            useMetadata: currentEndDate < endDate,
        });

        currentStartDate = new Date(currentEndDate.getTime());
        if (currentStartDate >= endDate) {
            await nango.updateMetadata({
                fromDate: endDate.toISOString().split("T")[0],
                toDate: endDate.toISOString().split("T")[0],
                useMetadata: false,
            });
            break;
        }
    }

    // 5. Optionally, handle incremental updates after the full windowed sync
    if (!metadata.useMetadata) {
        // ... (incremental sync logic)
    }
}

async function fetchDataForWindow(start: Date, end: Date) {
    // Implement provider-specific logic to fetch data for the window
    return [];
}

async function processAndSaveData(data: any[]) {
    // Implement logic to process and save data
}
```

**Key implementation aspects:**

- **Windowing:** The sync iterates over each month (or other time window), building queries and fetching data for just that period.
- **Batching:** Large result sets are fetched in batches (e.g., 100,000 records at a time) within each window.
- **Metadata:** Progress is tracked in metadata, allowing the sync to resume from the last completed window if interrupted.
- **Incremental:** After the full windowed sync, the script can switch to incremental mode, fetching only records modified since the last sync.
- **Error Handling:** Each window and batch is processed independently, reducing the risk of a single failure stopping the entire sync.

### Best Practices

1. **Choose an appropriate window size** (e.g., month, week) based on data volume and API limits.
2. **Track progress in metadata** to support resumability and avoid duplicate processing.
3. **Batch large queries** to avoid memory and timeout issues.
4. **Log progress** for observability and debugging.
5. **Handle incremental updates** after the initial full sync.

### Common Pitfalls

1. Not updating metadata after each window, risking duplicate or missed data.
2. Using too large a window size, leading to timeouts or API errors.
3. Not handling incremental syncs after the initial windowed sync.
4. Failing to batch large result sets, causing memory issues.
5. Not validating or handling edge cases in date calculations.

## Action Leveraging Sync Responses

### Overview

An "Action Leveraging Sync Responses" pattern allows actions to efficiently return data that has already been fetched and saved by a sync, rather than always querying the third-party API. This approach is useful when:

- The data needed by the action is already available from a previous sync.
- You want to minimize API calls, reduce latency, and improve reliability.
- You want to provide a fast, consistent user experience even if the third-party API is slow or unavailable.

This pattern is especially valuable for actions that need to return lists of entities (e.g., users, projects, items) that are already available from a sync.

### Key Characteristics

- Uses previously fetched or synced data when available.
- Falls back to a live API call only if no data is available.
- Transforms data as needed before returning.
- Returns a consistent, typed response.

### Visual Representation

```mermaid
graph TD
    A[Action Called] --> B[Check for Synced Data]
    B -->|Data Found| C[Return Synced Data]
    B -->|No Data| D[Fetch from API]
    D --> E[Transform/Return API Data]
```

### Implementation Example

Here's a generic example of this pattern:

```typescript
/**
 * Fetch all entities for an action, preferring previously synced data.
 * 1) Try using previously synced data (Entity).
 * 2) If none found, fallback to fetch from API.
 * 3) Return transformed entities.
 */
export default async function runAction(nango: NangoAction) {
  const syncedEntities: Entity[] = await getSyncedEntities(nango);

  if (syncedEntities.length > 0) {
    return {
      entities: syncedEntities.map(({ id, name, ...rest }) => ({
        id,
        name,
        ...rest,
      })),
    };
  }

  // Fallback: fetch from API (not shown)
  return { entities: [] };
}

async function getSyncedEntities(nango: NangoAction): Promise<Entity[]> {
  // Implement logic to retrieve entities from previously synced data
  return [];
}
```

**Key implementation aspects:**

- **Synced data first:** The action first attempts to use data that was previously fetched by a sync.
- **Fallback:** If no records are found, it can fallback to a live API call (not shown in this example).
- **Transformation:** The action transforms the data as needed before returning.
- **Consistent Response:** Always returns a consistent, typed response, even if no data is found.

### Best Practices

1. **Prefer previously synced data** to minimize API calls and improve performance.
2. **Handle empty or special cases** gracefully.
3. **Return a consistent response shape** regardless of data source.
4. **Document fallback logic** for maintainability.
5. **Keep transformation logic simple and clear.**

### Common Pitfalls

1. Not keeping synced data up to date, leading to stale or missing data.
2. Failing to handle the case where no data is available from sync or API.
3. Returning inconsistent response shapes.
4. Not transforming data as needed.
5. Overcomplicating fallback logic.

## 24 Hour Extended Sync

### Overview
A 24-hour extended sync pattern is designed to handle large datasets that cannot be processed within a single sync run due to Nango's 24-hour script execution limit. This pattern is essential when:
- Your sync needs to process more data than can be handled within 24 hours
- You need to handle API rate limits while staying within the execution limit
- You're dealing with very large historical datasets
- You need to ensure data consistency across multiple sync runs

### Why This Pattern?

Nango enforces a 24-hour limit on script execution time for several reasons:
- To prevent runaway scripts that could impact system resources
- To ensure fair resource allocation across all integrations
- To maintain system stability and predictability
- To encourage efficient data processing patterns

When your sync might exceed this limit, you need to:
1. Break down the sync into manageable chunks
2. Track progress using metadata
3. Resume from where the last run stopped
4. Ensure data consistency across runs

### Visual Representation

```mermaid
graph TD
    A[Start Sync] --> B{Has Metadata?}
    B -->|No| C[Initialize]
    B -->|Yes| D[Resume]
    C --> E[Process Batch]
    D --> E
    E --> F{Check Status}
    F -->|Time Left| E
    F -->|24h Limit| G[Save Progress]
    F -->|Complete| H[Reset State]
    G --> I[End Sync]
    H --> I
```

### Key Characteristics
- Uses cursor-based pagination with metadata persistence
- Implements time-remaining checks
- Gracefully handles the 24-hour limit
- Maintains sync state across multiple runs
- Supports automatic resume functionality
- Ensures data consistency between runs

### Implementation Notes

This pattern uses metadata to track sync progress and implements time-aware cursor-based pagination. Here's a typical implementation:

```typescript
export default async function fetchData(nango: NangoSync): Promise<void> {
    const START_TIME = Date.now();
    const MAX_RUNTIME_MS = 23.5 * 60 * 60 * 1000; // 23.5 hours in milliseconds
    
    // Get or initialize sync metadata
    let metadata = await nango.getMetadata<SyncCursor>();
    
    // Initialize sync window if first run
    if (!metadata?.currentStartTime) {
        await nango.updateMetadata({ 
            currentStartTime: new Date(),
            lastProcessedId: null,
            totalProcessed: 0
        });
        metadata = await nango.getMetadata<SyncCursor>();
    }
    
    let shouldContinue = true;
    
    while (shouldContinue) {
        // Check if we're approaching the 24h limit
        const timeElapsed = Date.now() - START_TIME;
        if (timeElapsed >= MAX_RUNTIME_MS) {
            // Save progress and exit gracefully
            await nango.log('Approaching 24h limit, saving progress and exiting');
            return;
        }
        
        // Fetch and process data batch
        const response = await fetchDataBatch(metadata.lastProcessedId);
        await processAndSaveData(response.data);
        
        // Update progress
        await nango.updateMetadata({
            lastProcessedId: response.lastId,
            totalProcessed: metadata.totalProcessed + response.data.length
        });
        
        // Check if we're done
        if (response.isLastPage) {
            // Reset metadata for fresh start
            await nango.updateMetadata({
                currentStartTime: null,
                lastProcessedId: null,
                totalProcessed: 0
            });
            shouldContinue = false;
        }
    }
}

async function fetchDataBatch(lastId: string | null): Promise<DataBatchResponse> {
    const config: ProxyConfiguration = {
        endpoint: '/data',
        params: {
            after: lastId,
            limit: 100
        },
        retries: 10
    };
    
    return await nango.get(config);
}
```

Key implementation aspects:
- Tracks elapsed time to respect the 24-hour limit
- Maintains detailed progress metadata
- Implements cursor-based pagination
- Provides automatic resume capability
- Ensures data consistency across runs
- Handles rate limits and data volume constraints

### Best Practices
1. Leave buffer time (e.g., stop at 23.5 hours) to ensure clean exit
2. Save progress frequently
3. Use efficient batch sizes
4. Implement proper error handling
5. Log progress for monitoring
6. Test resume functionality thoroughly

### Common Pitfalls
1. Not accounting for API rate limits in time calculations
2. Insufficient progress tracking
3. Not handling edge cases in resume logic
4. Inefficient batch sizes
5. Poor error handling
6. Incomplete metadata management