Skip to main content

Overview

Sully.ai supports transcription and note generation in over 50 languages, enabling healthcare providers to document clinical encounters in the patient’s preferred language. The API offers two modes for language handling:
ModeBest ForBehavior
Single-LanguageKnown language encountersAudio in other languages is filtered out
MultilingualMixed-language conversationsAutomatic language detection and transcription
Generated clinical notes are produced in the same language as the transcript, ensuring consistency throughout the documentation workflow.

Supported Languages

Sully.ai supports the following languages using BCP47 language tags:
LanguageBCP47 Tags
Bulgarianbg
Catalanca
Chinese (Mandarin Simplified)zh, zh-CN, zh-Hans
Chinese (Mandarin Traditional)zh-TW, zh-Hant
Chinese (Cantonese)zh-HK
Czechcs
Danishda, da-DK
Dutchnl
Englishen, en-US, en-AU, en-GB, en-NZ, en-IN
Estonianet
Finnishfi
Flemishnl-BE
Frenchfr, fr-CA
Germande, de-CH
Greekel
Hindihi
Hungarianhu
Indonesianid
Italianit
Japaneseja
Koreanko, ko-KR
Latvianlv
Lithuanianlt
Malayms
Norwegianno
Polishpl
Portuguesept, pt-BR, pt-PT
Romanianro
Russianru
Slovaksk
Spanishes, es-419
Swedishsv, sv-SE
Thaith, th-TH
Turkishtr
Ukrainianuk
Vietnamesevi

Single-Language Mode

When you know the language of the clinical encounter in advance, specify it explicitly. This improves transcription accuracy and filters out audio in other languages.

File Upload

Specify the language when uploading audio files:
import SullyAI from '@sullyai/sullyai';
import * as fs from 'fs';

const client = new SullyAI();

// Transcribe Spanish audio
const transcription = await client.audio.transcriptions.create({
  audio: fs.createReadStream('patient-visit.mp3'),
  language: 'es',
});

console.log(`Transcription ID: ${transcription.transcriptionId}`);

Regional Variants

Use regional variants when relevant for better accuracy with regional accents and terminology:
// British English
const transcription = await client.audio.transcriptions.create({
  audio: fs.createReadStream('uk-patient-visit.mp3'),
  language: 'en-GB',
});

// Brazilian Portuguese
const transcriptionBR = await client.audio.transcriptions.create({
  audio: fs.createReadStream('brazil-patient-visit.mp3'),
  language: 'pt-BR',
});

// Canadian French
const transcriptionCA = await client.audio.transcriptions.create({
  audio: fs.createReadStream('quebec-patient-visit.mp3'),
  language: 'fr-CA',
});
When a specific language is set, audio in other languages will be filtered out or ignored. This is useful for ensuring clean transcripts when the encounter language is known.

Multilingual Mode

For clinical encounters where multiple languages are spoken (such as with an interpreter or bilingual patients), use multilingual mode with language=multi.

When to Use Multilingual Mode

  • Patient and provider speak different languages
  • Interpreter-assisted visits
  • Bilingual patients who switch between languages
  • Family members speaking different languages during the visit

Supported Languages for Multilingual Mode

Multilingual mode works well with the following languages:
  • Dutch
  • French
  • German
  • Hindi
  • Italian
  • Japanese
  • Portuguese
  • Russian
  • Spanish

File Upload with Multilingual Mode

import SullyAI from '@sullyai/sullyai';
import * as fs from 'fs';

const client = new SullyAI();

// Transcribe a multilingual encounter
const transcription = await client.audio.transcriptions.create({
  audio: fs.createReadStream('interpreter-visit.mp3'),
  language: 'multi',
});

console.log(`Transcription ID: ${transcription.transcriptionId}`);

// Poll for completion
let result = await client.audio.transcriptions.retrieve(
  transcription.transcriptionId
);

while (result.status === 'STATUS_PROCESSING') {
  await new Promise((resolve) => setTimeout(resolve, 2000));
  result = await client.audio.transcriptions.retrieve(
    transcription.transcriptionId
  );
}

// Transcript includes all detected languages
console.log('Multilingual Transcript:', result.payload?.transcription);

Language in Streaming

When using real-time WebSocket streaming, specify the language as a URL parameter.

WebSocket URL Parameters

wss://api.sully.ai/v1/audio/transcriptions/stream?sample_rate=16000&account_id={id}&api_token={token}&language={language}

Single Language Streaming

// Get streaming token first
const token = await getStreamingToken();
const accountId = process.env.SULLY_ACCOUNT_ID!;

// Connect with Spanish language
const ws = new WebSocket(
  `wss://api.sully.ai/v1/audio/transcriptions/stream?sample_rate=16000&account_id=${accountId}&api_token=${token}&language=es`
);

ws.onmessage = (event) => {
  const data = JSON.parse(event.data);
  if (data.text) {
    console.log('Spanish transcript:', data.text);
  }
};

Multilingual Streaming

// Connect with multilingual mode
const ws = new WebSocket(
  `wss://api.sully.ai/v1/audio/transcriptions/stream?sample_rate=16000&account_id=${accountId}&api_token=${token}&language=multi`
);

ws.onmessage = (event) => {
  const data = JSON.parse(event.data);
  if (data.text) {
    // Automatically handles multiple languages
    console.log('Transcript:', data.text);
  }
};

Best Practices

Follow these guidelines to optimize language handling in your integration:

Use Specific Languages When Known

When you know the language of the encounter in advance, always specify it explicitly rather than using multilingual mode. Single-language mode provides:
  • Better transcription accuracy
  • Faster processing
  • Reduced false positives from background noise in other languages

Use Regional Variants When Relevant

Regional variants improve accuracy for:
  • Accents: en-GB for British accents, en-AU for Australian
  • Medical terminology: Regional differences in drug names and procedures
  • Spelling conventions: en-US vs en-GB spelling in generated notes

Reserve Multilingual Mode for True Multilingual Encounters

Only use language=multi when the conversation genuinely involves multiple languages:
  • Interpreter-assisted visits
  • Bilingual patient-provider conversations
  • Family discussions involving multiple languages
Using multilingual mode when only one language is spoken may reduce transcription accuracy. Always prefer single-language mode when the encounter language is known.

Note Language Consistency

Generated clinical notes are produced in the same language as the transcript:
  • Spanish transcript produces Spanish notes
  • Multilingual transcripts produce notes in the dominant language of the conversation
  • No separate language parameter is needed for note generation

Next Steps