AssemblyAI Unveils C# .NET SDK for Advanced Audio Transcription and Analysis

AssemblyAI has announced the release of its new C# .NET SDK, designed to facilitate audio transcription and analysis for developers utilizing .NET languages such as C#, VB.NET, and F#. The SDK aims to streamline the use of AssemblyAI's advanced Speech AI models, according to AssemblyAI.

Key Features and Goals

The SDK has been developed with several key objectives in mind:

Provide an intuitive interface for all AssemblyAI models and features using idiomatic C#.
Ensure compatibility with multiple frameworks, including .NET 6.0, .NET Framework 4.6.2, and .NET Standard 2.0 and above.
Minimize dependencies to prevent version conflicts and the need for binding redirects.

Transcribing Audio Files

One of the primary functionalities of the SDK is audio transcription. Developers can transcribe audio files asynchronously or in real-time. Below is an example of how to transcribe an audio file:

using AssemblyAI;
using AssemblyAI.Transcripts;

var client = new AssemblyAIClient("YOUR_API_KEY");

var transcript = await client.Transcripts.TranscribeAsync(new TranscriptParams
{
    AudioUrl = "https://storage.googleapis.com/aai-docs-samples/nbc.mp3"
});

transcript.EnsureStatusCompleted();

Console.WriteLine(transcript.Text);

For local files, similar code can be used to achieve transcription.

await using var stream = new FileStream("./nbc.mp3", FileMode.Open);
var transcript = await client.Transcripts.TranscribeAsync(
    stream,
    new TranscriptOptionalParams
    {
        LanguageCode = TranscriptLanguageCode.EnUs
    }
);

transcript.EnsureStatusCompleted();

Console.WriteLine(transcript.Text);

Real-Time Audio Transcription

The SDK also supports real-time audio transcription using Streaming Speech-to-Text. This feature is particularly useful for applications requiring immediate processing of audio data.

using AssemblyAI.Realtime;

await using var transcriber = new RealtimeTranscriber(new RealtimeTranscriberOptions
{
    ApiKey = "YOUR_API_KEY",
    SampleRate = 16_000
});

transcriber.PartialTranscriptReceived.Subscribe(transcript =>
{
    Console.WriteLine($"Partial: {transcript.Text}");
});
transcriber.FinalTranscriptReceived.Subscribe(transcript =>
{
    Console.WriteLine($"Final: {transcript.Text}");
});

await transcriber.ConnectAsync();

// Pseudocode for getting audio from a microphone for example
GetAudio(async (chunk) => await transcriber.SendAudioAsync(chunk));

await transcriber.CloseAsync();

Utilizing LeMUR for LLM Applications

The SDK integrates with LeMUR to allow developers to build large language model (LLM) applications on voice data. Here is an example:

var lemurTaskParams = new LemurTaskParams
{
    Prompt = "Provide a brief summary of the transcript.",
    TranscriptIds = [transcript.Id],
    FinalModel = LemurModel.AnthropicClaude3_5_Sonnet
};

var response = await client.Lemur.TaskAsync(lemurTaskParams);

Console.WriteLine(response.Response);

Audio Intelligence Models

Additionally, the SDK comes with built-in support for audio intelligence models, enabling sentiment analysis and other advanced features.

var transcript = await client.Transcripts.TranscribeAsync(new TranscriptParams
{
    AudioUrl = "https://storage.googleapis.com/aai-docs-samples/nbc.mp3",
    SentimentAnalysis = true
});

foreach (var result in transcript.SentimentAnalysisResults!)
{
    Console.WriteLine(result.Text);
    Console.WriteLine(result.Sentiment); // POSITIVE, NEUTRAL, or NEGATIVE
    Console.WriteLine(result.Confidence);
    Console.WriteLine($"Timestamp: {result.Start} - {result.End}");
}

For more information, visit the official AssemblyAI blog.

Image source: Shutterstock

Bookmark