AI 음성 전사 앱 개발 완전 가이드: Whisper + Together.ai로 구축하는 현대적 음성 메모 시스템

⏱️ 예상 읽기 시간: 20분

서론

음성 기술이 일상에 깊숙이 자리잡으면서, 음성을 텍스트로 변환하고 AI로 가공하는 애플리케이션의 수요가 급증하고 있습니다. 회의록 작성, 강의 노트 정리, 아이디어 캡처 등 다양한 용도로 활용되는 음성 전사 앱을 직접 구축해보겠습니다.

이 튜토리얼에서는 Nutlope의 Whisper 앱을 참고하여, Together.ai의 Whisper 모델과 최신 웹 기술을 활용한 전문적인 음성 전사 시스템을 단계별로 구축합니다.

프로젝트 개요

핵심 기능

음성 파일 업로드: 다양한 오디오 포맷 지원
AI 음성 전사: Together.ai Whisper 모델 활용
AI 텍스트 변환: 요약, 추출, 분석 등
사용자 인증: Clerk을 통한 안전한 로그인
대시보드: 전사 기록 관리 및 검색

기술 스택

tech_stack:
  frontend:
    - framework: "Next.js 14 App Router"
    - styling: "Tailwind CSS"
    - ui_components: "shadcn/ui"
    - state_management: "React Query"
  
  backend:
    - api: "Next.js API Routes"
    - orm: "Prisma"
    - database: "Neon PostgreSQL"
    - file_storage: "AWS S3"
  
  ai_services:
    - transcription: "Together.ai Whisper"
    - llm_framework: "Vercel AI SDK"
    - text_processing: "Together.ai LLM"
  
  infrastructure:
    - hosting: "Vercel"
    - authentication: "Clerk"
    - rate_limiting: "Upstash Redis"
    - monitoring: "Vercel Analytics"

프로젝트 설정

1. 환경 구성

# 프로젝트 생성
npx create-next-app@latest whisper-ai-app --typescript --tailwind --eslint --app
cd whisper-ai-app

# 필수 패키지 설치
npm install @clerk/nextjs @prisma/client prisma
npm install @aws-sdk/client-s3 @aws-sdk/s3-request-presigner
npm install @upstash/redis ai @ai-sdk/core
npm install @radix-ui/react-icons lucide-react
npm install class-variance-authority clsx tailwind-merge

# 개발 의존성
npm install -D @types/node

2. 환경 변수 설정

# .env.local 파일 생성
cat > .env.local << EOF
# Database
DATABASE_URL="postgresql://username:password@host:5432/whisper_db"

# Clerk Authentication
NEXT_PUBLIC_CLERK_PUBLISHABLE_KEY="pk_test_..."
CLERK_SECRET_KEY="sk_test_..."
NEXT_PUBLIC_CLERK_SIGN_IN_URL="/sign-in"
NEXT_PUBLIC_CLERK_SIGN_UP_URL="/sign-up"

# Together.ai
TOGETHER_API_KEY="your_together_api_key"

# AWS S3
AWS_ACCESS_KEY_ID="your_aws_access_key"
AWS_SECRET_ACCESS_KEY="your_aws_secret_key"
AWS_REGION="us-east-1"
AWS_S3_BUCKET_NAME="whisper-audio-files"

# Upstash Redis
UPSTASH_REDIS_REST_URL="https://your-redis.upstash.io"
UPSTASH_REDIS_REST_TOKEN="your_redis_token"

# App
NEXT_PUBLIC_APP_URL="http://localhost:3000"
EOF

데이터베이스 설계

Prisma 스키마 설정

// prisma/schema.prisma
generator client {
  provider = "prisma-client-js"
}

datasource db {
  provider = "postgresql"
  url      = env("DATABASE_URL")
}

model User {
  id        String   @id @default(cuid())
  clerkId   String   @unique
  email     String   @unique
  name      String?
  createdAt DateTime @default(now())
  updatedAt DateTime @updatedAt

  transcriptions Transcription[]

  @@map("users")
}

model Transcription {
  id          String            @id @default(cuid())
  title       String
  audioUrl    String
  originalText String?
  duration    Int? // seconds
  status      TranscriptionStatus @default(PENDING)
  createdAt   DateTime          @default(now())
  updatedAt   DateTime          @updatedAt

  userId      String
  user        User              @relation(fields: [userId], references: [id], onDelete: Cascade)
  
  transformations Transformation[]

  @@map("transcriptions")
}

model Transformation {
  id            String            @id @default(cuid())
  type          TransformationType
  prompt        String
  result        String
  createdAt     DateTime          @default(now())

  transcriptionId String
  transcription   Transcription     @relation(fields: [transcriptionId], references: [id], onDelete: Cascade)

  @@map("transformations")
}

enum TranscriptionStatus {
  PENDING
  PROCESSING
  COMPLETED
  FAILED
}

enum TransformationType {
  SUMMARY
  BULLET_POINTS
  ACTION_ITEMS
  KEYWORDS
  CUSTOM
}

데이터베이스 초기화

# Prisma 초기화
npx prisma generate
npx prisma db push

# 개발 환경에서 데이터베이스 확인
npx prisma studio

핵심 컴포넌트 구현

1. Clerk 인증 설정

// app/layout.tsx
import { ClerkProvider } from '@clerk/nextjs'
import { Inter } from 'next/font/google'
import './globals.css'

const inter = Inter({ subsets: ['latin'] })

export default function RootLayout({
  children,
}: {
  children: React.ReactNode
}) {
  return (
    <ClerkProvider>
      <html lang="ko">
        <body className={inter.className}>
          {children}
        </body>
      </html>
    </ClerkProvider>
  )
}

// middleware.ts
import { authMiddleware } from '@clerk/nextjs'

export default authMiddleware({
  publicRoutes: ['/'],
  ignoredRoutes: ['/api/webhook'],
})

export const config = {
  matcher: ['/((?!.+\\.[\\w]+$|_next).*)', '/', '/(api|trpc)(.*)'],
}

2. 오디오 파일 업로드 컴포넌트

// components/audio-upload.tsx
'use client'

import { useState, useCallback } from 'react'
import { useDropzone } from 'react-dropzone'
import { Upload, Music, AlertCircle } from 'lucide-react'
import { Button } from '@/components/ui/button'
import { Card, CardContent } from '@/components/ui/card'
import { Progress } from '@/components/ui/progress'

interface AudioUploadProps {
  onUploadComplete: (audioUrl: string, fileName: string) => void
  isUploading: boolean
}

export function AudioUpload({ onUploadComplete, isUploading }: AudioUploadProps) {
  const [uploadProgress, setUploadProgress] = useState(0)
  const [error, setError] = useState<string | null>(null)

  const onDrop = useCallback(async (acceptedFiles: File[]) => {
    const file = acceptedFiles[0]
    if (!file) return

    setError(null)
    setUploadProgress(0)

    try {
      // 프리사인 URL 생성
      const response = await fetch('/api/upload/presigned-url', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({
          fileName: file.name,
          fileType: file.type,
        }),
      })

      const { presignedUrl, audioUrl } = await response.json()

      // S3 업로드
      const uploadResponse = await fetch(presignedUrl, {
        method: 'PUT',
        body: file,
        headers: {
          'Content-Type': file.type,
        },
      })

      if (uploadResponse.ok) {
        onUploadComplete(audioUrl, file.name)
      } else {
        throw new Error('업로드 실패')
      }
    } catch (err) {
      setError('파일 업로드 중 오류가 발생했습니다.')
      console.error('Upload error:', err)
    }
  }, [onUploadComplete])

  const { getRootProps, getInputProps, isDragActive } = useDropzone({
    onDrop,
    accept: {
      'audio/*': ['.mp3', '.wav', '.m4a', '.aac', '.ogg', '.flac']
    },
    maxSize: 100 * 1024 * 1024, // 100MB
    disabled: isUploading,
  })

  return (
    <Card className="w-full max-w-2xl mx-auto">
      <CardContent className="p-6">
        <div
          {...getRootProps()}
          className={`
            border-2 border-dashed rounded-lg p-8 text-center transition-colors cursor-pointer
            ${isDragActive ? 'border-blue-400 bg-blue-50' : 'border-gray-300'}
            ${isUploading ? 'opacity-50 cursor-not-allowed' : 'hover:border-gray-400'}
          `}
        >
          <input {...getInputProps()} />
          
          {isUploading ? (
            <div className="space-y-4">
              <Music className="mx-auto h-12 w-12 text-blue-500 animate-pulse" />
              <div>
                <p className="text-lg font-medium">업로드 중...</p>
                <Progress value={uploadProgress} className="mt-2" />
              </div>
            </div>
          ) : (
            <div className="space-y-4">
              <Upload className="mx-auto h-12 w-12 text-gray-400" />
              <div>
                <p className="text-lg font-medium">
                  {isDragActive ? '파일을 놓아주세요' : '오디오 파일을 드래그하거나 클릭하세요'}
                </p>
                <p className="text-sm text-gray-500 mt-1">
                  MP3, WAV, M4A, AAC, OGG, FLAC (최대 100MB)
                </p>
              </div>
            </div>
          )}
        </div>

        {error && (
          <div className="mt-4 flex items-center gap-2 text-red-600">
            <AlertCircle className="h-4 w-4" />
            <span className="text-sm">{error}</span>
          </div>
        )}
      </CardContent>
    </Card>
  )
}

3. S3 업로드 API

// app/api/upload/presigned-url/route.ts
import { NextRequest, NextResponse } from 'next/server'
import { S3Client, PutObjectCommand } from '@aws-sdk/client-s3'
import { getSignedUrl } from '@aws-sdk/s3-request-presigner'
import { auth } from '@clerk/nextjs'
import { v4 as uuidv4 } from 'uuid'

const s3Client = new S3Client({
  region: process.env.AWS_REGION!,
  credentials: {
    accessKeyId: process.env.AWS_ACCESS_KEY_ID!,
    secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY!,
  },
})

export async function POST(request: NextRequest) {
  try {
    const { userId } = auth()
    if (!userId) {
      return NextResponse.json({ error: 'Unauthorized' }, { status: 401 })
    }

    const { fileName, fileType } = await request.json()
    
    // 고유한 파일명 생성
    const fileExtension = fileName.split('.').pop()
    const uniqueFileName = `${userId}/${uuidv4()}.${fileExtension}`

    // 프리사인 URL 생성
    const command = new PutObjectCommand({
      Bucket: process.env.AWS_S3_BUCKET_NAME!,
      Key: uniqueFileName,
      ContentType: fileType,
    })

    const presignedUrl = await getSignedUrl(s3Client, command, { expiresIn: 300 })
    const audioUrl = `https://${process.env.AWS_S3_BUCKET_NAME}.s3.${process.env.AWS_REGION}.amazonaws.com/${uniqueFileName}`

    return NextResponse.json({ presignedUrl, audioUrl })
  } catch (error) {
    console.error('Presigned URL generation error:', error)
    return NextResponse.json(
      { error: 'Failed to generate upload URL' },
      { status: 500 }
    )
  }
}

AI 음성 전사 구현

1. Together.ai Whisper 전사 API

// app/api/transcribe/route.ts
import { NextRequest, NextResponse } from 'next/server'
import { auth } from '@clerk/nextjs'
import { prisma } from '@/lib/db'
import { rateLimiter } from '@/lib/rate-limiter'

export async function POST(request: NextRequest) {
  try {
    const { userId } = auth()
    if (!userId) {
      return NextResponse.json({ error: 'Unauthorized' }, { status: 401 })
    }

    // Rate limiting
    const isAllowed = await rateLimiter.check(userId)
    if (!isAllowed) {
      return NextResponse.json(
        { error: 'Rate limit exceeded' },
        { status: 429 }
      )
    }

    const { audioUrl, title } = await request.json()

    // 전사 작업 생성
    const transcription = await prisma.transcription.create({
      data: {
        title,
        audioUrl,
        status: 'PROCESSING',
        userId: await getUserId(userId),
      },
    })

    // 백그라운드에서 전사 처리
    processTranscription(transcription.id, audioUrl)

    return NextResponse.json({ transcriptionId: transcription.id })
  } catch (error) {
    console.error('Transcription error:', error)
    return NextResponse.json(
      { error: 'Failed to start transcription' },
      { status: 500 }
    )
  }
}

async function processTranscription(transcriptionId: string, audioUrl: string) {
  try {
    // Together.ai Whisper API 호출
    const response = await fetch('https://api.together.xyz/v1/audio/transcriptions', {
      method: 'POST',
      headers: {
        'Authorization': `Bearer ${process.env.TOGETHER_API_KEY}`,
        'Content-Type': 'application/json',
      },
      body: JSON.stringify({
        file: audioUrl,
        model: 'whisper-large-v3',
        language: 'ko', // 한국어 지원
        response_format: 'verbose_json',
      }),
    })

    const result = await response.json()

    if (response.ok) {
      // 전사 결과 저장
      await prisma.transcription.update({
        where: { id: transcriptionId },
        data: {
          originalText: result.text,
          duration: result.duration,
          status: 'COMPLETED',
        },
      })
    } else {
      throw new Error(result.error?.message || 'Transcription failed')
    }
  } catch (error) {
    console.error('Transcription processing error:', error)
    
    // 실패 상태로 업데이트
    await prisma.transcription.update({
      where: { id: transcriptionId },
      data: { status: 'FAILED' },
    })
  }
}

async function getUserId(clerkId: string): Promise<string> {
  let user = await prisma.user.findUnique({
    where: { clerkId },
  })

  if (!user) {
    user = await prisma.user.create({
      data: {
        clerkId,
        email: '', // Clerk에서 가져올 수 있음
      },
    })
  }

  return user.id
}

2. Rate Limiting 구현

// lib/rate-limiter.ts
import { Redis } from '@upstash/redis'

const redis = new Redis({
  url: process.env.UPSTASH_REDIS_REST_URL!,
  token: process.env.UPSTASH_REDIS_REST_TOKEN!,
})

export class RateLimiter {
  private limit: number
  private window: number

  constructor(limit: number = 10, windowInSeconds: number = 3600) {
    this.limit = limit
    this.window = windowInSeconds
  }

  async check(userId: string): Promise<boolean> {
    const key = `rate_limit:${userId}`
    const now = Math.floor(Date.now() / 1000)
    const windowStart = now - this.window

    try {
      // 현재 윈도우에서의 요청 수 확인
      const requests = await redis.zcount(key, windowStart, now)
      
      if (requests >= this.limit) {
        return false
      }

      // 새 요청 기록
      await redis.zadd(key, { score: now, member: `${now}-${Math.random()}` })
      
      // 만료된 항목 제거
      await redis.zremrangebyscore(key, 0, windowStart)
      
      // TTL 설정
      await redis.expire(key, this.window)

      return true
    } catch (error) {
      console.error('Rate limiter error:', error)
      return true // 에러 시 허용
    }
  }
}

export const rateLimiter = new RateLimiter(10, 3600) // 시간당 10회

AI 텍스트 변환 기능

1. 텍스트 변환 API

// app/api/transform/route.ts
import { NextRequest, NextResponse } from 'next/server'
import { auth } from '@clerk/nextjs'
import { prisma } from '@/lib/db'
import { openai } from '@ai-sdk/openai'
import { generateText } from 'ai'

const TRANSFORMATION_PROMPTS = {
  SUMMARY: '다음 텍스트를 간결하고 핵심적인 내용으로 요약해주세요.',
  BULLET_POINTS: '다음 텍스트의 주요 포인트들을 불릿 포인트 형태로 정리해주세요.',
  ACTION_ITEMS: '다음 텍스트에서 실행해야 할 액션 아이템들을 추출해주세요.',
  KEYWORDS: '다음 텍스트의 핵심 키워드들을 추출해주세요.',
}

export async function POST(request: NextRequest) {
  try {
    const { userId } = auth()
    if (!userId) {
      return NextResponse.json({ error: 'Unauthorized' }, { status: 401 })
    }

    const { transcriptionId, type, customPrompt } = await request.json()

    // 전사 데이터 조회
    const transcription = await prisma.transcription.findFirst({
      where: {
        id: transcriptionId,
        user: { clerkId: userId },
      },
    })

    if (!transcription || !transcription.originalText) {
      return NextResponse.json(
        { error: 'Transcription not found' },
        { status: 404 }
      )
    }

    // 프롬프트 생성
    const systemPrompt = customPrompt || TRANSFORMATION_PROMPTS[type as keyof typeof TRANSFORMATION_PROMPTS]
    const fullPrompt = `${systemPrompt}\n\n텍스트:\n${transcription.originalText}`

    // AI 텍스트 변환
    const { text } = await generateText({
      model: openai('gpt-3.5-turbo'),
      prompt: fullPrompt,
      maxTokens: 1000,
    })

    // 변환 결과 저장
    const transformation = await prisma.transformation.create({
      data: {
        type: type || 'CUSTOM',
        prompt: systemPrompt,
        result: text,
        transcriptionId,
      },
    })

    return NextResponse.json({ 
      id: transformation.id,
      result: text 
    })
  } catch (error) {
    console.error('Transformation error:', error)
    return NextResponse.json(
      { error: 'Failed to transform text' },
      { status: 500 }
    )
  }
}

2. 텍스트 변환 컴포넌트

// components/text-transformer.tsx
'use client'

import { useState } from 'react'
import { Button } from '@/components/ui/button'
import { Card, CardContent, CardHeader, CardTitle } from '@/components/ui/card'
import { Textarea } from '@/components/ui/textarea'
import { Select, SelectContent, SelectItem, SelectTrigger, SelectValue } from '@/components/ui/select'
import { Loader2, Wand2 } from 'lucide-react'

interface TextTransformerProps {
  transcriptionId: string
  originalText: string
}

export function TextTransformer({ transcriptionId, originalText }: TextTransformerProps) {
  const [transformationType, setTransformationType] = useState<string>('')
  const [customPrompt, setCustomPrompt] = useState('')
  const [result, setResult] = useState<string>('')
  const [isTransforming, setIsTransforming] = useState(false)

  const transformationOptions = [
    { value: 'SUMMARY', label: '요약' },
    { value: 'BULLET_POINTS', label: '불릿 포인트' },
    { value: 'ACTION_ITEMS', label: '액션 아이템' },
    { value: 'KEYWORDS', label: '키워드 추출' },
    { value: 'CUSTOM', label: '커스텀' },
  ]

  const handleTransform = async () => {
    if (!transformationType) return

    setIsTransforming(true)
    try {
      const response = await fetch('/api/transform', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({
          transcriptionId,
          type: transformationType,
          customPrompt: transformationType === 'CUSTOM' ? customPrompt : undefined,
        }),
      })

      const data = await response.json()
      setResult(data.result)
    } catch (error) {
      console.error('Transformation error:', error)
    } finally {
      setIsTransforming(false)
    }
  }

  return (
    <div className="space-y-6">
      <Card>
        <CardHeader>
          <CardTitle className="flex items-center gap-2">
            <Wand2 className="h-5 w-5" />
            AI 텍스트 변환
          </CardTitle>
        </CardHeader>
        <CardContent className="space-y-4">
          <div className="flex gap-4">
            <Select value={transformationType} onValueChange={setTransformationType}>
              <SelectTrigger className="flex-1">
                <SelectValue placeholder="변환 유형 선택" />
              </SelectTrigger>
              <SelectContent>
                {transformationOptions.map((option) => (
                  <SelectItem key={option.value} value={option.value}>
                    {option.label}
                  </SelectItem>
                ))}
              </SelectContent>
            </Select>

            <Button 
              onClick={handleTransform} 
              disabled={!transformationType || isTransforming}
            >
              {isTransforming ? (
                <Loader2 className="h-4 w-4 animate-spin" />
              ) : (
                '변환'
              )}
            </Button>
          </div>

          {transformationType === 'CUSTOM' && (
            <Textarea
              placeholder="커스텀 프롬프트를 입력하세요..."
              value={customPrompt}
              onChange={(e) => setCustomPrompt(e.target.value)}
              rows={3}
            />
          )}
        </CardContent>
      </Card>

      {result && (
        <Card>
          <CardHeader>
            <CardTitle>변환 결과</CardTitle>
          </CardHeader>
          <CardContent>
            <div className="prose max-w-none">
              <pre className="whitespace-pre-wrap font-sans">{result}</pre>
            </div>
          </CardContent>
        </Card>
      )}
    </div>
  )
}

대시보드 구현

1. 전사 목록 페이지

// app/dashboard/page.tsx
import { auth } from '@clerk/nextjs'
import { redirect } from 'next/navigation'
import { prisma } from '@/lib/db'
import { TranscriptionList } from '@/components/transcription-list'
import { AudioUpload } from '@/components/audio-upload'

async function getTranscriptions(userId: string) {
  return await prisma.transcription.findMany({
    where: {
      user: { clerkId: userId },
    },
    orderBy: { createdAt: 'desc' },
    include: {
      transformations: true,
    },
  })
}

export default async function DashboardPage() {
  const { userId } = auth()
  
  if (!userId) {
    redirect('/sign-in')
  }

  const transcriptions = await getTranscriptions(userId)

  return (
    <div className="container mx-auto py-8 space-y-8">
      <div className="text-center space-y-4">
        <h1 className="text-3xl font-bold">음성 전사 대시보드</h1>
        <p className="text-gray-600">
          음성 파일을 업로드하고 AI로 전사 및 변환하세요
        </p>
      </div>

      <AudioUpload />

      <div className="space-y-4">
        <h2 className="text-2xl font-semibold">전사 기록</h2>
        <TranscriptionList transcriptions={transcriptions} />
      </div>
    </div>
  )
}

2. 전사 목록 컴포넌트

// components/transcription-list.tsx
'use client'

import { useState } from 'react'
import { Card, CardContent, CardHeader, CardTitle } from '@/components/ui/card'
import { Badge } from '@/components/ui/badge'
import { Button } from '@/components/ui/button'
import { Input } from '@/components/ui/input'
import { Clock, FileText, Search, Trash2 } from 'lucide-react'
import { formatDistance } from 'date-fns'
import { ko } from 'date-fns/locale'

interface TranscriptionListProps {
  transcriptions: Array<{
    id: string
    title: string
    status: string
    createdAt: Date
    duration?: number | null
    transformations: Array<{
      id: string
      type: string
      createdAt: Date
    }>
  }>
}

export function TranscriptionList({ transcriptions }: TranscriptionListProps) {
  const [searchTerm, setSearchTerm] = useState('')

  const filteredTranscriptions = transcriptions.filter(t =>
    t.title.toLowerCase().includes(searchTerm.toLowerCase())
  )

  const getStatusColor = (status: string) => {
    switch (status) {
      case 'COMPLETED': return 'bg-green-100 text-green-800'
      case 'PROCESSING': return 'bg-yellow-100 text-yellow-800'
      case 'FAILED': return 'bg-red-100 text-red-800'
      default: return 'bg-gray-100 text-gray-800'
    }
  }

  const formatDuration = (seconds?: number | null) => {
    if (!seconds) return null
    const minutes = Math.floor(seconds / 60)
    const remainingSeconds = seconds % 60
    return `${minutes}:${remainingSeconds.toString().padStart(2, '0')}`
  }

  return (
    <div className="space-y-4">
      <div className="flex items-center gap-2">
        <Search className="h-4 w-4 text-gray-400" />
        <Input
          placeholder="전사 기록 검색..."
          value={searchTerm}
          onChange={(e) => setSearchTerm(e.target.value)}
          className="max-w-sm"
        />
      </div>

      {filteredTranscriptions.length === 0 ? (
        <Card>
          <CardContent className="py-8 text-center text-gray-500">
            {searchTerm ? '검색 결과가 없습니다.' : '아직 전사 기록이 없습니다.'}
          </CardContent>
        </Card>
      ) : (
        <div className="grid gap-4">
          {filteredTranscriptions.map((transcription) => (
            <Card key={transcription.id} className="hover:shadow-md transition-shadow">
              <CardHeader className="flex flex-row items-center justify-between space-y-0 pb-2">
                <CardTitle className="text-lg">{transcription.title}</CardTitle>
                <div className="flex items-center gap-2">
                  <Badge className={getStatusColor(transcription.status)}>
                    {transcription.status === 'COMPLETED' ? '완료' :
                     transcription.status === 'PROCESSING' ? '처리중' :
                     transcription.status === 'FAILED' ? '실패' : '대기중'}
                  </Badge>
                  <Button variant="ghost" size="sm">
                    <Trash2 className="h-4 w-4" />
                  </Button>
                </div>
              </CardHeader>
              <CardContent>
                <div className="flex items-center justify-between text-sm text-gray-500">
                  <div className="flex items-center gap-4">
                    <div className="flex items-center gap-1">
                      <Clock className="h-4 w-4" />
                      {formatDistance(transcription.createdAt, new Date(), { 
                        addSuffix: true, 
                        locale: ko 
                      })}
                    </div>
                    {transcription.duration && (
                      <div className="flex items-center gap-1">
                        <FileText className="h-4 w-4" />
                        {formatDuration(transcription.duration)}
                      </div>
                    )}
                  </div>
                  <div className="text-xs">
                    변환 {transcription.transformations.length}개
                  </div>
                </div>
                
                {transcription.status === 'COMPLETED' && (
                  <div className="mt-3">
                    <Button variant="outline" size="sm">
                      상세 보기
                    </Button>
                  </div>
                )}
              </CardContent>
            </Card>
          ))}
        </div>
      )}
    </div>
  )
}

실시간 상태 업데이트

1. WebSocket 또는 SSE 구현

// app/api/transcriptions/[id]/status/route.ts
import { NextRequest, NextResponse } from 'next/server'
import { auth } from '@clerk/nextjs'
import { prisma } from '@/lib/db'

export async function GET(
  request: NextRequest,
  { params }: { params: { id: string } }
) {
  const { userId } = auth()
  if (!userId) {
    return NextResponse.json({ error: 'Unauthorized' }, { status: 401 })
  }

  try {
    const transcription = await prisma.transcription.findFirst({
      where: {
        id: params.id,
        user: { clerkId: userId },
      },
      include: {
        transformations: true,
      },
    })

    if (!transcription) {
      return NextResponse.json({ error: 'Not found' }, { status: 404 })
    }

    return NextResponse.json(transcription)
  } catch (error) {
    console.error('Status check error:', error)
    return NextResponse.json(
      { error: 'Failed to get status' },
      { status: 500 }
    )
  }
}

2. 폴링 기반 상태 확인

// hooks/use-transcription-status.ts
import { useState, useEffect } from 'react'

interface TranscriptionStatus {
  id: string
  status: string
  originalText?: string
  duration?: number
}

export function useTranscriptionStatus(transcriptionId: string | null) {
  const [status, setStatus] = useState<TranscriptionStatus | null>(null)
  const [isLoading, setIsLoading] = useState(false)

  useEffect(() => {
    if (!transcriptionId) return

    let interval: NodeJS.Timeout

    const checkStatus = async () => {
      try {
        setIsLoading(true)
        const response = await fetch(`/api/transcriptions/${transcriptionId}/status`)
        const data = await response.json()
        setStatus(data)

        // 완료되거나 실패하면 폴링 중단
        if (data.status === 'COMPLETED' || data.status === 'FAILED') {
          clearInterval(interval)
        }
      } catch (error) {
        console.error('Status check error:', error)
      } finally {
        setIsLoading(false)
      }
    }

    // 즉시 실행
    checkStatus()

    // 5초마다 상태 확인
    interval = setInterval(checkStatus, 5000)

    return () => clearInterval(interval)
  }, [transcriptionId])

  return { status, isLoading }
}

성능 최적화 및 사용자 경험

1. 프로그레시브 업로드

// components/progressive-upload.tsx
'use client'

import { useState } from 'react'
import { Progress } from '@/components/ui/progress'

export function ProgressiveUpload() {
  const [uploadProgress, setUploadProgress] = useState(0)

  const uploadWithProgress = async (file: File) => {
    const chunkSize = 1024 * 1024 // 1MB 청크
    const totalChunks = Math.ceil(file.size / chunkSize)
    
    for (let chunkIndex = 0; chunkIndex < totalChunks; chunkIndex++) {
      const start = chunkIndex * chunkSize
      const end = Math.min(start + chunkSize, file.size)
      const chunk = file.slice(start, end)
      
      // 청크 업로드
      await uploadChunk(chunk, chunkIndex, totalChunks)
      
      // 진행률 업데이트
      setUploadProgress(((chunkIndex + 1) / totalChunks) * 100)
    }
  }

  const uploadChunk = async (chunk: Blob, index: number, total: number) => {
    const formData = new FormData()
    formData.append('chunk', chunk)
    formData.append('chunkIndex', index.toString())
    formData.append('totalChunks', total.toString())

    await fetch('/api/upload/chunk', {
      method: 'POST',
      body: formData,
    })
  }

  return (
    <div className="space-y-2">
      <div className="flex justify-between text-sm">
        <span>업로드 진행률</span>
        <span>{Math.round(uploadProgress)}%</span>
      </div>
      <Progress value={uploadProgress} />
    </div>
  )
}

2. 에러 처리 및 재시도 로직

// lib/retry-logic.ts
export class RetryableError extends Error {
  constructor(message: string, public retryAfter?: number) {
    super(message)
    this.name = 'RetryableError'
  }
}

export async function withRetry<T>(
  fn: () => Promise<T>,
  maxRetries: number = 3,
  baseDelay: number = 1000
): Promise<T> {
  let lastError: Error

  for (let attempt = 0; attempt <= maxRetries; attempt++) {
    try {
      return await fn()
    } catch (error) {
      lastError = error as Error
      
      if (attempt === maxRetries) {
        throw lastError
      }

      if (error instanceof RetryableError) {
        const delay = error.retryAfter || baseDelay * Math.pow(2, attempt)
        await new Promise(resolve => setTimeout(resolve, delay))
      } else {
        throw error
      }
    }
  }

  throw lastError!
}

테스트 및 배포

1. 유닛 테스트

// __tests__/transcription.test.ts
import { describe, it, expect, jest } from '@jest/globals'
import { POST } from '@/app/api/transcribe/route'
import { NextRequest } from 'next/server'

// Clerk 모킹
jest.mock('@clerk/nextjs', () => ({
  auth: () => ({ userId: 'test-user-id' }),
}))

describe('/api/transcribe', () => {
  it('should create transcription successfully', async () => {
    const request = new NextRequest('http://localhost:3000/api/transcribe', {
      method: 'POST',
      body: JSON.stringify({
        audioUrl: 'https://example.com/audio.mp3',
        title: 'Test Audio',
      }),
    })

    const response = await POST(request)
    const data = await response.json()

    expect(response.status).toBe(200)
    expect(data.transcriptionId).toBeDefined()
  })

  it('should reject unauthorized requests', async () => {
    // userId를 null로 모킹
    jest.mocked(require('@clerk/nextjs').auth).mockReturnValue({ userId: null })

    const request = new NextRequest('http://localhost:3000/api/transcribe', {
      method: 'POST',
      body: JSON.stringify({
        audioUrl: 'https://example.com/audio.mp3',
        title: 'Test Audio',
      }),
    })

    const response = await POST(request)
    expect(response.status).toBe(401)
  })
})

2. 환경별 배포 설정

# .github/workflows/deploy.yml
name: Deploy to Vercel

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - uses: actions/setup-node@v3
        with:
          node-version: '18'
          cache: 'npm'
      
      - run: npm ci
      - run: npm run test
      - run: npm run build

  deploy:
    needs: test
    runs-on: ubuntu-latest
    if: github.ref == 'refs/heads/main'
    
    steps:
      - uses: actions/checkout@v3
      - uses: amondnet/vercel-action@v25
        with:
          vercel-token: $
          vercel-org-id: $
          vercel-project-id: $
          vercel-args: '--prod'

모니터링 및 분석

1. 사용량 추적

// lib/analytics.ts
export class AnalyticsTracker {
  static async trackTranscription(userId: string, duration: number, success: boolean) {
    try {
      await fetch('/api/analytics/track', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({
          event: 'transcription_completed',
          userId,
          properties: { duration, success },
        }),
      })
    } catch (error) {
      console.error('Analytics tracking error:', error)
    }
  }

  static async trackTransformation(userId: string, type: string) {
    try {
      await fetch('/api/analytics/track', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({
          event: 'text_transformation',
          userId,
          properties: { type },
        }),
      })
    } catch (error) {
      console.error('Analytics tracking error:', error)
    }
  }
}

2. 에러 로깅

// lib/logger.ts
export class Logger {
  static error(message: string, error: Error, context?: any) {
    console.error(`[ERROR] ${message}`, {
      error: {
        name: error.name,
        message: error.message,
        stack: error.stack,
      },
      context,
      timestamp: new Date().toISOString(),
    })

    // 프로덕션에서는 외부 로깅 서비스로 전송
    if (process.env.NODE_ENV === 'production') {
      // Sentry, LogRocket 등으로 전송
    }
  }

  static info(message: string, data?: any) {
    console.log(`[INFO] ${message}`, {
      data,
      timestamp: new Date().toISOString(),
    })
  }
}

향후 개선 방향

1. 고급 기능 추가

// 계획된 기능들
const futureFeatures = {
  realtime_transcription: {
    description: '실시간 음성 전사',
    technology: 'WebRTC + WebSocket',
    priority: 'high'
  },
  
  speaker_diarization: {
    description: '화자 분리',
    technology: 'Together.ai Speaker Diarization',
    priority: 'medium'
  },
  
  multilingual_support: {
    description: '다국어 지원',
    technology: 'Whisper Multilingual',
    priority: 'medium'
  },
  
  collaboration_features: {
    description: '팀 협업 기능',
    technology: 'Real-time collaboration',
    priority: 'low'
  },
  
  api_access: {
    description: 'API 제공',
    technology: 'REST API + API Keys',
    priority: 'medium'
  }
}

2. 성능 최적화

// 최적화 계획
const optimizations = {
  caching: {
    strategy: 'Redis caching for frequent queries',
    impact: 'Reduce DB load'
  },
  
  cdn: {
    strategy: 'CloudFront for audio file delivery',
    impact: 'Faster file access'
  },
  
  compression: {
    strategy: 'Audio compression before upload',
    impact: 'Reduce upload time and storage cost'
  },
  
  edge_computing: {
    strategy: 'Vercel Edge Functions for auth',
    impact: 'Lower latency'
  }
}

결론

이 튜토리얼을 통해 현대적인 기술 스택을 활용한 전문적인 AI 음성 전사 애플리케이션을 구축했습니다. 주요 성과는 다음과 같습니다:

핵심 성과

완전한 음성 처리 파이프라인: 업로드 → 전사 → AI 변환
사용자 중심 설계: 직관적인 UI/UX와 실시간 피드백
확장 가능한 아키텍처: 마이크로서비스 패턴과 클라우드 네이티브 설계
프로덕션 준비: 인증, Rate Limiting, 에러 처리, 모니터링

기술적 하이라이트

Together.ai Whisper: 고품질 음성 전사
Next.js 14: 최신 React 프레임워크
Clerk: 간편한 사용자 인증
Prisma + Neon: 현대적 데이터베이스 스택
AWS S3: 안정적인 파일 저장
Vercel: 간편한 배포와 호스팅

실무 적용 가치

이 애플리케이션은 다음과 같은 실제 사용 사례에 적용할 수 있습니다:

회의록 자동화: 회의 녹음을 자동으로 전사하고 요약
강의 노트: 온라인 강의나 세미나 내용 정리
인터뷰 분석: 고객 인터뷰나 연구 인터뷰 분석
콘텐츠 제작: 팟캐스트나 비디오 콘텐츠의 스크립트 생성

참고한 Nutlope의 Whisper 앱을 기반으로, 더욱 포괄적이고 실무에서 바로 활용할 수 있는 수준의 애플리케이션을 구현했습니다. 이제 여러분만의 독특한 기능을 추가하여 더욱 특별한 음성 전사 서비스를 만들어보세요!

관련 리소스:

서론