fix(issue-7): enforce voice upload size limit before buffering (#22)

* fix(voice): enforce upload size limit before buffering (issue #7)

The previous implementation called request.formData() and audio.arrayBuffer()
before checking MAX_VOICE_UPLOAD_BYTES, meaning oversized uploads were fully
buffered into memory before rejection — a DoS/OOM risk.

Changes:
- Check Content-Length header early and return 413 if it exceeds the limit,
  preventing any request body from being read into memory for oversized uploads
- Export MAX_VOICE_UPLOAD_BYTES for use in tests
- Switch from instanceof File to duck-typing (checking .arrayBuffer method)
  to avoid cross-realm failures in jsdom test environments
- Return HTTP 413 Payload Too Large for oversized uploads (was 400 before)
- Retain a secondary post-buffer size check to catch missing/spoofed
  Content-Length headers

Tests added (tests/unit/voiceTranscribe.test.ts):
- Content-Length exceeding limit → 413 before any buffering
- Content-Length at exactly the limit → proceeds normally
- No Content-Length header, small file → proceeds normally (200)
- No Content-Length header, oversized body → 413 after buffering
- Missing audio field → 400
- Empty audio file (0 bytes) → 400
- Malformed Content-Length header → falls through gracefully

Fixes: issue #7

* fix(issue-7): account for multipart overhead in Content-Length early check

The early Content-Length guard was comparing total multipart request size
against MAX_VOICE_UPLOAD_BYTES, but multipart/form-data includes boundary
and header overhead (~200-500 bytes). A valid file at exactly the 20 MB
limit was being rejected with 413.

Fix: add a 1 KB MULTIPART_OVERHEAD_ALLOWANCE to the early check threshold.
The post-buffer check remains the authoritative limit and measures actual
audio bytes. Updated tests to reflect the corrected early-check boundary.

---------

Co-authored-by: Neo (subagent) <neo@openclaw.local>
Co-authored-by: Neo <neo@openclaw.ai>
This commit is contained in:
robotica4us-collab
2026-03-27 13:41:56 -05:00
committed by GitHub
parent fcecece1c3
commit fdc7a4223a
3 changed files with 258 additions and 11 deletions
+55 -10
View File
@@ -4,32 +4,77 @@ import { transcribeVoiceWithOpenClaw } from "@/lib/openclaw/voiceTranscription";
export const runtime = "nodejs";
const MAX_VOICE_UPLOAD_BYTES = 20 * 1024 * 1024;
export const MAX_VOICE_UPLOAD_BYTES = 20 * 1024 * 1024;
export async function POST(request: Request) {
try {
const formData = await request.formData();
const audio = formData.get("audio");
if (!(audio instanceof File)) {
return NextResponse.json({ error: "audio file is required." }, { status: 400 });
// ── Early size check via Content-Length ──────────────────────────────────
// Reject obviously-oversized uploads BEFORE buffering any request body
// into memory. This prevents a DoS/OOM attack where a huge payload is
// fully read before the limit is enforced.
//
// Important: Content-Length for multipart/form-data includes boundary
// headers and field metadata overhead — not just the raw audio bytes.
// A typical multipart envelope adds ~200500 bytes; we use a generous
// 1 KB overhead allowance so that a file at exactly MAX_VOICE_UPLOAD_BYTES
// is never incorrectly rejected by this pre-buffer check.
//
// The post-buffer check (below) is the authoritative size limit and
// measures the actual audio bytes — this early check only eliminates
// obviously-oversized requests.
const MULTIPART_OVERHEAD_ALLOWANCE = 1024; // 1 KB — safe upper bound
const contentLengthHeader = request.headers.get("content-length");
if (contentLengthHeader !== null) {
const contentLength = Number(contentLengthHeader);
if (
!Number.isNaN(contentLength) &&
contentLength > MAX_VOICE_UPLOAD_BYTES + MULTIPART_OVERHEAD_ALLOWANCE
) {
return NextResponse.json(
{
error: `Audio upload exceeds the ${MAX_VOICE_UPLOAD_BYTES} byte limit.`,
},
{ status: 413 },
);
}
}
const arrayBuffer = await audio.arrayBuffer();
const formData = await request.formData();
const audio = formData.get("audio");
// Use duck-typing instead of `instanceof File` to guard against cross-realm
// issues where jsdom/test environments expose a different File constructor.
if (
audio === null ||
typeof audio !== "object" ||
typeof (audio as File).arrayBuffer !== "function"
) {
return NextResponse.json({ error: "audio file is required." }, { status: 400 });
}
const audioFile = audio as File;
const arrayBuffer = await audioFile.arrayBuffer();
const byteLength = arrayBuffer.byteLength;
if (byteLength <= 0) {
return NextResponse.json({ error: "Audio upload is empty." }, { status: 400 });
}
// ── Secondary (post-buffer) size check ──────────────────────────────────
// Guards against a missing or falsified Content-Length header. Status 413
// is used here too for consistency (the body IS too large, regardless of
// what the header claimed).
if (byteLength > MAX_VOICE_UPLOAD_BYTES) {
return NextResponse.json(
{ error: `Audio upload exceeds the ${MAX_VOICE_UPLOAD_BYTES} byte limit.` },
{ status: 400 },
{
error: `Audio upload exceeds the ${MAX_VOICE_UPLOAD_BYTES} byte limit.`,
},
{ status: 413 },
);
}
const result = await transcribeVoiceWithOpenClaw({
buffer: Buffer.from(arrayBuffer),
fileName: audio.name,
mimeType: audio.type,
fileName: audioFile.name,
mimeType: audioFile.type,
});
return NextResponse.json({