# MobKT-Embed pipeline server

## POST /mdify
Convert a PDF into raw markdown split into one H2 section per page.

- request: multipart/form-data with field `file` (the PDF)
- response: `text/markdown`; each page starts with `## Page N`
  (headings inside a page are demoted to H3+)

    curl -X POST -F "file=@doc.pdf" http://localhost:8000/mdify

## POST /profile
Profile markdown per H2 section (one section = one page). Every sentence in
a section is profiled against the Statics2011 F2011 KC index (knn_1); each
page returns its top-3 concepts with the fraction of sentences whose top-1
KC is that concept.

- request: JSON `{"model": "<encoder>", "markdown": "<markdown>"}`
- `model`: a folder name under `profiler/models` (e.g. `Qwen3-Embedding-0.6B`)
- the knn_1 index is built from Statics2011 on first use and cached
- response: `{"model": ..., "pages": [{"page": N, "profile":
  {"kc": [...], "weight": [...]}}]}`

    curl -X POST -H "Content-Type: application/json" \
         -d '{"model": "Qwen3-Embedding-0.6B", "markdown": "## Page 1\n..."}' \
         http://localhost:8000/profile

Available models: F2LLM-v2-0.6B, F2LLM-v2-1.7B, F2LLM-v2-160M, F2LLM-v2-330M, F2LLM-v2-4B, Qwen3-Embedding-0.6B, Qwen3-Embedding-4B, harrier-oss-v1-0.6b, harrier-oss-v1-270m

## GET /
This help.