Demo AI Quoting System with Fine-tuning and Quantization

This article is not only a technical report of the demo AI quote system, but also include some thoughts on the rising AI Staff industry. This article can also represent my treasure memory for my internship experience. Much thanks to CX for giving me, a freshman student who is still seeking knowledge, this opoortunity.

0) Context#

Company’s Goal: Applying “AI Staff” to manufacturing/distribution workflows
Use case: Parse customer quote Excels/Images/PDFs → normalized JSON → matching price database.
Why LLM vs rules: document variety, long‑tail fields, layout drift.

I normalize each line item into this compact schema used throughout this post:

{"index": "", "model": "", "voltage": "", "spec": "", "unit": "", "num": ""}

json

1) Model & Setup#

Base model: Qwen3-4B-Instruct-2507 ↗
Hardware: RTX 4090, i9-14900K
Framwork: LLaMA-Factory ↗, Llama.cpp ↗
Approach: LoRA, GGUF (4-bit quant)

2) Data Processing & Formatting#

Source:

Thanks for my amazing collegues, I was provided by excel sheets that are already labelled in a great quality.

I do not publish any company data. All examples here are redacted; I only show a 5‑line preview screenshot for illustration. The full dataset is private.

Why only Excel in v1#

Although the company handles PDFs/images/Excels, this demo focuses on Excel quotes only. Future work: OCR pipeline → the same JSON schema.

序号	物料编码	货物（功能规格）描述	单位	数量	单价（元）	税率（%）	单价（元）	CX-型号	CX-电压	CX-规格	CX-单位	CX-数量
1	500135730	低压电力电缆,ERF,铜,300,1芯,ZC,无铠装,普通	千米	1	251885.1	13%	284630.16	ZC-ERF	0.6/1kV	1*300	m	1000
2	500109714	低压电力电缆,VV,铜,35/16,3+1芯,ZC,22,普通	千米	1	90639.79	13%	102422.96	ZC-VV22	0.6/1kV	335+116	m	1000
3	500109080	低压电力电缆,YJLV,铝,120,4芯,ZC,22,普通	千米	1	52033.58	13%	58797.95	ZC-YJLV22	0.6/1kV	4*120	m	1000
4	500132449	低压电力电缆,YJLV,铝,120,4芯,ZC,无铠装,普通	千米	1	44103.17	13%	49836.58	ZC-YJLV	0.6/1kV	4*120	m	1000
5	500015270	低压电力电缆,YJLV,铝,120,4芯,不阻燃,22,普通	千米	1	51573.13	13%	58277.64	YJLV22	0.6/1kV	4*120	m	1000

Processing & formatting:

As I use Llama-Factory as my fine-tuning framework, it is necessary to convert the dataset to a jsonl file in Alpaca format for supervised fine-tuning. A formal Alpaca json typically contain four components, which are instruction (prompt), input (context), output (desired response), and system (system prompt). For details, please refer to LLM dataset formats ↗ in Llama-Factory.

Here is the python script I wrote that helps me to process the datasets.

Click here to see the full script

import argparse, json, re
from pathlib import Path
import pandas as pd

DASH = r"[\-－—–]"  
MODEL_COLS   = [f"CX{DASH}型号",   "CX-型号"]
VOLT_COLS    = [f"CX{DASH}电压",   "CX-电压"]
SPEC_COLS    = [f"CX{DASH}规格",   "CX-规格"]
UNIT_COLS    = [f"CX{DASH}单位",   "CX-单位"]
NUM_COLS     = [f"CX{DASH}数量",   "CX-数量"]

CX_PATTERNS  = [f"^CX{DASH}型号$", f"^CX{DASH}电压$", f"^CX{DASH}规格$", f"^CX{DASH}单位$", f"^CX{DASH}数量$"]


DEFAULT_INSTRUCTION = (
    "Given one table row of cables, produce ONLY a JSON object with keys exactly: index, model, voltage, spec, unit, num. If the input contains an index token like '<n>#', set index to '<n>' (no '#'). If the input has no index, set index to an empty string."
)

DEFAULT_SYSTEM = (
    "Return strict JSON with keys exactly: index, model, voltage, spec, unit, num. No extra text. Do not invent values. For index: if the input includes an index token like '<n>#',copy the number and output it as '<n>' (no '#'); otherwise set index to an empty string."
)


def to_str(x):
    if pd.isna(x):
        return ""
    s = str(x).strip()
    # collapse internal whitespace
    s = re.sub(r"\s+", " ", s)
    # drop trailing .0 for ints coming from Excel
    if re.match(r"^\d+\.0$", s):
        s = s[:-2]
    return s

def is_cx_col(colname: str) -> bool:
    if colname is None:
        return False
    name = str(colname).strip()
    for pat in CX_PATTERNS:
        if re.match(pat, name):
            return True
    return False

def pick_first_present(row, candidates):
    """Find the first candidate column name that exists (regex or literal)."""
    for cand in candidates:
        if any(ch in cand for ch in "-－—–[]^$\\"):
            for c in row.index:
                if re.match(cand, str(c).strip()):
                    return to_str(row[c])
        else:
            if cand in row.index:
                return to_str(row[cand])
    return ""

def build_input_values_only(row):
    """Join all NON-CX cell values from the row into one string (values only)."""
    vals = []
    for col in row.index:
        name = str(col).strip()
        if is_cx_col(name):
            continue
        v = to_str(row[col])
        if v != "":
            vals.append(v)
    return " | ".join(vals)

def make_output_obj(row, one_based_index):
    return {
        "index":   str(one_based_index),
        "model":   pick_first_present(row, MODEL_COLS),
        "voltage": pick_first_present(row, VOLT_COLS),
        "spec":    pick_first_present(row, SPEC_COLS),
        "unit":    pick_first_present(row, UNIT_COLS),
        "num":     pick_first_present(row, NUM_COLS),
    }

def process_excel(path, sheet, instruction, system, output_as_object):
    try:
        df = pd.read_excel(path, sheet_name=sheet)
    except Exception as e:
        print(f"[WARN] Skip {path} (read error): {e}")
        return []

    df = df.dropna(how="all")
    records = []
    for i, row in df.iterrows():
        index_1_based = i + 1
        input_text = build_input_values_only(row)
        out_obj = make_output_obj(row, index_1_based)

        rec = {
            "instruction": instruction,
            "input": input_text,
            "output": (out_obj if output_as_object else json.dumps(out_obj, ensure_ascii=False)),
        }
        if system:
            rec["system"] = system
        records.append(rec)
    return records

def iter_excels(paths):
    for p in paths:
        p = Path(p)
        if p.is_dir():
            for f in sorted(p.rglob("*.xls*")):
                yield f
        elif p.is_file() and p.suffix.lower().startswith(".xls"):
            yield p

def main():
    ap = argparse.ArgumentParser(description="Convert a folder of Excel files to Alpaca JSONL (values-only input, CX- output).")
    ap.add_argument("--in", dest="inputs", nargs="+", required=True, help="Folder(s) and/or file(s). Folders scanned recursively.")
    ap.add_argument("--sheet", default="CX-1", help="Sheet index (int) or name (str). Default 0.")
    ap.add_argument("--out", default="CX_AI_Quote_813.jsonl", help="Output JSONL.")
    ap.add_argument("--instruction", default=DEFAULT_INSTRUCTION, help="Instruction text.")
    ap.add_argument("--no-system", action="store_true", help="Omit the system field.")
    ap.add_argument("--output-as-object", action="store_true",
                    help="Store 'output' as a JSON object instead of a JSON string.")
    args = ap.parse_args()

    sheet = int(args.sheet) if args.sheet.isdigit() else args.sheet
    system = None if args.no_system else DEFAULT_SYSTEM

    all_recs, files = [], list(iter_excels(args.inputs))
    for f in files:
        recs = process_excel(f, sheet, args.instruction, system, args.output_as_object)
        print(f"[OK] {f.name}: {len(recs)} rows")
        all_recs.extend(recs)

    Path(args.out).parent.mkdir(parents=True, exist_ok=True)
    with open(args.out, "w", encoding="utf-8") as w:
        for r in all_recs:
            w.write(json.dumps(r, ensure_ascii=False) + "\n")

    print(f"[DONE] Wrote {len(all_recs)} samples from {len(files)} file(s) → {args.out}")

if __name__ == "__main__":
    main()

python

Instruction:

“Given one table row of cables, produce ONLY a JSON object with keys exactly: index, model, voltage, spec, unit, num. If the input contains an index token like ’#’, set index to '' (no ’#’). If the input has no index, set index to an empty string.”
System:

“Return strict JSON with keys exactly: index, model, voltage, spec, unit, num. No extra text. Do not invent values. For index: if the input includes an index token like ’#‘,copy the number and output it as '' (no ’#’); otherwise set index to an empty string.”
Input:

The input component is extracted from the strings of each row in the excel table, whereas adding ’|’ delimiter to seperate fields and adding ’#’ after the index number that are accquired from excel table row numbers to distinguished with real row values.

The reason behind using index to mark each row is to support input with multiple lines in the future. This can still preserves a 1:1 mapping between inputs and outputs via index, which keeping postprocessing simple without changing the schema.
Output:

The output format is {"index": "", "model": "", "voltage": "", "spec": "", "unit": "", "num": ""}The output component is extracted from the CX-columns, which are the annotated data, except for index number. The index numbers in output are also accquired from the excel table row numbers.

3) Fine-tuning Recipe#

Method: LoRA on Qwen3-4B-Instruct (SFT via LLaMA-Factory)
Dataset size: ~1099 JSON objects
Formatting: Alpaca style (instruction/input/output[/system]); assistant output is JSON‑only

0) Context#

1) Model & Setup#

2) Data Processing & Formatting#

Why only Excel in v1#

3) Fine-tuning Recipe#

4) Quantization (CPU, llama.cpp)#

5) Evaluation#

6) Results#

7) Reflection#

Demo AI Quoting System with Fine-tuning and Quantization

0) Context#

1) Model & Setup#

2) Data Processing & Formatting#

Anonymization and sharing policy#

Why only Excel in v1#

3) Fine-tuning Recipe#

4) Quantization (CPU, llama.cpp)#

5) Evaluation#

6) Results#

7) Reflection#