kk014 commited on
Commit
5d2bf5a
·
verified ·
1 Parent(s): e34ee3d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +115 -58
README.md CHANGED
@@ -1,66 +1,123 @@
1
  ---
 
2
  license: apache-2.0
3
- library_name: peft
4
  tags:
5
- - trl
6
- - sft
7
- - generated_from_trainer
 
 
 
 
8
  base_model: mistralai/Mistral-7B-v0.1
9
- model-index:
10
- - name: mistral-7b-docstring
11
- results: []
12
  ---
13
 
14
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
15
- should probably proofread and complete it, then remove this comment. -->
16
-
17
  # mistral-7b-docstring
18
 
19
- This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on an unknown dataset.
20
- It achieves the following results on the evaluation set:
21
- - Loss: 0.9943
22
-
23
- ## Model description
24
-
25
- More information needed
26
-
27
- ## Intended uses & limitations
28
-
29
- More information needed
30
-
31
- ## Training and evaluation data
32
-
33
- More information needed
34
-
35
- ## Training procedure
36
-
37
- ### Training hyperparameters
38
-
39
- The following hyperparameters were used during training:
40
- - learning_rate: 0.0002
41
- - train_batch_size: 2
42
- - eval_batch_size: 2
43
- - seed: 42
44
- - gradient_accumulation_steps: 8
45
- - total_train_batch_size: 16
46
- - optimizer: Use OptimizerNames.PAGED_ADAMW_8BIT with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
47
- - lr_scheduler_type: cosine
48
- - lr_scheduler_warmup_ratio: 0.03
49
- - num_epochs: 1
50
- - mixed_precision_training: Native AMP
51
-
52
- ### Training results
53
-
54
- | Training Loss | Epoch | Step | Validation Loss |
55
- |:-------------:|:-----:|:----:|:---------------:|
56
- | 8.0055 | 0.4 | 200 | 1.0115 |
57
- | 7.9363 | 0.8 | 400 | 0.9943 |
58
-
59
-
60
- ### Framework versions
61
-
62
- - PEFT 0.11.1
63
- - Transformers 4.46.0
64
- - Pytorch 2.10.0+cu128
65
- - Datasets 2.19.0
66
- - Tokenizers 0.20.3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language: en
3
  license: apache-2.0
 
4
  tags:
5
+ - code
6
+ - python
7
+ - docstring
8
+ - mistral
9
+ - qlora
10
+ - peft
11
+ - code-generation
12
  base_model: mistralai/Mistral-7B-v0.1
13
+ datasets:
14
+ - code_search_net
 
15
  ---
16
 
 
 
 
17
  # mistral-7b-docstring
18
 
19
+ Mistral 7B fine-tuned with QLoRA on Python docstring generation from CodeSearchNet.
20
+
21
+ Outperforms Llama 3.3 70B — a model 10x larger — on both ROUGE-L and BERTScore on domain-specific NumPy-style docstring generation.
22
+
23
+ ## Evaluation results
24
+
25
+ Evaluated on 100 held-out Python functions from CodeSearchNet (never seen during training).
26
+
27
+ | Model | ROUGE-L | BERTScore F1 |
28
+ |---|---|---|
29
+ | **Mistral 7B fine-tuned (this model)** | **0.2033** | **0.7739** |
30
+ | Llama 3.3 70B via Groq | 0.1715 | 0.7594 |
31
+ | Mistral 7B base (no fine-tuning) | 0.1102 | 0.7118 |
32
+
33
+ The fine-tuned 7B model beats Llama 3.3 70B on ROUGE-L (+18.5%) and BERTScore (+1.9%) while being 10x smaller and running at a fraction of the inference cost.
34
+
35
+ ## How to use
36
+
37
+ ```python
38
+ from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
39
+ from peft import PeftModel
40
+ import torch
41
+
42
+ BASE_MODEL = "mistralai/Mistral-7B-v0.1"
43
+
44
+ # Load in 4-bit for efficient inference
45
+ bnb_config = BitsAndBytesConfig(
46
+ load_in_4bit=True,
47
+ bnb_4bit_quant_type="nf4",
48
+ bnb_4bit_compute_dtype=torch.float16,
49
+ )
50
+
51
+ tokenizer = AutoTokenizer.from_pretrained(BASE_MODEL)
52
+ base_model = AutoModelForCausalLM.from_pretrained(
53
+ BASE_MODEL,
54
+ quantization_config=bnb_config,
55
+ device_map="auto",
56
+ )
57
+ model = PeftModel.from_pretrained(base_model, "kk014/mistral-7b-docstring")
58
+ model.eval()
59
+
60
+ # Generate a docstring
61
+ function_code = """
62
+ def calculate_bmi(weight_kg, height_m):
63
+ return weight_kg / (height_m ** 2)
64
+ """.strip()
65
+
66
+ prompt = (
67
+ "You are a Python documentation expert. "
68
+ "Write a clear, concise NumPy-style docstring for the following Python function.\n\n"
69
+ f"### Function:\n{function_code}\n\n"
70
+ "### Docstring:"
71
+ )
72
+
73
+ inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
74
+ with torch.no_grad():
75
+ outputs = model.generate(
76
+ **inputs,
77
+ max_new_tokens=150,
78
+ temperature=0.1,
79
+ do_sample=True,
80
+ pad_token_id=tokenizer.eos_token_id,
81
+ )
82
+
83
+ generated = tokenizer.decode(outputs[0], skip_special_tokens=True)
84
+ docstring = generated[len(prompt):].strip()
85
+ print(docstring)
86
+ ```
87
+
88
+ ## Training details
89
+
90
+ | Parameter | Value |
91
+ |---|---|
92
+ | Base model | mistralai/Mistral-7B-v0.1 |
93
+ | Dataset | CodeSearchNet (Python split) |
94
+ | Training samples | 8,000 |
95
+ | Method | QLoRA (4-bit NF4 quantisation) |
96
+ | LoRA rank | 16 |
97
+ | LoRA alpha | 32 |
98
+ | Epochs | 1 |
99
+ | Batch size | 2 (effective 16 with grad accum) |
100
+ | Learning rate | 2e-4 |
101
+ | Hardware | Kaggle T4 x2 (free tier) |
102
+ | Training time | ~4 hours |
103
+ | Framework | HuggingFace PEFT + TRL |
104
+
105
+ ## Limitations
106
+
107
+ - Trained on NumPy-style docstrings specifically — output style may differ for Google or Sphinx style
108
+ - Best on standalone functions under ~50 lines
109
+ - May repeat examples in generated output at very low temperatures
110
+ - Evaluated on CodeSearchNet Python split only — performance on other codebases may vary
111
+
112
+ ## Citation
113
+
114
+ If you use this model, please cite the original QLoRA paper:
115
+
116
+ ```
117
+ @article{dettmers2023qlora,
118
+ title={QLoRA: Efficient Finetuning of Quantized LLMs},
119
+ author={Dettmers, Tim and others},
120
+ journal={arXiv preprint arXiv:2305.14314},
121
+ year={2023}
122
+ }
123
+ ```