LoRA is to insert and learn the rank composition matrix created by dimensionally reducing the weight matrix in the transformer. Prompt Tuning, on the other hand, typically uses a soft prompt that encodes the prompt within the model to learn, rather than a hard prompt that a person gives the task directly. Both are effective in lightening, especially prompt tuning, which is better than hard prompt use.
Both techniques can also be implemented using the peft module.
from peft import get_peft_model, PeftModel, TaskType, LoraConfig, PromptTuningConfig, PromptTuningInit
for path,dirs,files in os.walk('/root/.cache/huggingface/hub/models--kakaobrain--kogpt'):
for file in files:
if file.endswith('tokenizer.json'):
tokenizer_path = path
print(tokenizer_path)
prompt_config = PromptTuningConfig(
task_type=TaskType.CAUSAL_LM,
num_virtual_tokens=10,
prompt_tuning_init=PromptTuningInit.TEXT,
prompt_tuning_init_text="Read the following and summarize:",
tokenizer_name_or_path=tokenizer_path
)
lora_config = LoraConfig(
task_type = TaskType.CAUSAL_LM,
r=8, lora_alpha=32, lora_dropout=0.1,
target_modules = ['q_proj', 'v_proj'],
# target_modules = r".*(q_proj|v_proj)",
)
However, the get_feft_model function receives only the model and one peft_config as parameters.
peft_model = get_peft_model(base_model, prompt_config)
I want to use both techniques at the same time. How shall I do it?