Force llama.cpp to output only one line - annna - Annna the nice friendly bot. | |
git clone git://bitreich.org/annna/ git://enlrupgkhuxnvlhsf6lc3fziv5h2hhfrinws6… | |
Log | |
Files | |
Refs | |
Tags | |
README | |
--- | |
commit bef238f14dbe190805e443abd9026f6ede912668 | |
parent 5e293f40c97b744439f3cabeb19ae749388b9116 | |
Author: Julian Schweinsberg <[email protected]> | |
Date: Mon, 11 Nov 2024 13:02:58 +0100 | |
Force llama.cpp to output only one line | |
sed is used to remove leading whitespace and the [end of text]-Thing | |
which can occur if less then the maximal token count was outputted. | |
head -n1 is used to remove the empty leading line which got outputted in | |
my testing. | |
Signed-off-by: Annna Robert-Houdin <[email protected]> | |
Diffstat: | |
M gpt | 10 +++++----- | |
1 file changed, 5 insertions(+), 5 deletions(-) | |
--- | |
diff --git a/gpt b/gpt | |
@@ -3,16 +3,16 @@ | |
#ggmlbase="/br/ai/ggml" | |
ggmlbase="/br/ai/llama.cpp" | |
#ggmlbin="./build/bin/gpt-2" | |
-ggmlbin="./build/bin/llama-simple" | |
+ggmlbin="./build/bin/llama-cli" | |
#ggmlmodel="models/gpt-2-1558M/ggml-model.bin" | |
ggmlmodel="models/zephyr-7b-beta.Q4_0.gguf" | |
ggmlntokens="69" | |
cd $ggmlbase | |
$ggmlbin -m $ggmlmodel -n $ggmlntokens \ | |
- "$1 Begin all lines with OUTPUT:." 2>/dev/null \ | |
- | grep "^OUTPUT:" \ | |
- | cut -d':' -f 2- \ | |
- | head -n 1 \ | |
+ --simple-io --no-display-prompt --grammar 'root ::= ([^\x00-\x1F])*' \ | |
+ -p "$1" 2>/dev/null \ | |
+ | head -n1 \ | |
+ | sed -E 's/^[[:blank:]]+//;s/[[:blank:]]*\[end of text\]$//' \ | |
| tr -d '"' | |