Log in

View Full Version : Dllama - Local LLM Inference



drezgames
06-05-2024, 03:28 AM
1594 (https://dllama.dev/)

A simple and easy to use library for doing local LLM inference directly from your favorite programming language.

Download
tinyBigGAMES/Dllama: Local LLM inference Library (github.com) (https://github.com/tinyBigGAMES/Dllama)

Simple example


uses
System.SysUtils,
Dllama;


begin
// init
if not Dllama_Init('config.json', nil) then
Exit;
try
// add message
Dllama_AddMessage(ROLE_SYSTEM, 'You are a helpful AI assistant');
Dllama_AddMessage(ROLE_USER, 'What is AI?');

// do inference
if Dllama_Inference('phi3', 1024, nil, nil, nil) then
begin
// success
end
else
begin
// error
end;
finally
Dllama_Quit();
end;
end.


Media

https://www.youtube.com/watch?v=fatbhKVTJeM

SilverWarior
07-05-2024, 04:11 PM
I just tried this project of yours and was not impressed. While it seemed that some text was being written by the AI the output was incoherent mess of various words. I don't have enough knowledge on this topic to even begin to guess what might be wrong.

Jonax
07-05-2024, 06:06 PM
I'm afraid I didn't even try the project. So much to do, so little time. Besides I'm no github user. So can't really comment at all other than the general point that it's good to see some pascal activity. Keep up the good work.

drezgames
07-05-2024, 06:18 PM
I just tried this project of yours and was not impressed. While it seemed that some text was being written by the AI the output was incoherent mess of various words. I don't have enough knowledge on this topic to even begin to guess what might be wrong.
Hmm, which Test did you run? What type of output did you get? Which model did you use? Did you change the template format in config.json? If it's not correct for example, you will not get correct output. Are you running in GPU or CPU mode, etc.

The fact that local LLM inference was not even possible 2-3 years ago on consumer hardware is impressive in itself! Amazing how fast AI is advancing.

If you can give some info about the problem, i'm sure we can get to the bottom of it. I want everyone to be able enjoy AI.

SilverWarior
07-05-2024, 10:22 PM
Hmm, which Test did you run? What type of output did you get? Which model did you use? Did you change the template format in config.json? Are you running in GPU or CPU mode, etc.

I did the only non commented test from Example project. I used the exact same model the Example project seems to be set up to use. It is hard to describe the output. On the other hand it seems as if it only contains only random number but on the other hand it also looked as it was outputting some commands which doesn't seem to be intended output. I used the existing prompt from the Example project by commenting and uncommenting specific lines that store prompts examples.
The AI supposed to run in AMD Vulkan mode.

When existing example didn't work I then tried to change the model to GPT2 which required me to modify the config.json. Trying the same prompts defined in constants resulted in empty responses except for one (don't remember which one) which returned "What are you doing?" as a response.

I will try again tomorrow if I find time. This time I intent to modify the example project so that it will save console output into a text file which I could then share with you for easier debugging.

PS: On first run the program stated it has done some extracting of GGUF data. Where is this extracted data being stored? I haven't seen any new files created. I may have to delete it in order to repeat the extraction process in case if something went wrong the first time. I did not get any error message.

drezgames
08-05-2024, 12:17 AM
I did the only non commented test from Example project. I used the exact same model the Example project seems to be set up to use. It is hard to describe the output. On the other hand it seems as if it only contains only random number but on the other hand it also looked as it was outputting some commands which doesn't seem to be intended output. I used the existing prompt from the Example project by commenting and uncommenting specific lines that store prompts examples.
The AI supposed to run in AMD Vulkan mode.

When existing example didn't work I then tried to change the model to GPT2 which required me to modify the config.json. Trying the same prompts defined in constants resulted in empty responses except for one (don't remember which one) which returned "What are you doing?" as a response.

I will try again tomorrow if I find time. This time I intent to modify the example project so that it will save console output into a text file which I could then share with you for easier debugging.

PS: On first run the program stated it has done some extracting of GGUF data. Where is this extracted data being stored? I haven't seen any new files created. I may have to delete it in order to repeat the extraction process in case if something went wrong the first time. I did not get any error message.
Did you happen to be using this prompt:

CPrompt = 'Почему снег холодный?'; //Why snow is cold?
If so, then this was to demonstrate the language capabilities of the tiny Phi3 model. I was surprised when this small model could output Russian text after being asked a question in Russian. Also, change gup_layers to 0 in the config.json file to force it to run in CPU mode and see if you get coherent output. If so and not with GPU enabled, maybe something is going on with the Vulkan backend. In this case, the next question would be, are you using updated drivers? I had a user with a similar issue and updating the drivers, fixed the problem.

drezgames
08-05-2024, 12:22 AM
I'm afraid I didn't even try the project. So much to do, so little time. Besides I'm no github user. So can't really comment at all other than the general point that it's good to see some pascal activity. Keep up the good work.
hi, thanks. You can directly download (https://github.com/tinyBigGAMES/Dllama/archive/refs/heads/main.zip) without having an account.

Jonax
10-05-2024, 11:28 AM
Thanks for link :). I'm afraid I won't be able to try this anyway. Despite the tempting foreign language capabilities. But I have a couple general questions.
This runs on the Embarcardero Delphi?
How much disc space is needed/recommended?

drezgames
10-05-2024, 06:30 PM
Thanks for link :). I'm afraid I won't be able to try this anyway. Despite the tempting foreign language capabilities. But I have a couple general questions.
This runs on the Embarcardero Delphi?
How much disc space is needed/recommended?
- Yes, Pascal (Delphi/FreePascal), C/C++ (C++ Builder, Visual Studio 2020, Pelles C)
- ~5MB for the distro and then you will a model to use, the smalled is Phi3, which is ~2.5GB in size, most models that I can run in VRAM are from 4-8GB in size.