Hey everyone, I’m looking for a way to use an open source local large language model (LLM) on Linux, particularly on low-spec hardware like Raspberry Pi, to generate lengthy, coherent stories of 10k+ words from a single prompt. I recall reading about methods described in scientific papers such as “Re3: Generating Longer Stories With Recursive Reprompting and Revision”, announced in this Twitter thread from October 2022 and “DOC: Improving Long Story Coherence With Detailed Outline Control”, announced in this Twitter thread from December 2022. These papers used GPT-3, and since it’s been a while since then, I was hoping there might be something similar made using only open source tools. Does anyone have experience with this or know of any resources that could help me achieve long, coherent story generation with an open source LLM? Any advice or pointers would be greatly appreciated. Thank you!

  • kby@feddit.de
    link
    fedilink
    arrow-up
    12
    ·
    edit-2
    7 months ago

    You can try setting up Ollama on your RPi, then use a highly-quantized variant of the Mistral model (or quantize it yourself with GGUF+llama.cpp). You can do some very heavy quantization (2-bit), which will increase the error rate. But if you are only planning to use the generated text as a starting point, it might be useful nevertheless. Also see: https://github.com/ollama/ollama/blob/main/docs/import.md#importing-pytorch--safetensors

    Here are some pre-quantized variants of Mistral 7B: https://huggingface.co/TheBloke/Mistral-7B-v0.1-GGUF

    (all the tools and models I have mentioned in my comment are free and open-source, and beyond that, require no uplink during operation)

  • Ziggurat@sh.itjust.works
    link
    fedilink
    Français
    arrow-up
    7
    ·
    7 months ago

    Have you tried GPT4All https://gpt4all.io/index.html ? It runs on CPU so is a bit slow, but it’s a way to run various LLM locally with an plug and play, easy to use solution. That said, LLM are huge, and perform better on GPU, provided you have a GPU big enough. Here is the trap. How much do you want to spend in a GPU ?

    • kindenough@kbin.social
      link
      fedilink
      arrow-up
      3
      ·
      7 months ago

      On GPU it is okay. GTX-1080 with a R5 3700X.

      It has just written a 24 page tourist info booklet about the town I live in and a bunch of it is very inaccurate or outdated on the places to go. Fun and impressive anyway. Took only a few minutes.

    • PeterPoopshit@lemmy.world
      link
      fedilink
      arrow-up
      2
      ·
      edit-2
      7 months ago

      If you get just the right gguf model (read the description when you download them to get the right K-optimization or whatever it’s called) and actually use multithreading (llamacpp supports multithreading so in theory gpt4all should too), then it’s reasonably fast. I’ve achieved roughly half the speed of ChatGPT just on an 8 core amd fx with ddr3 ram. Even 20b models can be usably fast.

  • Dojan@lemmy.world
    link
    fedilink
    arrow-up
    3
    ·
    7 months ago

    The open source LLMs are really capable, I think the method used to feed the plot might be the more important part in making this work.

  • Thavron@lemmy.ca
    link
    fedilink
    arrow-up
    4
    arrow-down
    1
    ·
    7 months ago

    Are you looking to make an easy buck by generating novels and self publishing them on Amazon?

  • PeterPoopshit@lemmy.world
    link
    fedilink
    arrow-up
    1
    ·
    edit-2
    7 months ago

    This probably isn’t very helpful but the best way I’ve found to make an ai write an entire book is still a lot of work. You have to make it write it in sections, pay attention to the prompts based on what’s happening and spend a lot of time copy pasting the good sentences into a better quality section and then use those blocks of text to create chapters. You’re basically plagiarizing a document using ai written documents rather than making the ai shit it out in 1 continuous stream.

    If you could come up with a way to make an ai produce a document using only complete sentences from other ai generated documents, maybe you could achieve a higher level of automation and still yield similar quality. Because otherwise it’s just as difficult as writing a book yourself.

    As for software, use llamacpp. It’s a cpu-only ai thingy that can utilize multiple cores. You probably aren’t getting an nvidia gpu running on any arm board unless you have a really long white neck beard and a degree in computer engineering. Download gguf compatible models for llamacpp on hugging face.