I’ve been thinking about adding this to my “Fuck it, I’ll do it myself” / SHTF pile. I have a spare 10-15GB for a good selection of basic articles (across sciences, history, pop culture trivia etc).
https://get.kiwix.org/en/solutions/hotspots/content-bundles/
https://get.kiwix.org/en/solutions/hotspots/imager-service/
There’s something inherently cool about having wikipedia in a box (yes, you’d likely need to refresh it once a year) but I’ve never heard of anyone actually self hosting a Kiwix instance.


Do you actually train the LLM or use RAG? I have been looking for a local LLM + Wikipedia RAG solution for a while now.
For now I just have kiwix-serve + searxng doing a simple search but the Kiwix search is…questionable.