Jim Scardelis 94d37fdcae fix: examples/langchain-python-rag-privategpt/requirements.txt (#3382) 7 months ago
..
.gitignore 385eeea357 remove with 1 year ago
LICENSE 385eeea357 remove with 1 year ago
README.md 1b272d5bcd change `github.com/jmorganca/ollama` to `github.com/ollama/ollama` (#3347) 9 months ago
constants.py 5c48fe1fb0 Update examples/langchain-python-rag-privategpt/constants.py 1 year ago
ingest.py 5528dd9d11 Error handling load_single_document() in ingest.py (#4852) 7 months ago
poetry.lock 385eeea357 remove with 1 year ago
privateGPT.py 1df6100c77 Update examples/langchain-python-rag-privategpt/privateGPT.py 1 year ago
pyproject.toml 385eeea357 remove with 1 year ago
requirements.txt 94d37fdcae fix: examples/langchain-python-rag-privategpt/requirements.txt (#3382) 7 months ago

README.md

PrivateGPT with Llama 2 uncensored

https://github.com/ollama/ollama/assets/3325447/20cf8ec6-ff25-42c6-bdd8-9be594e3ce1b

Note: this example is a slightly modified version of PrivateGPT using models such as Llama 2 Uncensored. All credit for PrivateGPT goes to Iván Martínez who is the creator of it, and you can find his GitHub repo here.

Setup

Set up a virtual environment (optional):

python3 -m venv .venv
source .venv/bin/activate

Install the Python dependencies:

pip install -r requirements.txt

Pull the model you'd like to use:

ollama pull llama2-uncensored

Getting WeWork's latest quarterly earnings report (10-Q)

mkdir source_documents
curl https://d18rn0p25nwr6d.cloudfront.net/CIK-0001813756/975b3e9b-268e-4798-a9e4-2a9a7c92dc10.pdf -o source_documents/wework.pdf

Ingesting files

python ingest.py

Output should look like this:

Creating new vectorstore
Loading documents from source_documents
Loading new documents: 100%|██████████████████████| 1/1 [00:01<00:00,  1.73s/it]
Loaded 1 new documents from source_documents
Split into 90 chunks of text (max. 500 tokens each)
Creating embeddings. May take some minutes...
Using embedded DuckDB with persistence: data will be stored in: db
Ingestion complete! You can now run privateGPT.py to query your documents

Ask questions

python privateGPT.py

Enter a query: How many locations does WeWork have?

> Answer (took 17.7 s.):
As of June 2023, WeWork has 777 locations worldwide, including 610 Consolidated Locations (as defined in the section entitled Key Performance Indicators).

Try a different model:

ollama pull llama2:13b
MODEL=llama2:13b python privateGPT.py

Adding more files

Put any and all your files into the source_documents directory

The supported extensions are:

  • .csv: CSV,
  • .docx: Word Document,
  • .doc: Word Document,
  • .enex: EverNote,
  • .eml: Email,
  • .epub: EPub,
  • .html: HTML File,
  • .md: Markdown,
  • .msg: Outlook Message,
  • .odt: Open Document Text,
  • .pdf: Portable Document Format (PDF),
  • .pptx : PowerPoint Document,
  • .ppt : PowerPoint Document,
  • .txt: Text file (UTF-8),