How to train your GPT (Part 2)
Uploading your custom documents to give your GPT a knowledge base
Welcome back to Year 2049, your source of practical insights, case studies, and resources to help you embrace and harness the power of AI in your life, work, and business.
If this was forwarded to you, you can subscribe to receive Year 2049 in your inbox every Friday.
How to customize your GPT’s knowledge base
One of the powers of building custom GPTs is uploading your own documents to give your GPT a unique “brain”, or knowledge base.
In Part 1 of this series, I covered how I integrated my newsletter’s content (75+ articles) into my GPT’s knowledge base. If you missed it, you can read it at the link below.
Many of you sent me follow-up questions about uploading private documents directly into the GPT’s knowledge base, as opposed to using the web browsing feature to access content that already exists publicly online.
So, we’ll cover that in Part 2 today!
Choose your documents wisely
Under the Configure tab, you can upload files in the Knowledge section. You should know that you have a maximum limit of 10 documents to upload.
The biggest mistake you can make is uploading documents all willy-nilly into your GPT.
Before you know it, your GPT’s sifting through documents like he’s Harry Potter swarmed by Hogwarts letters, just to answer a simple question. So make sure to define specific use cases and only upload knowledge that’s relevant for your GPT’s tasks.
🚨 Beware:
People using your GPT may be able to download your documents if Code Interpreter is enabled in the Capabilities section (right below Knowledge). Make sure to turn it off if you don’t want anyone accessing the full documents you’ve uploaded.
File formats
You can upload any file format into your GPT:
Text documents: TXT or Word
Spreadsheets
Presentations
PDFs
Regardless of the format, make sure you upload a clean and readable document that a computer can easily parse. My assumption is that GPTs use RAG (retrieval augmented generation) to store knowledge.
For that reason, try to upload documents with simple, one-column layouts that can be easily parsed, cut up, and stored. If your data is all text, your safest bet is a TXT file. Beware of documents with complex layouts (like PDFs with two columns of content), as it may chunk up and store the content inacurrately.
If you rewatch Sam Altman’s live demo of building a GPT with custom documents, you’ll notice he used a TXT file containing his lecture notes. So if you have complex documents, it’s worth spending a bit of time to reformatting them to ensure good knowledge transfer.
But don’t worry, I built a custom GPT you can use to simplify your documents and convert them into a TXT format (how meta, I know!). I’ve used my GPT-Friendly Document Maker to convert a few PDFs and it works like a charm. Plus, it significantly reduces the file size!
Recommended instructions
My initial attempts at integrating my private documents into my GPT were extremely frustrating. It kept ignoring the documents I uploaded and instead relied on answering questions based on GPT-4’s general knowledge.
I landed on an instruction that seems to be working for now, and you’ll find this useful if you want your GPT to only answer questions based on the documents you uploaded:
Prompt:
This GPT should always search its knowledge base before answering
I noticed that OpenAI was using those exact words when a GPT was referencing its knowledge base, so I decided to reuse them in my instruction, which has helped minimize drifting into GPT-4’s general knowledge and focus more on the documents I uploaded.
Based on your use case, you may want to be even more specific with that prompt. If you uploaded multiple documents, you may need to specify which document to reference for specific questions.
For example: you may have built a GPT to onboard new employees. The GPT should reference the FAQ document for general questions, but the Company Calendar for questions relating to paydays and holidays.
Videos from the week
I’m back in the rhythm of posting videos on TikTok and Instagram, and here’s what you missed if the algorithm is hiding me from you:
🤝 ChatGPT now has Team plans (TikTok | Instagram)
🐰 The Rabbit R1 is what Siri should’ve been (TikTok | Instagram)
🤯 Scientific AI is blowing my mind (TikTok | Instagram)
🤖 Sam Altman: AI will reduce the cost of intelligence significantly (TikTok | Instagram)
🔮 The future is too exciting to keep to yourself
Share this post in your group chats with friends, family, and coworkers.
If a friend sent this to you, subscribe for free to receive practical insights, case studies, and resources to help you understand and embrace AI in your life and work.
⏮️ What you may have missed
If you’re new here, here’s what else I published recently:
You can also check out the Year 2049 archive to browse all previous case studies, insights, and tutorials.
How would you rate this week's edition?