Next.js OpenAI Doc Search Starter
Template for building your own custom ChatGPT style doc search powered by Next.js, OpenAI, and Supabase.
Next.js OpenAI Doc Search Starter
This starter takes all the .mdx
files in the pages
directory and processes them to use as custom context within OpenAI Text Completion prompts.
Deploy
Deploy this starter to Vercel. The Supabase integration will automatically set the required environment variables and configure your Database Schema. All you have to do is set your OPENAI_KEY
and you're ready to go!
[
Technical Details
Building your own custom ChatGPT involves four steps:
- [👷 Build time] Pre-process the knowledge base (your
.mdx
files in yourpages
folder). - [👷 Build time] Store embeddings in Postgres with pgvector.
- [🏃 Runtime] Perform vector similarity search to find the content that's relevant to the question.
- [🏃 Runtime] Inject content into OpenAI GPT-3 text completion prompt and stream response to the client.
👷 Build time
Step 1. and 2. happen at build time, e.g. when Vercel builds your Next.js app. During this time the generate-embeddings
script is being executed which performs the following tasks:
sequenceDiagramparticipant Vercelparticipant DB (pgvector)participant OpenAI (API)loop 1. Pre-process the knowledge baseVercel->>Vercel: Chunk .mdx pages into sectionsloop 2. Create & store embeddingsVercel->>OpenAI (API): create embedding for page sectionOpenAI (API)->>Vercel: embedding vector(1536)Vercel->>DB (pgvector): store embedding for page sectionendend
In addition to storing the embeddings, this script generates a checksum for each of your .mdx
files and stores this in another database table to make sure the embeddings are only regenerated when the file has changed.
🏃 Runtime
Step 3. and 4. happen at runtime, anytime the user submits a question. When this happens, the following sequence of tasks is performed:
sequenceDiagramparticipant Clientparticipant Edge Functionparticipant DB (pgvector)participant OpenAI (API)Client->>Edge Function: { query: lorem ispum }critical 3. Perform vector similarity searchEdge Function->>OpenAI (API): create embedding for queryOpenAI (API)->>Edge Function: embedding vector(1536)Edge Function->>DB (pgvector): vector similarity searchDB (pgvector)->>Edge Function: relevant docs contentendcritical 4. Inject content into promptEdge Function->>OpenAI (API): completion request prompt: query + relevant docs contentOpenAI (API)-->>Client: text/event-stream: completions responseend
The relevant files for this are the SearchDialog
(Client) component and the vector-search
(Edge Function).
The initialization of the database, including the setup of the pgvector
extension is stored in the supabase/migrations
folder which is automatically applied to your local Postgres instance when running supabase start
.
Local Development
Configuration
cp .env.example .env
- Set your
OPENAI_KEY
in the newly created.env
file. - Set
NEXT_PUBLIC_SUPABASE_ANON_KEY
andSUPABASE_SERVICE_ROLE_KEY
run:Note: You have to run supabase to retrieve the keys.
Start Supabase
Make sure you have Docker installed and running locally. Then run
supabase start
To retrieve NEXT_PUBLIC_SUPABASE_ANON_KEY
and SUPABASE_SERVICE_ROLE_KEY
run:
supabase status
Start the Next.js App
In a new terminal window, run
pnpm dev
Using your custom .mdx docs
- By default your documentation will need to be in
.mdx
format. This can be done by renaming existing (or compatible) markdown.md
file. - Run
pnpm run embeddings
to regenerate embeddings.Note: Make sure supabase is running. To check, run
supabase status
. If is not running runsupabase start
. - Run
pnpm dev
again to refresh NextJS localhost:3000 rendered page.
Learn More
- Read the blogpost on how we built ChatGPT for the Supabase Docs.
- [Docs] pgvector: Embeddings and vector similarity
- Watch Greg's "How I built this" video on the Rabbit Hole Syndrome YouTube Channel:
Licence
Apache 2.0
Next.js OpenAI Doc Search Starter
Template for building your own custom ChatGPT style doc search powered by Next.js, OpenAI, and Supabase.
Next.js OpenAI Doc Search Starter
This starter takes all the .mdx
files in the pages
directory and processes them to use as custom context within OpenAI Text Completion prompts.
Deploy
Deploy this starter to Vercel. The Supabase integration will automatically set the required environment variables and configure your Database Schema. All you have to do is set your OPENAI_KEY
and you're ready to go!
[
Technical Details
Building your own custom ChatGPT involves four steps:
- [👷 Build time] Pre-process the knowledge base (your
.mdx
files in yourpages
folder). - [👷 Build time] Store embeddings in Postgres with pgvector.
- [🏃 Runtime] Perform vector similarity search to find the content that's relevant to the question.
- [🏃 Runtime] Inject content into OpenAI GPT-3 text completion prompt and stream response to the client.
👷 Build time
Step 1. and 2. happen at build time, e.g. when Vercel builds your Next.js app. During this time the generate-embeddings
script is being executed which performs the following tasks:
sequenceDiagramparticipant Vercelparticipant DB (pgvector)participant OpenAI (API)loop 1. Pre-process the knowledge baseVercel->>Vercel: Chunk .mdx pages into sectionsloop 2. Create & store embeddingsVercel->>OpenAI (API): create embedding for page sectionOpenAI (API)->>Vercel: embedding vector(1536)Vercel->>DB (pgvector): store embedding for page sectionendend
In addition to storing the embeddings, this script generates a checksum for each of your .mdx
files and stores this in another database table to make sure the embeddings are only regenerated when the file has changed.
🏃 Runtime
Step 3. and 4. happen at runtime, anytime the user submits a question. When this happens, the following sequence of tasks is performed:
sequenceDiagramparticipant Clientparticipant Edge Functionparticipant DB (pgvector)participant OpenAI (API)Client->>Edge Function: { query: lorem ispum }critical 3. Perform vector similarity searchEdge Function->>OpenAI (API): create embedding for queryOpenAI (API)->>Edge Function: embedding vector(1536)Edge Function->>DB (pgvector): vector similarity searchDB (pgvector)->>Edge Function: relevant docs contentendcritical 4. Inject content into promptEdge Function->>OpenAI (API): completion request prompt: query + relevant docs contentOpenAI (API)-->>Client: text/event-stream: completions responseend
The relevant files for this are the SearchDialog
(Client) component and the vector-search
(Edge Function).
The initialization of the database, including the setup of the pgvector
extension is stored in the supabase/migrations
folder which is automatically applied to your local Postgres instance when running supabase start
.
Local Development
Configuration
cp .env.example .env
- Set your
OPENAI_KEY
in the newly created.env
file. - Set
NEXT_PUBLIC_SUPABASE_ANON_KEY
andSUPABASE_SERVICE_ROLE_KEY
run:Note: You have to run supabase to retrieve the keys.
Start Supabase
Make sure you have Docker installed and running locally. Then run
supabase start
To retrieve NEXT_PUBLIC_SUPABASE_ANON_KEY
and SUPABASE_SERVICE_ROLE_KEY
run:
supabase status
Start the Next.js App
In a new terminal window, run
pnpm dev
Using your custom .mdx docs
- By default your documentation will need to be in
.mdx
format. This can be done by renaming existing (or compatible) markdown.md
file. - Run
pnpm run embeddings
to regenerate embeddings.Note: Make sure supabase is running. To check, run
supabase status
. If is not running runsupabase start
. - Run
pnpm dev
again to refresh NextJS localhost:3000 rendered page.
Learn More
- Read the blogpost on how we built ChatGPT for the Supabase Docs.
- [Docs] pgvector: Embeddings and vector similarity
- Watch Greg's "How I built this" video on the Rabbit Hole Syndrome YouTube Channel:
Licence
Apache 2.0