This is #2 of the Semantic Search with PostgreSQL series. In #1, I have setup PostgreSQL and pgvector extension. This time, let’s build on top of these and perform different queries.
By the end of this post, we will be able to search for photos using an arbituary location, natural language and image. We will also be able to combine everything to pull only photos shot by Nikon within 50km of New York City that contains modern architecture.
Why PostgreSQL?
In the previous experiment, we loaded the Unsplash 25k image embeddings with their respective metadata into Chroma, and performed semantic search. Chroma is super fast with an in-memory implementation of its embeddings/metadata. As long as our usage scenario is mostly similarity search using embeddings and basic metadata filtering, I think it is the perfect choice.
However, in scenarios where the data is more complex (e.g., e-commerce with photos, product info, users and transactions), a real relational database that allows combining vector similarity search with other SQL filter/join in a single SQL query will fit much better. Or in my existing project, where I need semantic (photo and text), relational (joining various tables of ryokan information) AND geospatial search, a more versatile platform such as PostgreSQL with extensions will fit the bill perfectly.
In this experiment, the Unsplash lite dataset contains several files, each linked up by the photo_id. Imagine this trip planning scenario:
- I have a photo of a landmark (e.g., Tokyo Tower)
- I want to find the most similar photos to it
- The targets must be shot around Tokyo (or, if we are adventurous, we can target Shanghai to see if there’s anything there that looks like Tokyo Tower)
- Once I have the targets, I want to check out the photo collections they are in, to expand my exploration. For example, if I get Oriental Pearl Tower in Shanghai as a target, it might have been added to a photo collection of other Shanghai modern skyscrappers. When I retrieve those, I can visit more places in my trip!
This scenario will be very hard to do in Chroma, especially #3 (spatial search) and #4 (the classic SQL join), and that’s when PostgreSQL comes in.
Obtain Unsplash dataset
Get all metadata
note: this is a rehash of what we did in the previous post. I am reposting it here for clarity.
First, we need to download the Unsplash Lite dataset with metadata such as photo ID, actual image URL, photographer name, description generated by AI, geolocation, image EXIF data, keyword used for the iamge, colors, etc.
After downloading the file, we load the data into a dataframe.
import numpy as np
import pandas as pd
import glob
path = '.'
documents = ['photos', 'keywords', 'collections', 'conversions', 'colors']
datasets = {}
for doc in documents:
print(doc)
files = glob.glob(doc + ".csv000*")
print(files)
subsets = []
for filename in files:
print(filename)
df = pd.read_csv(filename, sep='\t', header=0)
subsets.append(df)
if subsets:
datasets[doc] = pd.concat(subsets, axis=0, ignore_index=True)
else:
# Handle the empty case
datasets[doc] = pd.DataFrame() Get precomputed image embeddings
note: again this is a rehash of the previous post.
For the 25k Unsplash images, we will use precomputed image embeddings (50 Mb total size)provided by the SentenceTranformer documentation.
import requests
url = "http://sbert.net/datasets/unsplash-25k-photos-embeddings.pkl"
response = requests.get(url)
with open("unsplash-25k-photos-embeddings.pkl","wb") as file:
file.write(response.content)We then deserialize the file to get both the image names and embeddings.
import pickle
emb_filename="unsplash-25k-photos-embeddings.pkl"
with open(emb_filename, 'rb') as fIn:
img_names, img_emb = pickle.load(fIn)Get source images
We’ll also save the 25k Unsplash images (1.8 GiB total size) for offline use, per SentenceTranformer documentation.
img_folder = 'photos/'
if not os.path.exists(img_folder) or len(os.listdir(img_folder)) == 0:
os.makedirs(img_folder, exist_ok=True)
photo_filename = 'unsplash-25k-photos.zip'
if not os.path.exists(photo_filename): #Download dataset if does not exist
util.http_get('http://sbert.net/datasets/'+photo_filename, photo_filename)
#Extract all images
with zipfile.ZipFile(photo_filename, 'r') as zf:
for member in tqdm(zf.infolist(), desc='Extracting'):
zf.extract(member, img_folder)Ingest to Postgres
Next, we will use PostgreSQL to store the embeddings, together with their respective metadata.
Prepare data
Since we want to link up the embeddings to their respective metadata, we need to find a common value between them.
- in the embedding pickle file, the
img_namesis in the formatxyz.jpg - in
datasets['photos']dataframe, aphoto_idcolumn stores each image name without the.jpgsuffix
Thus we will
- Strip the
.jpgsuffix from the pickle’simg_names - Rename that column to
photo_id - Merge it with the dataframe using its
photo_idcolumn
First we strip the suffix
photo_ids = [name.rsplit('.', 1)[0] for name in img_names]We then create a DataFrame with one row per image and an embedding column whose entries are vectors we can push into pgvector later. For the embeddings, we normalize them to a 1-D Python list where each element is itself a 1-D list/array, so psycopg2 serializes it correctly.
df = pd.DataFrame({
'photo_id': photo_ids,
'embedding': [row for row in img_emb]
})We then merge the embeddings with metadata, unified by the photo_id key.
metadata = datasets['photos']
df = df.merge(metadata, on='photo_id', validate='one_to_one')Later, we will use photo_id as the primary key in our Postgres table DDL.
One last thing, we need to turn each numpy.ndarray of the embedding column into a plain Python list. Python lists adapt cleanly into Postgres arrays (and pgvector will parse them correctly).
df[‘embedding’] = df[‘embedding’].apply(lambda arr: arr.tolist())
Create a new database
Create a dedicated database:
sudo-u postgres createdb imagesdbEnable the vector extension:
sudo -u postgres psql -d imagesdb -c "CREATE EXTENSION IF NOT EXISTS vector;"
Connect to database from Python
Install psycopg2 (the PostgreSQL adapter for Python):
pip install psycopg2-binaryConnect to the newly created
imagesdbdatabaseconn = psycopg2.connect( host="127.0.0.1", port=5432, dbname="imagesdb", user="postgres", password=os.getenv("PG_PASS") ) cur = conn.cursor()
Enable pgvector extension
Even though we’ve installed the pgvector package in the previous post, the vector data type only exists in databases where we’ve enabled the extension. we have to run:CREATE EXTENSION vector; in every database where we plan to use the VECTOR(...) data type.
cur.execute("CREATE EXTENSION IF NOT EXISTS vector;")
conn.commit()Create Vector Index (HNSW)
HNSW (Hierarchical Navigable Small Worlds) is an advanced graph-based indexing method used in pgvector for efficient approximate nearest neighbor (ANN) search on high-dimensional vector data. It is a popular choice for AI and machine learning applications like Retrieval-Augmented Generation (RAG) due to its excellent balance of search performance (speed) and result accuracy (recall).
It organizes vectors into a multi-layered graph. Searches start at the sparse top layer and quickly navigate down to the denser bottom layers to find the nearest neighbors, allowing the algorithm to “jump” over large amounts of data. The index structure remains optimal as new data is added, supporting concurrent inserts, updates, and deletes.
CREATE INDEX IF NOT EXISTS images_vector_idx
ON images
USING hnsw (embedding vector_l2_ops);When creating and querying an HNSW index, we can tune specific parameters to balance performance and accuracy. We will look at these parameters in my next post when benchmarking pgvector against Chroma.
Create a table
We need to make sure the table’s VECTOR(<dim>) matches our actual embedding length.
Since the metadata contains a column stats_views with a maximum value of 2.7 billion, we have to use BIGINT for that column. Ideally, we’ll detect overflows in any integer column and apply BIGINT automatically.
dim = len(df['embedding'].iloc[0])
int32_max = np.iinfo(np.int32).max
col_defs = []
for col, dtype in df.drop(['photo_id','embedding'], axis=1).dtypes.items():
if dtype.kind == 'i':
pg_type = 'BIGINT' if df[col].max() > int32_max else 'INTEGER'
elif dtype.kind == 'f':
pg_type = 'DOUBLE PRECISION'
else:
pg_type = 'TEXT'
col_defs.append(f"{col} {pg_type}")
create_sql = f"""
DROP TABLE IF EXISTS images;
CREATE TABLE images (
photo_id TEXT PRIMARY KEY,
embedding VECTOR({dim}),
{', '.join(col_defs)}
);
"""
cur.execute(create_sql)
conn.commit()Insert rows
columns = df.columns.tolist()
insert_sql = sql.SQL("INSERT INTO images ({}) VALUES %s").format(
sql.SQL(', ').join(map(sql.Identifier, columns))
)
records = [
tuple([row['photo_id'], row['embedding']] +
[row[c] for c in columns if c not in ('photo_id','embedding')])
for _, row in df.iterrows()
]
execute_values(cur, insert_sql.as_string(cur), records, page_size=1000)
conn.commit()Verify
Login to postgres and connect to the
imagesdbdatabase we just createdsudo -u postgres psql -d imagesdbList all tables in the database:
\dtWe’ll see the
imagestable we just created.Examine the
imagestable’s schema (column names, data types, default values, indices) with\d images, or get extended details (including storage and size) with\d+ imagesInspect
imagestable’s row count to make sure everything is ingested:SELECT COUNT(*) FROM images;
Query
First let’s connect to our database
import psycopg2
conn = psycopg2.connect(
host="127.0.0.1", port=5432,
dbname="imagesdb", user="postgres",
password=os.getenv("PG_PASS")
)
cur = conn.cursor()Search metadata
Let’s now try a simple query to display the first 5 images taken by Canon, using the exif_camera_make column in the metadata.
from IPython.display import HTML, display
cur.execute("SELECT photo_id FROM images WHERE exif_camera_make ILIKE %s LIMIT 5;", ("Canon",))
ids = [r[0] for r in cur.fetchall()]
img_tags = ""
for pid in ids:
path = f"photos/{pid}.jpg"
if os.path.exists(path):
img_tags += f"<div style='margin:4px'><img src='{path}' width='200'></div>"
else:
img_tags += f"<div style='margin:4px;color:red'>missing: {pid}.jpg</div>"
html = f"""
<div style="
display: flex;
flex-direction: row;
align-items: flex-start;
flex-wrap: nowrap;
">
{img_tags}
</div>
"""
display(HTML(html))we will get the following

Semantic query
Let’s reuse the embedding function from my previous post to generate embeddings for both text and image queries.
def get_embedding(query, type):
"""
Obtain an embedding using SentenceTransformer with CLIP.
This function handles both text and image, denoted by "type"
For image, if the query string starts with "http", it downloads the image.
Otherwise, it treats the query as a local file path.
Returns a NumPy array embedding.
"""
from PIL import Image
import requests
from io import BytesIO
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('clip-ViT-B-32')
if type=="image":
# Check if query is an online URL.
if query.startswith("http"):
response = requests.get(query)
if response.status_code != 200:
raise Exception(f"Failed to download image from URL: {query}")
image = Image.open(BytesIO(response.content))
else:
image = Image.open(query)
# Get the image embedding as a NumPy array.
embedding = model.encode(image, convert_to_tensor=False)
else:
embedding = model.encode(query)
return embeddingQuery with text
We can then do a text query to find the most similar images with a text phrase. We will be using pgvector’s K-NN search operator (<->) to rank by distance. The steps are
Compute a CLIP-ViT-B/32 embedding for our query text
query_emb = get_embedding("beautiful lake","text")Execute a SQL query against the
imagestable. Let’s break this down:- In psycopg2,
%sis a placeholder for a query parameter. When we callcur.execute(sql, (query_emb.tolist(),))later, psycopg2 takesquery_emb.tolist()and substitutes it for%sin the SQL statement, handling string quoting and escaping exactly as Postgres expects. ::vectorcasts that query embedding array into the pgvector vector type- The
<->operator now accepts the parameter as a true vector and computes the nearest-neighbor Euclidean distance. AS distancerenames that computed distance so we can reference it inORDER BY, to sorts the result set in ascending order.query_emb.tolist(): turns the embedding NumPy array into a plain Python list of floats. psycopg2 does not know how to adapt a NumPy ndarray, but can serialize a Python list into a Postgres array literal (e.g. {0.1,0.2,…}).
# Parameterized KNN query: smallest distance first sql = """ SELECT photo_id, exif_camera_make, embedding <-> %s::vector AS distance FROM images ORDER BY distance LIMIT 5; """ cur.execute(sql, (query_emb.tolist(),)) results = cur.fetchall()- In psycopg2,
Display the thumbnails inline, sorted by semantic distance from the text query.
items = [] for pid, make, dist in results: img_src = f"photos/{pid}.jpg" items.append(f""" <div style="margin:4px;text-align:center"> <img src="{img_src}" width="200"><br> <small>{pid}<br>{make}<br>d={dist:.2f}</small> </div> """) html = f"<div style='display:flex'>{''.join(items)}</div>" display(HTML(html))
We get this result: 
Query with image
Similarly, we can do an image query with the same Tokyo Tower photo we used last time. Just change the first line of code to
query_emb = get_embedding("tokyo-wallpaper.jpg","image")
and run it. We will get

Likewise, we can query with an online image, such as this Mt. Fuji photo from Wikipedia. Just substitute the first 2 lines of code with
imgURL="https://upload.wikimedia.org/wikipedia/commons/f/f8/View_of_Mount_Fuji_from_%C5%8Cwakudani_20211202.jpg"
query_emb = get_embedding(imgURL,"image")We will get this

Semantic and metadata search
What if we want to search with an image or natural language query, but also scope it to return photos taken by a specific camera make (e.g. Canon)? In this case, we just need to add a WHERE clause to the SQL statement, just like any normal TSQL query.
sql = """
SELECT photo_id,
exif_camera_make,
embedding <-> %s::vector AS distance
FROM images
WHERE exif_camera_make ILIKE %s
ORDER BY distance
LIMIT 5;
"""
params = (query_emb.tolist(), 'Canon%')
cur.execute(sql, params)This filters for rows where the camera make starts with “Canon”, case-insensitively. We can use = %s in the SQL statement if we want an exact match, or pass the parameter as %Canon% for substring matching.
Notice now we have two
%splaceholders and a two-element tuple for the parameters. We can pass any camera make (or even a list) to ’params` without changing the SQL string.
Rerun our code with this updated SQL statement for the Wikipedia Mt. Fuji photo, we get the following results:

Multilingual search
We can also perform text search using multiple languages by encoding our query text with the clip-ViT-B-32-multilingual-v1 text embedding model, which is aligned to the clip-ViT-B-32 and maps 50+ languages to the same vector space. For image query, we can continue to use the same clip-ViT-B-32 model.
Spatial search
One of the factors driving my switch from Chroma to Postgres is spatial search. There are all sorts of powerful extensions such as PostGIS, earthdistance/cube (provides earth-curved accuracy without PostGIS), etc. But even the native planar support from PostgreSQL proves to be quite useful for small areas. Let’s see!
note: The Unsplash Lite dataset comes with limited geodata, as only 28% of images have complete lat/long data. Nevertheless, this section is a quick proof-of-concept to show the versatility of using Postgres.
Update table for geometric data
We’ll first update our table schema to take advantage of PostgreSQL’s native geometric types such as point, which is a a 2D coordinate (x,y). Other native geometric types include lseg, box and path, which we are not going to used for this test.
ALTER TABLE images
ADD COLUMN geo_pt point;
UPDATE images
SET geo_pt = point(photo_location_longitude, photo_location_latitude);
ALTER TABLECreate Spatial Index (GiST)
This allows Postgres to instantly find photos within a specific radius without checking every photo in the DB.
CREATE INDEX IF NOT EXISTS images_geo_idx
ON images
USING gist (geo_pt);With both the HNSW images_vector_idx and GiST images_geo_idx, a combined spatial and embedding search will use index Scan. For example, if we want to search for a similar photo taken within 10km of Tokyo, the search will jump straight to the 10km circle using the GiST index, then use the HNSW graph to pick the top 10 most similar photos from that subset.
Geocoding
We’ll then use Nominatim, a free OpenStreetMap geocoding API to obtain the latitude and longitude of a place programmatically. First install it with pip install geopy.
Then we will write our geocode script to resolve a location name with its latitude and longitude:
from geopy.geocoders import Nominatim
def geocode_city(city_name: str):
geolocator = Nominatim(user_agent="geo_demo")
loc = geolocator.geocode(city_name)
if not loc:
raise ValueError(f"Couldn’t geocode: {city_name}")
return loc.latitude, loc.longitude
location="Kyoto"
lat, lon = geocode_city(location)
print(f"{location} → lat: {lat:.6f}, lon: {lon:.6f}")We’ll get Kyoto → lat: 35.011575, lon: 135.768144 successfully.
Simple spatial search
Now we have Kyoto’s geodata. Let’s find 5 images closest to it.
Our updated SQL statement looks like this, and takes advantage of the updated table with support for point data type. Let’s break down the spatial search part:
- in
geo_pt <-> point(%s, %s), the<->operator on two PostgreSQL point values returns their Euclidean (straight-line) distance between two points.- This is identical to when we compare embeddings, only that
<->always returns a result in units of the point’s coordinate system. Sincegeo_ptwas created aspoint(longitude, latitude), that distance is measured in degrees of longitude/latitude, not in meters or kilometers.
- This is identical to when we compare embeddings, only that
point(%s, %s)constructs an in-query point from the supplied longitude and latitude parameters incur.execute(sql, (lon, lat, 5))below.
note: For more advanced usage, we can search by containment (e.g. box @> point), && for bounding-box intersection, as well as use functions such as area(), length(), center(), perimeter().
sql = """
SELECT
photo_id,
photo_location_city,
photo_location_country,
geo_pt <-> point(%s, %s) AS planar_dist
FROM images
WHERE geo_pt IS NOT NULL
ORDER BY planar_dist
LIMIT %s;
"""We can then execute the query by passing along Kyoto’s geodata and displaying the results:
cur.execute(sql, (lon, lat, 5))
results = cur.fetchall()
items = []
for pid, city, country, dist in results:
img_src = f"photos/{pid}.jpg"
items.append(f"""
<div style="margin:4px;text-align:center">
<img src="{img_src}" width="200"><br>
<small>{pid}<br>{city}<br>d={dist:.4f}</small>
</div>
""")
html = f"<div style='display:flex'>{''.join(items)}</div>"
display(HTML(html))
Spatial and semantic search
We can combine our pgvector semantic k-NN and point-based spatial index to get “the most similar images to an image/phrase within X km of Tokyo.”
The beauty of this is that we’re mixing in natural language (e.g., to describe an ambience, particular object, etc.) to search for images nearest a location. In the following example, we want to look up 5 modern architecture photos within 50km of New York City.
Notice that this is a two step process. In our example, we want to find images within 50km of New York City.
Rank the spatially filtered rows by their semantic similarity with our query embedding.
embedding <-> %s::vector AS vec_dist: Computes the vector distance between each image’s embedding (embedding) and the supplied query embedding (%s::vector) using our created HSNW index.geo_pt <-> point(%s, %s) AS geo_dist: Computes the planar geospatial distance for each image’s geo-location with our search location using our created GiST index.
Use a spatial filter
WHERE geo_pt <-> point(%s, %s) < %sto narrow down candidates by geography. In the spatial search< %saccepts only all images whosegeo_ptis smaller a threshold value, which isradius_deg.- Because our
<->operator distance is in degrees, we must translate our real-world radius (50 km) into the equivalentradius_degthreshold. - We use
radius_deg = 50 / 111to do this. The numerator (50) is the search radius in kilometers and denominator (111) is the approximate kilometers per degree at mid-latitudes. - Thus
radius_deg≈ 0.45 degrees. Now any image whose planar degree-distance to New York City is less than 0.45 is roughly within 50 km.
Sort by ascending semantic distance
ORDER BY embedding <-> %s::vector.query_emb = get_embedding("modern architecture","text") # This triggers the "Iterative Index Scan" in pgvector 0.8.0+ sql = """ SELECT photo_id, photo_location_city, photo_location_country, embedding <-> %s::vector AS vec_dist, geo_pt <-> point(%s, %s) AS geo_dist FROM images WHERE geo_pt <-> point(%s, %s) < %s ORDER BY embedding <-> %s::vector LIMIT %s; """ # Parameters: Note we pass the embedding twice # (once for the SELECT/score and once for the ORDER BY/index search) lat, lon = geocode_city("New York City") radius_deg = 50/111 # Approx 50km params = ( query_emb.tolist(), # For vec_dist calculation lon, lat, # For geo_dist calculation lon, lat, # For WHERE clause center radius_deg, # For WHERE clause radius query_emb.tolist(), # For ORDER BY (Crucial for HNSW index) 5 # LIMIT ) cur.execute(sql, params) results = cur.fetchall()Reusing the display code from above, we will see

We can also experiment with using a query image and identify all similar images nearest a particular location.
Conclusion
In this blog post, we have performed multimodal queries by computing a CLIP embedding, ask Postgres for the 5 closest vectors, and render the matching photos in our notebook. We also combine semantic queries with spatial search to demonstrate the flexibility of using a hybrid database such as PostgreSQL.
In my next post, we’ll compare supported features and performance between PostgreSQL and Chroma as vector engines.