PostgreSQL’s JSON functionalities as an alternative to NoSQL
Optimizing GeoJSON processing with built-in JSON operations
The Challenge: Efficient GeoJSON Processing at Scale
At CodeComply.ai, we transform floor plans into actionable data by converting them into GeoJSON features using Togal.ai’s processing capabilities. These features enable us to perform dozens of automated checks for fire safety, accessibility, and state-specific statuses.
As our system grew, we faced a critical challenge: efficiently processing large volumes of memory-intensive GeoJSON objects stored in PostgreSQL’s JSONB columns.
Our initial approach of fetching entire collections before filtering was causing performance issues — high memory usage and slow response times.
While a dedicated NoSQL database seemed like a good solution, we discovered that PostgreSQL’s native JSON operations could solve our problems without adding architectural complexity.
Let’s examine our approach using Koa.js, Sequelize, and PostgreSQL. These optimizations work with most recent PostgreSQL versions, so you won’t need the absolute latest release.
The Data Model
Let’s examine some primary data structures we work with:
Individual features — representing elements like doors
const point: Feature<Point> = {
type: "Feature",
geometry: {
type: "Point",
coordinates: [5230, 4957]
},
properties: {
name: "Door center",
id: "0193a241-2fd1-705a-aa18-4f9fdbc6c10f",
types: ["DOOR_POINT", "MANUAL"]
}
}
These point features contain essential metadata: coordinates, a unique identifier, and classification types we use for filtering.
Feature Collections — how we store related features:
{
"type": "FeatureCollection",
"features": [
{
type: "Feature",
geometry: {
type: "Polygon",
coordinates: [
[
[480, 3552],
[529, 3552],
[529, 3606],
[480, 3606],
[480, 3552],
],
],
},
properties: {
id: "019312ff-0a6e-751d-8f7e-419a099ef666",
str: "D",
name: "text_1",
},
},
(other features)
]
}
Our database stores these collections in JSONB columns, allowing us to leverage PostgreSQL’s JSON operations for efficient querying.
Steps to optimize queries
Our initial implementation retrieved complete FeatureCollections for processing:
const objects = await Model.findAll({
attributes: ['geojson_text']
});
This approach produced the following metrics:
In the next step, we implemented database-level filtering using PostgreSQL’s JSON operators — we’ll only return text features with a specific prefix:
const text_path = `feature->'properties'->>'name'`;
const PARAGRAPH_PREFIX = "paragraph_";
const filtered_text = literal(`
jsonb_build_object(
'type', 'FeatureCollection',
'features', COALESCE(
(
SELECT jsonb_agg(feature)
FROM jsonb_array_elements(geojson_text->'features') feature
WHERE ${`${text_path} LIKE :paragraph_prefix || '%'`}
),
'[]'::jsonb
)
)
`);
const objects = await Model.findAll({
where,
attributes: [
[filtered_text, "geojson_text"],
],
replacements: {
paragraph_prefix: PARAGRAPH_PREFIX,
},
});
Key components of this optimization:
JSON path operations:
- ->: Retrieves JSON objects
- ->>: Extracts text values
- Combined in expressions like feature->’properties’->>’name’ for property access
PostgreSQL Functions:
- jsonb_build_object(): Creates a JSON object from key-value pairs, useful for restructuring our output
- jsonb_agg(): Aggregates multiple records into a JSON array
- jsonb_array_elements(): Expands a JSON array into a set of rows, enabling us to query individual array elements
- COALESCE(): Returns the first non-null value from its arguments; we use it to return ‘[]’::jsonb when no features match our criteria
Sequelize helpers with text matching:
- Sequelize’s literal(): Allows raw SQL injection into queries when we need complex operations that aren’t supported by Sequelize’s standard query builder
- replacements: the PARAGRAPH_PREFIX parameter allows flexible text matching. For example:
- With PARAGRAPH_PREFIX = ‘text_’, we match features named ‘text_1’, ‘text_2’, etc.
- The WHERE ${text_path} LIKE :paragraph_prefix || ‘%’ condition performs this prefix matching efficiently at the database level
We immediately see a 26% shorter duration and a 76% reduction of response size. Pages with zero counts indicate no text features matching our filter.
Analysis of our test cases revealed we were sending unnecessary data — particularly geometry information. We implemented property selection:
const selected_properties = ["id", "str", "name"];
const filtered_text = literal(`
jsonb_build_object(
'type', 'FeatureCollection',
'features', COALESCE(
(
SELECT jsonb_agg(jsonb_build_object(
'type', 'Feature',
'properties', jsonb_build_object(
${selected_properties
.map((prop) => `'${prop}', feature->'properties'->>'${prop}'`)
.join(",")}
)
))
FROM jsonb_array_elements(geojson_text->'features') feature
WHERE ${`${text_path} LIKE :paragraph_prefix || '%'`}
),
'[]'::jsonb
)
)
`);
Below we can see the results of such optimization:
The response size keeps getting smaller — 87% reduction from baseline, 52% from the second approach, the duration is 38% faster than baseline and 15% faster than the filtered approach.
Our optimization journey demonstrates that PostgreSQL’s JSON capabilities can effectively replace dedicated NoSQL solutions while:
- Significantly reducing response sizes (87% reduction)
- Improving query performance (38% faster)
- Maintaining a simpler architecture
- Preserving type safety and data integrity
In the follow-up article, we’ll explore how we abstracted these patterns into a reusable query generator, covering:
- Dynamic property selection
- Complex filtering conditions
- Type-safe JSON operations
- Performance optimization techniques
Stay tuned!