SELECT * FROM image.png

PNG as a Database (PNGaaDB?)

screenshot of the demo website

SELECT * FROM `image.png`

A lot of people have heard of steganography before but idk how many people have actually tried uses it or building for it. At least that statement is true for myself as while I’ve done tons of CTFs in high school and college, I’d never really known how it all worked and thus png-db . This project hides JSON data in obscure corners of PNG files and exposes it via a sequel-esque script. It’s very fun.

To understand how an image can store data, one must first be familiar with the structure of a Portable Network Graphics (PNG) file. A PNG is not a single block of data but a series of distinct data segments known as "chunks".

+------------+
| PNG Header |
| (8 bytes)  |
+------------+
| Length     | (4 bytes)
| Type       | (4 bytes) "IHDR"
| Data       | (13 bytes) Image Header data
| CRC        | (4 bytes)
+------------+
| Length     | (4 bytes)
| Type       | (4 bytes) "zTXt" for Schema
| Data       | (variable) Compressed JSON Schema
| CRC        | (4 bytes)
+------------+
| Length     | (4 bytes)
| Type       | (4 bytes) "zTXt" for Row 1
| Data       | (variable) Compressed JSON Row Data
| CRC        | (4 bytes)
+------------+
| ...        |
+------------+
| Length     | (4 bytes)
| Type       | (4 bytes) "IDAT" (Pixel Data)
| Data       | (variable) Compressed Image Pixels
| CRC        | (4 bytes)
+------------+
| ...        |
+------------+
| Length     | (4 bytes)
| Type       | (4 bytes) "IEND"
| Data       | (0 bytes)
| CRC        | (4 bytes)
+------------+

Every valid PNG file is required to contain three specific types of chunks. The file has to begin with anIHDR (Image Header) chunk, which provides foundational metadata like the image's dimensions. Following the header, the file contains one or more IDAT chunks, which hold the compressed pixel data of the image. The file then must be terminated by an IEND chunk, which signals the end of the data stream.

Beyond these required components, the PNG specification allows for a variety of optional, ancillary chunks that can store a wide range of information. For example, the tEXt chunk stores simple, uncompressed textual information as a keyword-value pair. The zTXt chunk is a compressed alternative for larger blocks of text, using zlib to compress the data string. These chunks are where metadata like Title or Author are typically stored. Although the PNG specification defines these chunks for specific purposes, readers are not required to respect them, which creates an avenue for data embedding.

The structure of these chunks is a key enabler for using PNGs as a data format. Each chunk includes a four-byte length field, a four-byte chunk type code, the chunk's data, and a four-byte Cyclic Redundancy Check (CRC). This organized format provides a predictable container for custom data. The specification even defines an UnknownChunks element, which can store data not defined in the PNG specification, further highlighting the format's flexibility.

The following table provides a quick reference for some of the most important PNG chunks:

Chunk Type	Description	Critical/Optional
IHDR	The image header, containing dimensions, color type, and compression method.	Critical
IDAT	Contains the actual compressed pixel data. There can be multiple IDAT chunks.	Critical
IEND	Marks the end of the PNG file.	Critical
PLTE	The palette chunk, used for palette-based images.	Optional
tEXt	Stores uncompressed textual metadata (keyword-value pairs).	Optional
zTXt	Stores compressed textual metadata (keyword-value pairs).	Optional
bKGD	Defines the default background color for the image.	Optional
cHRM	Defines the color calibration data.	Optional

Steganography Use Cases

The concept of embedding data in a PNG is an element of steganography, the practice of concealing a message within another message. Steganography often involves modifying the raw pixel data itself to hide information, an approach that can be brittle and may corrupt an image. This project’s approach, however, uses the PNG file format's extensible chunk structure to store data without altering the visual content of the image. The data is openly contained in custom or designated chunks.

How It Works

png-db hacks the PNG format's inherent flexibility to store structured data within a single file. While other steganographic methods manipulate pixel data, here we’re using the PNG's zTXt (compressed textual data) chunks to store all of its information. The image itself is simply a placeholder, a grid of black pixels, which serves as a canvas for the embedded data (which is boring but I made this in an afternoon so meh).

The project organizes its data into a schema and individual data rows. When the database is saved to a PNG file, the schema is serialized into a JSON object and written to a single zTXt chunk with the keyword "schema". Each data row is also serialized as a JSON object and is written to its own separate zTXt chunk. The keyword for each data row chunk is generated based on its coordinates, in the format "row_x_y". This method allows the database to be reconstructed simply by reading and parsing these specific chunks from the PNG file. When a user queries the database, the project parses a simple WHERE clause and performs an in-memory linear scan of all the data rows. It then compares the values of the coordinates or the JSON fields within each row to the conditions specified in the query, returning only the rows that match.

Conclusion

png-db is a simple, focused tool that demonstrates the potential of using the PNG file format as a data container. I suppose next I’ll build a meme page where all the memes have hidden datasets contained in the image metadata then pretend I’ve found a super secret CIA chat website.

Try it at https://pngdb.jonaylor.com

PNG as a Database (PNGaaDB?)

SELECT * FROM `image.png`

Steganography Use Cases

How It Works

Conclusion

Check these out next

Protect Your Database from Cursor

Building a Simple AI DAW, Part 2: MCP and Agents

Query Google Sheets with SQL