• PostgreSQL
  • Technical
PostgreSQL® Data Types: Mappings to SQL, JDBC, and Java Data Types

1. Data Types

A long time ago at university, I learned my first high-level programming language, Pascal, from the book by Niklaus Wirth:

Algorithms + Data Structures = Programs

But I didn’t really learn much about databases studying computer science, as they were taught by another department (business systems). So maybe now’s the time to start with an imaginary book called something like:

Queries + Data Types = Databases?

Data types have a long and important history in computing, driven initially by word lengths and machine data types but becoming more powerful and abstract as computer science matured. One popular 1980’s magazine was even named after a data type (“BYTE”, early microprocessors such as the 8008, Z80, and 6800 were characterized by BYTE/8-bit word sizes, in an era when the PDP-11 had 16-bit words, the VAX had 32-bit words, and the Cray-1 a massive 64-bit word size).

Byte magazine December 1975 with two data types on the cover (BYTE and Character!)
Byte magazine December 1975 with two data types on the cover (BYTE and Character!)
(Source: Wikimedia)

2. PostgreSQL Data Types

“Data Types” is a popular PostgreSQL search, so I decided to do some investigation of my own into why they are so important. First of all, why do data types matter in PostgreSQL? Doing some preliminary research I found out that data types in PostgreSQL are important for at least the following aspects (possibly more!): 

  1. As column data types when creating a table
  2. For functions and operators
  3. For constraints
  4. For creating types and domains, and
  5. When using PostgreSQL from a programming language (e.g. PostgreSQL to/from Python, and “C”).

PostgreSQL has a lot of built-in data types that are described in Chapter 8 of the documentation. And you can add new data types, so I guess there are really an infinite number of data types possible.

There’s a table that enumerates at least 43 built-in data types, and reveals that along with the official name some types have aliases (used internally for historical reasons). For example “real” has the alias “float4” (a single precision 4-byte floating-point number).

Here’s the full table which shows the variety of data types available:

Name Aliases Description
bigint int8 signed eight-byte integer
bigserial serial8 autoincrementing eight-byte integer
bit [ (n) ]   fixed-length bit string
bit varying [ (n) ] varbit [ (n) ] variable-length bit string
boolean bool logical Boolean (true/false)
box   rectangular box on a plane
bytea   binary data (“byte array”)
character [ (n) ] char [ (n) ] fixed-length character string
character varying [ (n) ] varchar [ (n) ] variable-length character string
cidr   IPv4 or IPv6 network address
circle   circle on a plane
date   calendar date (year, month, day)
double precision float8 double precision floating-point number (8 bytes)
inet   IPv4 or IPv6 host address
integer int, int4 signed four-byte integer
interval [ fields ] [ (p) ]   time span
json   textual JSON data
jsonb   binary JSON data, decomposed
line   infinite line on a plane
lseg   line segment on a plane
macaddr   MAC (Media Access Control) address
macaddr8   MAC (Media Access Control) address (EUI-64 format)
money   currency amount
numeric [ (p, s) ] decimal [ (p, s) ] exact numeric of selectable precision
path   geometric path on a plane
pg_lsn   PostgreSQL Log Sequence Number
pg_snapshot   user-level transaction ID snapshot
point   geometric point on a plane
polygon   closed geometric path on a plane
real float4 single precision floating-point number (4 bytes)
smallint int2 signed two-byte integer
smallserial serial2 autoincrementing two-byte integer
serial serial4 autoincrementing four-byte integer
text   variable-length character string
time [ (p) ] [ without time zone ]   time of day (no time zone)
time [ (p) ] with time zone timetz time of day, including time zone
timestamp [ (p) ] [ without time zone ]   date and time (no time zone)
timestamp [ (p) ] with time zone timestamptz date and time, including time zone
tsquery   text search query
tsvector   text search document
txid_snapshot   user-level transaction ID snapshot (deprecated; see pg_snapshot)
uuid   universally unique identifier
xml   XML data

Table 1: Postgres Data Types (Name, Alias, Description)

But how do you know what you can do with each data type? Chapter 9 documents which functions and operators are applicable to each data type. The documentation also says that each data type has an external representation, which raises the question of what these “external representations” are either in standard SQL data types or for a specific programming language.

3. Using PostgreSQL From Java—the PgJDBC Driver

How do you use PostgreSQL from Java? With JDBC! (Java Database Connectivity). There’s a PostgreSQL JDBC Driver (PgJDBC for short) which allows Java programs to connect using standard, database independent, Java code. It’s an open source Pure Java (Type 4, which talks native PostgreSQL protocol) driver and is well documented.

It’s easy to download PostgreSQL, install it, and start the database server running. You also need to download the JDBC driver. 

Connecting to the database is easy from jdbc:

To create tables you need to specify PostgreSQL data types for all the columns. For example, here’s a simple test table with integer id (which is the primary key) and value columns:

You can then do simple things in Java/PgJDBC using statements:

 

4. Why Do Data Types Matter in JDBC?

For simple Statements, the INSERT statement is just a string, so you don’t need a real data type yet. However, to extract the returned value you do need to know the PostgreSQL type, and also the corresponding Java data type if you are going to do anything more useful than just print it out (i.e. store it, process it):

However, to use Prepared Statements you need to use the correct data types (lines 2, 3, 6, and 9) for both INSERT and SELECT, otherwise you’ll get a run time SQL exception:

Note that rs.getString(“value”) also seems to work ok for the basic data types (i.e. automatic casting), which is a feature we’ll find useful below.

 

5. Mappings From PostgreSQL Data Types to SQL/JDBC Data Types

 

 Mappings from shape to colour
Mappings from shape to colour
(Source: Wikimedia)

But how do you know which Java data types the PostgreSQL data types actually map to in advance?

Some searching resulted in the discovery of the getTypeInfo() method in the java.sql.DatabaseMetaData Interface, which is an interface implemented by driver vendors to let users know the capabilities of the database in combination with the JDBC driver. The getTypeInfo() method retrieves a description of all the data types supported by the database, ordered by DATA_TYPE and then by how closely they map to the corresponding JDBC SQL type. There’s potentially lots of useful information available, but I was mainly interested in TYPE_NAME and DATA_TYPE, which is the SQL data type from java.sql.Types. Initially it was reporting DATA_TYPE as integers (corresponding to the constants in the ENUM java.sql.Types), which wasn’t very useful, but then I managed to get it to report the constant name as follows:

Running this reveals the complete list of PostgreSQL data types and their mapping to SQL/JDBC Data types, for a total of 183 data types. Removing all the (many) OTHER and ARRAY (e.g. _record etc) data types leaves us with this more manageable table:

PostgreSQL Data Type SQL Data Type
bool BIT
bit BIT
int8 BIGINT
bigserial BIGINT
oid BIGINT
bytea BINARY
char CHAR
bpchar CHAR
numeric NUMERIC
int4 INTEGER
serial INTEGER
int2 SMALLINT
smallserial SMALLINT
float4 REAL
float8 DOUBLE
money DOUBLE
name VARCHAR
text VARCHAR
varchar VARCHAR
date DATE
time TIME
timetz TIME
timestamp TIMESTAMP
timestamptz TIMESTAMP
cardinal_number DISTINCT
character_data DISTINCT
sql_identifier DISTINCT
time_stamp DISTINCT
yes_or_no DISTINCT
xml SQLXML
refcursor REF_CURSOR

Table 2: Mappings From PostgreSQL to SQL/JDBC Data Types

Note that the PostgreSQL data types are using the aliases from Table 1. But this is only part of the answer. How do we know what Java data types correspond to the SQL data types?

6. Mappings From SQL/JDBC Data Types to Java Data Types

After doing some more searching, the best I could do to find mappings from SQL data types to Java data types was to consult the actual JDBC specifications, Appendix B tables. Here’s my solution using Appendix B.4, Java object types mapped to JDBC types.

PostgreSQL Data Type SQL/JDBC Data Type Java Type
bool BIT boolean
bit BIT boolean
int8 BIGINT long
bigserial BIGINT long
oid BIGINT long
bytea BINARY byte[]
char CHAR String
bpchar CHAR String
numeric NUMERIC java.math.BigDecimal
int4 INTEGER int
serial INTEGER int
int2 SMALLINT short
smallserial SMALLINT short
float4 REAL float
float8 DOUBLE double
money DOUBLE double
name VARCHAR String
text VARCHAR String
varchar VARCHAR String
date DATE java.sql.Date
time TIME java.sql.Time
timetz TIME java.sql.Time
timestamp TIMESTAMP java.sql.Timestamp
timestamptz TIMESTAMP java.sql.Timestamp
cardinal_number DISTINCT Mapping of underlying type
character_data DISTINCT Mapping of underlying type
sql_identifier DISTINCT Mapping of underlying type
time_stamp DISTINCT Mapping of underlying type
yes_or_no DISTINCT Mapping of underlying type
xml SQLXML java.sql.SQLXML
refcursor REF_CURSOR Undefined
_abc ARRAY java.sql.array

Table 3: Complete Mappings from PostgreSQL to SQL/JDBC to Java Data Types

Notes:

1 REF_CURSOR doesn’t appear in the jdbc appendices, but is mentioned in section “13.3.3.4 REF Cursor Support” of the specification, and may map to Types.REF_CURSOR.

2 _abc stands for one of many ARRAY data types available in PostgreSQL (_record to _yes_or_no).

7. Type Conversion

So now we know the correct Java types to use, everything should just magically work correctly right? Well, mostly, as long as you remember what the PostgreSQL data types used in the table columns are, and use the correct PostgreSQL to Java mappings. As an experiment I tried a few data types including int2/int4/int8 (short, int, long), char/text (String), numeric (BigDecimal). The main thing to watch out for is conversion/casting between different sized types—going from longer to short types results in run-times errors (if you are lucky) or truncation (if you are unlucky).

How about a really simple type such as bit, bool, and boolean? That should be idiot proof? True? (or False or Unknown…). Let’s try and see what happens (in my next PostgreSQL blog)!

8. Conclusions

In this blog I discovered what data types are available in PostgreSQL (a lot), and hopefully determined the definitive mapping from PostgreSQL to SQL/JDBC to Java data types. However, even armed with this information you have to be careful about type conversion/casting, and watch out for run-time errors, truncation, or loss of information.

And we’ve really only started to scratch the surface of PostgreSQL data types, as PostgreSQL allows for custom data types, user defined types, arrays, and OTHER data types (e.g. JSON). I was also surprised to discover that PostgreSQL is really an ORDBMS (an object-relational DB), and has support for some object-oriented features such as table inheritance and function overloading. How these all work from JDBC would be more than enough to fill my imaginary database book, or several actual (but currently chimerical) future blogs.

Transparent, fair, and flexible pricing for your data infrastructureSee Instaclustr Pricing Here or learn about our PostgreSQL Consulting service.

Instaclustr managed PostgreSQL® service streamlines deployment. Contact us to discover how!

Contact us