SOCI

SOCI
Simple Oracle Call Interface (and more)

Documentation and tutorial

Note:
The following documentation and tutorial apply to the original SOCI library, which supports only Oracle databases. Once the library will be ported to support other databases, this page will be redesigned.

Files
Errors
Connecting to the database
One-time statements
Binding local data
   Binding output data
   Binding input data
   Binding by position
   Binding by name
   Types
      Static type binding
      Static type binding for bulk operations
      Dynamic resultset binding
      Extending with user-provided datatypes
Indicators
Statement preparation and repeated execution
Stored procedures
Transactions
Advanced stuff
Rationale for some design decisions
Compiling

Files

The whole SOCI library consists of only two files: soci.h and soci.cpp. The first needs to be #included in your own code and the second should be just added to the whole project for compilation and linking. There was no reason to make a precompiled library out of this single file.

Everything in SOCI is declared in the namespace SOCI. All code examples presented in this documentation assume that your code begins with something like:

#include "soci.h"
// other includes

using namespace SOCI;

// ...

Errors

All DB-related errors manifest themselves as exceptions of type SOCIError, which is derived from std::runtime_error. This allows to handle Oracle errors within the standard exception framework:

int main()
{
    try
    {
        // regular code
    }
    catch (exception const &e)
    {
        cerr << "Bang! " << e.what() << endl;
    }
}

In addition, the SOCIError class has a public errNum_ member, which contains the Oracle error code. Of course, in order to access this code you have to catch the exception as SOCIError:

int main()
{
    try
    {
        // regular code
    }
    catch (SOCIError const &e)
    {
        cerr << "Oracle error: " << e.errNum_
             << " " << e.what() << endl;
    }
    catch (exception const &e)
    {
        cerr << "Some other error: " << e.what() << endl;
    }
}

Connecting to the database

The class Session encapsulates the database connection and other OCI-related details, which are common to all the statements that will be later executed. Its constructor expects three string parameters, which are service name, user name and password:

Session sql("service", "user", "password");

The constructor either connects successfully, or throws the exception. It is possible to have many active Sessions at the same time.

One-time statements

In many cases, the SQL query is intended to be executed only once, which means that statement parsing and execution can go together. The Session class provides a special once member, which triggers parsing and execution of such one-time statements:

sql.once << "drop table person";

For shorter syntax, the following form is also allowed:

sql << "drop table person";

The IOStream-like interface is exactly what it looks like, so that the statement text can be composed of many parts, involving anything that is streamable (including custom classes, if they have appropriate operator<<):

string tableName = "person";
sql << "drop table " << tableName;

Binding local data

Note:
The Oracle documentation uses two terms: defining (for instructing the library where the output data should go) and binding (for the input data and input/output PL/SQL parameters). For the sake of simplicity, SOCI uses the term binding for both of these.

Binding output data:

The into expression is used to add binding info to the statement:

int count;
sql << "select count(*) from person", into(count);

string name;
sql << "select name from person where id = 7", into(name);

Binding input data:

The use expression is used (no pun intended) to associate the SQL placeholder (written with colon) with the local data:

int val = 7;
sql << "insert into numbers(val) values(:val)", use(val);

In the above statement, the first "val" is a column name (assuming that there is appropriate table), the second "val" (with colon) is a placeholder and its name is ignored here, and the third "val" is a name of local variable.
To better understand the meaning of each "val" above, consider also:

int number = 7;
sql << "insert into numbers(val) values(:blabla)", use(number);

Binding by position

If there is more output or input "holes" in the single statement, it is possible to use many into and use expressions, separated by commas, where each expression will be responsible for the consecutive "hole" in the statement:

string firstName = "John", lastName = "Smith";
int personId = 7;

sql << "insert into person(id, firstname, lastname) values(:id, :fn, :ln)",
       use(personId), use(firstName), use(lastName);

sql << "select firstname, lastname from person where id = :id",
       into(firstName), into(lastName), use(personId);

In the code above, the order of "holes" in the SQL statement and the order of into and use expression should match.

Binding by name

The SQL placeholders that have their names (with colon) can be bound by name.
This allows to use different order:

string firstName = "John", lastName = "Smith";
int personId = 7;
sql << "insert into person(id, firstname, lastname) values(:id, :fn, :ln)",
       use(firstName, "fn"), use(lastName, "ln"), use(personId, "id");

or bind the same local data to many "holes" at the same time:

string addr = "...";
sql << "update person"
       " set mainaddress = :addr, contactaddress = :addr"
       " where id = 7",
       use(addr, "addr);

Types:

Static type binding

The static binding for types is most useful when the types used in the database are known at compile time - this was already presented with the help of into and use functions.

The following types are currently supported for use with into and use expressions:

char (for character values)
short, int, unsigned long, double (for numeric values)
char*, char[], std::string (for string values)
std::tm, std::time_t (for datetime values)
SOCI::Statement (for nested statements and PL/SQL cursors)
SOCI::BLOB (for Binary Large OBjects)
SOCI::RowID (for row identifiers)

See the test code that accompanies the library to see how each of these types is used.

Static type binding for bulk operations

Bulk inserts, updates, and selects are supported through the following std::vector based into and use types:

std::vector<char>
std::vector<short>
std::vector<int>
std::vector<unsigned long>
std::vector<double>
std::vector<std::string>
std::vector<std::tm>
std::vector<std::time_t>

Use of the vector based types mirrors that of the standard types, with the size of the vector used to specify the number of records to process at a time. See below for examples.

Dynamic resultset binding

For certain applications it is desirable to be able to select data from arbitrarily structured tables (e.g. via "select * ") and format the resulting data based upon its type. SOCI supports this through the SOCI::Row and SOCI::ColumnProperties classes.

Data is selected into a Row object, which holds ColumnProperties objects describing the type of data contained in each column. Once the data type for each column is known, the data can be formatted appropriately.

For example, the code below creates an xml document from a selected row of data from an arbitrary table:

Session sql("db1", "scott", "tiger");
  
sql << "create table ex2(num NUMBER, str VARCHAR2(20), dt DATE)";

int num_in = 123;
std::string str_in("my name");
std::time_t dt_in = time(0);

sql << "insert into ex2 values(:num1, :str1, :dt1)",
    use(num_in), use(str_in), use(dt_in);
    
Row r;
sql << "select * from ex2", into(r);
  
std::ostringstream doc;
doc << "<row>" << std::endl;
for(int i=0; i<r.size(); ++i)
{
    const ColumnProperties& props = r.getProperties(i);
    doc << '<' << props.getName() << '>';
    switch(props.getDataType())
    {
    case eString:
        doc << r.get<std::string>(i);
        break;
    case eDouble:
        doc << r.get<double>(i);
        break;
    case eInteger:
        doc << r.get<int>(i);
        break;
    case eUnsignedLong:
        doc << r.get<unsigned long>(i);
        break;
    case eDate:
        std::tm when = r.get<std::tm>(i);
        doc << asctime(&when);
        break;
    }
    doc << "</" << props.getName() << '>' << std::endl;  
}
doc << "</row>";

The following table shows the type T parameter that should be passed to Row::get<T>() for each possible SOCI Data Type that can be returned from ColumnProperties::getDataType(). Row::get<T>() throws an exception of type std::bad_cast if an incorrect type T is used.

Oracle Data Type	SOCI Data Type	`Row::get<T>` specializations
NUMBER	`eDouble`	double
NUMBER	`eInteger`	int
NUMBER	`eUnsignedLong`	unsigned long
CHAR, VARCHAR, VARCHAR2	`eString`	std::string
DATE	`eDate`	`std::tm`, `std::time_t`

Extending SOCI to support custom (user-defined) C++ types

SOCI can be easily extended through user code so that custom types can be used transparently to insert into, update, and select from the database.

To do so, provide an appropriate implementation of the TypeConversion class that converts to and from one of the SOCI Base Types listed below:

Oracle Data Type	SOCI Base Type
NUMBER	`double`
NUMBER	`int`
NUMBER	`unsigned long`
VARCHAR, VARCHAR2	`std::string`
CHAR	`char`
DATE	`std::tm`

For example, SOCI's built in support for std::time_t is implemented entirely as follows, utilizing std::tm as the Base Type:

namespace SOCI
{
  template<>
  class TypeConversion<std::time_t> 
  {
  public:
      typedef std::tm base_type;
      static std::time_t from(std::tm& t){return mktime(&t);}
      static std::tm to(std::time_t& t){return *localtime(&t);}
  };
}

There are three required class members for a valid TypeConversion specialization:

the base_type trait, defining the base type
the from() static member function, converting from the base type
the to() static member function, converting to the base type

(Note that no database-specific code is required.)

The following example shows how application code could extend SOCI to transparently support boost::gregorian::date:

#include "boost/date_time/gregorian/gregorian.hpp"
namespace SOCI
{
template<>
class TypeConversion<boost::gregorian::date>
{
public:
    typedef std::tm base_type;
    static boost::gregorian::date from(std::tm& t)
    {
        boost::gregorian::months_of_year month = 
            static_cast<boost::gregorian::months_of_year>(t.tm_mon + 1);
        
        boost::gregorian::date d(t.tm_year + 1900, month, t.tm_mday);
        return d;
    }
    static std::tm to(boost::gregorian::date& d)
    {
        std::tm t;
        
        t.tm_isdst = -1;
        t.tm_year = d.year() - 1900;
        t.tm_mon = d.month() - 1;
        t.tm_mday = d.day();
        t.tm_hour = 0;
        t.tm_min = 0;
        t.tm_sec = 0;
        std::mktime(&t); //normalize and compute remaining fields
          
        return t;
    }
};
} // namespace SOCI

With the above TypeConversion specialization in place, it is possible to use boost::gregorian::date directly with SOCI, binding input or output, either statically or via a dynamic resultset:

using boost::gregorian::date;
    
Session sql("db1", "scott", "tiger");

sql << "create table test_boost(when DATE)";
date dIn(1999, boost::gregorian::Feb, 1);
sql << "insert into test_boost values(:when)", use(dIn);
    
date dOut(boost::gregorian::not_a_date_time);
sql << "select when from test_boost", into(dOut);
assert(dIn == dOut);

Row r;
sql << "select * from test_boost", into(r);
assert(r.get<date>(0) == dIn);

std::vector<date> datesIn;
for (int i = 2; i != 12; ++i)
{
    date d(1999, boost::gregorian::Feb, i);
    datesIn.push_back(d);
}
sql << "insert into test_boost values(:when)", use(datesIn);

std::vector<date> datesOut(10);
sql << "select when from test_boost where when > :d", use(dIn), into(datesOut);
assert(datesIn == datesOut);

The other possibility to extend SOCI with custom data types is to use the IntoType<T> and UseType<T> class templates, which specializations can be user-provided. These specializations need to implement the interface defined by, respectively, the IntoTypeBase and UseTypeBase classes.
Note that when specializing these template classes for handling your own types, you are free to choose the expected parameters for the classes' constructors. The template functions into and use support up to 5 parameters, which are just forwarded to the constructor of appropriate class. The only convention is that when the indicator variable is used (see below), it should appear in the second position. Please refer to the library source code to see how this is done for the standard types.

Indicators

In order to support null values and other conditions which are not real errors, the concept of indicator is provided.
For example, when the following SQL query is executed:

select name from person where id = 7

there are three possible outcomes:

there is a person with id = 7 and his name is returned
there is a person with id = 7, but he has no name (his name is null)
there is no such person

Whereas the first alternative is easy, the other two are more complex. Moreover, they are not necessarily errors from the application's point of view and what's more interesting, they are different and the application may wish to detect which is the case.
The following example does this:

string name;
eIndicator ind;

sql << "select name from person where id = 7", into(name, ind);

switch (ind)
{
case eOK:
    // the data was returned without problems
    break;
case eNoData:
    // no such person
    break;
case eNull:
    // there is a person, but he has no name (his name is null)
    break;
case eTruncated:
    // the name was returned only in part,
    // because the provided buffer was too short
    // (not possible with std::string, but possible with char* and char[])
    break;
}

The use of indicator variable is optional, but if it is not used and the result would be either eNoData or eNull, then the exception is thrown. This means that you should use indicator variables everywhere where the application logic (and database schema) allow "no such object" or "attribute not set" conditions.

Indicator variables can be also used when binding input data, to control whether the data is to be used as provided, or explicitly overrided to be null:

int id = 7;
string name;
eIndicator ind = eNull;
sql << "insert into person(id, name) values(:id, :name)",
       use(id), use(name, ind);

In the above example, the row is inserted with name attribute set to null.

Indicator variables can also be used in conjunction with vector based insert, update, and select statements:

std::vector<string> names;
std::vector<eIndicator> inds;
sql << "select name from person where id = 7", into(names, inds);

The following example inserts null for each value of name:

std::vector<int> ids;
std::vector<string> names;
std::vector<eIndicator> nameIndicators;

for (int i = 0; i != 10; ++i)
{
    ids.push_back(i);
    nameIndicators.push_back(eNull);
}

sql << "insert into person(id, name) values(:id, :name)",
       use(ids), use(name, nameIndicators);

Statement preparation and repeated execution

Consider the following examples:

// Example 1.
for (int i = 0; i != 100; ++i)
{
    sql << "insert into numbers(value) values(" << i << ")";
}

// Example 2.
for (int i = 0; i != 100; ++i)
{
    sql << "insert into numbers(value) values(:val)", use(i);
}

Both examples will populate the table numbers with the values from 0 to 99.
The problem is that in both examples, not only the statement execution is repeated 100 times, but also the statement parsing and preparation. This means unnecessary overhead.
The following example uses the class Statement explicitly, by preparing the statement only once and repeating its execution with changing data (note the use of prepare member of Session class):

int i;
Statement st = (sql.prepare <<
                "insert into numbers(value) values(:val)",
                use(i));
for (i = 0; i != 100; ++i)
{
    st.execute(1);
}

The 1 parameter given to the execute method indicates the number of times the statement has to be executed. Here, it is executed once in each body of the loop.

Further performance improvements may be possible by having the underlying database API group operations together to reduce network roundtrips. SOCI makes such bulk operations possible by supporting std::vector based types:

// Example 3.
const int BATCH_SIZE = 25;
std::vector<int> valsIn;
for (int i = 0; i != BATCH_SIZE; ++i)
{
    ids.push_back(i);
}

Statement st = (sql.prepare << 
                "insert into numbers(value) values(:val)", 
                use(valsIn));
for (int i = 0; i != 4; ++i)
{
    st.execute(1);
}

(Of course, the size of the vector that will achieve optimum performance will vary, depending on many environmental factors, such as network speed.)

It is also possible to read all the numbers written in the above examples:

int i;
Statement st = (sql.prepare <<
                "select value from numbers order by value",
                into(i));
st.execute();
while (st.fetch())
{
    cout << i << '\n';
}

In the above example, the execute method is called with the default parameter 0. The following fetch calls perform the actual data retrieval and cursor traversal. The end-of-cursor condition is indicated by the fetch function returning false.

It is further possible to select records in batches into std::vector based types, with the size of the vector specifying the number of records to retrieve in each round trip:

std::vector<int> valsOut(100);
sql << "select val from numbers", into(valsOut);

The Statement::execute() and Statement::fetch() functions can also be used to repeatedly select all rows returned by a query into a vector based type:

std::vector<int> valsOut(30);
Statement st = (sql.prepare << 
                "select value from numbers", 
                into(valsOut));
st.execute();
while (st.fetch())
{
    std::vector<int>::iterator pos;
    for(; pos != valsOut.end(); ++pos)
    {
        cout << *pos << '\n'; 
    }
}

Assuming there are 100 rows returned by the query, the above code will retrieve and print all of them. Since the output vector was created with size 30, it will take (at least) 4 calls to fetch() to retrieve all 100 values. Each call to fetch() can potentially resize the vector to a size less than its initial size - how often this happens depends on the underlying database implementation.

Stored procedures

The Procedure class provides a convenient mechanism for calling stored procedures:

sql << "create or replace procedure echo(output out varchar2,"
       "input in varchar2) as "
       "begin output := input; end;";

std::string in("my message");
std::string out;
Procedure proc = (sql.prepare << "echo(:output, :input)", 
                                  use(out, "output"),
                                  use(in, "input"));
proc.execute(1);
assert(out == "my message");

Transactions

The Session class provides two methods for transaction processing: commit and rollback. The transaction is implicitly started with the first statement that modifies data in the database.

Advanced stuff

If there is any need to use more advanced features of the underlying OCI library (special connection steps, attribute manipulation, call-back functions, etc.), it can be done by accessing the low-level OCI handles available as public (see rationale below) members of the Session and Statement classes. Please refer to the soci.h file to see exactly what handles are available.

Rationale for some design decisions

Since public data members are considered harmful in the C++ community, some explanation is in order to defend SOCI against anticipated criticism. The intent is to allow reading, writing and object modifications through the low-level OCI handles, to enable experienced programmers to reuse SOCI in most demanding situations. The usual approach would be to provide the public getter/setter method pairs for private handles. However, OCI handles are opaque pointers and are strong abstractions by themselves - this is what makes them different from data members of standard types (like int or string). The getter/setter pairs (or even single accessors returning non-const references) were considered in the library design, but abandoned due to the lack of any real benefits. The only argument for providing getter/setter abstractions would be to allow future replacements of low-level OCI handles with "something else", but the reality is that OCI handles exist in a specific family and that "something else" (like handles for database other than Oracle) would not map into those abstractions. Porting the SOCI library to support other databases would require to change not only low-level handles, but also the getter/setter pairs, which would then prove to be not really abstract. Uff...
Having said that, it has to be reiterated that public data members are generally bad.

Compiling

This section explains how to compile your own programs with the SOCI library.

On MS Windows, in MSVC++7

In Project Properties:

In C/C++ - General:
Add the Additional Include Directory where the oci.h file is located (it can be something like C:\Oracle\Ora81\oci\include).
In Linker - General:
Add the Additional Library Directory where the oci.lib file is located (it can be something like C:\Oracle\Ora81\oci\lib\msvc).
In Linker - Input:
Add the Additional Dependency: oci.lib.

When executing programs, the oci.dll file should be located in the path where it can be found. Be careful when using many different Oracle tools, because they may overwrite each other's oci.dll versions. The "correct" oci.dll for running the program compiled as above is located somewhere near the oci.lib, for example in C:\Oracle\Ora81\BIN.

On Unix-like systems

The OCI library is usually inside the libclntsh.so or libclntsh.sl library (depending on the actual system), so you have to add the -lclntsh option for linking and -L option to provide the path to the linker. Of course, the -I compiler option will be needed to provide the location of the oci.h header file. Look inside the $ORACLE_HOME directory to find where these files are located - it may depend both on the Unix flavour and the Oracle version installed, but likely locations are $ORACLE_HOME/rdbms/demo for oci.h and $ORACLE_HOME/lib for libclntsh.so.