Subject: Re: DBIx::Class or Rose::DB::Object or other???



On 6/6/06, Kurt Hansen <[email protected]> wrote:
> I'm trying to understand what the differences are and which
> are relevant for my environment. So far, this is what I've picked up
> (yes a bit slim):
>
> 1. RDBO is faster.
> 2. DBIC has a Class::DBI compatibility layer that will make the
> transition from CDBI easier
>
> Any other key things I should note? Also, will the speed advantage of
> RDBO be maintained, or are both such young modules that it's hard to say
> where the speed advantage will end up?

The speed advantage will be maintained unless the design choices change
significantly for either module. Both are mature enough that the basic
design has settled down, so I doubt that will happen.

Aside from what you've already listed above, here are some more differences
between RDBO and DBIC. Full disclosure: I...

'm the author of RDBO :) (You can
skip to the end if you just want to read my advice.)

* Resultsets

DBIC has the concept of a "resultset" object which is sort of like an
encapsulated query. New resultsets can be spawned from existing resultsets,
further refining the scope of the encapsulated query or creating another
query based on some relationship to the original query. Methods on the
resultset can count the objects in the set, iterate over them, or page
through them. RDBO has no concept of an resultset object.

DBIC's resultsets can return data that is made up of something other than
...

row objects. (All RDBO Manager methods, OTOH, return collections of
inter-connected row objects.) This means that a DBIC resultset query can
use aggregate functions (e.g., SUM()) and GROUP BY/HAVING clauses in SQL
queries. Doing the same kind of query in RDBO currently requires the
creation of a custom Manager method.

DBIC's ResultSet classes offer extensive hooks for influencing each part of
the process of fetching multiple rows from the database. RDBO's bundled
Manager methods, OTOH, are essentially monolithic.

* Schema deployment

DBIC supports deploying a database schema from a set of Perl classes into a
database, and plans to support database schema versioning and migration.
RDBO does not go in this direction (Perl classes -> database schema), but
only in the reverse direction (database schema -> Perl classes), which DBIC
supports also (see: Loaders below).

* Database support

Both DBIC and RDBO support SQLite, Postgres, and MySQL. RDBO also supports
Informix, and Oracle support is in the early stages. DBIC supports DB2,
Oracle,...

and probably a few others I'm forgetting.

In general, database support in DBIC is more broad than in RDBO, but RDBO's
support is deeper for the databases that it does support. For example, if
you need DB2 support, DBIC has it and RDBO does not. OTOH, if you want,
say, built-in support for Postgres's INTERVAL data type (with emulation in
lesser databases), then RDBO has that.

* Performance

RDBO is performance-focused, and therefore more inclined support
performance-specific enhancements (such as the upcoming "ON DUPLICATE KEY
UPDATE" support) and to bend over backwards to get the maximum possible
performance. This means that the internals of RDBO are very ugly in places,
with significant inlining of code to avoid subroutine calls, and extensive
caching of "derived metadata," with all the expected bookkeeping required to
keep things in sync. DBIC's internals are more straight-forward, with few
"intentionally ugly" optimizations.

* Row objects

RDBO's row objects provide a lot of functionality out of the box, with
default inflate/deflate for over 30 column types. DBIC has hooks for this
type of thing, but less code is bundled/built-in.

RDBO has related object accessors and mutators that queue up changes and do
everything in a transaction at save() time. A For example, a "tree" of RDBO
objects can be created in Perl and then save()d into the database in a
single operation that runs single transaction and creates all of the related
objects in the correct order, ensuring that referential integrity is
maintained during the transaction.

RDBO also has "immediate" versions of these methods, which will add, set, or
delete related objects right now instead of on save().

There's an extensive infrastructure for deciding what kinds of methods get
created for each column, key, and relationship, and for adding support for
new column and method types.

* Special column values and inlining

RDBO supports "keywords" column values, where appropriate. Values that have
special meaning to the database are passed through as-is. For example, a
date value of "now" is passed through as the literal string "now" to
Postgres databases, where it is interpreted and evaluated by the database
itself. This works in reverse as well: a literal string of "-infinity"
coming out of a Postgres database is correctly inflated into a
DateTime::Infinite::Past object.

Similarly, strange values returned by some database (e.g., dates of
0000-00-00 returned by the ever-wacky MySQL) are handled "appropriately" by
RDBO (as opposed to dying with a fatal error when attempting to inflate
0000-00-00 into a DateTime object, which is decidedly inappropriate)

RDBO will also automatically "inline" column values when necessary. For
example, DBD::Informix does not correctly bind the special value "CURRENT"
to DATETIME columns. The value "CURRENT" must be put into the SQL query
directly as an literal string (sans quotes) in order for it to work. RDBO
detects this type of thing on a per-column-type basis and inlines values as
needed.

* Data source abstraction

DBIC is designed to support data sources other than databases accessed via
DBI (e.g., LDAP). RDBO only supports DBI.

DBIC associates a data source with an entire schema (traditionally, a
collection of table in a database). Multiple schema objects may exist,
allowing the same row classes to interact with different databases.

RDBO uses Rose::DB objects to abstract the database. Each thing accessing
the database "has a" Rose::DB object connected to that database. Swapping
out one Rose::DB object for another one allows the same classes or objects
to interact with different databases.

* Database portablilty

RDBO goes through great pains to hide the details of each supported
database. It is a goal of RDBO to never require any form of input that is
specific to a particular database (e.g., date/time formats that may differ
between database vendors.) This is extended to the Manager as well.
Abstract queries may, for example, ask for objects whose start_date is less
than a DateTime object, for example.

For each supported column type, there is a common interchange format for
values on the Perl side of the fence: DateTime objects for dates,
DateTime::Duration objects for intervals, Bit::Vector objects for bitfields,
BigInts for INT8 and SERIAL8 columns, and so on.

RDBO extends database portability as far as is reasonably possible. An
object can be loaded from, say, a MySQL database, and then that same object
can be saved into a Postgres database, as long as some reasonable equivalent
of each column type is available in both databases. RDBO will emulate
vendor-specific column types where possible (e.g. Postgres arrays are
emulated in other databases using varchar columns for storage).

Although DBIC has developing support for built-in inflate/deflate of
date/time values, in general, it requires column and query values to be
formatted appropriately for the target database.

* Loader

RDBO's loader is very aggressive when extracting information from databases
and generating Perl classes. Foreign keys, unique keys, and all kinds of
relationships, including many-to-many relationships, can be discovered and
configured automatically (and emitted by the Perl code generator). DBIC's
loader is not quite as mature.

Class generation in RDBO is governed by a convention manager, which can be
customized to exert control over naming and other by-convention decisions.
AFAIK, the conventions used DBIC's loader and elsewhere are not encapsulated
in any single place.

(The convention manager in RDBO also influences manual class setup in cases
where metadata is not explicitly specified.)

* Relationships

RDBO (optionally) "flattens" many-to-many relationships, making them operate
the same as one-to-many relationships from the perspective of the object.
This is true both on individual objects and in Manager queries.

* Abstract query syntax

RDBO's Manager abstracts SQL JOINs in terms of related objects. "Get Foos
with their Bars." "Get all Foos that have Bars, and also get their Zoobs."
The concepts of left/right inner/outer join are not part of the Manager
vocabulary or API (although of course JOINs are used under the covers).

DBIC also supports JOIN abstractions through relationship names, but uses
JOIN-like vocabulary to customize the behavior. E.g., to require related
objects instead of just fetching them if they're present, a custom join type
must be specified, which means you have to understand what the different
join types mean.

RDBO's Manager separates the concepts of which related objects to fetch and
whether those objects must exist (require_objects vs. with_objects). In
DBIC, the relationships themselves specify a particular behavior, which may
be overridden in a resultset (using a custom join type) on a per-query
basis.

* Documentation

In my admittedly biased opinion, RDBO has more comprehensive and cohesive
documentation :)

---

All of this aside, both modules are still in active development and new
features are added all the time. My advice is to choose the one that has
the features you need right now and the API that you like best. Both will
only get better and more capable with time.

-John




Programming list archiving by: Enterprise Git Hosting