Re: Performance of large queries

[prev] [thread] [next] [Date index for 2004/07/15]

From: colm-cdbi
Subject: Re: Performance of large queries
Date: 21:05 on 15 Jul 2004

On Thu, Jul 15, 2004 at 07:22:23PM +0000, cdbi-talk@xxxxxxxxx.xx wrote:
> The structure of the database is quite simple:
> 
> ---
> package GT::DBDyn::Stock::EOD::Data::Prices;
> use base 'GT::DBDyn::Stock::EOD::Data::DBI';
> 
> __PACKAGE__->table('prices');
> __PACKAGE__->columns(Primary => qw/source code datum exchange/);
> __PACKAGE__->columns(Essential => qw/open high low close volume
> currency/); __PACKAGE__->columns(Others => qw/adjust/);
> __PACKAGE__->has_a(source => 'GT::DBDyn::Stock::EOD::Data::Source');
> __PACKAGE__->has_a(exchange =>
> 'GT::DBDyn::Stock::Exchange::Data::Exchange'); __PACKAGE__->has_a(code =>
> 'GT::DBDyn::Names::Data::Names'); ---

Can I just check that your PK really needs to have all those
components?

What methods are you calling on the results you get back?  Are you
calling source / exchange as these are has_a and will have to make new
objects.

> And the query is the following:
> 
> my @ret =
>   GT::DBDyn::Stock::EOD::Data::Prices->search( { @_ }, {order_by =>
>   'datum'} );
> 
> It takes quite a long time for the processing. The query return ca. 2000
> datasets. Using the classical DBI-Query speeds up the query by an factor
> of 3 or 4. ...

Its a pretty basic suggestion, but have you tried using an iterator for
the result set? - it should mean you can start looping over your results
more quickly rather than building all the objects at the outset.

e.g. 

  my $iter = GT::DBDyn::Stock::EOD::Data::Prices->search({ @_ },
               { order_by => 'datum'}});

  while (my $obj = $iter->next) {
    # ...
  }

Alternatively check out set_sql in the docs which allows you to return
raw SQL results using normal DBI methods.  

Do people think there would be a general case for search_raw() and
retrve_all_raw() class methods in Class::DBI or is this breaking the
encapsulation too much?  It would seem that these would be simple to add
and fairly handy given that the search() stuff does a lot of useful
things that would probably lead to less cases where people would need
roll-you-own SQL.  I think there is an argument for this at least being
something you can use internally - the current search doesn't seem to
lend itself to you being able to write a wrapper to it that doesn't
involve objects if you'd like a middle ground between objects and
roll-your-own SQL.  Just a thought.

Colm

Performance of large queries
cdbi-talk 07:00 on 15 Jul 2004

Re: Performance of large queries
colm-cdbi 09:27 on 15 Jul 2004

Re: Performance of large queries
Tony Bowden 09:35 on 15 Jul 2004

Re: Performance of large queries
colm-cdbi 09:58 on 15 Jul 2004

Re: Performance of large queries
Tony Bowden 10:04 on 15 Jul 2004

Re: Performance of large queries
Perrin Harkins 17:13 on 15 Jul 2004

Re: Performance of large queries
cdbi-talk 19:22 on 15 Jul 2004

Re: Performance of large queries
colm-cdbi 21:05 on 15 Jul 2004

Re: Performance of large queries
Tony Bowden 23:05 on 15 Jul 2004