Re: best stategry: check before create
[prev]
[thread]
[next]
[Date index for 2005/03/18]
Ask Bjørn Hansen wrote:
>
> On Mar 17, 2005, at 11:10 PM, Ofer Nave wrote:
>
>> 1) It will hit the database for every title. I could avoid this by
>> building a quick hash of all titles already in my DB and checking
>> myself in perl, then calling 'create' for the new ones, but that's
>> less CDBI-ish.
>
>
> Then you are trading more small lookups for a big data transfer and
> memory usage. If you are concerned about it you should make a test
> to see if it's even faster. How many titles do you need to compare
> before it's faster?
>
Well, memory is pretty cheap, and data transfers are pretty fast.
Usually the bottleneck is related to sheer number of queries run against
a database.... I think.
In this case, we're talking peanuts. There's currently about 10
articles per day on wikinews, and I'm working in the context of a single
day. So I'd be hashing in ten article titles once instead of hitting
the database ten times. But I like to think about scaling and best
coding practices, even when the data is still small. :)
>> 2) It uses the same data structure to both 'find' and 'create' -
>> meaning you can't say "just match on title, but if you don't find the
>> title, then create it with that title as well as this body and this
>> other misc data". Since it returns the object, I could then set the
>> non-matchable fields afterwards, and then call 'update', but that's
>> two DB hits instead of one. Not a big deal, but I am wondering if
>> there's a more elegant CDBI solution that I don't know of.
>
>
> Do a search on the title and then either update or create as appropriate.
>
So basically do my own find or create? :) I thought about it. But it
seems like both methods would be equivalent.
> If the data load is truly high it should be a little faster to do a
> "insert ... on duplicate key update ..." statement if your MySQL is
> new enough to support that.
I did not know about this feature. I'll look it up, since it sounds
cool - although it is not what I need right now, since I'm not
considering updates, just inserting new records. I've currently got
whatever comes with FC3 - which pathetically is MySQL 3.23. I can
upgrade if I need to.
-ofer