[nycphp-talk] talk Digest, Vol 28, Issue 23
Nicholas Hart
nhart at partsauthority.com
Mon Feb 16 17:41:10 EST 2009
Many thanks for the considerable response! I have changed the code to
reduce a large number of SELECTs by saving a query to an array. My problem
now is that although this array is appox. 3,000 records, I need a way to
pass this to a function. Note: the array contains all discount data while
the function pulls out several sub-queries per user and product codes.
Since there are 5 sub queries, it is probably easier to pass the main one.
Any idea how to pass this array, a pointer to it or make this a global
array? Making a global array appears to be depricated or discouraged. I
have tried passing this array which is called $discount but it doesn't
appear to be working when I do. I also run out of memory which at 128MB is
kind of odd.
Any help is much appreciated..
Best,
Nick
On Sat, Feb 14, 2009 at 12:00 PM, <talk-request at lists.nyphp.org> wrote:
> Send talk mailing list submissions to
> talk at lists.nyphp.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
> http://lists.nyphp.org/mailman/listinfo/talk
> or, via email, send a message with subject or body 'help' to
> talk-request at lists.nyphp.org
>
> You can reach the person managing the list at
> talk-owner at lists.nyphp.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of talk digest..."
>
>
> Today's Topics:
>
> 1. Re: slow php (Hans Zaunere)
> 2. Re: slow php (Rolan Yang)
> 3. Re: slow php (slow database) (csnyder)
> 4. Re: slow php (slow database) (Hans Zaunere)
> 5. Re: slow php (slow database) (Paul A Houle)
> 6. Re: slow php (Daniel Convissor)
> 7. Party Like It's 1234567890 (Chris Shiflett)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Fri, 13 Feb 2009 15:46:10 -0500
> From: "Hans Zaunere" <lists at zaunere.com>
> Subject: Re: [nycphp-talk] slow php
> To: "'NYPHP Talk'" <talk at lists.nyphp.org>
> Message-ID: <01ed01c98e1c$1a14a980$4e3dfc80$@com>
> Content-Type: text/plain; charset="us-ascii"
>
> > I have a php 'cron job' script which is inserting some records. It
> > uses arrays to store 12 field records along with several nested loops
> > in which it tests various currency values from a separate 4 field table
> > and sorts them to get the lowest one for insert into a new 4 field
> > table record. It has about 640,000 Product records per 4 Customers or
> > apporx. 2.4 million records to insert. It takes several hours+ to
> > complete and I am hoping there is a way to speed it up.
> >
> > So far, I've tried updating instead of inserting records and changing a
> > few functions but nothing seems to make much difference. Am I being
> > unreasonable for the size of the files here? I'm considering using a
> > Derived Table in mySQL and will also consider another language such as
> > python. Please let me know if you have any suggestions. Thanks.
>
> Why would this be considered a result of PHP being slow? You're doing many
> millions of database queries (both INSERT and SELECT from what you
> describe), which is the real culprit.
>
> Some things to consider:
>
> -- minimize lookup queries by caching, or loading your lookup tables into
> PHP arrays, thus avoiding database queries
>
> -- make sure that any lookup queries you do perform are against a well
> indexed table - similarly, consider key buffer sizes, buffer pool sizes,
> and
> all that good stuff
>
> -- attempt to batch inserts (see http://dev.mysql.com/insert for various
> techniques)
>
> -- look at dev.mysql.com and google for mysql bulk insert and mysql bulk
> insert configuration - there are a lot of tunables pertaining to these
> types
> of operations, including temp. turning off indexes, etc.
>
> The only PHP specific item I'd consider at this point is to explicitly free
> memory and resources to keep the memory footprint as small as possible.
> PHP's garbage collection cannot always be, well, perfect, for long running
> processes.
>
> H
>
>
>
>
> ------------------------------
>
> Message: 2
> Date: Fri, 13 Feb 2009 16:16:50 -0500
> From: Rolan Yang <rolan at omnistep.com>
> Subject: Re: [nycphp-talk] slow php
> To: NYPHP Talk <talk at lists.nyphp.org>
> Message-ID: <4995E342.2040208 at omnistep.com>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
> Nicholas Hart wrote:
> > I have a php 'cron job' script which is inserting some records. It
> > uses arrays to store 12 field records along with several nested loops
> > in which it tests various currency values from a separate 4 field
> > table and sorts them to get the lowest one for insert into a new 4
> > field table record. It has about 640,000 Product records per 4
> > Customers or apporx. 2.4 million records to insert. It takes several
> > hours+ to complete and I am hoping there is a way to speed it up.
> >
>
> If you can get away with it, you might try dumping the data to a csv
> file and using a "load data infile" within mysql to import the whole
> batch of data at once. Another thing that might be slowing down your
> inserts are mysql indexes. If you can disable your indexes, import the
> data, then re-index afterwards, that should speed up the import.
>
> ~Rolan
>
>
> ------------------------------
>
> Message: 3
> Date: Fri, 13 Feb 2009 16:28:37 -0500
> From: csnyder <chsnyder at gmail.com>
> Subject: Re: [nycphp-talk] slow php (slow database)
> To: NYPHP Talk <talk at lists.nyphp.org>
> Message-ID:
> <b76252690902131328m2070741l92220afb54e03f82 at mail.gmail.com>
> Content-Type: text/plain; charset=UTF-8
>
> On Fri, Feb 13, 2009 at 3:39 PM, Kenneth Downs <ken at secdat.com> wrote:
> > This ain't PHP,and Python won't help. Inserting 2.4 million rows into a
> > database will take hours if you go line by line. You are paying a big
> > overhead of a round trip to the server for each row.
> >
> > The best way to speed it up is to use what they call "bulk copy" on MS
> SQL
> > Server and "copy" on PostgreSQL. I'd have to assume mySQL has something
> by
> > the same name.
> >
> > Also, it may be that mySQL supports multiple inserts per line, in which
> case
> > you can do 10 inserts per round trip. This will speed it up w/o
> requiring
> > you to do to a different command for the bulk copy.
> >
>
> To the best of my knowledge, you can't send more than one statement at
> a time through the php interface, unless there's something in the
> mysqli extension that allows it.
>
> Perhaps you could generate the SQL insert statements and append them
> to a file as you go. When all the loops are done, pipe the file into
> MySQL using the command line interface:
>
> mysql -u username -ppassword databasename < file.sql
>
> If some inserts depend on previous inserts (for related row ids or
> something) it might take a little more logic to work that all out, but
> it should still be possible.
>
>
> ------------------------------
>
> Message: 4
> Date: Fri, 13 Feb 2009 16:36:32 -0500
> From: "Hans Zaunere" <lists at zaunere.com>
> Subject: Re: [nycphp-talk] slow php (slow database)
> To: "'NYPHP Talk'" <talk at lists.nyphp.org>
> Message-ID: <020d01c98e23$232ea9b0$698bfd10$@com>
> Content-Type: text/plain; charset="us-ascii"
>
> > > Also, it may be that mySQL supports multiple inserts per line, in which
> case
> > > you can do 10 inserts per round trip. This will speed it up w/o
> requiring
> > > you to do to a different command for the bulk copy.
> >
> > To the best of my knowledge, you can't send more than one statement at
> > a time through the php interface, unless there's something in the
> > mysqli extension that allows it.
>
> There is http://us2.php.net/manual/en/mysqli.multi-query.php
>
>
> ...but this is really about this...
>
> http://dev.mysql.com/doc/refman/5.0/en/insert.html
>
> "INSERT statements that use VALUES syntax can insert multiple rows..."
>
> H
>
>
>
>
> ------------------------------
>
> Message: 5
> Date: Fri, 13 Feb 2009 16:43:44 -0500
> From: Paul A Houle <paul at devonianfarm.com>
> Subject: Re: [nycphp-talk] slow php (slow database)
> To: NYPHP Talk <talk at lists.nyphp.org>
> Message-ID: <4995E990.6010504 at devonianfarm.com>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
> The fastest way to get data into mysql is like this:
>
> http://dev.mysql.com/doc/refman/5.1/en/load-data.html
>
> I needed to load 2.5M RDF triples into mysql to look up some
> identifiers. My RDF store balked at the turtle file, so I hacked away
> the header and treated it like a space-separated file. LOAD DATA INFILE
> was scary fast... It loaded the data in 3 seconds!
>
> Once you've loaded raw data into mysql you can do all kinds of tricks like
>
> INSERT INTO ... SELECT
>
> to process the information into the desired form. If you can do this
> processing w/ clever mysql coding, you can probably do it in 1/10 the
> LOC and more than 10 times the speed that it would take you in a
> procedural language. The main thing is that you'll need to be
> comfortable with joins, subselects and writing extreme queries.
>
>
> ------------------------------
>
> Message: 6
> Date: Fri, 13 Feb 2009 17:11:33 -0500
> From: Daniel Convissor <danielc at analysisandsolutions.com>
> Subject: Re: [nycphp-talk] slow php
> To: NYPHP Talk <talk at lists.nyphp.org>
> Message-ID: <20090213221133.GA22107 at panix.com>
> Content-Type: text/plain; charset=us-ascii
>
> Hi Nicholas:
>
> Here are two additional options to consider, both revolving around the
> concept of getting the database to do the work.
>
> First, many people write scripts to read, modify and then write the data
> back to the database because they are unfamilliar with the INSERT SELECT
> and UPDATE SELECT syntaxes. These syntaxes are very powerful.
>
> Second, if you have very complex, interdependent operations to do that
> can't be done via queries, turn it into a stored procedure inside MySQL.
>
> Oh, and a third possiblity is to combine the two. Write some functions
> that you can then use in the queries.
>
> All that aside, you could be having issues with transaction buffers
> getting huge or slow query logging bogging things down. Are you using
> transactions at all?
>
> --Dan
>
> --
> T H E A N A L Y S I S A N D S O L U T I O N S C O M P A N Y
> data intensive web and database programming
> http://www.AnalysisAndSolutions.com/
> 4015 7th Ave #4, Brooklyn NY 11232 v: 718-854-0335 f: 718-854-0409
>
>
> ------------------------------
>
> Message: 7
> Date: Fri, 13 Feb 2009 17:37:48 -0500
> From: Chris Shiflett <shiflett at php.net>
> Subject: [nycphp-talk] Party Like It's 1234567890
> To: NYPHP Talk <talk at lists.nyphp.org>
> Message-ID: <41BDD58C-89D0-43C5-9E4A-83AC88A21379 at php.net>
> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
>
> It's a little late, but if anyone is interested in dropping by our
> office to christen our new kegerator and celebrate Unix time
> 1234567890, let me know. Address and map here:
>
> http://omniti.com/is/here
>
> The New York office, of course. :-)
>
> Chris
>
>
> ------------------------------
>
> _______________________________________________
> talk mailing list
> talk at lists.nyphp.org
> http://lists.nyphp.org/mailman/listinfo/talk
>
> End of talk Digest, Vol 28, Issue 23
> ************************************
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.nyphp.org/pipermail/talk/attachments/20090216/1aaaede6/attachment.html>
More information about the talk
mailing list