NYCPHP Meetup

NYPHP.org

[nycphp-talk] slow php

Tim Lieberman tim_lists at o2group.com
Mon Feb 16 18:05:31 EST 2009


On Feb 16, 2009, at 5:41 PM, Nicholas Hart wrote:

> Many thanks for the considerable response!  I have changed the code  
> to reduce a large number of SELECTs by saving a query to an array.   
> My problem now is that although this array is appox. 3,000 records,  
> I need a way to pass this to a function.  Note: the array contains  
> all discount data while the function pulls out several sub-queries  
> per user and product codes.  Since there are 5 sub queries, it is  
> probably easier to pass the main one.


> Any idea how to pass this array, a pointer to it or make this a  
> global array?

I'm not 100% sure (but someone will correct me if I'm wrong), PHP  
should be pretty smart about this.  If you're just calling a function  
with a big array as a param, PHP will pass in a reference (a  
pointer).  Now, if you actually change data in the array inside your  
function, PHP will then (probably?) clone it.

Too busy today to find out for sure.  You can always force the  
reference by defining your function like: function myFunc(&$my_array)  
{...}

That should ensure that the array is passed by reference.

Think carefully about how that array is structured, and how much you  
loop on it.  It may be that you can speed things up a lot by  
preprocessing the discount data in such a way that you can avoid  
scanning through it all every time.

>   Making a global array appears to be depricated or discouraged.  I  
> have tried passing this array which is called $discount but it  
> doesn't appear to be working when I do.  I also run out of memory  
> which at 128MB is kind of odd.

Leaning on globals is usually somewhat frowned upon, but if this is a  
little self-contained process, there's nothing wrong with storing it  
globally and then referencing the global inside your function.

As for memory limit issues -- you're probably not cleaning up after  
yourself.  Try to unset or otherwise destroy stuff at the end of each  
loop iteration.

>
>
> Any help is much appreciated..
>
> Best,
> Nick
>
>
>
>
>
> On Sat, Feb 14, 2009 at 12:00 PM, <talk-request at lists.nyphp.org>  
> wrote:
> Send talk mailing list submissions to
>        talk at lists.nyphp.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
>        http://lists.nyphp.org/mailman/listinfo/talk
> or, via email, send a message with subject or body 'help' to
>        talk-request at lists.nyphp.org
>
> You can reach the person managing the list at
>        talk-owner at lists.nyphp.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of talk digest..."
>
>
> Today's Topics:
>
>   1. Re: slow php (Hans Zaunere)
>   2. Re: slow php (Rolan Yang)
>   3. Re: slow php (slow database) (csnyder)
>   4. Re: slow php (slow database) (Hans Zaunere)
>   5. Re: slow php (slow database) (Paul A Houle)
>   6. Re: slow php (Daniel Convissor)
>   7. Party Like It's 1234567890 (Chris Shiflett)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Fri, 13 Feb 2009 15:46:10 -0500
> From: "Hans Zaunere" <lists at zaunere.com>
> Subject: Re: [nycphp-talk] slow php
> To: "'NYPHP Talk'" <talk at lists.nyphp.org>
> Message-ID: <01ed01c98e1c$1a14a980$4e3dfc80$@com>
> Content-Type: text/plain;       charset="us-ascii"
>
> > I have a php 'cron job' script which is inserting some records.  It
> > uses arrays to store 12 field records along with several nested  
> loops
> > in which it tests various currency values from a separate 4 field  
> table
> > and sorts them to get the lowest one for insert into a new 4 field
> > table record.  It has about 640,000 Product records per 4  
> Customers or
> > apporx. 2.4 million records to insert.  It takes several hours+ to
> > complete and I am hoping there is a way to speed it up.
> >
> > So far, I've tried updating instead of inserting records and  
> changing a
> > few functions but nothing seems to make much difference.  Am I being
> > unreasonable for the size of the files here?  I'm considering  
> using a
> > Derived Table in mySQL and will also consider another language  
> such as
> > python.  Please let me know if you have any suggestions.  Thanks.
>
> Why would this be considered a result of PHP being slow?  You're  
> doing many
> millions of database queries (both INSERT and SELECT from what you
> describe), which is the real culprit.
>
> Some things to consider:
>
> -- minimize lookup queries by caching, or loading your lookup tables  
> into
> PHP arrays, thus avoiding database queries
>
> -- make sure that any lookup queries you do perform are against a well
> indexed table - similarly, consider key buffer sizes, buffer pool  
> sizes, and
> all that good stuff
>
> -- attempt to batch inserts (see http://dev.mysql.com/insert for  
> various
> techniques)
>
> -- look at dev.mysql.com and google for mysql bulk insert and mysql  
> bulk
> insert configuration - there are a lot of tunables pertaining to  
> these types
> of operations, including temp. turning off indexes, etc.
>
> The only PHP specific item I'd consider at this point is to  
> explicitly free
> memory and resources to keep the memory footprint as small as  
> possible.
> PHP's garbage collection cannot always be, well, perfect, for long  
> running
> processes.
>
> H
>
>
>
>
> ------------------------------
>
> Message: 2
> Date: Fri, 13 Feb 2009 16:16:50 -0500
> From: Rolan Yang <rolan at omnistep.com>
> Subject: Re: [nycphp-talk] slow php
> To: NYPHP Talk <talk at lists.nyphp.org>
> Message-ID: <4995E342.2040208 at omnistep.com>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
> Nicholas Hart wrote:
> > I have a php 'cron job' script which is inserting some records.  It
> > uses arrays to store 12 field records along with several nested  
> loops
> > in which it tests various currency values from a separate 4 field
> > table and sorts them to get the lowest one for insert into a new 4
> > field table record.  It has about 640,000 Product records per 4
> > Customers or apporx. 2.4 million records to insert.  It takes  
> several
> > hours+ to complete and I am hoping there is a way to speed it up.
> >
>
> If you can get away with it, you might try dumping the data to a csv
> file and using a "load data infile" within mysql to import the whole
> batch of data at once. Another thing that might be slowing down your
> inserts are mysql indexes. If you can disable your indexes, import the
> data, then re-index afterwards, that should speed up the import.
>
> ~Rolan
>
>
> ------------------------------
>
> Message: 3
> Date: Fri, 13 Feb 2009 16:28:37 -0500
> From: csnyder <chsnyder at gmail.com>
> Subject: Re: [nycphp-talk] slow php (slow database)
> To: NYPHP Talk <talk at lists.nyphp.org>
> Message-ID:
>        <b76252690902131328m2070741l92220afb54e03f82 at mail.gmail.com>
> Content-Type: text/plain; charset=UTF-8
>
> On Fri, Feb 13, 2009 at 3:39 PM, Kenneth Downs <ken at secdat.com> wrote:
> > This ain't PHP,and Python won't help.  Inserting 2.4 million rows  
> into a
> > database will take hours if you go line by line.  You are paying a  
> big
> > overhead of a round trip to the server for each row.
> >
> > The best way to speed it up is to use what they call "bulk copy"  
> on MS SQL
> > Server and "copy" on PostgreSQL.  I'd have to assume mySQL has  
> something by
> > the same name.
> >
> > Also, it may be that mySQL supports multiple inserts per line, in  
> which case
> > you can do 10 inserts per round trip.  This will speed it up w/o  
> requiring
> > you to do to a different command for the bulk copy.
> >
>
> To the best of my knowledge, you can't send more than one statement at
> a time through the php interface, unless there's something in the
> mysqli extension that allows it.
>
> Perhaps you could generate the SQL insert statements and append them
> to a file as you go. When all the loops are done, pipe the file into
> MySQL using the command line interface:
>
> mysql -u username -ppassword databasename < file.sql
>
> If some inserts depend on previous inserts (for related row ids or
> something) it might take a little more logic to work that all out, but
> it should still be possible.
>
>
> ------------------------------
>
> Message: 4
> Date: Fri, 13 Feb 2009 16:36:32 -0500
> From: "Hans Zaunere" <lists at zaunere.com>
> Subject: Re: [nycphp-talk] slow php (slow database)
> To: "'NYPHP Talk'" <talk at lists.nyphp.org>
> Message-ID: <020d01c98e23$232ea9b0$698bfd10$@com>
> Content-Type: text/plain;       charset="us-ascii"
>
> > > Also, it may be that mySQL supports multiple inserts per line,  
> in which
> case
> > > you can do 10 inserts per round trip.  This will speed it up w/o
> requiring
> > > you to do to a different command for the bulk copy.
> >
> > To the best of my knowledge, you can't send more than one  
> statement at
> > a time through the php interface, unless there's something in the
> > mysqli extension that allows it.
>
> There is http://us2.php.net/manual/en/mysqli.multi-query.php
>
>
> ...but this is really about this...
>
> http://dev.mysql.com/doc/refman/5.0/en/insert.html
>
> "INSERT statements that use VALUES syntax can insert multiple rows..."
>
> H
>
>
>
>
> ------------------------------
>
> Message: 5
> Date: Fri, 13 Feb 2009 16:43:44 -0500
> From: Paul A Houle <paul at devonianfarm.com>
> Subject: Re: [nycphp-talk] slow php (slow database)
> To: NYPHP Talk <talk at lists.nyphp.org>
> Message-ID: <4995E990.6010504 at devonianfarm.com>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
> The fastest way to get data into mysql is like this:
>
> http://dev.mysql.com/doc/refman/5.1/en/load-data.html
>
> I needed to load 2.5M RDF triples into mysql to look up some
> identifiers.  My RDF store balked at the turtle file,  so I hacked  
> away
> the header and treated it like a space-separated file.  LOAD DATA  
> INFILE
> was scary fast...  It loaded the data in 3 seconds!
>
> Once you've loaded raw data into mysql you can do all kinds of  
> tricks like
>
> INSERT INTO ... SELECT
>
> to process the information into the desired form.  If you can do this
> processing w/ clever mysql coding,  you can probably do it in 1/10 the
> LOC and more than 10 times the speed that it would take you in a
> procedural language.  The main thing is that you'll need to be
> comfortable with joins,  subselects and writing extreme queries.
>
>
> ------------------------------
>
> Message: 6
> Date: Fri, 13 Feb 2009 17:11:33 -0500
> From: Daniel Convissor <danielc at analysisandsolutions.com>
> Subject: Re: [nycphp-talk] slow php
> To: NYPHP Talk <talk at lists.nyphp.org>
> Message-ID: <20090213221133.GA22107 at panix.com>
> Content-Type: text/plain; charset=us-ascii
>
> Hi Nicholas:
>
> Here are two additional options to consider, both revolving around the
> concept of getting the database to do the work.
>
> First, many people write scripts to read, modify and then write the  
> data
> back to the database because they are unfamilliar with the INSERT  
> SELECT
> and UPDATE SELECT syntaxes.  These syntaxes are very powerful.
>
> Second, if you have very complex, interdependent operations to do that
> can't be done via queries, turn it into a stored procedure inside  
> MySQL.
>
> Oh, and a third possiblity is to combine the two.  Write some  
> functions
> that you can then use in the queries.
>
> All that aside, you could be having issues with transaction buffers
> getting huge or slow query logging bogging things down.  Are you using
> transactions at all?
>
> --Dan
>
> --
>  T H E   A N A L Y S I S   A N D   S O L U T I O N S   C O M P A N Y
>            data intensive web and database programming
>                http://www.AnalysisAndSolutions.com/
>  4015 7th Ave #4, Brooklyn NY 11232  v: 718-854-0335 f: 718-854-0409
>
>
> ------------------------------
>
> Message: 7
> Date: Fri, 13 Feb 2009 17:37:48 -0500
> From: Chris Shiflett <shiflett at php.net>
> Subject: [nycphp-talk] Party Like It's 1234567890
> To: NYPHP Talk <talk at lists.nyphp.org>
> Message-ID: <41BDD58C-89D0-43C5-9E4A-83AC88A21379 at php.net>
> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
>
> It's a little late, but if anyone is interested in dropping by our
> office to christen our new kegerator and celebrate Unix time
> 1234567890, let me know. Address and map here:
>
> http://omniti.com/is/here
>
> The New York office, of course. :-)
>
> Chris
>
>
> ------------------------------
>
> _______________________________________________
> talk mailing list
> talk at lists.nyphp.org
> http://lists.nyphp.org/mailman/listinfo/talk
>
> End of talk Digest, Vol 28, Issue 23
> ************************************
>
> _______________________________________________
> New York PHP User Group Community Talk Mailing List
> http://lists.nyphp.org/mailman/listinfo/talk
>
> http://www.nyphp.org/show_participation.php




More information about the talk mailing list