NYCPHP Meetup

NYPHP.org

[nycphp-talk] Best way to accomplish this task

Justin Dearing zippy1981 at gmail.com
Sun Feb 14 21:30:12 EST 2010


You might want to look at a queuing system to hold the input. Your
options are Microsoft message queuring apache mq and IBM tibco. I am
sure there are others Microsoft message queue is supported  by php and
built into windows.  apache probably is. Tibco is expensive.

They all will solve your problem of preventing duplicate processing.
Depending on the size of your input data you might want to store the
data in a file and just out a pointer to the file in the messages.

On 2/14/10, Anthony Papillion <papillion at gmail.com> wrote:
> Hello Everyone,
>
> I'm designing a system that will work on a schedule. Users will submit data
> for processing into the database and then, every minute, a PHP script will
> pass through the db looking for unprocessed rows (marked pending) and
> process them.
>
> The problem is, I may eventually have a few million records to process at a
> time. Each record could take anywhere from a few seconds to a few minutes to
> perform the required operations on. My concern is making sure that the
> script, on the next scheduled pass, doesn't grab the records currently being
> processed and start processing them again.
>
> Right now, I'm thinking of accomplishing this by updating a 'status' field
> in the database. So unprocessed records would have a status of 'pending',
> records being processed would have a status of 'processing' and completly
> processed record will have a status of 'complete'.
>
> For some reason, I see this as ugly but that's the only way I can think of
> making sure that records aren't duplicatly processed. So when I select
> records to process, I'm ONLY selecting one's with the status of 'pending'
> which means they are new, unprocessed.
>
> Is there a better, more eleqent way of doing this or is this pretty much it?
>
> Thanks!
> Anthony Papillion
>

-- 
Sent from my mobile device



More information about the talk mailing list