[nycphp-talk] Caching, proxies, sharding and other scaling questions
Mitch Pirtle
mitch.pirtle at gmail.com
Sat Jul 25 14:18:22 EDT 2009
Memcache is your safest option for an in-memory solution, for sure.
Realistically, does memcache even have a competitor in that regard?
For persistent storage, you should look at MongoDB and Project
Voldemort. Voldemort is insanely fast as a key/value store, and
coupled with BerkeleyDB storage is unbeatable for scale, shard and
speed needs.
MongoDB provides a persistent datastore that also goes direct to disk,
and is wikkid fast - MongDB tries to bridge the gap between key/value
stores (scale, shard, speed) and relational databases (lists, finds).
I have a site that will probably break 1B page views in 6 months,
running on three physical webservers, each of which having memcache as
well. It certainly can be done, but depends greatly on the quality of
your developers, how much time you give them to do their thing, and
the overall performance characteristics and requirements of your site.
Making massive scale websites isn't actually hard, it just takes time.
Not giving your engineers time to think things through takes their
efficiency and sets it on fire outside in the street. :-)
-- Mitch
On Fri, Jul 24, 2009 at 7:29 PM, Jake McGraw<jmcgraw1 at gmail.com> wrote:
> On Fri, Jul 24, 2009 at 6:45 PM, Ajai Khattri<ajai at bitblit.net> wrote:
>> On Fri, 24 Jul 2009, Jake McGraw wrote:
>>
>>> Whats your data size like? How many requests per second do you plan on
>>> handling?
>>
>> Its a very big site. Last year, we handled a total of 945 million page
>> views. And we expect those numbers to go up of course :-)
>>
>>> a relational database to a key/value store (memcache is nice,
>>> personally, I'm becoming a big fan of Redis) is to set up a single
>>> instance and see how it handles the load.
>>
>> Yes, my thoughts exactly. (BTW, I also looked at Redis earlier today, but
>> I have yet to see a comparison with memcache). Any thoughts?
>>
>
> Memcache is a proven product with a long (in web terms) history. Redis
> is brand knew, RC for version 1.0 was just put out fairly recently.
> The things I like about Redis are:
>
> Data Persistence (not just in memory)
> * Very easy to take a snapshot of your entire data store, just backup
> the data dump dir.
> * Very easy to prime a new data store. Let's say part of scaling
> strategy includes mirroring your data, that is, you'll have multiple
> cache servers with the same data. Simply take a snapshot of your data
> dir, move the files to a new server and start redis.
> * If your server goes down you can still recover information from the
> last active state.
>
> Lists
> Redis is not just a key/value store, it also provides lists of values
> under a single key. You can push, pop, get the length, get an
> arbitrary value within a list and a bunch of other features. Doing all
> of this computation within the provides two benefits: 1. No round trip
> and (de)serialization, 2. Atomic transactions.
>
> KEYS command for wildcards in key support.
> http://code.google.com/p/redis/wiki/KeysCommand
>
> Sets
> Though I haven't played around with sets yet, they look pretty powerful.
>
> In general, I think the KEYS and List commands makes the whole
> key/value thing a lot easier to use when coming from an RDBMS
> background. For performance information, check out this post:
>
> http://groups.google.com/group/redis-db/browse_thread/thread/0c706a43bc78b0e5/455dd41883d90101#455dd41883d90101
>
> - jake
>
>>> For example, with modern
>>> hardware, value look up from a single, untaxed instance of memcache
>>> should take around 1ms. At a certain point, based almost entirely on
>>> traffic, that'll go up. When it gets to an undesirable level, throw in
>>> another memcache instance and hash the keys to spread the load (or
>>> allow your memcached client to hash the keys for you). Continue this
>>> until some other bottleneck rears its head.
>>
>> We know where the bottle necks are, so right now its a case of selecting
>> some solutions to test with.
>>
>>
>> --
>> Aj.
>>
>> _______________________________________________
>> New York PHP User Group Community Talk Mailing List
>> http://lists.nyphp.org/mailman/listinfo/talk
>>
>> http://www.nyphp.org/show_participation.php
>>
> _______________________________________________
> New York PHP User Group Community Talk Mailing List
> http://lists.nyphp.org/mailman/listinfo/talk
>
> http://www.nyphp.org/show_participation.php
>
More information about the talk
mailing list