Tag Archives: 37signals

[repost ]37signals Still Happily Scaling On Moore RAM And SSDs

original:http://highscalability.com/blog/2012/1/30/37signals-still-happily-scaling-on-moore-ram-and-ssds.html

There are so many architectural ideas swirling in the bit wind these days. Two of the biggest battles are cloud vs. bare metal and RAM vs. disk vs. SSD. 37signals has published two solid articles that are counter hype cycle in their message:

Technologists who grew up when RAM cost $1,000 per megabyte can have a hard time dealing with the luxury of RAM being virtually free.

The progress of technology is throwing an ever greater number of optimizations into the “premature evil” bucket never to be seen again.

37signals made quite a stir with their money shot of the 864GB of RAM they bought for a mere $12K as part of their caching layer for Basecamp. That’s a lot of memory for not a lot of money. There’s nothing like actually seeing it in the flesh to bring the point home. Does that make Memory Based Architectures a little more appealing?

37signals then followed up with another provocative article: Three years later, Mr. Moore is still letting us punt on database sharding. The gist is scaling up is working for them. RAM is getting cheaper and FusionIO is getting faster, so they’ve been able to avoid architecture complexifications like sharding. Does that make SSD based architectures a little more appealing?

StackExchange is in much the same position, with a different stack, but with sympatico core ideas and comparable results. The learning: In your transaction oriented features, if you aren’t Googleish in your requirements, then scale-up using bare metal, RAM, and SSD may be the way to go. The tug you feel towards the cloud and horizontal scaling may just be a strong consensus wind a blowin’.

Some of the key takeways are:

  • SSD technology is accelerating which makes it unlikely they will need to shard Basecamp in the future.
  • Memory is still expensive in the cloud and on a VPS. RAM is at a premium. So to move to a RAM architecture you need to go bare metal. A generous commenter  priced elastic cache at: 20k / month for ~ 800 GB of Memory (12 * Quadruple Extra Large nodes / 68 GB).
  • 37signals uses FusionIO for their databases, but since the RAM was to be installed in 3 servers it was cheaper to go with RAM.
  • BaseCamp has a relatively predictable capacity planning problem, so bare metal is far more cost efficient than the cloud. If you are Netflix with wild swings in your usage patterns, then the tradeoffs may be different.
  • SSD has higher densities and lower cost than RAM, but is far slower than RAM. RAM accelerates read and write loads whereas SSD accelerates reads better than writes.
  • 37signals handles failure by having redundancy in all systems. All databases have replicas. Excess capacity is maintained as are spare servers.  There’s no geographic redundancy as of yet.
  • Schema changes, a notorious bottleneck in most relational databases, is less of a problem with SSD. They still cache tables in RAM as much as possible.

Related Articles

[repost]37signals Architecture

original:

37signals Architecture

Update 7: Basecamp, now with more vroom. Basecamp application servers running Ruby code were upgraded and virtualization was removed. The result: A 66 % reduction in the response time while handling multiples of the traffic is beyond what I expected. They still use virtualization (Linux KVM), just less of it now.
Update 6: Things We’ve Learned at 37Signals. Themes: less is more; don’t worry be happy.
Update 5: Nuts & Bolts: HAproxy . Nice explanation (post, screencast) by Mark Imbriaco of why HAProxy (load balancing proxy server) is their favorite (fast, efficient, graceful configuration, queues requests when Mongrels are busy) for spreading dynamic content between Apache web servers and Mongrel application servers.
Update 4: O’Rielly’s Tim O’Brien interviews David Hansson, Rails creator and 37signals partner. Says BaseCamp scales horizontally on the application and web tier. Scales up for the database, using one “big ass” 128GB machine. Says: As technology moves on, hardware gets cheaper and cheaper. In my mind, you don’t want to shard unless you positively have to, sort of a last resort approach.
Update 3: The need for speed: Making Basecamp faster. Pages now load twice as fast, cut CPU usage by a third and database time by about half. Results achieved by: Analysis, Caching, MySQL optimizations, Hardware upgrades.
Update 2: customer support is handled in real-time using Campfire.
Update: highly useful information on creating a customer billing system.


In the giving spirit of Christmas the folks at 37signals have shared a bit about how their system works. 37signals is most famous for loosing Ruby on Rails into the world and they’ve use RoR to make their very popular Basecamp, Highrise, Backpack, and Campfire products. RoR takes a lot of heat for being a performance dog, but 37signals seems to handle a lot of traffic with relatively normal sounding resources. This is just an initial data dump, they promise to add more details later. As they add more I’ll update it here.

Site: http://www.37signals.com

Information Sources

  • Ask 37signals: Numbers?
  • Ask 37signals: How do you process credit cards?
  • Behind the scenes at 37signals: Support
  • Ask 37signals: Why did you restart Highrise?

    Platform

  • Ruby on Rails
  • Memcached
  • Xen
  • MySQL
  • S3 for image storage

    The Stats

  • 30 servers ranging from single processor file servers to 8 CPU application servers for about 100 CPUs and 200GB of RAM.
  • Plan to diagonally scale by reducing the number of servers to 16 for about 92 CPU cores (each significantly faster than what are used today) and 230 GB of combined RAM.
  • Xen virtualization will be used to improve system management.
  • Basecamp (web based project management)
    * 2,000,000 people with accounts
    * 1,340,000 projects
    * 13,200,000 to-do items
    * 9,200,000 messages
    * 12,200,000 comments
    * 5,500,000 time tracking entries
    * 4,000,000 milestones
  • Backpack (personal and small business information management)
    * Just under 1,000,000 pages
    * 6,800,000 to-do items
    * 1,500,000 notes
    * 829,000 photos
    * 370,000 files
  • Overall storage stats (Nov 2007)
    * 5.9 terabytes of customer-uploaded files
    * 888 GB files uploaded (900,000 requests)
    * 2 TB files downloaded (8,500,000 requests)

    The Architecture

  • Memcached caching is used and they are looking to add more. Yields impressive performance results.
  • URL helper methods are used rather than building the URLs by hand.
  • Standard ActiveRecord built queries are used, but for performance reasons they will also “dig in and use” find_by_sql when necessary.
  • They fix Rails when they run into performance problems. It pays to be king :-)
  • Amazon’s S3 is used for storage of files upload by users. Extremely happy with results.

    Credit Card Processing Process

  • Bill monthly. It makes credit card companies more comfortable because they won’t be on the hook for a large chunk of change if your company goes out of business. Customers also like it better because it costs less up front and you don’t need a contract. Just pay as long as you want the service.
  • Get a Merchant Account. One is needed to process credit cards. They use Chase Bank. Use someone you trust and later negotiate rates when you get enough volume that it matters.
  • Authorize.net is the gateway they use to process the credit card charge.
  • A custom built system handles the monthly billing. It runs each night and bills the appropriate people and records the result.
  • On success an invoice is sent via email.
  • On failure an explanation is sent to the customer.
  • If the card is declined three times the account is frozen until a valid card number is provided.
  • Error handling is critical because problems with charges are common. Freeze to fast is bad, freezing too slow is also bad.
  • All products are being converted to using a centralized billing service.
  • You need to be PCI DSS (Payment Card Industry Data Security Standard) compliant.
  • Use a gateway service that makes it so you don’t have to store credit card numbers on your site. That makes your life easier because of the greater security. Some gateway services do have reoccurring billing so you don’t have to do it yourself.

    Customer Support

  • Campfire is used for customer service. Campfire is a web-based group chat tool, password-protectable, with chatting, file sharing, image previewing, and decision making.
  • Issues discussed are used to drive code changes and the subversion commit is shown in the conversation. Seems to skip a bug tracking system, which would make it hard to manage bugs and features in any traditional sense, ie, you can’t track subversion changes back to a bug and you can’t report what features and bugs are in a release.
  • Support can solve problems by customers uploading images, sharing screens, sharing files, and chatting in real-time.
  • Developers are always on within Campfire addressing problems in real-time with the customers.

    Lessons Learned

  • Take a lesson from Amazon and build internal functions as services from the start. This make it easier to share them across all product lines and transparently upgrade features.
  • Don’t store credit card numbers on your site. This greatly reduces your security risk.
  • Developers and customers should interact in real-time on a public forum. Customers get better service as developers handle issues as they come up in the normal flow of their development cycle. Several layers of the usual BS are removed. Developers learn what customers like and dislike which makes product development more agile. Customers can see the responsiveness of the company to customers by reading the interactions. This goes a long ways to give potential customers the confidence and the motivation to sign up.
  • Evolve your software by actual features needed by users instead of making up features someone might need someday. Otherwise you end up building something that nobody wants and won’t work anyway.