Technology Stacks Used by Other Companies

I really like Stackshare’s listing of technology stacks used by various companies from big to small, many of which we’ve heard of.

It’s not so much useful to take a large company’s usage of a technology as an endorsement.  Amazon, for example, wrote their original web technology in PHP, and still has significant chunks left.  According to an insider, PHP is “banned” for new functionality on the grounds that PHP can’t be adequately secured.

It is useful to see where new technologies have been adopted, and get some context on how other companies are evolving their technology.  After all, there is something of a network effect for any technology or product: more users means more developers and support which means more users, etc.

Experimenting with Google Compute Engine

I must say I’ve been happy with Amazon Web Services.  I utilize accounts both for business and personal, and I’ve been very pleased with the progress of their development of additional services, including SQS, SES, and RDS.  I’ve been aware of some of the holes in the stretched pizza dough, but like many consumers, there’s no reason to evaluate other options until things actually get painful.

To be clear, there have been points where the pain has come close to inspiring me at least to see what else is out there.  Some examples come to mind:

  1. If you stop or reboot a running instance— which obviously stops your production instance— you’re required to confirm your intention.  If you create a new machine image from a running instance— which in not-at-all-an-obvious way stops your instance— there’s no warning.
  2. If you use the Amazon Web Services console to manage your various tools, you’re shown only the obscure initials for the services— EC2, SES, S3.  If you try to manage the administrative logs, you’re shown only the fully spelled out service names.
  3. Meeting all the recommended security points on their checklist requires that you turn off the default login.  But if you already have a retail account connected with your AWS account— which is encouraged and can’t be separated— then you must use the default login.

The pain arrived today.  According to AWS billing records, my otherwise innocent micro instance had been spending several days last month spewing obscene amounts of data for an unknown reason to an unknown destination, racking up a huge bill.  While chances are this is something I might have been able to do something about, there’s little evidence immediately available to even corroborate that this data actually transferred.  I haven’t submitted a ticket yet to Amazon to see if there’s anything they can do to, at a minimum, explain what happened.

In any case, this has inspired me to evaluate deploying my software on other platforms.  It’s certainly advantageous to at least be very clear on the extent to which you’re committed to a vendor.

I’ve begun separating the actual requirements for the services I use from the niceties that AWS has been providing.  To wit:

  • SSH keys to access from any terminal and SFTP service

Niceties from AWS I’ll probably miss:

  • EC2 (instance) roles
  • AWS command line tools to talk to S3

Niceties from Google I might learn to appreciate:

  • Save money on instances that stay up without having to pay for reserved instances
  • Customizable instance sizes
  • Automatic detailed monitoring stats

Here are some existing comparison articles that have been useful:

http://cloudacademy.com/blog/ec2-vs-google-compute-engine/

 

Review: Introducing Maven

If I’m going to read a whole book on a technology, I’d like a bit more depth.  Why does this work, rather than just how it works.  While following the examples in the book, I was able to learn how to build configuration files and utilize Maven.  But a different book might have provided more underpinnings: what is the fundamental paradigm that will obviate the need to go through tutorials and practice.

MySQL 5.7.9

Mysql 5.7.9 has some additional security features that are at least useful for slamming your head against, if not actually improving your security.

  1. There’s a default password set for ‘root’ when first running the server.  The password is not generated as a result of the installation as the documentation claims, but only when the server is actually started.  Don’t go hunting for it until you’ve started the server.
  2. There’s a new default password security policy.  The current setting can be viewed with this query:
  3. mysql> SHOW VARIABLES LIKE 'validate_password%';
    +--------------------------------------+--------+
    | Variable_name                        | Value  |
    +--------------------------------------+--------+
    | validate_password_dictionary_file    |        |
    | validate_password_length             | 8      |
    | validate_password_mixed_case_count   | 1      |
    | validate_password_number_count       | 1      |
    | validate_password_policy             | MEDIUM |
    | validate_password_special_char_count | 1      |
    +--------------------------------------+--------+

Comparing Data Storage Costs

Comparing a pay-per-use data storage service like Amazon AWS’s S3 against a service with maximum reserved storage is a bit of an apples-to-oranges comparison.  If you have, say, 100 gigabytes available through a Sugarsync account, your unused capacity will always be greater than zero.  You will always pay for some amount of unused capacity, making your actual gigabyte-month costs higher than they would be if you were only charged for actual usage.  That said, here are some current costs by gigabyte-month:

[table id=1 /]

 

CentOS v AWS Linux

This post got me thinking about which version of Linux to run for my AWS EC2 instances.  At first, I found the AWS Linux perfectly suitable: it’s endorsed by Amazon, and obviously runs well.  All of the appropriate interactions between the operating system and the virtualization system obviously work, where it isn’t immediately obvious that other version would do the same.

An investigation was inspired by the idea that AWS Linux can not (reasonably) be taken out of the AWS to do investigation, reproduce problems, or develop offline.  Similarly, you can’t build an AWS instance on a local (screaming fast) machine and then push an image back to EC2.

I chose CentOS to be that much more agnostic to the hosting virtualization service, though I’m an AWS fan.

Here are some of the pleasant side effects I found of using CentOS as bonuses:

  • SELinux — more security if you want it (and if you don’t)
  • Clearer documentation – examples from the world
  • CentOS free just like AWS Linux
  • fresher repositories
  • simulate environment locally
  • can’t pull AWS Linux
  • Build your own kernel

 

Fast WordPress Hosting

Since I’ve switched all my web sites  providing only content to WordPress, I’m necessarily obsessed with performance and trying to find the most cost-effective way to flexibly run a site and get screamingly fast performance.  I say “flexibly” because I’ve gotten used to installing a variety of software directly from the command line.  This may be unnecessary as I trust the available set of WordPress modules more and more, but I still like the flexibility of managing my own Linux server.

So what’s fast enough?

An Amazon t2.micro instance provides 1.016s average total round trip on repeated viewings.

Dreamhost shared hosting averages 2.530.

But this wonderful post has me thinking that trying to draw averages may not be particularly helpful, and also implies that “max response time” values make real user experience not something that can be terribly well controlled.

Ignorance of IP Name Resolution

An ideal blog post probably provides answers to others who may be searching.  This one doesn’t.  This is the accounting of the mysteries of “localhost” on CentOS linux instances.  I’ll post updates as I actually start to understand the answers.

Fetching a web page from localhost is slower than fetching from 127.0.0.1

127.0.0.1: average response .015s

localhost: average response .166s

So an entire end-to-end fetch is ten times as slow using localhost, and the IP address is apparently not cached, as it doesn’t get any faster with repeated runs.

Yes, “localhost” is an alias to 127.0.0.1 in my hosts file, so I expect that an actual DNS request on the wire should not be necessary.

WordPress won’t find my database on host “localhost”, but it will find it at 127.0.0.1

Okay?