Technology Stacks Used by Other Companies

I really like Stackshare’s listing of technology stacks used by various companies from big to small, many of which we’ve heard of.

It’s not so much useful to take a large company’s usage of a technology as an endorsement.  Amazon, for example, wrote their original web technology in PHP, and still has significant chunks left.  According to an insider, PHP is “banned” for new functionality on the grounds that PHP can’t be adequately secured.

It is useful to see where new technologies have been adopted, and get some context on how other companies are evolving their technology.  After all, there is something of a network effect for any technology or product: more users means more developers and support which means more users, etc.

Review: Learning Docker

Learning Docker Book Cover Learning Docker

1-Getting Started with Docker

Installing Docker via AWS Elastic Beanstalk or on a local Linux instance.  Brief reference to other container systems on Windows and Mac.

2. Handling Docker Containers

3. Building Images

4. Publishing Images

5. Running Your Private Docker Infrastructure

6. Running Services in a Container

7. Sharing Data with Containers

 

Detailed.

Experimenting with Google Compute Engine

I must say I’ve been happy with Amazon Web Services.  I utilize accounts both for business and personal, and I’ve been very pleased with the progress of their development of additional services, including SQS, SES, and RDS.  I’ve been aware of some of the holes in the stretched pizza dough, but like many consumers, there’s no reason to evaluate other options until things actually get painful.

To be clear, there have been points where the pain has come close to inspiring me at least to see what else is out there.  Some examples come to mind:

  1. If you stop or reboot a running instance— which obviously stops your production instance— you’re required to confirm your intention.  If you create a new machine image from a running instance— which in not-at-all-an-obvious way stops your instance— there’s no warning.
  2. If you use the Amazon Web Services console to manage your various tools, you’re shown only the obscure initials for the services— EC2, SES, S3.  If you try to manage the administrative logs, you’re shown only the fully spelled out service names.
  3. Meeting all the recommended security points on their checklist requires that you turn off the default login.  But if you already have a retail account connected with your AWS account— which is encouraged and can’t be separated— then you must use the default login.

The pain arrived today.  According to AWS billing records, my otherwise innocent micro instance had been spending several days last month spewing obscene amounts of data for an unknown reason to an unknown destination, racking up a huge bill.  While chances are this is something I might have been able to do something about, there’s little evidence immediately available to even corroborate that this data actually transferred.  I haven’t submitted a ticket yet to Amazon to see if there’s anything they can do to, at a minimum, explain what happened.

In any case, this has inspired me to evaluate deploying my software on other platforms.  It’s certainly advantageous to at least be very clear on the extent to which you’re committed to a vendor.

I’ve begun separating the actual requirements for the services I use from the niceties that AWS has been providing.  To wit:

  • SSH keys to access from any terminal and SFTP service

Niceties from AWS I’ll probably miss:

  • EC2 (instance) roles
  • AWS command line tools to talk to S3

Niceties from Google I might learn to appreciate:

  • Save money on instances that stay up without having to pay for reserved instances
  • Customizable instance sizes
  • Automatic detailed monitoring stats

Here are some existing comparison articles that have been useful:

http://cloudacademy.com/blog/ec2-vs-google-compute-engine/

 

Review: Introducing Maven

Introducing Maven Book Cover Introducing Maven

Introducing Maven offers comparisons with Gradle, Ant, Ivy, then explains the tasks that can be performed with it, and details of configuring a pom.xml project file.

If I’m going to read a whole book on a technology, I’d like a bit more depth.  Why does this work, rather than just how it works.  While following the examples in the book, I was able to learn how to build configuration files and utilize Maven.  But a different book might have provided more underpinnings: what is the fundamental paradigm that will obviate the need to go through tutorials and practice.

MySQL 5.7.9

Mysql 5.7.9 has some additional security features that are at least useful for slamming your head against, if not actually improving your security.

  1. There’s a default password set for ‘root’ when first running the server.  The password is not generated as a result of the installation as the documentation claims, but only when the server is actually started.  Don’t go hunting for it until you’ve started the server.
  2. There’s a new default password security policy.  The current setting can be viewed with this query:
  3. mysql> SHOW VARIABLES LIKE 'validate_password%';
    +--------------------------------------+--------+
    | Variable_name                        | Value  |
    +--------------------------------------+--------+
    | validate_password_dictionary_file    |        |
    | validate_password_length             | 8      |
    | validate_password_mixed_case_count   | 1      |
    | validate_password_number_count       | 1      |
    | validate_password_policy             | MEDIUM |
    | validate_password_special_char_count | 1      |
    +--------------------------------------+--------+

Review: Core Java Volume I

Core Java Volume I-Fundamentals Book Cover Core Java Volume I-Fundamentals
Cay S. Horstmann; Gary Cornell
Prentice Hall

One measure of a great novel is whether it's worth reading a second time.  Similarly, the measure of a great technical book is whether it's worth reading cover to cover when you are already familiar with the topic.  As a Java programmer, I might have assumed that reading an entire book on the Java language would not be a worthwhile way to pick up the handful of tidbits that I'd managed never to have acquired.  But I would have been wrong.

While this text could be used for someone with a programming background but new to Java, it's structured well enough to solidify and enhance the understanding of someone who's already done this a few years.  Specifically, the authors' experience shows when they explain various options for using a feature of the language, explain the comparisons, and then unabashedly advise "do it this way; don't do it the other way".  For example, the section on multithreading gives clear and concise explanations as to how create threads and locks and use the "synchronized" keyword, and compelling explains why you'd want to use each of the various options, including the "never" do this cases.

It would have been easy to gloss over "type erasure", as the phrase doesn't imply what it is or why we would care.  By including clear opinions and warnings from their own experience, I now understand that generic types in Java are a compile-time construct rather than a full enhancement to the underpinnings of the runtime.  The authors explain every consequence of the fact that Java containers actually lose some of their type information after compilation, and therefore why there are arbitrary-seeming limitations and otherwise inexplicable compile-time errors.

While this book is titled as if it would be a solid reference book to let collect dust on your shelf, I found it much better suited for enhancing my mental framework about what Java can and can not do by simply reading it.

Comparing Data Storage Costs

Comparing a pay-per-use data storage service like Amazon AWS’s S3 against a service with maximum reserved storage is a bit of an apples-to-oranges comparison.  If you have, say, 100 gigabytes available through a Sugarsync account, your unused capacity will always be greater than zero.  You will always pay for some amount of unused capacity, making your actual gigabyte-month costs higher than they would be if you were only charged for actual usage.  That said, here are some current costs by gigabyte-month:

ServiceOptionAdvertised rateEffective rate at 50GBEffective rate at 100GB
Sugarsync$7.49/month 100 GB$.0749$.1498$.075
AWS S3less than 1TB$.0300$.03
$.030
AWS Glacierless than 1TB$0.007 per GB$.007$.007
Dropbox1TB $9.99$.01$.199$.099
Amazon CloudDrive$69.99/year$.1166
$.058

 

CentOS v AWS Linux

This post got me thinking about which version of Linux to run for my AWS EC2 instances.  At first, I found the AWS Linux perfectly suitable: it’s endorsed by Amazon, and obviously runs well.  All of the appropriate interactions between the operating system and the virtualization system obviously work, where it isn’t immediately obvious that other version would do the same.

An investigation was inspired by the idea that AWS Linux can not (reasonably) be taken out of the AWS to do investigation, reproduce problems, or develop offline.  Similarly, you can’t build an AWS instance on a local (screaming fast) machine and then push an image back to EC2.

I chose CentOS to be that much more agnostic to the hosting virtualization service, though I’m an AWS fan.

Here are some of the pleasant side effects I found of using CentOS as bonuses:

  • SELinux — more security if you want it (and if you don’t)
  • Clearer documentation – examples from the world
  • CentOS free just like AWS Linux
  • fresher repositories
  • simulate environment locally
  • can’t pull AWS Linux
  • Build your own kernel

 

Fast WordPress Hosting

Since I’ve switched all my web sites  providing only content to WordPress, I’m necessarily obsessed with performance and trying to find the most cost-effective way to flexibly run a site and get screamingly fast performance.  I say “flexibly” because I’ve gotten used to installing a variety of software directly from the command line.  This may be unnecessary as I trust the available set of WordPress modules more and more, but I still like the flexibility of managing my own Linux server.

So what’s fast enough?

An Amazon t2.micro instance provides 1.016s average total round trip on repeated viewings.

Dreamhost shared hosting averages 2.530.

But this wonderful post has me thinking that trying to draw averages may not be particularly helpful, and also implies that “max response time” values make real user experience not something that can be terribly well controlled.

Ignorance of IP Name Resolution

An ideal blog post probably provides answers to others who may be searching.  This one doesn’t.  This is the accounting of the mysteries of “localhost” on CentOS linux instances.  I’ll post updates as I actually start to understand the answers.

Fetching a web page from localhost is slower than fetching from 127.0.0.1

127.0.0.1: average response .015s

localhost: average response .166s

So an entire end-to-end fetch is ten times as slow using localhost, and the IP address is apparently not cached, as it doesn’t get any faster with repeated runs.

Yes, “localhost” is an alias to 127.0.0.1 in my hosts file, so I expect that an actual DNS request on the wire should not be necessary.

WordPress won’t find my database on host “localhost”, but it will find it at 127.0.0.1

Okay?