{"id":218,"date":"2017-05-05T20:22:17","date_gmt":"2017-05-05T20:22:17","guid":{"rendered":"http:\/\/www.morsello.com\/?p=218"},"modified":"2017-05-05T20:22:17","modified_gmt":"2017-05-05T20:22:17","slug":"remembering-fail-safe","status":"publish","type":"post","link":"https:\/\/www.morsello.com\/index.php\/2017\/05\/05\/remembering-fail-safe\/","title":{"rendered":"Remembering Fail-Safe"},"content":{"rendered":"<p class=\"p2\"><span class=\"s1\">The description \u201cfail safe\u201d is commonly used to mean something foolproof, or a system with backup systems to prevent failure.<span class=\"Apple-converted-space\">\u00a0 <\/span>In other words, \u201csafe from failure\u201d.<\/span><\/p>\n<p class=\"p2\"><span class=\"s1\">That\u2019s a shame, since we have plenty of words that already mean that. <span class=\"Apple-converted-space\">\u00a0 <\/span>My dictionary defines fail-safe as <i>\u2026 a system \u2026 that insures safety if the system fails to operate<span class=\"Apple-converted-space\">\u00a0 <\/span>properly.<span class=\"Apple-converted-space\">\u00a0 <\/span><\/i>The original meaning meant \u201csafe <i>in case<\/i> of failure\u201d.<span class=\"Apple-converted-space\">\u00a0 <\/span>Things break.<span class=\"Apple-converted-space\">\u00a0 <\/span>How do we head off catastrophe?<\/span><\/p>\n<p class=\"p3\"><span class=\"s1\"><b>Real World examples<\/b><\/span><\/p>\n<p class=\"p2\"><span class=\"s1\"> The TCP network protocol \u201cguarantees\u201d delivery, but it\u2019s fail-safe.<span class=\"Apple-converted-space\">\u00a0 <\/span>If a packet can\u2019t be delivered, as happens, the connection is dropped rather than either accepting partial or corrupted data.<\/span><\/p>\n<p class=\"p2\"><span class=\"s1\"> In the movie <i>Die Hard<\/i>, the engineers of Nakatomi Plaza decided that safety meant that in the event of a power failure all the security systems of the building would be <i>dis<\/i>-abled.<span class=\"Apple-converted-space\">\u00a0 <\/span>In the movie that meant the bad guys could get into the vault.<span class=\"Apple-converted-space\">\u00a0 <\/span>In the real world, that decision would prevent people from being locked into the building.<\/span><\/p>\n<p class=\"p2\"><span class=\"s1\"> After thousands of deaths resulting from train accidents, train car brakes are now <i>engaged<\/i> by default.<span class=\"Apple-converted-space\">\u00a0 <\/span>A pressure line powered by the locomotive pulls the brake pads away from the wheels.<span class=\"Apple-converted-space\">\u00a0 <\/span>In the event that any of the braking system (the non-braking system?) fails, the brakes are pressed against the wheels.<\/span><\/p>\n<p class=\"p2\"><span class=\"s1\"> Airplanes use positive indicators for the status of important functions such as the landing gear being down.<span class=\"Apple-converted-space\">\u00a0 <\/span>Instead of an error light if the gear has failed, there\u2019s a no-error light if the gear is locked.<span class=\"Apple-converted-space\">\u00a0 <\/span>Should the sensor, wiring, or bulb fail, the indication is that gear is <i>not<\/i> down.<span class=\"Apple-converted-space\">\u00a0 <\/span>Better to have gear down and think it\u2019s not than think it is when it isn\u2019t.<\/span><\/p>\n<p class=\"p3\"><span class=\"s1\"><b>Value in software<\/b><\/span><\/p>\n<p class=\"p2\"><span class=\"s1\">This idea that we should expect failure isn\u2019t novel, it\u2019s called <i>testing<\/i>.<span class=\"Apple-converted-space\">\u00a0 <\/span>But arguably the primary purpose of testing is to identify defects in the software to avoid failure in production.<span class=\"Apple-converted-space\">\u00a0 <\/span>Is there value in assuming that we won\u2019t be successful at preventing every possible anomalous condition, including that our code does what we expect?<span class=\"Apple-converted-space\">\u00a0 <\/span>Consider the questions that <i>fail safe<\/i> raises?<\/span><\/p>\n<p class=\"p5\"><span class=\"s2\"> What can fail?<\/span><\/p>\n<p class=\"p2\"><span class=\"s1\"> Your software has bugs in it.<span class=\"Apple-converted-space\">\u00a0 <\/span>Networks go down.<span class=\"Apple-converted-space\">\u00a0 <\/span>You may get broken input.<span class=\"Apple-converted-space\">\u00a0 <\/span>You may get correct input that breaks your system because you didn\u2019t know the correct format.<span class=\"Apple-converted-space\">\u00a0 <\/span>You may get data in the wrong order.<span class=\"Apple-converted-space\">\u00a0 <\/span>Software you didn\u2019t write but you\u2019re counting on may fail.<\/span><\/p>\n<p class=\"p5\"><span class=\"s2\"> What is \u201csafe\u201d?<\/span><\/p>\n<p class=\"p2\"><span class=\"s1\"> What\u2019s the best result when failure happens?<span class=\"Apple-converted-space\">\u00a0 <\/span>Roll back a transaction?<span class=\"Apple-converted-space\">\u00a0 <\/span>Immediately kill a system?<span class=\"Apple-converted-space\">\u00a0 <\/span>Display an error?<span class=\"Apple-converted-space\">\u00a0 <\/span>Throw an exception?<\/span><\/p>\n<p class=\"p5\"><span class=\"s2\"> How we get back from \u201csafe\u201d to operational again?<\/span><\/p>\n<p class=\"p2\"><span class=\"s1\"> Once having decided what failure means and how to entire a safe mode, we may not have asked ourselves before how to get things going again.<span class=\"Apple-converted-space\">\u00a0 <\/span>If we reject entry of a file that contains erroneous data, how do we notify someone to deal with that?<span class=\"Apple-converted-space\">\u00a0 <\/span>How do we get it out of a queue to be processed again?<\/span><\/p>\n<p class=\"p3\">\n","protected":false},"excerpt":{"rendered":"<p>The description \u201cfail safe\u201d is commonly used to mean something foolproof, or a system with backup systems to prevent failure.\u00a0 In other words, \u201csafe from failure\u201d. That\u2019s a shame, since we have plenty of words that already mean that. \u00a0 My dictionary defines fail-safe as \u2026 a system \u2026 that insures safety if the system&hellip;<\/p>\n<p><a class=\"more-link\" href=\"https:\/\/www.morsello.com\/index.php\/2017\/05\/05\/remembering-fail-safe\/\" title=\"Continue reading &lsquo;Remembering Fail-Safe&rsquo;\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[1],"tags":[],"jetpack_featured_media_url":"","_links":{"self":[{"href":"https:\/\/www.morsello.com\/index.php\/wp-json\/wp\/v2\/posts\/218"}],"collection":[{"href":"https:\/\/www.morsello.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.morsello.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.morsello.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.morsello.com\/index.php\/wp-json\/wp\/v2\/comments?post=218"}],"version-history":[{"count":2,"href":"https:\/\/www.morsello.com\/index.php\/wp-json\/wp\/v2\/posts\/218\/revisions"}],"predecessor-version":[{"id":220,"href":"https:\/\/www.morsello.com\/index.php\/wp-json\/wp\/v2\/posts\/218\/revisions\/220"}],"wp:attachment":[{"href":"https:\/\/www.morsello.com\/index.php\/wp-json\/wp\/v2\/media?parent=218"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.morsello.com\/index.php\/wp-json\/wp\/v2\/categories?post=218"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.morsello.com\/index.php\/wp-json\/wp\/v2\/tags?post=218"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}