How we improved availability through iterative simplification
GitHub used several tools and tests to improve system performance and cut down on system timing out issues via code or query adjustments. One high-timeout issue was linked to the Command Palette feature's code that loaded lists of repositories. Testing a new candidate code block indicated an 80 to 90% improvement in performance. Additional identified areas of improvement included application-level sorting that eliminated an SQL query and reduced performance by 40 to 80%, thereby getting discarded, and a batched query that improved performance by 20 to 80%. A similar pattern was found in related code that confirmed a 30 to 40% performance improvement. An analysis of the busiest request endpoints indicated room for improvement which was addressed via filtering and feature flag changes.
read full post