Datadog homework for Christopher Wright
dhasenan 4b433fd54a Report requests by section 1 year ago
DataDogLogReader Report requests by section 1 year ago
DataDogLogReader.Console Implemented homework thing 1 year ago
DataDogLogReader.Test Minor changes, added readme 1 year ago
DataDogLogReader.sln Implemented homework thing 1 year ago
README.md Minor changes, added readme 1 year ago

README.md

DataDog homework for Christopher Wright

Running

  • Restore nuget packages (I use MonoDevelop to do this)
  • Compile with msbuild or your IDE
  • Run the resulting exe; first parameter is the CSV file

Implementation notes

Memory usage should be relatively low. Mild improvements could be achieved by switching a couple classes to structs (but that makes them significantly more error-prone to work with since they’re mutable and we need to mutate them as rvalues, eg within a collection). Interning strings might also help the GC reclaim memory faster. And, of course, switching to the newfangled UTF8 types in recent versions of .NET Core would let us reuse memory a lot better.

The tracked stats are stored as fields in a struct, which isn’t the most flexible design ever, but it’s straightforward. We do the same aggregation and tracking for every purpose, and that’s not really optimal, but it’s also not a huge cost in this case. We have to do aggregation on multiple time ranges because we print stats on 10-second periods but alert on 2-minute periods.

The rolling window aggregation isn’t the fastest possible implementation. A faster implementation would keep a current window aggregation, subtract stats that slide out of the window, and add those that enter. However, that would require implementing a de-aggregate mechanism. Dropping that work by a factor of 60-ish would be a moderate increase in performance.

On my machine, I get about 33 records processed per millisecond, which is likely good enough to see if it’s good enough in production.